by Brian Wang from NextBigFuture.com on (#6Q9J8)
Cerebras is a startup that makes wafer sized AI chips. They are making a data center with those AI wafer chips to provide super-fast AI inference. Llama3.1-70B at 450 tokens/s - 20x faster than GPUs 60c per M tokens - a fifth the price of hyperscalers Full 16-bit precision for full model ... Read more