Thumbnail 1552571
thumbnail
Large (256x256)

Articles

Cerebras Inference – Cloud Access to Wafer Scale AI Chips
Cerebras is a startup that makes wafer sized AI chips. They are making a data center with those AI wafer chips to provide super-fast AI inference. Llama3.1-70B at 450 tokens/s - 20x faster than GPUs 60c per M tokens - a fifth the price of hyperscalers Full 16-bit precision for full model ... Read more
1