Article 6Q9J8 Cerebras Inference – Cloud Access to Wafer Scale AI Chips

Cerebras Inference – Cloud Access to Wafer Scale AI Chips

by
Brian Wang
from NextBigFuture.com on (#6Q9J8)
Story ImageCerebras is a startup that makes wafer sized AI chips. They are making a data center with those AI wafer chips to provide super-fast AI inference. Llama3.1-70B at 450 tokens/s - 20x faster than GPUs 60c per M tokens - a fifth the price of hyperscalers Full 16-bit precision for full model ...

Read more

External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/blogspot/advancednano
Feed Title NextBigFuture.com
Feed Link https://www.nextbigfuture.com/
Reply 0 comments