Article 6Q9FW Cerebras Claims Fastest AI Inference

Cerebras Claims Fastest AI Inference

by
staff
from High-Performance Computing News Analysis | insideHPC on (#6Q9FW)
Cerebras-logo-jpeg-black-background.jpg

AI compute company Cerebras Systems today announced what it said is the fastest AI inference solution. Cerebras Inference delivers 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B, according to the company, making it 20 times faster than GPU-based solutions in hyperscale clouds.

The post Cerebras Claims Fastest AI Inference appeared first on High-Performance Computing News Analysis | insideHPC.

External Content
Source RSS or Atom Feed
Feed Location http://insidehpc.com/feed/
Feed Title High-Performance Computing News Analysis | insideHPC
Feed Link https://insidehpc.com/
Reply 0 comments