Article 6Z58Q Cerebras Reports 3,000 Tokens Per Second Inference on OpenAI gpt-oss-120b Model

Cerebras Reports 3,000 Tokens Per Second Inference on OpenAI gpt-oss-120b Model

by
staff
from Inside HPC & AI News | High-Performance Computing & Artificial Intelligence on (#6Z58Q)
Cerebras-logo-jpeg-black-background.jpg

Cerebras Systems today announced inference support for gpt-oss-120B, OpenAI's first open-weight reasoning model, running at record inference speeds of 3,000 tokens per second on the Cerebras AI Inference Cloud, according to ....

The post Cerebras Reports 3,000 Tokens Per Second Inference on OpenAI gpt-oss-120b Model appeared first on Inside HPC & AI News | High-Performance Computing & Artificial Intelligence.

External Content
Source RSS or Atom Feed
Feed Location http://insidehpc.com/feed/
Feed Title Inside HPC & AI News | High-Performance Computing & Artificial Intelligence
Feed Link https://insidehpc.com/
Reply 0 comments