Thumbnail 1665732
thumbnail
Large (256x256)

Articles

Cerebras Reports 3,000 Tokens Per Second Inference on OpenAI gpt-oss-120b Model
Cerebras Systems today announced inference support for gpt-oss-120B, OpenAI's first open-weight reasoning model, running at record inference speeds of 3,000 tokens per second on the Cerebras AI Inference Cloud, according to ....The post Cerebras Reports 3,000 Tokens Per Second Inference on OpenAI gpt-oss-120b Model appeared first on Inside HPC & AI News | High-Performance Computing & Artificial Intelligence.
1