Anthropic Releases New Version of Claude That Beats GPT-4 and Gemini Ultra in Some Benchmark Tests
Anthropic, a leading artificial intelligence startup, unveiled its Claude 3 series of AI models today, designed to meet the diverse needs of enterprise customers with a balance of intelligence, speed, and cost efficiency. The lineup includes three models: Opus, Sonnet, and the upcoming Haiku. From a report: The star of the lineup is Opus, which Anthropic claims is more capable than any other openly available AI system on the market, even outperforming leading models from rivals OpenAI and Google. "Opus is capable of the widest range of tasks and performs them exceptionally well," said Anthropic cofounder and CEO Dario Amodei in an interview with VentureBeat. Amodei explained that Opus outperforms top AI models like GPT-4, GPT-3.5 and Gemini Ultra on a wide range of benchmarks. This includes topping the leaderboard on academic benchmarks like GSM-8k for mathematical reasoning and MMLU for expert-level knowledge. "It seems to outperform everyone and get scores that we haven't seen before on some tasks," Amodei said. While companies like Anthropic and Google have not disclosed the full parameters of their leading models, the reported benchmark results from both companies imply Opus either matches or surpasses major alternatives like GPT-4 and Gemini in core capabilities. This, at least on paper, establishes a new high watermark for commercially available conversational AI. Engineered for complex tasks requiring advanced reasoning, Opus stands out in Anthropic's lineup for its superior performance. Sonnet, the mid-range model, offers businesses a more cost-effective solution for routine data analysis and knowledge work, maintaining high performance without the premium price tag of the flagship model. Meanwhile, Haiku is designed to be swift and economical, suited for applications such as consumer-facing chatbots, where responsiveness and cost are crucial factors. Amodei told VentureBeat he expects Haiku to launch publicly in a matter of "weeks, not months."
Read more of this story at Slashdot.