Article 6NXXE Anthropic Looks To Fund a New, More Comprehensive Generation of AI Benchmarks

Anthropic Looks To Fund a New, More Comprehensive Generation of AI Benchmarks

by
msmash
from Slashdot on (#6NXXE)
AI firm Anthropic launched a funding program Monday to develop new benchmarks for evaluating AI models, including its chatbot Claude. The initiative will pay third-party organizations to create metrics for assessing advanced AI capabilities. Anthropic aims to "elevate the entire field of AI safety" with this investment, according to its blog. TechCrunch adds: As we've highlighted before, AI has a benchmarking problem. The most commonly cited benchmarks for AI today do a poor job of capturing how the average person actually uses the systems being tested. There are also questions as to whether some benchmarks, particularly those released before the dawn of modern generative AI, even measure what they purport to measure, given their age. The very-high-level, harder-than-it-sounds solution Anthropic is proposing is creating challenging benchmarks with a focus on AI security and societal implications via new tools, infrastructure and methods.

twitter_icon_large.pngfacebook_icon_large.png

Read more of this story at Slashdot.

External Content
Source RSS or Atom Feed
Feed Location https://rss.slashdot.org/Slashdot/slashdotMain
Feed Title Slashdot
Feed Link https://slashdot.org/
Feed Copyright Copyright Slashdot Media. All Rights Reserved.
Reply 0 comments