Article 6ZGYD Search-capable AI agents may cheat on benchmark tests

Search-capable AI agents may cheat on benchmark tests

by
from The Register on (#6ZGYD)
Story ImageData contamination can make models seem more capable than they really are

Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the answers directly from online sources rather than deriving those answers through a "reasoning" process....

External Content
Source RSS or Atom Feed
Feed Location http://www.theregister.co.uk/headlines.atom
Feed Title The Register
Feed Link https://www.theregister.com/
Feed Copyright Copyright © 2025, Situation Publishing
Reply 0 comments