Experts find flaws in hundreds of tests that check AI safety and effectiveness

Robert Booth UK technology editor

from World news | The Guardian on 2025-11-04 00:05 (#7176A)

Scientists say almost all have weaknesses in at least one area that can undermine validity of resulting claims'

Experts have found weaknesses, some serious, in hundreds of tests used to check the safety and effectiveness of new artificial intelligence models being released into the world.

Computer scientists from the British government's AI Security Institute, and experts at universities including Stanford, Berkeley and Oxford, examined more than 440 benchmarks that provide an important safety net.

Source	RSS or Atom Feed
Feed Location	http://www.theguardian.com/world/rss
Feed Title	World news \| The Guardian
Feed Link	https://www.theguardian.com/world
Feed Copyright	Guardian News and Media Limited or its affiliated companies. All rights reserved. 2026