OpenAI’s flagship AI model has gotten more trustworthy but easier to trick

Emilia David

from The Verge on 2023-10-17 21:38 (#6FNC1)

Image: Microsoft

OpenAI's GPT-4 large language model may be more trustworthy than GPT-3.5 but also more vulnerable to jailbreaking and bias, according to research backed by Microsoft.

The paper - by researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, Center for AI Safety, and Microsoft Research - gave GPT-4 a higher trustworthiness score than its predecessor. That means they found it was generally better at protecting private information, avoiding toxic results like biased information, and resisting adversarial attacks. However, it could also be told to ignore security measures and leak personal information and conversation histories. Researchers found that users can bypass safeguards...

Source	RSS or Atom Feed
Feed Location	http://www.theverge.com/rss/index.xml
Feed Title	The Verge
Feed Link	https://www.theverge.com/