Article 6FNC1 OpenAI’s flagship AI model has gotten more trustworthy but easier to trick

OpenAI’s flagship AI model has gotten more trustworthy but easier to trick

by
Emilia David
from The Verge - All Posts on (#6FNC1)
openaimicrosoft.0.jpg Image: Microsoft

OpenAI's GPT-4 large language model may be more trustworthy than GPT-3.5 but also more vulnerable to jailbreaking and bias, according to research backed by Microsoft.

The paper - by researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, Center for AI Safety, and Microsoft Research - gave GPT-4 a higher trustworthiness score than its predecessor. That means they found it was generally better at protecting private information, avoiding toxic results like biased information, and resisting adversarial attacks. However, it could also be told to ignore security measures and leak personal information and conversation histories. Researchers found that users can bypass safeguards...

Continue reading...

External Content
Source RSS or Atom Feed
Feed Location http://www.theverge.com/rss/index.xml
Feed Title The Verge - All Posts
Feed Link https://www.theverge.com/
Reply 0 comments