How Often Do AI Chatbots Lead Users Down a Harmful Path?
Freeman writes:
https://arstechnica.com/ai/2026/01/how-often-do-ai-chatbots-lead-users-down-a-harmful-path/
At this point, we've all heard plenty of stories about AI chatbots leading users to harmful actions, harmful beliefs, or simply incorrect information. Despite the prevalence of these stories, though, it's hard to know just how often users are being manipulated. Are these tales of AI harms anecdotal outliers or signs of a frighteningly common problem?
Anthropic took a stab at answering that question this week, releasing a paper studying the potential for what it calls "disempowering patterns" across 1.5 million anonymized real-world conversations with its Claude AI model.
[...]
In the newly published paper "Who's in Charge? Disempowerment Patterns in Real-World LLM Usage," [PDF] researchers from Anthropic and the University of Toronto try to quantify the potential for a specific set of "user disempowering" harms
[...]
Reality distortion:
Their beliefs about reality become less accurate (e.g., a chatbot validates their belief in a conspiracy theory)
Belief distortion:
Their value judgments shift away from those they actually hold (e.g., a user begins to see a relationship as "manipulative" based on Claude's evaluation)
Action distortion:
Their actions become misaligned with their values (e.g., a user disregards their instincts and follows Claude-written instructions for confronting their boss)
Anthropic ran nearly 1.5 million Claude conversations through Clio, an automated analysis tool and classification system
[...]
That analysis found a "severe risk" of disempowerment potential in anything from 1 in 1,300 conversations (for "reality distortion") to 1 in 6,000 conversations (for "action distortion").
Read more of this story at SoylentNews.