Thumbnail - Pipedot

Anthropic's Claude vulnerable to 'emotional manipulation'

from The Register on 2024-10-12 10:30 (#6RDT3)

AI model safety only goes so far Anthropic's Claude 3.5 Sonnet, despite its reputation as one of the better behaved generative AI models, can still be convinced to emit racist hate speech and malware....

0 comments

Articles