Refusal in Language Models Is Mediated by a Single Direction by from Hacker News on 2024-06-18 17:09 (#6NM5B) Comments