Refusal in Language Models Is Mediated by a Single Direction by from Hacker News on 2026-05-02 13:15 (#75BWX) Comments