Efficient streaming language models with attention sinks by from Hacker News on 2023-10-02 16:56 (#6F86N) Comments