Implementing DeepSeek R1's GRPO algorithm from scratch by from Hacker News on 2025-04-13 18:33 (#6WKHD) Comments