Batched reward model inference and Best-of-N sampling from Hacker News on 2024-11-19 06:19 (#6SATH) Comments