Task-Specific LLM Evals That Do and Don't Work by from Hacker News on 2024-12-09 14:23 (#6ST4R) Comments