Transparency is often lacking in datasets used to train large language models from Hacker News on 2024-09-03 14:05 (#6QEGV) Comments