Thumbnail - Pipedot

Thumbnail 1663365

Large (256x256)

Articles

Built RL for long-horizon agents – tested on 32x H100s but too poor to train

from Hacker News on 2025-07-29 11:12 (#6YZ1J)

Comments

Show HN: Terminal-Bench-RL: Training Long-Horizon Terminal Agents with RL

from Hacker News on 2025-07-29 11:12 (#6YZ3R)

Comments

1