Thumbnail 1663365
thumbnail
Large (256x256)

Articles

Built RL for long-horizon agents – tested on 32x H100s but too poor to train
Comments
Show HN: Terminal-Bench-RL: Training Long-Horizon Terminal Agents with RL
Comments
1