Eureka: With GPT-4 overseeing training, robots can learn much faster

Benj Edwards

from Ars Technica - All content on 2023-10-23 13:37 (#6FSYX)

Enlarge / In this still captured from a video provided by Nvidia, a simulated robot hand learns pen tricks, trained by Eureka, using simultaneous trials. (credit: Nvidia)

On Friday, researchers from Nvidia, UPenn, Caltech, and the University of Texas at Austin announced Eureka, an algorithm that uses OpenAI's GPT-4 language model for designing training goals (called "reward functions") to enhance robot dexterity. The work aims to bridge the gap between high-level reasoning and low-level motor control, allowing robots to learn complex tasks rapidly using massively parallel simulations that run through trials simultaneously. According to the team, Eureka outperforms human-written reward functions by a substantial margin.

Before robots can interact with the real world successfully, they need to learn how to move their robot bodies to achieve goals-like picking up objects or moving. Instead of making a physical robot try and fail one task at a time to learn in a lab, researchers at Nvidia have been experimenting with using video game-like computer worlds (thanks to platforms called Isaac Sim and Isaac Gym) that simulate three-dimensional physics. These allow for massively parallel training sessions to take place in many virtual worlds at once, dramatically speeding up training time.

"Leveraging state-of-the-art GPU-accelerated simulation in Nvidia Isaac Gym," writes Nvidia on its demonstration page, "Eureka is able to quickly evaluate the quality of a large batch of reward candidates, enabling scalable search in the reward function space." They call it "rapid reward evaluation via massively parallel reinforcement learning."

Read 6 remaining paragraphs | Comments

Source	RSS or Atom Feed
Feed Location	http://feeds.arstechnica.com/arstechnica/index
Feed Title	Ars Technica - All content
Feed Link	https://arstechnica.com/