Article 69JER Google’s PaLM-E is a generalist robot brain that takes commands

Google’s PaLM-E is a generalist robot brain that takes commands

by
Benj Edwards
from Ars Technica - All content on (#69JER)
palm-e-robot-800x450.png

Enlarge / A robotic arm controlled by PaLM-E reaches for a bag of chips in a demonstration video. (credit: Google Research)

On Monday, a group of AI researchers from Google and the Technical University of Berlin unveiled PaLM-E, a multimodal embodied visual-language model (VLM) with 562 billion parameters that integrates vision and language for robotic control. They claim it is the largest VLM ever developed and that it can perform a variety of tasks without the need for retraining.

According to Google, when given a high-level command, such as "bring me the rice chips from the drawer," PaLM-E can generate a plan of action for a mobile robot platform with an arm (developed by Google Robotics) and execute the actions by itself.

PaLM-E does this by analyzing data from the robot's camera without needing a pre-processed scene representation. This eliminates the need for a human to pre-process or annotate the data and allows for more autonomous robotic control.

Read 11 remaining paragraphs | Comments

External Content
Source RSS or Atom Feed
Feed Location http://feeds.arstechnica.com/arstechnica/index
Feed Title Ars Technica - All content
Feed Link https://arstechnica.com/
Reply 0 comments