Article 6MTA8 Project Astra Is Google's 'Multimodal' Answer to the New ChatGPT

Project Astra Is Google's 'Multimodal' Answer to the New ChatGPT

by
BeauHD
from Slashdot on (#6MTA8)
At Google I/O today, Google introduced a "next-generation AI assistant" called Project Astra that can "make sense of what your phone's camera sees," reports Wired. It follows yesterday's launch of GPT-4o, a new AI model from OpenAI that can quickly respond to prompts via voice and talk about what it 'sees' through a smartphone camera or on a computer screen. It "also uses a more humanlike voice and emotionally expressive tone, simulating emotions like surprise and even flirtatiousness," notes Wired. From the report: In response to spoken commands, Astra was able to make sense of objects and scenes as viewed through the devices' cameras, and converse about them in natural language. It identified a computer speaker and answered questions about its components, recognized a London neighborhood from the view out of an office window, read and analyzed code from a computer screen, composed a limerick about some pencils, and recalled where a person had left a pair of glasses. [...] Google says Project Astra will be made available through a new interface called Gemini Live later this year. [Demis Hassabis, the executive leading the company's effort to reestablish leadership inAAI] said that the company is still testing several prototype smart glasses and has yet to make a decision on whether to launch any of them. Hassabis believes that imbuing AI models with a deeper understanding of the physical world will be key to further progress in AI, and to making systems like Project Astra more robust. Other frontiers of AI, including Google DeepMind's work on game-playing AI programs could help, he says. Hassabis and others hope such work could be revolutionary for robotics, an area that Google is also investing in."A multimodal universal agent assistant is on the sort of track to artificial general intelligence," Hassabis said in reference to a hoped-for but largely undefined future point where machines can do anything and everything that a human mind can. "This is not AGI or anything, but it's the beginning of something."

twitter_icon_large.pngfacebook_icon_large.png

Read more of this story at Slashdot.

External Content
Source RSS or Atom Feed
Feed Location https://rss.slashdot.org/Slashdot/slashdotMain
Feed Title Slashdot
Feed Link https://slashdot.org/
Feed Copyright Copyright Slashdot Media. All Rights Reserved.
Reply 0 comments