Article 75R05 Google touts its tokenmaxxing and capex spending amid AI orgy

Google touts its tokenmaxxing and capex spending amid AI orgy

by
from www.theregister.com - Articles on (#75R05)
Story ImageSundar Pichai, CEO of Google and doting parent company Alphabet, opened its Google I/O developer conference with a celebration of token and capital expenditures. Tokens are the basic data exchange unit of AI models and Google has vastly increased its token processing to accommodate internal and external demand for AI inference. Two years ago, Pichai said, Google handled 9.7 trillion tokens per month. Last year, it was 480 trillion per month. Currently, the Chocolate Factory handles 3.2 quadrillion tokens per month. "Now some out there might call this tokenmaxxing and there's probably some truth to it," said Pichai. "I still think it tells an important story about our products and how others are building as well, especially our developers." Pichai said over 8.5 million developers are building applications using Google's Gemini model family monthly, using about 19 billion tokens per minute in API calls. And over the past 12 months, more than 375 customers have consumed more than 1 trillion tokens each - an indication there's some demand for AI among businesses. That token processing is possible because of the vast capital expenditures Google has made in datacenters and compute capacity, and TPU hardware. "Supporting all of this at scale for our users while also serving enterprises and developers around the world requires massive investments in infrastructure," said Pichai. "And we've been investing for today and for the future. In 2022, we were spending $31 billion annually in capex. This year, we expect that number to be about six times that, approximately 180 to 190 billion dollars." Demis Hassabis, co-founder and CEO of Google DeepMind, took a turn on stage to provide an update on Google's progress toward AGI - artificial general intelligence - that ill-defined point when AI models perform some set of tasks as well as a human. Gemini Omni, Hassabis suggested, is a step in that direction. It can, he said, "create anything from any input," meaning digital stuff as opposed to atomic replication. "It combines Gemini's intelligence with the best of our generative media models for a new level of world understanding, multimodality and editing," he explained. Gemini Omni combines video, image, and interactive simulation capabilities of models like Veo, Nano Banana, and Genie with physics modeling, so projects accurately depict object interactions involving kinetic energy and gravity. The first model in that family, Gemini Omni Flash, is now available. Pichai returned to announce an expansion of SynthID, Google's AI watermarking technology. Google, he said, will support C2PA content credentials verification across its products, to help people distinguish between content created by AI and by a camera, and to tell whether it has been edited with Google Photos. "We are expanding both SynthID and content credentials verification to Search and Chrome," said Pichai. "You can simply circle to search or right-click in Chrome and ask, 'was this generated with AI?' and you'll get a clear response along with other helpful context." To help make this technology more broadly useful, Google said OpenAI, Kakao and ElevenLabs have decided to adopt SynthID. Pichai went on to announce the next generation of its Gemini model family, Gemini 3.5 Flash. "When compared to 3.1 Pro, Flash is better across the board, in almost all benchmarks," he said, adding that the model has made "huge progress in coding," one of the more remunerative use cases for AI models presently. One of the major selling points of Gemini 3.5 Flash is that it offers comparable performance to other frontier models, but much faster. The model manages about 289 tokens per second, about 4x more than other frontier models, Google claims. Those using Google's coding harness Antigravity can look forward to even greater speed gains. "We've optimized Flash to be not just four times, but 12 times faster in Antigravity," said DeepMind engineer Varun Mohan, adding that the 2.0 release of Antigravity is out now. The other major selling point is price. "Top companies in Google Cloud are processing about 1 trillion tokens a day," said Pichai. "If they shifted 80% of their workloads from other frontier models to 3.5 Flash, they'd save over $1 billion annually." Gemini 3.5 Flash is also making its presence known in the Google Gemini app and in Search through its integration with Gemini Spark, an agent service. "It's your personal AI agent that helps you navigate your digital life, taking action on your behalf and under your direction," Pichai explained. "It runs on dedicated virtual machines on Google Cloud. And it's 24/7." Based on Gemini 3.5 Flash, with an assist from the Antigravity harness, Spark can perform long-running tasks in the background, presumably without incurring a huge token bill. Spark will be able to connect to other tools - Google apps initially like Gmail and Chat, then third-party tools via MCP. Chrome integration, which will enable agentic browsing, is planned for later this summer. Josh Woodward, VP of Google Labs, Gemini and AI Studio, described how he used Spark to arrange a block party, emailing neighbors, recording their responses in a spreadsheet, and creating a slide deck. This is rolling out now to trusted testers and to Google AI Ultra subscribers in the US next week. Spark's arrival coincides with a new $100/month Ultra plan tier and the deflation of the top Ultra tier from $250/month to $200/month. Pichai offered up one of his timeworn phrases - "It's still the early days when it comes to making agents easy to use, super secure, and truly helpful" - to gloss over the security and privacy implications of AI agents acting on user data and applications without supervision. Then he handed off to Liz Reid, VP of Search, who proceeded to detail further AI incursions into Google's Search service. Gemini 3.5 Flash, she said, has become the default model for AI Mode. And the Search box itself has been redesigned to surface AI-based suggestions and to facilitate inputs from modalities other than text, such as images, files, videos, and Chrome tabs. The biggest change is Search Agents, which like Gemini Spark will be accessible from Search and will run while you're away from the keyboard. "You can set information agents to work for you 24/7 in the background," said Reid. "They can find you exactly what you need, exactly when you need it, and help you take action. You can spin up multiple agents in search simultaneously to get updated and make progress on all those things that matter to you." Google is also taking a page from Anthropic by offering code-based interactive widgets or mini-apps on demand. Search users will be able to create dynamic layouts, charts, graphs, and the like through the integration of Gemini 3.5 Flash and Antigravity in a containerized environment. This generative UI capability is rolling out this summer. Expect Google's token expenditures to continue to grow, along with pressure to purchase subscriptions to pay for the agentic labor. (R)
External Content
Source RSS or Atom Feed
Feed Location http://www.theregister.co.uk/headlines.atom
Feed Title www.theregister.com - Articles
Feed Link https://www.theregister.com/
Reply 0 comments