You can now give Google’s AI video model camera directions

Jess Weatherbed

from The Verge on 2025-04-09 17:15 (#6WGSY)

Veo-2-panning-shot-ai-label.png?quality=90&strip=all&crop=0,0,100,100

Google is trying to make it easier for users of its video AI model Veo 2 to make cinematic-looking generations and edit real footage. The new Veo 2 capabilities are available to preview via Google Cloud's Vertex AI platform, alongside other updates to improve Google's text-to-image generator, Imagen 3, and audio-related AI models.

New Veo 2 features include inpainting, which can automatically remove unwanted background images, logos, or distractions from your videos" according to Google, and outpainting, which extends the frame of the original video into a different format. The latter tool will fill the new space with ai-generated video footage that blends into the original clip, similar to Adobe's Generative Expand feature for images.

2_-_Outpainting.gif?quality=90&strip=all&crop=0,20.252260828177,100,59.495478343646

The update also lets Veo 2 users select cinematic technique presets to include alongside their text descriptions when generating footage, which can be used to help guide shot composition, camera angles, and pacing in the final results. Example presets include timelapse effects, drone-style POV, and simulating camera-panning in different directions.

A new interpolation feature has also been added that can create a video transition between two still images, filling in the beginning and end sequences with new frames.

3_-_Veo.gif?quality=90&strip=all&crop=17.257142857143,0,65.485714285714,100

Adobe's competing Firefly video model has some similar capabilities, with a generative AI video extending feature launching in Premiere Pro last week. Google also adds SynthID digital attribution watermarks into its AI-generated outputs, much like Adobe's Content Credentials system, but Adobe goes a step further by pledging that its tools are fully commercially safe because they're trained on licensed and public domain content - something Google can't match after inhaling the web to train its AI models.

Editing capabilities in Google's text-to-image model Imagen 3 have also been updated to significantly" improve automatic object removal, according to Google, providing what are supposed to be more natural results when removing distractions. Both Veo 2 and Imagen 3 are already being used by companies like L'Oreal and Kraft Heinz for marketing content production, with Kraft Heinz's digital experience leader Justin Thomas saying the type of task that once took us eight weeks is now only taking eight hours."

4_-_Imagen.max-1000x1000-1.png?quality=90&strip=all&crop=11.8,0,76.4,100

On the audio side, Google has released its text-to-music model, Lyria, in a private preview and rolled out an Instant Custom Voice" feature for its synthetic speech model, Chirp 3. Google says that Chirp 3 can now generate realistic custom voices from 10 seconds of audio input," and that a new transcription feature is launching in preview that can identify and separate individual speakers to provide clearer transcriptions for calls where multiple people are talking.

These updates are just a handful of AI-related announcements that Google made today. Gemini 2.5 Flash, the latest version of the company's efficiency-optimized Flash model, will soon be available on Vertex AI. Google says that Gemini 2.5 Flash automatically adjusts processing time" based on the complexity of the task to provide faster results for simple requests.

Google is also updating its enterprise-focused Agentic AI tools this week to allow AI agents to communicate with each other and perform tasks across platforms like PayPal and Salesforce. Meanwhile, a new section is being launched on Google's Cloud Marketplace for companies to browse and purchase AI agents built by third-party Google partners.

Source	RSS or Atom Feed
Feed Location	http://www.theverge.com/rss/index.xml
Feed Title	The Verge
Feed Link	https://www.theverge.com/