OpenAI’s new video generation tool could learn a lot from babies | John Naughton

John Naughton

from on 2024-02-24 16:00 (#6JW25)

The footage put together by Sora looks swish, but closer examination reveals its doesn't understand physical reality

First text, then images, now OpenAI has a model for generating videos," screamed Mashable the other day. The makers of ChatGPT and Dall-E had just announced Sora, a text-to-video diffusion model. Cue excited commentary all over the web about what will doubtless become known as T2V, covering the usual spectrum - from Does this mark the end of [insert threatened activity here]?" to meh" and everything in between.

Sora (the name is Japanese for sky") is not the first T2V tool, but it looks more sophisticated than earlier efforts like Meta's Make-a-Video AI. It can turn a brief text description into a detailed, high-definition film clip up to a minute long. For example, the prompt A cat waking up its sleeping owner, demanding breakfast. The owner tries to ignore the cat, but the cat tries new tactics, and finally, the owner pulls out his secret stash of treats from underneath the pillow to hold off the cat a little longer," produces a slick video clip that would go viral on any social network.

Source	RSS or Atom Feed
Feed Location	http://feeds.theguardian.com/theguardian/science/rss
Feed Title
Feed Link	http://feeds.theguardian.com/