Google is going head to head against OpenAI’s Sora with the newest version of its video generation model, Veo 2, which it says makes more realistic-looking videos.
The company also updated its image generation model Imagen 3 to produce richer, more detailed photos.
Google said Veo 2 has “a better understanding of real-world physics and the nuances of human movement and expression.” It is available on Google Labs’ VideoFX platform — but only on a waitlisted basis. Users will need to sign up through a Google Form and wait for access to be granted provisionally by Google at a time of its choosing.
“Veo 2 also understands the language of cinematography: Ask it for a genre, specify a lens, suggest cinematic effects and Veo 2 will deliver — at resolutions up to 4K,” Google said in a blog post.
While Veo 2 is available only to select users, the original Veo remains available on Vertex AI. Videos created with Veo 2 will contain Google’s metadata watermark SynthID to identify these as AI-generated.
Google admits, though, that Veo 2 may still hallucinate extra fingers and the like, but it promises the new model produces fewer hallucinations.
Veo 2 will compete against OpenAI’s recently released Sora video generation model to attract filmmakers and content creators. Sora had been in previews for a while before OpenAI made it available to paying subscribers.
Impressively, Google says that on its own internal tests gauging “overall preference” (i.e. which videos an audience liked better) and “prompt adherence” (how well the videos matched the instructions given by the human creator), Veo was preferred by human evaluators to Sora and other rival AI models.
Google announced Veo in May of this year during its Google I/O developer conference with a video made in partnership with actor-musician Donald Glover, aka Childish Gambino.
AI video generation still needs some work
AI video generation has long been an area of generative AI in which big model developers, like Google and OpenAI, regularly compete with and catch up with relatively smaller companies.
RunwayML, one of the pioneers of AI video generation, recently launched advanced controls for its Gen-3 Alpha Turbo model. Pika Labs released Pika 2.0, giving users more control and enabling them to add their own characters to a video. Luma AI announced a partnership with AWS to bring its models to Bedrock for enterprise use. Luma also expanded its Dream Machine generation model.
However, AI video generation still needs to convince both creators and viewers. After Sora’s long-anticipated release, people remained skeptical of its capabilities when it continued to generate physics and anatomy-defying figures. Users felt it gave inconsistent results.
A trailer from the recent Game Awards also showed people’s distrust of what they perceive as “AI slop.”
Some filmmakers, though, have begun to embrace the possibilities AI video generators can provide. Famed director James Cameron joined the board of Stability AI, while actor Andy Serkis announced he was building an AI-focused production company.
Google said it’s seeing interest from many users. The company said YouTube creators have been using VideoFX to make backgrounds for YouTube Shorts to save time.
Updates to Imagen 3
Google also updated its image model Imagen 3, which it recently made available through its Gemini chatbot on the web, to be more realistic and offer brighter images.
Imagen 3 can now render more art styles accurately, “from photorealism to impressionism, from abstract to anime.” Google said the model will also follow prompts more faithfully.
People can access Imagen 3 through ImageFX.
The post Google debuts new AI video generator Veo 2 claiming better audience scores than Sora appeared first on Venture Beat.