News
Google Expands AI-Based Video and Image Generation Toolbox
- By John K. Waters
- 12/17/2024
Google on Tuesday announced new versions of its video and image generation tools, Veo 2 and Imagen 3, as the tech giant continues advancing artificial intelligence capabilities for content creation. It also introduced Whisk, an experimental tool blending image generation and remixing capabilities.
The updated AI models, now integrated into Google Labs tools VideoFX and ImageFX, offer significant improvements in quality, realism, and creative control, catering to filmmakers, content creators, and businesses.
Veo 2, Google’s latest video generation model, delivers enhanced video quality across various styles and subjects, with outputs reaching resolutions up to 4K and extending to minutes in length. The model’s understanding of real-world physics and human movement has been refined, reducing errors common in AI-generated content, such as extra limbs or unrealistic details.
"Veo 2 understands the unique language of cinematography," Google said in a blog post, highlighting the tool’s ability to incorporate specific lens types, cinematic effects, and camera movements. Users can generate everything from a low-angle tracking shot to a close-up scene by specifying prompts such as "18mm lens" or "shallow depth of field."
The model has also been embedded with SynthID, Google’s invisible watermarking technology, to identify AI-generated outputs and mitigate misinformation risks. Google said it remains committed to a measured rollout of Veo 2 via VideoFX, YouTube, and its Vertex AI enterprise platform, emphasizing responsible development and user safety.
Alongside Veo 2, Google unveiled improvements to Imagen 3, its leading image-generation model. The updated version produces brighter, more composed images while better rendering diverse art styles, ranging from photorealism and impressionism to anime.
Imagen 3 also follows user prompts more accurately and generates richer textures and details, outperforming other leading models in side-by-side comparisons conducted by human raters, Google said.
Starting Tuesday, the latest Imagen 3 is globally available in Google Labs' ImageFX tool across more than 100 countries.
Google also introduced Whisk, an experimental image remixing tool that combines Imagen 3 with the company’s Gemini model for visual understanding. Whisk enables users to generate and remix images by combining subjects, styles, and scenes. For instance, users can create items like digital plushies, stickers, or enamel pins with a few inputs.
Under the hood, Gemini generates captions describing user-provided images, feeding these descriptions into Imagen 3 to facilitate unique, customizable outputs.
Google’s latest AI advancements reflect its ongoing push to equip creators, businesses, and developers with sophisticated tools for video and image generation. Veo 2 and Imagen 3 are expected to expand across YouTube Shorts and other products next year, marking another step in integrating generative AI into mainstream creative workflows.
Content creators and businesses can now explore the tools via Google Labs, where sign-ups for VideoFX and ImageFX access are open globally.
About the Author
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at jwaters@converge360.com.