News

Google Primes New Gemini Models, Spruces Up Vertex AI

The 2024 Google I/O event kicked off on Tuesday with, predictably, AI taking the lion's share of the spotlight.

"Google is fully in our Gemini era," said CEO Sundar Pichai in the opening keynote, referring to the company's flagship AI model family. He noted how Google combined its Google Brain and DeepMind teams to advance AI efforts like Gemini.

"Using the computational resources of Google, they're focused on building more capable systems safely and responsibly," he said. "This includes our next-generation foundation model, Gemini, which is still in training. Gemini was created from the ground up to be a multimodal, highly efficient tool with API integrations and built to enable future innovations like memory and planning. While still early, we're already seeing impressive multimodal capabilities not seen in prior models. Once fine-tuned and rigorously tested for safety, Gemini will be available at various sizes and capabilities, just like PaLM 2."

Google also announced several updates to its cloud-based Vertex AI service, a fully managed, unified development platform for leveraging models at scale. Vertex AI provides a selection of more than 150 first-party, open and third-party foundation models. It helps developers build AI agents and can be used to customize models with enterprise-ready tuning, grounding, monitoring and deployment capabilities.

New models available now include Gemini 1.5 Flash and PaliGemma. Gemini 1.5 Flash is a lighter-weight alternative to the Gemini 1.5 Pro model, designed for high-volume tasks like chat applications. PaliGemma is the first vision-language model in the Gemma family of open models, optimized for tasks such as image captioning and visual question-answering. It's available in Vertex AI Model Garden.

Available later will be Imagen 3, a text-to-image generation model that can generate detailed, photorealistic images, along with Gemma 2, a new open model that is built for a broad range of AI developer use cases, based on Gemini tech.

Finally, Gemini 1.5 Pro will be available to those accepted from a waitlist, boasting an expanded 2 million context window.

Vertex AI is also getting three new capabilities, joining recently announced prompt management and model evaluation tools:

  • Context caching: Entering public preview next month, this helps users actively manage and reuse cached context data. "As processing costs increase by context length, it can be expensive to move long-context applications to production," Google said. "Vertex AI context caching helps customers significantly reduce costs by leveraging cached data."
  • Controlled generation: This will enter public preview sooner than context caching, coming later this month. It helps users define Gemini model outputs according to specific formats or schemas. "Most models cannot guarantee the format and syntax of their outputs, even with specified instructions," Google said. "Vertex AI controlled generation lets customers choose the desired output format via pre-built options like YAML and XML, or by defining custom formats. JSON, as a pre-built option, is live today."
  • Batch API, now available in public preview, is described as "a super-efficient way to send large numbers of non-latency sensitive text prompt requests, supporting use cases such as classification and sentiment analysis, data extraction, and description generation." Benefits are said to include speeding up developer workflows and reducing costs by enabling multiple prompts to be sent to models in one request.

Google also announced:

All the news is rounded up here.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured