Google Enters Its 'Gemini' AI Era

The first version of the Gemini AI model is now ready for primetime.

Gemini 1.0 is Google's "largest and most capable AI model," said Google DeepMind chief Demis Hassabis in a blog post Wednesday.

Gemini currently comes in three sizes. Ultra, the largest size, is for "highly complex tasks; Pro is a scalable, general-purpose model; and Nano, the smallest, is designed to run on devices with limited memory.

The Google Bard AI chatbot now runs Gemini Pro, marking what Hassabis called "the biggest upgrade to Bard since it launched." Meanwhile, Gemini Nano can be accessed on new generations of Google's Pixel smartphones, starting with the Pixel 8 Pro.

Gemini Ultra is not yet publicly available; that'll come in early 2024, after Google completes "extensive trust and safety checks," does further model refinements and previews it first to "select customers, developers, partners and safety and responsibility experts."

For developers, Gemini Pro will be available via API starting Dec. 13. Gemini understands common programming languages like Python, Java and  C++.

Of the three sizes, Google is touting Gemini Ultra as being especially sophisticated. For instance, Google found Ultra to outperform humans in an MMLU (massive multitask language understanding) test, which measures general knowledge and problem-solving abilities in nearly 60 subjects. It's the first model to do so, according to Hassabis.

In addition, Google reports Ultra achieved "state-of-the-art" performance when tested on its ability to reason and problem-solve in multimodal contexts (i.e., contexts that require understanding of different data types, including text, video, image, audio and code).

"With the image benchmarks we tested, Gemini Ultra outperformed previous state-of-the-art models, without assistance from object character recognition (OCR) systems that extract text from images for further processing," Hassabis said. "These benchmarks highlight Gemini's native multimodality and indicate early signs of Gemini's more complex reasoning abilities." 

Google argues that this "multimodality" makes Gemini well-suited for tasks that involve processing, understanding and communicating about large volumes of unstandardized data -- and in relatively short time.

"Its remarkable ability to extract insights from hundreds of thousands of documents through reading, filtering and understanding information will help deliver new breakthroughs at digital speeds in many fields from science to finance," Hassabis said, adding, "[Gemini] better understands nuanced information and can answer questions relating to complicated topics. This makes it especially good at explaining reasoning in complex subjects like math and physics." 

New AI Hardware
To train Gemini 1.0, Google built a new version of its AI accelerator. Also announced Wednesday, the new Cloud Tensor Processing Unit (TPU) v5p is Google's "most powerful and scalable TPU accelerator to date," according to a post by Google machine learning chiefs Amin Vahdat and Mark Lohmeyer.

The company has also launched a new AI Hypercomputer, described as a "supercomputer architecture that employs an integrated system of performance-optimized hardware, open software, leading ML frameworks, and flexible consumption models."

More information about Gemini 1.0 is available here.

About the Author

Gladys Rama (@GladysRama3) is the editorial director of Converge360.