Mistral AI and NVIDIA Launch Advanced Language Model Mistral NeMo 12B -- Pure AI

Mistral AI and NVIDIA Launch Advanced Language Model Mistral NeMo 12B

By John K. Waters
07/25/2024

Mistral AI and NVIDIA have jointly unveiled Mistral NeMo 12B, a new language model designed to enhance enterprise applications, including chatbots, multilingual processing, coding, and summarization. The collaboration leverages Mistral AI's expertise in training data and NVIDIA's advanced hardware and software ecosystem, promising high performance across a range of tasks.

"We are fortunate to collaborate with the NVIDIA team, leveraging their top-tier hardware and software," said Guillaume Lample, cofounder and chief scientist of Mistral AI, in a statement. "Together, we have developed a model with unprecedented accuracy, flexibility, high efficiency, and enterprise-grade support and security thanks to NVIDIA AI Enterprise deployment."

Mistral NeMo was trained on the NVIDIA DGX Cloud AI platform and utilizes NVIDIA TensorRT-LLM for accelerated inference performance on large language models (LLMs) and the NVIDIA NeMo development platform for building custom generative AI models.

The 12-billion-parameter design ("12B") gives Mistral NeMo extra horsepower for multi-turn conversations, math, common sense reasoning, world knowledge, and coding, Lample said. The model's 128K context length allows it to process extensive and complex information coherently and accurately, ensuring contextually relevant outputs. Also, using the FP8 data format reduces memory size and enhances efficiency, speeding deployment without compromising accuracy, he said.

Mistral NeMo is packaged as an NVIDIA NIM inference microservice, which includes performance-optimized inference with NVIDIA TensorRT-LLM engines. This containerized format facilitates easy deployment across platforms, from cloud environments to RTX workstations, significantly reducing setup times from days to minutes, the company said.

NIM features, integrated into NVIDIA AI Enterprise, include enterprise-grade software with dedicated feature branches, rigorous validation processes, and comprehensive security and support. This setup is meant to ensure reliable and consistent performance, Lample said, backed by direct access to NVIDIA AI experts and defined service-level agreements.

Mistral NeMo is being released under the Apache 2.0 license. The open model license supports seamless integration of Mistral NeMo into commercial applications. The model was designed to run efficiently on a range of hardware, including NVIDIA L40S, GeForce RTX 4090, and RTX 4500 GPUs, offering low compute costs and enhanced security and privacy. It's available now as

Mistral NeMo is available now for deployment across various platforms, including cloud, data centers, and RTX workstations. The model can be accessed as an NVIDIA NIM now, and the company says a downloadable NIM version is coming soon.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].

Featured

The New AI Security Rules, Perplexity's $34.5B Chrome Bid, More

Pure AI

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
March 10-April 14, 2026

Live! 360 2-Day Hands-On Seminar: Copilot Studio, Microsoft Agent Framework and Foundry: Building Multi-Agent AI Systems
June 8-9, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026