News

Mistral AI and NVIDIA Launch Advanced Language Model Mistral NeMo 12B

Mistral AI and NVIDIA have jointly unveiled Mistral NeMo 12B, a new language model designed to enhance enterprise applications, including chatbots, multilingual processing, coding, and summarization. The collaboration leverages Mistral AI's expertise in training data and NVIDIA's advanced hardware and software ecosystem, promising high performance across a range of tasks.

"We are fortunate to collaborate with the NVIDIA team, leveraging their top-tier hardware and software," said Guillaume Lample, cofounder and chief scientist of Mistral AI, in a statement. "Together, we have developed a model with unprecedented accuracy, flexibility, high efficiency, and enterprise-grade support and security thanks to NVIDIA AI Enterprise deployment."

Mistral NeMo was trained on the NVIDIA DGX Cloud AI platform and utilizes NVIDIA TensorRT-LLM for accelerated inference performance on large language models (LLMs) and the NVIDIA NeMo development platform for building custom generative AI models.

The 12-billion-parameter design ("12B") gives Mistral NeMo extra horsepower for multi-turn conversations, math, common sense reasoning, world knowledge, and coding, Lample said. The model's 128K context length allows it to process extensive and complex information coherently and accurately, ensuring contextually relevant outputs. Also, using the FP8 data format reduces memory size and enhances efficiency, speeding deployment without compromising accuracy, he said.

Mistral NeMo is packaged as an NVIDIA NIM inference microservice, which includes performance-optimized inference with NVIDIA TensorRT-LLM engines. This containerized format facilitates easy deployment across platforms, from cloud environments to RTX workstations, significantly reducing setup times from days to minutes, the company said.

NIM features, integrated into NVIDIA AI Enterprise, include enterprise-grade software with dedicated feature branches, rigorous validation processes, and comprehensive security and support. This setup is meant to ensure reliable and consistent performance, Lample said, backed by direct access to NVIDIA AI experts and defined service-level agreements.

Mistral NeMo is being released under the Apache 2.0 license. The open model license supports  seamless integration of Mistral NeMo into commercial applications. The model was designed to run efficiently on a range of hardware, including NVIDIA L40S, GeForce RTX 4090, and RTX 4500 GPUs, offering low compute costs and enhanced security and privacy. It's available now as

Mistral NeMo is available now for deployment across various platforms, including cloud, data centers, and RTX workstations. The model can be accessed as an NVIDIA NIM now, and the company says a downloadable NIM version is coming soon.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at jwaters@converge360.com.

Featured

Upcoming Training Events

0 AM
TechMentor @ Microsoft HQ
August 11-15, 2025