AWS Primes Its Cloud for Nvidia's Blackwell GPUs -- Pure AI

News

AWS Primes Its Cloud for Nvidia's Blackwell GPUs

By Gladys Rama
03/21/2024

Microsoft Upgrades Azure's AI Chops with Nvidia Silicon

Nvidia's Jensen Huang Unveils Hopper Successor 'Blackwell' at GTC

Amid AI Buzz, NVIDIA Unveils 'Quantum Cloud' Dev/Research Service

Nvidia Releases Over Two Dozen Healthcare AI Microservices

With the launch of Nvidia's new Blackwell GPU platform at GTC this week, Amazon Web Services is set to be the beneficiary of significantly more compute power to drive its AI efforts.

"AI is driving breakthroughs at an unprecedented pace, leading to new applications, business models, and innovation across industries," said Nvidia CEO Jensen Huang in a prepared statement at GTC. "Our collaboration with AWS is accelerating new generative AI capabilities and providing customers with unprecedented computing power to push the boundaries of what's possible."

The two companies have been partners for many years, but latterly their efforts have revolved mostly around building out their respective AI and machine learning infrastructures by integrating their technologies.

Project Ceiba
For instance, AWS' supercomputer project, dubbed "Ceiba," will run on the new GB200 NVL72 technology from Nvidia. AWS first unveiled Ceiba at last year's re:Invent conference, touting it as the "world's fastest GPU-powered AI supercomputer." Ceiba is targeted for heavy AI workloads, including those used for weather forecasting, robotics, advanced LLMs, autonomous cars and more.

Originally, Ceiba was intended to run on Nvidia's older Hopper chips. The use of the newer Blackwell chips, however, promises to increase performance sixfold.

Ceiba is a "first-of-its-kind supercomputer with 20,736 B200 GPUs is being built using the new NVIDIA GB200 NVL72, a system featuring fifth-generation NVLink connected to 10,368 NVIDIA Grace CPUs," AWS said in its announcement Tuesday. "The system scales out using fourth-generation EFA networking, providing up to 800 Gbps per Superchip of low-latency, high-bandwidth networking throughput -- capable of processing a massive 414 exaflops of AI."

EC2
AWS Customers will also be able to tap into the new Blackwell chips via Elastic Compute Cloud (EC2) instances.

"AWS plans to offer EC2 instances featuring the new B100 GPUs deployed in EC2 UltraClusters for accelerating generative AI training and inference at massive scale," said AWS. "GB200s will also be available on NVIDIA DGX Cloud, an AI platform co-engineered on AWS, that gives enterprise developers dedicated access to the infrastructure and software needed to build and deploy advanced generative AI models."

At re:Invent last year, AWS announced it would host Nvidia's DGX Cloud AI-training-as-a-service platform on its cloud.

Security
Nvidia's new Blackwell technology will also enable more secure AI workloads in AWS by combining the GB200 chip with Amazon's Nitro hypervisor technology.

"The combination of the AWS Nitro System and the NVIDIA GB200 takes AI security even further by preventing unauthorized individuals from accessing model weights," said AWS. "The GB200 allows inline encryption of the NVLink connections between GPUs, and encrypts data transfers, while EFA encrypts data across servers for distributed training and inference."

AWS CEO Adam Selipsky touted his company's GTC announcements as the natural extension of its partnership with Nvidia, which has spanned more than a decade.

"Today we offer the widest range of NVIDIA GPU solutions for customers," he said. "NVIDIA's next-generation Grace Blackwell processor marks a significant step forward in generative AI and GPU computing."

About the Author

Gladys Rama (@GladysRama3) is the editorial director of Converge360.

Featured

Pure AI

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
March 10-April 14, 2026

Live! 360 2-Day Hands-On Seminar: Copilot Studio, Microsoft Agent Framework and Foundry: Building Multi-Agent AI Systems
June 8-9, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026