News

AWS to Collaborate with Nvidia on Advanced Compute for AI

In a bid to keep pace with Microsoft, Amazon Web Services (AWS) revealed on Tuesday that it is entering into a collaborative venture with chip maker Nvidia, encompassing multiple aspects of artificial intelligence (AI).

These announcements, unveiled during the 2023 re:Invent conference hosted by AWS this week, serve to bolster AWS's standing as a significant contender in the ongoing race within the generative AI landscape. While AWS leads the cloud market, its journey to fully capitalize on its platform's AI capabilities has been notably slower especially compared with Microsoft.

Still, recent investments by AWS, such as the partnership with Claude chatbot steward Anthropic and a potential large language model referred to as "Olympus," alongside product launches such as the generative AI developer platform Bedrock, have been instrumental in narrowing the gap. The recently expanded collaboration with Nvidia, a dominant player in the AI chip sector, is poised to enhance AWS's competitive edge even further.

For instance, AWS is bringing the massive compute power of Nvidia's GH200 Grace Hopper Superchips to its customers via its Elastic Compute Cloud (EC2) service. 

This means AWS customers who need to run resource-intensive, distributed and complex AI and machine learning workloads will be able to rent the chip power to do so from AWS whenever they need it -- at a time when the availability of AI-capable chips is particularly scarce. AWS claims to be "the first cloud provider" to provide such access to its customers.

"AWS instances with GH200 NVL32 will provide customers on-demand access to supercomputer-class performance, which is critical for large-scale AI/ML workloads that need to be distributed across multiple nodes for complex generative AI workloads -- spanning FMs [foundational models], recommender systems, and vector databases," AWS said in a press release Tuesday

Nvidia is also supporting three new EC2 instances designed for large workloads, including AI model training and inferencing, 3-D AI development, digital twins and more. Coming next year, the new EC2 instances are G6, G6e and P5e. They'll be powered by, respectively, Nvidia's L4, L40S and H200 Tensor Core chips.

AWS is also working with Nvidia on an AI supercomputer called "Project Ceiba," which the two companies are touting as the "world's fastest GPU-powered AI supercomputer."

AWS has enabled Ceiba to integrate with its product stack, including Amazon Virtual Private Cloud and Amazon Elastic Block Store. Powering the Ceiba supercomputer are over 16,000 of Nvidia's GH200 Superchips, giving it enough horsepower to run 65 petaflops' worth of AI workloads.

When it's done, Ceiba will serve as a sandbox for Nvidia's army of researchers looking to "advance AI for LLMs, graphics (image/video/3D generation) and simulation, digital biology, robotics, self-driving cars, Earth-2 climate prediction, and more."

Notably, Nvidia has also built supercomputers with AWS rival Microsoft, including the Azure supercomputer dubbed "Eagle," which was recently rated the world's third-fastest supercomputer and the fastest one based in the cloud.

AWS and Nvidia are also collaborating around developer software. For instance, AWS will host Nvidia's DGX Cloud AI-training-as-a-service platform on its cloud.

"It will be the first DGX Cloud featuring GH200 NVL32, providing developers the largest shared memory in a single instance," according to AWS. "DGX Cloud on AWS will accelerate training of cutting-edge generative AI and large language models that can reach beyond 1 trillion parameters."

In addition, AWS developer customers will have access to the NeMo Retriever microservice from Nvidia. The tool lets developers "create highly accurate chatbots and summarization tools using accelerated semantic retrieval."

In a prepared statement, Nvidia CEO Jensen Huang characterized the collaboration with AWS as emblematic of the two companies' mission to bring AI to everyday customers.

"Generative AI is transforming cloud workloads and putting accelerated computing at the foundation of diverse content generation," Huang said. "Driven by a common mission to deliver cost-effective, state-of-the-art generative AI to every customer, NVIDIA and AWS are collaborating across the entire computing stack, spanning AI infrastructure, acceleration libraries, foundation models, to generative AI services."

About the Author

Gladys Rama (@GladysRama3) is the editorial director of Converge360.

Featured

Upcoming Training Events

0 AM
TechMentor @ Microsoft HQ
August 11-15, 2025