At GTC 2026, NVIDIA, AWS, and Google Cloud Shift Focus from Chips to AI Infrastructure -- Pure AI

At GTC 2026, NVIDIA, AWS, and Google Cloud Shift Focus from Chips to AI Infrastructure

By David Ramel
03/20/2026

Key Takeaways

NVIDIA’s cloud partners used GTC 2026 to argue that AI demand is being shaped as much by networking, orchestration, and deployment models as by access to GPUs.
AWS centered its announcement on scale, saying it plans to deploy more than 1 million NVIDIA GPUs across its regions starting in 2026, while also adding new inference and analytics capabilities.
Google Cloud emphasized flexibility, highlighting fractional GPU instances, Kubernetes-based inference software, and plans to offer NVIDIA Vera Rubin NVL72 systems in the second half of 2026.

NVIDIA, Amazon Web Services, and Google Cloud used GTC 2026 to make a broader point about the AI market: the race is no longer only about who has the most advanced chips, but who can turn those chips into usable cloud infrastructure for training, inference, and large-scale deployment.

The announcements from AWS and Google Cloud suggested that NVIDIA’s conference has become as much a showcase for cloud architecture as for silicon. Both companies focused on how NVIDIA hardware is packaged with networking, virtualization, orchestration, and managed services for enterprises looking to move AI projects into production.

AWS said it plans to deploy more than 1 million NVIDIA GPUs across its cloud regions starting in 2026, including systems based on Blackwell and Vera Rubin architectures. It also announced support for Amazon EC2 instances using NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, and said it is adding NVIDIA’s Inference Xfer Library, or NIXL, to AWS Elastic Fabric Adapter to improve disaggregated large language model inference across NVIDIA GPUs and AWS Trainium systems.

The company also used the event to tie AI infrastructure more closely to its broader cloud stack. AWS said it can deliver 3x faster Apache Spark performance using Amazon EMR on EKS with Amazon EC2 G7e instances powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs. It also said support for NVIDIA Nemotron models is expanding in Amazon Bedrock, including upcoming reinforcement fine-tuning and planned availability of Nemotron 3 Super.

Google Cloud took a somewhat different approach, leaning into flexibility and software integration. The company said it is previewing fractional G4 virtual machines using NVIDIA virtual GPU technology, giving customers access to 1/2, 1/4, and 1/8 GPU configurations for workloads ranging from inference and rendering to remote desktops and streaming.

Google also said it is integrating NVIDIA Dynamo with GKE Inference Gateway, a move aimed at improving how AI workloads are managed across Kubernetes infrastructure. Looking further ahead, the company said it plans to be among the first cloud providers to offer NVIDIA Vera Rubin NVL72 rack-scale systems in the second half of 2026 as part of its AI Hypercomputer platform.

Taken together, the announcements pointed to a maturing market in which cloud providers are trying to differentiate not just on raw compute supply, but on how efficiently that compute can be consumed. That includes better interconnects for inference, more granular access to GPUs, software designed to reduce bottlenecks, and managed services that let customers build AI systems without assembling the stack themselves.

For NVIDIA, that is a useful reframing. The company still dominates the conversation around AI accelerators, but the message at GTC 2026 was that future growth may depend as much on cloud packaging and deployment models as on the next processor cycle. AWS and Google Cloud, in effect, used NVIDIA’s event to make the case that the infrastructure around the GPU is becoming the real battleground.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

Pure AI

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 2-Day Hands-On Seminar: Copilot Studio, Microsoft Agent Framework and Foundry: Building Multi-Agent AI Systems
June 8-9, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
July 9-10, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026