AWS Launches Custom Chip for ML Training
- By John K. Waters
Amazon took the wraps off another custom machine learning (ML) chip at its annual re:Invent conference, underway this week online. The new Trainium processor was designed by Amazon Web Services (AWS) to provide "the best price performance for training ML models in the cloud," the company says.
AWS CEO Andy Jassy introduced AWS Trainium during his conference keynote. "We know that we want to keep pushing the price performance on machine learning training," Jassy said, "so we're going to have to invest in our own chips. You have an unmatched array of instances in AWS, coupled with innovation in chips."
The Trainium chip was designed to provide the highest performance with the most teraflops (TFLOPS) of compute power for ML in the cloud, Jassy said, to enable a broader set of ML applications. The chip is specifically optimized for deep learning (DL) training workloads for applications, including image classification, semantic search, translation, voice recognition, natural language processing and recommendation engines.
Trainium is Amazon's second piece of custom, in-house silicon. Its predecessor, Inferentia, debuted two years ago. The company recently announced plans to move some Alexa and facial recognition computing to the Inferentia chips.
Inferentia enables up to 30% higher throughput and up to 45% lower cost-per-inference than Amazon EC2 G4 instances, the company has claimed which were already the lowest cost instances for ML inference in the cloud.
"While Inferentia addressed the cost of inference, which constitutes up to 90% of ML infrastructure costs, many development teams are also limited by fixed ML training budgets," the company states on the AWS blog. "This puts a cap on the scope and frequency of training needed to improve their models and applications. AWS Trainium addresses this challenge by providing the highest performance and lowest cost for ML training in the cloud."
The Trainium and Inferentia chips share the same AWS Neuron SDK, which makes it easy for developers already up to speed on Inferentia to get started with Trainium. Because the Neuron SDK is integrated with such popular ML frameworks as TensorFlow, PyTorch, and MXNet, developers can readily migrate to AWS Trainium from GPU-based instances with minimal code changes. AWS Trainium will be available via Amazon EC2 instances and AWS Deep Learning AMIs as well as managed services including Amazon SageMaker, Amazon ECS, EKS, and AWS Batch.
The combination of Trainium and Inferentia provides an end-to-end flow of ML compute "from scaling training workloads to deploying accelerated inference," the company says.
Amazon plans to make the chip available in 2021.
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at firstname.lastname@example.org.