AWS Packs Nine New Features into its SageMaker Machine Learning Service

Amazon Web Services (AWS) has packed the latest release of its SageMaker machine learning service with nine new capabilities designed to make it easier for developers to automate and scale all steps of the end-to-end machine learning workflow.

Announced at the re:Invent online conference, this upgrade includes such new capabilities as faster data preparation, a purpose-built repository for prepared data, workflow automation, greater transparency into training data to mitigate bias and explain predictions, distributed training capabilities to train large models up to two times faster, and model monitoring on edge devices.

"Machine learning is becoming more mainstream, but it is still evolving at a rapid clip," the company said in a statement. "With all the attention machine learning has received, it seems like it should be simple to create machine learning models, but it isn't. In order to create a model, developers need to start with the highly manual process of preparing the data. Then they need to visualize it in notebooks, pick the right algorithm, set up the framework, train the model, tune millions of possible parameters, deploy the model, and monitor its performance. This process needs to be continuously repeated to ensure that the model is performing as expected over time. In the past, this process put machine learning out of the reach of all but the most skilled developers. However, Amazon SageMaker has changed that. Amazon SageMaker is a fully managed service that removes challenges from each stage of the machine learning process, making it radically easier and faster for everyday developers and data scientists to build, train, and deploy machine learning models."

Amazon SageMaker is a service that enables developers to build and train machine learning models for predictive or analytical applications in the AWS public cloud.

"One of the best parts about having such a widely-adopted service like SageMaker is that we get lots of customer suggestions which fuel our next set of deliverables," said Swami Sivasubramanian, VP the Amazon Machine Learning group at AWS.

The list of new capabilities AWS is announcing for SageMaker includes:

  • Data Wrangler, which provides fast and easy way for developers to prepare data for machine learning.
  • Feature Store, a purpose-built data store for storing, updating, retrieving, and sharing machine learning features.
  • Pipelines, which gives developers the first purpose-built, easy-to-use continuous integration and continuous delivery (CI/CD) service for machine learning.
  • Clarify, which provides developers with greater visibility into their training data, ┬áso they can limit bias in machine learning models and explain predictions.
  • Deep profiling for the SageMaker Debugger monitors machine learning training performance to help developers train models faster.
  • Distributed Training provides new capabilities that can train large models up to two times faster than would otherwise be possible with today's machine learning processors.
  • Edge Manager, which delivers machine learning model monitoring and management for edge devices to ensure that models deployed in production are operating correctly.
  • JumpStart, a developer portal for pre-trained models and pre-built workflows.


About the Author

John K. Waters is the editor in chief of a number of sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at