News
Google Open Sources Library for Training Large-Scale Neural Network Models
This week Google announced that it has open-sourced GPipe, a library that can be used to train deep neural networks (DNNs) used in large-scale machine learning projects, like language processing, speech recognition and image recognition.
Google noted in its announcement that current state-of-the-art image models are facing limitations imposed by GPU memory, even with the many advances in that area.
With GPipe and "pipeline parallelism," the company discussed in this paper how it uses GPipe in a distributed manner to "overcome this limitation," thanks to the library's use of synchronous stochastic gradient descent, combined with the parallel architecture and use of multiple sequential layers.
"Importantly, GPipe allows researchers to easily deploy more accelerators to train larger models and to scale the performance without tuning hyperparameters," the company commented.
"To demonstrate the effectiveness of GPipe, we trained an AmoebaNet-B with 557 million model parameters and input image size of 480 x 480 on Google Cloud TPUv2s," it continued.
"This model performed well on multiple popular datasets, including pushing the single-crop ImageNet accuracy to 84.3%, the CIFAR-10 accuracy to 99%, and the CIFAR-100 accuracy to 91.3%."
In contrast, Google points out that the winner of the 2017 ImageNet challenge, Squeeze-and-Excitation Networks, achieved 82.7 percent top-1 accuracy with 145.8 million parameters.
Much more detail about GPipe can be found in the official announcement here.
According to Google, the core GPipe library is open-sourced under the Lingvo framework.
About the Author
Becky Nagel is vice president of AI for 1105 Media, where she specializes in training internal and external customers on maximizing their business potential via a wide variety of generative AI technologies as well as developing cutting-edge AI content and events. She's the author of "ChatGPT Prompt 101 Guide for Business Uses," regularly leads research studies on generative AI business usage, and serves as the director of AI Boardroom, a new resource for C-level executives looking to excel in the AI era. Prior to her current position she was a technical leader for 1105 Media's Web, advertising and production teams as well as editorial director for a suite of enterprise technology publications, including serving as founding editor of PureAI.com. She has 20 years of enterprise technology journalism experience, and regularly speaks and writes about generative AI, AI, edge computing and other cutting-edge technologies. She can be reached at bnagel@1105media.com.