Nvidia Proposes AI-Savvy Benchmark for Supercomputer Performance

Since 1993, the TOP500 list has provided a broadly respected ranking of the world's most powerful supercomputers, employing the High Performance Linpack (HPL) benchmark to compare performance among systems. Now, Nvidia is proposing that the TOP500 expand its benchmark suite to better reflect the computational demands posed by AI workloads.

"Multiple benchmarking approaches provide different perspectives, contributing to a more holistic picture of a supercomputer's capabilities," writes Ian Buck, general manager and vice president of Accelerated Computing at NVIDIA, in a blog post.

The current HPL benchmark emphasizes double-precision (64-bit) floating point calculations, which are common in complex simulations where accuracy is paramount. But AI-oriented scenarios, like neural networks for deep learning, can run far more efficiently using less expensive, single-precision (FP32) or half-precision (FP16) computation. The HPL-AI benchmark incorporates mixed-precision calculations that are a common feature of AI workloads and training, providing valuable insight into system performance in this fast emerging space.

Jack Dongarra founded the HPL benchmark nearly 30 years ago. He says HPL-AI reflects the evolving demands placed on supercomputers as machine learning and AI become more prevalent.

"Mixed-precision techniques have become increasingly important to improve the computing efficiency of supercomputers, both for traditional simulations with iterative refinement techniques, as well as for AI applications," Dongarra said. "Just as HPL allows benchmarking of double-precision capabilities, this new approach based on HPL allows benchmarking of mixed-precision capabilities of supercomputers at scale."

Nvidia and the U.S. Department of Energy (DOE) recently tested the IBM-built Summit supercomputer at the Oak Ridge National Laboratory in Knoxville, TN., using the HPL-AI benchmark. As the top-ranked system on the current TOP500 list, Summit under the HPL benchmark produced 148.6 petaflops -- nearly 50 percent higher than the second-place system on the list. When tested under HPL-IA, Summit produced a staggering 445 petaflops.

The increase reflects the efficiency of the Summit system architecture in performing mixed-precision math, and points to the growing need, Nvidia says, for an industry standard benchmark to provide comparative assessment of performance in AI and other scenarios that rely heavily on mixed-precision calculations.

"Today, no benchmark measures the mixed-precision capabilities of the largest-scale supercomputing systems the way the original HPL does for double-precision capabilities," Buck writes. "HPL-AI can fill this need, showing how a supercomputing system might handle mixed-precision workloads such as large-scale AI."

About the Author

Michael Desmond is an editor and writer for 1105 Media's Enterprise Computing Group.