Microsoft Launches Windows ML Runtime for On-Device AI Applications -- Pure AI

Microsoft Launches Windows ML Runtime for On-Device AI Applications

By John K. Waters
09/26/2025

Microsoft said on Thursday it has made its Windows ML artificial intelligence runtime generally available for production use, enabling developers to run AI models directly on Windows devices without relying on cloud computing.

The AI inferencing runtime, first introduced at Microsoft's Build conference in May, allows applications to utilize processors from Advanced Micro Devices, Intel, Nvidia, and Qualcomm for local AI processing across Windows 11 computers.

Windows ML serves as a hardware abstraction layer that automatically detects device capabilities and downloads appropriate execution providers, reducing application size by tens to hundreds of megabytes compared to bundling runtime components directly in apps, Microsoft said.

The system is compatible with ONNX Runtime APIs. It supports models in the Open Neural Network Exchange format, allowing developers to convert existing PyTorch models for deployment across different Windows hardware configurations.

Software companies, including Adobe, McAfee, and Topaz Labs, are integrating the technology into upcoming releases. Adobe plans to use Windows ML for semantic search and content tagging in Premiere Pro and After Effects, while McAfee will employ it for deepfake detection in social media content.

"Windows ML with TensorRT for RTX delivers over 50% faster inferencing on NVIDIA RTX GPUs compared to DirectML," said Jason Paul, vice president of Consumer AI at Nvidia, in a blog post, referring to Microsoft's existing graphics-based machine learning framework.

The runtime includes execution providers developed by chip manufacturers to optimize performance for their specific processors. AMD's implementation supports the company's Ryzen AI platform across CPU, GPU, and neural processing unit configurations.

Intel's execution provider combines the company's OpenVINO software with Windows ML, enabling developers to select the optimal processing units for AI workloads on Intel Core Ultra processors.

Qualcomm Technologies worked with Microsoft to optimize Windows ML for Snapdragon X Series neural processing units, with support planned for the company's upcoming Snapdragon X2 platform.

The technology allows applications to process AI workloads locally rather than sending data to cloud services, potentially improving response times and data privacy while reducing operational costs for developers.

Windows ML is included in the Windows App SDK version 1.8.1 and requires Windows 11 version 24H2 or newer. Microsoft has also released supporting tools, including an AI Toolkit for Visual Studio Code and an AI Dev Gallery for testing custom models.

The launch represents Microsoft's effort to position Windows as a platform for edge AI computing as the technology industry increasingly focuses on running artificial intelligence applications on personal devices rather than exclusively in data centers.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].