News
Will Inference and AI Agents Break Enterprise GenAI Budgets?
Enterprises may be underestimating the actual cost of generative AI as they move from experimentation to production, according to Gartner's "10 Best Practices for Optimizing Generative and Agentic AI Costs" report.
"Organizations transitioning from GenAI pilots to production experience a rude awakening when it comes to costs," Gartner researchers found. "Creating a production-ready GenAI system can be orders of magnitude more expensive than running a pilot."
The industry watchers predict that at least 50 percent of GenAI initiatives will exceed their planned budgets by 2028 due to poor architectural choices and a lack of operational expertise.
The warning reflects a growing challenge facing the AI industry. While much of the conversation has focused on model capabilities, Gartner argues that the real test for enterprises will be operating AI systems efficiently at scale.
A major driver of those costs is inference, the process of using a trained AI model to respond to prompts, generate content, analyze data, or perform other tasks in production. Unlike training, which is typically a large upfront expense, inference costs recur every time users or applications call the model. Gartner expects inference to account for at least 70 percent of a model's lifetime costs, shifting attention away from training and toward the day-to-day realities of serving AI workloads at scale.
The challenge becomes even greater with agentic AI. Unlike traditional chatbots that generate a single response, AI agents can trigger multiple model calls, retrieve data, access external tools, and execute multi-step workflows.
As organizations deploy more autonomous systems, AI usage and related costs can rise significantly.
The message is that success in the AI era will depend on more than model performance. Gartner claims that organizations must focus on cost governance, architectural efficiency, model selection, and usage monitoring to scale generative and agentic AI without incurring unsustainable spending.
"Through 2028, at least 50% of GenAI projects will overrun their budgeted costs due to poor architectural choices and lack of operational know-how," the report stated.