News

Many Companies Say They Are Ready for Agentic AI Failures, but Few Test Recovery Often

Most companies say they are ready to recover from disruptions involving agentic AI, but a new survey of more than 300 IT decision-makers from Australia, New Zealand, Europe, the United Kingdom, and the United States suggests relatively few test those plans often enough to prove it.

The survey, conducted by Keepit, a vendor-independent cloud backup and recovery service based in Denmark, found that 94% of respondents were confident their disaster recovery plans covered agentic AI systems, even though only 32% said they tested those plans monthly.

Perhaps more worrying, 33% of IT and security leaders responding to the survey said they have only partial control over the use of agentic AI in their organizations, and 52% had doubts about whether their recovery plans cover agentic AI scenarios. 

"Organizations need to put more emphasis on creating long-term, structured, and tested disaster recovery plans," said Kim Larsen, Group Chief Information Security Officer at Keepit, in a statement. "This also means putting a spotlight on data governance and accountability, which is the foundation for any resiliency plan."

Among the key findings were that most organizations were testing recovery, but not consistently. Around 90% have evaluated large-scale data recovery at least once; however, testing is not frequent or systematic across all systems. 

Furthermore, a critical part of IT, access and authentication, was often overlooked in recovery planning. Identity-related systems, such as Microsoft's Entra ID and Confluence’s Okta, are tested far less often than other data systems. 

Compared to productivity applications such as Microsoft 365, Google Workspace, and Salesforce, Keepit found that, on average, productivity applications are restored four times as frequently as identity applications. 

"For every four companies who run a yearly test on their productivity workload, only one of them (25%) will have run a test on their identity applications," the report stated. 

The survey also found that most restore activity involves single-file downloads, reflecting routine operational needs rather than large-scale recovery events. Many incidents are granular, making it faster and more practical to retrieve a specific file.

The report's authors noted that backup creates value when organizations can recover confidently, correctly, and efficiently, whether the need is small and immediate or broad and time-critical. Also, restore activity is strong among larger organizations.

The report's authors said their aim was to determine whether external, high-profile events caused any changes in restoration behavior. Keepit investigated two such events in 2024 and one event in 2025 that could have caused data loss or unavailability: the solar flares in May 2024, the CrowdStrike incident in July 2024, and the Microsoft outages in October 2025. 

The results were worrying because none of these events prompted any change in user behavior. There was no sign of increased activity to confirm that backups were working in the days and weeks following the events. 

Two theories are proposed in the report regarding behavior. First, organizations did not experience widespread, immediate restoration needs as a direct result of these events; second, the results also suggest that "awareness moments" do not automatically translate into changes in recovery routines. 

The solution, according to the report's authors, is to be proactive rather than reactive to similar events. "Organizations can use external events as structured triggers for guided recovery checks — short, repeatable validations that reinforce confidence without requiring large-scale, disruptive exercises," the report noted. 

They also suggested implementing "guided recovery" enabled by MCP (Model Context Protocol), which opens the door to "asking for help" in the moment that matters. 

Furthermore, an MCP-enabled assistant can help identify unhealthy tenants or suspicious patterns in protected data and guide administrators through the right recovery steps, turning recovery into a manageable, repeatable process. 

"It all boils down to knowing who is in charge of recovery and which systems are restored first when multiple systems are affected," Larsen said. "When decisions are delayed, recovery takes longer than necessary." 

Featured