What happened
A unit of Amazon Web Services (AWS) suffered at least two outages caused by errors related to the use of its own AI tools for writing code, the Financial Times reports. The incidents occurred over the past few months and have prompted internal debates about the safety of broadly rolling out agent AI assistants.
In mid‑December one of AWS’s services was down for about 13 hours after engineers allowed the tool Kiro to make changes without additional approval. The agent AI decided, according to sources, to "delete and recreate the environment," which led to an interruption of a service used to analyze customer costs.
"The agent decided to 'delete and recreate the environment,' which caused prolonged service downtime."
— An AWS employee, Financial Times
Why it matters
AWS generates roughly 60% of Amazon's operating profit and is a key cloud provider for hundreds of thousands of companies worldwide. When AI‑based tools are granted the right to change running environments without human review, the risks stop being local — they become systemic.
The causes of the incidents are simpler than the headlines: rapid deployment of agent capabilities, expanded rights to perform actions, and insufficient operational controls. As a result, technology designed to speed up engineers' work can potentially undermine that very stability.
How AWS is responding
The company prepared an internal report and implemented additional safeguards: mandatory peer review of changes and extra staff training. AWS also emphasizes that one of the outages was localized and did not affect the majority of customers.
"The December incident was limited in scope and involved a single service in mainland China," the company said in an official statement.
— AWS, official statement
At the same time, Amazon continues to invest in AI: in 2025 the company introduced new Nova 2 models and the Nova Forge service, and is developing products like Kiro. Parallel to this, Amazon and Google are working on rapid interconnection of their clouds — which on one hand increases flexibility, and on the other makes ensuring consistent controls more difficult.
What it means for Ukraine and the markets
For Ukrainian companies and government services that increasingly rely on foreign cloud providers, these incidents are a reminder of two simple things: first, innovations should be implemented with a security checklist; second, critical infrastructure must have resilient backup scenarios.
Practical takeaways: require transparent SLAs from providers, dual‑control (two‑person) processes for executing changes, logging and audit, and diversification of providers for key services. This is not technological paranoia, but business reality on which access to data, services and even security depends.
Summary
The incidents at AWS show that the speed of innovation and operational security must go hand in hand. Clouds will continue to evolve — and that is good for the economy — but customer trust is built through control mechanisms, not just marketing. The question remains open: can the cloud giants find a balance between autonomous AI and strict operational guarantees on which the stability and security of digital infrastructure depend?