Amazon today blamed human error for the the big AWS outage that took down a bunch of large internet sites for several hours on Tuesday afternoon.
In a blog post, the company said that one of its employees was debugging an issue with the billing system and accidentally took more servers offline than intended. That error started a domino effect that took down two other server subsystems and so on and so on.
“Removing a significant portion of the capacity caused each of these systems to require a full restart,” the post read. “While these subsystems were being restarted, S3 was unable to service requests. Other AWS services in the US-EAST-1 Region that rely on S3 for storage, including the S3 console, Amazon Elastic Compute Cloud (EC2) new instance launches, Amazon Elastic Block Store (EBS) volumes (when data was needed from a S3 snapshot), and AWS Lambda were also impacted while the S3 APIs were unavailable.”
In response, the company said it is making some changes to ensure that a similar human error wouldn’t have as large an impact. One is that the tool employees use to remove server capacity will no longer allow them to remove as much as quickly as they previously could.
Amazon also said it is making changes to prevent the AWS Service Health Dashboard — the webpage that shows which AWS services are operating normally and not — from stopping working in the event of a similar occurrence.
AWS, which leases out computing power and data storage to companies big and small, is on pace to be a $14 billion business over the next year. It also drives a large portion of Amazon’s operating income.
This article originally appeared on Recode.net.