Just picture a mass of people, an enormous crowd, rushing through a door when one person trips, and the entire mass goes down like dominoes.
It shouldn’t be that hard to picture. We’re almost upon Black Friday, after all, and that’s basically how the store entrances are at opening time, anyway.
That was basically the internet yesterday. Between 11:49 PM PDT (Pacific Daylight Time) on October 19 and 2:24 AM PDT on October 20, Amazon AWS—the cloud service that underpins an enormous part of the digital world—went down, and it took a lot of websites down with it.
Amazon’s Massive AWS Outage
My first clue was that my eye doctor couldn’t access their system to order me a new batch of contacts. Then I couldn’t reimburse a friend for dinner on Venmo.
Throughout the day, news sites reported other people’s much more important problems, such as students unable to do homework (yay!) and teachers unable to grade any work (yay?). The list of downed services and apps stretched on:
Duolingo, Roblox, Fortnite, Coinbase, Robinhood, Perplexity AI, ChatGPT, United Airlines, Canva, Reddit, Flickr, and of course, Amazon itself. Just a sampling of sites that people had difficulty accessing because they all run on AWS servers.
Amazon put out this press release, which reads like the frantic message of an underground nuclear missile silo operator sending an urgent, breathless telegram.
“By 12:26 AM PDT on October 20, we determined that the event was the result of DNS resolution issues for the regional DynamoDB service endpoints, and mitigated the issue by 2:24 AM PDT.”
Ah, yes. That old Dynamo, always playing tricks on Grandpa Internet.
“After resolving the DynamoDB DNS issue, AWS services began recovering, but a small subset of internal subsystems continued to be impaired. To facilitate full recovery, we temporarily throttled some impaired operations such as EC2 instance launches.”
It feels like I should be reading this on a ticker tape machine. Yeah, that stuff that they’d drop from windows in New York for parades.
Anyway, these sorts of mass outages will continue to happen as long as so much of our internet infrastructure is built on just a few pylons, such as AWS.
The post Explaining Monday’s Massive AWS Outage appeared first on VICE.