The Amazon Web services suffered outage for nearly 15 hours on Diwali Day, disrupting thousands of popular applications and websites, including Snapchat and Fortnite, banking apps and government services.
More than 1,000 companies across the globe were affected due to the disruption that began at 12.30 pm. The outage-tracking site Downdetector recorded over 6.5 million user reports which showed that several services including Venmo, WhatsApp, Duolingo, Roblox, and Amazon’s Alexa assistant were left non-functioning for hours before AWS declared all operations back to normal at approximately at 3.30 am IST on October 21.
The outage spread through everyday services and impacted multiple sectors. It was experienced in the financial platforms including Coinbase, Robinhood, and major UK banks like Lloyds and Halifax. Airlines like United and Delta had delayed flights, while streaming services, workplace tools like Zoom and Slack, and even the UK’s tax authority website became inaccessible. Gaming platforms faced long disruption as Roblox and Fortnite were down for nearly six hours. However, Duolingo promised to protect users’ learning streaks.
Reportedly, the outage originated at the AWS’ US-EAST-1 data center region in northern Virginia, the world’s largest cloud provider’s most crucial hub. The main cause of the outage was a Domain Name System (DNS) resolution issue affecting DynamoDB, a core database service that stores critical information for AWS clients.
The DNS functions as the internet’s phone book, converting website names into numeric IP addresses that computers can understand. Once the system went out of order, applications could not locate the servers they needed, leading to cascading failures across the internet.
Meanwhile, technology experts are pointing fingers at both Amazon and the companies that depend on its infrastructure. There is an opinion that companies using Amazon services haven’t been taking enough care to build protection systems into their applications.
Maintenance of the DNS is the key to avoid outages. Even the largest cloud environments can fail to function if the DNS, which is a minor piece of infrastructure, is affected. The outages could also end up in prolonged legal battles due to the financial losses incurred by the companies. Delta Airlines is still seeking over $500 million from CrowdStrike, even though a year had passed after the company’s outage led to widespread flight disruptions.
