Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

Major AWS Outage Highlights Dependencies within Cloud Providers (Week of Nov. 23-30) | Outage Deep Dive

Major AWS Outage Highlights Dependencies within Cloud Providers (Week of Nov. 23-30) | Outage Deep Dive

FromThe Internet Report


Major AWS Outage Highlights Dependencies within Cloud Providers (Week of Nov. 23-30) | Outage Deep Dive

FromThe Internet Report

ratings:
Length:
15 minutes
Released:
Dec 1, 2020
Format:
Podcast episode

Description

If you’re an AWS customer or rely on services that use AWS, you might have noticed the major, hours-long outage last week. On November 25th, at approximately 5:15 am PST, users of Kinesis, a real-time processor of streaming data, began to experience service interruptions. The issue was not network-related, and AWS later issued a detailed incident post-mortem analysis identifying an existing operating system configuration issue that was triggered by a maintenance event that involved adding server capacity. Over the course of the day, Amazon attempted several mitigation measures, but the outage was not completely resolved until approximately 10:23 pm PST.

What was notable about this outage was its blast radius, which extended far beyond AWS’s direct customers. Several AWS services that use Kinesis, including Cognito and CloudWatch, were affected, as were any user of applications consuming those services (e.g., Ring, iRobot, Adobe). This is a good reminder of the risk of hidden service dependencies, as well as the need for visibility to understand and communicate with customers when something’s gone wrong.
Released:
Dec 1, 2020
Format:
Podcast episode

Titles in the series (90)

This is the Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Catch our Outage Deep Dive series for special coverage of Internet outages. We go under the hood to determine what happened, covering key lessons and ways IT teams can minimize downtime in similar situations. Also tune in every other week for the Pulse Update podcast series to hear from the Internet experts at ThousandEyes as they share the latest data on ISP outages, public cloud provider network outages, collaboration app network outages, and more. Then, the hosts dig into some of the most interesting outage events from the last few weeks.