Service Outages Postmortems

Comments discuss recent major outages at providers like Fastly, Cloudflare, Backblaze, and others, analyzing root causes such as configuration errors, network failures, overloads, and cascading effects, along with praises for detailed postmortem blog posts.

➡️ Stable 0.6x DevOps & Infrastructure
4,599
Comments
20
Years Active
5
Top Authors
#9092
Topic ID

Activity Over Time

2007
9
2008
29
2009
68
2010
69
2011
99
2012
175
2013
150
2014
133
2015
139
2016
200
2017
199
2018
231
2019
350
2020
365
2021
605
2022
418
2023
479
2024
357
2025
499
2026
27

Keywords

IT CPU MUST PDT QED thestreet.com youtube.com backblaze.com MB AI outage outages caused incident failure traffic network service triggered failed

Sample Comments

mehphp Apr 20, 2021 View on HN

Wasn't their previous major outage because of a bad migration?

tpike01 Oct 7, 2021 View on HN

Was the outage a configureration error by the admins or was it foul play?

radishingr Nov 13, 2021 View on HN

"Impact: fixed processes that led to 8 hr outage" seems like an easy case to make.

Redsquare Jun 12, 2018 View on HN

Bit drastic considering this was their first major global outage...

seldo Jun 12, 2014 View on HN

Our sincere apologies for tonight's downtime. We're back up now after 30 incredibly frustrating minutes, but we're making changes to ensure this incident can't be repeated.The root cause was a network failure at our CDN, Fastly. The incident was limited to a single Point of Presence (POP) in San Jose, so if you were in Europe or Asia you didn't see anything wrong, but obviously at this time of day most traffic is from the west coast.While our uptime over the last f

mattbillenstein Jul 30, 2025 View on HN

We're sorry https://www.youtube.com/watch?v=9u0EL_u4nvwEdit, an outage of this length smells of bad systems architecture...

sp8 Mar 12, 2021 View on HN

That literally happened, they blogged about it recently. https://www.backblaze.com/blog/recent-outages-why-we-acceler...

elil17 Mar 19, 2018 View on HN

Title seems misleading - a poorly chosen default behavior caused the outage

immjs Jul 20, 2023 View on HN

Companies that explain in great detail why an outage happened - chefskiss

Thorrez Nov 1, 2021 View on HN

From the blog post it sounds like no. They say a service got overloaded due to an increase in the number of datacenters and triggered a bug.