Digraph
Organize the world
Digraph
Search
Everything
This topic
Blog
Recent
Everything
Sign in
Software system incident postmortems
Software system incident postmortems
Software system incidents
Parent topics
Software systems
This topic
Recent activity
You must be
signed in
to add and move topics and links.
2012-08 Knight Capital stock trading disruption
Software system incident postmortems
Computer security postmortems
Privacy, computer security, vulnerabilities and attacks
Software system incident postmortems
5 Whys - Wikipedia
https://en.wikipedia.org/wiki/5_Whys
Software system incident postmortems
Cloudflare outage on June 21, 2022
https://blog.cloudflare.com/cloudflare-outage-on-june-21-2022/
Software system incident postmortems
Cloudflare outage on June 21, 2022 | Hacker News
https://news.ycombinator.com/item?id=31823132
Software system incident postmortems
Details of the Cloudflare outage on July 2, 2019
https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/
Cloudflare
Kyoto Tycoon
Quicksilver (key-value store)
Software system incident postmortems
GitHub - hjacobs/kubernetes-failure-stories: Compilation of public failure/horror stories related to Kubernetes
https://github.com/hjacobs/kubernetes-failure-stories
Kubernetes
Software system incident postmortems
Google Cloud Status Dashboard
https://status.cloud.google.com/incident/storage/19002
Software system incident postmortems
How to lose $172k per second for 45 minutes (2013) | Hacker News
https://news.ycombinator.com/item?id=19542766
Software system incident postmortems
Kubernetes Failure Stories | Hacker News
https://news.ycombinator.com/item?id=20163500
Kubernetes
Software system incident postmortems
Roblox Return to Service 10/28-10/31 2021 - Roblox Blog
https://blog.roblox.com/2022/01/roblox-return-to-service-10-28-10-31-2021/
Software system incident postmortems
Root cause analysis: significantly elevated error rates on 2019‑07‑10
https://stripe.com/rcas/2019-07-10
Software system incident postmortems
Root cause analysis: significantly elevated error rates on 2019‑07‑10 | Hacker News
https://news.ycombinator.com/item?id=20422337
Software system incident postmortems
Route Leak Impacting Cloudflare | Hacker News
https://news.ycombinator.com/item?id=20262214
Border Gateway Protocol (BGP)
Cloudflare
Software system incident postmortems
Summary of the AWS Service Event in the Northern Virginia (US-EAST-1) Region
https://aws.amazon.com/message/12721/
Software system incident postmortems
System separation in the Continental Europe Synchronous Area on 8 January 2021 – 2nd update
https://www.entsoe.eu/news/2021/01/26/system-separation-in-the-continental-europe-synchronous-area-on-8-january-2021-2nd-update/
Software system incident postmortems
Today's Outage Post Mortem
https://blog.cloudflare.com/todays-outage-post-mortem-82515/
Cloudflare
Software system incident postmortems
Update to Security Incident [May 17, 2019] - Stack Overflow Blog
https://stackoverflow.blog/2019/05/17/update-to-security-incident-may-17-2019/
Software system incident postmortems
StackOverflow
Update to Security Incident | Hacker News
https://news.ycombinator.com/item?id=19941797
Software system incident postmortems
StackOverflow
Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline | Hacker News
https://news.ycombinator.com/item?id=20267790
Border Gateway Protocol (BGP)
Cloudflare
Software system incident postmortems
python sweetness — How to lose $172,222 per second for 45 minutes
https://sweetness.hmmz.org/2013-10-22-how-to-lose-172222-a-second-for-45-minutes.html
2012-08 Knight Capital stock trading disruption
Software system incident postmortems