metastable failures blog post
- 
What is a metastable failure? distributed systems
Metastable failures occur in open systems with an uncontrolled source of load where a trigger causes the system to enter a bad state that persists even when the trigger is removed.- key to metastable failures is the sustaining feedback loop, rather than the trigger
 
- grey failures - https://blog.acolyer.org/2017/06/15/gray-failure-the-achilles-heel-of-cloud-scale-systems/
 
 - 
Examples of metastable failure
- 
supply chain crunch
- semi conductor
 
 - 
black start problems
 - 
traffic engineering problems
- Gazis, Denos C., and Robert Herman. “The Moving and ‘Phantom’ Bottlenecks.” Transportation Science 26, no. 3 (August 1992): 223–29. https://doi.org/10.1287/trsc.26.3.223.
 
 - 
thundering herd problems
 - 
joylent pxe boot - https://www.yohttps//www.youtube.com/watch?v=30jNsCVLpAEutube.com/watch?v=30jNsCVLpAE
- also black start
 
 - 
Rasmussen, Jens. “Risk Management in a Dynamic Society: A Modelling Problem.” Safety Science 27, no. 2–3 (November 1997): 183–213. https://doi.org/10.1016/S0925-7535(97)00052-0.
 - 
Brooker, Marc. “The Perils of Not Always Coordinating,” n.d., 39.
 
 -