IT failures affect us more than we might like to admit. To guard against large-scale disasters, in 2012, Netflix engineers decided to test the limits of their systems. They designed a program called Chaos Monkey, which deliberately(!) put several of their production infrastructure servers out of action. The goal? To test the resilience of their IT architecture and self-recovery processes. Thankfully, it all worked!