Murphy’s law is an adage or epigram that is typically stated as: “Anything that can go wrong will go wrong”.
Actually, the way I always heard Murphy’s law be formulated is “Everything that can go wrong with eventually go wrong.” However he said it, after running operations and release management for a web application with significant traffic, I can tell you this:
Murphy was an optimist.
“Let’s just get this release out of the door at 5pm, just before we go home. What could possibly go wrong?”
“Let’s release this minor update on Friday. What could possibly go wrong?”
“Let’s migrate everybody over to our new VPC setup at once. What could possibly go wrong?”
“Let’s not apply everything I learned at Cloud9 to my personal website and have it run on a single, not externally backed up, unoptimised WordPress install. What could possibly go wrong?”
For a long time at Cloud9, we’ve had guidelines based on past experience, here’s just three:
- No end-of-day releases, because things always catch on fire while you’re in transit home or just when you’re having dinner.
- No releases on Fridays, because nobody likes firefighting on weekends.
- Data has to be on two servers at the very least, because — you know — servers and hard drives break.
Once in a while there seemed to be very compelling reasons to not follow a guideline just this once. “It’s just a tiny tweak, it cannot possibly break anything.” I saw a variant of the 5pm release happen while I was driving to my parents in law. It started with a proud internal email stating “We have released A and B just now!” I looked at my watch: 6pm. Alright, let’s see how this plays out. And indeed, only half an hour later another email: “ok, so that was a bad idea — reverted.”
Although it seemed impossible, things almost always seemed to go wrong when there was a chance, definitely too often to ignore it. So, there you have it — Zef’s law:
Everything that can possibly go wrong will immediately blow up in your face. — Zef’s Law
Since then these guidelines were turned into unbreakable laws.
If you live abiding Zef’s law, you learn to manage and accept the consequences of taking risk: immediate failure. The trick is to lower the risk to the level where the failure becomes tolerable.