Provoking thought

No More Playing It Safe

I don’t consider myself a risk-taker. Far from it. The entire last chapter was about figuring out ways to reduce our risk by no longer having to take big bets. Now — just to mess with your mind — let me try to push you in exact opposite direction. I’m going to attempt to talk you into taking more risk. Managed risk.

It’s for the greater good. Trust me.

The reason to take risks to allow more unexpected things to happen. This will help us down the line when larger unexpected things happen — black swans. As a bonus, some unexpected things will result in big jumps forward. Others will be setbacks. The setbacks are opportunities to learn, so we will be better prepared for the next unexpected event.

Nassim Nicholas Taleb wrote a series of books, the most famous is likely The Black Swan — not to be confused with the Black Swan movie starring the excellent Natalie Portman.

In The Black Swan, Taleb argues that whereas humanity feels it has a reasonable handle on historic events and how they happened, reality is that the world is too complex to be able to explain them accurately. Let alone being able to predict when and how they will happen. Yet, our world is dominated by such surprising events that he calls black swans. Most historical events were not realistically predictable, and most inventions had random origins as well. These events are called Black Swans because while one may assume that after seeing a lifetime of white swans that all swans are white. And then, blam, it appears: a black swan. How could we have seen that coming? The reality is that we couldn’t. That’s reality, and our world is shaped by it.

We are not completely powerless regarding black swans, however. Different things respond differently to black swan events.

Fragile things simply collapse. Markets collapsed multiple times as a result of a black swan, as recently as 2008.

Other things attempt to be robust towards black swans — so ideally they remain standing.

But most interesting is a third category of things that Taleb calls antifragile:

Some things benefit from shocks; they thrive and grow when exposed to volatility, randomness, disorder, and stressors and love adventure, risk, and uncertainty.

One example of something acting antifragile in many ways is the human body. When put under significant stress, e.g. when lifting significantly more weight that normal, the body adapts from the experience and starts to build muscle to better prepare next time. Similarly, vaccines work because they make the body a little ill so that it can recover better prepared from when an actual disease infection would happen.

With similar techniques, we can attempt to make our organizations antifragile as well.

How? By artificially increasing the number of random events. This is why we have fire drills, this is why Chaos Engineering is a thing.

The skill to acquire is to turn the aftermath of any unexpected event into a net positive.

Let’s say you have a successful product that is earning you a stable income. This can be a good time to experiment with more risky ideas for new products. With each attempt, we can get lucky and hit gold. Or we fail miserably. Fail as in: the product doesn’t find a market. Nevertheless, we can still turn that failure into a source of organizational learning, and therefore get significant value out failure as well — another net positive.

Let’s say you run a web application, and it has essentially no downtime. That’s great. The downside is you’re not learning anything. Perhaps you have too much process and infrastructure in place for things to go wrong. That may seem good, but perhaps it’s slowing you down unnecessarily, or you’re overspending on infrastructure. How will you know? Consider removing some of the process. Let engineers push their code live, without having an operations team evaluate and stress test every release extensively for a week. Maybe, nothing happens — which would be a win. However, hopefully something blows up, and by better understanding what blew up, you learn more about your system’s reality, and can create a more targeted safety net than your previous expensive “catch all” approach.

Let’s say you have been releasing software for ages without any significant quality problems. That’s great, but perhaps you’ve been playing it unnecessarily safe here too. Look at your QA process, pick some aspects — ideally some of the more expensive and slow ones — and remove them. See what happens. Nothing? Great. Something blows up? Yay! Now let’s dig into what broke exactly, what it teaches us about our software, and if we have better ways of dealing with it than the processes we had in place before.

If we get good at this, we turn antifragile — randomness doesn’t kill us, it makes us stronger. Whatever event hits us, we know how to get better as a result. It trains the general organizational muscle of dealing with things going wrong.

More things will go wrong in the future, and likely more serious ones than the ones we introduce ourselves in our somewhat controlled manner. When that day comes, we will be much better prepared. And ideally, know how to get better out of it than we went in.

We become antifragile.