Zero cost and maximum cost

In a video game I can die again and again and again. Thus, trying to make my way through a tricky section of a first-person shooter’s single player campaign is a trial-and-error affair. I just keep dying until I hit upon a suitable strategy. Conversely, the virtue of real life is that I can’t die again and again and again. My actions and decisions in the present are woven into the fabric of an immutable past that shapes my future to some degree. If I’m not getting my way in a salary negotiation, I can’t just open-palm slap the person responsible for giving me a raise. I can’t experiment with blackmail or intimidation. I can’t do a trial run of extreme flattery or exaggerated indifference. I have one shot and I must live with the consequences of my actions.

Let me paint a different picture. A pilot in training can fly a real plane and he can fly using a simulator. He can’t crash in the real plane, but he can crash in the simulator without consequence. Thus, he flits between the two extremes necessary for maximal learning: an environment which carries an infinitesimal cost for failure, and an environment which carries an unbearably high cost for failure.

I only began thinking about this because I noticed a discrepancy in my practice of Brazilian jiu-jitsu. Sometimes when I roll, I play. I experiment. I’ll give up an advantageous position, or let an opportunity for a submission slip by just to see what happens. Because I’m not focusing on desirable or undesirable outcomes, I do the unorthodox. Other times, I don’t play. I roll to win, or in most cases, to survive. In these situations, no position, advantage or opportunity is yielded without resistance. Movement is deliberate, submission attempts are executed with vigour, and I try to give nothing away, make it as hard as possible for my sparring partner to gain anything.

In Antifragile, Nassim Taleb talks about “barbell strategies”. For example, a barbell investing strategy would be to invest 90% of a portfolio in low-risk assets—ones that keep up with inflation for example—and 10% in high risk ventures. A barbell strategy in the domain of health and fitness would be to engage in low-effort activity the majority of the time—walk, cycle, swim—and near-maximal effort activity for a minimal amount of time—heavy deadlifts, sprinting. A barbell strategy in the domain of learning encompasses what I mentioned above. The creation of two learning environments. One which holds zero cost for failure, and one which holds maximum cost for failure.

Say I want to learn the art of investing. The best strategy—a barbell strategy—would be to create multiple, divergent portfolios using fake money. Pretend I had a certain amount of currency, decide how I would invest it, and then monitor potential returns over a certain period. Then, either after or in parallel, I would put my money where my mouth is and invest my own currency in what seemed like the best way. Bad choice equals lost money. Good decision equals gain.

The opposite of the barbell strategy is moderation. In terms of learning environments, it would be having one single environment in which there’s a moderate cost for failure. In such a place, one can’t experiment, because every failure carries weight. But one also never feels true pressure because all failures are endurable. Not good. We learn best in two situations. First, when we can afford to make as many mistakes as possible—when mistakes have positive utility—and second, when we can’t afford to make any error whatsoever—when we are compelled by the cost of error to deploy all our resources to achieve the best outcome.