Probability is not PredictabilityWeather Forecasts, Cancer Diagnoses, Coin TossesBarry LeybovichBlockedUnblockFollowFollowingApr 23Suppose you are watching the nightly news (or more likely your phone just tells you) that tomorrow there is a 30% chance of rain.
Tomorrow, do you carry an umbrella?It’s a seemingly benign question, but the way we interpret probabilities has real ramifications for how we make decisions and business and in life.
ItIn the following, I’ll break down what probabilities and statistics like this mean at their core, and how we make decisions around this uncertainty.
When Statistics Become ProbabilitiesLet’s use the same example — tomorrow there is a 30% chance of rain.
Generally, statements like these are using inferential statistics: they look at the frequency an event has occurred in the past and using an inference to apply it to the future.
Suppose that in the past 100 times a particular weather pattern has occurred, 30 of the times it has rained the next day.
If today a meteorologist sees that pattern, he or she may say:“30% of the times I have seen this pattern, it rained the next day.
So there is a 30% chance it will rain tomorrow.
”- Some meteorologist, somewhereHowever, that’s not quite what the probability (note: statistics are historical frequencies, probabilities are future likelihoods) means.
What it actually means is roughly:Based on the fact that 30 out of the past 100 times I’ve seen this pattern, it rained the next day, I am inferring that out of the next 100 times I see this pattern, it will rain on the day following 30 times.
Given that I see the pattern today, there is a 30% chance that this is one of the days when I see the pattern and it rains the next day.
– No meteorologist, anywhereWe see right away that this is a much more involved statement.
Importantly, we see how the historical observations (i.
statistics) are translating into forward-looking probabilities.
Unfortunately we don’t know if tomorrow will be one of the days it will rain — knowing the probability doesn’t help us predict whether or not to bring an umbrella tomorrow.
After all, tomorrow isn’t happening 100 times, it’s happening once.
Probabilities and Discrete EventsWeather is something that happens every day, and so it may be appropriate to use probabilities to characterize weather in the long run.
Let’s look at something discrete: a patient receiving a cancer diagnosis.
When a patient is diagnosed with cancer, something that they may commonly ask (or be told by their physician) is “what are my chances?” This is frequently used to help patients assess the seriousness of the diagnosis.
Intuitively, we understand that “80% of similar patients survive after 10 years” is a better prognosis than “8% chance to make it to 10 years”.
However, we ought to unpack this further to really try to understand what these mean.
To help us, I am using a model published by Memorial Sloan Kettering Cancer Center which (along with several other models they publish), displays the survival rate 5-years after having surgery to remove a colorectal cancer.
In the model, I input a hypothetical T3 stage cancer at N1 stage having spread to 2 lymph nodes of 16 taken, with moderate differentiation and for a 27 year old male.
The model correctly notes and I shall add here as well that “the prediction tools are not to be used as a substitute for medical advice, diagnosis, or treatment of any health condition or problem.
”So, what are the results?“This number shows, as a percentage, the probability that you will survive at least 5 years after undergoing a complete resection (surgical removal of all cancerous tissue) for colon cancer.
This probability means that for every 100 patients like you, we expect that 85 will survive 5 years after surgery and 15 will have died within 5 years.
”Let’s look at the second sentence first: “for every 100 patients like you, we expect that 85 will survive 5 years after surgery.
” This closely matches our example with the rain, where our hypothetical meteorologist noted that for every 100 days with a specific weather pattern, there would be rain on 30 of the following days.
MSK’s model is using past patient data, and making an inference about the future likelihood of survival for 100 patients, of which our patient is just one.
Then (or beforehand even), MSK makes the jump and says, since 85 patients out of 100 will be alive after 5 years, our patient has an 85% chance of surviving to 5 years.
But there’s a problem with making this jump — it isn’t sound.
Our patient isn’t having 100 cancers of which he will survive 85 — he is having one single cancer.
Maybe it is useful for the doctor to know that if the doctor has 100 patients he can expect 85 to survive to five years, but that doesn’t make it useful to each individual patient.
That doesn’t provide any sort of predictability — it doesn’t tell a patient the crucial information of if they are in the 85% or in the 15%, which a doctor of course will not be able to say.
This is something Paul Kalanithi wrote about in his memoir When Breath Becomes Air:Rather than saying, “Median survival is eleven months” or “You have a ninety-five percent chance of being dead in two years,” I’d say, “Most patients live many months to a couple of years.
” This was, to me, a more honest description.
The problem is that you can’t tell an individual patient where she sits on the curve: Will she die in six months or sixty?.I came to believe that it is irresponsible to be more precise than you can be accurate.
In fact, even mathematicians argue about this.
Specifically, “frequentists” insist that probabilities apply only to events, like flipping a coin, that can be repeated extensively (ad infinitum in theory).
Since coin flips and weather happen regularly and repeatedly, you can assign a probability because over time you will make enough predictions to compare to the probability assigned.
For cancer prognoses, and other discrete events like the Super Bowl (and the coin toss at the Super Bowl) and presidential elections, however, there aren’t hundreds of events being repeated — there is only the one.
Furthermore, probability is not predictability.
Knowing that that the probability that a fair coin will land on heads is 50%, you in no way can accurately predict the next flip.
Maybe you can predict on average how many flips out of 100 will be heads, but you won’t be able to predict the next flip with any certainty.
Huffington Post’s 2016 presidential election probabilities, for posterity.
This method of thinking may increase confusion because it seems less precise, and it is!.But it makes us face the uncertainty ahead for us.
Too often are we over confident because probabilities make unlikely events seem impossible, when in fact they are extremely possible.
That is one reason that businesses are evolving towards emergent strategy Writes Duncan Watts in Everything Is Obvious,So if even [mathematicians] have trouble wrapping their heads around the meaning of the statement that “the probability of rain tomorrow is 60 percent,” then it’s no surprise that the rest of us do as well.
Further reading:[Essay] The Median Isn’t the Message by Stephen Jay Gould[Book] The Signal and the Noise by Nate Silver.. More details