Never start with a hypothesisLies, damned lies, and STAT101Cassie KozyrkovBlockedUnblockFollowFollowingNov 30, 2018Setting up hypothesis testing is a ballroom dance; its steps are action-action-worlds-worlds.
There’s a nice foxtrot rhythm to it.
Unfortunately, most people bungle it by starting on the wrong foot.
Here’s how to dance it right.
Step 1: Write down the default actionStatistics is the science of changing your mind under uncertainty, so the first order of business is to figure out what you’re going to do unless the data talk you out of it.
What do you commit to doing if you stay ignorant?That’s why everything begins with a physical action/decision that you commit to doing if you don’t gather any (more) evidence.
This is called your default action.
Getting started is about actions, not beliefs.
What I’m asking you is, “What will you actually do if you walk away and remain ignorant of the information?”“Gather data” is not an appropriate answer.
I’m prodding you to tell me which of the options you’d go for if I forced you choose RIGHT NOW.
(Sorry I yelled.
)Step 2: Write down the alternative actionYou’ll keep your decision binary, framed as do thing vs not do thing.
Whichever is not your default is your alternative action.
If binary feels too basic, the amazing variety of shapes on your screen speaks volumes of the power of binary options put together.
When you need to make a more complex decision, you can compound several hypothesis tests.
Let’s start with one at a time.
The first part is not about beliefsGetting started is about actions, not beliefs.
I’m not asking you what you think you know, since as a good Frequentist (a.
classical statistician, follower of the philosophy taught in most STAT101 classes) you don’t believe anything before you do the analysis.
You believe in Nothing.
Say it with me.
Bayesians are different when it comes to this, but if you’re feeling righteous Bayesian rage because you’re at philosophical odds with the logic here, take a deep breath and think of this as a lesson in knowing your enemy.
We’ll talk about the Bayesian way of life soon enough.
For now, the clue as to which kind of statistics you’re dealing with is in the jargon floating about.
If you hear “confidence interval” or “p-value”, hello Frequentist.
If you hear “credible interval” or “prior” or “posterior” (this is nothing rude, I promise), hello Bayesian.
If the first is more familiar, it’s because most educational programs teach Frequentist thinking before/instead of Bayesian thinking.
Dealing with no informationWhich action to pick as your default is not a question for the numbers nerd.
It’s an MBA thing that’s the province of the team’s decision-maker.
You make it based on business sense while meditating in a closet.
Picking a default action requires business savvy and is the duty of the team’s decision-maker.
I’m asking you what you’d prefer to do if you stay ignorant, so you don’t need data to answer my question, though you may find a previous analysis inspiring.
Exploratory data analysis (EDA) is a sort of guided meditation, if you will.
It’s a tool to help decision-makers through this part.
Read this if you’re keen to dive deeper into how analysts and decision-makers work together.
EDA is pretty useful… if you can afford it.
The price is all data you used for it has to be nuked from orbit before you get to the statistics part.
For teams that aren’t flush with data, excluding any of it from inference is too expensive.
They’re entirely at the mercy of the mental span and brainstorming ability of their decision-maker.
Playing it safeImagine a decision about launching a new product.
The typical choice among decision-makers is to play it safe: don’t launch it unless the data give you a good reason to hit the green button.
If you don’t have data, you’d cheerfully mothball the project.
Maybe that’s a mistake, but hey — you can live with yourself.
You picked the default in a way that makes sticking to it the lesser evil as far as mistakes go.
The default action is the option that you find palatable under ignorance.
Other examples where society considers the default to be fairly obvious are innocent-until-proven-guilty (default = don’t convict if there’s no evidence), testing new medications (default = don’t approve if there’s no evidence), and scientific publication (default = don’t publish if there’s no evidence).
If you don’t have a default, you don’t need fancy statistics.
Although true indifference is fairly rare in the human animal, if you’d honestly be willing to flip a coin in the absence of data, then you don’t need statistics.
If your mind isn’t set, it can’t be changed.
Move along and read this instead.
Statistical inference is for decision-making under uncertainty.
If you have the answer already, go home.
To be dry about it, the first move involves framing your decision under no information and I hope you see that a decision-maker’s training is more relevant for this than a mathematician’s.
Dealing with full informationThe next step in the dance is a bit strange.
STAT101 teaches it to you like it ain’t no thing, but it’s quite an intense mental leap.
Your job is to imagine all possible states of the world.
Yes, you heard me.
This is one of the decision-making tasks on the tougher end of the spectrum.
For non-trivial examples (stuff that’s slightly more involved than the baby examples you’ll see in class) it really takes a lot of mental discipline, creativity, flexibility, and concentration to do it well.
Your job is to imagine all possible states of the world.
Once you’ve imagined all possible parallel worlds, it’s time to put each in one of two buckets: let’s call Bucket 1 “Worlds Where I’d Be Happy To Take My Default Action” and Bucket 2 “All The Other Ones.
”Step 3: Describe the null hypotheses (H0)If you don’t like the 10-word name for Bucket 1, its technical name is null hypothesis.
Statistics classes teach you to test hypotheses, not form them.
They tend to be pre-made for you on those exams.
You might have heard shorthand descriptions of the null hypothesis like “status quo” or “the boring one” or “the thing we don’t want to prove.
” All of these are subtly inaccurate, lazy things a professor might teach a first-year college kid of untrustworthy mental sophistication.
But I trust you to handle the philosophical weirdness, so now you know that the null hypothesis describes the full collection of universes in which you’d happily choose your default action.
Let’s have a few moments of silence out of respect for the mental gymnastics we’re asking decision-makers to handle.
Not everyone has the mental flexibility it takes to zoom out.
Choose your decision-maker wisely.
Let’s have a quick reminder of where we stand.
The point here is that you’ve set things up so you’re committed to doing your default action as long as you know nothing, you know only a little, or you know with absolute certainty that you’re a citizen of a null hypothesis universe.
Hypotheses are like cockroaches.
When you see one, it’s never just the one.
There’s always more hiding somewhere nearby.
Step 4: Describe the alternative hypotheses (H1)Bucket 2 is the alternative hypothesis and you put all the leftovers in there.
It’s everything that could be true when the null is false.
The two hypotheses are mathematical complements, which is another way of saying there’s no third bucket.
In a nutshell, the alternative hypothesis is your answer to this:“What would it take to change your mind?”Action (default) -action-worlds-worlds: the dance is complete.
We’re ready to add data, so what’s the game there?The science of changing your mindBetween them, your hypotheses cover all possibilities.
They don’t overlap.
If I convince you — with data! — that you reside in one of the alternative hypothesis worlds… my goodness, what are you doing still considering the default action?.Stop!.It’s not a happy choice here.
If data convince you that you live in the alternative hypothesis world, switch actions.
You’d better switch from the default action to the alternative action: NOT doing your default.
This might spiral off into a series of other decisions, but one thing’s for sure: you’re not touching the default with a bargepole.
The data have changed your mind!Active versus passiveA huge part of this decision context is that from the get-go, the actions are not the same to you.
You’re as fully open-minded as a Frequentist should be, but that doesn’t mean you don’t consider one of the actions more sensible or ethical under ignorance.
That’s the key.
If both actions are the same to you, read this instead.
The default is the action you’re okay with falling into passively whereas the alternative action is something you need to be actively convinced to do.
Dealing with partial informationIf you only have a partial view of your data, you’ll have to deal with uncertainty.
That’s where the fancy-pants probability calculations come in.
They boil down to one sentence and it’s the same thing every time, as we’ll see in the next chapter.
The point is that you won’t ever know for sure which of the worlds is your world.
That’s why it’s important that your default action is chosen in a way that accurately reflects your values.
How do you check?.If you’ve framed things right, a Type I error should feel worse a Type II error.
In other words:The idea of incorrectly leaving your cozy comfort zone (default action) should be more painful than the idea of incorrectly sticking to it.
If that’s not true, you haven’t really been honest with yourself about which action is which.
Let’s take it again from the top!There’s no magic that makes certainty out of uncertainty.
Actions speak loudestIn order to be able to set up statistical hypotheses, you must know what your default action is.
The entire thing falls apart when you start elsewhere.
Unfortunately, picking your default action incorrectly is a common mistake among those who learn the math without absorbing any of the philosophy.
It’s also a symptom of a team where the decision-maker is missing in action and the numbers nerds are out en masse.
Picking your default action incorrectly is a painfully common mistake.
It’s everywhere!A surefire way to set yourself up for failure is to start with the hypotheses instead of the actions.
That a vestige of the way the class exercises are structured (because statistics classes don’t teach you the decision-maker’s role, those things are almost always done for you by the professor), but in real life it amounts to getting off on the wrong foot.
With all the effort you’re about to put into the rest of it, wouldn’t it be a shame to faceplant barely out of the gate?Always start with the default action.
If you’re craving these ideas in example form (with aliens!), read on here.
Don’t faceplant right out of the gate by starting with the hypotheses, always start with the default action.