The plot below shows the distributions under null and test hypotheses.
Analytical ApproachLet’s take the plot above and annotate what we are actually seeing.
The critical value p*for determining statistical significance falls between p_null and p_alt.
Therefore the distance from p_null to p_alt can be subdivided in two parts:The distance from p_null to p*The distance from p_alt to p*We can express these distances in terms of the z score for a one tailed test.
For the distance from p_null to p*:For the distance from p*to p_alt:Summing the two we obtain:The standard error SE can be expressed as:Where:n = the number of observationss_null = the standard deviation for the difference in proportions under the null hypothesis for the two groupss_alt = the standard deviation differences under the desired detectable difference (12% here)Substituting in and solving for n the number of observations is given by:Python ScriptImplementing the above in python code we the function below.
Note that we round up the sample size at the end.
We obtain the same sample size 2863 as in the simulation approach!StatsModelAlternatively we can use free python packages such as StatsModel which assumes a two tail test and uses Cohen’s h as a measure of distance between two probabilities.
The code below is succinct and easy to implement.
We obtain a larger sample size than the previous approach: 3021.
ConclusionWe can estimate the sample size using a simulation, rigorous analysis, and specialized packages.
There are multiple sample size calculators online that we can use to obtain similar results.
In the end all methods should give a good ballpark estimate of how many observations we might need.
This will then give us a scope of the experiment in terms of how much time we need and if it is actually worth it.
Keep in mind the end goal: Set up a practically feasible experiment.
This post is based on an exercise required for the Udacity Data Scientist Nanodegree.
I recommend enrolling in the course to gain a deeper understanding of the topics shown here.
In addition, feel free to check out this post on statistical significance.