Understanding Confidence Interval

Remember that for frequentist, there is one true population mean that exists, independent of how many times you draw sample.

We have precisely calculated the population mean to be 0.


We have already simulated one million samples.

In the code below, we build a confidence interval for each sample, and check whether the population mean falls within the confidence interval.

# make 95% confidence intervalz = 1.

96success = 0for sample in tqdm(samples): x = np.

mean(sample) s = np.

std(sample) up = x + z * s / np.

sqrt(sample_size) lo = x – z * s / np.

sqrt(sample_size) if lo <= mean and mean <= up: success += 1print("False positive rate: %.

3f"%(1 – success / len(samples)))False positive rate: 0.

056It turns out that 94.

4% of the confidence intervals capture the population mean.

This is what confidence interval really means: if we repeat the sampling procedures infinitely many times, we will capture the population mean in about 95% of the confidence intervals.

In other words, approximately 5% of the confidence intervals fail to capture the population mean.

In the graph below, this happens when the blue dots (upper bound) cross below the population mean, or when orange dots (lower bound) cross above the population mean.

The difference is nuanced.

The key realization is that the population mean never moves (the horizontal line), it is the boundary of the confidence interval that moves from sample to sample.

Now you see why it makes no sense to say “the population means falls within the confidence interval 95% of the time.

” The population mean does not fall here and there.

It never moves.

This article focuses on confidence interval on population mean.

Often time we encounter confidence interval on proportion (z-test) and linear regression parameters.

Still, the interpretation is the same.

ConclusionWe live in a Bayesian world.

One could be easily be forgiven for saying “we are 95% confident that…” Your managers don’t want to repeat one million experiments after all.

What we often do is to draw one sample and compute one confidence interval, but when we tell others what it means, we should not forget that confidence interval is a frequentist concept.

Notebook: confidence_interval.

ipynb.. More details

Leave a Reply