To solve such problems we always start with a null hypothesis, and we assume that the null hypothesis is true until proven otherwise.
For our example, the null hypothesis is that there is no statistically significant increase in the average marks scored by students in 2009 and 2019, — that is, μ₁₉ = μ₀₉.
From the given data, we can compute the probability of correctness of the null hypothesis and if that probability turns out to be very low then we should reject the null hypothesis in favor of the research hypothesis, otherwise we should reject the research hypothesis.
We’ll soon learn how to compute the desired probability but as of now we should set up our hypothesis about the given data as follows:Equation 2: Setting up the hypothesisStep 2: Select appropriate test statisticTo do this we can revoke the central limit theorem (CLT), which states that the distribution of the means of sample distributions is normal with mean μ and standard deviation σ/⎷n, i.
You can read more about the central limit theorem from my earlier post.
Although, we do not have access to P: μ, σ, n, but we know from CLT that μ₀₉ and μ₁₉ belong to a normal distribution with unknown parameters.
Here comes the trickiest part of hypothesis testing, which is to transform (or map) the values of μ₀₉ and μ₁₉ from an unspecified normal distribution,Z(μ,σ/⎷n), to the standard normal distribution, i.
Essentially, we want to center the mean to zero and scale the standard deviation to 1 for the points, μ₀₉ and μ₁₉, in Z(μ,σ/⎷n).
This can be done as follows for our hypothesis:Equation 3: Mapping our hypothesis to a standard normal distribution, Z(0,1)Even after mapping the values (μ₀₉ and μ₁₉) to a Z(0,1), still the unknowns, μ, σ and n, appear in our hypothesis.
We can easily remove μ by rearranging the terms in our hypothesis as follows:Equation 4: Rearranging the terms in our hypothesis to get rid of the unknown μSolving the left hand side of our hypothesis in Equation 4 gives us:Equation 5: Hypothesis without the unknown μTo get rid of σ, we should replace it with the highest sample standard deviation.
In our example, σ₁₉ > σ₀₉, so we’ll replace σ with σ₁₉ and use the corresponding n₁₉ in the denominator of our hypothesis.
Hence, our final hypothesis in terms of the known sample distribution parameters can be written as:Equation 6: Our hypothesis in terms of the known sample distribution parametersNow, we can plug in the values of respective parameters from Equation 1 to Equation 6 to get the numerical values for our hypothesis.
Equation 7: Numerical values of our hypothesis testing set upCaution: Everyone knows that numerically 2.
4 is greater than 0 in Equation 7, but please note that hypothesis testing is not a numerical comparison problem, instead it’s a statistical problem.
This means that we are asking whether 2.
4 is statistically significantly greater than 0 (not just, is 2.
4 greater than 0?)?.Here, z=2.
4 is our test statistic.
By statistically significant we precisely mean that if we select multiple independent random samples, of 100 students each, who appeared in our aptitude test in 2019, then what is the probability that most of those random samples will have z > 2.
4 (implicitly meaning, μ₁₉ > μ₀₉).
Our random sample S₁₉ only contains one set of 100 students and the comparison of its mean (μ₁₉) with μ₀₉ gives only one test statistic (z = 2.
But we are interested in testing a general hypothesis that if we obtain marks of another group of 100 students (randomly selected) from 2019 data then what is the likelihood that the average of the new group is also greater than μ₀₉.
That’s why we test for z > 2.
4 and not for z = 2.
Notice that I highlighted “most” because we should quantify what “most” means in our statistical hypothesis testing problem and we’ll do that in the next step.
But first let’s calculate the probability of z > 2.
4, that is p(z > 2.
4), from the table of the standard normal distribution, Z(0,1).
At z = -2.
4 (bell curve is symmetric so values at -2.
4 and 2.
4 are equal) we’ve p(z > 2.
4) = 0.
You can also use this calculator to compute p(z>2.
Step 3: Select a level of significance (α)This is the parameter we use to decide the statistical significance of our hypothesis test.
To do so, let’s first look at the probability distribution of Z(0,1) as shown in Figure 1.
Figure 1: A plot of standard normal distribution, Z(0,1)From Figure 1, it is evident that 68% of the data lies within 1 standard deviation, 95% of the data lies within 2 standard deviations and 99% of the data lies within 3 standard deviations around the mean (=0) in a standard normal distribution.
In hypothesis testing, we select that point as our level of significance for which at least (1-α)% of the random samples support our research hypothesis.
Let’s understand this — remember I highlighted “most” in the last paragraph of step 2?.That “most” is quantified in terms of the level of significance (α) in the sense that if we randomly select multiple groups, of 100 students each, who appeared in the exam in 2019 then we accept our research hypothesis (μ₁₉ > μ₀₉, equivalently z>2.
4) only if (1-α)% of those groups have their means greater than μ₀₉.
Usually, α = 0.
05 is selected to test the statistical significance of a hypothesis test.
This means that we want (1-α) = 1–0.
05 = 95% of random samples to support our research hypothesis.
Since, we usually have one z-statistic for our hypothesis testing problem so we just want p(z > z-statistic) < α to accept the research hypothesis, otherwise we reject it in the favor of the null hypothesis.
For our example, if we select α = 0.
05 then we should accept the research hypothesis only if p(z>2.
4) = 0.
0082 < 0.
05, which is indeed correct.
By accepting the research hypothesis we mean — with 0.
05 level of significance (α = 0.
05), we conclude that μ₁₉ is statistically significantly greater than μ₀₉.
To summarize, a three step procedure is adopted for statistical hypothesis testing:Set up the hypothesis: The user formulates a research hypothesis (for example, μ₁₉ > μ₀₉) and selects a null hypothesis (μ₁₉ = μ₀₉) for comparison.
Select appropriate test statistic: A test statistic is a single number that summarizes the sample information.
For instance, we used Z statistic in our discussion and hence, our example hypothesis test is popularly known as Z-test.
However, depending on the nature of the problem, a different test statistic should be appropriately selected.
For example, a t-statistic (computed from Student’s t-distribution) should be used for the problems with small sample size (n < 30) and the corresponding hypothesis test is known as t-test.
Select a level of significance (α): To establish the statistical significance of a hypothesis test, the user selects a suitable level of significance.
Since, level of significance is a probability measure, its value ranges between 0 and 1.
The typical values of α used in scientific literature are 0.
05 and 0.
10, with α=0.
05 the most commonly used value.
Finally, depending on the nature of the research hypothesis, a hypothesis test can be of three types:Upper-tailed test: A hypothesis test where the research hypothesis states that the value of the parameter has increased in comparison to the null hypothesis (for example, μ₁₉ > μ₀₉).
Lower-tailed test: A hypothesis test where the research hypothesis states that the value of the parameter has decreased in comparison to the null hypothesis (for example, μ₁₉ < μ₀₉).
Two-tailed test: A hypothesis test where the research hypothesis states that the value of the parameter has changed in comparison to the null hypothesis (for example, μ₁₉ ≠ μ₀₉).
Noticeably, a two-tailed test is a test of inequality instead of increase or decrease in the value of the parameter of interest.