9.4 - Comparing Two Proportions | STAT 415

Time magazine reported the result of a telephone poll of 800 adult Americans. The question posed of the Americans who were surveyed was: "Should the federal tax on cigarettes be raised to pay for health care reform?" The results of the survey were:

Non- Smokers Smokers

\(n_1 = 605\) \(y_1 = 351 \text { said "yes"}\) \(\hat{p}_1 = \dfrac{351}{605} = 0.58\)

\(n_2 = 195\) \(y_2 = 41 \text { said "yes"}\) \(\hat{p}_2 = \dfrac{41}{195} = 0.21\)

Is there sufficient evidence at the \(\alpha = 0.05\), say, to conclude that the two populations — smokers and non-smokers — differ significantly with respect to their opinions?

Answer

If \(p_1\) = the proportion of the non-smoker population who reply "yes" and \(p_2\) = the proportion of the smoker population who reply "yes," then we are interested in testing the null hypothesis:

\(H_0 \colon p_1 = p_2\)

against the alternative hypothesis:

\(H_A \colon p_1 \ne p_2\)

Before we can actually conduct the hypothesis test, we'll have to derive the appropriate test statistic.

Theorem

The test statistic for testing the difference in two population proportions, that is, for testing the null hypothesis \(H_0:p_1-p_2=0\) is:

\(Z=\dfrac{(\hat{p}_1-\hat{p}_2)-0}{\sqrt{\hat{p}(1-\hat{p})\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\)

where:

\(\hat{p}=\dfrac{Y_1+Y_2}{n_1+n_2}\)

the proportion of "successes" in the two samples combined.

Proof

Recall that:

\(\hat{p}_1-\hat{p}_2\)

is approximately normally distributed with mean:

\(p_1-p_2\)

and variance:

\(\dfrac{p_1(1-p_1)}{n_1}+\dfrac{p_2(1-p_2)}{n_2}\)

But, if we assume that the null hypothesis is true, then the population proportions equal some common value p, say, that is, \(p_1 = p_2 = p\). In that case, then the variance becomes:

\(p(1-p)\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)\)

So, under the assumption that the null hypothesis is true, we have that:

\( {\displaystyle Z=\frac{\left(\hat{p}_{1}-\hat{p}_{2}\right)- \color{blue}\overbrace{\color{black}\left(p_{1}-p_{2}\right)}^0}{\sqrt{p(1-p)\left(\frac{1}{n_{1}}+\frac{1}{n_{2}}\right)}} } \)

follows (at least approximately) the standard normal N(0,1) distribution. Since we don't know the (assumed) common population proportion p any more than we know the proportions \(p_1\) and \(p_2\) of each population, we can estimate p using:

\(\hat{p}=\dfrac{Y_1+Y_2}{n_1+n_2}\)

the proportion of "successes" in the two samples combined. And, hence, our test statistic becomes:

\(Z=\dfrac{(\hat{p}_1-\hat{p}_2)-0}{\sqrt{\hat{p}(1-\hat{p})\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\)

as was to be proved.

Tag » When To Use A Two Proportion Z Test