Proportions


Political Affiliation and Immigration

The dataset below contains the results from a poll based on a random sample with two variables: response, indicating their response to the poll question, and political, reporting their self-reported political ideology.

A number of randomly sampled registered voters from Tampa, FL were asked if they thought workers who have illegally entered the US should be (i) allowed to keep their jobs and apply for US citizenship, (ii) allowed to keep their jobs as temporary guest workers but not allowed to apply for US citizenship, or (iii) lose their jobs and have to leave the country.

## Copy and run this code to create table
library(ggplot2)
library(dplyr)
immigration <- read.csv("https://collinn.github.io/data/immigrationpoll.csv")

with(immigration, table(response, political)) %>% addmargins(1)
##                        political
## response                conservative liberal moderate
##   Apply for citizenship           57     101      120
##   Guest worker                   121      28      113
##   Leave the country              179      45      126
##   Not sure                        15       1        4
##   Sum                            372     175      363

Question 1

We will make a confidence interval to answer the question “What proportion of conservative Tampa voters support workers being ‘allowed to keep their jobs and apply for US citizenship.’

Part A: Describe the parameter of interest, including which symbol we use for it.

Part B: What is the corresponding value of the statistic and what symbol do we use for it?

Part C: What is the sample size for the group of conservatives?

Part D: Check the conditions for making a confidence interval.

Part E: Create a 95% confidence interval for the parameter.

Part F: Interpret the confidence interval.

Question 2

Let’s see if there is a difference between conservatives and liberals in terms of proportions that support workers being ‘allowed to keep their jobs and apply for US citizenship.’

Part A: What is the value of the statistic of interest? (Make ‘Liberal’ the first group)

Part B: Check the conditions to make a confidence interval.

Part C: Make a 90% confidence interval.

Part D: Interpret the confidence interval

Part E: According to the CI, is it plausible there is no difference between the groups?


Supercommuters

The fraction of workers who are considered “supercommuters”, because they commute more than 90 minutes to get to work, varies by state. Suppose the 1% of Nebraska residents and 6% of New York residents are supercommuters. Now suppose that we plan a study to survey 1000 people from each state, and we will compute the sample proportions \(\hat{p}_{NE}\) for Nebraska and \(\hat{p}_{NY}\) for New York.

  1. What is the associated mean and standard deviation of \(\hat{p}_{NE}\) in repeated samples of size 1000?

  2. What is the associated mean and standard deviation of \(\hat{p}_{NY}\) in repeated samples of size 1000?

  3. Calculate and interpret the mean and standard deviation associated with the difference in sample proportions for the two groups, \(\hat{p}_{NY} - \hat{p}_{NE}\) in repeated samples of 1000 in each group.

  4. How are the standard deviations from parts (a), (b), and (c) related?