This assignment has 25 points possible. Your score will be calculated out of 22 points and scaled to be out of 5 points.
The General Social Survey asked a random sample of 1,390 Americans the following question: “On the whole, do you think it should or should not be the government’s responsibility to promote equality between men and women?” 82% of the respondents said it “should be”. At a 95% confidence level, this sample has 2% margin of error. Based on this information, determine if the following statements are true or false, and explain your reasoning. (NORC 2016)
Part A: We are 95% confident that 80% to 84% of Americans in this sample think it’s the government’s responsibility to promote equality between men and women.
Part B: We are 95% confident that 80% to 84% of all Americans think it’s the government’s responsibility to promote equality between men and women.
Part C: If we considered many random samples of 1,390 Americans, and we calculated 95% confidence intervals for each, 95% of these intervals would include the true population proportion of Americans who think it’s the government’s responsibility to promote equality between men and women.
Part D: In order to decrease the margin of error to 1%, we would need to quadruple (multiply by 4) the sample size.
In the US, businesses and schools shut down due to the COVID-19 pandemic in March 2020, and a vaccine became publicly available for the first time in April 2021. That month, a Gallup poll surveyed a random sample of 3,731 US adults, asking how they felt about the COVID-19 vaccine requirement for air travel. The poll found that 57% said they would favor it. (Gallup 2021b)
Part A: Describe the population parameter of interest.
Part C: Construct a 95% confidence interval for the proportion of US adults who favored requiring proof of COVID-19 vaccination for travel by airplane. (Show work and final value)
Part D: Interpret the confidence interval.
Part E: Without doing any calculations, describe what would happen to the confidence interval if we decided to use a higher confidence level.
Part F: Without doing any calculations, describe what would happen to the confidence interval if we used a larger sample.
Is yawning contagious? An experiment conducted by the MythBusters, a science entertainment TV program on the Discovery Channel, tested if a person can be subconsciously influenced into yawning if another person near them yawns. 50 people were randomly assigned to two groups: 34 to a group where a person near them yawned (treatment) and 16 to a group where there wasn’t a person yawning near them (control). The visualization below displays how many participants yawned in each group.
knitr::include_graphics("https://nfriedrichsen.github.io/homework/mythbusters-chart.jpg")
Suppose we are interested in estimating the difference in yawning rates between the control and treatment groups using a confidence interval. Explain why we cannot construct such an interval using the normal approximation. What might go wrong if we constructed the confidence interval despite this problem?
A CDC report on sleep deprivation rates shows that the proportion of California residents who reported insufficient rest or sleep during each of the preceding 30 days is 8.0%, while this proportion is 8.8% for Oregon residents. These data are based on simple random samples of 11,545 California and 4,691 Oregon residents.
Construct a 90% confidence interval for the difference in proportions (Oregon - California) and interpret the interval in context.
This is a continuation of the Rainbow Trout problem on HW5. We will use bootstrapping to make a confidence interval instead of the t-distribution. We will generate a sample that matches the info provided in that question. Specifically we are looking at the sinking net info. We will use this generated sample as a ‘new’ dataset because we don’t have the original sample data in its entirety.
The average length (mm) for the fish caught in the Sinking Net was 254.972 with a standard deviation of 73.619. Sample size = 71
library(ggplot2)
library(dplyr)
theme_set(theme_bw())
## Better histograms
gh <- function(bins = 10) {
geom_histogram(color = 'black', fill = 'gray80', bins = bins)
}
## Bootstrapping function
bootstrap <- function(x, statistic, n = 1000L) {
bs <- replicate(n, {
sb <- sample(x, replace = TRUE)
statistic(sb)
})
data.frame(Sample = seq_len(n),
Statistic = bs)
}
set.seed(8675309)
# generate new sample
data_sink = rnorm(71, mean=254.972, sd=73.619)
Part A: Make a bootstrap distribution for the mean
length of all fish caught with the sinking net using 1000 samples (use
the gh()
function from above).
Part B: Use the quantile()
function and
the bootstrap distribution to make a 90% CI for the mean
Part C: Suppose the true mean length of fish caught with the floating net is 300mm. According the bootstrap CI, is there a difference in mean lengths for fish caught with the different net types?