Part A: What different types of questions do CIs and Hypothesis testing answer?
# CI: estimating values, getting a range of plausible ones
# HT: answer to binary research questions
Part B: Describe the difference between the Null hypothesis and the Alternate hypothesis. Which one are we trying to show is true?
Answer: \(H_0\) is a representation of some status-quo or assumed value. \(H_A\) is a representation of the research question we are answering. In a way we are trying to find out whether \(H_A\) is true and we really only care about \(H_0\) in that it gives us a way to carry out the test.
Part C: What is a Null distribution?
Answer: A distribution that shows what our statistic will do in repeated samples (of the same sample size) when the null hypothesis actually is true.
Part D: What does the p-value tell us for a hypothesis test?
Answer: It gives us strength of evidence in favor of the alternative hypothesis.
For each of the following situations, state whether the parameter of interest is a mean or a proportion.
Part A: A poll shows that 64% of Americans personally worry a great deal about federal spending and the budget deficit.
Part B: A survey reports that local TV news has shown a 17% increase in revenue within a two year period while newspaper revenues decreased by 6.4% during this time period.
Part C: In a survey, smart phone users are asked whether they use a web-based taxi service.
Part D: In a survey, smart phone users are asked how many times they used a web-based taxi service over the last year.
# A -- proportion
# B -- mean (quantitative)
# C -- proportion
# D -- mean
Write the null and alternative hypotheses in words and then symbols for each of the following situations.
Part A: New York is known as “the city that never sleeps”. A random sample of 25 New Yorkers were asked how much sleep they get per night. Do these data provide convincing evidence that New Yorkers on average sleep less than 8 hours a night?
\(H_0\): new-yorkers get 8 hours of sleep on average (same as everyone else);
\(H_A\): new-yorkers get less than 8 hours of sleep on average
\(H_0\): \(\mu = 8\);
\(H_A\): \(\mu < 8\)
Part B: Employers at a firm are worried about the effect of March Madness, a basketball championship held each spring in the US, on employee productivity. They estimate that on a regular business day employees spend on average 15 minutes of company time checking personal email, making personal phone calls, etc. They also collect data on how much company time employees spend on such non- business activities during March Madness. They want to determine if these data provide convincing evidence that employee productivity decreases during March Madness.
\(H_0\): \(\mu = 15\)
\(H_A\): \(\mu > 15\)
Part C: A public health researcher wants to know if the rate that adults in Iowa who smoke cigarettes is higher than the national rate of 11%. A random sample of 500 Iowa adults is surveyed to estimate the rate of smoking. Do these data provide convincing evidence that the smoking rate in Iowa is more than the national average?
\(H_0\): \(p = .11\);
\(H_A\): \(p > .11\)
Part D: A university wants to know if its new online orientation program increases student engagement compared to the old in-person version. They randomly assign incoming students to one of the two orientations and record whether each student attends at least one optional campus event afterward.
\(H_0\): there is no difference in proportion of attendance for the two programs;
\(H_A\): the online program has a higher rate of attendance than in-person
\(H_0\): \(p_{online}-p_{in-person} = 0\);
\(H_A\): \(p_{online}-p_{in-person} > 0\)
We are going to use the StatKey website to calculate p-values for us. Statkey can handle single means and differences too, but we are starting with the basics. Follow these steps.
The following is an example from Dr. Laura Ziegler’s intro statistics class at ISU.
Facebook is a social networking website. One piece of data that members of Facebook often report is their relationship status: single, in a relationship, married, it’s complicated, etc.
With the help of Lee Byron of Facebook, David McCandless examined changes in peoples’ relationship status, in particular, breakups. A plot of the results showed that there were repeated peaks on Mondays. Based on this initial examination of data, McCandless speculated that breakups are reported at higher frequency on Mondays.
To test this research hypothesis, McCandless collected a random sample of 75 breakups reported on Facebook within the last year. Of these sampled breakups, 20 occurred on a Monday.
Research Question: Are people more likely to break up on Mondays than other days of the week?
Part A: Describe the parameter in context and provide the appropriate symbol.
Answer: true proportion of break-ups reported on Monday
Part B: What is the value of the statistic?
20/75
## [1] 0.2666667
Part C: Is this an observational study or an experiment?
Observational.
Part D: What types of inferences we will be able to draw? Causal claims? Generalizations?
Generalizations only (random sample).
Part E: State the null and alternate hypothesis using the notation in the slides.
# H0: p = 1/7
# HA: p > 1/7
Part F: Find the p-value using StatKey.
Should get something small. Answers will vary.
Part G: Write up a conclusion to answer the research question.
We have moderate(?) evidence to suggest that break-ups are reported more often on Mondays.
The Lady Tasting Tea experiment took place in the late 1920s in England, where statistician Ronald A. Fisher (responsible for much of the experiment design stuff in this class) and psychologist Dr. Muriel Bristol both worked. During a social gathering, Dr. Bristol claimed she could tell whether milk or tea had been poured first into a cup simply by tasting it. Her future husband, Dr. William Roach, suggested turning the claim into a lighthearted test by preparing eight cups (four with milk poured first and four with tea poured first) and asking her to identify which were which. More info here. She got all 8 correct.
Part A: Write out the hypothesis statements for this test. Think about what proportion she would get right if just guessing.
\(H_0\): p = 0.5; \(H_A\): p \(>\) 0.5
Part B: Use StatKey to get the p-value for this experiment.
Should get something incredibly small.
Part C: Interpret the p-value and write a conclusion to the experiment with context. Do you think Dr. Bristol could actually tell the difference?
The probability of Dr. Bristol getting all 8 correct if she could not actually tell them apart is very small (\(\sim\) 1%).