Question 1

Part A: Explanatory: Percent with Bachelor’s degree, Response: Per capita income

Part B: There is a moderately strong linear relationship between counties’ percent with bachelor’s degree and per capita income. There are a few outliers with large values in both variables.

Part C: No. There are two particular issues here. One: this relationship is not causal, two: this would be an ecological fallacy as county relationships are not always the same as individual relationships.

Question 2

Part A: There is a non-linear, but moderate and positive relationship between percent internet users of a country and the life expectancy. There are no obvious outliers.

Part B: This is an observational study.

Part C: There are many different options. One is wealth.

Question 3 (Correlation)

  1. positive linear association
  2. none
  3. positive non-linear association
  4. negative linear association

Question 4

Part A: Footprint is explanatory, happiness is response. We are trying to use footprint to predict happiness.

Part B: There is a moderately strong, positive, non-linear relationship between these variables. There are no extreme outliers.

Part C: No, pearson’s correlation is not appropriate. It measures linear relationships, but this relationship is not linear.

Question 5

theme_set(theme_bw())
Happy %>% filter(Region == "1") %>% ggplot(aes(x=Footprint, y=Happiness)) + geom_point()

Region 1 corresponds to South American countries. There is no linear relationship between these variables for South American countries.

Question 6

Part A: Life Expectancy

Part B:

Happy %>% ggplot(aes(x=LifeExpectancy, y=Happiness)) + geom_point()

There is a moderately strong positive linear relationship between life expectancy and happiness of a country. There are no outliers. Pearson’s correlation is OK because the relationship is linear.

Part C: It is possible to get large correlation values with non-linear relationships. This would imply the relationship is linear if you blindly apply the correlation interpretation (no bueno).

Question 7 (More Correlation)

  1. strong, not reasonable
  2. strong, not reasonable
  3. strong, reasonable
  4. moderate, reasonable
  5. moderate, reasonable
  6. strong, reasonable

Question 8 (Would you believe me if I said even more correlation?)

  1. plot 4
  2. plot 3
  3. plot 1
  4. plot 2