This assignment has a total of 22 pts possible. Your score out of 20 will noted and scaled to 5 points (maximum of 5).
Question 1 – Conceptual Questions: (1pt each)
Part A What does it mean to say two variables are associated with each other?
Part B What does it mean to say two variables are independent of each other?
Part C What does the distribution of a variable tell us?
Question 2 For this question, we will be using the
iris dataset, giving the measurements, in centimeters, of
the variables for sepal and petal length and width. You can read more on
the dataset here. An
image of what these variables correspond to on a flower are provided
below.
NOTE: You will need to edit out the image from your document in order for it to knit to a .pdf doc. If your doc does not knit, this may be the cause.
To load this data into R, simply copy and paste the following into your Rmd file in an R code chunk
library(ggplot2)
data(iris)
Use this data to answer the following questions:
Part A How many observations and variables are
in the iris dataset? In one sentence, briefly describe what
constitutes an observation in this data. (2 pts)
Part B Use the code below to create the
appropriate plot to visualize the relationship between the variables
Sepal.Width and Sepal.Length. Do these two
variables appear to be associated? If so, comment on the
strength of this association. (2 pt)
ggplot(iris, aes(Sepal.Width, Sepal.Length)) + geom_point()
Species. Has anything changed in the association between
Sepal.Width and Sepal.Length? Comment on the
strength, form, and
direction of any associations you see (1pt)ggplot(iris, aes(Sepal.Width, Sepal.Length, color = Species)) + geom_point()
Question 3:
From the IMS Textbook, do the following exercises (you do not need to read anything from the textbook to answer these):
Write your answers to these exercises below.
I recommend developing a nice formatting for your questions like the following:
Part A: What features are apparent in the bar plot but not in the pie chart?
Answer:
Part B: What features are apparent in the pie chart but not in the bar plot?
Answer:
Part C: Which graph would you prefer to use for displaying these categorical data?
Answer: