Part A Create a frequency table using the
titanic data set to find how many children and adults were
on board the Titanic.
with(titanic, table(Age))
## Age
## Child Adult
## 109 2092
Part B Determine what percentage of the passengers on-board the Titanic were adults.
with(titanic, table(Age)) %>% proportions()
## Age
## Child Adult
## 0.04952294 0.95047706
The percentage of passengers that were adults is 95.0%. You could also have used the previous table to get .95 = \(\frac{2092}{2092 + 109}\).
Part C Determine what percentage of the passengers on-board the Titanic were members of the crew.
with(titanic, table(Class)) %>% proportions()
## Class
## 1st 2nd 3rd Crew
## 0.1476602 0.1294866 0.3207633 0.4020900
40.2% of passengers were members of the crew.
Part A How many children were included in second class?
with(titanic, table(Age, Class))
## Class
## Age 1st 2nd 3rd Crew
## Child 6 24 79 0
## Adult 319 261 627 885
24 children.
Part B What percentage of the crew survived? How about children?
with(titanic, table(Survived, Class)) %>% proportions(margin = 2)
## Class
## Survived 1st 2nd 3rd Crew
## No 0.3753846 0.5859649 0.7478754 0.7604520
## Yes 0.6246154 0.4140351 0.2521246 0.2395480
with(titanic, table(Survived, Age)) %>% proportions(margin=2)
## Age
## Survived Child Adult
## No 0.4770642 0.6873805
## Yes 0.5229358 0.3126195
24% of the crew survived. 52%
of children survived. These are conditional probabilities
since we are narrowing our focus to one group for each table (crew and
children). We need survived and not survived to add up to 100% for each
group. The way I arranged my tables with survived on the y-axis means I
needed to use margins=2 argument in the
proportions() function.
Part C What proportion of individuals who survived were members of the crew? Construct the plot associated with the table you create.
with(titanic, table(Survived, Class)) %>% proportions(margin=1) %>% addmargins(2)
## Class
## Survived 1st 2nd 3rd Crew Sum
## No 0.08187919 0.11208054 0.35436242 0.45167785 1.00000000
## Yes 0.28551336 0.16596343 0.25035162 0.29817159 1.00000000
The proportion of passengers
who survived that were members of the crew is .298. This is a conditional probability
since we are restricting ourselves to only looking at the Survived=Yes
row. We need the rows to add to 100% so I used
proportions(margin=1).
ggplot(titanic, aes(Class)) +
geom_bar() +
facet_grid(Survived ~ Sex)
Part A: Amongst female passengers, which class had the most who did not survive? How many female passengers in this class did not survive?
3rd class. 106 female passengers in 3rd class did not survive.
Part B: Amongst male passengers, which class had the fewest people survive? How many male passengers in this class survived?
2nd class. 25 male passengers in 2nd class survived.
Part A: Write the code to produce tables displaying information for the two plots of college data (Region and Type)
# Boxplot 1
with(college, table(Region, Type)) %>% proportions(margin = 1) %>% addmargins(2)
## Type
## Region Private Public Sum
## Far West 0.5673077 0.4326923 1.0000000
## Great Lakes 0.6613757 0.3386243 1.0000000
## Mid East 0.6363636 0.3636364 1.0000000
## New England 0.6197183 0.3802817 1.0000000
## Plains 0.6666667 0.3333333 1.0000000
## Rocky Mountains 0.2666667 0.7333333 1.0000000
## South East 0.5563140 0.4436860 1.0000000
## South West 0.4523810 0.5476190 1.0000000
# Boxplot 2
with(college, table(Region, Type)) %>% proportions(margin = 2) %>% addmargins(1)
## Type
## Region Private Public
## Far West 0.09119011 0.10044643
## Great Lakes 0.19319938 0.14285714
## Mid East 0.19474498 0.16071429
## New England 0.06800618 0.06026786
## Plains 0.12982998 0.09375000
## Rocky Mountains 0.01236476 0.04910714
## South East 0.25193199 0.29017857
## South West 0.05873261 0.10267857
## Sum 1.00000000 1.00000000
Part B: Using the appropriate graph and table, do any regions have public schools as a majority and if so what percent of schools in that region are public?
# Rocky Mountains (73%) and South West (54%)
Part C: Using the appropriate graph and table, amongst public schools, which region has the largest percentage (also give me that percentage).
# South East (29%)