The R function pnorm()
will be very helpful to use for
calculating probabilities. In order to use it we need to tell the
function a few specific things we are using:
Here is an example of using the pnorm()
function to
calculate the probability P(X > 25) for a normal distribution with
mean 30 and std.dev. 10.
pnorm(25, mean=30, sd=10, lower.tail = FALSE)
## [1] 0.6914625
# Nice
We will use a standard normal distribution for practice. For all of these question parts, show your R code and your calculations. I also recommend being able to draw a rough sketch to illustrate visually the probability on the normal distribution graph.
Part A: Use R to find the probability of randomly picking a value more than 0.5.
Part B: Use R to find the probability of randomly picking a value less than -0.5.
Part C: Do your answers to Parts A and B match?
Part D: Use Parts A and B to find the probability of randomly picking a value between -0.5 and 0.5.
Part E: Verify the 68-95-99.7% rule using 1, 2, and 3 std.dev’s, and state the results in your own words.
Mercury is a chemical that is toxic to humans (and many others animals). From the smokestacks of power plants to the discharges from wastewater treatment plants, and other places, mercury in the form of the compound methylmercury exists in the enviroment, and it can settle to the seafloor and be taken up by tiny organisms that live or feed on bottom sediments.
These compounds aren’t digested, they accumulate within the animals that ingest them, and become more and more concentrated as they pass along the food chain as animals eat and then are eaten in turn. This is biomagnification, and it means that higher-level predators-fish, birds, and marine mammals-build up greater and more dangerous amounts of toxic materials than animals lower on the food chain. (Info taken from here)
The U.S. Food and Drug Administration recommends avoiding eating fish with mercury levels higher than 0.46 \(\mu\)g/g (micro-grams mercury per gram of fish), as they may be harmful (source).
Suppose the population of yellowfin tuna follows a Normal distribution with an average mercury level of 0.354 \(\mu\)g/g , and a variance of 0.02. (Mean value taken from here, I made up the variance amount – hard to find this)
Part A: What is the standard deviation of this population?
Part B: What is the probability of catching a yellowfin tuna with an unsafe amount of mercury? Suppose you enjoy eating yellowfin tuna – would you frequently eat yellowfin tuna knowing this extra information?
Part C: Suppose the standard deviation is acutally .03 instead of .14. Find the probability of catching a yellowfin tuna with an unsafe amount of mercury. Does this change your answer to whether or not you would frequently eat yellowfin tuna?
Consider an 8-sided die with faces labeled 1 through 8.
Part A: Find and interpret the expected value of this die.
Part B: Find and interpret the standard deviation of this die.
Part C: What is the expected value of rolling and adding up three 8-sided dice?
Part D: Which has more variability: rolling three 8-sided dice and adding up the values or rolling one 8-sided die and multiplying the result by 3 (consider variance formulas)
Go to [this website] and go through the tutorial and associated examples. It will help explain what the sampling distribution is all about.
Question 3: What are some things you learned from this example?
Go to [this website] and go through the tutorial and associate examples. It will help explain what the CLT is all about.
Question 4: What are some things you learned from this example?
Part A: In your own words, explain what a population distribution is.
Part B: In your own words, explain what a sample distribution is.
Part C: In your own words, explain what a sampling distribution is.
Part D: I mentioned before that sampling is a random process. What is meant by the term sampling variability?
Part E: Which distribution does the standard error refer to?
We will use the following link to explore sampling distribution concepts. StatKey Link for Sampling Distributions.
This is an app for simulating sampling distributions. In the top left, click “Baseball Players 3e…” to navigate a drop down menu of data sets and select “NFL Contracts 3e”. The panel on the right hand side shows what the original sample looks like under Population and gives some statistics for the population. At the top we are able to choose different sample sizes with which we create a random sample from the population.
Part A: Leave the sample size at n=10. Click “Generate 1 Sample.” The app puts a point on the sampling distribution corresponding to this point. The sample we just generated is represented in the bottom right of the screen. What are the mean and standard deviation of the sample?
Part B: Generate another sample. What values did you get for the mean and standard deviation?
Part C: Generate 5000 samples. What shape does the distribution have and what is the value of the standard error?
Part D: Change the sample size to 50. Generate 5000 samples. What shape does the distribution have and what is the value of the standard error?
Part E: Change the sample size to 200. Generate 5000 samples. What shape does the distribution have and what is the value of the standard error?
Part F: Add 3000 more samples to the plot from Part E. Did the shape or std error change very much? Why?
Part G: In general, how would you describe what happens to the shape of the sampling distribution as the sample size increases? What about the standard error?