Probability, matrices, and apply()

Purpose of this assignment

The tasks you’ll work on today will help you hone the skills you’ve been working on over the last week and to gain specific experience working with probability and probability distributions in R. Post your code and answers to the “Probability worksheet” discussion on Canvas. One submission per group is fine, and you’ll get a grade for completion (not quality or correctness, I just want you to get some practice). I’ll post my answers to these questions at the end of class.

Task 1

Suppose height in a human population is normally distributed with a mean of 1.8 meters and a standard deviation of 0.3 meters. Answer the following questions

What is the probability that a randomly chosen individual will be less than (or at most) 1.5 meters tall?
What is the probability that a randomly chosen individual has a height between 1.6 and 1.8 meters?
Assume you sample a single individual from the population. Would you be surprised if they were more than 2.4 meters tall?

Task 2

Consider a genetic locus in a haploid population (one gene copy per individual). Assume there are two alleles, A and a, and that the frequency of the A allele is 0.3.

Draw a random sample of 25 alleles (i.e. individuals) and assign the output to an object y. (Treat A as success and a as a failure).
Now, draw 15 random samples of 25 samples each and assign them to a matrix, Y. You can think of the 15 samples as distinct samples taken from the same population.
Compute the frequency of the A allele in your 15 samples and the mean frequency across the 15 samples. How do these compare to the population mean of 0.3?

Task 3

In this task, we are going to graph results for the first time. Let’s use the histogram function hist(), as shown below:

## draw 100 samples from a Poisson with rate parameter 2.5
x <- rpois(100, lambda = 2.5)
## make a histogram summarizing the sample label the x and y axes
## with xlab and ylab, and choose the color of the plot
hist(x, xlab = "Value", ylab = "Frequency", col = "red")

Draw 15 samples from a Poisson distribution with rate parameter (lambda) 4.6. Visualize the samples with a histogram
Now draw (i) 100 samples, (ii) 1000 samples, and (iii) 10,000 samples from the same distribution. Visualize them all, how does the number of samples affect the distribution as seen in the histogram? Compare it to the theoretical distribution below. My code for the distribution is included, don’t worry about some details. We’ll go over these (especially the plot function) at a later time.

x <- seq(0, 20, by = 1)
y <- dpois(x, lambda = 4.6)
plot(x, y, type = "h", lwd = 2, xlab = "Value", ylab = "Probability")

Task 4

Consider a standard normal distribution (\(\mu\) = 0, \(\sigma\) = 1).

Draw 3000 values from the normal distribution. Compute the mean of the sample. Display the distribution of sampled values as a histogram.
Now draw 20 samples of 75 values from the same distribution. Store these in the matrix Y. Compute the mean of each sample (you should have 20 means). Create histograms of the means and a few of the individual samples of 75 values. Do these histograms look the same as the one with the original 300 samples? How are they different?

\[\\\] \[\\\] \[\\\] \[\\\]