Sampling distributions and confidence intervals:
By simulation and by bootstrap.
Module 3, Week 6, 2021-10
Bioestadística BIOL 2205, Ciencias Biológicas, Universidad de los Andes
Professor Andrew J. Crawford
Names of the members of usual normal Lab group: 4
1. Maria Fernanda Herreño
2. Viviana Guayara
3. Nicol Vergara
ANSWER SHEET
Part 1
Paste below a small copy of your original sample from the Gamma(1, 10) distribution:
Paste below a small copy of one of your sampling distributions of the possible means from a
draw of size n (labeled, so we know which of the two this figure represents).
Results of Shapiro-Wilk tests of normality applied to sampling distributions estimated from
5,000 replicate samples of size n:
Test statistic, W P-value
Sample size n = 500
Sample size n = 20
Does the SE of the mean of one sample provide a ‘good’ estimate of the SD of the sampling
distribution of means, as our textbook claims?
se of sample sd of sampling distribution
Sample size n = 500
Sample size n = 20
Which gave us the larger sampling error (uncertainty in our estimated mean), n = 500 or n = 20?
Advanced question:
How might you next evaluate whether our estimate of the sd the sampling distribution show
signs of bias or not?
Part 2A
For the variable Tacc (or ‘jumps’) based on n = 9 green frogs, report the following:
Mean:
SD:
SE:
90% Confidence Interval based on the t distribution:
Part 2B
Paste a histogram for the sampling distribution based on 1,000 bootstrap samples.
Paste a histogram for the sampling distribution based on 10,000 bootstrap samples.
Which one looks more ‘normal’ and why? Which one will we use to calculate the bootstrap
confidence interval?
Part 3
For your bootstrap sampling distribution, report the following:
Mean:
SD:
Compare the above values with your single-sample estimates of the:
Mean:
SE:
90% bootstrap confidence interval via the formula (CI90th):
90% bootstrap confidence interval via 5th & 95th quantiles (CI90quantiles):
90% confidence interval based on t distribution (from Part 2A, above):
Question:
Which of the above three methods for obtaining a Confidence Interval seems the most
conservative and why?
Question:
Which of the above three methods for obtaining a Confidence Interval do you prefer and why?
True / False question:
The most important assumption in this exercise is that the original data were sampled randomly
with respect to (and therefore representative of) the reference population?
Answer T or F, and provide a one sentence justification.
Question:
Write a concise and correct definition of the (1 – a)% Confidence Interval: