Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
32 views7 pages

Final

The document is a final exam for a biostatistics course at UCSD Extended Studies, consisting of various statistical problems related to hypothesis testing, ANOVA, and regression analysis. Students are instructed to use course materials but not external resources, and they must interpret their findings clearly. The exam covers topics such as drug efficacy, body temperature comparisons, thermal pollution effects, carcinogenicity studies, and chlorophyll-a concentration analysis.

Uploaded by

Cutey Pie Mew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views7 pages

Final

The document is a final exam for a biostatistics course at UCSD Extended Studies, consisting of various statistical problems related to hypothesis testing, ANOVA, and regression analysis. Students are instructed to use course materials but not external resources, and they must interpret their findings clearly. The exam covers topics such as drug efficacy, body temperature comparisons, thermal pollution effects, carcinogenicity studies, and chlorophyll-a concentration analysis.

Uploaded by

Cutey Pie Mew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

UCSD Extended Studies: Biostatistics

Final Exam

Name:

Directions: This exam is open book, so please feel free to utilize all the resources provided
throughout the course (lecture notes and slides, practice problems, discussion boards). Do
not use any external resources and refer to the “Academic Integrity Pledge” for any
additional guidance on expectations.

CHECK LIST BEFORE YOU HAND IN:

[] Type your answers clearly and concisely.

[] Formulate null and alternative hypothesis clearly, define parameters, check assumptions,
etc.

[] Please do not paste raw R output. Interpret your findings (graphs, confidence intervals, p-
values) in terms of the problem statement.

GOOD LUCK!
1. (3pts) The table below shows the hours of relief provided by two analgesic drugs in 12
patients suffering from arthritis. Is there any evidence that one drug provides longer
relief than the other?

case 1 2 3 4 5 6 7 8 9 10 11 12
DrugA 2 3.6 2.6 2.6 7.3 3.4 14.9 6.6 2.3 2.0 6.8 8.5
DrugB 3.5 5.7 2.9 2.4 9.9 3.3 16.7 6.0 3.8 4.0 9.1 20.9

a) (1 pt) Check the assumption and choose the appropriate test.

b) (2 pts) Perform the test, manually or using R, you chose in part (a) to test. What are the
results of the test and any conclusions?
2. (4 pts) The body temperatures are provided for a sample of n = 65 healthy men and n =
65 health women. Do men and women have the same body temperature on average?
Analyze this data set using R and do the following parts to answer this question. Using
the following code (copy and paste) to load the data to R:

maletemp<-
c(96.3,96.7,96.9,97,97.1,97.1,97.1,97.2,97.3,97.4,97.4,97.4,97.4,97.5,97.5,97.6,97.6,97.6,97.7,97.8,97.8,
97.8,97.8,97.9,97.9,98,98,98,98,98,98,98.1,98.1,98.2,98.2,98.2,98.2,98.3,98.3,98.4,98.4,98.4,98.4,98.5,9
8.5,98.6,98.6,98.6,98.6,98.6,98.6,98.7,98.7,98.8,98.8,98.8,98.9,99,99,99,99.1,99.2,99.3,99.4,99.5)

femaletemp<-
c(96.4,96.7,96.8,97.2,97.2,97.4,97.6,97.7,97.7,97.8,97.8,97.8,97.9,97.9,97.9,98,98,98,98,98,98.1,98.2,98
.2,98.2,98.2,98.2,98.2,98.3,98.3,98.3,98.4,98.4,98.4,98.4,98.4,98.5,98.6,98.6,98.6,98.6,98.7,98.7,98.7,98
.7,98.7,98.7,98.8,98.8,98.8,98.8,98.8,98.8,98.8,98.9,99,99,99.1,99.1,99.2,99.2,99.3,99.4,99.9,100,100.8)

a) (1 pt) In order to test if body temperatures for men and women differ on average,
set up the appropriate null and alternative hypothesis. Define the parameters of
interest and state H0 and Ha in terms of these parameters.

b) (2 pts) Suppose we test the hypothesis in part (a) using a significance level α = 0.05.
Assuming the variances for men and women are equal, perform a t-test manually or
in R to test the hypothesis. State the conclusion of your test in the context of this
problem.

c) (1 pt) Compute a 99% confidence interval for the difference in mean temperatures
between men and women. Write a sentence interpreting this estimate of the
difference. Is this interval narrower or wider than the 95% confidence interval? Is
zero in this interval? Comment on this.
3. (4pts) A study was done on the effects of thermal pollution on clams. Clams were
collected at three sites: an intake site to a plant, a discharge site, and a site near
Interstate 55. The goal of this problem is to determine if the clams differ in terms of
heights at the three sites. The ANOVA table is provided below. Based on the table,
please do the following parts:

Sum of
Source DF Squares Mean Square F Value Pr > F

Model 2 0.57607022 0.28803511 1.62 0.2052

Error 71 12.62678383 0.17784203

Total 73 13.20285405

a) (1 pt) To test if the mean heights of the clams are equal at the three sites, define the
appropriate parameters and state H0 and Ha for this problem.

b) (1 pt) Read the ANOVA output and verify that the F-test statistic is the ratio of the
appropriate mean squares from the ANOVA table. What are the numerator and
denominator degrees of freedom for the F-test?

c) (1 pt) What is your conclusion using a significance level α = 0.05? How about α =
0.10?

d) (1 pt) Does it make sense to do multiple comparisons looking at differences in pairs


of mean heights for this problem? Explain.
4. (4pts) A carcinogenicity study was conducted to examine the tumor potential of a drug
product scheduled for initial testing in humans. A total of 300 rats (150 males and 150
females) were studied for a 6-month period. At the beginning of the study, 100 rats (50
males and 50 females) were randomly assigned to the control group, 100 to the low-dose
group, and the remaining 100 to the high-dose group. On each day of the 6-month period,
the rats in the control group received an injection of an inert solution, whereas those in the
treatment groups received an injection of the solution plus drug. The sample data are
shown in the accompanying table.

Number of Tumors
Rat group One or more none
Control 10 90
Low dose 14 86
High dose 19 81

a. (1 pt) Give the percentage of rats with one or more tumors for each of the three
treatment groups.

b. (1 pt) Conduct a test of whether there is significant difference in the proportion of


rats having one or more tumors for the three treatment groups with alpha=0.05.

c. (2 pts) Does there appear to be a drug-related problem regarding tumors for this
drug product? That is, as the dose is increased, does there appear to be an increase
in the proportion of rats with tumor?
5. (5 pts) The following table gives data on the concentration of chlorophyll-a in a lake along
with the concentration of phosphorus in the lake (Source: Smith and Shapiro 1981).
Chlorophyll-a is used as an indicator of water quality that measures the density of algal
cells. Phosphorus stimulates algal growth. We want to use a simple linear regression
model to model the relationship between chlorophyll-a and phosphorus in the lake.
Please analyze the data and answer the questions.

Data:
Obs chlor phos log(chlor) log(phos)

1 95.0 329.0 4.55388 5.79606


2 39.0 211.0 3.66356 5.35186
3 27.0 108.0 3.29584 4.68213
4 12.9 20.7 2.55723 3.03013
5 34.8 60.2 3.54962 4.09767
6 14.9 26.3 2.70136 3.26957
7 157.0 596.0 5.05625 6.39024
8 5.1 39.0 1.62924 3.66356
9 10.6 42.0 2.36085 3.73767
10 96.0 99.0 4.56435 4.59512
11 7.2 13.1 1.97408 2.57261
12 130.0 267.0 4.86753 5.58725
13 4.7 14.9 1.54756 2.70136
14 138.0 217.0 4.92725 5.37990
15 24.8 49.3 3.21084 3.89792
16 50.0 138.0 3.91202 4.92725
17 12.7 21.1 2.54160 3.04927
18 7.4 25.0 2.00148 3.21888
19 8.6 42.0 2.15176 3.73767
20 94.0 207.0 4.54329 5.33272
21 3.9 10.5 1.36098 2.35138
22 5.0 25.0 1.60944 3.21888
23 129.0 373.0 4.85981 5.92158
24 86.0 220.0 4.45435 5.39363
25 64.0 67.0 4.15888 4.20469

“Q5.txt” provides you the above data. Use the following code to load the data to R.

> setwd("C:\\Users\\HP\\Desktop")
## change to your specific path where you saved the data

> data<-read.table("Q5.txt", header=T)

a) (1 pt) Provide and examine the scatterplots of chlorophyll-a vs. phosphorus


(column 2 and 3) and log(chlorophyll-a) vs. log(phosphorus) (column 4 and 5).
Determine whether a linear model adequately describes the relationship between
chlorophyll-a and phosphorus, or their log transformed data.
b) (1 pt) Provide the fitted linear regression equation estimate relating whatever you
determined in question a.

c) (2 pts) Perform the t test to check whether the slope of the regression is significant.
Write down the null and alternative hypothesis that is being tested with this t-test
statistic. Report the p-value and your conclusion.

d) (1 pt) What is the predicted amount of the average chlorophyll-a given the
concentration of phosphorus is 200?

You might also like