Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
258 views8 pages

MAST20005 Statistics Assignment 3

1) The document provides the details and working for questions from a statistics assignment, including the use of likelihood ratio tests, sign tests, normal approximations, and confidence intervals. 2) Question 1 examines likelihood ratio tests and normal approximations. Question 2 uses sign tests and binomial distributions. Question 3 involves exponential distributions and chi-squared tests. 3) Question 4 applies a sign test to compare a sample median to a hypothesized value. Question 5 uses a t-test to compare two sample means and calculates a confidence interval for the difference between the population means.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
258 views8 pages

MAST20005 Statistics Assignment 3

1) The document provides the details and working for questions from a statistics assignment, including the use of likelihood ratio tests, sign tests, normal approximations, and confidence intervals. 2) Question 1 examines likelihood ratio tests and normal approximations. Question 2 uses sign tests and binomial distributions. Question 3 involves exponential distributions and chi-squared tests. 3) Question 4 applies a sign test to compare a sample median to a hypothesized value. Question 5 uses a t-test to compare two sample means and calculates a confidence interval for the difference between the population means.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

MAST20005 Statistics, Assignment 3

Brendan Hill - Student 699917 (Tutorial Thursday 10am)

November 19, 2016

Question 1
(a)
The likelihood ratio test is based on the following:

L(θ0 )
λ= ≤k
L(θM L )
Q Qn P Pn
Since the MLE of θ in this case is given by X̄ this is determined as follows (where means i=1 and means i=1
throughout):

L(θ0 )
λ= ≤k
L(x̄)
Q xi
[θ (1 − θ0 )1−xi ]
= Q 0xi ≤k
[x̄ (1 − x̄)1−xi ]
P P
(θ0 )( xi ) (1 − θ0 )(n− xi )
= P P
(x̄)( xi ) (1 − x̄)(n− xi )
(θ0 )y (1 − θ0 )(n−y) X
= (where y = xi )
(y/n)y (1 − y/n)(n−y)
 θ y  1 − θ n−y
0 0
=
y/n 1 − y/n
 nθ y  n − nθ n−y
0 0
=
y n−y

P
Therefore the likelihood ratio test is based on the statistic Y = Xi .

(b)
When H0 is true, Y is the sum of n i.i.d Bernoulli trials with p = θ0 . Therefore when H0 is true, the Y distributes
according to Bi(n, θ0 ).

(c)
Using the normal approximation, Y ≈ N (µ = nθ0 , σ 2 = nθ0 (1 − θ0 ))
Hence for n = 100, θ0 = 21 , the following test rejects H0 at the 0.05 significance level:

|Y − nθ0 |
z=p ≥ z0.025
nθ0 (1 − θ0 )
|Y − 50|
⇒z= ≥ 1.96
5
In terms of a critical region for Y, this is equivalent to:

c = 40
Where the critical region is Y ≤ 40 or Y ≥ 60.

1
Question 2
(a)
Under the null hypothesis, where the medianPof X is 0, we expect approximately half the observations in general to
n
be > 0. Hence the sign test statistic S(0) = i=1 I(Xi > 0) simply has a binomial distribution with n = 25, p = 0.5,
ie. S(0) ≈ Bi(25, 0.5).
So the significance level of a test which rejects H0 when S(0) ≥ 16 is given by P (S(0) ≥ 16|H0 ) = 1 − P (S(0) ≤
15|H0 ) = 0.11476

(b)
Let H2 be the hypothesis that X ≈ N (0.1, 1). Hence P (X > 0|H2 ) = 0.53983. So S(0)|H2 distributes according to a
binomial with parameters n = 25, p = 0.53983.
Then the power of the test S(0) ≥ 16 is 1 − P (S(0) ≤ 15|H2 ) = 0.21150.

(c)
Note that this question relies on the assumption that X distributes normally (at least, symmetrically) so that the
median is the mean. This assumption has been confirmed with the lecturer even though it is not stated explicitly in
the assignment.

2
Note that by the central limit theorem, for σX = 1, we have that X̄ ≈ N (µX , 1/n).
p
Let the null hypothesis H0 be that µX = 0. Hence under H0 , (X̄ − 0)/ 1/25 = 5X̄ ≈ N (0, 1).

p
Let Z = X̄/ 1/n = 5X̄, so Z ≈ N (0, 1). Then the following test has significance level 0.11476:

Z ≥ z0.11476 = 1.201596

(d)
If X ≈ N (0.1, 1) (H1 ) then the power of the test in (c) under H1 is given by:

P (Z ≥ 1.201596) = P (5X̄ ≥ 1.201596)


= P (X̄ ≥ (1/5)1.201596)
= P ((X̄ − 0.1) ≥ (1/5)1.201596 − 0.1)
= P (5(X̄ − 0.1) ≥ 1.201596 − 0.5)
= P (5(X̄ − 0.1) ≥ 0.701596)
= 1 − Φ(0.701596) since 5(X̄ − 0.1) ≈ N (0, 1) under H1
= 0.2414656

2
Question 3
(a)
Since X distributes by Exp(λ = 1), F (x) = 1 − e−x . Hence if y = F (x) then:

y = 1 − e−x
e−x = 1 − y
−x = log(1 − y)
x = − log(1 − y) y ∈ (0, 1)

In other words, F −1 (y) = − log(1 − y)

(b)
H0 is the hypothesis that X ≈ Exp(1).
Five equally probable class intervals for X are given by evaluating the intervals of (F −1 (y) given by y ∈ {0, 0.2, 0.4, 0.6, 0.8, 1.0,
hence:

(0, 0.223144], (0.223144, 0.510826], (0.510826, 0.916291], (0.916291, 1.60944], (1.60944, ∞)

(c)
Using these 5 categories the following test statistic can be used, to test at the α = 0.05 significance level:

Q = (6 − 8)2 /8 + (16 − 8)2 /8 + (7 − 8)2 /8 + (6 − 8)2 /8 + (5 − 8)2 /8 = 10.25


With a critical region:

Q ≥ c = χ20.05 (4) = 9.487729


Since Q > c, we reject H0 at the 0.05 significance level.

3
Question 4
(a)
Let H0 be the hypothesis that the median m = 130, and H1 be the alternative hypothesis that m > 130.

Let Y equal the number of observations such that (xi − 130) > 0. Hence under H0 , since 130 is the median,
Y ≈ Binom(8, 0.5).

The critical region under H0 at significance 0.05 is given by Y > 6.

The dataset provided yields y = 3, hence we do not reject H0 at this significance level.

(b)
Let X represent the sample of unexposed babies and Y the sample of exposed babies. Note that nx = 7 and ny = 8.

Let the null hypothesis H0 be mx = my , and the alternative hypothesis H1 be that mx < my .
The normal approximation of the distribution of the W statistic is:

ny (nx + ny + 1)
µW = = 64
2
nx ny (nx + ny + 1)
V ar(W ) = = 74.667
12
Hence the critical region for W is:

W ≥ 64 + z0.01 74.667 = 84.10199
The value of the W statistic for the observations provided is calculated as follows:
Dataset X Y

Observations 8 11 12 14 20 43 111 35 56 83 92 128 150 176 208

Rank 1 2 3 4 5 7 11 6 8 9 10 12 13 14 15

P
Now w = yrank = 87 > 84.10199. Hence we reject H0 at the 0.01 significance level.

4
Question 5
(a)
Let µX and µY represent the means of X and Y respectively. Then, let the null hypothesis H0 be that µX = µY = µ,
where X ≈ N (µ, σ 2 ) and Y ≈ N (µ, σ 2 ). The alternative hypothesis H1 is that µX 6= µY .
The following test statistic can be used to test H0 at the 0.05 significance level:

|X̄ − Ȳ |
T = p ≥ t0.025 (n + m − 2)
SP 1/n + 1/m
q 2 +(m−1)S 2
(n−1)SX
where n and m are the sample sizes of X and Y respectively, and the pooled variance SP = n+m−2
Y

For the sample given we have n = 14, m = 5, x̄ = 12.56, s2x = 24.65, ȳ = 17.32, s2y = 11.01, hence the test statistic is:
r
13 · 24.65 + 4 · 11.01
SP = = 4.6304
17
|12.56 − 17.32|
t= p = 1.97315
4.6304 · 1/14 + 1/5
With critical region:

t0.025 (17) = 2.109816


Since 1.97315 < 2.109816, we do not reject H0 at the 0.05 significance level.

(b)
p = 0.064964

(c)
Endpoints of the 95% confidence interval for µx − µy are given by:
p
(x̄ − ȳ) ± t0.025 (17)SP 1/14 + 1/15
Hence we are 95% confidence that the true difference lies in the following interval:

[−9.84968, 0.329682]
Note that this interval includes 0, reflecting the fact that we failed to reject the null hypothesis in (a).

(d)
2
The following F statistic can be used to test if the hypothesis σX = σY2 at the 0.05 significance level:

s2x
F = ∈
/ [f0.025 (n − 1, m − 1), f0.975 (n − 1, m − 1)]
s2y
For the samples provided, the value of the test statistic is:

f = 2.238874
And the critical region is:

(0, 0.2502567) ∪ (8.7149963, ∞)


2
However since 0.2502567 < 2.238874 < 8.7149963, there is not sufficient evidence to reject the hypothesis that σX = σY2
at the 0.05 significance level.

5
Question 6
The null hypothesis H0 is that the angle of pull does not affect the separation force required. The alternative hypoth-
esis H1 states that angle of pull does affect the separation force required.

The relevant test statistic is:

SS(A)/3
F = ≥ f0.05 (3, 12) = 3.490295
SS(E)/12
Using the ”anova” function in R to analyze variance by the A factor yields the following results:

SS(A) = 58.157

SS(E) = 91.005
Hence

f = 2.5562
Since 2.5562 < 3.490295, we do not reject H0 at the 0.05 significance level. So there is not enough evidence to suggest
that the angle of pull affects the separation force required.

Note that the p-value for this result is 0.104162

Note that I have assumed normality and equal variances of the underlying data.

6
Question 7
(a)
Note that E(Xij ) = E(µ + αi + ij ) = µ + αi , so:

m
i n
1 XX
E(X̄.. ) = E( Xij )
n i=1 j=1
m i n
1 XX
= E(Xij )
n i=1 j=1
m i n
1 XX
= (µ + αi )
n i=1 j=1
m
1X
= ni [µ + αi ]
n i=1
m
1 1X
= (nµ) + ni αi
n n i=1
m
1X
=µ+ ni αi
n i=1

Pm
X̄.. will be an unbiased estimatorPof µ when n1 i=1Pni αi = 0. Specifically, when all values of ni are equal (say n0 ),
m m m
this becomes n1 i=1 n0 αi = n1 n0 i=1 αi = 0, since i=1 αi = 0.
P

So X̄.. is an unbiased estimator of µ when the sample sizes ni for all categories are the same.

(b)

m
X m
X
ni (X̄i. − X̄.. )2 = ni (X̄i.2 − 2X̄i. X̄.. + X̄..2 )
i=1 i=1
m
X m
X m
X
= ni X̄i.2 − 2X̄.. ni X̄i. + X̄..2 ni
i=1 i=1 i=1
Xm Xm h1 X ni i
= ni X̄i.2 − 2X̄.. ni Xij + nX̄..2
i=1 i=1
ni j=i
Xm ni
m X
X
= ni X̄i.2 − 2X̄.. Xij + nX̄..2
i=1 i=1 j=i
m
X
= ni X̄i.2 − 2X̄.. [nX̄.. ] + nX̄..2
i=1
m
X
= ni X̄i.2 − 2nX̄..2 + nX̄..2
i=1
m
X
= ni X̄i.2 − nX̄..2
i=1

7
(c)
 P
1 m Pni
Required to show: That n−m i=1 j=1 (Xij − X̄i. )2 is an unbiased estimator of σ 2

First, note that:

E[Xij ] = (µ + αi )
V ar[Xij ] = σ 2
2
E[Xij ] = V ar[Xij ] + E[Xij ]2 = σ 2 + (µ + αi )2
1
Pni
Also since X̄i. = ni i=1 Xij we have:

E[X̄i. ] = (µ + αi )
σ2
V ar[X̄i. ] =
n
σ2
E[X̄i.2 ] = V ar[X̄i. ] + E[X̄i. ]2 = n + (µ + αi )2
Now, the expectation of the expression under consideration is calculated as follows:
m Xni
" m n #
 1 X  1  XX i
2 2
E[ (Xij − X̄i. ) ] = E (Xij − X̄i. )
n − m i=1 j=1 n−m i=1 j=1
" m n #
 1  XX i
2 2
= E (Xij − 2Xij X̄i. + X̄i. )
n−m i=1 j=1
" m ni ni
#
 1  X h X i hX i 
2 2
= E Xij − 2X̄i. Xij + ni X̄i.
n−m i=1 j=1 j=1
" m ni
# ni
 1  X h X i  X
2
= E Xij − 2X̄i. [ni X̄i. ] + ni X̄i.2 since ni X̄i. = Xij
n−m i=1 j=1 i=1
" m ni
#
 1  X h X i 
2
= E Xij − 2ni Xi.2 + ni X̄i.2
n−m i=1 j=1
" m ni
#
 1  X h X i
2 2
= E Xij − ni Xi. )
n−m i=1 j=1
" m n #
 1  XX i h i
2 2
= E Xij − X̄i.
n−m i=1 j=1
m i n
 1 XX h
2
i
= E Xij − X̄i.2
n − m i=1 j=1
m i n
 1 XX 2
= E[Xij ] − E[X̄i.2 ]
n − m i=1 j=1
m i n
 1 XX 2
= (σ 2 + (µ + ai )2 ) − ( σni + (µ + ai )2 )
n − m i=1 j=1
m i n
 1 XX σ2
= (σ 2 − ni )
n − m i=1 j=1
m ni
 1  2 XX 1
= σ (1 − ni )
n−m i=1 j=1
m
1  2X

= σ (ni − 1)
n−m i=1
 1 
= σ 2 (n − m)
n−m
= σ2

Hence, it is an unbiased estimator of σ 2 .

You might also like