Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
23 views32 pages

Lecture 2

Lecture 2 of TAMS65 covers point estimation methods including the Method of Moments, Least Square Method, and Maximum Likelihood Method. It reviews concepts from Lecture 1, introduces various estimation techniques, and provides examples for each method. The lecture emphasizes the application of these methods to estimate unknown parameters based on sample data.

Uploaded by

XZTA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views32 pages

Lecture 2

Lecture 2 of TAMS65 covers point estimation methods including the Method of Moments, Least Square Method, and Maximum Likelihood Method. It reviews concepts from Lecture 1, introduces various estimation techniques, and provides examples for each method. The lecture emphasizes the application of these methods to estimate unknown parameters based on sample data.

Uploaded by

XZTA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

TAMS65 - Lecture 2

Point estimation -
Methods to get point estimate/estimator

Zhenxia Liu
Matematisk statistik
Matematiska institutionen
Content

I Review of Lecture 1

I Introduction

I Method of Moments -MM

I Least Square Method - LSM

I Maximum Likelihood Method - ML

I Appendix

TAMS65 - Lecture2 1/30


Review of Lecture 1
I Population(population) X with unknown parameter θ.
Goal: To estimate θ.

I Random sample(slumpmässigt stickprov) X1 , . . . , Xn ,


observations(observationer) x1 , . . . , xn

I Point estimator(stickprovsvariabeln/skattningsvariabel)
Θ
b = f (X1 , . . . , Xn )

I Point estimate(punktskattning) θ̂ = f (x1 , . . . , xn )

I Unbiased(Väntevärdesriktighet)
I More effective(effektivare)
I Consistent(konsistent)

TAMS65 - Lecture2 2/30


Review of Lecture 1
1 Pn
Population mean µ ≈ µ̂ = x̄ = n i=1 xi Sample mean

Population variance σ 2 ≈ σ̂ 2
 Pn
1 2
 n i=1 (xi − µ)
 if µ is known(känt),
2
σ̂ =
1 Pn

 2 2
s = n−1 i=1 (xi − x̄) if µ is unknown(okänt)

s 2 is called sample variance.



s= s 2 is called sample standard deviation.

Population standard deviation σ ≈ σ̂ = σ̂ 2

TAMS65 - Lecture2 3/30


Introduction
Several methods to find point estimate/estimator.

I The Method of Moments (momentmetoden)

I Least Square Method( minsta-kvadrat-metoden)

I Maximum Likelihood Method


(maximum-likelihood-metoden)

Skill:

Applying these methods = Applying pdf/pmf, expectation,


variance or related property.

TAMS65 - Lecture2 4/30


The Method of Moments
The population X with unknown parameter θ, estimate θ.
Random sample X1 , ..., Xn , observations x1 , ..., xn .

The Method of Moments (momentmetoden) - MM:

Population moment = Sample moment


1st moment E (X ) = x̄ = n1 ni=1 xi
P

1 Pn
2nd moment E (X 2 ) = n
2
i=1 xi
..
. Pn
1
kth moment E (X k ) = n
k
i=1 xi

Solve for θ, then get θ̂MM or θ̂.

Note: The number of equations depends on the number of


unknown parameters.
TAMS65 - Lecture2 5/30
Example 1
Example 1: Assume that x1 , ..., xm are observations of
independent random variables X1 , ..., Xm , where Xi ∼ Bin(n, p).
Find point estimate(punktskattning) p̂MM of p by using the
Method of Moments.

Binomial distribution(Binomialfördelning) X ∼ Bin(n, p) :


I p = Population proportion
I E(X ) = np


, where x̄ = m1 m
P
p̂MM = i=1 xi .
n
( Exercise: Prove that p̂MM is unbiased.)

TAMS65 - Lecture2 6/30


Least Square Method

The population X with unknown parameter θ, estimate θ.


Random sample X1 , ..., Xn , observations x1 , ..., xn .

Least Square Method(minsta-kvadrat-skattningen) - LSM

Find θ which minimize


n
X
Q(θ) = [xi − E(Xi )]2 .
i=1

Then get θ̂LSM or θ̂.

Note: θ would be multiple dimensional.

TAMS65 - Lecture2 7/30


Example 2
Example 2: Suppose that the distribution of a population X has
the probability density function(täthetsfunktionen)
x
(
1 − 2a
2a e if x ≥ 0;
fX (x) =
0 otherwise,
where a > 0 is an unknown parameter.
Observations x1 , x2 , . . . , xn are given.
Find a point estimate(punktskattning) of a using Least Square
Method.
Exponential distribution(Exponentialfördelning)
I X ∼ Exp( µ1 )
− µx
with pdf fX (x) = µ1 e for x ≥ 0. E(X ) = µ


âLSM = 2 ( Exercise: Prove that âLSM is unbiased.)
TAMS65 - Lecture2 8/30
Preparations to Maximum Likelihood Method
Summation
n
X n
X
xi = x1 + x2 + . . . + xn , c = nc
i=1 i=1
n
X n
X
cxi = cx1 + cx2 + . . . + cxn = c(x1 + x2 + . . . + xn ) = c xi
i=1 i=1

Product
n
Y n
Y
xi = x1 · x2 · . . . · xn , c = cn
i=1 i=1
n
Y n
Y
n n
(cxi ) = (cx1 ) · (cx2 ) · . . . · (cxn ) = c · x1 · x2 · . . . · xn = c xi
i=1 i=1

TAMS65 - Lecture2 9/30


Preparations to Maximum Likelihood Method

ln(a · b) = ln a + ln b

a
ln = ln a − ln b
b

ln ac = c ln a

ln e c = c

e ln c = c

TAMS65 - Lecture2 10/30


Maximum Likelihood Method
Maximum Likelihood
Method(Maximum-Likelihood-Metoden) - ML

Let x1 , ..., xn are observations of independent r.v.s X1 , ..., Xn with



 p(x; θ) := pX (x), pmf - discrete r.v.

f (x; θ) := fX (x), pdf - continuous r.v.


The Likelihood function(likelihoodfunktionen) L(θ) is defi-


ned as following
 Qn
 i=1 p(xi ; θ) = p(x1 ; θ) · ... · p(xn ; θ), discrete r.v.
L(θ) =
 Qn
i=1 f (xi ; θ) = f (x1 ; θ) · ... · f (xn ; θ), continuous r.v.

Find θ which can maximize L(θ), then get θ̂ML or θ̂.


TAMS65 - Lecture2 11/30
Maximum Likelihood Method
Observations: (−0.5, 0, 0.3, 0.5, 0.7, 0.8, 0.95, 1.15, 1.25, 1.30, 1.6, 1.9, 2.7, 3.5).
When θ changes from θ1 to θ2 , we get a ”new” probability density funtion. ML
chooses the pdf which makes L(θ) as large as possible.

TAMS65 - Lecture2 12/30


Maximum Likelihood Method

Note:

I In general, it’s easier to maximize ln L(θ).

I If there are observations x1 , . . . , xn and y1 , . . . , ym from


independent r.v.s Xi , i = 1, . . . , n and Yj , j = 1, . . . , m,
respectively, where Xi and Yj have different distributions but
both distributions contain same unknown parameter θ, then

L(θ) = L1 (θ) · L2 (θ).

I The parameter θ can be multidimensional.

TAMS65 - Lecture2 13/30


Example 3
Example 3: During a ” short ” geological period, it may be
reasonable to assume that the times between successive eruptions
for a volcano are independent and exponentially distributed with
an expected value µ that is characteristic of the individual volcano.
The table below shows the times in months between 36 successive
eruptions for the volcano Mauna Loa in Hawaii 1832-1950.

126 73 3 6 37 23
73 23 2 65 94 51
26 21 6 68 16 20
6 18 6 41 40 18
41 11 12 38 77 61
26 3 38 50 91 12

According to the data, estimate µ by Maximum Likelihood


Method.
TAMS65 - Lecture2 14/30
Example 3

According to the data, estimate µ by Maximum Likelihood


Method.

Model: Let X be the time between two successive eruptions, then


X ∼ Exp( µ1 ) with the pdf fX (x) = µ1 e −x/µ , for x ≥ 0.

x1 +x2 +···+x36
µ̂ML = x̄ = 36 = 36.72

TAMS65 - Lecture2 15/30


Example 4
Example 4: The following data are 40 observations from a Poisson
distribution with a parameter λ (which is the mean):

Value 0 1 2 3 4
Frequency 21 0 11 6 2

Find the Maximum-Likelihood estimate for λ by given information.

Poisson distribution(Poissonfördelning) X ∼ Po(λ) :


I pX (k) = λk −λ
k! e for k = 0, 1, . . .

Note that: there are 21+0+11+6+2=40 observations in total.

λ̂ML = x̄ = 1.2.

TAMS65 - Lecture2 16/30


Maximum Likelihood Method - Normal distribution
We have observations x1 , . . . , xn from independent r.v.s
X1 , . . . , Xn , where Xi ∼ N(µ, σ).

Normal distribution(normalfördelning X ∼ N(µ, σ) :


(x−µ)2

I fX (x) = √1 e 2σ 2 , x ∈ R.
σ 2π

Case 1: σ is known and µ is unknown. Then µ̂ML = x̄.

The proof is given in Appendix.

Case 2: µ is known and σ is unknown. Then


n
2 1X
σ̂ML = (xi − µ)2 Exercises
n
i=1

TAMS65 - Lecture2 17/30


Maximum Likelihood Method - Normal distribution
Case 3: Both µ and σ are unknown.
(
µ̂ML = n1 ni=1 xi = x̄
P
(unbiased);
2 = 1
Pn 2
σ̂ML n i=1 (xi − x̄) (biased).

The proof is given in Appendix.


 
1 Pn n−1 2
Note that: E(Σ̂2ML ) =E (Xi − X̄ )2 = σ 6= σ 2
n i=1 n
which is NOT unbiased.
n
So we make an adjustment by choosing σ̂ 2 since it is
n − 1 ML
unbiased. That is,
n n 1
Pn 1 Pn
σ̂ 2 = 2
σ̂ML = · i=1 (xi − x̄)2 = (xi − x̄)2 = s 2 .
n−1 n−1 n n − 1 i=1

TAMS65 - Lecture2 18/30


Maximum Likelihood Method - Corrected/Adjusted
Corrected/Adjusted(korrigerade) point estimate of σ 2 is sample
variance:
n
2 1 X
s = (xi − x̄)2 .
n−1
i=1

A sample from Normal distribution, we use the following


point estimates

µ̂ = x̄,
n
2 2 1 X
σ̂ = s = (xi − x̄)2 ,
n−1
i=1

where both µ and σ 2 are unknown.

TAMS65 - Lecture2 19/30


Maximum Likelihood Method - More samples from Normal
distributions
Now suppose we have two samples from independent Normal
distributions

x1 , . . . , xn1 , where X1 , . . . , Xn1 are independent and N(µ1 , σ)

y1 , . . . , yn2 , where Y1 , . . . , Yn2 are independent and N(µ2 , σ)

The estimates of the three parameters can be deduced by applying


the following Likelihood function

L(µ1 , µ2 , σ 2 ) = L1 (µ1 , σ 2 ) · L2 (µ2 , σ 2 )

(xi −µ1 )2 (yi −µ2 )2


  
1 1
√ e − 2σ2 √ e − 2σ2
Qn1 Qn 2
= i=1 i=1
σ 2π σ 2π

TAMS65 - Lecture2 20/30


Maximum Likelihood Method - More samples from Normal
distributions
Then µ̂1 = x̄, µ̂2 = ȳ

The corrected/adjusted point estimate of σ 2

(n1 − 1)s12 + (n2 − 1)s22


σ̂ 2 = s 2 = ,
(n1 − 1) + (n2 − 1)

where
s12 = n11−1 ni=1 1 Pn2
(xi − x̄)2 and s22 = − ȳ )2
P 1
n2 −1 i=1 (yi

which are sample variances from the respective samples.

Here s 2 is called Combined/Pooled sample variance.

Note: This result can also be generalized to more samples.

TAMS65 - Lecture2 21/30


MM, LSM, ML

MM versus LSM versus ML

I MM - The Method of Moments


I Simple
I Consistent point estimate/estimator
I Usually, biased.

I LSM - Least Square Method


I Idea is good
I Linear regression

I ML - Maximum Likelihood Method


I Take into account of more samples
I Usually, lower variance than other methods

TAMS65 - Lecture2 22/30


Practice after the lecture:

Exercises - Lesson 2:

(I) 11.23, PS-1, 11.10, 11.14, 11.12, 11.15.

(II)11.13(a), 11.11, 11.16, 11.28, 11.22, 11.25.

Prepare your questions to the Lessons.

Thank you!

TAMS65 - Lecture2 23/30


Appendix

Solution to Example 1

By the Method of Moments, we have that E(X ) = x̄ which gives


m
1 X x̄
np = x̄ = xi , that is p= .
m n
i=1

So

p̂MM = ,
n

1 Pm
where x̄ = m i=1 xi .

TAMS65 - Lecture2 24/30


Appendix

Solution to Example 2
R∞ x
1 − 2a
Note that E(Xi ) = E (X ) = 0 x· 2a e dx = 2a.
Pn Pn
Then Q(a) = i=1 [xi − E(Xi )]2 = i=1 (xi − 2a)2 .
Pn
Let Q 0 (a) = 2 i=1 (xi − 2a)(−2) = 0, then a = x̄2 .

Q 00 (a) = 8n > 0, so we get minimum value of Q(a).

That is, âLSM = x̄2 .

TAMS65 - Lecture2 25/30


Appendix
Solution to Example 3
Model: Let X be the time between two successive eruptions, then
X ∼ Exp( µ1 ) with the pdf f (x) = µ1 e −x/µ , for x ≥ 0.

We have 36 observations x1 , . . . , x36 , where n = 36.


By the Maximum Likelihood Method, 
Qn Qn 1 −xi /µ 1 − µ1 Pni=1 xi
L(µ) = i=1 f (xi ; µ) = i=1 e = e
µ µn
1 Pn
ln L(µ) = −n ln µ − xi
µ i=1
d(ln L(µ)) n 1 Pn
=− + 2 i=1 xi = 0 gives µ = x̄ Max?
dµ µ µ
d 2 (ln L(µ)) n 2 Pn n 2 n
= 2 − 3 i=1 xi = 2 − 3 nx̄ = ... = − 2 < 0 ⇒
dµ2 µ µ µ=x̄ x̄ x̄ x̄
i.e. max.
⇒ µ̂ML = x̄ = x1 +x2 +···+x
36
36
= 36.72.

TAMS65 - Lecture2 26/30


Appendix
Solution to Example 4

The likelihood function is


Pn
Qn xi xi
L(λ) = i=1 λxi ! e −λ = λQn i=1xi ! e −nλ
i=1

Pn Qn
ln L(λ) = i=1 xi ln λ − ln ( i=1 xi !) − nλ
Pn
d(ln L(λ)) i=1 xi
dλ = λ −n
Pn
xi
Let d(lndλL(λ))
= i=1 λ − n = 0, then we have λ = x̄.
2
Pn
d (ln L(λ)) x
dλ2 = − λi=12 i < 0, so maximum.
Then the Maximum-Likelihood estimate for λ is
21 · 0 + ... + 4 · 2
λ̂ML = x̄ = = 1.2.
40

TAMS65 - Lecture2 27/30


Appendix
We have observations x1 , . . . , xn from independent r.v.s
X1 , . . . , Xn , where Xi ∼ N(µ, σ).

Case 1: σ is known and µ is unknown. Then µ̂ML = x̄.

1 (x−µ)2
f (x) = √ e − 2σ2
σ 2π

1 (xi −µ)2 1 Pn
− 12 2
√ e − 2σ2 = 2 i=1 (xi −µ)
Qn
L(µ) = i=1 n/2
e 2σ
σ 2π (σ 2π)

L(µ) gets maximum when the function ni=1 (xi − µ)2 gets
P
minimum, i.e. µ̂ML = x̄ (same as MM, LSM).

TAMS65 - Lecture2 28/30


Appendix

Case 3: Both µ and σ are unknown.

The Likelihood function


h 1 2 2
i h 1 2 2
i
L(µ, σ) = √ e −(x1 −µ) /2σ · . . . · √ e −(xn −µ) /2σ
σ 2π σ 2π
 1 n 1 Pn 2
−n − 2σ2 (x −µ)
= √ σ e i=1 i
.

Then we get
n
1 X
ln L(µ, σ) = konstant − n ln σ − (xi − µ)2 .
2σ 2
i=1

TAMS65 - Lecture2 29/30


Appendix

n n
!
∂(ln L(µ, σ)) 1 X 1 X
=− 2 2(xi − µ)(−1) = 2 xi − nµ
∂µ 2σ σ
i=1 i=1
n
∂(ln L(µ, σ)) n 1 X
=− + 3 (xi − µ)2
∂σ σ σ
i=1
 ∂l  1 Pn
 ∂µ =0  µ̂ML = n i=1 xi = x̄ (unbiased)
ger Pn
2 = 1
∂l σ̂ML i=1 (xi − x̄)2 (biased)
 
∂σ =0 n

TAMS65 - Lecture2 30/30


Thank you!

You might also like