Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views29 pages

Lecture 1

The document outlines the first lecture of the TAMS65 course on mathematical statistics, focusing on point estimation. It covers definitions, properties of point estimators, and examples of unbiased and consistent estimators. Key concepts include the distinction between point estimates and estimators, as well as methods for evaluating their effectiveness.

Uploaded by

XZTA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views29 pages

Lecture 1

The document outlines the first lecture of the TAMS65 course on mathematical statistics, focusing on point estimation. It covers definitions, properties of point estimators, and examples of unbiased and consistent estimators. Key concepts include the distinction between point estimates and estimators, as well as methods for evaluating their effectiveness.

Uploaded by

XZTA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

TAMS65 - Lecture 1

Introduction and Point estimation

Zhenxia Liu
Matematisk statistik
Matematiska institutionen
Content
I Syllabus, Course Plan, Course evaluation

I Introduction

I Repetitions

I Definitions in Statistics

I Point estimation
I Point estimate, point estimator
I Unbiased, more effective
I Consistent
I Commonly used point estimates/estimators

I Appendix

TAMS65 - Lecture1 1/27


Introduction

I Probability Theory - TAMS79/TAMS80

Construct models that describe how common different events


are and that explain the variation in measurement data.

I Statistical Theory - TAMS65

Provide basic knowledge of statistical methods, i.e. to draw


conclusions about phenomenon affected by chance based on
observed data.

TAMS65 - Lecture1 2/27


Big picture of TAMS65

Heights
z}|{
Y = β0 + β1
|{z} x + |{z}
ε
Weights Error

I Typical Linear regression model (Lecture 10-12)


I β0 , β1 : Point/interval estimation, Hypothesis Testing(Lecture
1-7)
I ε : χ2 -test, Random vector(Lecture 8-9)

TAMS65 - Lecture1 3/27


Repetition
General normal distribution(normalfördelning) :
(x−µ)2

X ∼ N(µ, σ) if its pdf fX (x) = √1 e 2σ 2 , x ∈ R.
σ 2π

X −µ
Theorem 1: If X ∼ N(µ, σ), then ∼ N(0, 1).
σ

Standard normal distribution


2
Z ∼ N(0, 1) if its pdf φ(z) = √1 e −z /2 , z ∈ R.

TAMS65 - Lecture1 4/27


Repetition
Standard normal distribution
2
Z ∼ N(0, 1) if its pdf φ(z) = √1 e −z /2 , z ∈ R.

Its cdf Φ(z) = P(Z ≤ z).

I P(Z ≤ a) = Φ(a)
I P(a ≤ Z ≤ b) = Φ(b) − Φ(a)
I Φ(−a) = 1 − Φ(a)

TAMS65 - Lecture1 5/27


Definitions in Statistics
Population (population) := entire collection of objects that we
are interested in. Denoted by X , Y , . . . .

Assumption: Population X can be modelled by a certain kind of


distribution with unknown parameter(s) θ

Goal: Estimate the unknown parameter(s) θ

I Example: What is the ’true’ average height of all adults in


Sweden?
I Let X = {heights of all adults in Sweden}, then X is the
population.
I It is reasonable to assume X ∼ N(µ, σ) (Why?)
I µ = population mean = true average height
I σ = population standard deviation = true standard deviation
of heights
I For example, we want to estimate µ.

TAMS65 - Lecture1 6/27


Definitions in Statistics
Sample (stickprov), denoted by {X1 , . . . , Xn }, is a subset of the
population.

Random sample (slumpmässigt stickprov), denoted by


{X1 , . . . , Xn }, is a sample such that X1 , . . . , Xn are independent,
and X1 , . . . , Xn have the same distribution as population.

Note: n = sample size(stickprovsstorlek)

I Population X = {heights of all adults in Sweden}, then we


assume X ∼ N(µ, σ).
I e.g. Plan to choose n adults (from the population) who don’t
have genetic relation.
I Let Xi be the height of adult i, i = 1, . . . , n,
I Then X1 , . . . , Xn are independent and
Xi ∼ N(µ, σ), i = 1, . . . , n,
I Then {X1 , . . . , Xn } is a random sample.

TAMS65 - Lecture1 7/27


Definitions in Statistics

I Before measure/observe: X1 , . . . , Xn are random variables


I After measure/observe:
Observations(observationer): x1 , . . . , xn are numbers.

For example:

I Population X = {heights of all adults in Sweden}


I Plan to choose n adults who don’t have genetic relation.
I Before measure/observe: X1 , X2 , . . . , Xn .
I After measure/observe:
x1 = 180cm, x2 = 175cm, . . . , xn = 182cm.

TAMS65 - Lecture1 8/27


Point estimation - point estimator, point estimate
Point estimator(stickprovsvariabeln/skattningsvariabel) of an
unknown parameter θ, denoted by Θ, b is a function of random
sample {X1 , . . . , Xn }. That is, Θ = f (X1 , . . . , Xn ).
b

I Point estimator is a random variable.


I e.g. Sample mean(stickprovsmedelvärde):
X̄ = n1 ni=1 Xi = X1 +...+X
P n
n is a point estimator.

Point estimate(punktskattning) of an unknown parameter θ,


denoted by θ̂, is an observed value of point estimator, that is,
θ̂ = f (x1 , ..., xn ).

I Point estimate is a number.


I e.g. Sample mean x̄ = n1 ni=1 xi is a point estimate.
P

TAMS65 - Lecture1 9/27


Point estimation
True value θ v.s. point estimate θ̂ v.s. point estimator Θ
b

I θ = true/real/theoretical value
I θ̂ = a point estimate = an estimation value of θ, calculated
with observations.
I Θ
b = a point estimator = describes the variation of θ̂ for
different observations.

Standard error(medelfelet) (of a point estimate θ̂): q


d = d(θ̂) := an estimation of D(Θ)
b = an estimation of V (Θ)
b

In the text book, you can see


b = θ∗ (X) and Point estimate θ̂ = θ∗ (x).
Point estimator Θ

TAMS65 - Lecture1 10/27


Point estimation - point estimate, point estimator
Example 1: Let X = {heights of all adults in Sweden}, and
assume that X ∼ N(µ, σ).
To estimate µ, we chose three (non-genetic) adults in Sweden and
got x1 = 178 cm, x2 = 182 cm, and x3 = 186 am.

To estimate µ, we try different estimations, for example:


µ̂1 = 13 (x1 + x2 + x3 ), µ̂2 = 12 (x1 + x3 ), µ̂3 = x3 − 2.

Are µ̂1 , µ̂2 and µ̂3 point estimates of µ?


Then we get
1
µ̂1 = (178 + 182 + 186) = 182
3
1
µ̂2 = (178 + 186) = 182, µ̂3 = 186 − 2 = 184.
2
Are they the same good point estimates? If not, which is the best?
TAMS65 - Lecture1 11/27
Point estimation - unbiased
Rules to compare point different estimates/estimators

Rule 1:
A point estimator Θ
b of θ is unbiased(Väntevärdesriktighet) if

E(Θ)
b = θ.

Example 1 continued: Three point estimates of µ are given


1 1
µ̂1 = (x1 + x2 + x3 ), µ̂2 = (x1 + x3 ), µ̂3 = x3 − 2
3 2
Their corresponding three point estimators of µ are
1 1
M̂1 = (X1 + X2 + X3 ), M̂2 = (X1 + X3 ), M̂3 = X3 − 2
3 2
Lowercase: µ ↔ Uppercase: M Unbiased?

TAMS65 - Lecture1 12/27


Point estimation - unbiased
Note that: {X1 , X2 , X3 } is a random sample since X1 , X2 , X3 are
independent, and Xi ∼ N(µ, σ), i = 1, 2, 3.
 
1 1 1 1
E(M1 ) = E
b (X1 + X2 + X3 ) = E(X1 ) + E(X2 ) + E(X3 ) = µ,
3 3 3 3
 
1 1 1
E(Mb2 ) = E (X1 + X3 ) = E(X1 ) + E(X3 ) = µ,
2 2 2
E(Mb3 ) = E (X3 − 2)) = E(X3 ) − 2 = µ − 2 6= µ.

Point estimator M
b3 /point estimate µ̂3 is NOT unbiased.

Both point estimators M b1 and Mb2 are unbiased, that is, point
estimates µ̂1 and µ̂2 are unbiased. Which is better?

TAMS65 - Lecture1 13/27


Point estimation - more effective
Rule 2:
If Θ
b 1 and Θ
b 2 are unbiased point estimators of θ, Θ
b 1 is more
effective (effektivare) than Θ2 if
b

V (Θ
b 1 ) < V (Θ
b 2 ).

Example 1 continued: Both point estimators M b1 and M b2 of µ are


unbiased, but which point estimator is more effective?
σ2
 
1 1 1 1
V (M
b1 ) = V (X1 + X2 + X3 ) = 2 V (X1 ) + 2 V (X2 ) + 2 V (X3 ) = ,
3 3 3 3 3
  2
b2 ) = V 1 (X1 + X3 ) = 1 V (X1 ) + 1 V (X3 ) = σ .
V (M
2 22 22 2

then V (Mb1 ) < V (M


b2 ), so M
b1 - µ̂1 is more effective. Thus we
1
choose µ̂1 = 3 (x1 + x2 + x3 ) among these three point estimates.

TAMS65 - Lecture1 14/27


Point estimation - consistent

Rule 3: A point estimator Θ


b n of θ is said to be
consistent(konsistent) if
b n − θ| > ε) → 0 as n → ∞,
P(|Θ

for every ε > 0.

Theorem 2 If E(Θ b n ) → 0 as n → ∞, then Θ


b n ) = θ and V (Θ bn
is a consistent point estimator of θ.

Note: Proof of Theorem 2 is given in Appendix.

Remark: Bias(systematiskt fel): A bias is a systematic error that


b − θ.
leads to an incorrect estimate of effect or association = E (Θ)

TAMS65 - Lecture1 15/27


Point estimation - Example 2

Example 2: Let x1 , ..., x7 be independent observations of a random


variable X with E(X ) = µ and V (X ) = σ 2 . Is the following point
estimate of σ 2 unbiased?
x22 + x62 − (x2 + x6 )/2
σ̂ 2 = .
2

Note: Lowercase: σ 2 ↔ Uppercase: Σ2

σ̂ 2 is not unbiased.

TAMS65 - Lecture1 16/27


Point estimation - Theorems
Let x1 , . . . , xn are observations of independent r.v.s X1 , . . . , Xn
with E(Xi ) = µ and V (Xi ) = σ 2 .
Theorem 3 The Sample mean(stickprovsmedelvärdet)
n n
b = X̄ = 1 1X
X
M Xi , µ̂ = x̄ = xi
n n
i=1 i=1

is an unbiased and consistent point estimator of µ.

Commonly used point estimates/estimators

I Population mean µ ≈ µ̂ = x̄ Sample mean


1
Pn
I µ̂ = x̄ = n i=1 xi

1
Pn
I M
b =X =
n i=1 Xi

The proof is given in Appendix.


TAMS65 - Lecture1 17/27
Point estimation - Theorems
Let x1 , . . . , xn are observations of independent r.v.s X1 , . . . , Xn
with E(Xi ) = µ and V (Xi ) = σ 2 .
Theorem 4 The Sample variance(stickprovsvariansen)
n n
1 X 1 X
S2 = (Xi − X̄ )2 , s2 = (xi − x̄)2
n−1 n−1
i=1 i=1

is an unbiased and consistent point estimator of σ 2 .

Commonly used point estimates/estimators

I Population variance σ 2 ≈ σ̂ 2 , if µ is known(känt),


2 1
Pn 2
I σ̂ = n i=1 (xi − µ)
1
Pn
I Σ̂2 = n i=1 (Xi − µ)
2

The proof of unbiasedness is given in Appendix.


TAMS65 - Lecture1 18/27
Point estimation

Commonly used point estimates/estimators

Population variance σ 2 ≈ σ̂ 2 , if µ is unknown(okänt),

1 Pn
I σ̂ 2 = s 2 = n−1 i=1 (xi − x̄)2 Sample variance
1 Pn
I Σ̂2 = S 2 = n−1 i=1 (Xi − X̄ )2
√ √
I Sample standard deviation s = s 2 and S = S 2.

Pn 2
1 Pn ( )−n·(x̄)2
i=1 xi
Note: s 2 = n−1
2
i=1 (xi − x̄) = n−1 .

I Population standard deviation σ ≈ σ̂.

TAMS65 - Lecture1 19/27


Point estimation

Sample standard deviation(stickprovsstandardavvikelse) S or s


v v

u n √
u n
u 1 X u 1 X
S= S =2 t 2 2
(Xi − X̄ ) , s = s = t (xi − x̄)2
n−1 n−1
i=1 i=1

Note: S is NOT an unbiased point estimator of σ, since

0 < V (S) = E(S 2 ) − [E(S)]2 = σ 2 − [E(S)]2

That is, [E(S)]2 < σ 2 and E(S) < σ.

TAMS65 - Lecture1 20/27


Commonly used point estimates/estimators
Population mean µ ≈ µ̂ = x̄ Sample mean

1 Pn
I µ̂ = x̄ = n i=1 xi

1 Pn
I M
b =X =
n i=1 Xi

Population variance σ 2 ≈ σ̂ 2 , Population standard deviation σ ≈ σ̂.

I If µ is known(känt),
1 n
I σ̂ 2 = 2
P
n i=1 (xi − µ)
1
Pn
I Σ̂2 = n i=1 (Xi − µ)
2

I If µ is unknown(okänt),
1 n
I σ̂ 2 = s 2 = − x̄)2 Sample variance
P
n−1 i=1 (xi
1
Pn
I Σ̂2 = S 2 = n−1 i=1 (Xi − X̄ )2
√ √
I Sample standard deviation s = s 2 and S = S 2 .

TAMS65 - Lecture1 21/27


σ
10.1 : Coefficient of variation(variationskoefficient) µ :

σ s
≈ .
µ x̄

Practice after the lecture:

Exercises - Lesson 1:

(I) 11.6, 11.8, 11.9, 10.1, 10.4.

(II) 5.7, 5.12, 5.22, 6.1, 5.13, 6.9.

Prepare your questions to the Lessons.

Thank you!

TAMS65 - Lecture1 22/27


Appendix

Proof of the Theorem 2.

Apply Chebyshev’s inequality

V (X )
P(|X − E (X )| ≥ k) ≤ .
k2
For any given ε > 0, then
b n − θ| > ε) ≤ P(|Θ
P(|Θ b n − θ| ≥ ε)
V (Θ
b n)
≤ → 0 as n → ∞
ε2
b n ) → 0 as n → ∞. So Θ
since V (Θ b n is consistent.

TAMS65 - Lecture1 23/27


Appendix
Solution to Example 2:

X22 X62 X2 X6
 
2
E(Σ̂ ) = E + − −
2 2 4 4
2 2
E(X2 ) E(X6 ) E(X2 ) E(X6 )
= + − −
2 2 4 4
2 2
= / E(Z ) = V (Z ) + (E(Z )) /
V (X2 ) + (E(X2 ))2 V (X6 ) + (E(X6 ))2 µ µ
= + − −
2 2 4 4
σ 2 + µ 2 σ 2 + µ2 µ µ
= + − = σ 2 + µ2 − 6= σ 2
2 2 2 2
No, σ̂ 2 is not unbiased. Lowercase: σ 2 ↔ Uppercase: Σ2

TAMS65 - Lecture1 24/27


Appendix
Proof of Theorem 3.
n
b = E(X̄ ) = 1 1
X
E(M) E(Xi ) = nµ = µ
n | {z } n
i=1 =µ

So M
b is unbiased.

n
 2 X
1 1 σ2
V (M)
b = V (X̄ ) = V (Xi ) = 2 nσ 2 =
n | {z } n n
i=1
=σ 2

b → 0 as n → ∞. By Theorem 6, M
So V (M) b is consistent.

Therefore, M
b an unbiased and consistent point estimator of µ.

TAMS65 - Lecture1 25/27


Appendix
Proof of Theorem 4 We have
n
X n
X n
X n
X
(Xi − X̄ )2 = (Xi2 − 2Xi X̄ + X̄ 2 ) = Xi2 − 2X̄ Xi +nX̄ 2
i=1 i=1 i=1
|i=1
{z }
=nX̄
n
X
= Xi2 − nX̄ 2
i=1

and

E(Xi2 ) = V (Xi ) + (E(Xi ))2 = σ 2 + µ2


σ2
E(X̄ 2 ) = V (X̄ ) + (E(X̄ ))2 = + µ2
n

which gives
TAMS65 - Lecture1 26/27
Appendix

n n
! !
1 X 1 X
E(S 2 ) = E (Xi − X̄ )2 = E (Xi − X̄ )2
n−1 n−1
i=1 i=1
n
!
1 X
= E Xi2 − nX̄ 2
n−1
i=1
n
!
1 X
= E(Xi2 ) −n E(X̄ ) 2
n−1 | {z } | {z }
i=1
=σ 2 +µ2 2
= σn +µ2
1 1
n(σ 2 + µ2 ) − σ 2 − nµ2 = (n − 1)σ 2 = σ 2 .

=
n−1 n−1

So, S 2 is an unbiased point estimator of σ 2 .

TAMS65 - Lecture1 27/27


Thank you!

You might also like