0% found this document useful (0 votes)

25 views39 pages

09 Inference - Slides Web

This document provides an introduction to statistical inference. It defines key terms related to inferential statistics and discusses concepts such as point estimation, confidence intervals, hypothesis testing, and how statistics are used to make statistical inferences about populations based on sample data.

Uploaded by

Godslove Oluwa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views39 pages

09 Inference - Slides Web

Uploaded by

Godslove Oluwa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Introduction to Statistical Inference

Edwin Leuven
Introduction

I Define key terms that are associated with inferential statistics.

I Revise concepts related to random variables, the sampling
distribution and the Central Limit Theorem.

2/39
Introduction

Until now we’ve mostly dealt with descriptive statistics and with
probability.
In descriptive statistics one investigates the characteristics of the
data

I using graphical tools and numerical summaries

The frame of reference is the observed data

In probability, the frame of reference is all data sets that could have
potentially emerged from a population

3/39
Introduction

The aim of statistical inference is to learn about the population

using the observed data
This involves:

I computing something with the data

I a statistic: function of data
I interpret the result
I in probabilistic terms: sampling distribution of statistic

4/39
Introduction

Probability

Population Sample (Data)

fX (x ) (x1 , . . . , xn )

Parameter Statistic
x̄ = n1 ni=1 xi
P
E [X ] = µ

Inference

5/39
Point estimation
We want to estimate a population parameter using the observed
data.

I f.e. some measure of variation, an average, min, max, quantile,

etc.

Point estimation attempts to obtain a best guess for the value of

that parameter.
An estimator is a statistic (function of data) that produces such a
guess.
We usually mean by “best” an estimator whose sampling
distribution is more concentrated about the population parameter
value compared to other estimators.
Hence, the choice of a specific statistic as an estimator depends on
the probabilistic characteristics of this statistic in the context of the
sampling distribution.
6/39
Confidence Interval

We can also quantify the uncertainty (sampling distribution) of our

point estimate.
One way of doing this is by constructing an interval that is likely to
contain the population parameter.
One such an interval, which is computed on the basis of the data, is
called a confidence interval.
The sampling probability that the confidence interval will indeed
contain the parameter value is called the confidence level.
We construct confidence intervals for a given confidence level.

7/39
Hypothesis Testing

The scientific paradigm involves the proposal of new theories that

presumably provide a better description of the laws of Nature.
If the empirical evidence is inconsistent with the predictions of the
old theory but not with those of the new theory

I then the old theory is rejected in favor of the new one.

I otherwise, the old theory maintains its status.

Statistical hypothesis testing is a formal method for determining

which of the two hypothesis should prevail that uses this paradigm.

8/39
Statistical hypothesis testing

Each of the two hypothesis, the old and the new, predicts a different
distribution for the empirical measurements.
In order to decide which of the distributions is more in tune with the
data a statistic is computed.
This statistic t is called the test statistic.
A threshold c is set and the old theory is reject if t > c
Hypothesis testing consists in asking a binary question about the
sampling distribution of t

9/39
Statistical hypothesis testing

This decision rule is not error proof, since the test statistic may fall
by chance on the wrong side of the threshold.
Suppose we know the sampling distribution of the test statistic t
We can then set the probability of making an error to a given level
by setting c
The probability of erroneously rejecting the currently accepted
theory (the old one) is called the significance level of the test.
The threshold is selected in order to assure a small enough
significance level.

10/39
Multiple measurements

The method of testing hypothesis is also applied in other practical

settings where it is required to make decisions.
Consider a random trial of a new treatment to a medical condition
where the

I treated get the new treatment

I controls get the old treatment

and measure their response

We now have 2 measurements that we can compare.
We will use statistical inference to make a decision about whether
the new treatment is better.

11/39
Statistics

Statistical inferences, be it point estimation, confidence intervals, or

hypothesis tests, are based on statistics computed from the data.
A statistic is a formula which is applied to the data
and we think of it as a statistical summary of the data
Examples of statistics are

I the sample average and

I the sample standard deviation

For a given dataset a statistic has a single numerical value.

it will be different for a different random sample!
The statistic is therefore a random variable

12/39
Statistics

It is important to distinguish between

1. the statistic (a random variable)

2. the realisation of the statistic for a given sample (a number)

we therefore denote the statistic with capitals, f.e. the sample mean:
1 Pn
I X̄ = n i=1 Xi

and the realisation of the statistic with small letters:

1 Pn
I x̄ = n i=1 xi

13/39
Example: Polling

14/39
Example: Polling

Imagine we want to predict whether the left block or the right block
will get a majority in parliament
Key quantities:

I N = 4,166,612 - Population
I p = (# people who support the right) / N
I 1 − p = (# people who support the left) / N

We can ask the following questions:

1. What is p?
2. Is p > 0.5?
3. We estimate p but are we sure?

15/39
Example: Polling
We poll a random sample of n = 1,000 people from the population
without replacement:

I choose person 1 at random from N, choose person 2 at random

from N-1 remaining, etc.
or, choose a random set of n people from all Nn = n!(N−n)!
N!

I
possible sets

Let (
1 if person i support the right
Xi =
0 if person i support the left
and denote our data by x1 , . . . , xn
Then we can estimate p by

p̂ = (x1 + . . . + xn )/n

16/39
Example: Polling

To construct the poll we randomly sampled the population

With a random sample each of the n people is equally likely to be
the ith person, therefore

E [Xi ] = 1 · p + 0 · (1 − p) = p

and therefore

E [p̂] = E [(X1 + . . . + Xn )/n]

= (E [X1 ] + . . . + E [Xn ])/n = p

The “average” value of p̂ is p, and we say that p̂ is unbiased

Unbiasedness refers to the average error over repeated sampling,
and not the error for the observed data!
17/39
Example: Polling

Say 540 support the right, so p̂ = 0.54

Does this mean that in the population:

I p = 0.54?
I p > 0.5?

The data are a realization of a random sample and p̂ is therefore a

random variable!
For a given sample we will therefore have estimation error

estimation error = p̂ − p 6= 0

which comes from the difference between our sample and the
population

18/39
Example: Polling

When sampling with replacement the Xi are independent, and

p(1−p)
I Var [p̂] = n

When sampling without replacement the Xi are not independent

N1 − 1 N1
= Pr(Xi = 1|Xj = 1) 6= Pr (Xi = 1|Xj = 0) =
N −1 N −1
and we can show that

p(1−p) n−1
I Var [p̂] = n 1− N−1

For N = 4, 166, 612, n = 1, 000, and p = 0.54, the standard

deviation of p̂ ≈ 0.016.
But what is the distribution of p̂?
19/39
The Sampling Distribution

Statistics vary from sample to sample

The sampling distribution of a statistic

I is the nature of this variability

I can sometimes be determined and often approximated

The distribution of the values we get when computing a statistic in

(infinitely) many random samples is called the sample distribution of
that statistic

20/39
The Sampling Distribution

We can sample from

I population
I eligible voters in norway today
I model (theoretical population)
I Pr(vote right block) = p

The sampling distribution of a statistic depends on the population

distribution of values of the variables that are used to compute the
statistic.

21/39
Sampling Distribution of Statistics

Theoretical models describe the distribution of a measurement as a

function of one or more parameters.
For example,

I in n trials with succes probability p, the total number of

successes follows a Binomial distribution with parameters n and
p
I if an event happens at rate λ per unit time then the probability
that k events occur in a time interval with length t follows a
Poission distribution with parameters λt and k

22/39
Sampling Distribution of Statistics

More generally the sampling distribution of a statistic depends on

I the sample size

I the sampling distribution of the data used to construct the
statistic

can be complicated!
We can sometimes learn about the sampling distribution of a
statistic by

I Deriving the finite sample distribution

I Approximation with a Normal distribution in large samples
I Approximation through numerical simulation

23/39
Finite sample distributions
Sometimes we can derive the finite sample distribution of a statistic
Let the fraction of people voting right in the population be p
Because we know the distribution of the data (up to the unknown
parameter p) we can derive the sampling distribution
In a random sample of size n the probability of observing k people
voting on the right can be derived and follows a binomial distribution
!
n k
Pr(X = k) = p (1 − p)n−k
k

This depends on p which is unknown.

This approach is however often not feasible
The statistic may be complicated, depend on different variables, the
population distribution of these variables is unknown
24/39
Theoretical Distributions of Observations (Models)

Distribution Sample Space f (x )

n k n−k
Binomial 1, . . . , n k p (1 − p)
Poisson 1, 2, . . . λk exp(−λ)/k!
Uniform [a, b] 1/(b − a)
Exponential [0, ∞) λ exp(−λx )
Normal (−∞, ∞) √ 1 exp(− 1 ( x −µ )2 )
2πσ 2 σ

Distribution E [X ] Var (X ) R
Binomial np np(1 − p) d,p,q,rbinom
Poisson λ λ d,p,q,rpoisson
1 1 2
Uniform 2 (a + b) 12 (b − a) d,p,q,runif
Exponential λ−1 λ −2 d,p,q,rexp
Normal µ σ2 d,p,q,rnorm

25/39
Example: Polling
hist(
replicate(
10000,mean(rbinom(1000, 1, .54)))
, main="", xlab="p_hat",prob=TRUE,breaks=50)
25
20
Density

15
10
5
0

0.48 0.50 0.52 0.54 0.56 0.58 0.60

26/39
The Normal Approximation

In general, the sampling distribution of a statistic is not the same as

the sampling distribution of the measurements from which it is
computed.
If the statistic is

1. (a function of) a sample average and

2. the sample is large

then we can often approximate the sampling distribution with a

Normal distribution

27/39
Example: Polling

In the graph p̂ looked like it had a Normal distribution with mean

0.54 and s.d. 0.16
If N n then Xi are approximately independent, and if n is large
then

√
n(p̂ − p) ∼ N(0, p(1 − p))
or equivalently
p(1 − p)
p̂ ∼ N(p, )
n
by the Central Limit Theorem

28/39
Example: Polling

curve(dnorm(x, mean=.54, sd=0.016),

col="darkblue", lwd=2, add=TRUE, yaxt="n")
25
20
Density

15
10
5
0

0.48 0.50 0.52 0.54 0.56 0.58 0.60

p_hat
29/39
Approximation through numerical simulation

Computerized simulations can be carried out to approximate

sampling distributions.
With a model we can draw many random samples, compute the
statistic, and characterize it’s sampling distribution.
Assume price ∼ Expontential(λ)
Consider samples of size n = 201
E [price] = λ−1 and Var [price] = λ−2
and therefore
q
Var (price) = (1/λ2 )/201 ≈ 0.0705/λ

30/39
Approximation through numerical simulation

Remember that 95% of the probability density of a normal

distribution is within 1.96 s.d. of its mean.
The Normal approximation for the sampling distribution of the
average price suggests

√
1/λ ± 1.96 · 1/(λ n)

should contain 95% of the distribution.

31/39
Approximation through numerical simulation

We may use simulations in order to validate this approximation

Assume λ = 1/12, 000

X.bar = replicate(10^5,mean(rexp(201,1/12000)))
mean(abs(X.bar-12000) <= 1.96*0.0705*12000)

## [1] 0.95173

Which shows that the Normal approximation is adequate in this

example
How about other values of n or λ?

32/39
Approximation through numerical simulation

Simulations may also be used in order to compute probabilities in

cases where the Normal approximation does not hold.
Consider the following statistic

(min(xi ) + max(xi ))/2

where Xi ∼ Uniform(3, 7) and n = 100

What interval contains 95% of the observations?

33/39
Approximation through numerical simulation
Let us carry out the simulation that produces an approximation of
the central region that contains 95% of the sampling distribution of
the mid-range statistic for the Uniform distribution:

mid.range <- rep(0,10^5)

for(i in 1:10^5) {
X <- runif(100,3,7)
mid.range[i] <- (max(X)+min(X))/2
}
quantile(mid.range,c(0.025,0.975))

## 2.5% 97.5%
## 4.9409107 5.0591218

Observe that (approximately) 95% of the sampling distribution of

the statistic are in the range [4.941680, 5.059004].
34/39
Approximation through numerical simulation

Simulations can be used in order to compute any numerical

summary of the sampling distribution of a statistic
To obtain the expectation and the standard deviation of the
mid-range statistic of a sample of 100 observations from the
Uniform(3, 7) distribution:

mean(mid.range)

## [1] 4.9998949

sd(mid.range)

## [1] 0.027876151

35/39
Approximation through numerical simulation

Computerized simulations can be carried out to approximate

sampling distributions.

1. draw a random sample of size n with replacement from our data

2. compute our statistic
3. do 1. & 2. many times

The resulting distribution of statistics across our resamples is an

approximation of the sampling distribution of our statistic
The idea is that a random sample of a random sample from the
population, is again a random sample of the population
This is called the bootstrap and computes the sampling distribution
without a model!

36/39
Approximation through numerical simulation

n = 1000
data = rbinom(n, 1, .54) # true distr, usually unknown
estimates = rep(0,999)
for(i in 1:999) {
id = sample(1:n, n, replace=T)
estimates[i] = mean(data[id])
}
sd(estimates)

## [1] 0.015946413

sqrt(.54*(1-.54)/1000) # true value, usually unknown

## [1] 0.015760711
37/39
Summary

Today we looked at the elements of statistical inference

I Estimation:
I Determining the distribution, or some characteristic of it.
(What is our best guess for p?)
I Confidence intervals:
I Quantifying the uncertainty of our estimate. (What is a range
of values to which we’re reasonably sure p belongs?)
I Hypothesis testing:
I Asking a binary question about the distribution. (Is p > 0.5?)

38/39
Summary

In statistical inference we think of data as a realization of a random

process
There are many reasons why we think of our data as (ex-ante)
random:

1. We introduced randomness in our data collection (random

sampling, or random assigning treatment)
2. We are actually studying a random phenomenon (coin tosses or
dice rolls)
3. We treat as random the part of our data that we don’t
understand (errors in measurements)

The coming weeks we will take a closer look at how this randomness
affects what we can learn about the population from the data

39/39

FDSA Unit - 3
No ratings yet
FDSA Unit - 3
59 pages
Sta 341 Class Notes Final
No ratings yet
Sta 341 Class Notes Final
120 pages
Session2 QTII 24
No ratings yet
Session2 QTII 24
31 pages
IBM322 Sampling
No ratings yet
IBM322 Sampling
47 pages
Ch8 Statistics Ver1
No ratings yet
Ch8 Statistics Ver1
21 pages
5 Inference
No ratings yet
5 Inference
57 pages
Stats 101 - Class 02
No ratings yet
Stats 101 - Class 02
103 pages
ParameterEstimation Slides
No ratings yet
ParameterEstimation Slides
40 pages
Statistical Inference
No ratings yet
Statistical Inference
29 pages
Sampling Distribution Basics
No ratings yet
Sampling Distribution Basics
30 pages
Stat2602 Chapter3
No ratings yet
Stat2602 Chapter3
37 pages
Statistics 1B Lecture Notes: Author: T. Farrar
No ratings yet
Statistics 1B Lecture Notes: Author: T. Farrar
129 pages
Research - Stats Notes
No ratings yet
Research - Stats Notes
44 pages
Statistics from Samples & Theorems
No ratings yet
Statistics from Samples & Theorems
113 pages
Lecture06 Ch6 Forsyth Inf Stats FA24
No ratings yet
Lecture06 Ch6 Forsyth Inf Stats FA24
56 pages
002 Probability-and-Statistics-Part-4-Statistics
No ratings yet
002 Probability-and-Statistics-Part-4-Statistics
123 pages
Stats CH 7 Powerpoint
No ratings yet
Stats CH 7 Powerpoint
65 pages
STAT 552 Probability and Statistics Ii: Short Review of S551
No ratings yet
STAT 552 Probability and Statistics Ii: Short Review of S551
51 pages
Formula List Statistics 2
No ratings yet
Formula List Statistics 2
4 pages
Notes On Probability and Statistics
No ratings yet
Notes On Probability and Statistics
5 pages
Random Variables & Sampling
100% (1)
Random Variables & Sampling
5 pages
Lecture Slides 10 UN1201
No ratings yet
Lecture Slides 10 UN1201
35 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Module 4-Sampling 2
No ratings yet
Module 4-Sampling 2
56 pages
Unit 18
No ratings yet
Unit 18
12 pages
Unit 2
No ratings yet
Unit 2
25 pages
What Is Statistic
No ratings yet
What Is Statistic
129 pages
Summary Week 2
No ratings yet
Summary Week 2
17 pages
Estimation
No ratings yet
Estimation
92 pages
COM 201 - Inferential Statistics - 18032022-1
No ratings yet
COM 201 - Inferential Statistics - 18032022-1
58 pages
Block 7
No ratings yet
Block 7
34 pages
06 Stat Est
No ratings yet
06 Stat Est
41 pages
Notes On Sampling and Hypothesis Testing
No ratings yet
Notes On Sampling and Hypothesis Testing
10 pages
Lecture No. Probability & Statistics
No ratings yet
Lecture No. Probability & Statistics
60 pages
Research 9 Q3
No ratings yet
Research 9 Q3
17 pages
Sampling & Sampling Distributions
No ratings yet
Sampling & Sampling Distributions
26 pages
Inferential Statistics: by The End of This Chapter You Should Be Able To
No ratings yet
Inferential Statistics: by The End of This Chapter You Should Be Able To
46 pages
Probs-Stats Revision Notes
No ratings yet
Probs-Stats Revision Notes
19 pages
UNL STAT318 Notes Chapter 1-4 (2020)
No ratings yet
UNL STAT318 Notes Chapter 1-4 (2020)
66 pages
Lecture 6 - Estimation Part A
No ratings yet
Lecture 6 - Estimation Part A
23 pages
Seminar Week 4 - With Solutions - Fullpage
No ratings yet
Seminar Week 4 - With Solutions - Fullpage
35 pages
Summary Week 2
No ratings yet
Summary Week 2
17 pages
Chapter3 Notes
No ratings yet
Chapter3 Notes
24 pages
Stats Inference for Students
No ratings yet
Stats Inference for Students
35 pages
Week 10 - Statistics, Random Sampling, Point Estimation
No ratings yet
Week 10 - Statistics, Random Sampling, Point Estimation
14 pages
Chapter 16
No ratings yet
Chapter 16
24 pages
Lectorial Slides 6a
No ratings yet
Lectorial Slides 6a
30 pages
Lec 5
No ratings yet
Lec 5
64 pages
Unit - III (P&S Notes)
No ratings yet
Unit - III (P&S Notes)
39 pages
SB K49 Lecture7
No ratings yet
SB K49 Lecture7
57 pages
Eba3e PPT ch06
No ratings yet
Eba3e PPT ch06
41 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
00 - Inrroduction To Statistics
No ratings yet
00 - Inrroduction To Statistics
30 pages
Chapter 6-8 Sampling and Estimation
No ratings yet
Chapter 6-8 Sampling and Estimation
48 pages
Prob & Stats (Slides) PDF
No ratings yet
Prob & Stats (Slides) PDF
101 pages
03 Estimation IITB PDF
No ratings yet
03 Estimation IITB PDF
58 pages
Chapter 08 Statistics 2
No ratings yet
Chapter 08 Statistics 2
47 pages
MIT2 854F10 Stats
No ratings yet
MIT2 854F10 Stats
38 pages