Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views23 pages

Lecture 24

The document provides details about the final exam for the STA732 Statistical Inference course at Duke University, including logistics, preparation advice, and main topics covered. It emphasizes key concepts in statistical inference such as point estimation, hypothesis testing, and interval estimation, along with optimality types and important technical details. Additionally, it includes exercises and extended readings for further study.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views23 pages

Lecture 24

The document provides details about the final exam for the STA732 Statistical Inference course at Duke University, including logistics, preparation advice, and main topics covered. It emphasizes key concepts in statistical inference such as point estimation, hypothesis testing, and interval estimation, along with optimality types and important technical details. Additionally, it includes exercises and extended readings for further study.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

STA732

Statistical Inference
Final Review

Yuansi Chen
Spring 2022
Duke University

https://www2.stat.duke.edu/courses/Spring22/sta732.01/

1
Exame logistics
Final Exam

• Time: April 27, 7pm-10pm?


• Location: Old Chem 025
• Logistics:
• Format:
• Similar to midterm, but with 5 problems
• Emphasize a bit more on the second half
• Closed-book exam
• No electronic devices
• I will print two help sheets:
• Formula sheet with common distributions (like the one in
first-year exam)
• Theorem sheet with 10 theorem copied from textbook (see the
post on Ed)

2
Advice for preparation

• Review homework midterm, sample midterm


• Go back to required readings of each lecture
• Can be helpful to do recitation of
• Main concepts, definitions
• Main proof techniques
• Try to solve exercises in Keener/ Lehmann and Casella /
Lehmann and Romano

3
Office hours

• Yuansi: Mon/Wed 3:30-4:45pm, 223B


• Youngsoo: to be annouced on Sakai
• Ed Discussion: we will reply in less than 24 hours

4
Overview of the main topics
Three main types of statistical inference problems

• Point estimation
• Hypothesis testing
• Interval estimation

5
Types of optimality in point estimation

With model family 𝑃𝜃 , 𝜃 ∈ Ω, loss 𝐿 and risk 𝑅, we can consider an


optimal estimator in the sense of

• Uniform minimum risk


• Restrict to a smaller class
• Uniform minimum variance unbiased (UMVU) (Lec 4-6)
• Minimum risk equivariant (MRE) (Lec 7-8)
• Global approaches
• Bayes estimator (Lec 9-11)
• Minimax estimator (Lec 12-13)
• Large sample: asymptotic efficiency of MLE (Lec 14-15)

6
Types of optimality in hypothesis testing

With model family 𝑃𝜃 , 𝜃 ∈ Ω, null + alternative hypotheses,


Neyman-Pearson Paradigm, we can consider an optimal test

• Uniformly most powerful (UMP) (Lec 16-19)


• Restrict to a smaller class
• Uniformly most powerful unbiased (UMPU) (Lec 20-22)
• Large sample: asymptotic of likelihood ratio tests (LRT) (Lec
23)

7
Basic toy model families

• Exponential families: Bernoulli, Binomial, Poisson,


Exponential, Chi-square, Normal, Gamma, beta, Multinomial,
etc.
• Linear model or Generalized linear model

8
Important concepts (not exhaustive)

• Loss and risk


• Sufficiency, completeness
• Bias, Variance, Fisher information
• Equivariance
• Bayes estimator, empirical Bayes and hierarchical Bayes,
shrinkage
• Admissibility
• Minimax estimator, least favorable prior
• Convergence in probability, convergence in distribution, MLE
• Level, power, power function in hypothesis testing
• Neyman-Pearson paradigm
• Simple vs composite, one-side vs two-side tests
• Unbiased test, 𝛼-similar
• Conditional test, Neyman structure
9
• Canonical linear model, general linear model
Important technical details (not exhaustive)

What are the typical ways to


• prove sufficiency?
• prove completeness?
• show independence?
• show an estimator is UMVU?
• find MRE?
• show an estimator is Bayes or admissible or minimax?
• manipulate functions of converging random variables? derive
asymptotic distribution?
• derive the form a UMP test?
• prove a test is UMP in single-param?
• prove a test is UMPU in single-param? with nuisance param?
• construct optimal tests in general linear model?
• ... 10
If you have mastered the above materials, congratulations!
Your background in theoretical statistics distinguishes you from
99% PhDs in other fields!

11
Possible extended readings (in theoretical statistical inference)

• Asymptotics:
• Asymptotic Statistics, van der Vaart 1998
• Asymptotics in Statistics, Le Cam 2000
• Large sample non-asymptotics:
• Empirical Processes in M-Estimation, van de Geer 2000
• High-Dimensional Statistics: A Non-Asymptotic Viewpoint,
Wainwright 2017
• Semi-parameteric models:
• Efficient and Adaptive Estimation for Semiparametric Models,
Bickel, Klassen, Ritov, Wellner 1998

12
Some research ideas

General question: can we derive an optimal estimator (uniform or


minimax) if we restrict to estimators that can be computed in
polynomial time (in 𝑛 and 𝑑)?
See workshops at Simons Institute
• Computational Complexity of Statistical Inference Boot Camp,
2021
• Rigorous Evidence for Information-Computation Trade-offs,
2021

13
Exercices
Chapter 8, Ex 26 in Keener:
Let 𝑋1 , … , 𝑋𝑛 be i.i.d. Poisson with mean 𝜆, and consider
estimating
𝑔(𝜆) = 𝑃𝜆 (𝑋𝑖 = 1) = 𝜆𝑒−𝜆
One natural estimator might be the proportion of ones in the
sample:
1
𝑝𝑛̂ = # {𝑖 ≤ 𝑛 ∶ 𝑋𝑖 = 1} .
𝑛
Another choice would be the maximum likelihood estimator,
𝑔 (𝑋̄ 𝑛 ), with 𝑋̄ 𝑛 the sample average.
1. Find the asymptotic relative efficiency of 𝑝𝑛̂ with respect to
𝑔 (𝑋̄ 𝑛 ).
2. Determine the limiting distribution of
𝑛 [𝑔 (𝑋̄ 𝑛 ) − 1/𝑒]
when 𝜆 = 1. See Page 24 in Lecture 14 14
15
Chapter 14, Ex 10 in Keener:
Consider a regression version of the two-sample problem in which


{𝛽1 + 𝛽2 𝑥𝑖 + 𝜖𝑖 , 𝑖 = 1, … , 𝑛1
𝑌𝑖 = ⎨
{
⎩𝛽3 + 𝛽4 𝑥𝑖 + 𝜖𝑖 , 𝑖 = 𝑛1 + 1, … , 𝑛1 + 𝑛2 = 𝑛,

with 𝜖1 , … , 𝜖𝑛 i.i.d. from 𝑁 (0, 𝜎2 ). Derive a 1 − 𝛼 confidence


interval for 𝛽4 − 𝛽2 , the difference between the two regression
slopes.

16
17
Chapter 9, Ex 7 in Keener
Let 𝑋1 , 𝑋2 , … be i.i.d. from a uniform distribution on (0, 1) and let
𝑇𝑛 maximize
𝑛
log (1 + 𝑡2 𝑋𝑖 )

𝑖=1
𝑡
over 𝑡 > 0.
𝑝
1. Show that 𝑇𝑛 → 𝑐 as 𝑛 → ∞, identifying the constant 𝑐.

2. Find the limiting distribution for 𝑛 (𝑇𝑛 − 𝑐) as 𝑛 → ∞.

18
Thank you for attending
See you on April 27 in Old Chem 025

19
20

You might also like