Page 1 Lecture 7 – MEEN 260 Dr. R.
Tafreshi
Lecture 7: Statistical Data Analysis – Student’s t distribution
Lecture Outline:
Normal Probability Distributions
To understand the difference between Two-Sided vs. One-Sided confidence interval
To understand where to use Student’s t distribution
o Confidence Interval for Small Samples
Comparing Two Sample Sets
Examples
Announcements / Reminders / Discussions:
Conference Papers…
Reading Assignments:
Chapter 3 of textbook by Beckwith
o Specially additional problem solving examples in the textbook:
Examples 3.4 – 3.10
Next Lecture: Propagation of Uncertainty
Chi-Squared Distribution
Page 2 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Review:
Note: The textbook (and many statistics textbooks) use the notation zc /2
Example: zc /2 = z0.475 c: Confidence interval
This refers to the value of z when:
zc/2
0
p( z )dz c / 2 0.95 / 2 0.475
zc/2
P( zc /2 x zc /2 )
zc/2
p( x)dx
– 1 = 68.3%
– 1.96 = 95.0%
– 2 = 95.5%
– 2.575 = 99.0%
– 3 = 99.73%
– 6 = 99.9999998%
From: http://www.answers.com/topic/normal-distribution
Two-Sided vs. One-Sided confidence interval
“I am 95% sure the average students’ age at TAMUQ is 20.5 ± 2.5 yrs”
18 23 Two-sided confidence interval
s s
x zc /2 x x zc /2 x
n n
“I am 97.5% sure the average students‘ age at TAMUQ is at least 18 yrs”
18 One-sided confidence interval
s
xz x
n
Page 3 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Samples
Samples should be representative of the complete set (the population) of values
from which it has been chosen
The larger the random sample the more representative of the population it is likely
to be
What if we can’t afford to take ~100 samples?
o Time
o $$$
o Opportunity
For small samples (<30) our sampling assumptions no longer are valid
Solution: “Student’s t” distribution
Student’s t distribution
Early 1900s Guiness brewery hired top
college graduates in statistics and
biochemistry to improve their manufacturing
process
At Guiness, William Gosset developed a
way to test the quality of their product with
only a few samples, but because of his
employer’s desire for secrecy, he published his work under the name of “Student”.
The t-test assumes that the underlying population follows a Gaussian distribution,
but accounts for the fact that when using S x » s with only a few samples, we are
underestimating the std. dev.
Thus the values in the t-table are greater than those in the z-table.
The t-statistic depends not only on the assumed confidence interval, but on the
number of samples taken. The “degrees of freedom” of the t-statistic is ν = n −1.
Page 4 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Confidence Interval for Small Samples
Student’s t distribution for calculating confidence intervals
S S
x t /2,v x x t /2,v x
n n
o Where t /2,v is the t-statistic that we get from the tables
o What is ν?
The degrees of freedom equals sample size minus 1:
ν = n −1
o What is ?
The significance level, the complement of the confidence interval:
=1–c
Page 5 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Example – Prob. 3.22 of textbook
The manufacturer of inexpensive outdoor thermometers checks a sample of nine against a
68 F standard. The following results were obtained:
68.5 67.5 67 69 68 67 67.5 69 69
Calculate the range within which the population mean is expected to lie with a
confidence level of 95%
Solution:
Given: x = 68.06, S x = 0.846, n = 9, = 0.05
From table …
S S
Confidence Interval: x t /2,v x x t /2,v x
n n
… 67.41 68.71
Table 3.5 of textbook: Student's t-Distribution (Values of t ,v )
30 1.301 1.697 2.042 2.75
Page 6 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Example – Prob. 3.22 (modified) of textbook
The manufacturer of inexpensive outdoor thermometers checks a sample of nine against a
68 F standard. The following results were obtained:
68.5 67.5 67 69 68 67 67.5 69 69
Calculate the lower limit above which the population mean is expected to lie with a
confidence level of 99%
Solution:
Given: x = 68.06, s x = 0.846, n = 9, = 0.01
From table …
sx
Confidence Interval: x t ,v 67.24
n
See page 57 of your textbook for more details about one-sided vs. two-sided
confidence interval
Summary of distributions:
Confidence interval of all possible measurements
x
zc/2 x zc/2 ; z
Confidence interval containing the “true value” (population mean)
s s x
x zc /2 x x zc /2 x ; z
n n sx / n
Confidence interval containing the “true value” (population mean)
s s x
x t /2,v x x t /2,v x ; t
n n sx / n
One-sided confidence intervals,
s sx
o e.g. xz x ; x t ,v
n n
Page 7 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Comparing Two Sample Sets
How do we compare two sets of sample to see if they are different (statistically)?
Examples:
o Compare the quality of manufactured goods between day-shift workers and
night-shift workers
o Compare the quality of products between two different companies
Assume:
o x1 , x2 : the sample means
o s1 , s2 : the sample deviations
o n1 , n2 : the number of samples in each set
Use a t-test where:
2
( s1 ) 2 ( s2 ) 2
n n
x1 x2
t x1 x2 ; v 1 2 2
2
( s1 ) 2 ( s2 ) 2 ( s1 )
2
( s2 ) 2
n
n1 n2 1 n2
n1 1 n2 1
(v is rounded down to the nearest integer)
To verify if the sets are significantly different (statistically):
If t x1 x2 > t /2,v They are different with a confidence level (1- )
Page 8 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Example: assume two manufacturing machines with products with
x1 = 1076.75 ( s1 )2 = 29.30 n1 = 12
x2 = 1072.33 ( s2 )2 = 26.24 n2 = 12
2
( s1 ) 2 ( s2 ) 2
n n
x1 x2
t x1 x2 ; v 1 2 2
2
( s1 ) 2 ( s2 ) 2 ( s1 )
2
( s2 ) 2
n
n1 n2 1 n2
n1 1 n2 1
1076.75 1072.33
t x1 x2 texp 2.055 ;
(29.30)2 (26.24)2
12 12
2
(29.30) 2 (26.24) 2
12 12
v 21.9
2 2 2
(29.30) (26.24) 2
12
12
11 11
Compare texp and t /2,v for = 0.10:
texp 2.055 > t0.10/2,21 =1.721
o Statistically different with 10% significance level
(We are 90% sure they are different)
But for = 0.05:
texp 2.055 < t0.05/2,21 =2.080
o Not statistically different with 5% significance level
(We cannot be 95% sure that they are different)
Research Example:
“Predicting Epileptic Seizures in Scalp EEG Based on a Variational Bayesian Gaussian Mixture Model of Zero-Crossing
Intervals”, Ali Shahidi Zandi, Reza Tafreshi, Manouchehr Javidan, and Guy A. Dumont, IEEE TBME Journal, 2013.
* Some material based on original slides by Drs. Reza Langari and Bryan Rasmussen, TAMU MEEN.
Page 9 Lecture 7 – MEEN 260 Dr. R. Tafreshi
Table 3.2