LFS PROJECT I – ANSWERS FOR MASTER SAMPLE.
XLS
Graphical Descriptive Techniques
Q1
Pg 27 K&W 5th ed
Distribution of Ages Bin Freq.
10 0
20 3
40
30 23
35
40 34
30 50 25
60 8
Frequency
25
20 70 6
80 1
15
90 0
10
100 0
5 More 0
0
10 20 30 40 50 60 70 80 90 100 More Max 34
Bin
The maximum frequency value is 34 (There are 34 observations whose age is between 30 (exclusive)
and 40 (Inclusive).
WebCT 1 : 34
Q2
Pg 46 K&W 5th ed
African/Black (1) 79
Coloured (2) 10
9%
Indian/Asian (3) 2
2%
African/Black (1) White (4) 9
10%
Sum 100
Coloured (2)
Indian/Asian (3)
79%
White (4)
The largest population group represented in the pie-chart is the African/Black population with 79%
WebCT 2 : 79
Numerical Descriptive Measures
Q3, Q4, Q5
Pg 97 K&W 5th ed
Tools… > Data Analysis… > Descriptive Statistics…
yearsed
Mean 7.46
Standard Error 0.412291
Median 8
Mode 10
Standard
Deviation 4.12291
Sample Variance 16.99838
Kurtosis -0.77938
Skewness -0.48005
Range 15
Minimum 0
Maximum 15
Sum 746
Count 100
WebCT 3: 7.46
WebCT 4: 8
WebCT 5: 15
Q6
Pg 130 K&W 5th ed
Correlation Matrix:
annualsal yearsed
annualsal 1
yearsed 0.452155 1
The coefficient of correlation using the years of education (yearsed) and annual salary (annualsal)
variables is 0.452155 (positive relationship between the two variables)
WebCT 6 : 0.4522
Probability (Pivot Tables)
EXCEL 2007
Insert … Pivot Table
Drag Gender to left column, drag Type to top row, drag Type to body of table. Change
the body of the table from Sum to Count.
EXCEL 2003
Data … Pivot Table and Pivot Chart Wizard
Step 2: Select your data … Next
Step 3: Select Layout (bottom left) …
Drag Gender to left column, drag Type to top row and drag Type (from the full list of
headings) to body of table. Change the body of the table from Sum to Count (double
click on ‘sum of type’) … OK
Click FINISH
Sum of C_Gender Type
C_Gender 1 2 Grand Total
1 41 22 63
2 19 18 37
Grand Total 60 40 100
Q7 What is P( M )? 0.63
Q8 What is P( U )? 0.6
Q9 What is P( M or U )? 0.81
Q10 What is P( M and U )? 0.18
Q11 What is P( M | U )? 0.41/0.6 = 0.683
Introduction to Hypothesis Testing
Q12
Pg 323 K&W 5th ed
Excel gives us the following population figures on the logannualsal variable: (Please note that this
table was calculated using the entire dataset)
logannualsal
Mean 9.322461
Standard Error 0.008055
Median 9.357207
Mode 8.188689
Standard Deviation 1.162719
From the sample we calculate the following values:
logannualsal
Mean 9.334814
Standard Error 0.126136
Median 9.287301
Mode 9.798127
Standard Deviation 1.261362
TEST:
Tools… > Data Analysis CC… > One Sample Inference > Inference about a Mean (Sigma known)…
H0: μ = 9.32 (the population mean is equal to 9.32)
H1: μ ≠ 9.32 (the population mean is not equal to 9.32)
Test Statistic:
x−μ 9.3348 − 9.32
z= = = 0.1276
σ 1.16
n 100
From Excel:
One Sample z-test
H0: Mu = 9.32
H1: Mu <> 9.32
Sample Mean 9.3348
Sigma 1.16
Z-statistic 0.1277
Sample Size 100
p-value 0.8984
95% Confidence Interval
(9.1075,9.5622)
0.8984 > 0.05, hence we cannot reject the null hypothesis at the 5% significance level.
Q12 WebCT 7: 0.8984
Q13
WebCT 9: Type II
Q14
One Sample z-test
H0: Mu = 9.1
H1: Mu <> 9.1
Sample Mean 9.3348
Sigma 1.16
Z-statistic 2.0243
Sample Size 100
p-value 0.0429
WebCT 10: 0.0429
Q15
0.0429 < 0.05, Now we can reject the null hypothesis at the 5% significance level and conclude that
there is sufficient evidence to infer that the population mean is not equal to 9.1
WebCT 11: Yes
Q16
WebCT 12 : Type I
Inference about the Description of a Single Population
Q17
Pg 356 K&W 5th ed
Tools… > Data Analysis CC… > One Sample Inference > Inference about a Mean (Sigma
unknown)…
H0: μ = 40 (the population mean is equal to 40)
H1: μ ≠ 40 (the population mean is not equal to 40)
Test Statistic:
x − μ 38.98 − 40
t= = = -0.8816
s 11.5697
n 100
From Excel:
One Sample t-test
H0: Mu = 40
H1: Mu <> 40
Sample Mean 38.98
Sample Standard Deviation 11.56971
t-statistic -0.8816
Sample Size 100
p-value 0.3801
95% Confidence Interval
(36.6843,41.2757)
0.3801 > 0.05, hence we cannot reject the null hypothesis at the 5% significance level. There is
insufficient evidence to infer that the population mean is not equal to 40.
WebCT 13: 0.3801
Q18
Tools… > Data Analysis CC… > One Sample Inference > Inference about a Mean (Sigma
unknown)…
H0: μ = 40 (the population mean is equal to 40)
H1: μ < 40 (the population mean is less than 40)
From Excel:
One Sample t-test
H0: Mu = 40
H1: Mu < 40
Sample Mean 38.98
Sample Standard Deviation 11.56971
t-statistic -0.8816
Sample Size 100
p-value 0.1901
0.1901 > 0.05, hence we cannot reject the null hypothesis at the 5% significance level. There is
insufficient evidence to infer that the population mean is less than 40.
WebCT 14: 0.1901
Q19
Pg 359 K&W 5th ed
0.95 Confidence Interval Estimate of MU (SIGMA Unknown)
Sample mean = 38.98
Sample standard deviation = 11.5697
Lower confidence limit = 36.6843
Upper confidence limit = 41.2757
WebCT 15: 41.2757
Additional tests for Qualitative Data
Q20
Pg 548 K&W 5th ed
Expected Values based on previous year’s figures:
Number of observations Proportion Expected Value
100 0.75 75
100 0.12 12
100 0.08 8
100 0.05 5
Observed Values from sample data:
Observed Values
79
10
9
2
H0: p1 = 0.75, p2 = 0.12, p3 = 0.08, p4 = 0.05
H1: at least one of the proportions is not equal to its specified value
Test Statistic
Race Obs. Freq. ( f i ) Expected Freq. ( ei ) ( f i - ei ) ( f i − ei ) 2
ei
Black/African 79 75 4 0.2133
Coloured 10 12 -2 0.3333
White 9 8 1 0.1250
Indian/Asian 2 5 -3 1.8000
TOTAL 100 100 2.4713
χ 2 = 2.4713
THE IMPORTANT BIT!!!
Set up a table as below & use CHITEST.
From Excel: Using the CHITEST function in Excel, the following p-value can be calculated:
Obs Exp
79 75
10 12
9 8
2 5
P-value 0.480433
Rejection Region
χ 2 > χ α2 ,k −1 = χ 02.05,3 = 7.815
Conclusion
2.4713 < 7.815, hence we cannot reject the null hypothesis at the 5% significance level. The
proportions are a good fit for the data.
WebCT 23: 0.4804