Descriptive statistics using excel
- Data Analysis
The Analysis ToolPak is an Excel add-in program that provides data analysis tools
for financial, statistical and engineering data analysis.
To load the Analysis ToolPak add-in, execute the following steps.
1. Click on Excel Options.
2. Under Add-ins, select Analysis ToolPak and click on the Go button.
1
3. Check Analysis ToolPak and click on OK.
2
4. On the Data tab, you can now click on Data Analysis.
The following dialog box below appears.
3
Examples on Data analysis
Descriptive Statistics
You can use the Analysis Toolpak add-in to generate descriptive statistics. For
example, you may have the scores of 14 participants for a test.
To generate descriptive statistics for these scores, execute the following steps.
1. On the Data tab, click Data Analysis.
Note: can't find the Data Analysis button? Click here to load the Analysis ToolPak
add-in.
2. Select Descriptive Statistics and click OK.
4
3. Select the range A2:A15 as the Input Range.
4. Select cell C1 as the Output Range.
5. Make sure Summary statistics is checked.
6. Click OK.
Result:
5
6
Statistical tests using excel
1- Analysis of variance (ANOVA)
- One Way ANOVA
𝛼 = 0.05.
Hypotheses testing:
𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑘
𝐻1 : at least one of the means is different.
Test statistic:
𝑀𝑆𝐴
𝐹=
𝑀𝑆𝐸
Critical region:
Reject 𝐻0 if 𝐹 > 𝐹1−𝛼,(𝑣1,𝑣2 )
Or
𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
Decision:
This example teaches you how to perform a single factor ANOVA (analysis of
variance) in Excel. A single factor or one-way ANOVA is used to test the null
hypothesis that the means of several populations are all equal.
Below you can find the salaries of people who have a degree in economics,
medicine or history.
7
To perform a single factor ANOVA, execute the following steps.
1. On the Data tab, click Data Analysis.
Note: can't find the Data Analysis button? Click here to load the Analysis ToolPak
add-in.
2. Select Anova: Single Factor and click OK.
3. Click in the Input Range box and select the range A2:C10.
4. Click in the Output Range box and select cell E1.
8
5. Click OK.
Result:
9
𝛼 = 0.05.
Hypotheses testing:
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : at least one of the means is different.
Test statistic:
𝑀𝑆𝐴
𝐹= = 15.19623
𝑀𝑆𝐸
Critical region:
𝐹1−𝛼,(𝑣1 ,𝑣2 ) = 𝐹 𝑐𝑟𝑖𝑡 = 3.443357
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.00007163
Reject 𝐻0 if 𝐹 > 3.443
Or
𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
Decision:
Therefore, we reject the null hypothesis. The means of the three populations are
not all equal. At least one of the means is different. However, the ANOVA does
not tell you where the difference lies. You need a t-Test to test each pair of means.
10
- Two-Factor Analysis of Variance
𝛼 = 0.05.
The three hypotheses to be tested are as follows:
Hypotheses testing:
1. 𝐻0 : 𝑎1 = 𝑎2 = ⋯ = 𝑎𝑎 = 0
𝐻1 : At least one of the 𝑎𝑖 is not equal to zero.
2. 𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑏 = 0
𝐻1 : At least one of the 𝛽𝑗 is not equal to zero.
3. 𝐻0 : (𝑎𝛽)11 = (𝑎𝛽)12 = ⋯ = (𝑎𝛽)𝑎𝑏 = 0
𝐻1 : At least one of the (𝑎𝛽)𝑖𝑗 is not equal to zero.
Critical region:
𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
Decision:
- ANOVA two ways with Replication
In the following example, the cost of three swimming programs in two regions in
the East and the West
SQ RQ TQ
East 19 8 21
18 10 31
25 10 26
20 18 28
17 7 14
21 16 24
West 15 23 31
28 32 31
17 26 26
14 16 20
17 27 14
13 18 9
11
1. On the Data tab, click Data Analysis.
2. Select Anova: Two-Factor With Replication and click OK.
3. Click in the Input Range box and select the range A1:D13.
4. Click in the Output Range box and select cell E2.
5. Click OK.
Result:
12
𝛼 = 0.05.
1) Hypotheses testing:
𝐻0 : 𝑎𝐸𝑎𝑠𝑡 = 𝑎𝑊𝑒𝑠𝑡 = 0
𝐻1 : at least one of the means is different.
Test statistic:
𝐹 = 1.4416
Critical region:
𝐹1−𝛼,(𝑣1 ,𝑣2 ) = 𝐹 𝑐𝑟𝑖𝑡 = 4.1708
13
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.2392
Reject 𝐻0 if 𝐹 > 4.1708 Or 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
Decision:
Therefore, we not reject the null hypothesis. There are no difference in the means
of the cost between the two regions in the east and the west.
2) Hypotheses testing:
𝐻0 : 𝛽𝑆𝑄 = 𝛽𝑅𝑄 = 𝛽𝑇𝑄 = 0
𝐻1 : at least one of the means is different.
Test statistic:
𝐹 = 2.871
Critical region:
𝐹1−𝛼,(𝑣1 ,𝑣2 ) = 𝐹 𝑐𝑟𝑖𝑡 = 3.3158
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0722
Reject 𝐻0 if 𝐹 > 3.3158 Or 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
Decision:
Therefore, we not reject the null hypothesis. There was no significant difference
between the mean cost of the three programs.
3) Hypotheses testing:
𝐻0 : 𝑎𝛽𝐸𝑎𝑠𝑡,𝑆𝑄 = 𝑎𝛽𝐸𝑎𝑠𝑡,𝑅𝑄 = 𝑎𝛽𝐸𝑎𝑠𝑡,𝑇𝑄 = 𝑎𝛽𝑊𝑒𝑠𝑡,𝑆𝑄 = 𝑎𝛽𝑊𝑒𝑠𝑡,𝑅𝑄 = 𝑎𝛽𝑊𝑒𝑠𝑡,𝑇𝑄 = 0
𝐻1 : at least one of the means is different.
14
Test statistic:
𝐹 = 6.4163
Critical region:
𝐹1−𝛼,(𝑣1 ,𝑣2 ) = 𝐹 𝑐𝑟𝑖𝑡 = 3.3158
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0047
Reject 𝐻0 if 𝐹 > 3.3158 Or 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
Decision:
Therefore, we reject the null hypothesis. There is interaction between the two
factors (region, type of program).
15
- ANOVA two ways without Replication
In the following example, the price of three swimming programs in six areas
SQ RQ TQ
1 19 8 21
2 18 10 31
3 25 10 26
4 20 18 28
5 17 7 14
6 21 16 24
𝛼 = 0.05.
The two hypotheses to be tested are as follows:
Hypotheses testing:
1. 𝐻0 : 𝑎1 = 𝑎2 = ⋯ = 𝑎𝑎 = 0
𝐻1 : At least one of the 𝑎𝑖 is not equal to zero.
2. 𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑏 = 0
𝐻1 : At least one of the 𝛽𝑗 is not equal to zero.
Critical region:
𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 𝛼
Decision:
1. On the Data tab, click Data Analysis.
2. Select Anova: Two-Factor With Replication and click OK.
16
3. Click in the Input Range box and select the range B2:D7.
4. Click in the Output Range box and select cell E3.
5. Click OK.
Result:
17
𝛼 = 0.05.
1) Hypotheses testing:
𝐻0 : 𝑎1 = 𝑎2 = ⋯ = 𝑎6 = 0
𝐻1 : at least one of the means is different.
Test statistic:
𝐹 = 2.6805
Critical region:
𝐹1−𝛼,(𝑣1 ,𝑣2 ) = 𝐹 𝑐𝑟𝑖𝑡 = 3.3258
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0866
18
Reject 𝐻0 if 𝐹 > 3.3258 Or 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
Decision:
Therefore, we not reject the null hypothesis. There are no difference in the means
of the cost between the six areas.
2) Hypotheses testing:
𝐻0 : 𝛽𝑆𝑄 = 𝛽𝑅𝑄 = 𝛽𝑇𝑄 = 0
𝐻1 : at least one of the means is different.
Test statistic:
𝐹 = 18.022
Critical region:
𝐹1−𝛼,(𝑣1 ,𝑣2 ) = 𝐹 𝑐𝑟𝑖𝑡 = 4.1028
𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0004
Reject 𝐻0 if 𝐹 > 4.1028 Or 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
Decision:
Therefore, we reject the null hypothesis. There is a significant difference between
the mean prices of the three programs.
19