Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
94 views17 pages

Unit 4 QB Part B Answer (2023)

The document discusses non-parametric tests, which are statistical methods that do not assume a specific distribution for the data. It outlines various tests such as the sign test, Wilcoxon signed-rank test, Mann-Whitney U test, and Kruskal-Wallis test, including their definitions, uses, advantages, and disadvantages. Additionally, it provides examples of applying the Kruskal-Wallis test to analyze production volumes and corrosion prevention methods.

Uploaded by

jossuji80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views17 pages

Unit 4 QB Part B Answer (2023)

The document discusses non-parametric tests, which are statistical methods that do not assume a specific distribution for the data. It outlines various tests such as the sign test, Wilcoxon signed-rank test, Mann-Whitney U test, and Kruskal-Wallis test, including their definitions, uses, advantages, and disadvantages. Additionally, it provides examples of applying the Kruskal-Wallis test to analyze production volumes and corrosion prevention methods.

Uploaded by

jossuji80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

MA4401-Probability and Statistics Dept.

of Mathematics 2023-24
UNIT IV- NON-PARAMETRIC TESTS
PART – A
1 What is a non-parametric test?
Nonparametric tests are methods of statistical analysis that do not require a distribution to meet
the required assumptions to be analyzed (especially if the data is not normally distributed). It is
also known as the distribution-free test.
2 Name any three non-parametric tests.
1. Sign test for paired data
2.Rank sum tests
(a) Mann-Whitney U test
(b) Kruskal-Wallis test or H - test
3.One sample run test.
3 Write the advantages of Non-parametric test?
(i)They do not require us to make the assumption that a population is distributed in the shape of
a normal curve or another specific shape.
(ii)Generally they are easier to do and to understand.
(iii)Sometimes even formal ordering or ranking is not required.
4 Write the disadvantages of Non-parametric test?
(i)Less efficient as compared to parametric test
(ii)The results may or may not provide an accurate answer because they are distribution free.
5 Define sign test?
The sign test is a non-parametric test that is used to test whether or not two groups are equally
sized
6 When sign test is used?
(i)When there are pair of observations on two things being compared
(ii)For any given pair, each of two observations is made under similar conditions
(iii)No assumptions are made regarding the parent population
7 Define Wilcoxon signed –rank test?
The Wilcoxon signed test is a non-parametric statistical hypothesis test used to compare the
continuous outcome in the two related samples, matches samples or repeated measurements on a
single sample to assess whether their population mean ranks differ.
8 How do you interpret Wilcoxon signed-rank test?
The test statistic for the Wilcoxon signed rank test is W, defined as the smaller of W+ (sum of
the positive ranks) and W- (sum of the negative ranks). If the null hypothesis is true, we expect
to see similar numbers of lower and higher ranks that are both positive and negative.
9 Define Kolmogrov-Smirnov test?
It is a method to determining the goodness of the fit between an observed sample and theoretical
probability distribution.
10 Write the importance of Kolmogrov-Smirnov test?
The Kolmogorov–Smirnov test is a non-parametric goodness-of-fit test and is used to determine
whether two distributions differ, or whether an underlying probability distribution differences
from a hypothesized distribution. It is used when we have two samples coming from two
populations that can be different.
11 Define the statistics used in Kolmogrov-Smirnov -test
The test statistics for Kolmogrov-Smirnov test is the maximum absolute deviation of expected
relative frequency Fe and the observed frequency F0. It is denoted by Dn
i.e., Dn =max | Fe F0 |

21
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
12 When is Mann-Whitney ‘U’ test used?
The Mann-Whitney U test is used to compare whether there is a difference in the dependent
variable for two independent groups. It compares whether the distribution of the dependent
variable is the same for the two groups and therefore from the same population.
13 Define the statistics used in the U-test and give its mean.
n (n  1) nn
U  n1n2  1 1  R1 , Mean  1 2
2 2
14 Explain Kruskal-Wallis Test or H-Test.
The Kruskal Wallis test is a generalization of Mann-Whitney U-test to the case k  2
samples. It is used to test the null hypothesis H 0 , that 𝑘 independent samples are drawn from the
identical population.
15 When do we use Kruskal Wallis test?
(i)The observations are independent within and between samples.
(ii) The variable under study is continuous.
(ii) The populations are identical except possibly in respect of median.
16 Write the formula for Kruskal Wallis test?
12 k
Ri 2
The test statistics for Kruskal Wallis test is H    3(n  1)
n(n  1) i 1 ni
17 What are the assumptions of run test
Run test of randomness assumes that the mean and the variance are constant and the probability
is independent.
18 Define One Sample Run Test?
It is a method to determine the randomness with which the sample items have been selected.
19 Explain the term “Run” with an example.
A run is a subsequence of one or more identical symbols representing a common property of the
data. In other words, a run is defined as a set of identical (or related) symbols contained between
two different symbols or no symbols (such as at the beginning or end of the sequence).
Ex: Consider a sequence made up of two symbols, A& B such as AAA|BB|
A|BBB|AAAA|BBBB|AA. In losing a coin, for example, A’s could represent ‘heads’ and B’s
represent ‘fails’. The example is 7 runs.
20 Write the formula for run test.
Let R be the number of runs, n1 = number of items in first sample, n2 = number of items in
second sample.
Here, R is approximated by normal distribution
R  E ( R)
Z  N (0,1)
V ( R)
2n n 2n n (2n n  n  n ) R
Where E ( R)    1 2  1, V ( R )   2  1 2 21 2 1 2  Z   N (0,1)
n1  n2  n1  n2   n1  n2  1 
PART B
1 (a)The production volume of units assembled by three different operators during 9 shifts is
summarized below. Check whether there is significant difference between the production
volumes of units assembled by the three operators using Kruskal–Wallis test at a
significant level of 0.05.
Operator I 29 34 34 20 32 45 42 24 35
Operator II 30 21 23 25 44 37 34 19 38
22
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
Operator III 26 36 41 48 27 39 28 46 15
Solution:
Null Hypothesis: H0: There is significant different between the production volumes of units of
units assembled by the three operators.
Alternative Hypothesis: H1: There is no significant different between the production volumes of
units of units assembled by the three operators.
Since there are three operators (I, II and III) and each group consists of 9 shifts, so we have
N1= N2= N3=9 and N1+ N2+ N3=9+9+9=27.
Arranging all these units in increasing order of magnitude and assigning appropriate ranks, we
get
15 19 20 21 23 24 25 26 27 28
Units 29 30 32 34 34 34 35 36 37 38
39 41 42 44 45 46 48
1 2 3 4 5 6 7 8 9 10
Ranks 11 12 13 14 15 15 15 18 19 20
21 22 23 24 25 26 27

Sum of Ranks
Operator I 11 15 15 3 13 25 23 6 17 R1=128
Operator II 12 4 5 7 24 19 15 2 20 R2=108
Operator III 8 18 22 27 9 21 10 26 1 R3=142

12  R12 R2 2 R32 
H      3( N  1)
N ( N  1)  N1 N 2 N 3 

12  128  108  142  


2 2 2

      3( N  1)
27(28)  9 9 9 
12
 16384  11664  20164  84
6804
12
  48212  84
6804
 85.029  84
H  1.02
Here degrees of freedom = k-1=3-1=2, Also level of significance: Here   0.05
  2  for 2 degrees of freedom and   0.05    2 0.05
 5.991(from χ 2 table)
Decision:Accept H 0 if H   2 0.05,2
Now 1.02  5.991  H   2 0.05,2
The null hypothesis H0 is accepted. We conclude that there is significant difference between the
production volume of units assembled by the three operators.
23
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
(b)An experiment designed to compare three preventive methods against corrosion yielded
the following maximum depths of pits (in thousands of an inch) in pieces of wire subjected
to the respective treatments. Use the 0.05 level of significance to test the three samples
come from identical population using Kruskal - Wallis test.
Methods A 77 54 67 74 71 66 -
Methods B 60 41 59 65 62 64 52
Methods C 49 52 69 47 56 - -
Solution:
Null Hypothesis: H0:The three samples are come from identical population.
Alternative Hypothesis: H1:The three samples are not come from identical population
n  n1  n2  n3  6  7  5  18
Arranging these depths in increasing order of magnitude and assigning appropriate ranks,we
have table of tanks.
Depths 41 47 49 52 52 54 56 59 60 62 64 65 66 67 69 71 74 77
Rank 1 2 3 4.5 4.5 6 7 8 9 10 11 12 13 14 15 16 17 18

Sum of ranks
Method A 18 6 14 17 16 13 84
Method B 9 1 8 12 10 11 4.5 55.55
Method C 3 4.5 15 2 7 31.5
N1  6, N 2  7, N3  5
N  N1  N 2  N3  6  7  5  18
R1  84, R2  55.5, R3  31.5
12  R12 R2 2 R32 
Here H       3( N  1)
N ( N  1)  N1 N 2 N3 

12   84   55.5  31.5 
2 2 2
12
      3(19)  1814.45  57
18(19)  6 7 5  342
H  6.66
Degrees of freedom =k-1=3-1=2
Level of significance: Here   0.05
The value of vfor 2 degree of freedom for   0.05 is 5.991
Conclusion:
Since the value of H is greater than the tabled value of  2 at   0.05 for 2 degrees of freedom
as 6.66>5.991
2 (a) The manager of a company believes that difference in sales performance
depends upon the sale person’s age. Independent samples of sales people were
taken and their weekly sales record is reported below. At 95% confidence limit, test
the hypothesis using Kruskal - Wallis test.
No. Below 30 Years 24 16 21 15 19 26 -
24
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
of Between 30-40 years 23 17 22 25 18 29 27
Sales Over 45 years 30 20 23 25 34 36 28
Solution:
Null Hypothesis: H 0 : 1  2  3 there is no significance difference between 3 age groups
with respect to their mean sales
Alternative Hypothesis: H1 : 1  2  3 there is a significance difference between 3 age
groups with respect to their mean sales
Level of significance   0.05
Below 30 Years 24 16 21 15 19 26
Rank R1 11 2 7 1 5 14 40
Between 30-40 years 23 17 22 25 18 29 27
Rank R2 9.5 3 8 12.5 4 17 15 69
Over 45 years 30 20 23 25 34 36 28
Rank R3 18 6 9.5 12.5 19 20 16 101
n  n1  n2  n3  6  7  7  20
Test Statistic
12  R12 R22 R32  12  402 692 1012 
W      3  n  1     3  21  5.688
n  n  1  n1 n2 n3  20  21  6 7 7 
 2 value at 5% level of significance with 2 d.f. is 5.991
Conclusion:
W   2  5.991 we accept out null hypothesis and concluded that there is no significance
difference between 3 age groups with respect to their mean sales
(b)During one semester a student received in various subjects the mark shown below. Test
at 5%level of significance whether there is significant difference between the marks in
these subjects using Kruskal-Wallis test
Mathematics 72 80 83 75
Science 81 74 77
English 88 82 90 87 80
Economics 90 71 77 70
Solution:
Null Hypothesis: H 0 : 1  2  3  4 there is no significance difference between the
marks in these subjects
Alternative Hypothesis: H1 : 1  2  3  4 there is a significance difference between the
marks in these subjects
Level of significance   0.05
Mathematics 72 80 83 75
Rank R1 3 8.5 12 5 28.5
Science 81 74 77
Rank R2 10 4 6.5 20.5
English 88 82 90 87 80
Rank R3 14 11 15.5 13 8.5 62
Economics 90 71 77 70

25
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
Rank R4 15.5 2 6.5 1 25
n  n1  n2  n3  n4  4  3  5  4  16
Test Statistic
12  R12 R22 R32 R42  12  28.52 20.52 622 252 
W      
 3 n  1 
16 17   4
  
4 
 3 17 
n  n  1  n1 n2 n3 n4  3 5
 4.949
 2 value at 5% level of significance with 2 d.f. is 5.991
Conclusion:
W   2  5.991 we accept out null hypothesis and concluded that there is no significance
difference between the marks in these subjects.
3 (a) Two samples are given below.
Values of X i 1 2 3 5 7 9 11 18 -
Values of Y i 4 6 8 10 12 13 14 15 19
Test whether the two samples come from the same population at level =0.10 by using
Mann - Whitney U test.
Solution:
Null Hypothesis: H 0 : 1  2 is the two samples are came from the same population.

Alternative Hypothesis: H 0 : 1  2 is the two samples are not came from the same population.
Level of Significance: Here   0.10
Computation of U-statistic:
The observations are arranged in ascending order and ranks from 1 to 17 are assigned.
Original 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 18 19
Data
Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
The ranks of the observation belonging to the small samples are understood
R1=1+2+3+5+7+9+11+16=54
R1=4+6+8+10+12+13+14+15+17=99
Also n1  8, n2  9
n1 (n1  1) (8  9)
U  Statistic:  U  n1  n2   R1  (8  9)   54  72  36  54  54
2 2
n1n2 8  9 72
Mean : u     36
2 2 2
n1n2  n1  n2  1  8  9  8  9  1 72(18)
Variance :  u 2     108 ,  u  108  10.39
12 12 12
Here n2  9, so we can use the statistic
U  u 54  36 18
Z    1.73
u 10.39 10.39
The table value of Z  at   0.10 is 2.58. Now, | Z || Z | as 1.73<2.58, so we accept the
null hypothesis H 0 and conclude that the two samples are came from the same population.
26
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
(b) The following are the number of mistakes counted on pages randomly selected from
reports typed by a company’s two secretaries.
Male secretary 15 10 5 6 8 10 12
Female secretary 12 8 7 9 10 5 4
Use Mann – Whitney U-Test at 2% level of significance to test the null hypothesis that the
2 secretaries average equal mistakes per page.
Solution:
Solution:
H 0 : 1  2 there is no significant difference between the 2 secretaries average equal
mistakes per page.
H1 : 1  2 there is a significant difference between the 2 secretaries average equal
mistakes per page.
Male secretary 15 10 5 6 8 10 12
Rank R1 4 10 2.5 4 6.5 10 12.5 59.5
Female Secretary 12 8 7 9 10 5 4
Rank R2 12.5 6.5 5 8 10 2.5 1 45.5
Here n1  7, n2 7, R1  59.5, R2  45.5
n1 (n1  1) 7(7  1)
U  n1n2   R1  7  7   49  28  59.5  17.5
2 2
n1n2 7  7
   24.5
2 2
n1n2 (n1  n2  1) 7  7(7  7  1) 49 15 735
     61.25  7.8262
12 12 12 12
U 
Z N  0,1


17.5  24.5  7
 0.8944
7.8262 7.8262
| Z || 0.8944 |
Z  0.8944
At   2%  0.02 levelof significancefor two tailed test Z  2.26
Since |Z| < 2.26, we accept H0 and we may conclude that there is no significant difference
between the 2 secretaries average equal mistakes per page.
4 (a)Time of service by two cashiers in a period in a bank are given below:
Consumer Number SB139 Cu A/c 32 SB453 SB093 Cu A/c 21 SB123
Cashier A (in min) 12 23 1 5 16 17
Consumer Number SB309 Cu A/c 12 SB678 SB090 SB121 -
Cashier B (in min) 2 3 10 8 12 -
Test whether the service time differ significantly between two cashiers using Mann-
Whitney U test.
27
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
Solution:
H 0 : 1  2 there is no significant difference between the operators with respect to
average time.
H1 : 1  2 there is a significant difference between the operators with respect to average
time.
Rank 1 2 3 4 5 6 7.5 7.5 9 10 11
1 2 3 5 8 10 12 12 16 17 23
A B B A B B A B A A A
n1  6, n2  5
R1  7.5  11  1  4  9  10  42.5
U-Statistics:
n1  n1  1 6  6  1
U  n1n2   R1   6  5   42.5  8.5
2 2
n1n2  6  5
E U     15
2 2
n1n2  n1  n2  1  6  5 6  5  1
V U     30
12 12
U  E U  8.5  15
Z   1.1867  1.1867
V U  30

At   5% level of significance is 1.96


Since | Z | 1.96 we reject our null hypothesis and concluded that there is significant
difference between workers with respect to average time.
(b)Two classes of students are tested using a certain competitive exam. The scores of a
sample of students from each class is given below.
Class A 45 44 47 48 55 53 55 63 - - -
Class B 65 67 77 65 56 67 78 55 66 65 58
Test whether both classes have similar scholastic levels using Mann- Whitney U test.
Solution:
Solution:
H 0 : 1  2 there is no significant difference between both classes have similar scholastic
levels.
H1 : 1  2 there is a significant difference between both classes have similar scholastic
levels.
Class A 45 44 47 48 55 53 55 63
Rank R1 2 1 3 4 7 5 7 11 40
Class B 65 67 77 65 56 67 78 55 66 65 58
Rank R2 13 16.5 18 13 9 16.5 19 7 15 13 10 150

28
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
n1 (n1  1) 8  8  1
U  n1n2   R1  8 11   40  84
2 2
n n 8 11
 1 2  44
2 2
n1n2 (n1  n2  1) 8 11 8  11  1 88  20
    146.667  12.111
12 12 12
U  84  44
Z N  0,1   3.303
 12.111
At   5%  0.05 levelof significancefor two tailed test Z  1.96
Since |Z| > 1.96, we reject our null hypothesis and conclude that there is a significant
difference between both classes have similar scholastic levels.
5 (a)Kevin Morgan, national sales manager of an electronics firm, has collected the following
salary statistics on his field sales force earnings. He has both observed frequencies and
expected frequencies if the distribution of salaries is normal. At 0.05 level of significance,
Can Kevin conclude that the distribution of sales force earning is normal?
Earnings in thousands 25-30 31-36 37-42 43-48 49-54 55-60 61-66
Observed Frequency 9 22 25 30 21 12 6
Expected Frequency 6 17 32 35 18 13 4
Solution:
Null Hypothesis: H 0 : The distribution of salesforce earnings is normal
Null Hypothesis: H1 : The distribution of salesforce earnings is not normal
Level of significance :   0.05
Test statistic : Dn  max | Fe  F0 |
 Dn  0.064
The tabulated value of Dn n=7 and   0.05 is 0.486.
O.F O.C.F O.R.F E.F C.E.F E.R.F D  Fe  Fo

9 6
9 9  0.072 6 6  0.048 0.024
125 125
31 23
22 31  0.248 17 23  0.184 0.064
125 125
56 55
25 56  0.448 32 55  0.440 0.008
125 125
86 90
30 86  0.688 35 90  0.720 0.032
125 125
107 108
21 107  0.856 18 108  0.864 0.008
125 125

29
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
119 121
12 119  0.952 13 121  0.968 0.076
125 125
125 125
6 125 1 4 125 1 0
125 125
Conclusion:
Since, the table value of Dn ( 0.486) is greater than the calculated value of Dn ( 0.064),
we accept the null hypothesis. i.e., the distribution of salesforce earnings is normal.
(b)Suppose it is desired to check whether pinholes in electrolytic tin plate are
distributed uniformly across a plated coil on the basis of the following distances of
10 pinholes from one edge of a long strip of tin plate 320 inches wide.
4.8 14.8 28.2 23.1 4.4 28.7 19.5 2.4 25 6.2
Use Kolmogorov Smirnov test to test the null hypothesis.
Solution:
Null hypothesis: H 0 : Electrolytic tin plates are distributed uniformly
Alternative hypothesis: H1 : Electrolytic tin plates are not distributed uniformly
O.F O.C.F O.R.F E.F C.E.F E.R.F D  Fe  Fo
4.8 4.8 0.03 15.73 15.73 0.1 0.07
14.8 19.6 0.13 15.73 31.5 0.2 0.07
28.2 47.8 0.30 15.73 47.25 0.3 0
23.1 70.9 0.45 15.73 63 0.4 0.05
4.4 75.3 0.48 15.73 78.75 0.5 0.02
28.7 104 0.66 15.73 94.5 0.6 0.06
19.7 123.7 0.79 15.73 110.25 0.7 0.09
2.4 126.1 0.80 15.73 126 0.8 0
25 151.1 0.96 15.73 141.75 0.9 0.06
6.2 157.3 1 15.73 157.5 1 0
Dn  max Fe  Fo  0.09
Tabulated Dn for n  10 at 0.05 level of significance is 0.410
Since Dn calculated  Dn table we accept our null hypothesis and concluded that
electrolytic tin plates are uniformly distributed
6 (a)The following table gives the observed frequency along with the expected frequency
under a normal distribution.
(i)Calculate the Kolmogorov Smirnov statistic
(ii)Can we conclude that this distribution does infact follow a normal distribution? Use
0.10 level of significance.
Test Score 51-60 61-70 71-80 81-90 91-100
Observed Frequency 30 100 440 500 130
Expected Frequency 40 170 500 390 100
Solution:
Null Hypothesis: H 0 : The distribution does infact is normal
Null Hypothesis: H1 : The distribution does infact is not normal
30
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
Level of significance :   0.10
O.F O.C.F O.R.F E.F C.E.F E.R.F D  Fe  Fo
30 40
30 30  0.025 40 40  0.033 0.008
1200 1200
130 210
100 130  0.108 170 210  0.175 0.067
1200 1200
570 710
440 570  0.475 500 710  0.592 0.025
1200 1200
1070 1100
500 1070  0.892 390 1100  0.917 0.025
1200 1200
1200 1200
130 1200 1 100 1200 1 0
1200 1200

Test statistic : Dn  max | Fe  F0 |


 Dn  0.067
The tabulated value of Dn n=5 and   0.05 is 0.486.
Conclusion:
Since, the table value of Dn ( 0.486) is greater than the calculated value of Dn ( 0.064),
we accept the null hypothesis. i.e., the distribution does infact is normal.
(b)The following data show the employee’s rates of defective work before and after a
change in the wage incentive plan. Compare the following two sets of data to see whether
the charge lowered the defective units produced. Using the sign test with =0.01.
Before 8 7 6 9 7 10 8 6 5 8 10 8
After 6 5 8 6 9 8 10 7 5 6 9 8
Solution:
Null Hypothesis: H 0 : p  0.5
Nu Alternate Hypothesis: H1 : p  0.5(one-tailed test) (one- Tailed test)
Level of significance:   0.01
Test statistic:
di          0   0
Here n  4 + 6 =19
10

k  number of negative deviations =15


6

Now,
n
n
1 n
p '  p(u  k )   
2
 x ( np  5)
xk  
10   1  10  10  10  
10 10 10
1
 
2
           ....      (0.000976)(386)  0.3767
x 6  x 

 2   6   7  10  
Conclusion:

31
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
Since p '  0.05 we accept the null hypothesis and conclude that there is no significant change in
the defective units produced.
7 (a)The following are the measurements of breaking strength of a certain kind of 2 inch
cotton ribbon in pounds.
163 165 160 189 161 171 158 151 169 162
163 139 172 165 148 166 172 163 187 173
Use the sign test to test the null hypothesis =160 against the hypothesis >160 with 0.05
level of significance.
Solution:
Null Hypothesis: H 0 :   160
Nu Alternate Hypothesis: H1 :   160 (one- Tailed test)
Level of significance:   5%
Test statistic:
Let u  the observed number of plus signs
Replacing each value exceeding 160 with a plus sign, each value less than 160 with a minus
sign, and discarding the one value which equals 160, we get
  0               
Here n  The total number of plus and minus sigs =19
u  number of plus signs =15
Now, we find
19   1  19 
n 19 19
1 n
p  p(u  15)   
'

2
    
x 15  x  2
 x 
x 15  
 1  19  19  19  19  19  
19

                   0.0095
 2  15  16  17   18  19  
Conclusion:
Since p '(0.0095) is less than 0.05, we reject our null hypothesis and conclude that the mean
breaking strength of given kind of ribbon exceeds 160 pounds.
(b)A consumer panel includes 14 individuals. It is asked to rate two brands of
cococola according to a point evaluation system based on several criteria. Test the
null hypothesis that there is no difference in the level of ratings for the two brands
of cococola at 5 % level of significance using the sign test.
Brand I 20 24 28 24 20 29 19 27 20 30 18 28 26 24
Brand II 16 26 18 17 20 21 23 22 23 20 18 21 17 26
Solution:
Null Hypothesis: H 0 : p  0.5 there is no significance difference between in the level of
ratings for the two brands
Nu Alternate Hypothesis: H1 : p  0.5 there is a significance difference between in the level
of ratings for the two brands
Level of significance:   5%
Test statistic:

32
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
From the given data
di :     0      0   
Here n  4  8  12 (by omitting zero differences)
u  number of plus signs =8
Now, we find
19   1  12  12  12  
12 12 12
1
p '  p(u  8)   
2

x 8  x 

     8    9   ...  12    0.194
 2       
Conclusion:
Since p '  0.05 , we accept our null hypothesis and conclude that there is no significance
difference between in the level of ratings for the two brands
8 (a)The following data in tons, are the amounts of sulphur oxides emitted by a large
industrial plant in 40 days.
24 15 20 29 19 18 22 25 27 9
17 20 17 6 24 14 15 23 24 26
19 23 28 19 16 22 24 17 20 13
19 10 23 18 31 13 20 17 24 14
Use the sign test to test the null hypothsesis   21.5 against the alternative hypothesis
  21.5 at 0.01 level of significance.
Solution:
Null Hypothesis: H 0 :   21.5
Nu Alternate Hypothesis: H1 :   21.5 (one- Tailed test)
Level of significance:   1%
Test statistic:
Let u  the observed number of plus signs
Replacing each value exceeding 21.5 with a plus sign, each value less than 21.5 with a minus
sign, and discarding the one value which equals 21.6, we get
                      
                   
Here n  The total number of plus and minus sigs = 16+24=40
u  number of plus signs =16
As the sample size n  40 is very large, we shall use the normal approximation to binomial
distribution.
u  np 16  40  0.5 
Z   1.26
npq 40  0.5  0.5 

Z  1.26  1.26
The critical value Z at   0.01 for one tailed test is 2.33
Conclusion:
Since Z  2.33 , we accept our null hypothesis and conclude that   21.5

33
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
(b)In an industrial production line items are inspected periodically for defectives. The
following is a sequence of defective items (D) and non defective items (N) produced by
these production line.
DD NNN D NN DD NNNNN DDD NN D NNNN D N D
Test whether the defectives are occurring at random or not at 5% level of significance.
Solution:
Null hypothesis: H 0 : the defective occurring at random
Alternative hypothesis: H1 : the defective not occurring at random
Level of significance   0.05
DD NNN D NN DD NNNNN DDD NN D NNNN D N D
1 2 3 4 5 6 7 8 9 10 11 12 13
R  13 (the number of runs), n1  11 , n2  17
2n1n2 2 1117 
 1   1  14.357
n1  n2 11  17
2n1n2  2n1n2  n1  n2  2 1117   2 1117   11  17 
   2.472
 n1  n2   n1  n2  1 11  17  11  17  1
2 2

Test Statistic
R   13  14.357
Z   0.549 , | Z || 0.549 | 0.549
 2.472
The value of Z at 5% level of significance is 1.96
Conclusion:
Since | Z | 1.96 , we accept out null hypothesis and conclude that the defective occurring at
random.
9 (a)A technician is asked to analyze the results of 22 items made in a preparation run. Each
item has been measured and compared to engineering specifications. The order of
acceptance ‘a’ and rejections ‘r’ is aarrrarraaaaarrarraara. Determine whether it is a
random sample or not. Use =0.05.
Solution:
Null hypothesis: H 0 : the sample is randomly chosen
Alternative hypothesis: H1 : the sample is not randomly chosen
Level of significance   0.05
aa rrr a rr aaaaa rr a rr aa r a
Given
1 2 3 4 5 6 7 8 9 10 11
Here, n1  12, n2  10, R  11 the number of runs
2n1n2 2 12 10
= +1= +1=11.909
n1  n12 12 10

2n1n2 (2n1n2  n1  n2 ) 2 12 10(2 12 10  12  10)


= = =2.269
(n1  n2 ) (n1  n2  1)
2
(2  10)2 (12  10  1)
Test statistic:
R
Z N (0,1)

34
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
11  11.9091
Z  0.4007
2.2688
| Z || 0.4007 | 0.4007
The value of Z at   0.05 level of significance for tow tailed test is 1.96
Conclusion:
Since | Z | 1.96 , we accept H 0 and conclude that the sample is randomly chosen.
(b)The production manager of a large undertaking randomly paid 10 visits to the
work site in a month. The number of workers who reported late for duty were found
to be 2,4,5,1,6,3,2,1,7 and 8 respectively. Use the run test randomness at =0.05 to
check the claim of the production superintendent that on an average not more than
3 workers report late for duty.
Solution:
Null hypothesis: H 0 : the sample is randomly chosen
Alternative hypothesis: H1 : the sample is not randomly chosen
Level of significance   0.05
2 4 5 1 6 3 2 1 7 8
B A A B A - B B A A
Here A=the average above 3 and B=the average below 3
The above sequence can be written as
B AA B A BB AA
1 2 3 4 5 6
R  6 (the number of runs)
n1  5 (the number of occurrences of A)
n2  4 (the number of occurrences of B)
2n1n2 2  5 4 
 1   1  5.444
n1  n2 54

2n1n2  2n1n2  n1  n2  2  5 4   2  5  4   5  4 


   1.383
 n1  n2   n1  n2  1  5  4   5  4  1
2 2

Test Statistic
R   6  5.444
Z   0.402
 1.383
The value of Z at 5% level of significance is 1.96
Conclusion:
Since | Z | 1.96 , we accept out null hypothesis and conclude that the sample is randomly
chosen.
10 (a)Explain the Kruskal-Wallis test procedure with appropriate examples.
The Mann- Whitney U test can be used to test whether two populations are identical. It has beed
extended to the case of 3 or more populations by Kruskal and Wallis. The hypothesis for K-W
test with k  3 population can be written as follows.
Null Hypothesis H0 : 1  2  3 (i.e all the population are identical)
Alternate hypothesis: H0 : 1  2  3 (i.e all the population are not identical)
The K-W test is based on the analysis of independent random samples from each of the k
35
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
populations. The K-W test statistic which is based on the sum of the ranks for each of the
samples can be computed as follows
12  K Ri 2 
H or W     3(n  1) where
n(n  1)  i1 ni 
ni  the number of items in sample i
k  number of population (or samples)
n   ni  n1  n2  ...  nk
Ri  sum of the ranks of all items in samples i
To compute the W statistic, we must first rank all the given sample items.
Kruskal and Wallis shows that under the null hypothesis in which the populations are identical,
the sampling distribution of ‘W’ can be approximated by a  2 distribution with (k  1)df . The
approximation is generally accepted if each of the sample sizes is greater than or equal to 5.
If H falls in the critical region H  2 with (k  1) degrees of freedom, we accept our null
hypothesis at  level of significance, otherwise we reject H 0 .
(b)Explain the Mann- Whitney U test for comparing samples with appropriate examples.
Solution:
Use of Mann- Whitney ‘U’ test will enable us to determine whether the two populations are
identical. Let {x1 , x2 ...xm } and { y1 , y2 ... ym } be two independent random samples from two
populations. Here we set up the null hypothesis
H0 : 1  2 i.e the two populations are identical and
H0 : 1  2 i.e the two populations are not identical
Working Rule:
1. Combine all the given samples (from smallest to the largest), and assign ranks to all
values.
2. Assign the average of the ranks if the sample values are same. ( there are tie scores)
3. Find the sum of the ranks for each of the sample. Let us denote these sums by R1 and
R 2 .Also n1 and n2 are their respective sample sizes.
For our convenience choose n1  n2 (if they are unequal)
4. Calculate U- statistic:
n1 (n1  1)
U  n1n2   R1 [for sample1]
2
(or)
n (n  1)
U  n1n2  2 2  R2 [for sample 2]
2
Now, the mean and variance of the sampling distribution of U are
nn n n (n  n  1)
Mean  1 2 and variance  1 2 1 2
2 12
Therefore the standard normal of U in

36
MA4401-Probability and Statistics Dept. of Mathematics 2023-24
n1n2
U
U  E (U ) 2
Z Z  N (0,1)
V (U ) n1n2 (n1  n1  1)
12
5. If | Z | Z , we accept H 0 if | Z | Z where Z is the tabulated value of Z for the given
level of significance 
For example: | Z | 1.96 , H 0 is accepted at 55 level and rejected of | Z | 1.96 .

37

You might also like