STA101 Assignment 1&2 Solution
Answer to the question no. 1
Marks: 45
1. (i) [5.5]
Qualitative Quantitative (discrete) Quantitative (continuous)
Hair color Shoe size Foot length
Computer password Shirt size (10, 12, 14 etc.) Height
License plate number Time to drive to campus
Shirt size (S, M, L)
Zip code
(ii)
Nominal Ordinal Interval Ratio
Zip code Grade IQ Height
Gender Rating SAT score Time
Eye color Ranking Temperature (F, C) Weight
2. A sample of 100 students was taken, and these students were asked about the amount of
money they possess. The following table gives the frequency distribution of their responses.
[7]
Amount of Number of Amount of Number of
Money (Tk.) Students Money (Tk.) Students
0 - 99 18 500 - 599 √49
100 - 199 K 600 - 699 8
200 - 299 12 700 - 799 9
300 - 399 K-4 800 - 899 6
400 - 499 √81 900 - 999 5
a) Find the value of K and class midpoints.
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
b) Do all the classes have the same width? If so, what is the width?
c) Prepare the relative frequency and percentage distribution columns.
d) Prepare a cumulative frequency distribution. Calculate the cumulative relative frequencies
and cumulative percentages for all classes.
e) Find the percentage of the students who possess money -
i. Minimum 500
ii. Maximum 499
f) Represent the data set in a suitable graph with appropriate information (like title, axis, label
etc.) and comments on the graph.
Answer to the question no. 2
a)
Amount 𝑥𝑖 𝑓𝑖
0 - 99 49.5 18
100 - 199 149.5 K = 15
200 - 299 249.5 12
300 - 399 349.5 K-4 = 11
400 - 499 449.5 9
500 - 599 549.5 7
600 - 699 649.5 8
700 - 799 749.5 9
800 - 899 849.5 6
900 - 999 949.5 5
10
∑ ⬚ 𝑓𝑖 = 100
𝑖=1
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
b) Yes, the class width is 99.
c & d)
Relativ Percentage Cumulative Cumulative Cumulative
e
Frequency Relative Percentage
Freque Frequency
Amount 𝑓𝑖 ncy
Modal
Class 0 - 99 18 0.18 18 18 0.18 18
100 - 199 15 0.15 15 33 0.33 33
200 - 299 12 0.12 12 45 0.45 45
300 - 399 11 0.11 11 56 0.56 56
400 - 499 9 0.09 9 65 0.65 65
500 - 599 7 0.07 7 72 0.72 72
600 - 699 8 0.08 8 80 0.80 80
700 - 799 9 0.09 9 89 0.89 89
800 - 899 6 0.06 6 95 0.95 95
900 - 999
5 0.05 5 100 1.00 100
10
∑ ⬚ 𝑓𝑖
𝑖=1
= 100
e)
i. For minimum 500, we will consider classes from 500-599 to 900-999
Total frequency of those class = 7+8+9+6+5 = 35
35
And percentage = 100 × 100% = 35%
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
ii. For maximum 499 we will consider classes from 0-99 to 400-499
Total frequency of those class = 18+15+12+11+9 = 65
65
And percentage = 100 × 100% = 65%
f)
3. The following data set represents the record high temperatures in degree Fahrenheit (℉)
for each of the 50 US states: [5.5]
106 98 96 108 90 93 89 103 104 119
111 85 97 102 85 109 93 120 98 102
90 96 114 108 91 100 96 105 89 96
107 99 113 125 88 122 110 85 99 90
93 102 123 110 111 101 92 96 89 116
a) Construct a suitable frequency distribution table using interval 85 – 95, 95 – 105 and
so on. [2]
b) Construct a stem and leaf plot and mention the interesting features like maximum and
minimum value, range, modal value and median value. [3.5]
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
Answer to the question no. 3
a)
Class Limit Tally Frequency Relative Frequency Percentage
85-95 IIII IIII IIII 15 0.30 30
95-105 IIII IIII IIII II 17 0.34 34
105-115 IIII IIII II 12 0.24 24
115-125 IIII 5 0.10 10
125-135 I 1 0.02 2
Total 50 1 100
* Lower limit included and upper limit excluded
b) Stem and Leaf Plot:
Key: 10|4 → means 104
Stem Leaf
8 5558999
9 000123336666678899
10 0122234567889
11 00113469
12 0235
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
Maximum value = 125
Minimum value = 85
Range = 125-85 = 40
Modal Value = 96 [Occurs maximum number of times, 5 times]
Median Value = Average of 25th value + 26th value = (99+100) / 2 = 99.5
4. The number of Tesla, Inc. employees who will be selected for various salary bands in 2023 is
demonstrated in the following table: [5.0]
Wages ($) No. of Employees
40k – 50k Y+8
50k – 60k 24+√49
60k – 70k 19
70k – 80k √121 + Y
80k – 90k √225
Here, Y is the last digit of your student ID (i.e., 20100012, 21123415, etc.). Suppose your ID
is 20100012, then the 2nd last row (70k-80k) will be √121+2 = 13]. Estimate the following:
a) Find the Range of wages ($) and complete the frequency distribution table. [0.5+0.5]
b) Find:
i. Mean [1.0]
ii. Median [1.5]
iii. Mode [1.5]
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
Answer to the question no. 4
a) Range = (90 – 40) = 50
Let, Y = 2
Frequency of class (70– 80) = 11 + Y= 13
Complete frequency distribution table:
Wages ($) Frequency Mid value Cumulative 𝑓𝑖 𝑥𝑖
(𝑓𝑖 ) (𝑥𝑖 ) frequency
40 - 50 10 45 10 450
50 - 60
(Modal Class) 31 55 41 1705
60 - 70
(Median Class) 19 65 60 1235
70 - 80 13 75 73 975
80 - 90 15 85 88 1275
5 5
∑ ⬚ 𝑓𝑖 ∑ ⬚ 𝑓𝑖 𝑥𝑖
𝑖=1 𝑖=1
= 88 = 5640
b)
i) Mean:
∑5𝑖=1 ⬚𝑓𝑖 𝑥𝑖 5640
𝑥= ∑5𝑖=1 ⬚𝑓𝑖
= 88
= 64.09
The mean is 64.09
ii) Median:
𝑛 88
= = 44
2 2
𝐿𝑚 = 60
𝐹 = 41
𝑓𝑚 = 19
𝑐 = 10
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
𝑛
−𝐹 44 − 41
2
Median = 𝐿𝑚 + ∗ 𝑐 = 60 + ∗ 10 = 61.5789
𝑓𝑚 19
The median is 61.58
iii) Mode:
𝐿0 = 50
∆1 = 31 − 10 = 21
∆2 = 31 − 19 = 12
𝑐 = 10
∆1 21
Mode = 𝐿0 + ∆ ∗ 𝑐 = 50 + 21 + 12 ∗ 10 = 56.36
1 + ∆2
The mode is 56.36
Y Mean Median Mode
0 64.28571 60.52632 56.57143
1 64.18605 61.05263 56.47059
2 64.09091 61.57895 56.36364
3 64 62.10526 56.25
4 63.91304 62.63158 56.12903
5 63.82979 63.15789 56
6 63.75 63.68421 55.86207
7 63.67347 64.21053 55.71429
8 63.6 64.73684 55.55556
9 63.52941 65.26316 55.38462
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
5. The stock price of AAA Cable Company for the 24 trading days are in the following table
below: [9.5]
92 87 72 65 86 77 81 69
88 77 62 65 57 47 31 69
52 54 68 63 42 45 49 58
For the given information
a. Construct Stem and leaf plot [1]
b. Determine 𝑄1 , 𝑄2 𝑎𝑛𝑑 𝑄3 Quartiles [1.5]
c. Determine 𝐷5 𝑎𝑛𝑑 𝐷8 deciles [1]
d. Determine 𝑃30 , 𝑃80 𝑎𝑛𝑑 𝑃67 Percentiles [1.5]
e. Determine IQR [0.5]
f. Draw Box and whiskers plot. Also find the outliers if any? [2.5]
g. Calculate the coefficient of skewness and comment on the shape distribution [1.5]
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
Answer to the question no. 5
Unarranged Arranged
sl # Quartiles Deciles Percentiles
data data
92 1 31
87 2 42
72 3 45
65 4 47
86 5 49
77 6 52 Q1=AM of 6th and 7th
81 7 54 value=53
69 8 57 P30=>7.2th=8th=57
88 9 58
77 10 62
62 11 63
65 12 65 Q2=AM of 12th and D5=AM of 12th and
57 13 65 13th value=65 13th value=65
47 14 68
31 15 69
69 16 69
52 17 72 P67=>16.08th=17th=72
54 18 77 Q3=AM of 18th and
68 19 77 19th value=77
63 20 81 D8=>19.2th=20th=81 P80=>19.2th=20th=81
42 21 86
45 22 87
49 23 88
58 24 92
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
a) Stem and leaf plot:
Key: 3|9 means 39 Stem Leaf
=> 3 9
Stem Leaf
3 1
4 2,5,7,9
5 2,4,7,8
6 2,3,5,8,9,9
7 2,7,7
8 1,6,7,8
9 2
b) Determining Q1, Q2 and Q3 Quartiles:
1×24
Here n=24 and for 𝑄1 => = 6 (is an Integer value)
4
1 52+54
So 𝑄1 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 6𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 7𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟓𝟑
2 2
2×24
Here n=24 and for 𝑄2 => = 12 (is an Integer)
4
1 65+65
So 𝑄2 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 12𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 13𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟔𝟓
2 2
3×24
Here n=24 and for 𝑄3 => = 18 (is an Integer value)
4
1 77+77
So 𝑄3 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 18𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 19𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟕𝟕
2 2
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
c) Determining 𝑫𝟓 𝒂𝒏𝒅 𝑫𝟖 Deciles:
5×24
Here n=24 and for 𝐷5 => = 12 (is an Integer)
10
1 65+65
So 𝐷5 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 12𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 + 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 13𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = = 𝟔𝟓
2 2
8×24
Here n=24 and for 𝐷8 => = 19.2 (is not an Integer)
10
So 𝐷8 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 20𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟖𝟏
d) Determine 𝑷𝟑𝟎 , 𝑷𝟖𝟎 𝒂𝒏𝒅 𝑷𝟔𝟕 Percentiles:
30×24
Here n=24 and for 𝑃30 => = 7.2 (is not an Integer)
100
So 𝑃30 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 8𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟓𝟕
80×22
Here n=24 and for 𝑃80 => = 19.2 (is not an Integer)
100
So 𝑃80 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 20𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟖𝟏
67×24
Here n=24 and for 𝑃67 => = 16.08 (is not an Integer)
100
So 𝑃67 = [𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 17𝑡ℎ 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛] = 𝟕𝟐
e) 𝑰𝑸𝑹 = 𝑸𝟑 − 𝑸𝟏
= 77 − 53
= 24
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
f) Box and whiskers plot:
Outliers are identified as individual data points that fall outside the whiskers, which
extend to the minimum and maximum values within 1.5 times the interquartile
range (IQR).
So,
Lower fence = Q1 - 1.5×IQR,
Upper fence = Q3 + 1.5×IQR]
= [53 - (1.5 × 24) , 77 + (1.5 × 24) ]
= [17, 113]
Since, there is no data points outside this range so there are is outlier.
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
g) Coefficient of skewness:
𝐵𝑜𝑤𝑙𝑒𝑦 ′ 𝑠 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠
(𝑄3 − 𝑄2 ) − (𝑄2 − 𝑄1 )
=
𝑄3 − 𝑄1
(77 − 65) − (65 − 53)
= = 0.0
77 − 53
The distribution is approximately symmetric. Because the coefficient of skewness is equal to 0.
The value of skewness is 0.
6. A study on a range of automotive lubricants reported the following data on oxidation-
induction time (min) for various commercial oils: [12.5]
Sample 1:
87 103 130 160 180 195 132 145 211 105
145 153 152 138 87 99 93 119 129
Sample 2:
99 102 110 33 56 112 130 111 124 155
201 209 103 66 84 75 107 202 59
a) What are the sample size of the sample 1 & 2 individually? [0.5]
b) Compute the sample mean, variance, and standard deviation for sample 1. [1+2.5+0.5]
c) Compute the sample mean, variance, and standard deviation for sample 2. [1+2.5+0.5]
d) Compute the Coefficient of variation for sample 1. [0.5]
e) Compute the Coefficient of variation for sample 2. [0.5]
f) Which measure one should consider to compare the performance/consistency among the
sample data? And why? [2]
g) For which sample of commercial oils, the relative variability of oxidation-induction time is
higher? [1]
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24
Answer to the question no. 6
a) Sample size for sample 1=19
Sample size for sample 2=19
b) For sample 1:
87+103+130………..+119+129 2563
Sample mean = = = 134.895
19 19
∑𝑛 ̅ )2
𝑖=0(𝐱𝐢− 𝐱 22765.7895
Variance = = = 1264.766
𝑛−1 19−1
Standard Deviation= √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 35.564
c) For sample 2:
99+102+110………..+202+59 2138
Sample mean = = = 112.526
19 19
∑𝑛 ̅)2
𝑖=0(𝐱𝐢− 𝐱 44576.7368
Variance = = = 2476.485
𝑛−1 19−1
Standard Deviation= √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 49.764
𝑆𝐷 35.564
d) Coefficient of Variation, CV1 = x̅
×100 = 134.895 × 100 = 26.364%
𝑆𝐷 49.764
e) Coefficient of Variation, CV2 = x̅
×100 = 112.526 × 100 = 44.225%
f) The coefficient of Variation (CV)/ Standard Deviation one should consider to compare
the performance / consistency of the product of the two company based on the
situation.
The coefficient of variation represents the ratio of the standard deviation to the
mean, and it is a useful statistic for comparing the degree of variation from one data
series to another, even if the means are drastically different from one another.
g) As CV1 < CV2, Sample 2 of commercial oils has relatively higher variation in oxidation-
induction time comparing to Sample 1.
STA101 (Introduction to Statistics) _Assignment 1&2_Summer 24