Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
39 views29 pages

Assignment 2

Uploaded by

eshaan arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views29 pages

Assignment 2

Uploaded by

eshaan arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Instructions

This take-home assignment contains questions from Chapter 10 to 11 in the course textbook.
This is a take home assignment, which means that you have the liberty to complete this assignment at your
You are expected to complete this assignment on your own. You may reach out to classmates, the course ins
The instructor will grade you on the answers reached, as well as the methods used to arrive at the answers.
Excel workings, in particular, are difficult to parse. You will need to provide information on both the calculatio
In short, the presentation of your answers is important and will have bearing on your grades in this assignme
There are 8 questions and they all carry equal weightage.
Please submit ONE SINGLE .xls file containing responses to all problems. Each problem should be
1. Consumer Reports uses a survey of readers to obtain customer satisfaction ratings for the nation’s largest retailers in India (
Each survey respondent is asked to rate a specified retailer in terms of six factors: quality of products, selection, value, checko
An overall satisfaction score summarizes the rating for each respondent with 100 meaning the respondent is completely satisfi
Sample data representative of independent samples of "Mart-India" and "Bharat Bazaar" customers are shown below

a. Formulate the null and alternative hypotheses to test whether there is a difference between the population mean customer
b. Assume that experience with the Consumer Reports satisfaction rating scale indicates that a population standard deviation
Conduct the hypothesis test and report the p-value. Round your answer to four decimal places.
c. Provide a 95% confidence interval for the difference between the population mean customer satisfaction scores for the two
Which retailer, if either, appears to have the greater customer satisfaction?

Solutions
ans. a The null hypothesis (Ho) in this case is that x1 and x2 are equal to each other,
whereas the alternative hypothesis (Ha) is that x1 and x2 are not equal to each other.
ans. b x1 = 79 σ1 = 12
x2 = 71 σ2 = 12
μ1 = Same for nu n1 = 25
μ2 = n2 = 30

z value 2.4618
z table value 0.9931
P(Z>z) = 1-z tb1 - 0.9931 0.0069
P value 2*P(Z>z) 0.0138

ans. c x1 = 79 σ1 = 12
x2 = 71 σ2 = 12
μ1 = Same for nu
n1 = 25
μ2 = n2 = 30

Standard error 3.2496 when + 14.3692


Z a/2 1.96 when - 1.6308
x1 - x2 8 interval [1.63,14.36]
Mart-India appears to have the greater customer satisfaction.
nation’s largest retailers in India (Consumer Reports, March 2012).
products, selection, value, checkout efficiency, service, and store layout.
e respondent is completely satisfied in terms of all six factors.
tomers are shown below

en the population mean customer satisfaction scores the two retailers.


a population standard deviation of 12 is a reasonable assumption for both retailers.

mer satisfaction scores for the two retailers.

to each other,
qual to each other.
2. Scores in the first and fourth (final) rounds for a sample of 20 golfers who competed in the "Indian Golf Association" tourna
Suppose you would like to determine if the mean score for the first round of an "Indian Golf Association" event is significantly
Does the pressure of playing in the final round cause scores to go up? Or does the increased player concentration cause score

Final
PLayer First Round Round Player2 First Round3 Final Round4
Arjun Patel 70 72 Arjun Prakash 72 72
Neha Sharma 71 72 Chetan Mehta 72 70
Rajesh Singh 70 75 Jayesh Dubey 70 73
Priya Patel 72 71 Mihir Desai 70 77
Aakash Mehta 70 69 Chirag Patel 68 70
Kavita Reddy 67 67 Bhaskar Verma 68 65
Ankit Gupta 71 67 Eknath Singh 71 70
Deepika Desai 68 75 Kamlesh Bhatia 70 68
Rahul Kumar 67 73 Nikhil Wagh 69 68
Sneha Choudhary 70 69 Tejas Agarwal 67 71

a. Use a = .10 to test for a statistically significantly difference between the population means for first- and fourth-round scores
conclusion?
b. What is the point estimate of the difference between the two population means? For
which round is the population mean score lower?
c. What is the margin of error for a 90% confidence interval estimate for the difference
between the population means? Could this confidence interval have been used to test
the hypothesis in part (a)? Explain

Solutions

Ans. a Sample size = 20

Standard Dev = 3.3162

mean of differences = d dassum of differences/sample size = 1.05

Ttest Value = 1.416


Degrees of Freedom = n-1 = 19

P value 0.173 (Using tdist fnc.)

Ans. b The point estimate of the difference between the two population means is d dash = 1.05
This means that the final round scores are higher than first round scores.

Ans. c

Critical value = 90% conf.int


α 0.1
α /2 0.05
DOF 19
ttable value 1.729

Margin of error 1.2821

Confidence int. 2.3321


-0.2321

interval [-0.2321,2.3321]

Conclusion: Null hypothesis cannot be rejected


Indian Golf Association" tournaments are shown in the following table.
ssociation" event is significantly different than the mean score for the fourth and final round.
ayer concentration cause scores to come down?

or first- and fourth-round scores. What is the p-value? What is your


d dash = 1.05
3. In "Born together—Reared apart: the Landmark Maharashtra twin study" (2012),
Nancy Segal discusses the efforts of research psychologists at the University of Mumbai to understand similarities and differen
Below are critical reading SAT scores for several pairs of identical twins (twins who share all of their genes), one of whom was
one of whom was raised in a family with no other children (no siblings) and one of whom was raised in a family with other chil

Name (No Name (With


Siblings) SAT Score Siblings) SAT Score2
Raj 441 Dinesh 428
Mathew 641 Ramesh 589
Shalini 540 Kavya 662
Tarun 376 Kunal 449
Meera 384 Esha 495
Dev 468 Mohan 498
Vasundhara 539 Jyotsna 437
Deepika 595 Jasmin 523
Devendra 494 Karthik 418
Laxmi 681 Bhavna 618
Bharat 580 Lokesh 497
Jyoti 577 Janhvi 654
Ananya 486 Deepika 447
Ganesh 633 Lavanya 629
Rahul 543 Bhaskar 440
Ganesh 580 Karishma 639
Nisha 499 Radha 529
Sameer 595 Jatin 569
Siddharth 494 Dev 470
Mira 632 Ananya 630

a. What is the mean difference between the critical reading SAT scores for the twins raised with no siblings and the twins raise
b. Provide a 90% confidence interval estimate of the mean difference between the critical reading SAT scores for the twins rai
c. Conduct a hypothesis test of equality of the critical reading SAT scores for the twins raised with no siblings and the twins ra

Solutions

Ans. a No of pairs 20
(No Siblings) T 538.9 (Sum of SAT Scores/No. of Pairs)
(With Siblings) Twins' Mean 531.05 (Sum of SAT Scores/No. of Pairs)

Mean DifferenceNo Sibling Mean - With Sibling Mea 7.85


(This shows us that on average twins with no siblings scored slightly higher than twins with siblings)
Ans. b Conf.interval @ 90%
α 0.1
t value

Interval estimation

Standard Error 26.66491634

stdv 1 83.6175
stdv 2 85.0204

(rounding to 38) Degrees of free 37.9895


Numerator 505546.2595
Denomenator 13307.5314

Variance square 1 ((S1)^2 / n1)^2 122216.112


Variance square 2 ((S2)^2 / n2)^2 130626.9844

t α /2,df value 1.686


(from t table) (38,0.05)

confidence interval calculation


with + 52.807
with - -37.107

[52.8070 , -37.1070]

Ans.c Hypothesis test:Ho: μ1 = μ2 There is no difference in the twins' mean SAT critical reading scores
Ha: μ1 ≠ μ2 There is a difference in the twins' mean SAT critical reading scores

test statistic = 0.2943943233


critical value = 2.712
(α = 0.01,dof = 38)
α /2 = 0.005

(2.712 > 0.294) Hence we fail to reject the null hypothesis.


similarities and differences between twins by studying sets of twins who were raised separately.
nes), one of whom was raised in a family with no other children (no siblings) and one of whom was raised in a family with other children (
n a family with other children (with siblings)

lings and the twins raised with siblings?


scores for the twins raised with no siblings and the twins raised with siblings.
siblings and the twins raised with siblings at a = .01. What is your conclusion?

ns with siblings)
cal reading scores
al reading scores
family with other children (with siblings).
4. A survey of 320 households with 5 pets each revealed the distribution shown in the table.
Is the result consistent with the hypothesis that dogs(male) and cats(female) are equally probable as pets?

5 male pets 4 male pets 3 male pets


Male and female 0 female
pets 1 female pets 2 female pets
pets
Number of Families 18 56 110
Expected Proabablity 0.03125 0.15625 0.3125
Expected Frequency 10 50 100

Solution:
Null Probability - Male and female pets are equally probable, meaning the expected proportio
Using the Binomial Distribution Formula -
probable as pets?

2 male pets 1 male pets 0 male pets Total


3 female pets 4 female pets 5 female pets
88 40 8 320
0.3125 0.15625 0.03125
100 50 10

he expected proportion is 50-50.


5. Vinay Kumar, head cricket coach for the 2012 national champion Mumbai Tigers, is the highest paid coach in Indian sports w
The sample below shows the head cricket coach's salary for a sample of 10 schools playing in the Indian Premier League (IPL).

Universit Coach's
y Salary
Karnataka 2.2
Mumbai 1.5
Kolkata 0.5
Chennai 0.2
Delhi 2.4
Hyderaba
1.5
d
Punjab 2.7
Gujarat 0.1
Rajasthan 2
Kerala 0.2

a. use the sample mean for the 10 schools to estimate the population mean annual salary for head cricket coaches at colleges
b. use the data to estimate the population standard deviation for the annual salary for head cricket coaches.
c. what is the 95% confidence interval for the population variance?
d. what is the 95% confidence interval for the population standard deviation?

Solution:
a) Sample Mean to estimate the population mean annual salary for head cricket coaches at universities playing ncaa Division I
In Excel (Using the AVERAGE Function) = 1.33 million rupees

b) Population Standard Deviation (Using the STDEV Function) = 1.002275 million rupees

c) 95% confidence interval for the population variance


Sample Variance (using VAR.S function) = 1.004556
Lower critical value (using CHISQ.INV function) = 2.700389
Upper critical value (using function) = 19.02277

Lower bound = 0.475273


Upper bound = 3.348036

The 95% confidence interval for the variance is (0.4752, 3.348).

d) Lower bound = 0.6894


Upper Bound = 1.829764

The 95% confidence interval for the standard deviation is (0.6894, 1.8297)
ghest paid coach in Indian sports with an annual salary of ₹5.4 million (Indian Sports Today, March 29, 2012).
n the Indian Premier League (IPL). Salary data are in millions of rupees.

r head cricket coaches at colleges and universities playing ncaa Division Ist cricket.
cricket coaches.

universities playing ncaa Division Ist cricket = Sum of all salaries/ Number of Universities
Our adventure involves conducting a statistical test at a significance level of 0.05.
Conduct a statistical test to determine whether there is a significant difference between the variances in the bag weights for t
What is your conclusion? which machine, if either, provides the greater opportunity for quality improvements?

Machine Measurement 1 Measurement 2 Measurement 3 Measurement 4


Machine A 2.95 3.45 3.5 3.75
Machine B 3.22 3.3 3.34 3.28

Solution:
Hypothesis for the F-Test = Null Hypothesis (Same Variance for both machines) and Alternative Hypothesis (Different Variance
The significance level is given 0.05
Variance of Machine 1 (VAR.S Function) = 0.048889
Variance of Machine 2 = 0.0059012987013

F-statistic is calculated as the ratio of the two sample variances F = Varianc eof Machine 1/ Variance of Ma
7.221579666E-06
Number of Measurements for Machine 1 = 25
Number of Measurements for Machine 2 = 22

Critical Value = 2.05400431223557


he variances in the bag weights for two machines.
ality improvements?

Measurement 5 Measurement 6 Measurement 7 Measurement 8 Measurement 9


3.48 3.26 3.33 3.2 3.16
3.29 3.25 3.3 3.27 3.38

tive Hypothesis (Different Variance)

anc eof Machine 1/ Variance of Machine 2


Measurement 10 Measurement 11 Measurement 12 Measurement 13 Measurement 14 Measurement 15
3.2 3.22 3.38 3.9 3.36 3.25
3.34 3.35 3.19 3.35 3.05 3.36
Measurement 16 Measurement 17 Measurement 18 Measurement 19 Measurement 20
3.28 3.2 3.22 2.98 3.45
3.28 3.3 3.28 3.3 3.2
Measurement 21 Measurement 22 Measurement 23 Measurement 24 Measurement 24
3.7 3.34 3.18 3.35 3.12
3.16 3.33
7. The test scores of 352 students who completed a training program in software development have a standard deviation of 0
On the other hand, the test scores of 73 students who withdrew from the same program have a standard deviation of 0.797.
Are there significant differences in the variances of test scores between students who completed the software development p
Let's test this at a significance level of 0.05.
Note: F.025 with 351 and 72 degrees of freedom is 1.466.
Group Sample Size Standard Deviation
Completed Program 352 0.94
Dropped Out 73 0.797

Solution:
Hypothesis for the F-Test = Null Hypothesis (Same Variance) and Alternative Hypothesis (Different Variance)
The significance level is given 0.05.

For completed Program: Sample Size (n₁) = 352


Standard Deviation (s₁) = 0.94
Degrees of Freedom (df₁) = 351

For dropped out: Sample Size (n₂) = 73


Standard Deviation (s₂) = 0.797
Degrees of Freedom (df₂) = 72
F₀.025 with 351 and 72 degrees of freedom = 1.466
F-Statistic = Variance of Complete Program/ Variance of Dropped out group

Variance of Completed Program = 0.8836


Variance for dropped out group = 0.635209

F-Statistic = 1.39103822521406

Since the F-Statistic (1.391038225) is lesser than the critical value (1.466), the null hypothesis is not rejected.

In conlusion, there is no significant difference in the variances of test scores between students who completed the program an
have a standard deviation of 0.940.
a standard deviation of 0.797.
ed the software development program and those who dropped out?

ent Variance)

s not rejected.

who completed the program and those who dropped out at 0.05 significance level. Therefore, the variance in test scores is statistically sim
test scores is statistically similar for both groups.
8. A research hypothesis suggests that the variance in the heights of trees in tropical rainforests is significantly higher than the
In the study, 16 trees from each type of forest are selected and their heights are measured.
In the tropical rainforest, the standard deviation of tree heights is found to be 28 meters, while in the temperate forest, the st
Let's denote the sample of tree heights in the tropical rainforest as population 1.

Population Sample Size Standard Deviation


Tropical Rainforest 16 28 meters
Temperate Forest 16 14 meters

a. At a .05 level of significance, do the sample data justify the conclusion that the variance in tree heights in tropical rainfores
what is the p-value?
b. what are the implications of your statistical conclusions ?

Solution:
To test whether the variance of the heights of tropical rainforests trees is significantly higher than in temperate forests.
Hypothesis for the F-Test = Null Hypothesis (Same Variance) and Alternative Hypothesis (Variance of tropical rainforest height
Significance level = 0.05

Tropical Rainforest:
Sample size (n1) = 16
Standard deviation (s1) = 28 meters
Variance (Square of Standard Deviation) = 28^2 sq. m = 784 square meters

Temperate Forest:
Sample size (n2) = 16
Standard deviation (s2) = 14 meters
Variance (Square of SD) = 14^2 sq. m = 196 square meters

F-Statistic is the ratio of the two sample variances 4

Degrees of freedom for tropical rainforest = df1 = n1 − 1 = 16 − 1 = 15


Degrees of freedom for temperate forest = df2 = n2 − 1 = 16 − 1 = 15

Critical Value for F-Test (Using F.INV.RT Function) = 2.40344707149534

As we see on comparison that F-Statistic (4) is higher than the Critical Value (2.40344), meaning that we reject the null hypoth

Finding the P-Value (using the F.DIST.RT Function) = 0.00544477352917739

At the given significance level of 0.05, we see that the variance in tree heights is much higher in tropical rainforests when com
This shows that the heights of trees in tropical rainforests are more diverse, which could be due to many reasons including pre
nificantly higher than the variance in the heights of trees in temperate forests.

e temperate forest, the standard deviation is 14 meters.

ights in tropical rainforests is significantly higher than in temperate forests?

temperate forests.
tropical rainforest heights is greater than the temperate forest)

square meters

square meters

we reject the null hypothesis.

cal rainforests when compared to temperate forests. This is also true because the p-value of 0.005444 is less than 0.05.
any reasons including presence of diverse species, varied growth conditions, and also environmental factors.

You might also like