Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
62 views25 pages

Rachit. Assignment

This document contains the responses to questions from an exercise on IT skills and data analysis. Multiple regression analysis determines the relationship between a variable and more than one dependent variable. Scatter diagrams show relationships between variables and can indicate if the relationships are direct, inverse, linear, or curvilinear. Regression analysis is used to calculate the estimating equation that best fits the data points on a scatter plot.

Uploaded by

kracc0744
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views25 pages

Rachit. Assignment

This document contains the responses to questions from an exercise on IT skills and data analysis. Multiple regression analysis determines the relationship between a variable and more than one dependent variable. Scatter diagrams show relationships between variables and can indicate if the relationships are direct, inverse, linear, or curvilinear. Regression analysis is used to calculate the estimating equation that best fits the data points on a scatter plot.

Uploaded by

kracc0744
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Sec assignment – IT SKILLS AND DATA ANALYSIS II

Submitted by – Rachit kumar

Exercise = 12.1

Question 1 : What is regression analysis?

Answer: it is the process of determining a relationship between a dependent and a independent


variable.

Question 7: Explain why and how we construct a scatter diagram.

Answer : A scatter diagram is constructed, by recording paired points of data on a graph, to determine
whether or not there is a relationship between a dependent and a independent variable, if there is a
relationship, then the pattern of points will approximate a line.

Question 8: What is multiple-regression analysis?

Answer : in the process which determines the relationship between a variable and more than a
dependent variable.

Question 9: For each of the following scatter diagrams, indicate whether a relationship exists and, if
so, whether it is direct or inverse and linear or curvilinear.

Answer :a) yes, direct and linear

b) yes, inverse and circular

c) yes, inverse and curvilinear

Exercise 12.2
x y xy x mean y mean x^2 x mean^2 n*xm

2.7 16.66 44.982 13.2 52.015 7.29 174.24

4.8 16.92 81.216 23.04

5.6 22.3 124.88 31.36

18.4 71.8 1321.12 338.56

19.6 80.88 1585.248 384.16

21.5 81.4 1750.1 462.25

18.7 77.46 1448.502 349.69

14.3 48.7 696.41 204.49

sum of columns= 7052.458 1800.84

b= 3.832876241

a= 1.421033618

estimating equation is given by y=1.421034+3.832876*x

when x=6, y = 24.41829106

when x=13.4,y= 52.78157525

when x=20.5,y= 79.99499656


90
80
70
60
50
40
y
30
20
10
0
0 5 10 15 20 25

x y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xm

11.6 50.48 585.568 13.35 56.19 134.56 178.2225 1247.5575

10.9 47.82 521.238 118.81

18.4 71.5 1315.6 338.56

19.7 81.26 1600.822 388.09

12.3 50.1 616.23 151.29

6.8 39.4 267.92 46.24

13.8 52.8 728.64 190.44

5636.01 1367.99

b= 3.20

a= 13.50652523

estimating equation = -0.36597-0.36597*x

when x=6, y= 32.6901094

when x=13.4,y= 56.3498632


when x=20.5,y= 79.0504378

90
80
70
60
50
40 y
30 Linear (y)

20
10
0
0 5 10 15 20 25

Question 12.14

n*xmean^
x y xy x mean y mean x^2 x mean^2 2

16 -4.4 -70.4 10.5 1.93 256 110.25 661.5

6 8 48 36

10 2.1 21 100

5 8.7 43.5 25

12 0.1 1.2 144

14 -2.9 -40.6 196

2.7 757

n*xmean*y
mean

-
1.244921
b= 466 121.59
15.00167
a= 539

estimating equation is
15.00168-1.24492*x

when x is
5, y is 8.777068063

when x is
6,y is 7.532146597

when x is
7,y is 6.287225131

10

2
y
0
0 5 10 15 20
-2

-4

-6
Question 12.15

x y xy x mean y mean x^2 y^ y-y^ (y-y^)^2 x mean^2

56 45 2520 47.57 38.55 3136 44.78633545 0.21366455 0.04565254 2262.9049

48 38.5 1848 2304 38.86810489 -0.368104892 0.135501212

42 34.5 1449 1764 34.42943197 0.070568025 0.004979846

58 46.1 2673.8 3364 46.26589309 -0.165893089 0.027520517

40 33.3 1332 1600 32.94987434 0.350125665 0.122587981

39 32.1 1251.9 1521 32.21009552 -0.110095516 0.012121023

50 40.4 2020 2500 40.34766253 0.052337468 0.002739211

sum of columns= 13094.7 16189 0.35 n*xme

12836.7645

b= 0.73977882 Se= 0.264575131

a= 3.358721549

estimating equation is 3.358722+0.739779*x

50
45
40
35
30
25
y
20
Linear (y)
15
10
5
0
0 20 40 60 80
Question 12.16

housing start x appliance sales y xy x mean y mean b a x^2

2 5 10 3.72 7.55 1.715553 1.168745

2.5 5.5 13.75 1.715553 1.168745 6.2

3.2 6 19.2 1.715553 1.168745 10.2

3.6 7 25.2 1.715553 1.168745 12.9

3.3 7.2 23.76 1.715553 1.168745 10.8

4 7.7 30.8 1.715553 1.168745 1

4.2 8.4 35.28 1.715553 1.168745 17.6

4.6 9 41.4 1.715553 1.168745 21.1

4.8 9.7 46.56 1.715553 1.168745 23.0

5 10 50 1.715553 1.168745 2

sum of columns= 295.95 147.1

b= 1.715552524 x mean^2 n*xmean^2

a= 1.168144611 13.8384 138.38

eqn of relationship is 1.168145+1.715553*x

slope of regression line is 1.715553

standard error of estimate is equal to 0.374165739


Question 12.17

x y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ymean

5 9 45 5.6 5 25 31.36 156.8 140

5.5 6 33 30.25

6 3 18 36

6.5 0 0 42.25

5 7 35 25

131 158.5

Unreturned Lobs (L) is the dependent variable y b is equal to -5.294117647

a is equal to 34.64705882

for estimated equation 34.64705882-5.29411765*x=y

when height is 5.9 then unreturned lob is found using estimating equation by putting x=5.9 is equal to =3.4
Question 12.18

passenger y
1000

800

600

400

200

0
0 10 20 30 40 50 60 70

passenger y

price x passenger y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ymean

25 800 20000 42.5 687.5 625 1806.25 14450 233750

30 780 23400 900

35 780 27300 1225

40 660 26400 1600

45 640 28800 2025

50 600 30000 2500

55 620 34100 3025

60 620 37200 3600

227200 15500

b is -6.238095238

a is 952.6190476

estimating equation is y= 952.61-6.23*x

when ticket price is 50 cents then passangers per 100 mile is 641.41

by putting x= 50 in the estimating equation


Question 12.19

x y y^ xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ymean

5 58 56.5 290 15 28.5 25 225 1800 3420

10 41 42.5 410 100

10 45 42.5 450 100

15 27 28.5 405 225

15 26 28.5 390 225

20 12 14.5 240 400

20 16 14.5 320 400

25 3 0.5 75 625

2580 2100

b= -2.8

a= 70.5

equation is 70.5-
2.8*x

test score if worker is inturrupted

18 times = 20.1

80

60
y
40
y^
20
Linear (y)
0 Linear (y^)
0 5 10 15 20 25 30
Question 12.20

noise x arousel y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ymean

4 39 156 3.5 32.5 16 12.25 98 910

3 38 114 9

1 16 16 1

2 18 36 4

6 41 246 36

7 45 315 49

2 25 50 4

3 38 114 9

1047 128

b is 4.566666667

a is 16.51666667

estimating equation is y=16.51+4.56*x

degree of arousel when noise level is 5

is founded by putting x=5 in the estimating equation

y=16.51+4.56*5

y=39.35

50
40
30
20
arousel y
10
0
0 2 4 6 8
Question 12.21

s.no. test score x units sold y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ym

1 2.6 95 247 3.4 137.1 6.76 11.56 115.6 4

2 3.7 140 518 13.69

3 2.4 85 204 5.76

4 4.5 180 810 20.25

5 2.6 100 260 6.76

6 5 195 975 25

7 2.8 115 322 7.84

8 3 136 408 9

9 4 175 700 16

10 3.4 150 510 11.56

4954 122.62

b is 41.68091168

a is -4.615099715

regression line is given by y = -4.61+41.68*x

change in y for a unit change in x is equal to b that is 41.68

our x mean is 3.4 by putting this value in the regression line equation is

137.102
Question 12.22

soccer games x minor accidents y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ym

20 6 120 20.85 6.85 400 434.7225 3043.0575 99

30 9 270 900

10 4 40 100

12 5 60 144

15 7 105 225

25 8 200 625

34 9 306 1156

1101 3550

b is 0.199711999

a is 2.686004823

estimating equation is y= 2.68+0.19*x

no. of minor accidents when 33 soccer games took place

8.95

standard error of estimate is

0.314324673
y^ y-y^ (y-y^)^2 a b

6.48 -0.48 0.2304 2.68 0.19

8.38 0.62 0.3844 2.68 0.19

4.58 -0.58 0.3364 2.68 0.19

4.96 0.04 0.0016 2.68 0.19

5.53 1.47 2.1609 2.68 0.19

7.43 0.57 0.3249 2.68 0.19

9.14 -0.14 0.0196 2.68 0.19

0.494

10
9
8
7
6
5
4 minor accidents y
3
2
1
0
0 10 20 30 40
Question 12.23

p or x q or y y^ xy x mean y mean x^2 x mean^2 n*xmean^2

20 125 130.6463 2500 13.06 203.75 400 170.5636 1364.5088

17.5 156 156.9806 2730 306.25

16 183 172.7811 2928 256

14 190 193.8485 2660 196

12.5 212 209.6491 2650 156.25

10 238 235.9833 2380 100

8 250 257.0507 2000 64

6.5 276 272.8513 1794 42.25

19642 1520.75

-
b is 10.53371326

a is 341.3202952

regressionn line is y = 341.32-10.53*x

n*xmean*ymean y^ y-y^ (y-y^)^2 a b

21287.8 130.6463 -5.6463 31.88070369 341.3203 -10.5337

156.98055 -0.98055 0.961478302 341.3203 -10.5337

172.7811 10.2189 104.4259172 341.3203 -10.5337

193.8485 -3.8485 14.81095225 341.3203 -10.5337

209.64905 2.35095 5.526965903 341.3203 -10.5337

235.9833 2.0167 4.06707889 341.3203 -10.5337

257.0507 -7.0507 49.71237049 341.3203 -10.5337

272.85125 3.14875 9.914626563 341.3203 -10.5337


300

250

200
q or y
150
y^
100 Linear (q or y )
50 Linear (y^)

0
0 5 10 15 20 25

Question 12.24

money spent x %age of pollutants y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xm

8.4 35.9 301.56 14.29 29.53 70.56 204.2041 2654.6533

10.2 31.8 324.36 104.04

16.5 24.7 407.55 272.25

21.7 25.2 546.84 470.89

9.4 36.8 345.92 88.36

8.3 35.8 297.14 68.89

11.5 33.4 384.1 132.25

18.4 25.4 467.36 338.56

16.7 31.4 524.38 278.89

19.3 27.4 528.82 372.49

28.4 15.8 448.72 806.56

4.7 31.5 148.05 22.09

12.3 28.9 355.47 151.29

5080.27 3177.12

b is -0.776160662
a is 40.62133587

standard erro

regression equation is given by y = 40.62-0.77*x

percentage of dangerous pollutants when x = 20k is 25.09812262

n*xmean^2 n*xmean*ymean y^ y-y^ (y-y^)^2 a b

2654.6533 5485.7881 34.101596 1.798404 3.234256947 40.62134 -0.77616

32.704508 -0.904508 0.818134722 40.62134 -0.77616

27.8147 -3.1147 9.70135609 40.62134 -0.77616

23.778668 1.421332 2.020184654 40.62134 -0.77616

33.325436 3.474564 12.07259499 40.62134 -0.77616

34.179212 1.620788 2.626953741 40.62134 -0.77616

31.6955 1.7045 2.90532025 40.62134 -0.77616

26.339996 -0.939996 0.88359248 40.62134 -0.77616

27.659468 3.740532 13.99157964 40.62134 -0.77616

25.641452 1.758548 3.092491068 40.62134 -0.77616

18.578396 -2.778396 7.719484333 40.62134 -0.77616

36.973388 -5.473388 29.9579762 40.62134 -0.77616

31.074572 -2.174572 4.728763383 40.62134 -0.77616

93.752

standard error = 2.9194 2.91940218


Exercise 12.3

Question 12.25

a positive

b positive

c positive

d zero

Question 12.26

x y

5 9

5.5 6

6 3

6.5 0

5 7

coefficient of correlation= -0.9762

coefficient of determination =
0.95297

Question 12.27

price x passenger y

25 800

30 780

35 780

40 660

45 640

50 600
55 620

60 620

coefficient of correlation= -0.908

coefficient of determination= 0.8244

Question 12.28

x y y^ xy x mean y mean x^2 x mean^2

5 58 56.5 290 15 28.5 25 225

10 41 42.5 410 28.5 100

10 45 42.5 450 28.5 100

15 27 28.5 405 28.5 225

15 26 28.5 390 28.5 225

20 12 14.5 240 28.5 400

20 16 14.5 320 28.5 400

25 3 0.5 75 28.5 625

coeff. Of correlation is -0.99284954 by excel inbuilt command

Column 1 Column 2

Column 1 1

Column 2 -0.99284954 1

by data analysis toolpack


n*xmean^2 n*xmean*ymean a b (y-y^)^2 (y-y mean)^2

1800 3420 70.5 -2.8 2.25 870.25

70.5 -2.8 2.25 156.25

70.5 -2.8 6.25 272.25

70.5 -2.8 2.25 2.25

70.5 -2.8 6.25 6.25

70.5 -2.8 6.25 272.25

70.5 -2.8 2.25 156.25

70.5 -2.8 6.25 650.25

34 2386

by formula coeff. Of dispersion is 0.98575021

coeff of corre. Is -0.99284954

Question 12.29

noise x arousel y xy x mean y mean x^2 x mean^2 n*xmean^2

4 39 156 3.5 32.5 16 12.25 98

3 38 114 32.5 9

1 16 16 32.5 1

2 18 36 32.5 4

6 41 246 32.5 36

7 45 315 32.5 49

2 25 50 32.5 4
3 38 114 32.5 9

coeff of correl. Is 0.848008711

Column 1 Column 2

Column 1 1

Column 2 0.848008711 1

n*xmean*ymean y^ (y-y^)^2 a b (y-y mean)^2

910 34.783338 17.78023842 16.51667 4.566667 42.25

30.216671 60.58021032 16.51667 4.566667 30.25

21.083337 25.84031506 16.51667 4.566667 272.25

25.650004 58.5225612 16.51667 4.566667 210.25

43.916672 8.506975556 16.51667 4.566667 72.25

48.483339 12.13365059 16.51667 4.566667 156.25

25.650004 0.4225052 16.51667 4.566667 56.25

30.216671 60.58021032 16.51667 4.566667 30.25

244.37 870

coeff. Of determination is 0.719114943

coeff of correl. Is 0.848006452


Question 12.30

s.no. test score x units sold y

1 2.6 95

2 3.7 140

3 2.4 85

4 4.5 180

5 2.6 100

6 5 195

7 2.8 115

8 3 136

9 4 175

10 3.4 150

b is 41.68091

a is -4.6151

coeff of correl. Is 0.962784254

coeff. Of determination
is 0.926953918

Question 12.31

x y xy x mean y mean x^2 x mean^2

2 12.8 25.6 3.23 9.04 4 10.4329

3 11.3 33.9 9.04 9

5 3.2 16 9.04 25
4 6.4 25.6 9.04 16

2 11.6 23.2 9.04 4

6 3.2 19.2 9.04 36

1 8.7 8.7 9.04 1

3 10.5 31.5 9.04 9

4 8.2 32.8 9.04 16

3 11.3 33.9 9.04 9

3 9.4 28.2 9.04 9

2 12.8 25.6 9.04 4

4 8.2 32.8 9.04 16

Column 1 Column 2

Column 1 1

Column 2 -0.816369484 1

coeff of correl. Is 0.432492012

n*xmean^2 n*xmean*ymean y^ (y-y^)^2 a b (y-y mean)^2

1.9157328
135.6277 379.5896 11.4159 1 15.2665 -1.9253 14.1376

3.2739283
9.4906 6 15.2665 -1.9253 5.1076

5.64 5.9536 15.2665 -1.9253 34.1056

1.3579240
7.5653 9 15.2665 -1.9253 6.9696
0.0338928
11.4159 1 15.2665 -1.9253 6.5536

0.2649160
3.7147 9 15.2665 -1.9253 34.1056

21.540737
13.3412 44 15.2665 -1.9253 0.1156

1.0188883
9.4906 6 15.2665 -1.9253 2.1316

0.4028440
7.5653 9 15.2665 -1.9253 0.7056

3.2739283
9.4906 6 15.2665 -1.9253 5.1076

0.0082083
9.4906 6 15.2665 -1.9253 0.1296

1.9157328
11.4159 1 15.2665 -1.9253 14.1376

0.4028440
7.5653 9 15.2665 -1.9253 0.7056

41.363 124.0128

coeff. Of 0.6664618
determination is 49

-
0.81637114
coeff of correl. Is 7

regression line equation is y = 15.2665-1.9253*x

Question 12.32
x y xy x mean y mean x^2 x mean^2 n*xmean^2 n*xmean*ymean

3 11 33 2.875 8.25 9 8.265625 66.125 189.75

7 18 126 49

4 9 36 16

2 4 8 4

0 7 0 0

4 6 24 16

1 3 3 1

2 8 16 4

a=3.3309

b=1.7110

estimating equation is y= 3.3309+1.7110

Column 1 Column 2

Column 1 1

Column 2 0.786727714 1

coefficient of correlation= 0.7867

0.61889689

coefficient of determination=
0.6188

You might also like