R Class 20

The document outlines an assignment consisting of multiple statistical analysis tasks involving datasets related to cable failures, auto claims, and marketing spend. It includes fitting linear and generalized linear models, evaluating model significance, performing hypothesis tests, and analyzing correlations and principal components. Each question requires specific statistical methods and interpretations, along with justifications for data manipulations and model adjustments.

Uploaded by

sarthakgarg0401

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views2 pages

R Class 20

Uploaded by

sarthakgarg0401

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

CHAPTER 11 & 12 ASSIGNMENT PREPARED BY – RAKESH GUPTA

Question 1: A statistician is carrying out an exercise to analyses a dataset that describes the failure times of outdoor telephone cables,
with respect to the cable material quality (graded 1 to 4) and level of rainfall in centimeters that the cable is exposed to. The data given
in the file “Cables_dataset.csv” show failure times in years for 20
different cables.
(i) Fit a linear model to the data with the failure time as the response, including both cable material quality and level of rainfall as the
two covariates. Your answer should include a summary of the fitted model. [5]
(ii) (a) State the formula of the model fitted in part (i), clearly explaining the notation that you use.
(b) Comment on the significance of the parameters of the model fitted in part (i). [6]
(iii) (a) Plot the residuals of the model in part (i).
(b) Comment on the plot created in (iii)(a). [4]

An analyst suggests that the 6th row of the original data should be removed.
(iv) (a) Construct a new data set from the original data “Cables dataset.csv” with the 6th row removed. [2]
(b) Justify the removal of the 6th row from the original data. [2]

(v) (a) Fit a linear model to the new data set constructed in part (iv)(a). [1]

(b) Comment on the fit of the model from part (v)(a) compared to the model fitted in part (i), by comparing suitable statistics from the
R outputs. [3]
(vi) (a) Fit a generalized linear model (GLM) to the data set constructed in part (iv)(a) using a Gamma distribution.
(b) State the formula of the model fitted in part (vi)(a), clearly explaining the notation that you use.
(c) Comment on the significance of the parameters of the model fitted in part (vi)(a). [6]

Question 2
Refer to the dataset “AutoClaims.csv” and answer the following questions.

(i) Fit a linear regression model to predict the “PAID” claim amount based on other variables (Consider the AGE as a numerical
variable and all others as categorical).Provide your interpretation of the model by explaining R-Squared, Adjusted R-Squared, p-
value of the model and p-value of each of the coefficients. Identify the significant variables in the prediction of “PAID” claims.

(ii) Comment on the applicability of the linear regression model by plotting “Residuals vs.Fitted Values” and “QQ Plot of the
residuals”.

(iii) Your actuarial friend has suggested you to use natural logarithm of “PAID” claims instead of the actual “PAID” Claim amount
because the loge(PAID) is more closer to normal distribution than “PAID” Claims. Verify the statement made by your friend by
comparing the Skewness and Excess Kurtosis of both the PAID claims as well as loge(PAID). Write appropriate custom functions to
compute both of them

(iv) Repeat the model in (i) above by considering the suggestion in (iii). Identify and comment on the key differences between both
the models.

(v) Your Manager has suggested that the model can be improved by adding interaction effects between STATE and CLASS, STATE
and GENDER, CLASS and GENDER as additional variables to the set of independent variables taken in (i). Evaluate the worthiness
of this suggestion.

Question 3: Refer to the data file “Indices_Returns.csv” and answer the following questions:
Indices_Returns.csv file is provided in the system.

(i) Compute the pairwise Pearson correlation coefficient between the returns of 10 sectors (BM, CD, EN, FM, FI,
HC, IN, IT, TE and UT) rounded to three digits after the decimal point. Display the correlation matrix in the output
(ii) Identify the pair with the highest correlation coefficient and the pair with the least correlation coefficient.
(iii) Perform Principal component analysis on the returns values of the 10 sectors.
(iv) How many principal components have an Eigen value of more than 1?
(v) What is the approximate proportion of total variation explained by the first two principal components?
(vi) Compute the pair wise correlations among the 10 principal components (Round them to 3 digits after the
decimal point) and display the results. What do you infer about the resulting correlations?
(vii) Using a scree plot comment on the number of significant components in the model
CHAPTER 11 & 12 ASSIGNMENT PREPARED BY – RAKESH GUPTA

Question 4: Five years of marketing spend and company sales by month

i) Construct a scatterplot of the data. Comment on the relationship between the Sales & Spend based on the plot. (4)

ii) Calculate Pearson’s correlation coefficient between Sales and Spend of the company. (2)

iii) Perform a hypothesis test for the null hypothesis that Pearson’s population correlation coefficient is equal to zero,
against the alternative that it is positive. You should report the p-value of the test and a clear conclusion. (5)

iv) Perform a simple linear regression analysis on the data. Your answer should report the estimate of parameter
sigma. (6)

v) Plot the fitted line on the data scatterplot. (2)

vi) State the proportion of the total variability of the responses explained by the model based
on your output in (iv). (1)

vii) Plot a graph of the residuals of the model fitted in (iv) against the explanatory variable. (2)

viii) Obtain a 99% confidence interval for parameter sigma. (4)

ix) Comment on the validity of the model based on results in part (vii) and part (viii). (2)

x) Calculate the p-value of a hypothesis test for this suggestion (slope equal to 10), by creating a suitable test
statistic. (7)

xi) Comment on the suggestion in point (x). (2)

xii)Calculate the predicted amount of sales when the marketing spend is INR 4500. (2)

Final Test
No ratings yet
Final Test
5 pages
B311-221 10.0.1.1 (H187SP60C983) Firmware Release Notes
100% (1)
B311-221 10.0.1.1 (H187SP60C983) Firmware Release Notes
10 pages
Samsung Sync Master 226bw 206bw Service Manual
No ratings yet
Samsung Sync Master 226bw 206bw Service Manual
79 pages
FINALTERM Mth302 Solved by Chanda Rehman Paper No19
No ratings yet
FINALTERM Mth302 Solved by Chanda Rehman Paper No19
9 pages
Business Statistics: Level 3
100% (1)
Business Statistics: Level 3
26 pages
B.Tech IT Application Dev Lab Manual
No ratings yet
B.Tech IT Application Dev Lab Manual
45 pages
Details of NAAC Accreditation
100% (1)
Details of NAAC Accreditation
78 pages
OPM-50 Optical Power Meter User's Manual: Shineway Technologies, Inc. All Rights Reserved
No ratings yet
OPM-50 Optical Power Meter User's Manual: Shineway Technologies, Inc. All Rights Reserved
20 pages
NCERT Solutions For Class 12 Maths Chapter 3 Matrices Exercise 3.3
No ratings yet
NCERT Solutions For Class 12 Maths Chapter 3 Matrices Exercise 3.3
13 pages
A-Level Statistics Practice
100% (1)
A-Level Statistics Practice
78 pages
Online Handwriting Recognition by Using Microcontroller
No ratings yet
Online Handwriting Recognition by Using Microcontroller
93 pages
0 - Module 0 Fundamental Introduction (Huawei VRP) PDF
No ratings yet
0 - Module 0 Fundamental Introduction (Huawei VRP) PDF
4 pages
Personal Information Sheet
No ratings yet
Personal Information Sheet
2 pages
Ali Ali Ali Ali Ali
100% (1)
Ali Ali Ali Ali Ali
11 pages
SOLID: The First 5 Principles of Object Oriented Design - DigitalOcean
No ratings yet
SOLID: The First 5 Principles of Object Oriented Design - DigitalOcean
25 pages
Attributes: 1.1 System
No ratings yet
Attributes: 1.1 System
29 pages
EDCI572 Project
No ratings yet
EDCI572 Project
28 pages
TradeGecko B2B ECommerce Getting Started Ebook
No ratings yet
TradeGecko B2B ECommerce Getting Started Ebook
17 pages
Srinivasan Padmanabhan Resume
No ratings yet
Srinivasan Padmanabhan Resume
6 pages
Actuarial Science Cs 1 Exam Paper
No ratings yet
Actuarial Science Cs 1 Exam Paper
5 pages
Aqib-Sr DevOps Eng
No ratings yet
Aqib-Sr DevOps Eng
2 pages
QM Resit Alternative - Online1
No ratings yet
QM Resit Alternative - Online1
4 pages
CN Lec2
No ratings yet
CN Lec2
49 pages
Cheeku
No ratings yet
Cheeku
5 pages
Business Statistics L2 Past Paper Series 3 2012 - 031
No ratings yet
Business Statistics L2 Past Paper Series 3 2012 - 031
9 pages
GPU-Based Viewshed Analysis Algorithm
No ratings yet
GPU-Based Viewshed Analysis Algorithm
9 pages
Homework 4
No ratings yet
Homework 4
3 pages
Edexcel S1 Mixed Question PDF
No ratings yet
Edexcel S1 Mixed Question PDF
78 pages
Practice Midterm Questions 1 and 2
No ratings yet
Practice Midterm Questions 1 and 2
4 pages
Applied Multivariate Statistics Q&A
No ratings yet
Applied Multivariate Statistics Q&A
11 pages
Statistics GIDP Ph.D. Qualifying Exam Methodology: January 10, 9:00am-1:00pm
No ratings yet
Statistics GIDP Ph.D. Qualifying Exam Methodology: January 10, 9:00am-1:00pm
20 pages
IMT 24 Quantitative Techniques M1
No ratings yet
IMT 24 Quantitative Techniques M1
5 pages
1 Section A Answer All Questions in This Section
No ratings yet
1 Section A Answer All Questions in This Section
8 pages
Business Stats Regression Analysis
No ratings yet
Business Stats Regression Analysis
4 pages
Business Management Exam Guide
No ratings yet
Business Management Exam Guide
5 pages
2010 Apr QMT500
No ratings yet
2010 Apr QMT500
8 pages
Final
No ratings yet
Final
7 pages
Individual Assignment Nov 2023
No ratings yet
Individual Assignment Nov 2023
5 pages
Sta100 G
No ratings yet
Sta100 G
9 pages
A1 QuestionFinalExam
No ratings yet
A1 QuestionFinalExam
8 pages
Secure File Transmission System Using Steganogrphic Algorithm - New
No ratings yet
Secure File Transmission System Using Steganogrphic Algorithm - New
45 pages
Business Analytics Assignment
No ratings yet
Business Analytics Assignment
4 pages
Soal UAS Statu Genap 2019 2020 ENGLISH 1
No ratings yet
Soal UAS Statu Genap 2019 2020 ENGLISH 1
9 pages
Test of Interest Rate
No ratings yet
Test of Interest Rate
5 pages
Arch 2012 Iss1 Shapiro Paper
No ratings yet
Arch 2012 Iss1 Shapiro Paper
11 pages
Short Answer 1
No ratings yet
Short Answer 1
3 pages
System of Linear Equations
No ratings yet
System of Linear Equations
18 pages
MTH302 Final Term
No ratings yet
MTH302 Final Term
11 pages
Ma 等 - 2024 - LLMParser An Exploratory Study on Using Large Language Models for Log Parsing
No ratings yet
Ma 等 - 2024 - LLMParser An Exploratory Study on Using Large Language Models for Log Parsing
13 pages
MS SQL Administrator Resume
No ratings yet
MS SQL Administrator Resume
1 page
Actuarial Statistics II Revision
No ratings yet
Actuarial Statistics II Revision
3 pages
CM1A Nov 24 QP - 0
No ratings yet
CM1A Nov 24 QP - 0
10 pages
Business Analystics - Model Paper
No ratings yet
Business Analystics - Model Paper
6 pages
GLM Assign
No ratings yet
GLM Assign
3 pages
BUS End Term QP - Shabana
No ratings yet
BUS End Term QP - Shabana
5 pages
BUSINESS ORGANISATION (Unit - 1)
No ratings yet
BUSINESS ORGANISATION (Unit - 1)
21 pages
Um2206 stm32 Nucleo64p Boards mb1319 Stmicroelectronics
No ratings yet
Um2206 stm32 Nucleo64p Boards mb1319 Stmicroelectronics
52 pages
CS1B - September 2024 - Exam Paper
No ratings yet
CS1B - September 2024 - Exam Paper
6 pages
Institutionalizing Modular Adaptable Ship Technologies
No ratings yet
Institutionalizing Modular Adaptable Ship Technologies
19 pages
Math Revision Pack P2 Questions
No ratings yet
Math Revision Pack P2 Questions
33 pages
Business Statistics UST Past Year Quiz1
No ratings yet
Business Statistics UST Past Year Quiz1
10 pages
University of Delhi: Semester Examination 2023-MAY-JUNE:REGULAR Statement of Marks / Grades
No ratings yet
University of Delhi: Semester Examination 2023-MAY-JUNE:REGULAR Statement of Marks / Grades
2 pages
Final Assessment
No ratings yet
Final Assessment
6 pages
SM-I End Term Exam
No ratings yet
SM-I End Term Exam
16 pages
Case Study 1
No ratings yet
Case Study 1
3 pages
MUF0142 Sample Exam Questions 2
No ratings yet
MUF0142 Sample Exam Questions 2
17 pages
Individual Assignment
No ratings yet
Individual Assignment
3 pages
Business Stats Q and A
No ratings yet
Business Stats Q and A
17 pages
Dell Enterprise Storage Integrator Plug-In
No ratings yet
Dell Enterprise Storage Integrator Plug-In
3 pages
Round 1 - Opening Closing Ranks - JAM MS 2025
No ratings yet
Round 1 - Opening Closing Ranks - JAM MS 2025
1 page
CS1B
No ratings yet
CS1B
5 pages
Structure Charts & HIPO Diagram
No ratings yet
Structure Charts & HIPO Diagram
5 pages
FandI Subj101 200304 Exampaper
No ratings yet
FandI Subj101 200304 Exampaper
7 pages
CH 61
No ratings yet
CH 61
1 page
CS1B April 2024 Exam Paper
No ratings yet
CS1B April 2024 Exam Paper
7 pages
Business Statistics (2024 Bcom)
No ratings yet
Business Statistics (2024 Bcom)
9 pages
DS Honor Sem 5 Endsem Paper 1
No ratings yet
DS Honor Sem 5 Endsem Paper 1
2 pages
Department of Computer Science & Engineering (Ai) Question Bank (Module Iii) Course: Maths-Iv Code: Bas 303 Year/Sem: Ii/Iii SESSION: 2024-25 Unit 3
No ratings yet
Department of Computer Science & Engineering (Ai) Question Bank (Module Iii) Course: Maths-Iv Code: Bas 303 Year/Sem: Ii/Iii SESSION: 2024-25 Unit 3
4 pages
Stat2, HW2
No ratings yet
Stat2, HW2
10 pages
Past Years Sem 2
No ratings yet
Past Years Sem 2
160 pages
STA 3201 Introduction To Econometrics - FT - DEC - 22
No ratings yet
STA 3201 Introduction To Econometrics - FT - DEC - 22
4 pages
Internet of Things Report
No ratings yet
Internet of Things Report
13 pages
01 Exam 5 June 2023 Morning
No ratings yet
01 Exam 5 June 2023 Morning
2 pages
CMS 301-F - Day - Prof. Bagakas
No ratings yet
CMS 301-F - Day - Prof. Bagakas
5 pages
R Class 15
No ratings yet
R Class 15
3 pages
Assignment 2 Full
No ratings yet
Assignment 2 Full
10 pages
R Class 21
No ratings yet
R Class 21
16 pages
STATISTICS
No ratings yet
STATISTICS
5 pages
JHC BAQF Brochure
No ratings yet
JHC BAQF Brochure
19 pages
ADA Supp Exam QP 12 July 2022
No ratings yet
ADA Supp Exam QP 12 July 2022
6 pages
04.0 PP Xi Xii Preface To Second Edition
No ratings yet
04.0 PP Xi Xii Preface To Second Edition
2 pages
08.0 PP 52 67 Location and Dispersion
No ratings yet
08.0 PP 52 67 Location and Dispersion
16 pages
441 Math 118 HW2
No ratings yet
441 Math 118 HW2
10 pages
EGM 411 Assignment
No ratings yet
EGM 411 Assignment
2 pages
Allama Iqbal Open University, Islamabad: Warning
No ratings yet
Allama Iqbal Open University, Islamabad: Warning
4 pages
CM1B First 10 Vedios Questions Solved
No ratings yet
CM1B First 10 Vedios Questions Solved
40 pages
Ans 131
No ratings yet
Ans 131
3 pages
Booklet 8
No ratings yet
Booklet 8
1 page
Assignment2 2025spring S34
No ratings yet
Assignment2 2025spring S34
3 pages
2015 June M Babs 502
No ratings yet
2015 June M Babs 502
4 pages
Test 2 - Semester 1 2023 Memo
No ratings yet
Test 2 - Semester 1 2023 Memo
6 pages
Fall '23 Stats Exam 3 Review
No ratings yet
Fall '23 Stats Exam 3 Review
10 pages
HW 11
No ratings yet
HW 11
3 pages

R Class 20

Uploaded by

R Class 20

Uploaded by

CHAPTER 11 & 12 ASSIGNMENT PREPARED BY – RAKESH GUPTA

Question 4: Five years of marketing spend and company sales by month

v) Plot the fitted line on the data scatterplot. (2)

viii) Obtain a 99% confidence interval for parameter sigma. (4)

xi) Comment on the suggestion in point (x). (2)

You might also like