Biostatistics II. Final Assignment

Uploaded by

adriansyahtrial

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views3 pages

Biostatistics II. Final Assignment

Uploaded by

adriansyahtrial

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Here’s the final assignment for Biostatistics II.

Complete the exercises below and upload

them to the Final Assignment folder. Please create a separate folder inside it with your
name and upload the answers as separate files. The files should be named as follows:
name, last name, task.
The deadline is June 7th at 23:59.

1) Confounding Task:
Using the provided dataset, you are required to determine if age is a confounder and
calculate the adjusted (if needed) association between diabetes and sugar consumption in
grams per day.
Instructions:
1. Examine the dataset to understand the variables: "Age" (in years),
"Sugar_Consumption" (in grams per day), and "Has_Diabetes" (0 for no, 1 for yes).
2. Determine if age is a confounder in the association between diabetes and sugar
consumption.
3. If age is a confounder, calculate the adjusted association between diabetes and sugar
consumption.
4. Present your findings and methodology used in a clear and concise manner. Perform
tasks in R, and present R code.

If you have any questions regarding the task, kindly email: [email protected]

Dataset:
Use a dataset from this folder with your ID in filename:
Student ID can be found in this table
Confounding

If you have any questions regarding the task, kindly email: [email protected]

2) Missing Data Task:

Using the provided dataset, you are required to determine if the missing data follows the
Missing at Random (MAR) or Missing Not at Random (MNAR) mechanism. If MAR, you
should impute the missing values using the Multiple Imputation by Chained Equations
(MICE) method and estimate the association between asbestos exposure and lung cancer. If
MNAR, you should perform worst/best scenario analysis.

Instructions:
1. Examine the dataset to understand the variables: "smoking" (0 for no, 1 for yes), "age"
(in years), "asbestos" (0 for no exposure, 1 for exposure), and "lung_cancer" (0 for no, 1 for
yes).
2. Determine if the missing data follows the Missing at Random (MAR) or Missing Not at
Random (MNAR) mechanism.
3. If MAR, impute the missing values using the Multiple Imputation by Chained Equations
(MICE) method.
4. Estimate the association between asbestos exposure and lung cancer after
imputation.
5. If MNAR, perform worst/best scenario analysis to estimate the association between
asbestos exposure and lung cancer.
6. Present your findings and methodology used in a clear and concise manner. Perform
tasks in R, and present R code.

If you have any questions regarding the task, kindly email: [email protected]

Dataset:
Use a dataset from this folder with your ID in filename:
Missing Data

3) Simple and Multiple Linear Regressions Task

Using the Boston Housing dataset from “mlbench” library, you are required to perform
simple and multiple linear regression analyses using built-in R statistical functions, and write
a short technical report on your findings.

Instructions:
1. Prior to doing this task, please apply the following transformation to the data so that
each one of you have to deal with different data and may come up with different
results and conclusions. To do that, you are supposed to use the code below setting
the seed (random number generator) equal to your individual student number, i.e.
336245 for Imam, 410860 for Maria, etc. using (amending “123” by your own ID). If
you are unsure, please let me know.

library(dplyr)
BostonHousing<- BostonHousing %>% select("crim", "indus", "rm", "age", "tax",
"ptratio")
set.seed(123)
df<- cbind(BostonHousing[1], apply(BostonHousing[2:6], c(1,2),
function(x){x+round(runif(1, -2, 10), 2)}))

2. First, give a brief introduction to describe the dataset being analysed using the R help
or Google with some explanation behind the different variables of the data: how
many records (rows) and how many variables (columns) are available in the data as
well as what these variables represent. Comment on whether some values for
quantitative variables are surprising (you are advised to produce a boxplot to
support your comment).
3. Fit multiple linear regression model with the outcome being variable “per capita
crime rate” using all the available predictors. Comment on the regression coefficients
(null hypothesis and interpretation) and p-values for all the variables that show
statistical significance at 5% level. If none of the variables show statistical significance
at 5% level, comment on the one with the smallest p-value.
4. Choose the predictor with the smallest p-value (if more than one predictor has the
same smallest values, choose any one of them) and fit a simple linear regression
model using it. Comment of the regression coefficient and p-value for this single
predictor.
5. Compare the R-squared and adjusted R-squared between multiple and simple linear
regressions and comment on your findings.

If you have any questions regarding the task, kindly email: [email protected]

4) Life Tables Task

Using the available databases with the age-specific death rates , e.g.

- http://demogr.nes.ru/en/demogr_indicat/data_description (RusFMD)

- https://tochno.st/datasets

- https://www.mortality.org/

or any others, please construct the life tables

The variables that should be presented in the life table are following:
l(x), q(x), p(x), d(x), L(x), T(x), e(x)
Use any ax you wish (based on the formulas from the presentations)
The life table could be constructed in Excel or R (all the formulas for your calculations
should be visible)
The task is individual, please take the countries and/or regions unique for each student.

If you have any questions regarding the task, kindly email: [email protected]

Docx
No ratings yet
Docx
7 pages
Computer Lab 3 MM
No ratings yet
Computer Lab 3 MM
38 pages
pastPaper2024Spring Assm02
No ratings yet
pastPaper2024Spring Assm02
24 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
3 pages
08 Test
0% (1)
08 Test
11 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Exercises Dobson
0% (1)
Exercises Dobson
3 pages
STT 215 Exam 1 Example
No ratings yet
STT 215 Exam 1 Example
5 pages
ESB2021 Resit With Solution
No ratings yet
ESB2021 Resit With Solution
9 pages
222BDA35 Activity2
No ratings yet
222BDA35 Activity2
5 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Assignment STAT5002
No ratings yet
Assignment STAT5002
5 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
ECON20003 S1 2024 Sample Exam
No ratings yet
ECON20003 S1 2024 Sample Exam
27 pages
Activity 7
No ratings yet
Activity 7
5 pages
HW3 2023
No ratings yet
HW3 2023
2 pages
FIT2086 Assignment 3: Regression & Classification Analysis
No ratings yet
FIT2086 Assignment 3: Regression & Classification Analysis
9 pages
Assignment 2 - HLTH 605b - Fall 2020 (100 Marks)
No ratings yet
Assignment 2 - HLTH 605b - Fall 2020 (100 Marks)
2 pages
MLR-handson - Jupyter Notebook
No ratings yet
MLR-handson - Jupyter Notebook
5 pages
Stat 205 Practice Mid-Term Exam
No ratings yet
Stat 205 Practice Mid-Term Exam
7 pages
STAT 5700 Homework 1
No ratings yet
STAT 5700 Homework 1
19 pages
CS2B - Sept23 - EXAM - Clean Proof
No ratings yet
CS2B - Sept23 - EXAM - Clean Proof
5 pages
MSc Epidemiology: Mixed Models Intro
No ratings yet
MSc Epidemiology: Mixed Models Intro
26 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
강준혁 회귀분석 과제 4
No ratings yet
강준혁 회귀분석 과제 4
10 pages
Test Your Knowledge of Linear Regression and PCA in R
No ratings yet
Test Your Knowledge of Linear Regression and PCA in R
7 pages
Boston Housing & Logistic Regression Analysis
No ratings yet
Boston Housing & Logistic Regression Analysis
3 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Exercises
No ratings yet
Exercises
11 pages
Experiment No.2 Title:: Predicting Missing Data Using Regression Modeling
No ratings yet
Experiment No.2 Title:: Predicting Missing Data Using Regression Modeling
8 pages
CS2B - Sept23 - EXAM - Clean Proof
No ratings yet
CS2B - Sept23 - EXAM - Clean Proof
4 pages
Assignment 1 Questions
No ratings yet
Assignment 1 Questions
4 pages
Appendix: Answers To Selected Exercises: /user
No ratings yet
Appendix: Answers To Selected Exercises: /user
8 pages
STAT 31631 - Statistical Modeling - Assignment01
No ratings yet
STAT 31631 - Statistical Modeling - Assignment01
2 pages
Assignment 4 Corrected
No ratings yet
Assignment 4 Corrected
3 pages
MH 3511 Midterm 2018 So LN
No ratings yet
MH 3511 Midterm 2018 So LN
5 pages
A1
No ratings yet
A1
8 pages
A1w2017s PDF
No ratings yet
A1w2017s PDF
11 pages
Statistics & Econometrics Exam 2021
No ratings yet
Statistics & Econometrics Exam 2021
8 pages
R Code Analysis and Output Interpretation
No ratings yet
R Code Analysis and Output Interpretation
8 pages
Ejercicios Unidad 8 Categ Predictors
No ratings yet
Ejercicios Unidad 8 Categ Predictors
2 pages
Group 4
No ratings yet
Group 4
9 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
IE 451 Fall 2023-2024 Homework 4 Solutions
No ratings yet
IE 451 Fall 2023-2024 Homework 4 Solutions
19 pages
Om Ashish Mishra 23363025: 5 Mcqs
No ratings yet
Om Ashish Mishra 23363025: 5 Mcqs
9 pages
Math68052 Generalised Linear Models and Survival Analysis
No ratings yet
Math68052 Generalised Linear Models and Survival Analysis
12 pages
Example Metrics - Final Assignment - WS1920 - SH
No ratings yet
Example Metrics - Final Assignment - WS1920 - SH
9 pages
Homework Chapter 13: Pooling Cross Sections Across Time: Simple Panel Data Methods
No ratings yet
Homework Chapter 13: Pooling Cross Sections Across Time: Simple Panel Data Methods
2 pages
As 2
No ratings yet
As 2
13 pages
Multiple Regression Analysis: Hypothesis Tests & Confidence Intervals
No ratings yet
Multiple Regression Analysis: Hypothesis Tests & Confidence Intervals
5 pages
2024-2025 S2 SB Assignment
No ratings yet
2024-2025 S2 SB Assignment
3 pages
Sta 226
No ratings yet
Sta 226
5 pages
Theo Assignment 2 New
No ratings yet
Theo Assignment 2 New
10 pages
Regression and Classification Analysis
No ratings yet
Regression and Classification Analysis
101 pages
Assignment3 Zhao Zihui
No ratings yet
Assignment3 Zhao Zihui
8 pages
Ex 5
No ratings yet
Ex 5
6 pages
Bana 3010 Assignment 5
No ratings yet
Bana 3010 Assignment 5
5 pages
Untitled
No ratings yet
Untitled
5 pages
STAT 2066EL Final Exam Review Problems Corrected
No ratings yet
STAT 2066EL Final Exam Review Problems Corrected
6 pages
Largersampleprovidesa: Marginoferror
No ratings yet
Largersampleprovidesa: Marginoferror
3 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
Regression Cheat Sheet
No ratings yet
Regression Cheat Sheet
6 pages
Callaway SantAnna 2020
No ratings yet
Callaway SantAnna 2020
45 pages
Quiz Feedback 1
No ratings yet
Quiz Feedback 1
5 pages
Analysis of Variance For Random Models, Volume 2 Unbalanced Data Theory, Methods, Applications, and Data Analysis Complete Ebook Edition
100% (18)
Analysis of Variance For Random Models, Volume 2 Unbalanced Data Theory, Methods, Applications, and Data Analysis Complete Ebook Edition
15 pages
Econometrics Final
No ratings yet
Econometrics Final
13 pages
Real Estate Valuation Using Regression - Ver1
No ratings yet
Real Estate Valuation Using Regression - Ver1
17 pages
Introduction To Statistical Mechanics: Thermodynamics Limit
No ratings yet
Introduction To Statistical Mechanics: Thermodynamics Limit
15 pages
Holt's Linear Trend Forecasting
No ratings yet
Holt's Linear Trend Forecasting
10 pages
Queuing Theory
No ratings yet
Queuing Theory
62 pages
Reliability Test Result
No ratings yet
Reliability Test Result
2 pages
Business Statistics - II Syllabus
No ratings yet
Business Statistics - II Syllabus
2 pages
Econometrics Assignment MBA - 2
No ratings yet
Econometrics Assignment MBA - 2
3 pages
Lecture 15
No ratings yet
Lecture 15
14 pages
1b.eco 329 Assumptions of OLS
No ratings yet
1b.eco 329 Assumptions of OLS
9 pages
Operations Management Forecasting Tutorial
No ratings yet
Operations Management Forecasting Tutorial
2 pages
Least Squares for Data Analysts
No ratings yet
Least Squares for Data Analysts
5 pages
Hardy Weinberg Exam Qs
No ratings yet
Hardy Weinberg Exam Qs
6 pages
Data Analysis with Pandas & Matplotlib
No ratings yet
Data Analysis with Pandas & Matplotlib
3 pages
Pendugaan Parameter
No ratings yet
Pendugaan Parameter
30 pages
Pre-Test & Post-Test Energi Terbarukan
No ratings yet
Pre-Test & Post-Test Energi Terbarukan
1 page
(Ebook PDF) Introductory Econometrics: A Modern Approach 6th Editioninstant Download
100% (5)
(Ebook PDF) Introductory Econometrics: A Modern Approach 6th Editioninstant Download
57 pages
RMSE in Managerial Economics
No ratings yet
RMSE in Managerial Economics
2 pages
Model Specification
No ratings yet
Model Specification
2 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
99 pages
Panel Questions
No ratings yet
Panel Questions
5 pages
Econ 326 2024 Group Assignment
No ratings yet
Econ 326 2024 Group Assignment
2 pages
Homework 5
No ratings yet
Homework 5
6 pages
Lecture 2
No ratings yet
Lecture 2
17 pages