Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views2 pages

SN1 Project Part2

Sn1 stats project instructions

Uploaded by

maya ben
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

SN1 Project Part2

Sn1 stats project instructions

Uploaded by

maya ben
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Probability and Statistics

Math 201-SN1-RE PROJECT PART 2 Fall 2024


Due on Wednesday December 11 before 11:59pm.

Instructions:
ˆ This part of the project can be completed individually or in teams of two students. Please send your instructor an MIO
to inform them whether you will be working alone or with a partner. The instructor will then reply to your team with
a .csv file containing fake data for analysis.

ˆ You are required to use R for all the calculations and to produce all the graphs.

ˆ Students will submit their project electronically on Léa, in the Assignments and Dropbox section.

ˆ One student per team will need to submit two files:

– a report (.pdf file) containing a presentation of your results including the appropriate graphs and discussions,
– an R script (.R file) containing all the code used, cleaned up and commented

1. Import the Data


(a) Import the data file in RStudio either by using the read.csv command or by clicking on File - Import Dataset -
From Text (base) ....
(b) This will result in a data frame containing three columns (variables): the index of the subject, a random variable X
and another random variable Y . If you wish, you can pretend these are the results of two cognitive tests performed
on a random sample of a newly discovered alien species (humans would normally score 100 on average on these
cognitive tests).
(c) Important! Do not change the name of the data file, do not modify the data file with Excel, and do not change the
name of the data frame you obtain in R after importing the data. This will allow your teacher to verify that the
code you submit runs without errors.
(d) Remember! Type, run, and save all your code by using an R script! In other words, use the upper-left pane in
RStudio to write your code and save regularly. The lower-left pane (console) should only be used directly for quick
and short testing of commands that don’t need to be saved.
2. Hypothesis test on the variable X

(a) Determine if one can safely use the t-procedures on X by producing a properly labelled histogram of X.
Comment on the shape and outliers.
(b) Perform a hypothesis test on the population average for X using the t-procedure. Choose a lower-tailed alternative
hypothesis (<) that results in a P -value between 0.001 and 0.1. (You can find a suitable hypothesis with a bit of
trial-and-error, or you can mathematically reverse-engineer a desired P -value to find a value of µ0 that will work.)
(c) In your report, show the details of your hypothesis test following the same presentation that your teacher has used
in class. Use the following functions in R to perform your calculations: mean(), sd(), sqrt() and pt().
(d) In addition to the above, use the function t.test() to check your answers, which gives a summary including the
test statistic and the P -value. Copy-paste the output of t.test() into your report.
3. Hypothesis test on the variable Y

(a) Determine if one can safely use the t-procedures on Y by producing a properly labelled histogram of Y .
Comment on the shape and outliers.
(b) Perform a hypothesis test on the population average for Y using the t-procedure. Choose a two-tailed alternative
hypothesis (̸=) that results in a P -value between 0.2 and 0.5. (You can find a suitable hypothesis with a bit of
trial-and-error, or you can mathematically reverse-engineer a desired P -value to find a value of µ0 that will work.)
(c) In your report, show the details of your hypothesis test following the same presentation that your teacher has used
in class. Use the following functions in R to perform your calculations: mean(), sd(), sqrt() and pt().
(d) In addition to the above, use the function t.test() to check your answers, which gives a summary including the
test statistic and the P -value. Copy-paste the output of t.test() into your report.
4. Hypothesis test on the variable X − Y (matched pairs)
(a) Determine if one can safely use the t-procedures on X − Y by producing a properly labelled histogram of X − Y .
Comment on the shape and outliers.
(b) Perform a hypothesis test on the average difference X − Y using the t-procedure. Choose either a lower-tail (<),
upper-tail (>) or two-tailed (̸=) alternative, but the null hypothesis should be that the average difference is 0 between
the two variables.
(c) In your report, show the details of your hypothesis test following the same presentation that your teacher has used
in class. Use the following functions in R to perform your calculations: mean(), sd(), sqrt() and pt().
(d) Produce three confidence intervals for the average difference for the confidence levels of 90%, 95% and 99%. Use the
following functions in R to perform your calculations: mean(), sd(), sqrt() and qt().
(e) In addition to the above, use the function t.test() to check your answers, which gives a summary including the
test statistic, P -value and confidence interval. Copy-paste the output of t.test() into your report.
5. P -hacking
(a) How did you feel about the process of finding a hypothesis that suits your needs for a P -value in question 2 and 3?
Should such a procedure be considered scientifically sound? Why or why not?
(b) Research and explain the principle of P -hacking (in less than 100 words) and how it relates to the above.
Cite at least one reference.
6. Project journal

(a) In your report, include a brief description of how you worked together as a team, if applicable, and the methods used
(email, MIO, snapshat, online video meetings, in person meetings, google doc, etc.)
(b) Specify which part(s) of the assignment each student worked on. Both students have to be involved with the R
coding, and both students have to be involved with the written report.
7. Grading scheme
This part of the project counts for 5% of your final grade.
Late submissions will not be accepted.
Here is a tentative grading scheme for this part of the project.

PART 2

Presentation 1
(including submission, code, journal and overall effort)

Hypothesis test on X 1

Hypothesis test on Y 1

Hypothesis test on X − Y 1

P -hacking 1

TOTAL (Part 2) 5

You might also like