0% found this document useful (0 votes)

26 views4 pages

Allendowney Blogspot Com 2011 05

Statistical analysis

Uploaded by

Sudarsan P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views4 pages

Allendowney Blogspot Com 2011 05

Statistical analysis

Uploaded by

Sudarsan P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

More Create Blog Sign In

Probably Overthinking It
A blog by Allen Downey.

Tuesday, May 31, 2011 Probably Overthinking It

There is only one test!

UPDATE: Here is a more recent (and I think clearer) version of this post.

A few weeks ago I started reading reddit/r/statistics regularly. I post links to this blog there
and the readers have given me some good feedback. And I have answered a few
questions.

One of the most commons question I see is something like, "I have some data. Which test
should I use?" When I see this question, I always have two thoughts:

1) I hate the way statistics is taught, because it gives students the impression that
hypothesis testing is all about choosing the right test, and if you get it wrong the statistics
wonks will yell at you, and

2) There is only one test!

Available for preorder now
Let me explain. All tests try to answer the same question: "Is the apparent effect real, or is it
due to chance?" To answer that question, we formulate two hypotheses: the null hypothesis,
Probably Overthinking It: The Book!
H0, is a model of the system if the effect is due to chance; the alternate hypothesis, HA, is a
model where the effect is real. I am working on a book, also called
Probably Overthinking It, which is
Ideally we should compute the probability of seeing the effect (E) under both hypotheses; about using evidence and reason to
that is P(E | H0) and P(E | HA). But formulating HA is not always easy, so in conventional answer questions and make better
decisions. It will be published in 2023
hypothesis testing, we just compute P(E | H0), which is the p-value. If the p-value is small,
by the University of Chicago Press. If
we conclude that the effect is unlikely to have occurred by chance, which suggests that it is
you would like to get an occasional
real. update about the book, please join
my mailing list.
That's hypothesis testing. All of the so-called tests you learn in statistics class are just ways
to compute p-values efficiently. When computation was expensive, these shortcuts were
This blog has moved
important, but now that computation is virtually free, they are not.
I'm posting new articles on this new
And the shortcuts are often based on simplifying assumptions and approximations. If you site. Older articles are still available
violate the assumptions, the results can be misleading, which is why statistics classes are here.
filled with dire warnings and students end up paralyzed with fear.
Tweet about this
Fortunately, there is a simple alternative: simulation. If you can construct a model of the null
hypothesis, you can estimate p-values just by counting. This figure shows the structure of Post
the simulation.

Share on Facebook

Posts

All Comments

Books by Allen Downey

Think DSP

Think Java

Think Bayes

Think Python 2e

Think Stats 2e
The first step is to get data from the system and compute a test statistic. The result is some
measure of the size of the effect, which I call "delta". For example, if you are comparing the
mean of two groups, delta is the difference in the means. If you are comparing actual values
with expected values, delta is a chi-squared statistic (described below) or some other Think Complexity
measure of the distance between the observed and expected values.

The null hypothesis is a model of the system under the assumption that the apparent effect
is due to chance. For example, if you observed a difference in means between two groups, Blog Archive
the null hypothesis is that the two groups are the same.
►
► 2018 (16)

The next step is to use the model to generate simulated data that has the sample size as the ►
► 2017 (13)
actual data. Then apply the test statistic to the simulated data. ►
► 2016 (30)
►
► 2015 (29)
The last step is the easiest; all you have to do is count how many times the test statistic for ►
► 2014 (21)
the simulated data exceeds delta. That's the p-value. ►
► 2013 (21)
►
► 2012 (24)
As an example, let's do the Casino problem from Think Stats:
▼
▼ 2011 (30)
►
► November (6)
Suppose you run a casino and you suspect that a customer has replaced a die
►
► October (4)
provided by the casino with a ``crooked die''; that is, one that has been
tampered with to make one of the faces more likely to come up than the ►
► August (2)
others. You apprehend the alleged cheater and confiscate the die, but now ►
► June (2)
you have to prove that it is crooked. You roll the die 60 times and get the ▼
▼ May (2)
following results:
There is only one test!

Value 1 2 3 4 5 6 Think Stats will be published by

O'Reilly in June
Frequency 8 9 19 6 8 10
►
► April (3)
What is the probability of seeing results like this by chance? ►
► March (3)
►
► February (4)
The null hypothesis is that the die is fair. Under that assumption, the expected frequency for
each value is 10, so the frequencies 8, 9 and 10 are not surprising, but 6 is a little funny and ►
► January (4)
19 is suspicious.

About Me
To compute a p-value, we have to choose a test statistic that measures how unexpected
these results are. The chi-squared statistic is a reasonable choice: for each value we Allen Downey
compare the expected frequency, exp, and the observed frequency, obs, and compute the
View my complete profile
sum of the squared relative differences:

def ChiSquared(expected, observed):

total = 0.0
for x, exp in expected.Items():
obs = observed.Freq(x)
total += (obs - exp)**2 / exp
return total

Why relative? Because the variation in the observed values depends on the expected
value. Why squared? Well, squaring makes the differences positive, so they don't cancel
each other when we add them up. But other than that, there is no special reason to choose
the exponent 2. The absolute value would also be a reasonable choice.

For the observed frequencies, the chi-squared statistic is 20.6. By itself, this number doesn't
mean anything. We have to compare it to results from the null hypothesis.

Here is the code that generates simulated data:

def SimulateRolls(sides, num_rolls):

"""Generates a Hist of simulated die rolls.

Args:
sides: number of sides on the die
num_rolls: number of times to rolls

Returns:
Hist object
"""
hist = Pmf.Hist()
for i in range(num_rolls):
roll = random.randint(1, sides)
hist.Incr(roll)
return hist

And here is the code that runs the simulations:

count = 0.
num_trials = 1000
num_rolls = 60
threshold = ChiSquared(expected, observed)

for _ in range(num_trials):
simulated = SimulateRolls(sides, num_rolls)
chi2 = ChiSquared(expected, simulated)
if chi2 >= threshold:
count += 1

pvalue = count / num_trials

print 'p-value', pvalue

Out of 1000 simulations, only 34 generated a chi-squared value greater than 20.6, so the
estimated p-value is 3.4%. I would characterize that as borderline significant. Based on this
result, I think the die is probably crooked, but would I have the guy whacked? No, I would
not. Maybe just roughed up a little.

For this problem, I could have computed the sample distribution of the test statistic
analytically and computed the p-value quickly and exactly. If I had to run this test many
times for large datasets, computational efficiency might be important. But usually it's not.

And accuracy isn't very important either. Remember that the test statistic is arbitrary, and
the null hypothesis often involves arbitrary choices, too. There is no point in computing an
exact answer to an arbitrary question.
For most problems, we only care about the order of magnitude: if the p-value is smaller that
1/100, the effect is likely to be real; if it is greater than 1/10, probably not. If you think there
is a difference between a 4.8% (significant!) and 5.2% (not significant!), you are taking it too
seriously.

So the advantages of analysis are mostly irrelevant, but the disadvantages are not:

1) Analysis often dictates the test statistic; simulation lets you choose whatever test statistic
is most appropriate. For example, if someone is cheating at craps, they will load the die to
increase the frequency of 3 and/or 4, not 1 and/or 6. So in the casino problem the results
are suspicious not just because one of the frequencies is high, but also because the
frequent value is 3. We could construct a test statistic that captures this domain knowledge
(and the resulting p-value would be lower).

2) Analytic methods are inflexible. If you have issues like censored data, non-
independence, and long-tailed distributions, you won't find an off-the-shelf test; and unless
you are a mathematical statistician, you won't be able to make one. With simulation, these
kinds of issues are easy.

3) When people think of analytic methods as black boxes, they often fixate on finding the
right test and figuring out how to apply it, instead of thinking carefully about the problem.

In summary, don't worry about finding the "right" test. There's no such thing. Put your effort
into choosing a test statistic (one that reflects the effect you are testing) and modeling the
null hypothesis. Then simulate it, and count!

Added June 24, 2011: For some additional examples, see this followup post.

Posted by Allen Downey at 6:28 PM 5 comments:

Wednesday, May 4, 2011

Think Stats will be published by O'Reilly in June

No statistics today, just news. I signed a contract with O'Reilly to publish Think Stats. I turn
in the manuscript next week, and it should be out in June. So if you find any errors, now is
the time to tell me!

Here's my new author page at O'Reilly.

Some people have asked if I get to choose the animal on the cover (which is a signature of
O'Reilly cover design). The short answer is no.

First, I'm actually not sure whether Think Stats will be an "animal book" or part of another
O'Reilly series. We haven't talked about it.

Also, here's what the O'Reilly author materials have to say on the topic:

Cover Design
The purpose of a book cover is to get a potential reader to pick up the book, and to persuade a bookstore to
display it.

We're confident that we have the most striking and effective covers in technical publishing. Despite our
relatively small size, it's not unusual to see window displays of our books at technical bookstores.

Our covers are all designed by Edie Freedman. She is open to suggestions, but has the final say on all cover
designs. Here's what Edie has to say about how she designs the animal covers for Nutshell Handbooks:

I ask the authors to supply me with a description of the topic of the book. What I am looking for
is adjectives that really give me an idea of the "personality" of the topic. Authors are free to
make suggestions about animals, but I prefer to deal with adjectives. Once I have the
information from the author, I spend some time thinking about it, and then I choose an animal.
Sometimes it is based on no more than what the title sounds like. (COFF, for example,
sounded like a walrus noise to me.) Sometimes it is very much linked to the name of the book
or software. (For example, vi, the "Visual Editor," suggested some beast with huge eyes). And
it is always subject to what sort of artwork I can find.
It seems to work best this way. We've tried doing animals that the authors want, and have
found that it works better if I do the selection, and then submit it for approval. I don't think
anyone has been too disappointed with the final choices we've made.

I'll add that if you do want a particular animal, Edie is more inclined to respond to "right brain" reasons than to
obvious mental associations. For example, several people on the net suggested the obvious clam for the Perl
handbook, but Edie responded instead to Larry Wall's obscure argument for the camel: ugly but serviceable,
able to go long distances without a lot of nourishment.

In any event, you'll have a chance to see and comment on Edie's cover design before we commit it to print. If
you really hate it, and can't be persuaded to feel otherwise, she'll probably try again (but no promises!).

So here's what I have for "right brain" ideas: the themes of the books are curiosity and
analysis. One of the problems I see with conventional approaches to statistics is that it
makes people timid -- afraid to do the wrong thing. One of my goals with this approach is to
help students be fearless. So I want something curious and fearless. Ideas?
Here's another article about the animal selection process.

And here's the list of animals they've already used.

And here it is:

The line drawing looks better at higher resolution, but you get the idea. Can anyone name
that fish?

Posted by Allen Downey at 5:11 PM 4 comments:

Newer Posts Home Older Posts

Simple theme. Theme images by gaffera. Powered by Blogger.

Andrew Cardwell - Using The RSI
100% (7)
Andrew Cardwell - Using The RSI
5 pages
100 Baggers PDF
94% (67)
100 Baggers PDF
202 pages
BAD702 SMLDS Lab Manual (Student Copy)
No ratings yet
BAD702 SMLDS Lab Manual (Student Copy)
47 pages
M.SC I - Course 1 - Principles of Psychology
100% (1)
M.SC I - Course 1 - Principles of Psychology
310 pages
Purpose Practice
No ratings yet
Purpose Practice
81 pages
Smart Multibaggers
100% (1)
Smart Multibaggers
12 pages
The IB Psychology IA Handbook
100% (3)
The IB Psychology IA Handbook
60 pages
RSCH2122 1ST Quarter Exam 1
100% (1)
RSCH2122 1ST Quarter Exam 1
48 pages
Q1 Practical Research 2 - Module 10-11 (W6)
No ratings yet
Q1 Practical Research 2 - Module 10-11 (W6)
26 pages
RavenPack AI Dixon
No ratings yet
RavenPack AI Dixon
33 pages
Research Methodology MCQ Questions and Answers
No ratings yet
Research Methodology MCQ Questions and Answers
13 pages
Capital Markets
No ratings yet
Capital Markets
30 pages
WWW Edge Org Conversation Andy Clark Perception As Controlled Hallucination...
No ratings yet
WWW Edge Org Conversation Andy Clark Perception As Controlled Hallucination...
17 pages
Chapter 6 Classification and Prediction25.10.13
No ratings yet
Chapter 6 Classification and Prediction25.10.13
43 pages
WWW Edge Org Conversation Mary Catherine Bateson How To Be A Systems Thinker...
No ratings yet
WWW Edge Org Conversation Mary Catherine Bateson How To Be A Systems Thinker...
12 pages
Blending Ensemble For Classification With Genetic
No ratings yet
Blending Ensemble For Classification With Genetic
30 pages
Assignment On Research Process by Kamini Chaudhary 45
No ratings yet
Assignment On Research Process by Kamini Chaudhary 45
7 pages
Chapter 6 - Teacher Notes
No ratings yet
Chapter 6 - Teacher Notes
36 pages
The 10 Tasks of Top Trading
No ratings yet
The 10 Tasks of Top Trading
10 pages
Medium Com @jtchen2k Kernel Density Estimation With Python From Scratch c200b187...
No ratings yet
Medium Com @jtchen2k Kernel Density Estimation With Python From Scratch c200b187...
8 pages
TFN Transes 1st Term Edited
No ratings yet
TFN Transes 1st Term Edited
18 pages
PRR2 Module-4
No ratings yet
PRR2 Module-4
10 pages
Distribution in Test Statistics
No ratings yet
Distribution in Test Statistics
16 pages
Comprehensive Examination Review: Mabini Colleges, Inc
No ratings yet
Comprehensive Examination Review: Mabini Colleges, Inc
5 pages
Chapter 1 Summary 3pages
No ratings yet
Chapter 1 Summary 3pages
5 pages
Fundamentalsof Research Methodologyand Data Collection
No ratings yet
Fundamentalsof Research Methodologyand Data Collection
48 pages
WWW Edge Org Conversation Barbara Tversky The Geometry of Thought 1
No ratings yet
WWW Edge Org Conversation Barbara Tversky The Geometry of Thought 1
6 pages
Module 3 Half
No ratings yet
Module 3 Half
48 pages
The Effect of Social Media On Student Academic Performance: A Case Study at The Islamic University of Bangladesh
No ratings yet
The Effect of Social Media On Student Academic Performance: A Case Study at The Islamic University of Bangladesh
20 pages
29643-Article Text-33697-1-2-20240324
No ratings yet
29643-Article Text-33697-1-2-20240324
9 pages
Financial Technical Indicators
No ratings yet
Financial Technical Indicators
24 pages
Teaching Guide Statistics and Probability
No ratings yet
Teaching Guide Statistics and Probability
5 pages
Business Research Essentials
No ratings yet
Business Research Essentials
32 pages
Lab Report Writing Guide
No ratings yet
Lab Report Writing Guide
2 pages
Test Bank For Our Origins 5th Edition Clark Spencer Larsen
100% (46)
Test Bank For Our Origins 5th Edition Clark Spencer Larsen
25 pages
BUS 404 (OBE Summary)
No ratings yet
BUS 404 (OBE Summary)
5 pages
C) Working in A Scientific Way To Search For Truth of Any Problem
No ratings yet
C) Working in A Scientific Way To Search For Truth of Any Problem
12 pages
4 Applications Machine Learning in Predictive Analysis and Risk Management in Trading
No ratings yet
4 Applications Machine Learning in Predictive Analysis and Risk Management in Trading
8 pages
Defining The Marketing Research Problem and Developing An Approach
No ratings yet
Defining The Marketing Research Problem and Developing An Approach
18 pages
Consumer Rights Awareness in MP
No ratings yet
Consumer Rights Awareness in MP
9 pages
BIOA02 FLR T-Test One-Page Summary 02-22-2017
No ratings yet
BIOA02 FLR T-Test One-Page Summary 02-22-2017
2 pages
Trade Score
No ratings yet
Trade Score
30 pages
Hypothesis: DR - Viral Bharat Bhai Jadav Associate Professor IITE, Gandhinagar
No ratings yet
Hypothesis: DR - Viral Bharat Bhai Jadav Associate Professor IITE, Gandhinagar
22 pages
Anova Test - Post Hoc 1
No ratings yet
Anova Test - Post Hoc 1
2 pages
Stock Market Weekly Analysis June 2018
No ratings yet
Stock Market Weekly Analysis June 2018
19 pages
Research Paper Writing Guide
No ratings yet
Research Paper Writing Guide
12 pages
Corporate Strategy Insights
No ratings yet
Corporate Strategy Insights
30 pages
Seminar Research Design
No ratings yet
Seminar Research Design
12 pages
Final-Review BIOSTATS PHD
No ratings yet
Final-Review BIOSTATS PHD
24 pages
Technical Note: Developing Results Frameworks
No ratings yet
Technical Note: Developing Results Frameworks
11 pages
RMB W5
No ratings yet
RMB W5
71 pages
Lesson 5 2nd Quarter
No ratings yet
Lesson 5 2nd Quarter
24 pages
Unit 4 Inferential Statistics
No ratings yet
Unit 4 Inferential Statistics
68 pages
Business Research for Managers
No ratings yet
Business Research for Managers
90 pages
Chapter One
No ratings yet
Chapter One
32 pages
Writing Up Your Results - APA Style Guidelines
No ratings yet
Writing Up Your Results - APA Style Guidelines
5 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
5 pages
Nafisa Tabassum 1003 (BA)
No ratings yet
Nafisa Tabassum 1003 (BA)
5 pages
Bernard F Dela Vega PH 1-1
No ratings yet
Bernard F Dela Vega PH 1-1
5 pages
Parametric Tests
No ratings yet
Parametric Tests
17 pages
Statistical Tests in Biology AQA Guidance
No ratings yet
Statistical Tests in Biology AQA Guidance
8 pages
Hypothesis Unit Review Answer Key
No ratings yet
Hypothesis Unit Review Answer Key
6 pages
Math 1040 Skittles Report Part 3
No ratings yet
Math 1040 Skittles Report Part 3
6 pages
Testing Hypothesis
No ratings yet
Testing Hypothesis
9 pages
Hypothesis Testing (Lecture) PDF
50% (2)
Hypothesis Testing (Lecture) PDF
50 pages
Apa
No ratings yet
Apa
7 pages
Hypothesis Testing (Original) PDF
No ratings yet
Hypothesis Testing (Original) PDF
5 pages
What Is Inferential Statistics
No ratings yet
What Is Inferential Statistics
4 pages
Means Hypothesis Testing
No ratings yet
Means Hypothesis Testing
5 pages
Sampling & Hyphothesisi
No ratings yet
Sampling & Hyphothesisi
29 pages
Ch-17 T-Test Notes
No ratings yet
Ch-17 T-Test Notes
4 pages
L1.2 1.3 Z, T Test
No ratings yet
L1.2 1.3 Z, T Test
47 pages
Hypotheses Unit 1
No ratings yet
Hypotheses Unit 1
6 pages
Act 3.1 Belocora
No ratings yet
Act 3.1 Belocora
7 pages
BONGGA Statistics-and-Probability 4Q SLM2
No ratings yet
BONGGA Statistics-and-Probability 4Q SLM2
6 pages
Parametric Tests
No ratings yet
Parametric Tests
8 pages
CH 8 - Hypothesis Testing Notes
No ratings yet
CH 8 - Hypothesis Testing Notes
18 pages
Research Iii: Quarter 3 Week 4
No ratings yet
Research Iii: Quarter 3 Week 4
20 pages
QSCI 381 Lecture 7
No ratings yet
QSCI 381 Lecture 7
28 pages
Introduction To Hypothesis Testing24
No ratings yet
Introduction To Hypothesis Testing24
54 pages
Wk. 13 Ppt. - Quantitative Techniques in Business
No ratings yet
Wk. 13 Ppt. - Quantitative Techniques in Business
24 pages
Week 3 - Statistical Hypothesis Testing
No ratings yet
Week 3 - Statistical Hypothesis Testing
18 pages
Transcription Doc Hypothesis Testing: T-Distribution: Speaker: Dhaval Doshi
No ratings yet
Transcription Doc Hypothesis Testing: T-Distribution: Speaker: Dhaval Doshi
20 pages
Chapter 5 - All Methods P Value
No ratings yet
Chapter 5 - All Methods P Value
36 pages
Unit 4
No ratings yet
Unit 4
30 pages
QT Project Report
No ratings yet
QT Project Report
10 pages
MODS 2023 L1W4 - CI and Stats Tests
No ratings yet
MODS 2023 L1W4 - CI and Stats Tests
30 pages
Hypothesis Testing & T-Test Guide
No ratings yet
Hypothesis Testing & T-Test Guide
17 pages
Marketing PPT Tests
No ratings yet
Marketing PPT Tests
10 pages
PDF - 4.2 Review On Inferential Statistics Choosing The Correct Tool
No ratings yet
PDF - 4.2 Review On Inferential Statistics Choosing The Correct Tool
43 pages
Compilation
No ratings yet
Compilation
44 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
5 pages
Module 3-4
No ratings yet
Module 3-4
36 pages
Getting To Know and Interpreting Your Data: Irina Baetu
No ratings yet
Getting To Know and Interpreting Your Data: Irina Baetu
46 pages
Beginers Guide To Statistics
No ratings yet
Beginers Guide To Statistics
18 pages
Class 19 Z Test T Test Copy 25
No ratings yet
Class 19 Z Test T Test Copy 25
10 pages
T Test
No ratings yet
T Test
29 pages
Computational Data Science - Unit 4
No ratings yet
Computational Data Science - Unit 4
18 pages
Sampling Test
No ratings yet
Sampling Test
48 pages
Spss Hypotheses
No ratings yet
Spss Hypotheses
28 pages
HYPOTHESES
No ratings yet
HYPOTHESES
32 pages
Statistical Tests
No ratings yet
Statistical Tests
91 pages
Modules Statistics PDF
No ratings yet
Modules Statistics PDF
67 pages
Hypothesis Test
No ratings yet
Hypothesis Test
23 pages
Nifty Futures Absorption Pattern
100% (1)
Nifty Futures Absorption Pattern
51 pages
Activity 5
No ratings yet
Activity 5
28 pages

Allendowney Blogspot Com 2011 05

Uploaded by

Allendowney Blogspot Com 2011 05

Uploaded by

More Create Blog Sign In

Tuesday, May 31, 2011 Probably Overthinking It

There is only one test!

2) There is only one test!

Books by Allen Downey

Value 1 2 3 4 5 6 Think Stats will be published by

def ChiSquared(expected, observed):

Here is the code that generates simulated data:

def SimulateRolls(sides, num_rolls):

And here is the code that runs the simulations:

pvalue = count / num_trials

Posted by Allen Downey at 6:28 PM 5 comments:

Wednesday, May 4, 2011

Think Stats will be published by O'Reilly in June

Here's my new author page at O'Reilly.

And here's the list of animals they've already used.

And here it is:

Posted by Allen Downey at 5:11 PM 4 comments:

Newer Posts Home Older Posts

Subscribe to: Posts (Atom)

Simple theme. Theme images by gaffera. Powered by Blogger.

You might also like