Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
129 views8 pages

Final Practice Questions No Answers

1. A study of married couples found those married over 5 years reported higher happiness than those married under 5 years, leading researchers to conclude "marriage gets better over time." However, survivor bias and age as a confounding factor could influence these findings. The different sample sizes of couples married over and under 5 years may also confound results. 2. A company found a small positive correlation between bill size and payment time, but residential bills showed a larger positive correlation while commercial bills showed a negative correlation. 3. An analyst studied food sales data and found two regression models predicting sales based on price and competitor price had higher R2 and lower error than a model based just on price. The addition of other variables

Uploaded by

Kathy Thanh PK
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views8 pages

Final Practice Questions No Answers

1. A study of married couples found those married over 5 years reported higher happiness than those married under 5 years, leading researchers to conclude "marriage gets better over time." However, survivor bias and age as a confounding factor could influence these findings. The different sample sizes of couples married over and under 5 years may also confound results. 2. A company found a small positive correlation between bill size and payment time, but residential bills showed a larger positive correlation while commercial bills showed a negative correlation. 3. An analyst studied food sales data and found two regression models predicting sales based on price and competitor price had higher R2 and lower error than a model based just on price. The addition of other variables

Uploaded by

Kathy Thanh PK
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

1.

A survey of married couples finds that those who have been married more than 5 years
report higher happiness levels in their lives on average compared with those who have
been married less than 5 years. The researchers conclude that “marriage gets better
over time.”
a. Is survivor bias likely to play an important role here? Explain briefly.

b. Someone suggests that age is very likely to be a confounding factor here. Do you
agree or disagree? Explain briefly.

c. In the study the researchers mention that their sample is comprised of 1000
couples that have been married more than 5 years and only 300 couples that
have been married less than five years. Would the different group sizes be a
likely confounding factor here? Briefly explain why or why not.

2. A company studies its accounts receivable records and sees that there is a small
correlation (+.02) between the size of the bill and the number of days it takes to collect
on the bill. However, when it separately examined the relationship for residential and
commercial bills, it found that for residential bills, there was a positive correlation, +.35;
for commercial bills, there was a negative correlation, -.43. Draw a scatterplot that is
consistent with these findings.

3. An analyst gathers weekly market data for a Pillsbury breakfast food product over the
past 40 weeks. Let Q = weekly quantity of Pillsbury sold (in millions of units), P = weekly
average price ($) of the Pillsbury product, A = the fraction of the Pillsbury product sold
during the week during a promotion, Pc = average weekly price ($) of products from
Pillsbury’s direct competitors, Ac = fraction of Pillsbury’s direct competing products sold
during the week during a (competitor’s) promotion, t = week number (1 through 40) and
SAc = 1 if a major competitor ran a special high intensity promotion during that week
and 0 otherwise. The analyst computes two regression models along with the
corresponding R2 and standard errors:
Q = 12.45 – 2.77P, R2 = 61.3%, SE = .38
Q = 11.31 – 3.61P + 1.11*Pc, R2 = 67.4%, SE = .36
a. Give managerial interpretations for the numbers 12.45, -2.77, -3.61 and 1.11.

b. What could explain why the -3.61 coefficient is much more negative than the
corresponding
-2.77 coefficient? Explain in language a manager would understand.

c. The analyst computes a fourth model: Q = 4.19 - .47P + 3.67A - .013t - .36SAc, R 2
= .81, SE = .28. Next week (week number 41) Pillsbury plans to set its price at
$3.12, sell 35% of product during promotions (P=0.35) and a major competitor is
planning a special high intensity promotion campaign. What are forecasted sales
for Pillsbury and how would you communicate to a manager the uncertainty
associated with the forecast?

4. To study the benefits of upgrading information technology (IT), which many believe can
both reduce costs and improve quality. A number of hospitals are offered some
incentives to upgrade to a new state-of-the-art IT system, and two years later
researchers find that the hospitals that choose to accept the offer and upgrade had
lower costs per patient and higher levels of healthcare quality measures compared with
hospitals that did not upgrade to the new IT system.
(a) Analysts conclude from this that the upgraded IT system led to this boost in
performance. Can you think of a confounding factor that could play a large role here?
Please give an example of one such factor, explain why it's likely to be a confounding
factor, and please limit your answer to a couple sentences.

(b) Without repeating the study, can you think of a better way to analyze the data already
gathered?

Different researchers decide to choose a group of hospitals and randomly select half the
hospitals in the group to be offered the incentives. Two years later researchers find that the
hospitals that chose to accept the offer and upgrade had higher levels of healthcare quality
measures compared with hospitals that were not offered the incentives.

(c) A manager says that now the study is experimental and thus there are unlikely to be
confounding factors. Do you agree? Explain briefly.

(d) Without repeating the study, can you think of a better way to analyze the data already
gathered?

5. You are hired by the Department of Health and Human Services to help understand the
determinants of the obesity epidemic in the US. You are given data on more than 20,000
individuals, aged 22-60. You have the following information:
- Weight in pounds
- Height in inches
- Gender
- Age
- Region: Northeast, South, Midwest, West (these are the only 4 regions in the US)
- Immigrant status (=1 if person is an immigrant, 0 if not)

With this data in hand you start by running regression models where the dependent variable is
weight.

Based on your results presented in the Table at the end of the document, answer the following
questions:

a. Explain why the coefficient of immigrant goes from more negative to less
negative from column (1) to column (2).

b. Interpret the intercept and the coefficient on midwest in regression 3.

c. What is the predicted difference in weight between a female and a male, who
have the same age, height, immigrant status, and live in the same region. Give a
95% confidence interval for this prediction.
d. Based on your models, and assuming no changes in other variables but age,
whom do you predict to have a larger increase in their weight (in pounds) from
this year to the next on average?

1. QM 716 students
2. Their professor
3. Same for both
4. Cannot tell

e. Which variable has a stronger correlations with weight – immigrant status or


gender? Explain.

6. Air pollution in Beijing has been a very serious problem in recent years. Air pollution has
a negative impact on health, particularly the health of infants. As a policymaker in
Beijing, you want to investigate if providing air filters to households improves infant
health. You randomly sample 1,000 households with an infant in their home.

Experiment A

First, suppose you asked households if they would like to receive an air filter. Of the 1000
households, 500 households requested and received an air filter, while the other 500
households did not request one. Let’s call this “Experiment A”. You collect data on the
following variables:

InfantHealth: Health indicator for infants in each household (Worst score = 1, Best score
= 100)

AirFilter: Dummy variable = 1 if the household received an air filter, = 0 if no filter

MothersEducation: Mother’s years of schooling (minimum 9 years, and maximum 22


years)
a) Analyzing data from “Experiment A,” you find the following regression result. Standard
errors are in parentheses.

Regression 1: InfantHealth = 2.2 + 30.5*AirFilter + 3.5*MothersEducation


(0.8) (8.4) (0.9)

Interpret the coefficient on AirFilter in regression #1. (One sentence).

b) Calculate the t-statistic and 95% confidence interval for the coefficient on AirFilter in
regression (#1). Is it statistically significant?

t-statistic:_ ___________________

95% confidence interval:_

Is the AirFilter oefficient statistically significant?___]

c) Based on this result, your colleague says, “Providing air filters is a good public policy
because regression #1 shows that air filters cause better infant health.” Do you agree
that this evidence shows that air filters cause better infant health? Why or why not?

Experiment B

Now, instead of asking households to choose whether they want an air filter, you randomly
assign air filters among another 1000 randomly selected households. Let’s call this “Experiment
B”. You provide air filters to 500 randomly selected households while the other 500 households
do not receive air filters.

d) Using the Experiment B data, you find the following regression result.

Reg #2: InfantHealth = α1 + 10.2*AirFilter + 4.5*MothersEducation

(8.4) (0.9)
Based on this result, your colleague says, “Providing air filters is a good public policy
because regression (2) shows that air filters cause better infant health.” Do you agree
that this evidence shows that air filters cause better infant health? Why or why not?

Answer:

Using the Experiment B data, you now run the following regression:

Regression #4: AirFilter = b0 + b 1* MothersEducation

Which of the following statements is most likely to be correct? Circle one and explain
why.

i) b1 will be positive and statistically significant

ii) b 1 will be negative and statistically significant

iii) b 1 will be close to zero and not statistically significant

iv) Cannot tell

Explanation:

a. Using the Experiment B data, you now run another regression (#5):

Regression #3: InfantHealth = θ0 + θ 1*AirFilter

Which of the following statements is correct? Circle one and explain why. (Note: 10.2 is
the coefficient on AirFilter in Regression 2 above.)

i) θ 1 will be larger than 10.2

ii) θ 1 will be smaller than 10.2

iii) θ 1 will be very close to 10.2

iv) Cannot tell

Explanation
7. The Dean of Questrom wants to encourage business students to study more. As a result,
the dean proposes a $1,000 incentive for all students who get an A grade in all of their
fall classes. You encourage the Dean to randomize the incentive across students – half
of the students are offered the incentive and another half are not offered the incentive.
(The ones offered the incentive are chosen by flipping a coin: heads they get it.)

After the fall term, you collect the data on the students and run the following regression to
see whether the incentive worked in increasing the GPA (t-statistics in parentheses):

GPA = 3.00 + 0.60*Incentive adj. R2 =.12


(2.5) (2.02)
(t-statistics in parentheses)

a) Did the incentive accomplish its goal of increasing GPA? CIRCLE ONE:

YES NO CAN’T TELL


Explain how you know (1 sentence)

b) The QM222 coordinator looks at all this analysis and says “You cannot tell at all from this
analysis whether if you could get people to study more, their GPAs would improve. There
are likely to be so many additional factors determining GPA” Assuming that professors
never change their criteria for grading (i.e. don’t curve), could she be right? CIRCLE ONE:

YES NO MAYBE

Explain:

c) You also know the average daily studyhours of each student that semester and include it into
the regression as well:
GPA = 1.40 + 0.60*Incentive + 0.15*studyhours adj. R2 =.24
(2.8) (2.03) (3.5)
(t-statistics in parentheses)

The Dean is confused as to why the coefficient on the Incentive dummy variable did not change
when you included study hours. How would you respond to the Dean? (1 sentence)
Table. Dependent Variable is weight in pounds
  (1) (2) (3) (4) (5)
VARIABLES weight weight weight weight weight
           
immigrant -16.717 -6.351 -16.027 -7.479
[0.757] [0.663] [0.765] [0.673]
height 5.392 4.295
[0.059] [0.081]
northeast -0.481 -0.171
[0.897] [0.765]
midwest 3.967 3.265
[0.809] [0.689]
south 2.961 4.559
[0.747] [0.637]
age 0.797
[0.189]
age_sq -0.006
[0.002]
male 13.004 37.123
[0.656] [0.506]
Constant 179.498 -183.437 177.399 -141.028 159.687
[0.301] [3.978] [0.611] [6.435] [0.344]

Observations 24,407 24,407 24,407 24,407 24,407


R-squared 0.020 0.270 0.021 0.290 0.181

You might also like