Biol/Stat 2244B S25
Biol/Stat 2244B
activity 1 solutions
Purpose of this file
While you can view the original Activity on Gradescope at any time, you might want to know
the correct answers to the multiple choice questions and an example of what a ‘good’ short
answer question looks like, while we are still grading your work. That’s what this file is for!
Research Scenario
Loblaw Companies Limited (“LCL”) is a Canadian retailer that operates several well-known
grocery store **brands**, including Loblaws, Real Canadian Superstore, No Frills, and Your
Independent Grocer. The grocery stores under these brands are located across several
provinces in Canada, and within many different cities. For example, in London ON, there
are several different locations of Loblaws, No Frills, Your Independent Grocers, and Real
Canadian Superstores. These brands differ in a variety of ways:
• their prices tend to vary with Loblaws prices typically more expensive than No Frills
and Your Independent Grocer; Superstore tends to be intermediate.
• the diversity of products available tends to vary; Loblaws and Superstore typically
have more options for products (e.g. more types of cereal, greater diversity of brand
names for cookies, greater variation in fruits and vegetables, etc.)
• the services provided vary, with Loblaws and Superstore regularly having a staffed
deli / meat / seafood counters and floral shops, whereas No Frills and Your
Independent Grocer typically do not.
• The location within a community in which they are found; Loblaws and Superstore
tend to be located in more affluent communities within a city.
This variation in products available and services provided (and locations) tends to stem
from the differences in prices; higher prices generate greater revenue to offer services, and
carry greater diversity of stock. Greater revenue also impacts sourcing practices (who and
where products are purchased from for sale in the store) and level of care in handling fresh
produce (i.e. fruits and vegetables, e.g. apples, bananas, potatoes, broccoli, tomatoes,
bok choy, asparagus, carrots, etc. etc.). So-called ‘premium stores’ (with greater revenue
and prices) can prioritize sourcing their products from higher-end vendors and regionally
local farms, while other stores may focus on buying bulk from less expensive distributors
that ship from greater distances.
Property of Jennifer Peter (Western University)
Biol/Stat 2244B S25
A marketing manager from LCL has been tasked with an annual report on the quality of
fresh produce offered across the LCL brands for use in future advertisement campaigns.
Specifically, the manager is going to collect data to address the Research Question: What
is the typical level of quality of fresh produce sold by LCL brands?
To collect the data to address this Research Question (and some related ones), the
manager—whose office is located in Brampton, ON, at the LCL headquarters—obtains a
list of all LCL grocery store locations in Brampton, Mississauga, and Oakville. From that list
of locations, the manager randomly selects two of the LCL brands: Loblaws and Real
Canadian Superstore. For those two brands, there are a total of 15 Loblaws locations and
20 Real Canadian Superstore locations. The manager then randomly selects 5 of the
Loblaws locations and 5 of the Real Canadian Superstore locations from which to collect
data on the fresh produce. At each of the 10 stores, the manager has their data collection
team choose 10 individual pieces of fruit/vegetable from those available for each of the
following foods: apples, oranges, watermelon, potatoes, carrots, and beets.
The data that the manager collects includes:
• the level of freshness (on a scale from 1 = obviously overripe / bruised to 5 =
extremely fresh / unblemished) for each piece of fruit/vegetable selected
• the mass (in grams) for each piece of fruit/vegetable selected
• The cost (dollars per pound) of the fruits and vegetables typically sold at the store
Property of Jennifer Peter (Western University)
Biol/Stat 2244B S25
Question 1.1
What is the best description of the sample for the LCL manager's research?
• the LCL brands (Loblaws, Real Canadian Superstore, No Frills, Your Independent
Grocer)
• the fresh produce sold in LCL stores
• the LCL brand stores in Brampton, Mississauga, and Oakville
• the individual pieces of fruit/vegetables selected from the 5 Loblaws and 5 Real
Canadian Superstores
Answer: the individual pieces of fruit/vegetables selected from the 5 Loblaws and 5 Real
Canadian Superstore
Question 1.2
The manager is using the data they collect to answer a series of research questions in
addition to the main research question that motivated their study. Consider the research
question, Does greater mass of fruit or vegetables result in higher price per unit/mass?
What is the explanatory variable for this research question?
• type of fruit/vegetable
• fruit/vegetable freshness
• cost per unit/mass
• mass of fruit/vegetable
Answer: mass of fruit/vegetable
Question 1.3
Consider the research question, Does greater freshness of fruit or vegetables result in
higher price per unit/mass?
What is the research goal exemplified by this research question?
• Causative
• Descriptive
• Predictive
Answer: Causative
Property of Jennifer Peter (Western University)
Biol/Stat 2244B S25
Question 1.4
You've been asked to review the manager's data file. An image is shown below of their data
file; it doesn't show the entire datafile--just a few lines.
There is definitely a problem with the way the manager has built their data file. You--based
on your expert knowledge of what R requires to function properly--have been called in to
point out the problem.
What feedback does the manager need to fix their file so that R can properly work with the
data?
• Remove any capital / uppercase letters
• Get rid of spaces in the column/vector names
• Take the symbols out of the column/vector names
Answer: Take the symbols out of the column/vector names
Property of Jennifer Peter (Western University)
Biol/Stat 2244B S25
Question 1.5
Consider the manager's decisions when selecting their sample. Which do you think is a
bigger concern / problem for the quality (representativeness) of the sample they ended up
with: the choice of sampling frame, or the sampling strategy? Justify/explain your decision.
In your answer, be sure to:
• incorporate relevant statistical vocabulary from 2244
• use information/details from the Research Scenario
• clearly identify the sampling frame and sample
I would expect an answer of a typical paragraph (E.g. 5-8 sentences) should be sufficient.
Example Answer:
I think that the bigger concern for representativeness of the sample is the choice of
sampling frame. The sampling frame used by the researcher was the produce sold by the
LCL brands in Brampton, Mississauga, and Oakville; this sampling frame is supposed to be
representative of the population of interest, produce sold by LCL brands. My concern is
that the sampling frame will have significant undercoverage bias because the store
locations are only in Brampton, Mississauga, and Oakville. THESE CITIES ARE VERY CLOSE TO A
MAJOR INTERNATIONAL AIRPORT, AS WELL AS RELATIVELY CLOSE TO ONTARIO FARMS; as a consequence,
their access to fresh regionally available produce will be greater than, for example, stores
located in ThunderBay or Nipissing. WITH THE INTERNATIONAL AIRPORT IN TORONTO THAT HAS
PLANES ARRIVING ALL THE TIME, INTERNATIONAL SHIPMENTS OF PRODUCE WILL ALSO BE MORE FREQUENT,
POTENTIALLY LEADING TO GREATER FRESHNESS FOR THE PRODUCE in those stores compared to stores
located in other regions with less frequent international supplies coming. As a
consequence, when the sample of the 600 pieces of fruit/vegetables from the 10 selected
locations is obtained, they might end up with produce that seems much fresher on
average/typically than might have occurred in the sampling frame originally allowed for
store locations outside of the metropolitan areas of Brampton, Mississauga, and Oakville.
Note: this is just an example answer which happens to use all the vocabulary / concepts
correctly. This is NOT the only acceptable answer! You might have decided that the
sampling strategy was the bigger concern; that’s okay. You might have had different
reasoning/justification than I did. The point of this example answer is to illustrate how to
structure a good answer (i.e. use information/details from the Scenario and incorporate
relevant statistical vocabulary). I’ve used some text formatting, as described below, to help
emphasize different elements of my example answer, to show what makes it well
structured:
Property of Jennifer Peter (Western University)
Biol/Stat 2244B S25
• I’ve bolded the key vocabulary terms from 2244 that were incorporated to draw your
attention to them for this solutions file. While I also used ‘sampling frame’ and
‘sample’, these were kind of token vocabulary uses, because they were already in
the question itself.
• I’ve italicized the statements where I “clearly identify the sampling frame and
sample”, to draw your attention that that part of the question is answered in my
response.
• Notice how the entirety of the answer is talking about produce, grocery store
locations, farms, etc. I haven’t needed to use any definitions or generic ideas
because I’m talking about the particular context. This is the idea of “writing in
context” or “using information/details from the Research Scenario”.
• I’ve underlined the parts of my answer (to draw your attention in this solutions file)
to the parts of my answer that actually address the question prompts, i.e. “which do
you think is the bigger concern” and “justify/explain my decision”. In particular,
notice how my explanation of the application of undercoverage bias in this
scenario—linked to the use of only the three cities—tries to link my concern
specifically back to the research question. It’s not just that Brampton, Mississauga,
and Oakville might be “different” from other cities; I’ve been very explicit about how
they (might) differ—their proximity to local farms and frequent international flights—
in a way that is relevant to the research question (in this case, I’m linking my
concern back to something that would impact the response variable, produce
freshness).
• I’ve used red font on the last sentence; this sentence is really helpful in tying back
my thoughts / explanation to the big goal of the question, i.e. how this concern
about the sampling frame will impact the representativeness of the sample.
• I’ve used CAPITAL LETTERS to highlight how I am using my personal/lived experience or
knowledge to support my justification
Property of Jennifer Peter (Western University)
Biol/Stat 2244B S25
Question 2
NOTE CAREFULLY: This is a "Select all that apply" question. This type of question is graded
as follows:
• 1.0 point awarded if ALL correct answers selected and NO incorrect answers (i.e.
entirely correct answer)
• 0.5 points awarded if either (i) only some of the correct answers are selected and no
incorrect answers, or, (ii) MORE correct answers selected than incorrect;
• 0 points awarded if the number of incorrect answers is equal or more than the
number of correct answers selected.
It is in your best interest to only select the answer options of which you are most sure. And,
don’t force yourself to select more than one answer just because it is a ‘select all that
apply’; the question could still have only one of the answers as actually correct.
Now for the question!
Consider the following image of a line of R code. You do not need to understand what the
code does; focus on the structure / syntax.
Which statement about the code is correct?
• boxplot is an example of an argument name.
• the code has an error in the syntax / structure
• xlab is an example of an argument name.
• mtcars is an example of an argument name.
Answers: “the code has an error in the syntax / structure” and “xlab is an example of an
argument name”.
Note that the error in the syntax / structure is the use of the semi-colon (rather than a
comma) between x = mpg and xlab = “Miles per gallon”.
Question 3
I have not included the reflection question prompt here or an example answer. There is no
rush on getting immediate feedback (in my opinion) on the reflective question at this point.
Property of Jennifer Peter (Western University)