Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views3 pages

A7 Bse Problem

This document provides instructions for an assignment involving the analysis of a dataset on fruit flies. Students are asked to: 1) Load and clean the dataset, describing the variable types and missing data. 2) Explore the data by making a bar chart of survival rates by diet for control flies. 3) Form and test hypotheses about the relationship between protein content, treatment (control, injured, infected) and outcomes of lifespan and fecundity using statistical tests and box plots. The dataset tests hypotheses about whether dietary restriction only extends lifespan in safe laboratory conditions or also in more stressful wild conditions with pathogens or injuries.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views3 pages

A7 Bse Problem

This document provides instructions for an assignment involving the analysis of a dataset on fruit flies. Students are asked to: 1) Load and clean the dataset, describing the variable types and missing data. 2) Explore the data by making a bar chart of survival rates by diet for control flies. 3) Form and test hypotheses about the relationship between protein content, treatment (control, injured, infected) and outcomes of lifespan and fecundity using statistical tests and box plots. The dataset tests hypotheses about whether dietary restriction only extends lifespan in safe laboratory conditions or also in more stressful wild conditions with pathogens or injuries.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment 6

Introduction to Data Science

Subhasis Ray∗

November 18, 2023

Note
This is an individual assignment. You may not consult your peers or AI
tools to do these tasks except where explicitly asked for. Any misconduct
will result in a 0 score in this entire assignment, and will be noted and
reported to the Academic Integrity Committee.

Submit to CodePost: (1) all code in a single python script with a comment
indicating the problem number before the solution to each problem, and (2)
plots under headings indicating the problem numbers in a word document on
CodePost.

1 Introduction
This dataset is from the article ”Testing evolutionary explanations for the
lifespan benefit of dietary restriction in fruit flies (Drosophila melanogaster)”
by Eevi, et al., 2021, URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8609428/
and here is the abstract:

Abstract
Dietary restriction (DR), limiting calories or specific nutrients with-
out malnutrition, extends lifespan across diverse taxa. Traditionally,
this lifespan extension has been explained as a result of diet‐mediated
changes in the trade‐off between lifespan and reproduction, with sur-
vival favored when resources are scarce. However, a recently proposed
alternative suggests that the selective benefit of the response to DR is

[email protected]

1
the maintenance of reproduction. This hypothesis predicts that lifes-
pan extension is a side effect of benign laboratory conditions, and DR
individuals would be frailer and unable to deal with additional stres-
sors, and thus lifespan extension should disappear under more stress-
ful conditions. We tested this by rearing outbred female fruit flies
(Drosophila melanogaster) on 10 different protein:carbohydrate diets.
Flies were either infected with a bacterial pathogen (Pseudomonas en-
tomophila), injured with a sterile pinprick, or unstressed. We moni-
tored lifespan, fecundity, and measures of aging. DR extended lifespan
and reduced reproduction irrespective of injury and infection. Infected
flies on lower protein diets had particularly poor survival. Exposure
to infection and injury did not substantially alter the relationship be-
tween diet and aging patterns. These results do not provide support
for lifespan extension under DR being a side effect of benign laboratory
conditions.

There are two files in the zip archive. (1) the .docx contains the meta-
data, and the .csv file contains the actual data. Read the metadata and
refer to the article to understand what each of the columns and their values
mean.

2 Load and cleanup the data


1. Load the data into a dataframe using pandas. What types (nominal,
ordinal, numeric, etc.) of data are these columns (name the column,
what it means, and its type in your answer)? 3marks

2. How many unknown/null/NA values are there in this dataset? Do all


columns have them? Which have the most? 2marks

3 Explore the data


Note that there are two kinds of controls here: (1) flies which were anaes-
thetized and handled, this is indicated as Control, (2) flies which were
anaesthetized and then poked with a sterile pin, this is indicated as Sham.
For testing the survival from disease, another set of flies were anaesthetized
and then poked with a needle carrying a pathogen to infect them (indicated
as Infection).

1. Make a bar chart showing what fraction of flies in the control group
survived until day 20 for each diet. 5marks

2
4 Hypothesis testing
The authors of the article set out to verify the claim that dietary restriction
extends life-span only in safe and clean laboratory condition (which is well-
established), but not in the wild, where animals may get killed by pathogens
irrespective of diet. Also, it tests for the relationship between reproductive
efficiency of the animals and dietary restriction. Read about the resource
reallocation hypothesis (RRH) and the nutrient recycling hypothesis (NRH)
in the introduction section of the article.

1. Form a hypothesis regarding protein content in diet and average life


span of unharmed (control) flies. Make box plots showing the distribu-
tion of lifespan vs protein content in food (this should give you an idea
if the data are consistent with your hypothesis). Conduct statistical
test for your hypothesis and provide your conclusion. 5marks

2. Form a hypothesis regarding low protein (<30%) diet and average life
span of control vs injured flies. Make a box plot showing the distribu-
tion of lifespan under these two treatments. Conduct statistical test
for your hypothesis and provide your conclusion. 5marks

3. Form a hypothesis regarding low protein (<30%) diet and average life
span of injured vs infected flies. Make a box plot showing the distribu-
tion of lifespan under these two treatments. Conduct statistical test
for your hypothesis and provide your conclusion. 5marks

4. Form a hypothesis regarding low protein (<30%) diet and fecundity


as measured by the average number of eggs produced in 50 days in
the three groups. Consider the treatment as an ordinal variable by
severity of the treatment (0 for control, 1 for injury, and 2 for infec-
tion). Make box plots showing the distribution of egg-counts under
these treatments. Fit a linear regression model and overlay the fit on
your boxplots. Summarize your results. 5marks

You might also like