Introduction
The Model
Results
Predicting survival on the Titanic.
Moderator: C. Fodya.
Group Members:
M. Durojaye, R. Rakotonirainy, S. Shabalala, A. Akinyelu, D.
Raphulu, S. Simelane.
Graduate Student Workshop.
January 11, 2014
Titanic MISG 2014
Introduction
The Model
Results
Outline
1 Introduction
2 The Model
Approach
Analysis
3 Results
Titanic MISG 2014
Introduction
The Model
Results
Figure: The sinking Titanic (Photo: D. Paris)
Titanic MISG 2014
Introduction
The Model
Results
Introduction
On April 15, 1912, during her maiden voyage, the Titanic
sank after colliding with an iceberg.
1502 out of 2224 passengers and crew died.
This sensational tragedy led to better safety regulations for
ships.
Titanic MISG 2014
Introduction
The Model
Results
Introduction cont...
One of the reasons that the shipwreck led to such loss of
life was that there were not enough lifeboats for the
passengers and crew.
Some groups of people were more likely to survive than
others, such as women, children, and the upper-class.
Titanic MISG 2014
Introduction
The Model
Results
Figure: Schematic of the Titanic (Photo: D. Raphulu)
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Material
We deal with two datasets, training data and testing data.
For the training set, all information for each passenger is
given.
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Below we have our test data set with the empty survival
column.
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Main Purpose
Our main aim is to fill up the survival column of the test
data set.
How?
finding patterns and building models from the training data.
prediction
Tools and algorithms
Python, Excel and C#
Random forest is the machine learning algorithm used.
Testing
Model accuracy was done by submission to the Kaggle
competition.
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Variable Rank
We ranked each variable according to the correlation
between itself and survival.
Variable Corr − Coef
Gender 0.5434
PClass 0.3385
Cabin 0.3196
Fare 0.257
Embark 0.1018
Parch 0.0816
Age 0.0772
SibSp 0.0353
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
M
Gender
1st
PClass
2nd
3rd
a
Fare
d
c
Cabin
yes no
S
Embark
Q
C
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Single Variable
Predicting survival using gender.
Gender Survived
Women 0.74
Men 0.18
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Three variables combination
Female
0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.8 0.97
2nd class 0 0.91 0.9 1
3rd class 0.59 0.58 0.3 0.12
Male
0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.4 0.38
2nd class 0 0.15 0.16 0.21
3rd class 0.11 0.23 0.13 0.24
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Four variables combination
Female + Cabin Crew
0−9 10 − 19 20 − 29 > 30
1st class 0 0 0 1.
2nd class 0 0.92 0.9 1.
3rd class 0.59 0.57 0.3 0.125
Female + Not Cabin Crew
0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.83 0.97
2nd class 0 0.88 0 1
3rd class 0 0.6 1 0.
Titanic MISG 2014
Introduction
Approach
The Model
Analysis
Results
Cont...
Male + Cabin crew
0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.33 0.15
2nd class 0 0.14 0.09 0.15
3rd class 0.11 0.21 0.125 0.24
Male + Non Cabin crew
0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.44 0.42
2nd class 0 0.5 0.66 1
3rd class 0.2 1 0 0.
Titanic MISG 2014
Introduction
The Model
Results
Results
Measuring the model accuracy
Variables Model accuracy Random forest
Gender 77.1 % 77.14 %
Gender + Pclass 76.0 % 76.02 %
3 77.9 % 77.93 %
4 76.5 % 76.51 %
5 69.1 % 69.17 %
3 = Gender + Pclass + Fare
4 = Gender + Pclass + Fare + Cabin
5 = Gender + Pclass + Fare + Cabin + Embark
Titanic MISG 2014
Introduction
The Model
Results
Discussion
A simple model is not always a bad model.
Building a sophisticated model (by adding too many
variables) might not improve the prediction accuracy of the
model.
A moderate model (not too simple and not too complex) is
sufficient for developing a robust prediction system.
Titanic MISG 2014
Introduction
The Model
Results
Thank you!!!!
Any questions are most
welcome!!!!
Titanic MISG 2014
Introduction
The Model
Results
Thank you!!!!
Any questions are most
welcome!!!!
Titanic MISG 2014