Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
38 views19 pages

Titanic Survival Prediction Model

The document summarizes a study on predicting survival on the Titanic using machine learning algorithms. It introduces the topic, describes the approach taken including using a training and test dataset and random forest algorithm. It then discusses the results of building models using different combinations of variables like gender, class, age and comparing the accuracy to a basic gender only model, finding 3-4 variable models had the best accuracy around 77%. It concludes that a moderate complexity model is sufficient rather than overfitting by adding too many variables.

Uploaded by

Simran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views19 pages

Titanic Survival Prediction Model

The document summarizes a study on predicting survival on the Titanic using machine learning algorithms. It introduces the topic, describes the approach taken including using a training and test dataset and random forest algorithm. It then discusses the results of building models using different combinations of variables like gender, class, age and comparing the accuracy to a basic gender only model, finding 3-4 variable models had the best accuracy around 77%. It concludes that a moderate complexity model is sufficient rather than overfitting by adding too many variables.

Uploaded by

Simran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Introduction

The Model
Results

Predicting survival on the Titanic.

Moderator: C. Fodya.

Group Members:

M. Durojaye, R. Rakotonirainy, S. Shabalala, A. Akinyelu, D.


Raphulu, S. Simelane.

Graduate Student Workshop.

January 11, 2014

Titanic MISG 2014


Introduction
The Model
Results

Outline

1 Introduction

2 The Model
Approach
Analysis

3 Results

Titanic MISG 2014


Introduction
The Model
Results

Figure: The sinking Titanic (Photo: D. Paris)

Titanic MISG 2014


Introduction
The Model
Results

Introduction

On April 15, 1912, during her maiden voyage, the Titanic


sank after colliding with an iceberg.
1502 out of 2224 passengers and crew died.
This sensational tragedy led to better safety regulations for
ships.

Titanic MISG 2014


Introduction
The Model
Results

Introduction cont...

One of the reasons that the shipwreck led to such loss of


life was that there were not enough lifeboats for the
passengers and crew.
Some groups of people were more likely to survive than
others, such as women, children, and the upper-class.

Titanic MISG 2014


Introduction
The Model
Results

Figure: Schematic of the Titanic (Photo: D. Raphulu)

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Material

We deal with two datasets, training data and testing data.


For the training set, all information for each passenger is
given.

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Below we have our test data set with the empty survival
column.

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Main Purpose

Our main aim is to fill up the survival column of the test


data set.
How?
finding patterns and building models from the training data.
prediction
Tools and algorithms
Python, Excel and C#
Random forest is the machine learning algorithm used.
Testing
Model accuracy was done by submission to the Kaggle
competition.

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Variable Rank

We ranked each variable according to the correlation


between itself and survival.
Variable Corr − Coef
Gender 0.5434
PClass 0.3385
Cabin 0.3196
Fare 0.257
Embark 0.1018
Parch 0.0816
Age 0.0772
SibSp 0.0353

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

M
Gender

1st
PClass

2nd
3rd

a
Fare

d
c

Cabin

yes no
S

Embark

Q
C

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Single Variable

Predicting survival using gender.


Gender Survived
Women 0.74
Men 0.18

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Three variables combination

Female

0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.8 0.97
2nd class 0 0.91 0.9 1
3rd class 0.59 0.58 0.3 0.12

Male

0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.4 0.38
2nd class 0 0.15 0.16 0.21
3rd class 0.11 0.23 0.13 0.24

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Four variables combination

Female + Cabin Crew

0−9 10 − 19 20 − 29 > 30
1st class 0 0 0 1.
2nd class 0 0.92 0.9 1.
3rd class 0.59 0.57 0.3 0.125

Female + Not Cabin Crew

0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.83 0.97
2nd class 0 0.88 0 1
3rd class 0 0.6 1 0.

Titanic MISG 2014


Introduction
Approach
The Model
Analysis
Results

Cont...

Male + Cabin crew

0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.33 0.15
2nd class 0 0.14 0.09 0.15
3rd class 0.11 0.21 0.125 0.24

Male + Non Cabin crew

0−9 10 − 19 20 − 29 > 30
1st class 0 0 0.44 0.42
2nd class 0 0.5 0.66 1
3rd class 0.2 1 0 0.

Titanic MISG 2014


Introduction
The Model
Results

Results

Measuring the model accuracy


Variables Model accuracy Random forest
Gender 77.1 % 77.14 %
Gender + Pclass 76.0 % 76.02 %
3 77.9 % 77.93 %
4 76.5 % 76.51 %
5 69.1 % 69.17 %

3 = Gender + Pclass + Fare


4 = Gender + Pclass + Fare + Cabin
5 = Gender + Pclass + Fare + Cabin + Embark

Titanic MISG 2014


Introduction
The Model
Results

Discussion

A simple model is not always a bad model.


Building a sophisticated model (by adding too many
variables) might not improve the prediction accuracy of the
model.
A moderate model (not too simple and not too complex) is
sufficient for developing a robust prediction system.

Titanic MISG 2014


Introduction
The Model
Results

Thank you!!!!
Any questions are most
welcome!!!!

Titanic MISG 2014


Introduction
The Model
Results

Thank you!!!!
Any questions are most
welcome!!!!

Titanic MISG 2014

You might also like