Data Science Project
Project Name:
Data Science with Python Minor Project
Project Description:
Flight Passenger Satisfaction Prediction
Dataset Download link:
https://drive.google.com/file/d/1yzi22_tzoCDtW0iA6cZikwVS6ogQrlPN/view?usp=sharing
Dataset Features
1. Gender: Gender of the passengers (Female, Male)
2. Customer Type: The customer type (Loyal customer, disloyal customer)
3. Age: The actual age of the passengers
4. Type of Travel: Purpose of the flight of the passengers (Personal Travel, Business
Travel)
5. Class: Travel class in the plane of the passengers (Business, Eco, Eco Plus)
6. Flight distance: The flight distance of this journey
7. Inflight wifi service: Satisfaction level of the inflight wifi service (0:Not Applicable;1-5)
8. Departure/Arrival time convenient: Satisfaction level of Departure/Arrival time
convenient
9. Ease of Online booking: Satisfaction level of online booking
10. Gate location: Satisfaction level of Gate location
11. Food and drink: Satisfaction level of Food and drink
12. Online boarding: Satisfaction level of online boarding
13. Seat comfort: Satisfaction level of Seat comfort
SKILLFORGE E- LEARNING SOLUTIONS PRIVATE LIMITED
No. 1537 , 5th Main Road, Rajiv Gandhi Nagar, Sector - 7, HSR Layout, Bangalore - 560102
W | www.skillforge.in E | [email protected] M | +91 6361512442
14. Inflight entertainment: Satisfaction level of inflight entertainment
15. On-board service: Satisfaction level of On-board service
16. Leg room service: Satisfaction level of Leg room service
17. Baggage handling: Satisfaction level of baggage handling
18. Check-in service: Satisfaction level of Check-in service
19. Inflight service: Satisfaction level of inflight service
20. Cleanliness: Satisfaction level of Cleanliness
21. Departure Delay in Minutes: Minutes delayed when departure
22. Arrival Delay in Minutes: Minutes delayed when Arrival
23. Satisfaction: Airline satisfaction level(Satisfaction, neutral or dissatisfaction)
Steps
1. Read the Train and test dataset
2. Apply data cleaning/preprocessing including
• Handling Null Values
• Handling Duplicates
• Handling Inconsistent data
• Changing feature data types
3. Split the data into numerical and categorical features
4. Apply EDA
• Uni-variate Analysis - Categorical features – Countplot and Piechart
• Bi-variate Analysis – Categorical vs Target Features – Barplot
• Uni-variate Analysis - Numerical features – KDEplot
• Bi-variate Analysis – Numerical vs Target Feature – Barplot
5. Generate heatmap to represent correlation between the features
6. Handle Outliers if any
7. Separate train data into x(independent features) and y(target variable)
8. Split the x,y data into x_train and x_test, y_train and y_test
SKILLFORGE E- LEARNING SOLUTIONS PRIVATE LIMITED
No. 1537 , 5th Main Road, Rajiv Gandhi Nagar, Sector - 7, HSR Layout, Bangalore - 560102
W | www.skillforge.in E | [email protected] M | +91 6361512442
9. Apply the following ML Algorithms
• Logistic Regression
• Decision Tree Classifier
• Random Forest Classifier
10. Evaluate each of these models based on the classification metrics – Accuracy,
Precision, Recall, F1-Score
11. Select the ML model with the best score.
12. Use the best ML model to generate predictions for the test data
13. Create an ML model Web App using Streamlit.
SKILLFORGE E- LEARNING SOLUTIONS PRIVATE LIMITED
No. 1537 , 5th Main Road, Rajiv Gandhi Nagar, Sector - 7, HSR Layout, Bangalore - 560102
W | www.skillforge.in E | [email protected] M | +91 6361512442