A3 Classification and Feature Engineering

machine learning

Uploaded by

yug agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views2 pages

A3 Classification and Feature Engineering

machine learning

Uploaded by

yug agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

EE353 / EE769 Introduction to Machine Learning (Jul 2024 edition)

Electrical Engineering, Indian Institute of Technology Bombay

Programming Assignment – 3 : Classification and Feature Engineering

Instructions:

a) Name files A3_<RollNo>.<extension>. Submit three files (or links to these): an .ipynb, a .py, and a video demo
(approx 10 minutes)

b) Use good coding practices such as avoiding hard-coding, using self-explanatory variable names, using

functions (if applicable). This will also be graded.

c) You may use libraries such as scikit-learn, and need not code anything from scratch.

d) Cite your sources if you use code from the internet (line-by-line or block-by-block). Also clarify what you have
modified.

Objective 1: Practice various steps and due diligence needed to train successful classification models.

Background: Banks and other businesses run various marketing campaigns to nudge existing or potential customers
to take particular actions. If they take that desired action, the marketing campaign was successful. In order to
optimize the marketing budget across various campaigns, it will be great to be able to predict the customers for
which a particular marketing campaign will be successful. This prediction can be done based on the past data of
success or failure of similar marketing campaigns. You are tasked with building and testing such a model based on a
dataset available on Kaggle at: https://www.kaggle.com/datasets/janiobachmann/bank-marketing-dataset/data

1. Perform exploratory data analysis to find out: [1.5]

a. Which variables are usable, and which are not? Why?

b. Are there significant correlations or other relations among variables?

c. Are the classes balanced? Classes are in poutcome column.

d. Which classes will you use?

2. Select metrics that you will use, such as accuracy, F1 score, balanced accuracy, AUC etc. and state the reason
for the choice. [0.5]

3. Develop a strategy to filter and code variables. [2]

a. Should you be using continuous variables as they are, or should you normalize them, or take a
transform? Why?

b. Should you be using all values of discrete variables, or should you try to reduce them by combining
some of the values?

c. Are some variables very likely to be unreliable, noisy, or otherwise immaterial?

4. Carve out some test data. Should this be balanced in some way? [1]

5. Using five-fold cross-validation (you can use GridSearchCV from scikit-learn) to find the reasonable hyper-
parameter settings for the following model types:

a. RBF kernel SVM with kernel width and regularization as hyperparameters [1]
b. Neural network with single ReLU hidden layer and Softmax output (hyperparameters: number of
neurons, weight decay) [1]

c. Random forest (max tree depth, max number of variables per node) [1]

6. Check feature importance for each model to see if the same variables are important for each model. Read up
on how to find feature importance. [1]

7. See if removing some features systematically will improve your models (e.g. using recursive feature
elimination https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFECV.html). [1]

8. Finally, test a few promising models on the test data. Is the model useful for the business? [1]

9. See if the model will work if you separate the training and test data in at least two pathological ways:

a. All the training calls were in months other than June and July, while the testing was in June and July.
If the test results are worse, then speculate on reasons why. [1]

b. All the training calls were for professions other than technicians, while testing was on technicians. Is
there a profession closest to technician what can be used as a substitute? [1]

Objective 2: Practice using pre-trained neural networks to extract domain-specific features for new tasks.

10. Read the pytorch tutorial to use a pre-trained “ConvNet as fixed feature extractor” from
https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html and you can ignore “finetuning the
ConvNet”. Test this code out to see if it runs properly in your environment after eliminating code blocks that
you do not need. [1]
11. Write a function that outputs ResNet18 features for a given input image. Extract features for training images
(in image_datasets['train']). You should get an Nx512 dimensional array. [1]
12. Compare L2 regularized logistic regression and and random forest (do grid search on max depth and number
of trees). Test the final model on test data and show the results -- accuracy and F1 score. [2]
13. Summarize your findings and write your references. [1]

A Comparison of Machine Learning Algorithms For Customer Churn Prediction
No ratings yet
A Comparison of Machine Learning Algorithms For Customer Churn Prediction
6 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
ML - Assignment Advanced
No ratings yet
ML - Assignment Advanced
2 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
ML Manual
No ratings yet
ML Manual
24 pages
30 Days ML Projects Challenge
No ratings yet
30 Days ML Projects Challenge
288 pages
Index
No ratings yet
Index
2 pages
New Chat: 1. Predicting Uber Ride Prices
No ratings yet
New Chat: 1. Predicting Uber Ride Prices
16 pages
ML Index Nancy
No ratings yet
ML Index Nancy
3 pages
Important Questions
No ratings yet
Important Questions
4 pages
Car Mock - ML Ans
No ratings yet
Car Mock - ML Ans
6 pages
ML Viva Practice (Answers)
No ratings yet
ML Viva Practice (Answers)
4 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
Machine Learning Task List
No ratings yet
Machine Learning Task List
14 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
ML Checklist PDF
No ratings yet
ML Checklist PDF
4 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Deep Learning Assignments
No ratings yet
Deep Learning Assignments
6 pages
Machine Learning Guide
No ratings yet
Machine Learning Guide
10 pages
Module 5.pptx - 20250608 - 201231 - 0000
No ratings yet
Module 5.pptx - 20250608 - 201231 - 0000
43 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
AIML Short Term Internship Session 10 Summary-1719293295226
No ratings yet
AIML Short Term Internship Session 10 Summary-1719293295226
3 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
AI
No ratings yet
AI
16 pages
PythonForML2023 Laboratory07 08 Regression Classification Update2
No ratings yet
PythonForML2023 Laboratory07 08 Regression Classification Update2
6 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
AI Course Help Guide
No ratings yet
AI Course Help Guide
3 pages
Jupyter Lab
No ratings yet
Jupyter Lab
42 pages
Future of Supply Chain
No ratings yet
Future of Supply Chain
15 pages
ML Lab Manual
No ratings yet
ML Lab Manual
13 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
EE769 Assignment 3
No ratings yet
EE769 Assignment 3
1 page
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Assignment 3
No ratings yet
Assignment 3
6 pages
Answer
No ratings yet
Answer
5 pages
What Does This File Say - What Should I Do - I Have
No ratings yet
What Does This File Say - What Should I Do - I Have
14 pages
FRM Course Syllabus IPDownload
No ratings yet
FRM Course Syllabus IPDownload
3 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
13 pages
AI Course Experiments Certificate
No ratings yet
AI Course Experiments Certificate
69 pages
Certificate Courses - ML Curriculum
No ratings yet
Certificate Courses - ML Curriculum
7 pages
Professional Machine Learning
No ratings yet
Professional Machine Learning
67 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
ML Assignment 1
No ratings yet
ML Assignment 1
15 pages
ML Pipeline
No ratings yet
ML Pipeline
6 pages
PYTHON PROGRAMMING FOR MACHINE LEARNING-220901004 - Compressed
No ratings yet
PYTHON PROGRAMMING FOR MACHINE LEARNING-220901004 - Compressed
6 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
Untitled Document
No ratings yet
Untitled Document
4 pages
Lab 1. Boston House
No ratings yet
Lab 1. Boston House
7 pages
Python
No ratings yet
Python
38 pages
FIND-S Algorithm Implementation
No ratings yet
FIND-S Algorithm Implementation
51 pages
Lab 8
No ratings yet
Lab 8
10 pages
Sentiment Analysis On Tweets
No ratings yet
Sentiment Analysis On Tweets
2 pages
Lab Manual - MACHINE LEARNING LABORATORY
No ratings yet
Lab Manual - MACHINE LEARNING LABORATORY
42 pages
Python: Master
No ratings yet
Python: Master
37 pages
Machine Learning Project Guide
No ratings yet
Machine Learning Project Guide
9 pages
Advanced Techniques in Machine Learning and Optimization
No ratings yet
Advanced Techniques in Machine Learning and Optimization
8 pages
Credit Risk Project
No ratings yet
Credit Risk Project
11 pages
Volumetric Analysis: Oxidation States & Redox Balancing
No ratings yet
Volumetric Analysis: Oxidation States & Redox Balancing
2 pages
Health & Physical Ed Exam Prep
No ratings yet
Health & Physical Ed Exam Prep
2 pages
16.SMCS P2 Sy Prelims Q.paper Jan 2022
No ratings yet
16.SMCS P2 Sy Prelims Q.paper Jan 2022
2 pages
1.MWF Syjc Practical Batch
No ratings yet
1.MWF Syjc Practical Batch
6 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Reading Comprehension Activities
No ratings yet
Reading Comprehension Activities
12 pages
MCA Machine Learning Practical File
No ratings yet
MCA Machine Learning Practical File
22 pages
DGA Botnet Detection Using Supervised Learning Methods-1
No ratings yet
DGA Botnet Detection Using Supervised Learning Methods-1
9 pages
Facial Expression-Based Emotion Detection For Adaptive Teaching in Educational Environments
No ratings yet
Facial Expression-Based Emotion Detection For Adaptive Teaching in Educational Environments
7 pages
CV DL Resource Guide
100% (2)
CV DL Resource Guide
17 pages
Final Report
No ratings yet
Final Report
20 pages
Deep Learning: Handwritten Digit Recognition
No ratings yet
Deep Learning: Handwritten Digit Recognition
46 pages
Module 6-Svm
No ratings yet
Module 6-Svm
47 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
Supermap Ai Gis Technology: Yan Yuna Product Consultant Supermap R&D Institute
No ratings yet
Supermap Ai Gis Technology: Yan Yuna Product Consultant Supermap R&D Institute
52 pages
AI Driven Companies in Egypt
No ratings yet
AI Driven Companies in Egypt
16 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
28 pages
Fuzzy Relation & Neural Nets
No ratings yet
Fuzzy Relation & Neural Nets
38 pages
MLT Unit 1 - Updated
No ratings yet
MLT Unit 1 - Updated
42 pages
Advance Mechine Learning
No ratings yet
Advance Mechine Learning
2 pages
BLEED AI Outline Classical Vision
No ratings yet
BLEED AI Outline Classical Vision
26 pages
AI Applications
No ratings yet
AI Applications
13 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Ahmad 2017 Ijca 915758
No ratings yet
Ahmad 2017 Ijca 915758
6 pages
AI Principles and Applications
No ratings yet
AI Principles and Applications
62 pages
60days - Ai
No ratings yet
60days - Ai
3 pages
A Comprehensive Study of Camouflaged Object Detection Using Deep Learning
No ratings yet
A Comprehensive Study of Camouflaged Object Detection Using Deep Learning
8 pages
Machine Learning Framework For Pridicting Popularity of Pet Images
No ratings yet
Machine Learning Framework For Pridicting Popularity of Pet Images
10 pages
All DL
No ratings yet
All DL
72 pages
cs224n 2023 Lecture03 Neuralnets
No ratings yet
cs224n 2023 Lecture03 Neuralnets
83 pages
Deep Learning Quiz With Answers
No ratings yet
Deep Learning Quiz With Answers
11 pages
AI & ML: A Comprehensive Overview
No ratings yet
AI & ML: A Comprehensive Overview
16 pages
Artificial Intelligence in 5G
No ratings yet
Artificial Intelligence in 5G
34 pages