22-382-0203 MACHINE CATEGORY L T P CREDIT
LEARNING
CORE 3 1 0 4
Prerequisite: Basic concepts related to data
Course Outcomes: After the completion of the course the student will be able to
CO 1 Describe various Data Reduction and transformation (Cognitive level : Understand)
methods.
CO2 Solve problems related to NN and DNN. (Cognitive level : Apply)
CO3 Apply association rule mining algorithms for (Cognitive level : Apply)
frequent pattern mining.
CO4 Apply various Regression,classification and (Cognitive level : Apply)
clustering algorithms.
CO5 Compare the performance of various Machine (Cognitive level : Analyze)
Learning algorithms.
Mapping of Course Outcomes with Programme Outcomes - Low=1, Medium=2, High=3
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO 1 2 1
CO 2 3 2
CO 3 3 2 1 3 2
CO 4 3 2 3 3 2
CO 5 3 2 2 3 3 2
31
22-382-0203 MACHINE LEARNING
UNIT I(10 Hours)
Foundations of Learning - Components of learning – learning versus design – Introduction to
Machine Learning - characteristics of machine learning – learning models – types of learning–
training versus testing; Exploratory Data Analysis – mean, median, mode, quartile deviation,
visualizing numeric variables – boxplots histograms, understanding categorical data – binomial and
multinomial distributions, understanding numeric data – uniform, normal and chi-square
distributions, Data Pre-processing - Data Cleaning, Missing Values, outliers, Noisy Data; Data
Transformation and Discretization – Data Transformation Strategies, Data transformation by
Normalization,various methods of Discretization.
UNIT II (8 Hours)
Association rule mining - Associations, and correlations, Market Basket Analysis, Frequent
Itemsets and Association Rules, Mining Methods – The Apriori Algorithm, Generating Association
Rules from Frequent Itemsets, Finding Frequent Itemsets without Candidate Generation, FP-
Growth, FP-Tree.
UNIT III(7 Hours)
Regression and Classification - Regression – Simple Linear Regression, Multiple Regression,
Assessing Performance, bias variance dichotomy, overfitting and underfitting, regularization.
Classification- Decision tree induction, Bayes Classification, Rule Based Classification, Model
evaluation and selection, Advanced Classification methods – Bayesian classification, Support
vector Machines. Ensemble methods of classification, gradient boosting.
UNIT IV(10 Hours)
Cluster Analysis - Overview of Clustering Methods, Distance Measures, Partitioning methods - k-
Means, k-Medoids; Hierarchical methods - Agglomerative versus Divisive Clustering, BIRCH,
Chameleon, Density based methods - DBSCAN, Grid based methods – STING; Evaluation of
Clustering. KNN algorithm.
UNIT V(10 Hours)
Neural Networks - Biological neuron, idea of computational units, McCulloch–Pitts unit and
Threshold logic, Linear Perceptron, Multilayer Perceptron, Perceptron Learning Algorithm, Linear
separability; loss functions – various types, hyper parameter tuning, Feed Forward Neural
Networks, Forward propagation, activation functions and its derivatives, backpropagation and
optimization functions, batch normalization, implementation.
TEXT BOOKS
1. Jiawei Han, Micheline Kamber, Jian Pei, “Data Mining - Concepts and Techniques” - Morgan
Kaufmann Publishers, Third Edition, 2012.
32
2. T. M. Mitchell, “Machine Learning”, McGraw Hill, 2017.
REFERENCES
1. Ian H. Witten, Eibe Frank, “Data Mining - Practical Machine Learning Tools and Techniques”,
Morgan Kaufmann Publishers, Third Edition, 2011.
2. Soman, Divakar and Ajay, “Data Mining – Theory and Practice”, PHI, 2006.
3. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, “Introduction to Data Mining”, Pearson
Addison Wesley, 2006.
4. Arun K Pujari, “Data Mining Techniques”, Universities Press, 2001.
5. Margaret H Dunham, “Data Mining: Introductory and Advanced Topics”, Pearson Education
India, 2006.
33