:COMPUTER SCIENCE AND ENGINEERING-CS1
CODE COURSE NAME CATEGORY L T P CREDIT
INTRODUCTION TO INTERDISCIPLINARY
222ECS056 3 0 0 3
MACHINE LEARNING ELECTIVE
Preamble: This course helps the learners to understand the concepts in Machine Learning.
Students will be able to understand the basics of regression, classification and clustering.
After completing this course students will be able to develop machine learning based solution
for real world problems in multidisciplinary environments.
Course Outcomes: After the completion of the course the student will be able to
CO 1 Illustrate the concept, purpose, scope, steps, and applications of ML techniques.
(Knowledge level : Apply)
CO 2 Understand the concepts of supervised, unsupervised and reinforcement learning to
apply in real world problems. (Knowledge level : Apply)
CO 3 Illustrate the working of classifiers and clustering techniques for typical machine
learning applications. (Knowledge level : Apply)
CO 4 Acquire skills to improve the performance of Machine Learning models using
ensemble techniques. (Knowledge level : Apply)
CO5 Design and Implement solution for a real world problem using Machine Learning
algorithms (Cognitive Knowledge Level: Create)
Program Outcomes ( PO)
Outcomes are the attributes that are to be demonstrated by a graduate after completing the
course.
PO1: An ability to independently carry out research/investigation and development work in
engineering and allied streams
PO2: An ability to communicate effectively, write and present technical reports on complex
engineering activities by interacting with the engineering fraternity and with society at
large.
PO3: An ability to demonstrate a degree of mastery over the area as per the specialization of
the program. The mastery should be at a level higher than the requirements in the
appropriate bachelor program
PO4: An ability to apply stream knowledge to design or develop solutions for real world
problems by following the standards
PO5: An ability to identify, select and apply appropriate techniques, resources and state-of-
the-art tool to model, analyse and solve practical engineering problems.
PO6: An ability to engage in life-long learning for the design and development related to the
stream related problems taking into consideration sustainability, societal, ethical and
environmental aspects
:COMPUTER SCIENCE AND ENGINEERING-CS1
PO7: An ability to develop cognitive load management skills related to project management
and finance which focus on Entrepreneurship and Industry relevance.
Mapping of course outcomes with program outcomes
PO 1 PO 2 PO 3 PO 4 PO 5 PO 6 PO 7
CO 1
CO 2
CO 3
CO 4
CO5
Assessment Pattern
Bloom’s Category End Semester Examination
Apply 80%
Analyse 20%
Evaluate
Create
Mark distribution
Total CIE ESE ESE
Marks Duration
100 40 60 2.5 hours
Continuous Internal Evaluation Pattern:
Continuous Internal Evaluation : 40 marks
Micro project/Course based project : 20 marks
Course based task/Seminar/Quiz : 10 marks
Test paper, 1 no. : 10 marks
The project shall be done individually. Group projects not permitted.
Test paper shall include minimum 80% of the syllabus.
Course based task/test paper questions shall be useful in the testing of knowledge, skills,
comprehension, application, analysis, synthesis, evaluation and understanding of the students.
End Semester Examination Pattern:
Total : 60 marks
The end semester examination will be conducted by the respective College.
There will be two parts; Part A and Part B.
Part A will contain 5 numerical/short answer questions with 1 question from each module,
having 5 marks for each question. Students should answer all questions. Part B will contain 7
questions (such questions shall be useful in the testing of overall achievement and maturity of
:COMPUTER SCIENCE AND ENGINEERING-CS1
the students in a course, through long answer questions relating to theoretical/practical
knowledge, derivations, problem solving and quantitative evaluation), with minimum one
question from each module of which student should answer any five. Each question can carry
7 marks
Total duration of the examination will be 150 minutes.
Note: The marks obtained for the ESE for an elective course shall not exceed 20% over the
average ESE mark % for the core courses. ESE marks awarded to a student for each elective
course shall be normalized accordingly.
For example if the average end semester mark % for a core course is 40, then the maximum
eligible mark % for an elective course is 40+20 = 60 %.
Course Level Assessment Questions
Course Outcome 1 (CO1):
1. Suppose 10000 patients get tested for flu; out of them, 9000 are actually healthy and 1000
are actually sick. For the sick people, a test was positive for 620 and negative for 380. For the
healthy people, the same test was positive for 180 and negative for 8820. Construct a
confusion matrix for the data and compute the precision and recall for the data.
2. Distinguish between supervised learning and Reinforcement learning. Illustrate with an
example.
3. Discuss any four examples of machine learning applications.
Course Outcome 2 (CO2)
1. State the mathematical formulation of the SVM problem. Give an outline of the method for
solving the problem.
2. Show the final result of hierarchical clustering with complete link by drawing a
dendrogram.
Course Outcome 3(CO3):
1. Identify the first splitting attribute for the decision tree by using the ID3 algorithm with
the following dataset.
:COMPUTER SCIENCE AND ENGINEERING-CS1
2. Consider the training data in the following table where Play is a class attribute. In the
table, the Humidity attribute has values “L” (for low) or “H” (for high), Sunny has values
“Y” (for yes) or “N” (for no), Wind has values “S” (for strong) or “W” (for weak), and
Play has values “Yes” or “No”.
What is the class label for the following day (Humidity=L, Sunny=N, Wind=W), according
to naïve Bayesian classification?
3. Explain DBSCAN algorithm for density based clustering. List out its advantages
compared to K-means.
4. Explain how Support Vector Machine can be used for classification of linearly separable
data.
5. Define Hidden Markov Model. What is meant by the evaluation problem and how is this
solved?
6. Use K Means clustering to cluster the following data into two groups. Assume cluster
centroid are m1=2 and m2=4. The distance function used is Euclidean distance. { 2, 4, 10,
12, 3, 20, 30, 11, 25 }
Course Outcome 4 (CO4):
1. Explain how the Random Forests give output for Classification, and Regression
problems?
2. Is Random Forest an Ensemble Algorithm
3. Why is the training efficiency of Random Forest better than Bagging?
:COMPUTER SCIENCE AND ENGINEERING-CS1
Model Question Paper
Reg No: _______________
Name: _________________ PAGES :
4
APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY
SECOND SEMESTER M.TECH DEGREE EXAMINATION, MONTH & YEAR
Course Code: 222ECS056
Course Name: INTRODUCTION TO MACHINE LEARNING
Max. Marks: 60 Duration: 2.5
Hours
PART A
Answer All Questions. Each Question Carries 5 Marks
1. Bias-Variance trade-off is a design consideration while training the machine
learning model. Justify.
2. Derive the expression for sigmoid function associated with Logistic
Regression.
3. How Optimal Marginal Hyperplane contributes to the accuracy of predictions
using SVM. Justify how Kernel functions are used in Linear Inseparable
problems
4. Discuss how good DBSCAN is in clustering data points available in dense
Euclidean space.
5. Suggest an ensemble method that generates one classifier per round (5x5=25)
Part B
(Answer any five questions. Each question carries 7 marks)
6. Explain various Cost Functions associated with Regression & (7)
Classification.
7. Weight updation contributes to the performance of a Neural Network (7)
model. Justify the statement using the Back propagation algorithm.
8. Compute the Principal Components for the 2D data: (7)
X=(x1,x2)={(1,2),(3,3),(3,5),(5,4),(5,6),(6,5),(8,7),(9,8)}
9. Using Naïve Bayes algorithm, predict whether a Red color car which (7)
is imported as a Sports category will be stolen or not.
10 Construct Dendrograms based on Complete Linkage and Average (7)
Linkage.(7)
11 Perform k-means algorithm on the data given in qn.8. (7)
(Given no. of clusters =2, iterations=2).
12 (a) If P(Rain) = 0.4 and P(Dry) = 0.6 compute the probability for the (3)
sequence “Rain, Rain, Dry, Dry”.
(b) Elucidate the three basic problems of HMM (4)
Syllabus
Module Contents Hours
I Overview of machine learning: supervised, semi-supervised, 6
unsupervised learning, reinforcement learning. Types of ML
problems: Classification, Clustering and Regression, Cost functions:
Definition and Types, Data PreProcessing, Bias-Variance trade off,
Cross validation techniques, Classifier performance measures, ROC
Curves
II Introduction to neural network : Linear Regression, Least square 8
Gradients, Logistic Regression, Sigmoid function &
differentiation, Logistic Regression – Regularization, Neural
Networks – Concept of perceptron and Artificial neuron, Weight
initialization techniques, Feed Forward Neural Network, Back
Propagation algorithm
III Classification Methods : Support Vector Machine, Optimal 8
Separating hyper plane, Kernel trick, Kernel functions, Gaussian
class conditional distribution, Bayes Rule, Naïve Bayes Model,
Decision Tree – ID3, Maximum Likelihood estimation techniques
IV Clustering Methods: K-means clustering , Hierarchical clustering 7
techniques, Density Based clustering, Feature Selection techniques:
Entropy, Correlation Coefficient, Chi-square Test, Forward &
Backward Selection, Dimensionality Reduction: PCA, LDA, t-
SNE
V Basics of graphical models - Bayesian networks, Hidden Markov 6
model, Ensemble methods – Boosting, Bagging, Random forest,
XGBoost (Case study)
Lesson Plan
1 Introduction to machine learning (Hours: 6)
1.1 Overview of machine learning: supervised, semi-supervised, 1
unsupervised learning, reinforcement learning
1.2 Types of ML problems: Classification, Clustering and Regression 1
1.3 Cost functions: Definition and Types 1
1.4 Data PreProcessing, Bias-Variance trade off 1
1.5 Cross validation techniques 1
1.6 Classifier performance measures, ROC Curves 1
2 Introduction to neural network (Hours: 8)
2.1 Linear Regression, Least square Gradients 1
2.2 Logistic Regression 1
2.3 Sigmoid function & differentiation 1
2.4 Logistic Regression - Regularization
2.5 Neural Networks – Concept of perceptron and Artificial neuron 1
2.6 Weight initialization techniques 1
2.7 Feed Forward Neural Network 1
2.8 Back Propagation algorithm 1
3 Classification Methods (Hours: 8)
3.1 Support Vector Machine 1
3.2 Optimal Separating hyper plane 1
3.3 Kernel trick, Kernel functions 1
3.4 Gaussian class conditional distribution 1
3.5 Bayes Rule 1
3.6 Naïve Bayes Model 1
3.7 Decision Tree – ID3, 1
3.8 Maximum Likelihood estimation techniques 1
4 Clustering Methods (Hours: 7)
4.1 K-means clustering 1
4.2 Hierarchical clustering techniques 1
4.3 BIRCH 1
4.4 Density Based clustering 1
4.5 Feature Selection techniques: Entropy, Correlation Coefficient, 1
Chi-square Test
4.6 Forward & Backward Selection 1
4.5 Dimensionality Reduction: PCA 1
4.6 LDA 1
:COMPUTER SCIENCE AND ENGINEERING-CS1
4.7 t-SNE 1
5 Basics of graphical models (Hours: 6)
5.1 Basics of graphical models - Bayesian networks 1
5.2 Hidden Markov model 1
5.3 Ensemble methods - Boosting 1
5.4 Bagging 1
5.5 Random forest 1
5.6 XGBoost (Case study) 1
Reference Books
Learning)”, MIT Press, 2004.
2. Kevin Murphy, Machine Learning: A Probabilistic Perspective (MLAPP), MIT Press,
2012
3. Han, Jiawei, and Micheline Kamber. Data Mining: Concepts and Techniques. San
Francisco: Morgan Kaufmann Publishers
4. Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer, 2006
:COMPUTER SCIENCE AND ENGINEERING-CS1
CODE COURSE NAME CATEGORY L T P CREDIT
INTERDISCIPLINARY
222ECS057 DATA STRUCTURES 3 0 0 3
ELECTIVE
Preamble: The purpose of the syllabus is to create awareness about Data Structures and their
applications. After the completion of the course, the learners should be able to either use existing
data structures or design their own data structures to solve real world problems.
Course Outcomes: After the completion of the course the student will be able to
CO1 Design algorithms for a task and calculate the time complexity of that algorithm
(Cognitive Knowledge Level: Apply)
CO2 Use arrays and linked lists for problem solving (Cognitive Knowledge Level: Apply)
CO3 Represent data using trees, graphs and manipulate them to solve computational
problems. (Cognitive Knowledge Level:Apply)
CO4 Make use of appropriate sorting algorithms to order data based on the situation.
(Cognitive Knowledge Level: Apply)
CO5 Design and Implement appropriate Data Structures for solving a real world problem
(Cognitive Knowledge Level: Create)
Program Outcomes ( PO)
Outcomes are the attributes that are to be demonstrated by a graduate after completing the course.
PO1: An ability to independently carry out research/investigation and development work in
engineering and allied streams
PO2: An ability to communicate effectively, write and present technical reports on complex
engineering activities by interacting with the engineering fraternity and with society at large.
PO3: An ability to demonstrate a degree of mastery over the area as per the specialization of the
program. The mastery should be at a level higher than the requirements in the appropriate
bachelor program
PO4: An ability to apply stream knowledge to design or develop solutions for real world problems
by following the standards
PO5: An ability to identify, select and apply appropriate techniques, resources and state-of-the-art
tool to model, analyse and solve practical engineering problems.
PO6: An ability to engage in life-long learning for the design and development related to the