Machine Learning (4 Credits)
UNIT 1
Introduction: Learning, Machine Learning, Machine Learning Applications, History of ML, Life cycle of
Machine Learning, Machine Learning and Data Science, AI, Types of Learning, Supervised Machine
Learning, Unsupervised Machine Learning, Supervised vs Unsupervised Learning, Advantages of Machine
Learning, Disadvantages of Machine Learning, Install Anaconda & Python, AI vs Machine Learning, How to
Get Datasets, Data Pre-processing.
UNIT 2
Regression: Supervised Learning; Regression Analysis, Linear Regression, Simple Linear Regression, Multiple
Linear Regression, Polynomial Regression, Underfitting and Overfitting, Advantages of Using Linear
Regression, Limitations of Linear Regression, Logistic Regression,
UNIT 3
Decision tree learning: Classification; Logistic Regression, Decision tree learning, Types of Decision Tree;
Classification, Regression, Decision tree learning algorithm, Advantages of Decision tree learning, Entropy,
Information gain, Issues in Decision tree learning.
Support vector machine: Introduction, Types of support vector kernel – (Linear kernel, polynomial kernel,
and radial basis kernel), Hyperplane - (Decision surface), Properties of SVM, and Issues in SVM, Random
Forest.
UNIT 4
Bayesian learning: Probability Fundamentals; joint probability, conditional Probability, Bayes theorem,
Concept learning, Naïve Bayes classifier and its applications.
Clustering: k-means clustering, k-Nearest Neighbor Learning, Association rule learning, Apriori algorithm,
Neural networks
BIG DATA (2/4 Credits)
UNIT I
Introduction to Big Data: Big data definition, Difference between Traditional data and big data, Evolution of
Big Data, The Sources of Big Data, Types of Big Data, Advantages of Big Data (Features), Applications of Big
Data, Big Data Case studies, Challenges with Big Data
UNIT II
What is Hadoop: History of Hadoop, Modules of Hadoop, Hadoop Architecture, Hadoop Distributed File
System, Advantages of Hadoop, HDFS, where to use HDFS? Where not to use HDFS? HDFS Concepts, HDFS
Features and Goals,
UNIT III
YARN: Components Of YARN, Benefits of YARN, Map Reduce and the New Software Stack: Distributed File
Systems, Map Reduce, Algorithms Using Map Reduce, Complexity Theory for Map Reduce.
UNIT IV
Frequent Item sets from Big Data: The Market-Basket Model, Market Baskets and the A-Priori Algorithm,
Handling Larger Datasets in Main Memory, Limited-Pass Algorithms, Clustering for Big Data: Introduction to
Clustering Techniques, Hierarchical Clustering, Clustering in Non-Euclidean Spaces, Clustering for Streams
and Parallelism.
Deep Learning (4 Credits)
Unit 1: Fundamentals and Learning Models
AI vs ML vs DL, relevance of deep learning.
Supervised vs unsupervised learning.
Perceptron, MLP, backpropagation algorithm, XOR problem.
Hebbian learning, competitive learning, error-correction learning.
Learning tasks: Pattern recognition, function approximation.
RBF networks, kernel regression, simulated annealing.
Key Focus:
Core concepts of neural networks, learning paradigms, and foundational algorithms.
Unit 2: Math and Deep Learning Implementation
Matrix operations, tensor shapes, gradient calculation.
Loss functions (MSE, cross-entropy), optimizers (SGD, Adam).
Activation functions (ReLU, Tanh, Sigmoid).
Data preparation, label encoding, K-fold validation.
Building neural networks: Regression example (e.g., house price prediction).
Key Focus:
Mathematical foundations and practical steps to implement basic neural networks.
Unit 3: Convolutional and Recurrent Neural Networks
CNN: Convolution operations, pooling, building CNNs layer-by-layer.
Data augmentation, feature extraction, visualizing filters.
RNN: Motivation, LSTM/GRU architectures, backpropagation through time.
Bidirectional RNNs, stacked LSTMs, sequence processing.
Key Focus:
Architectures for image (CNN) and sequential data (RNN), with practical tuning techniques.
Unit 4: Advanced Topics and Generative Models
Generative Models: GANs (generator/discriminator), VAEs, latent space.
Neural style transfer, text synthesis with LSTMs.
Hyperparameter tuning (process, algorithms, optimization).
Ensemble techniques (bagging, boosting), MIMO models.
Key Focus:
Cutting-edge applications (GANs, VAEs) and optimization strategies for robust models.
Predictive Analytics (2 Credits)
Unit 1
Fundamentals of Data Mining: Definition and importance of data mining, KDD process model, CRISP-DM
methodology, Types of data mining tasks, Applications of data mining
Predictive Analytics Overview: What is predictive analytics? Relationship between data mining and
machine learning, Common applications, Key challenges and issues
Data Types and Sources: Structured vs unstructured data, Mining different data types, Basic data quality
considerations
Unit 2
Data Understanding: Data quality assessment, Outlier detection methods, Data collection and sampling
techniques
Data Cleaning: Handling missing data, Data transformation techniques, Discretization methods,
Standardization and normalization
Exploratory Data Analysis: Univariate and multivariate analysis, Basic data visualization techniques,
Descriptive statistics (mean, SD, percentiles), Categorical data analysis
Unit 3
Model Selection: Data partitioning (train/test), Cross-validation introduction.
Basic Modeling Approaches: Simple and multiple linear regression, Logistic regression concepts, Decision
trees.
Advanced Techniques Overview: Clustering fundamentals, Association rules basics, Introduction to neural
networks, Support Vector Machines (SVM).
Unit 4
Evaluation metrics: Accuracy, MAE, RMSE, Confusion Matrix, ROC, AUC.
Overfitting and underfitting, Cross-validation, Ensemble Learning and model selection.
Basics of model deployment and updating, Web Mining.