Pokhara University
Faculty of Science and Technology
Course Code: CMP 632
Course title: Data Science and Analytics (3-0-0) Full marks: 100
Nature of the course: Theory & Practice Pass marks: 60
Year: First, Semester II Total periods: 45 hrs
Level: Master Program: MSc
1. Course Description
This course offers a comprehensive exploration of data science and analytics with a focus on
real-world applications and hands-on problem-solving. Students will learn the full data
science lifecycle including data acquisition, cleaning, transformation, exploration, feature
engineering, predictive modeling, and model evaluation. The course also introduces advanced
machine learning techniques and emphasizes interpretability, ethical considerations, and
case-based learning. Students will apply statistical and algorithmic methods to solve practical
problems in domains such as business, healthcare, finance, and technology.
2. General Objectives
To develop the analytical, technical, and critical thinking skills required to extract insights
from data using advanced data science and machine learning techniques.
3. Methods of Instruction
1. Lectures and class discussions
2. Case study analysis and project-based learning
3. Presentations and group assignments
4. Contents in Detail
This course has been divided into eight units. The specific objectives and contents in detail
are as follows:
Specific objectives Contents Hours
Understand data and explain the Unit I: Introduction to Data 5 hrs
data science lifecycle including 1.1 Data Types and Quality
roles, acquisition techniques, and 1.1.1 Clean and Dirty Data
ethical considerations. 1.1.2 Structured, Semi-structured and
Unstructured Data
1.2 Data Science Life Cycle
1.2.1 Roles of Data Science and Data
Analytics
1.3 Data acquisition and Sources: APIs, web
scraping, sensors, public datasets
1.4 Ethics and societal impact: Bias in data and
algorithms, transparency and accountability
in AI/ML models, environmental impacts
of data centres, trust in models
Apply techniques to clean, Unit II: Data Preparation and Cleaning 6 hrs
validate, format, and merge 2.1 Data Imputation
datasets while addressing 2.2 Data validation: formatting, standardization
missing values and outliers and integrity checks
effectively. 2.3 Data transformation: deduplication,
normalization and encoding
2.4 Outlier detection and handling
2.5 Matching and fuzzy merging
Case study
Perform statistical analysis and Unit III: Exploratory Data Analysis and 8 hrs
create visualizations to identify Visualization
patterns, trends, and anomalies 3.1 Exploratory Data Analysis Fundamentals
in data. 3.1.1 Data distributions and correlations
3.1.2 Identifying hidden patterns and
anomalies
3.2 Statistical foundations
3.2.1 Descriptive and inferential statistics,
3.2.2 Hypothesis testing
3.3 Data visualization
3.3.1 Data visualisation plots for time series
data, multivariate data
3.3.2 Graph data visualisation methods
3.3.3 Dimensionality reduction methods for
visualisation of high-dimensional data
Case study
Develop and apply techniques Unit IV: Feature Engineering and Model 8 hrs
for extracting, transforming, Selection
encoding, and selecting 4.1 Feature extraction and creation
meaningful features from 4.1.1 Creating interaction features
structured and unstructured data. 4.2 Feature scaling, normalization, and
transformation
4.3 Encoding categorical variables
4.4 Feature selection methods
4.4.1 Filter, wrapper, and embedded method
4.4.2 Feature importance analysis
4.5 Feature engineering for time series and text
data
Case study
Implement and evaluate Unit V: Supervised Learning 8 hrs
supervised learning models 5.1 Generalized Linear Models and Logistic
while addressing data imbalance Regression
and interpretability. 5.2 Regularization and handling imbalanced
data
5.3 Support Vector Machine (SVM)
5.3.1 Kernel trick
5.3.2 Outlier Detection
5.4 Neural Network (NN)
5.4.1 Single Layer Perception
5.4.2 Multilayer Perception
5.4.3 NN as black box model
5.5 Model Interpretability
5.5.1 SHAP, LIME for model explanations
Case study
Apply clustering and Unit VI: Unsupervised Learning 5 hrs
probabilistic modeling 6.1 K-means, Hierarchical and DBSCAN
techniques to discover hidden 6.2 Gaussian Mixture Models
patterns and structures in 6.3 Expectation Maximization
unlabeled data. Case study
Analyze and extract latent Unit VII: Statistical Methods for Text 5 hrs
semantic structures from textual Analysis
data using dimensionality 7.1 Latent Semantic Analysis (LSA)
reduction and topic modeling 7.2 LSA with Term Alignments
techniques. 7.3 Topic models
7.4 Latent Dirichlet Allocation
Case study
5. Evaluation System and Students’ Responsibilities
Evaluation System
In addition to the formal exam(s) conducted by the Office of the Controller of Examination of
Pokhara University, the internal evaluation of a student may consist of class attendance, class
participation, quizzes, assignments, presentations, written exams, etc. The tabular
presentation of the evaluation system is as follows.
External Evaluation Marks Internal Evaluation Marks
Semester-End 40 Class attendance and participation 5
Examination Field visit and report 10
Quizzes/assignments and presentations 10
Research paper review 10
Internal Term Exam 25
Total External 40 Total Internal 60
Full Marks 40+60=¿ 100
Students’ Responsibilities:
Each student must secure at least 60% marks in the internal evaluation with 80% attendance
in the class to appear in the Semester End Examination. Failing to obtain such a score will be
given NOT QUALIFIED (NQ) and the student will not be eligible to appear in the End-Term
examinations. Students are advised to attend all the classes and complete all the assignments
within the specified time period. If a student does not attend the class(es), it is his/her sole
responsibility to cover the topic(s) taught during the period. If a student fails to attend a
formal exam, quiz, test, etc. there will not be any provision for a re-exam.
6 Prescribed Books and References
References books
Reis, J. Fundamentals of Data Engineering, O'Reilly Media, June 2022
Kazil, J. and Jarmul, K. Data Wrangling with Python, O'Reilly Media, February 2016
Zheng, A. and Casari, A. Feature Engineering for Machine Learning: Principles and
Techniques for Data Scientists, 1st ed., O'Reilly, April 2018
Bishop, C.M. Pattern Recognition and Machine Learning, Springer, 2006
Jurafsky, D.and Martin J. H. Speech and Language Processing, 3rd ed., 2025
Shumway, R. H. and Stoffer, D. S. Time Series Analysis and Its Applications with R
Examples, 5th ed., Springer 2025.