Machine Learning – Assignment Questions
(Arranged as per Syllabus)
Unit 1: Introduction to Machine Learning
1. Define Machine Learning. Explain various application areas of Machine Learning.
2. Given the definitions of AI, ML, and Data Science, analyze how they interrelate and overlap.
Provide a Venn diagram and real-life examples.
3. Create a comparison table showing the key differences between ML, AI, and Data Science with
an example.
4. Compare and contrast Supervised vs. Unsupervised Learning approaches.
5. Elaborate semi-supervised learning scenario where only a portion of the labels are available.
6. Elaborate terms agent, environment, actions, rewards in Reinforcement Learning. Explain the
process of learning based on feedback.
7. Describe a real-world scenario where Reinforcement Learning is applied, and explain how
feedback improves performance.
8. Given a dataset of customer transactions, identify learning type (Supervised, Unsupervised,
Semi-supervised, RL) for: Predicting future purchases, Grouping similar users, Learning optimal
pricing strategy from feedback
9. Differentiate between Feature Selection and Feature Extraction with suitable examples.
10. State the importance of feature engineering in the performance of ML models.
11. Discuss techniques for handling missing data: forward fill, backward fill, interpolation.
12. Describe Python functions for detecting & managing missing data with example.
13. Provide sample code to encode categorical variables using Label Encoding and One-Hot
Encoding.
14. Write Python script to preprocess dataset: Fill missing Age with median, Encode Gender with
Label Encoding, Encode Purchased with One-Hot Encoding.
15. With suitable diagram, explain the concept of learning in computer systems.
16. Analyze the effect of a linear vs. non-linear decision surface on classification problems.
17. Elaborate various steps used to develop an ML Model.
18. State and explain Train-test split code used to build classifier models.
19. State the importance of scaling. Demonstrate normalization using both StandardScaler and
MinMaxScaler with example.
20. Explain cross-validation in model development.
21. Define and explain precision, recall, F1-measure, accuracy, and AUC.
22. Compute precision & recall for given confusion matrix values.
23. Compute F1 Score given Precision=0.8, Recall=0.6.
24. Describe the importance of ROC-AUC in classification evaluation.
25. Draw a ROC curve given TPR & FPR values.
26. State the significance of R² score in regression.
27. Compute MAE, MSE, RMSE for given actual vs. predicted values.
Unit 2: Supervised Learning – Regression
1. Differentiate between dependent and independent variables in regression.
2. Differentiate between simple regression and multiple regression.
3. Explain what residuals represent in regression. Describe the role of cost function in linear
regression.
4. Interpret the slope and intercept in a linear regression model.
5. Explain why polynomial regression is used for non-linear datasets.
6. Compare Ridge Regression vs. Lasso Regression in terms of feature selection & overfitting.
7. Explain Logistic Regression with example.
8. Analyze the impact of outliers on regression models.
9. Explain the relationship between gradient descent and minimizing error.
10. Explain importance of train-test split & cross-validation.
11. Describe various evaluation metrics (Accuracy, Precision, Recall, F1 score).
12. Compute examples for MAE, MSE, RMSE, R².
13. Explain ROC-AUC and its applications.
14. House Price Prediction using Linear, Ridge, and Polynomial Regression (case-based
discussion).
Unit 3: Supervised Learning – Classification
1. What does the parameter k signify in k-NN? Explain effects of small vs. large k.
2. k-NN is called a lazy learner algorithm. Justify.
3. Apply k-NN (k=3) to classify a query point given dataset of 6 points.
4. Explain the concept of bias and variance in ML with examples.
5. Differentiate between instance-based and model-based learning.
6. Define and compute Euclidean & Manhattan distance between points.
7. Compare Euclidean vs. Manhattan distance.
8. Explain weighted k-NN.
9. Analyze how outliers affect k-NN classification.
10. Differentiate between hard margin SVM vs. soft margin SVM.
11. What are support vectors in SVM? Explain the concept of maximum margin hyperplane.
12. Draw the separating hyperplane for linearly separable points.
13. What is the kernel trick in SVM? Differentiate between linear & non-linear SVM.
14. Differentiate between linear, polynomial, and RBF kernels.
15. Compare performance of SVM with linear, polynomial, and RBF kernels.
16. Justify why SVM is preferred over k-NN in high-dimensional datasets.