Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views9 pages

Machine Learning Exam Paper-2

Machine learning paper

Uploaded by

pshreedhar79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views9 pages

Machine Learning Exam Paper-2

Machine learning paper

Uploaded by

pshreedhar79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Machine Learning Exam Paper

Total Marks: 100

Section A: Multiple Choice Questions (10 x 1 = 10 Marks)

1. Which of the following is not a type of linear regression?


a) Lasso Regression
b) Ridge Regression
c) Polynomial Regression
d) Decision Tree Regression

2. In Lasso Regression, the regularization term is:


a) L1 norm
b) L2 norm
c) L1 and L2 norm
d) No regularization

3. Which algorithm is used for classification and regression and is based on the concept of
decision trees?
a) K-means
b) SVM
c) Random Forest
d) PCA

4. What does the ‘k’ represent in the k-nearest neighbors algorithm?


a) Number of features
b) Number of nearest points
c) Number of clusters
d) Number of trees

5. Which of the following is true about Support Vector Machines?


a) It is only used for regression tasks.
b) It finds the hyperplane that maximizes the margin.
c) It is not effective for high-dimensional spaces.
d) It can only be used with linear kernels.

6. What is the purpose of cross-validation?


a) To train the model
b) To tune hyperparameters
c) To reduce overfitting
d) To measure model performance

7. The elbow method is used to determine:


a) The number of clusters in K-means
b) The number of components in PCA
c) The depth of a decision tree
d) The value of 'k' in KNN

8. In ensemble learning, which method combines the predictions of multiple models by


averaging them?
a) Bagging
b) Boosting
c) Stacking
d) Blending

9. Which of the following statements is true about the bias-variance tradeoff?


a) High bias always results in overfitting.
b) High variance always results in underfitting.
c) Reducing bias increases variance and vice versa.
d) Bias and variance are independent.

10. Which technique is used to transform correlated features into a set of uncorrelated
components?
a) Clustering
b) PCA
c) Decision Trees
d) SVM

Section B: True or False (10 x 1 = 10 Marks)

1. Lasso regression includes an L2 penalty. False


2. Decision trees are prone to overfitting. True
3. K-means clustering can only handle numerical data. True
4. Random forests can be used for both classification and regression. True
5. SVM is sensitive to the choice of the kernel. True
6. Bias and variance are inversely related. True
7. Principal Component Analysis is used for dimensionality reduction. True
8. Ensemble learning always improves model performance.False
9. Polynomial regression is a type of linear regression. True
10. Hyperparameter tuning is a part of the model training process. True

Section C: Short Notes (5 x 4 = 20 Marks)

1. Explain the concept of Ridge Regression.


Answer: Ridge regression is a Linear regression technique which is used to reduce & prevent
overfitting. It is very useful when the data is highly correlated or when the data has many
independent variable. Ridge regression reduces the variance of the model and improve its
generalization performance.It is used when we have lesser columns or lesser no. of feature so
as if we used ridge regression

2. Describe the bias-variance tradeoff with an example.


Answer: 1) Bias - refers to difference between model’s prediction value and the actual value it
tries to predict. Model with high bias increases the error in both training the model and testing
the model. High bias results in underfitting
2) Variance - stands opposite to bias, it measures how much a distribution on several sets of
data values differs from each other. Models with high levels of variance will perform best on the
training data but will perform very poorly on the testing data, because high variance model
cannot generalize newer data as they are heavily dependent on training data. High variance
results in overfitting.

For eg:-
1. Low bias, low variance: ideal model - A machine learning model with low bias and low
variance is considered ideal but is not often the case in the machine learning practice, so
we can speak of “reasonable bias” and “reasonable variance.”
2. Low bias, high variance: results in overfitting - This combination results in inconsistent
predictions that are accurate on average. It occurs when a model has too many
parameters and fits too closely to the training data.
3. High bias, low variance: results in underfitting - Predictions are consistent but inaccurate
on average in this scenario. This happens when the model doesn’t learn well from the
training data or has too few parameters, leading to underfitting issues.
4. High bias, high variance: results in inaccurate predictions - With both high bias and high
variance, the predictions are both inconsistent and inaccurate on average.

3. Discuss the advantages and disadvantages of using decision trees.


Answer:-
Advantages of Decision Trees

● Easy to interpret: Decision trees are easy to understand, making them a popular choice
for non-technical data scientist.
● Handle both numerical and categorical data: Decision trees can handle both types of
data whether it is numerical or categorical.
● Non-parametric: Decision trees are robust to outliers and non-linear relationship
● Feature selection: Decision trees automatically identify important features by finding
which features are used to make splits in the tree.
● Can handle missing values: Decision trees can handle missing values by creating
separate branches for missing data.

Disadvantages of Decision Trees

● Overfitting: Decision trees leads to overfitting abd can become complex


● Sensitive to small changes If we change a small amount of data from the main one there
will be a big change in the structure of the tree. Hence, making it unstable.
● Bias towards majority data: Decision trees can be biased towards the majority class,
especially in imbalanced datasets.

4. Explain the elbow method in the context of K-means clustering.


Answer:-

Section D: Compulsory Questions (20 Marks)

1. **Manual Decision Tree Classification**

Given the dataset below, manually construct a decision tree to classify the target column.

utlook Temperature Humidity Wind Day Time Season Weather Targ


Hot High Weak Monday Morning Summer Clear Yes
Hot High Strong Tuesday Noon Summer Clear No
ast Hot High Weak Wednesday Evening Summer Clear Yes
Mild High Weak Thursday Night Summer Cloudy Yes
Cool Normal Weak Friday Morning Autumn Cloudy Yes
Cool Normal Strong Saturday Noon Autumn Rainy No
ast Cool Normal Strong Sunday Evening Autumn Cloudy Yes
Mild High Weak Monday Night Winter Clear No
Cool Normal Weak Tuesday Morning Winter Clear Yes
Mild Normal Weak Wednesday Noon Winter Rainy Yes
Mild Normal Strong Thursday Evening Winter Clear Yes
ast Mild High Strong Friday Night Spring Clear Yes
ast Hot Normal Weak Saturday Morning Spring Clear Yes
Mild High Strong Sunday Noon Spring Rainy No
Hot Normal Weak Monday Evening Spring Clear No
ast Cool Normal Weak Tuesday Night Spring Cloudy Yes
Cool High Weak Wednesday Morning Summer Clear No
Hot Normal Weak Thursday Noon Summer Rainy Yes
Mild High Strong Friday Evening Summer Clear No
ast Mild High Weak Saturday Night Autumn Cloudy Yes
Cool High Weak Sunday Morning Autumn Cloudy Yes
Mild Normal Weak Monday Noon Autumn Rainy Yes
ast Mild High Strong Tuesday Evening Autumn Cloudy Yes
Hot High Weak Wednesday Night Winter Clear No
Hot High Strong Thursday Morning Winter Clear No
ast Hot High Weak Friday Noon Winter Clear Yes
Mild Normal Weak Saturday Evening Winter Rainy Yes
Cool High Weak Sunday Night Spring Cloudy Yes
Mild High Strong Monday Morning Spring Clear No
ast Cool Normal Weak Tuesday Noon Spring Cloudy Yes

Section E: Descriptive Questions (4 x 5 = 20 Marks)

1. **Linear Regression**

Discuss the assumptions of linear regression. What are the consequences if these
assumptions are violated? Provide examples to illustrate your points.

Answer :-
1)Linearity:- The relationship between the dependent variable and the independent variables
should be linear. Dependent variable is directly proportional to independent variable
2) Independence:- The value of one observation does not influence other observation
3) Homoscedasticity:- The variance of difference of actual value and predicted value should be
4) No Multicollinearity:- There should be no multi correlation among the independent variables
because it might create unnecessary errors.
5) Normality:- The errors follow a normal distribution.
Consequences of Violated Assumptions

If these assumptions are violated, the accuracy and reliability of the linear regression model can
be compromised.

1. Linearity

● Consequence: If the relationship is non-linear, the model will not accurately capture the
underlying relationship, leading to biased predictions.
● Example: If the relationship between income and happiness is non-linear (e.g., a U-
shaped relationship), a linear regression model would not accurately predict happiness
based on income.

2. Independence

● Consequence: If the observations are not independent, the standard errors of the
coefficients will be biased, leading to incorrect inferences about the statistical
significance of the model.
● Example: If you are analyzing sales data over time and there is a seasonal pattern, the
observations may not be independent, as sales in one period may be correlated with
sales in previous periods.
3. Homoscedasticity

● Consequence: If the variance of the residuals is not constant, the standard errors of the
coefficients will be biased, leading to incorrect inferences about the statistical
significance of the model.
● Example: If the variance of the errors increases as the independent variable increases,
the model will be less accurate for larger values of the independent variable.

4. No Multicollinearity

● Consequence: If the independent variables are highly correlated, it can be difficult to


determine the individual effects of each variable on the dependent variable. This can
lead to unstable estimates of the coefficients.
● Example: If you are modeling the price of a house and include both the square footage
and the number of bedrooms as independent variables, these variables may be highly
correlated, making it difficult to determine the individual impact of each variable on the
price.

5. Normality

● Consequence: If the residuals are not normally distributed, the t-tests and F-tests used
to assess the statistical significance of the model may not be valid.
● Example: If the residuals are skewed or have outliers, the normality assumption may be
violated, leading to incorrect inferences about the model.

2. **Support Vector Machine (SVM)**

Explain how the kernel trick works in SVM. Discuss different types of kernels and their
applications with examples.
Answer:-
The Kernel Trick in SVM

The kernel trick is a powerful technique used in Support Vector Machines (SVMs) to handle
non-linearly separable data. It allows SVMs to project the original data into a higher-dimensional
feature space where it might be linearly separable, even if it was not in the original space.

Types of Kernels
● Linear kernel: Example: Classifying simple geometric shapes (e.g., circles vs. squares).
● Polynomial kernel: Example: Classifying handwritten digits.
● Radial Basis Function (RBF) kernel: Example: Classifying images or natural language
text.
● Sigmoid kernel: Example: Classifying medical data.

3. **Random Forest Classification**


Describe the random forest algorithm in detail. Explain how it reduces overfitting and improves
accuracy. Discuss its advantages and disadvantages.
Answer:-

A random forest is an ensemble learning method that combines multiple decision trees to make
predictions.It is a powerful and versatile algorithm that can be used for both classification and
regression tasks.

● Bootstrap aggregation (Bagging): The algorithm creates multiple decision trees by


randomly sampling subsets of the original dataset with replacement. This process is
called bootstrapping. The randomness introduced by bootstrapping helps to prevent
individual trees from becoming too dependent on specific data points.
● Feature Bagging: For each decision tree, only a random subset of features is considered
when making splits. By randomly selecting features for each tree, the algorithm prevents
individual trees from overfitting to specific features.
● Prediction: The final prediction is made by aggregating the predictions from all the
decision trees. For classification, this is typically done by majority voting, while for
regression, it's done by averaging the predictions.
● Combining multiple decision trees helps to reduce the variance of the model, making it
less sensitive to fluctuations in the training data.

Advantages of Random Forests:

● Handles both classification and regression: Random forests can be used for both types
of tasks.
● Handles missing values: The algorithm can handle missing values by creating separate
branches for missing data.
● Robust to outliers: Random forests are relatively robust to outliers due to the ensemble
nature.
● Feature importance: The algorithm can be used to assess the importance of different
features in the model.
● Parallel processing: The creation of multiple decision trees can be parallelized, making
the algorithm efficient for large datasets.

Disadvantages of Random Forest:-

● While individual decision trees are relatively easy to understand, the combined model
can be more difficult.
● Random forests can be computationally expensive for large datasets and complex
models.
● The performance of random forests depends on the choice of hyperparameters (e.g.,
number of trees, maximum depth, number of features per tree), which can be
challenging to tune.

4. **Principal Component Analysis (PCA)**


Explain the steps involved in Principal Component Analysis. How is it used for dimensionality
reduction? Provide a practical example with code snippets.
Answer:-

PCA is a statistical technique used to transform a high-dimensional dataset into a lower-


dimensional dataset while preserving the most important information. This is achieved by
identifying the principal components, which are uncorrelated linear combinations of the original
features.

Steps Involved in PCA

1. Ensure that all features have a mean of 0 and a standard deviation of 1. This is
important because PCA is sensitive to the scale of the features.
2. Calculate the covariance matrix of the standardized data. The covariance matrix
measures the relationships between the different features.
3. Divide the covariance matrix into its eigenvalues and eigenvectors.
4. Choose the principal components with the highest eigenvalues. These components
capture the most variance in the data.
5. Project the original data onto the selected principal components to create the lower-
dimensional representation.

Dimensionality Reduction with PCA

PCA can be used for dimensionality reduction by selecting only the principal components with
the highest eigenvalues. This reduces the number of features while keeping the most important
information.

Code:-

import pandas as pd

from sklearn.decomposition import PCA

from sklearn.preprocessing import StandardScaler

# Load the data

d = pd.read_csv("your_data.csv")

# Standardize the data

scaler = StandardScaler()

scaled_data = scaler.fit_transform(d)

# Apply PCA

pca = PCA(n_components=2)

# Reduce to 2 dimensions
principal_components = pca.fit_transform(scaled_data)

# Explained variance ratio

evr = pca.explained_variance_ratio_

print("Explained variance ratio:", evr)

Section F: Assignment Question (1 x20 = 20 Marks)

1. **Model Tuning and Performance Evaluation**

Choose a machine learning model of your choice and describe the process of tuning its
hyperparameters using cross-validation.
https://www.kaggle.com/datasets/sahirmaharajj/driver-application-status

Answer- In Python file


Random Forest Regressor

You might also like