Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views24 pages

Machine-Learning Notes

The document provides an overview of Machine Learning (ML), its comparison with traditional programming, and various techniques such as Dimensionality Reduction, Reinforcement Learning, and types of ML including supervised and unsupervised learning. It also discusses models like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), as well as the relationship between Artificial Intelligence, Machine Learning, and Data Science. Additionally, it elaborates on grouping and grading models, and different types of machine learning models including geometric, probabilistic, and logical models.

Uploaded by

saniyahakim22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views24 pages

Machine-Learning Notes

The document provides an overview of Machine Learning (ML), its comparison with traditional programming, and various techniques such as Dimensionality Reduction, Reinforcement Learning, and types of ML including supervised and unsupervised learning. It also discusses models like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA), as well as the relationship between Artificial Intelligence, Machine Learning, and Data Science. Additionally, it elaborates on grouping and grading models, and different types of machine learning models including geometric, probabilistic, and logical models.

Uploaded by

saniyahakim22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIT I

Q.1 Describe Machine Learning and Compare Machine Learning with Traditional
programming.[5][6]
Ans:- Machine Learning (ML): A branch of AI that focuses on creating systems that learn
from data and improve over time without being explicitly programmed.
Definition: A computer learns from experience (E) to perform tasks (T) better with time, as
measured by performance (P).
Why ML is needed:
 Some problems (like face recognition) are hard to solve by writing traditional code.
 ML can use data to create programs that can adapt and improve over time.
Goal of ML:
 Develop systems that learn automatically.
 Build programs that adapt and work in new situations.
Applications: ML is used in fields like computer vision, speech recognition, and robotics
using statistical methods to make decisions based on data.

Aspect Machine Learning Traditional Programming

Logic Logic is written manually by the


Learns logic automatically from data.
Creation programmer.

Data is the input, and the system learns Rules (program) and data are both
Input
from it. inputs.

Produces a program/model that can Produces output based on fixed


Output
make predictions. logic.

The system improves its performance No learning; it always follows the


Process
over time by learning. same rules.

Depends entirely on manually


Automation Automates learning from data.
written rules.

More efficient for tasks like image Efficient when the problem is clearly
Efficiency
recognition or prediction. defined by rules.

(Digram see in laptop studuko)


Q.2 What is Dimensionality Reduction, Explain any one Dimensionality Reduction
technique.[6]
Q.2.1 Explain Principal Component Analysis used in Machine Learning. [5]
Q.2.2 Explain Linear Discriminant Analysis (LDA) used in Machine Learning.[5]
Ans:- Dimensionality Reduction is the process of reducing the number of input variables
(features) in a dataset, while retaining as much important information as possible.
It transforms the original high-dimensional data into a lower-dimensional space (fewer
features) while keeping the essential structure or patterns.
Why is Dimensionality Reduction important?
 Real-world datasets often have too many features (also called high dimensionality),
which can:
o Make models slower and more complex
o Cause overfitting (model learns noise instead of useful patterns)
o Make visualization difficult
Dimensionality Reduction techniques:-
Q.2.1 Explain Principal Component Analysis used in Machine Learning. [5]
Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is a dimensionality reduction technique used in machine
learning to simplify large datasets by reducing the number of features (variables), while
keeping the most important information.
PCA is an unsupervised method (doesn't need output labels).
It helps remove noise and redundancy in the data.
It improves efficiency and accuracy of models in many cases.
Commonly used in image compression, pattern recognition, and data visualization.
Why PCA is Used:
 To reduce the size of data without losing much information
 To make models faster and easier to train
 To help in visualizing high-dimensional data in 2D or 3D
How PCA Works (in simple steps):
1. Standardize the data – Make sure each feature has the same scale.
2. Find correlations – PCA finds patterns or directions (called components) in the data
that capture the most variation (spread).
3. Create new features – It creates new features (called principal components) that are
combinations of old ones.
4. Keep top components – We select the top few components that capture most of the
information and drop the rest.
Example:
Suppose we have 100 features. PCA may reduce them to just 2 or 3 new features that still
explain most of the data's behavior.

Q.2.2 Explain Linear Discriminant Analysis (LDA) used in Machine Learning.[5]


Linear Discriminant Analysis (LDA)
Linear Discriminant Analysis (LDA) is a supervised machine learning technique used for
classification and dimensionality reduction. It reduces the number of input features while
keeping the class-discriminating information.
 LDA is supervised (uses class labels).
 It is useful when the output has multiple classes (e.g., Class A, B, C).
 It works well when the data is normally distributed.
 LDA improves classification accuracy by reducing noise and redundancy.
Purpose of LDA:
 To separate different classes in a dataset as much as possible
 To reduce dimensions while keeping the data well-separated based on class labels
How LDA Works (in simple steps):
1. Calculate the mean of each class.
2. Measure how spread out the data is within and between classes.
3. Find a new axis (direction) that maximizes separation between the classes.
4. Project the data onto this new axis with fewer dimensions.
Example Use Cases:
 Face recognition
 Medical diagnosis
 Text classification
Q.3 Write a note on Reinforcement Learning.[4]
Ans:-
 RL is a type of machine learning focused on decision making.
 It learns how to take actions in an environment to get maximum rewards.
 It works by trial and error and learning from feedback (reward or penalty).
 RL is different from supervised learning—it doesn’t use labeled input-output data.
Key Points:
 The agent learns by doing, not by being told.
 It’s self-learning and autonomous.
 Works well in tasks where continuous decisions are needed (like games, robotics,
etc.).
How It Works:
 An agent takes an action in an environment.
 The environment gives feedback in the form of a reward.
 Based on the reward, the agent learns which actions are better for the future.
Elements of Reinforcement Learning – Simplified
RL consists of four main elements:
1. Policy:
 The strategy used by the agent to decide actions based on the current state.
 It’s like a map from “situation” to “what action to take.”
2. Reward Function:
 Gives the agent feedback for each action.
 Helps the agent know if it did well or poorly.
 It guides the agent to achieve the goal.
3. Value Function:
 Tells the agent how good a state is in the long run.
 It’s the total expected future reward from a given state.
4. Model of the Environment:
 Used for planning.
 It helps the agent predict the next state and reward based on current action.
Example:
 A robot learning to walk: it tries steps, falls, gets feedback, adjusts, and tries again.
 Over time, it learns how to walk by improving based on rewards (positive/negative).

Q.4 Explain parametric & nonparametric models in machine learning.[5]


Ans:- What are Parametric Models?
 Parametric models use a fixed number of parameters (like mean, variance).
 We assume a specific form for the function/model, then estimate its parameters
using data.
 Once parameters are known, we can describe the entire distribution.
 Likelihood is a function that tells us how likely our data is, given a certain parameter.
Examples:
 Logistic Regression
 Linear Discriminant Analysis
 Naive Bayes
 Perceptron
 Simple Neural Networks
Advantages:
1. Simple and easy to understand.
2. Learn quickly from small datasets.
3. Use less training data.
4. Good for simple problems.
5. Maximum Likelihood Estimation (MLE)
What is MLE?
 A method to estimate parameters of a model that make the observed data most
probable.
 It tries to find the value of parameter (θ) that maximizes the likelihood of the data.
Example:
 Tossing a coin and estimating the probability of heads or tails using observed
outcomes.

Non-Parametric Methods
 These do not assume a fixed form for the model.
 They adapt based on the amount and nature of data.
 Often used when we don’t know the exact distribution of the data.
How It Works:
 It uses techniques like density estimation (e.g., histograms, kernel methods).
 Divides data into bins and counts observations in each bin to estimate the
distribution.
Examples:
 k-Nearest Neighbors (k-NN)
 Decision Trees
 Support Vector Machines (SVM)
 Random Forest
Advantages:
1. No assumption about data shape.
2. Can learn complex functions.
3. More flexible and powerful for large data.
Limitations:
1. Need more training data.
2. Slower and computationally expensive.
3. Risk of overfitting.
4.
Q.5 Differentiate supervised and unsupervised learning techniques.[5]
Ans:-

Sr. No. Supervised Learning Unsupervised Learning

1 Output is known and given. Output is unknown or not given.

2 Hard to learn very complex patterns. Can discover complex patterns.

3 Uses labeled training data. No labeled data is used.

Outputs (labels) are not shown to the


4 Every input has a matching output.
system.

5 Goal is to predict output. Goal is to find patterns or groups in data.

6 Needs a clear and defined target/output. Target may be missing or unclear.

Example: OCR (Optical Character


7 Example: Finding faces in images
Recognition)

8 Model can be tested for accuracy. Model cannot be directly tested.

9 Also called classification or regression. Also called clustering.

Q.6 Elaborate grouping and grading models and Differentiate Grouping and Grading
models of Machine Learning.[5][4]
Ans: 1. Grouping Model (Clustering / Segmentation)
Definition:
Grouping models aim to divide a dataset into clusters or groups such that items in the same
group are more similar to each other than to those in other groups.
Key Points:
 It is a type of unsupervised learning.
 The model does not use labeled output data.
 It looks for natural structures or hidden patterns in the data.
 Groups are formed based on distance, density, or similarity.
Techniques Used:
 K-Means Clustering
 Hierarchical Clustering
 DBSCAN
Applications:
 Customer segmentation in marketing
 Grouping similar news articles
 Market basket analysis
 Medical diagnosis (grouping patients with similar symptoms)
Example:
Given customer data (age, income, purchase history), a grouping model can automatically
divide them into segments like budget shoppers, premium buyers, or occasional buyers.
2. Grading Model (Classification / Ranking)
Definition:
Grading models predict the category, rank, or score of an input based on past labeled
examples.
Key Points:
 It is a type of supervised learning.
 The model uses input-output pairs for training.
 The goal is to assign a grade/class to new, unseen data.
 It can also involve ordinal values (ordered categories like low, medium, high).
Techniques Used:
 Decision Trees
 Logistic Regression
 Support Vector Machines
 Naive Bayes
 Neural Networks
Applications:
 Credit score prediction
 Exam paper grading (A, B, C...)
 Spam detection
 Disease classification
 Product quality rating
Example:
A grading model trained on past student exam scores can predict the grade (A/B/C) for a
new student based on their test performance.

Aspect Grouping Model Grading Model

Learning Type Unsupervised Learning Supervised Learning

Assign a level, grade, or class to


Goal Group similar data points
data

Output Labels Not given during training Given during training

Similarity or distance between data


Basis Predefined rules or labeled data
points

Example Customer segmentation, Face clustering Exam scoring, Credit risk prediction

Also Known As Clustering Classification or Ranking

Q.7 Explain the relationship between Artificial Intelligence, Machine Learning


and data science. [4]
Ans:-

Aspect Machine Learning (ML) Artificial Intelligence (AI) Data Science

Learning from data to Making machines act like


Extracting useful
Focus improve performance over humans using
insights from data.
time. intelligence.

Structured data,
Statistical models and data Logic, decision trees, and
Uses analytics, and
patterns. intelligent behavior.
visualizations.

Simulates human
Software learns patterns in Uses data analysis to
Definition thinking and decision-
data to make decisions. find useful information.
making.

Gain insights for


Main Goal Maximize accuracy. Maximize success rate.
decision-making.

Learning Supervised, unsupervised, Includes planning, Uses ML, stats, big data
Type reinforcement learning. prediction, perception. tools.
Aspect Machine Learning (ML) Artificial Intelligence (AI) Data Science

Concerned Learning and improving Acting smart and making Managing and analyzing
With knowledge. decisions. data effectively.

Q.8 Explain types of Machine Learning. [6]


Ans:- Types of Machine Learning (ML)
Machine Learning is categorized into 3 main types:
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning

1. Supervised Learning
 Definition: The algorithm learns from a labeled dataset (both input and correct
output are given).
 Goal: Predict output for new data based on what it learned.
 Example: Predicting whether an image is of a dog or cat after training with labeled
examples.
Types:
 Classification: Predicts categories (e.g., spam or not spam).
o Algorithms: Logistic Regression, Decision Tree, KNN, Naive Bayes, SVM
 Regression: Predicts numerical values (e.g., house price).
o Algorithms: Linear Regression, Polynomial Regression, Random Forest, Ridge
Advantages:
 High accuracy possible
 Decision-making is explainable
 Can reuse pre-trained models

Disadvantages:
 Needs labeled data (costly and time-consuming)
 May not work well on new or unexpected data
Applications:
 Image & speech recognition
 Fraud detection
 Customer churn prediction

2. Unsupervised Learning
 Definition: Algorithm works on unlabeled data and finds hidden patterns.
 Goal: Explore and group data without knowing the outcome.
 Example: Grouping customers with similar buying behavior (clustering).
Types:
 Clustering: Groups similar data (e.g., K-Means, DBSCAN)
 Association: Finds relationships (e.g., Apriori, FP-Growth)
Advantages:
 No labeled data needed
 Good for exploring unknown data
 Helps in pattern discovery and data reduction
Disadvantages:
 Hard to evaluate accuracy
 Results may be difficult to interpret
Applications:
 Customer segmentation
 Market basket analysis
 Anomaly detection
 Image compression
 Topic modeling (in NLP)
Q.9 Models of Machine learning: Geometric model, Probabilistic Models, Logical Models
Ans:-
Machine Learning models can be grouped based on how they represent and learn from
data. The main types are:
1. Geometric Models
 Idea: These models represent data as points in a high-dimensional space and try to
draw boundaries or fit lines/curves between different categories or values.
 Use: Mostly in classification and regression tasks.
 Goal: Separate or relate data using geometrical shapes like lines, planes, or curves.
Examples:
 Linear Regression – fits a straight line to data points.
 Support Vector Machines (SVM) – finds the best boundary (hyperplane) between
classes.
 K-Nearest Neighbors (KNN) – uses distance (Euclidean) to find nearby points and
classify.

2. Probabilistic Models
 Idea: These models use probability and statistics to model the uncertainty in data.
 Use: Handle noisy, uncertain, or incomplete data well.
 Goal: Predict the probability of outcomes and make decisions based on likelihood.
Examples:
 Naive Bayes Classifier – uses Bayes’ theorem to classify.
 Hidden Markov Models (HMM) – used in speech and sequence modeling.
 Gaussian Mixture Models (GMM) – models data as a mixture of normal distributions.
3. Logical Models
 Idea: These models use rules, logic, and decision structures to learn patterns.
 Use: Interpretable models for decision-making.
 Goal: Learn if-else type rules or structured logical expressions.
Examples:
 Decision Trees – splits data based on questions (rules).
 Rule-based Learning – generates logical rules from data.
 Inductive Logic Programming – uses logic programming to learn structured
knowledge.

UNIT II
Q.2 Elaborate random forest regression. [5][5]
Ans: Random Forest Regression
 It is a supervised learning method used for classification and regression tasks.
 Works by combining multiple decision trees (ensemble learning) to solve complex
problems.
 Improves accuracy and reduces overfitting by using multiple classifiers.

How does the Random Forest Algorithm work?


Steps:
Step 1: Select Random Samples
 Randomly select K subsets from the dataset.
Step 2: Build Decision Trees
 Use each subset to build one decision tree.
Step 3: Choose Number of Trees
 Decide how many trees (N) you want in the forest.
Step 4: Repeat
 Repeat steps 1 and 2 to build N trees.
Step 5: Make Predictions
 For new data, each tree predicts an output.
 The final result is based on majority voting (classification) or average (regression).
Example:
 Suppose you want to classify fruit photos.
 Dataset is divided and given to each decision tree.
 Each tree gives a prediction; the majority vote determines the final class.
Applications of Random Forest
1. Banking: Identify loan risks and defaults.
2. Medicine: Diagnose diseases and assess risk factors.
3. Land Use: Analyze patterns in land use.
4. Marketing: Predict market trends.
Advantages of Random Forest
 Suitable for both classification and regression tasks.
 Handles large datasets with high dimensionality well.
 Increases accuracy and reduces overfitting.
Disadvantages of Random Forest
 Can be complex and difficult to interpret.
 Requires more computational power and resources.

Q.2 Differentiate multivariate regression and univariate regression. [4]


Ans:

Sr.
Univariate Multivariate
No.

Univariate analysis refers to the Multivariate analysis refers to the analysis of


1
analysis of one variable. more than one variable.

It does not deal with causes and


2 It deals with causes and relationships.
relationships.

It does not contain any dependent


3 It contains more than one dependent variable.
variable.

4 Equation: Y = A + BX Equation: Y = A + BX + CX₁

Q.3 Define Regression. Explain types of regression. [6][6]


Ans: Regression:
 Regression analysis is a statistical method used to find the relationship between a
dependent variable and one or more independent variables.
 It helps to measure how variables are related and to predict future outcomes.
Types of Regression
1. Simple Linear Regression
 Uses one independent variable to predict one dependent variable.
 Assumes a straight-line relationship.
 Example: Predicting house price based on size.
2. Multiple Linear Regression
 Uses multiple independent variables to predict one dependent variable.
 Example: Predicting house price based on size, location, number of rooms, etc.
3. Polynomial Regression
 Used when the relationship is non-linear.
 Adds polynomial terms (like x², x³) to model curved trends.
 Example: Predicting population growth over time.
4. Support Vector Regression (SVR)
 Based on Support Vector Machines (SVM).
 Tries to find a line (hyperplane) that best predicts values with minimum error.
 Works for both linear and non-linear relationships.
5. Decision Tree Regression
 Uses a tree-like structure to make predictions.
 Each decision splits the data based on a feature.
 Example: Predicting customer behavior based on age, income, etc.
6. Random Forest Regression
 Ensemble method using many decision trees.
 Combines multiple tree predictions for better accuracy.
 Example: Predicting sales or customer churn.
Advantages of Regression
 Easy to understand and interpret.
 Robust to outliers.
 Can handle both linear relationships easily.
Disadvantages of Regression
 Assumes linearity.
 May not be suitable for highly complex relationships.

Q.4 What is underfitting and overfitting in machine Learning explain the techniques to
reduce overfitting? [5]
Ans:
Underfitting
Underfitting happens when a machine learning model is too simple and cannot capture the
underlying patterns in the data.
As a result:
 The model performs poorly on both training and testing data.
 It has high bias and low variance.
Example: Using a straight line to fit data that clearly follows a curve.
Overfitting
Overfitting happens when a model learns not only the patterns but also the noise in the
training data.
As a result:
 The model performs well on training data but poorly on unseen test data.
 It has low bias but very high variance.
Example: A model that is too complex and tries to perfectly fit every point in the training
data.
Techniques to Reduce Overfitting
1. Limit Model Complexity
o Reduce the number of hidden nodes or layers in neural networks.
o Use simpler models to prevent capturing noise.
2. Early Stopping
o Stop training the model before it starts memorizing the training data.
3. Regularization
o Apply techniques like weight decay to limit large weights in the model.
o Use Ridge Regression or Lasso Regression in linear models.
4. Use More Data
o A larger training set helps the model generalize better.
5. Feature Selection
o Remove irrelevant or highly correlated features to avoid confusing the model.
Common Reasons for Overfitting
 Noisy data
 Small training set
 Large number of features
Q.5 Explain any two Evaluation Metrics for regression/ Explain three evaluation metrics
used for regression model.[5][6]
Ans:
In regression models, evaluation metrics help measure how well the model predicts
continuous values. They compare the actual target values with the predicted values from the
model.
Here are three important evaluation metrics:

1. Mean Squared Error (MSE)


 Definition:
MSE is the average of the squares of the differences between actual and predicted
values.
 Formula:

 Explanation:
o MSE shows how much error, on average, the model makes.
o Squaring the differences penalizes larger errors more heavily.
o A lower MSE indicates a more accurate model.
 Example:
Used in predicting house prices, where large errors are costly and should be
penalized.
2. Mean Absolute Error (MAE)
 Definition:
MAE is the average of the absolute differences between the actual and predicted
values.
 Formula:

 Explanation:
o MAE provides a straightforward interpretation: average error in prediction.
o It treats all errors equally and is not sensitive to outliers.
o The smaller the MAE, the better the model’s performance.
 Example:
Suitable when outliers are present, such as in forecasting sales data.
3. R-squared (R²) — Coefficient of Determination
 Definition:
R² measures the proportion of the variance in the dependent variable that is
predictable from the independent variables.
 Formula:

 Explanation:
o R² ranges between 0 and 1.
o An R² value close to 1 indicates that the model explains most of the variance.
o An R² of 0 means the model does not explain any variance.
 Example:
Used in evaluating models like stock price prediction to see how well the model
explains price movements.

Q.6 Explain Elastic Net regression in Machine Learning. [5]


Ans: ElasticNet Regression
 Definition:
o ElasticNet Regression combines Ridge (L2 penalty) and Lasso (L1 penalty)
techniques in linear regression.
o It balances feature selection and feature preservation by blending both
approaches.
 Use Case:
o Especially useful when there are more features than observations.
o Helps when features are correlated:
 Lasso may remove most correlated features.
 Ridge may keep all features.
 ElasticNet selects a subset of correlated features while maintaining
stability.
Advantages:
 Reduces model complexity:
o Effectively eliminates irrelevant features, better than Ridge regression.
 Better bias-variance trade-off:
o By tuning regularization parameters, achieves a better balance between bias
and variance than Lasso or Ridge.
 Versatile:
o Applicable to various regression models: Linear, Logistic, Cox models, etc.
Disadvantages:
 Higher computational cost:
o Requires more resources and time due to two regularization parameters and
cross-validation.
 Interpretability issues:
o May become less interpretable when dealing with a large number of features
or large coefficients.
Q.7 Differentiate between Regression and Correlation. [4]
Ans:

Aspect Correlation Regression

Measures the relationship strength Explains how one variable affects


Definition
between two or more variables. another.

Coefficients can be positive or


Range Values range between -1 to +1.
negative (slope & intercept).

No distinction; all variables are treated Distinction exists: Independent and


Variables
equally. Dependent variables.

Shows the degree of association Predicts the value of one variable


Purpose
between variables. based on another.

Symmetrical; correlation between A and Not symmetrical; regression of Y on


Symmetry
B is same as B and A. X ≠ regression of X on Y.

Coefficient Provides a relative measure (correlation Provides an absolute measure


Type coefficient). (regression equation).
Aspect Correlation Regression

Q.8 Explain Bias-Variance Trade-off with respect to Machine Learning. [5]


Ans:
Bias-Variance Tradeoff in Machine Learning
In machine learning, the bias-variance tradeoff explains how model complexity affects
prediction errors and generalization.
Bias
 Error due to simplifying assumptions in the model.
 High bias → model is too simple → underfitting.
 Leads to poor performance on both training and test data.
Variance
 Error from model’s sensitivity to small changes in training data.
 High variance → model is too complex → overfitting.
 Performs well on training data but poorly on new data.

Trade-off
 Underfitting (High Bias, Low Variance):
o Model fails to capture data patterns.
o Caused by using an overly simple model or too little data.
o Fix: Use a more complex model or add data.
 Overfitting (High Variance, Low Bias):
o Model captures both patterns and noise.
o Caused by overly complex models or too many features.
o Fix: Simplify model, reduce features, or apply regularization.

Examples
 Underfitting: Linear model on curved data.
 Overfitting: 10th-degree polynomial on noisy data.
Practical Techniques
 Use cross-validation to evaluate model generalization.
 Apply regularization (L1/L2) to control complexity.
 Increase training data to reduce overfitting.
Q.9 Differentiate Ridge and Lasso Regression techniques. [4]
Ans:

Characteristic Ridge Regression Lasso Regression

Regularization Uses L2 regularization (penalty = Uses L1 regularization (penalty =


Type square of coefficients) absolute value of coefficients)

Best when all features are important Best when only some features are
Use Case
and you want to reduce overfitting important and others can be ignored

Includes all features with smaller Results in a simpler model with only
Model Simplicity
weights key features

Effect on Shrinks them close to zero, but not Shrinks some to exactly zero,
Coefficients exactly zero removing them from the model

Computation Usually faster, as it doesn't remove Slightly slower due to feature


Speed variables selection process

Predicting house prices using many Genetic analysis where only a few
Example
factors (size, location, amenities) genes affect the outcome

Q.10 Regression techniques:


Ans:
Regression Techniques
1. Polynomial Regression
 Extends linear regression by fitting a curved line (polynomial equation) to the data.
 Suitable when the relationship between variables is non-linear.
 Example: Fitting a curve for house price vs size where the price increases more
sharply at higher sizes.
2. Decision Tree Regression
 Uses a tree-like structure to model decisions and outcomes.
 Splits the data based on feature values into branches until a prediction is made.
 Easy to interpret and handles non-linear data well.
3. Random Forest Regression
 An ensemble method that builds multiple decision trees and averages their results.
 More accurate and stable than a single decision tree.
 Reduces overfitting and improves prediction performance.
4. Support Vector Regression (SVR)
 Based on Support Vector Machines, it tries to fit the best line within a margin of
tolerance (epsilon).
 Works well for high-dimensional or non-linear data using kernel functions.
 Good at handling outliers and complex patterns.
5. Ridge Regression
 A type of linear regression that uses L2 regularization (squares of coefficients).
 Helps reduce overfitting by shrinking large coefficients.
 Keeps all features but with smaller weights.
6. Lasso Regression
 Uses L1 regularization (absolute values of coefficients).
 Can shrink some coefficients to zero, effectively performing feature selection.
 Useful when we want a simpler model with fewer predictors.

7. ElasticNet Regression
 Combines L1 (Lasso) and L2 (Ridge) regularization.
 Balances between feature selection and coefficient shrinkage.
 Best when features are highly correlated or when there are more features than
observations.
8. Bayesian Linear Regression
 Applies Bayesian probability to linear regression.
 Provides probabilistic predictions and measures of uncertainty.
 Useful when we need confidence intervals or prior knowledge included in the model.

You might also like