Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
21 views24 pages

Unit 1

The document provides an overview of machine learning, detailing its definition, key elements, and various types including supervised, unsupervised, and reinforcement learning. It explains the processes involved in each type, their advantages and disadvantages, and their applications in artificial intelligence. Additionally, it highlights the role of machine learning in automation, natural language processing, computer vision, predictive analytics, and more.

Uploaded by

Jakka Karthik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views24 pages

Unit 1

The document provides an overview of machine learning, detailing its definition, key elements, and various types including supervised, unsupervised, and reinforcement learning. It explains the processes involved in each type, their advantages and disadvantages, and their applications in artificial intelligence. Additionally, it highlights the role of machine learning in automation, natural language processing, computer vision, predictive analytics, and more.

Uploaded by

Jakka Karthik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIT – I: Introduction to Machine Learning and Prerequisites

Introduction to Machine Learning – Learning Paradigms – PAC learning – Version


Spaces – Role of Machine Learning in Artificial Intelligence applications.

What is Machine Learning?

According to Tom M. Mitchell:

● Definition: A computer program is said to learn from experience E with respect to


some class of tasks T and performance measure P, if its performance at tasks in T,
as measured by P, improves with experience E.

Key Elements in Machine Learning:

● E (Experience): Refers to past data, including labeled data (e.g., whether someone
is obese or not).
● T (Task): Refers to the specific task, such as classifying new data.
● P (Performance Measure): Measures how well the system performs, like the
accuracy of classifying obesity.

Examples

i) Handwriting recognition learning problem

• Task T: Recognising and classifying handwritten words within images

• Performance P: Percent of words correctly classified

• Training experience E: A dataset of handwritten words with given classifications

ii) A robot driving learning problem

• Task T: Driving on highways using vision sensors

• Performance measure P: Average distance traveled before an error

• training experience: A sequence of images and steering commands recorded while


observing a human driver

iii) A chess learning problem

• Task T: Playing chess

• Performance measure P: Percent of games won against opponents

• Training experience E: Playing practice games against itself .


What is Machine Learning? (In Different Words)

1. Subset of Artificial Intelligence: Machine learning is a branch of AI focused on


creating models that allow computers to learn from data and previous experiences,
enabling them to make predictions.
2. Pattern Recognition: Machine learning develops models that learn hidden patterns
in datasets and use these patterns to make predictions on similar new data.
3. Automated Task Performance: Machine learning uses algorithms trained on data
sets to perform tasks that humans usually do, such as categorizing images,
analyzing data, or predicting trends.
4. Learning without Explicit Programming: Machine learning refers to systems that
have the capability to learn from data without being programmed explicitly for every
task.

Types of machine Learning

Supervised learning is a type of machine learning where the model is trained on a labeled
dataset. This means that each training example includes both the input data and the correct
output (or label). The goal is for the model to learn the mapping between inputs and outputs
so that it can predict the output for unseen data.

Key Characteristics:

● Labeled Data: Each training data point has a corresponding correct output (label).
● Learning Process: The algorithm learns by adjusting its parameters based on the
errors it makes during training. The performance is evaluated by comparing
predictions with the true output labels.
● Output: The model’s task is to predict an output based on new input data.

Types of Supervised Learning

Supervised learning can be divided into two main categories:


1. Classification

● Definition: In classification, the output variable is categorical (e.g., labels, classes,


categories).
● Goal: To predict which category a new observation belongs to based on the features
of the data.

Example:

● Predicting whether an email is spam or not (spam vs. non-spam).


● Diagnosing diseases based on medical tests (e.g., “malignant” vs. “benign”).

Algorithms Used:

● Decision Trees
● K-Nearest Neighbors (KNN)
● Support Vector Machines (SVM)
● Naive Bayes

2. Regression

● Definition: In regression, the output variable is continuous, meaning that the model
predicts a value within a range, such as a number.
● Goal: To predict a continuous value based on input features.

Example:

● Predicting the price of a house based on features like location, square footage, and
number of bedrooms.
● Estimating sales revenue based on past data.

Algorithms Used:

● Linear Regression
● Polynomial Regression
● Ridge Regression
● Lasso Regression

How Supervised Learning Works

1. Training Phase:
○ The algorithm is given a labeled dataset consisting of input-output pairs.
○ It learns the relationship between the inputs and outputs, adjusting its model
parameters to minimize errors (e.g., through gradient descent or other
optimization techniques).
2. Prediction Phase:
○ After training, the model can make predictions on unseen data by applying
the learned relationship.
3. Evaluation:
○ The model’s performance is assessed using metrics like accuracy (for
classification) or mean squared error (for regression).
○ Cross-validation or separate test datasets are often used to ensure the model
generalizes well to new data.

Examples of Supervised Learning

1. Spam Detection:
○ Input: Features of an email (e.g., word frequency, sender, subject).
○ Output: Class label (spam or non-spam).
○ Algorithm: Naive Bayes classifier.
2. Credit Scoring:
○ Input: Information about an individual (e.g., income, credit history, loan
amount).
○ Output: A score or classification (e.g., good or bad credit risk).
○ Algorithm: Logistic Regression.
3. House Price Prediction:
○ Input: Features like location, size, and number of rooms.
○ Output: A continuous value representing the price of the house.
○ Algorithm: Linear Regression.

Advantages of Supervised Learning

● Clear Objective: Since the data is labeled, the goal of learning is clear and
well-defined.
● Effective for Known Tasks: Supervised learning is effective when the input-output
relationship is known, and the problem is well-defined (e.g., classification,
regression).
● Wide Range of Applications: It is widely applicable in real-world scenarios like
speech recognition, image classification, and financial predictions.

Disadvantages of Supervised Learning

● Requires Labeled Data: Labeling data can be expensive and time-consuming,


especially for large datasets.
● Overfitting Risk: If the model is too complex or not properly regularized, it can
overfit the training data, resulting in poor performance on unseen data.
● Limited Flexibility: The model can only make predictions on problems similar to the
training data.
Unsupervised Learning

Definition: Unsupervised learning is a type of machine learning in which models are trained
using unlabeled data. Unlike supervised learning, there are no corresponding output labels
in the data. The model is tasked with identifying patterns, structures, and relationships within
the data on its own.

In unsupervised learning, the goal is to:

● Discover hidden patterns or underlying structure in the data.


● Group the data based on similarities (clustering).
● Represent the data in a compressed form.

Example: Given an input dataset containing images of cats and dogs, the algorithm will
group similar images together based on their features without knowing which image belongs
to which category (cat or dog). This process is done through clustering.

Why Use Unsupervised Learning?

1. Discover Hidden Insights: It can reveal important patterns and structures within the
data.
2. Similarity to Human Learning: Just like humans learn from experience,
unsupervised learning algorithms adapt to data without labeled examples.
3. Works with Unlabeled Data: It can handle data that is unlabeled and uncategorized,
which is common in many real-world applications.
4. Real-World Relevance: In many cases, labeled data isn't available, making
unsupervised learning essential.

How Unsupervised Learning Works

1. Input Data:
○ The model is provided with data that does not include any labels or
predefined outputs.
2. Pattern Recognition:
○ The algorithm tries to discover patterns or structures within the data. It may
look for similarities between data points (clustering) or attempt to reduce the
number of dimensions while preserving the essential data (dimensionality
reduction).
3. Output:
○ The result could be a set of clusters, reduced dimensions, or an identified
structure that helps in understanding the data better or preparing it for further
processing.
4. Evaluation:

Since there are no labels in unsupervised learning, evaluating performance is often


more subjective and may involve techniques like silhouette scores (for clustering) or
visual inspection (for dimensionality reduction)..
Types of Unsupervised Learning Problems:

1. Clustering: Grouping similar data points together based on their features. Examples:
○ K-Means Clustering
○ Hierarchical Clustering
○ DBSCAN
2. Association: Identifying relationships between variables in large datasets. Example:
○ Market Basket Analysis: "People who buy bread are likely to buy butter as
well."

Popular Unsupervised Learning Algorithms:

1. K-Means Clustering: Groups data into K clusters based on similarity.


2. K-Nearest Neighbors (KNN): Classifies new data points based on the closest data
points.
3. Hierarchical Clustering: Builds a tree of clusters based on data similarity.
4. Anomaly Detection: Identifies outliers or unusual data points.
5. Principal Component Analysis (PCA): Reduces data dimensions while preserving
variance.
6. Apriori Algorithm: Used in market basket analysis to identify associations between
items.

Advantages of Unsupervised Learning:

● No Labeled Data Needed: Easier to get unlabeled data compared to labeled data.
● Ability to Handle Complex Tasks: Can be applied to complex problems where
supervised learning is not feasible.
● Discovering New Patterns: Helps in discovering unknown patterns and
relationships in data.

Disadvantages of Unsupervised Learning:

● Difficult to Evaluate: Without labeled data, it's hard to measure the model's
accuracy or performance.
● Less Accurate: Results may not be as precise as supervised learning due to lack of
labeled outputs.
● Higher Complexity: Understanding the results and finding the right algorithm can be
more challenging.

Unsupervised learning plays a crucial role in tasks like market analysis, anomaly detection,
and customer segmentation, where labeled data is scarce or unavailable.
Reinforcement Learning (RL)

Definition: Reinforcement Learning (RL) is a feedback-based machine learning technique in


which an agent learns to make decisions by performing actions and receiving feedback from
the environment. The goal is for the agent to maximize its total cumulative reward by
selecting actions that lead to positive outcomes and avoiding actions that lead to negative
consequences. Unlike supervised learning, RL does not require labeled data and learns
through trial and error.

Key Concepts in Reinforcement Learning:

1. Agent: The entity that interacts with the environment, makes decisions, and learns
from its actions.
2. Environment: The external system with which the agent interacts. The environment
provides feedback based on the agent's actions.
3. Actions: The decisions made by the agent that affect the environment.
4. States: The conditions or situations that describe the environment at any given time.
5. Rewards: Positive or negative feedback received from the environment after
performing an action. A higher reward signifies a good action, while a penalty is given
for bad actions.
6. Policy: A strategy that the agent uses to decide which actions to take in different
states.
7. Value Function: A function that estimates how good it is for an agent to be in a given
state, often used to predict long-term rewards.

How Reinforcement Learning Works:

● The agent starts in an initial state.


● The agent takes an action based on its current state.
● The environment reacts to the action by transitioning to a new state and providing a
reward or penalty.
● The agent learns from this feedback to improve future actions.

Through repeated interaction with the environment, the agent optimizes its policy to
maximize the cumulative reward over time.

Example:

In a video game scenario, an RL agent might control a character that needs to avoid
obstacles and collect rewards. For each good action (like collecting a reward or avoiding an
obstacle), the agent receives a positive feedback signal (reward), and for each bad action
(such as hitting an obstacle), the agent receives a negative feedback (penalty). Over time,
the agent learns to maximize its total score by improving its decision-making.
Applications of Reinforcement Learning:

1. Game Playing: Training agents to play games (e.g., AlphaGo, chess, or video
games).
2. Robotics: Enabling robots to learn and improve their behavior in real-world tasks like
navigation and manipulation.
3. Autonomous Vehicles: Teaching self-driving cars to make decisions for safe and
efficient driving.
4. Finance: Optimizing trading strategies and investment portfolios.
5. Healthcare: Personalized treatment recommendations and drug discovery.
6. Advertising: Optimizing bidding strategies for digital ads.

Advantages of Reinforcement Learning:

● Autonomous Learning: The agent can learn without labeled data, making it
applicable to real-world situations where labeled data is hard to come by.
● Dynamic Adaptation: It adapts to changing environments through continuous
learning and feedback.
● Optimization of Long-Term Goals: Focuses on maximizing long-term cumulative
rewards rather than short-term gains.

Disadvantages of Reinforcement Learning:

● Computationally Expensive: Training RL models can be resource-intensive due to


the need for extensive simulations or real-world interactions.
● Exploration vs. Exploitation Dilemma: The agent must balance exploring new
actions (which may lead to better outcomes) versus exploiting known actions that
give good results.
● Delayed Feedback: In some environments, rewards are delayed, making it difficult
for the agent to correlate actions with outcomes.
Role of Machine Learning in Artificial Intelligence (AI) Applications

Machine learning (ML) plays a crucial role in enabling AI systems to function autonomously,
adapt to new data, and improve over time. Here’s how ML fits into various AI applications:

1. Automation and Decision-Making

● AI Task: Automating tasks like driving a car (autonomous driving) or performing


diagnostics in healthcare.
● ML Role: ML algorithms learn from vast amounts of data, enabling machines to
make decisions or take actions without human intervention. For example, self-driving
cars use ML to interpret sensor data, make driving decisions, and avoid obstacles.

2. Natural Language Processing (NLP)

● AI Task: Understanding, interpreting, and generating human language (chatbots,


translation systems).
● ML Role: ML techniques, especially deep learning, help systems like virtual
assistants (e.g., Siri, Alexa) to understand speech, recognize intent, translate
languages, and generate human-like text. NLP algorithms improve as they are
exposed to more data.
3. Computer Vision

● AI Task: Interpreting and analyzing visual information (image and video recognition).
● ML Role: ML models (like convolutional neural networks) are trained on large
datasets of images to detect objects, faces, and text in images. This is used in facial
recognition, object detection, and even medical imaging analysis.

4. Predictive Analytics

● AI Task: Predicting future outcomes based on historical data (forecasting,


recommendations).
● ML Role: ML algorithms, like regression and classification models, analyze patterns
in data to predict future trends. For example, in retail, ML can predict customer
buying behavior, while in healthcare, it can predict disease progression.

5. Personalization and Recommendation Systems

● AI Task: Providing personalized content or recommendations (movies, products).


● ML Role: ML algorithms analyze user data (preferences, behavior) to recommend
relevant products, music, or videos. For example, Netflix uses ML to suggest movies
based on viewing history, and Amazon recommends products based on past
purchases.

6. Fraud Detection

● AI Task: Identifying suspicious activity (bank fraud, online fraud).


● ML Role: ML models learn from historical data of legitimate and fraudulent
transactions, detecting anomalies in real-time and flagging potentially fraudulent
activities. This is used extensively in finance and e-commerce.

7. Robotics and Automation

● AI Task: Enabling robots to perform tasks autonomously (manufacturing, drones).


● ML Role: Robots use ML to improve their movements and operations based on
feedback from their environment. This is applied in manufacturing for quality control,
warehouse automation, and even surgical robots in healthcare.

8. Healthcare and Diagnostics

● AI Task: Diagnosing diseases, predicting patient outcomes.


● ML Role: ML algorithms analyze medical data (e.g., images, patient histories) to
identify diseases, predict patient outcomes, and recommend treatments. For
example, ML is used in detecting cancer in medical imaging and in predicting patient
risks for conditions like heart disease.

9. AI in Games and Simulations


● AI Task: Creating intelligent, adaptive game characters or simulations.
● ML Role: ML algorithms allow game AI to learn strategies and adapt to player
behavior. Reinforcement learning, for example, is used in training AI agents to
improve at complex games like chess or Go by playing against themselves and
refining their strategies.

10. Speech Recognition and Synthesis

● AI Task: Converting spoken language into text or generating human-like speech


(speech-to-text, virtual assistants).
● ML Role: ML algorithms help virtual assistants (like Siri and Google Assistant) to
recognize and respond to spoken commands by learning from vast amounts of audio
data. This is achieved through deep learning techniques like recurrent neural
networks (RNNs) and transformers.

Issues/challenges in Machine Learning

1. Data Quality and Quantity:


○ Insufficient Data: Lack of labeled data for training.
○ Imbalanced Data: Disproportionate representation of classes.
○ Noisy Data: Errors and irrelevant information in data.
2. Bias and Fairness:
○ Algorithmic Bias: Inherited biases from training data.
○ Fairness Issues: Ensuring equitable predictions for all groups.
3. Overfitting and Underfitting:
○ Overfitting: Model captures noise, failing to generalize.
○ Underfitting: Model too simple to capture patterns.
4. Interpretability and Explainability:
○ Black-box Models: Difficulty understanding model decisions.
○ Need for Transparency: Essential for trust in critical applications.
5. Scalability:
○ Computational Complexity: High resource demands for training.
○ Large Datasets: Challenges in scaling models efficiently.
6. Lack of Generalization:
○ Poor Generalization: Models fail on new or changing data.
○ Domain Transfer: Models don’t perform well across different domains.
7. Security and Privacy:
○ Adversarial Attacks: Vulnerability to manipulations.
○ Data Privacy: Concerns with using personal data.
8. Ethical Issues:
○ Decision-Making Autonomy: Risks of automated decisions without
oversight.
○ Responsibility: Unclear accountability for wrong decisions.
9. Real-time Prediction and Deployment:
○ Latency: Need for quick predictions in real-time applications.
○ Model Maintenance: Continuous updates and monitoring required.
10. Model Complexity:Difficult to deploy and maintain complex models.
PAC Learning (Probably Approximately Correct Learning)
PAC LEARNING AND EXAMPLES ****https://www.youtube.com/watch?v=SIf32P0bE28****

Definition:
PAC Learning is a framework in machine learning, introduced by Leslie Valiant, which aims
to formalize learning by ensuring that the learner finds a hypothesis that is "probably" correct
(with high confidence) and "approximately" accurate (within a small error margin).

Key Components:

1. Probably: The learner has a high probability (1−δ) of finding a good hypothesis.
2. Approximately: The hypothesis learned is close to the correct one, with a small error
(ϵ).
3. Correct: The hypothesis performs well on unseen data.

Goal:

The objective is to find a hypothesis h in the hypothesis space (H) such that:

Where:

● ϵ: Maximum allowable error.


● δ: Confidence parameter.

Requirements for PAC Learning:

1. Efficiency: The learning algorithm should run in polynomial time relative to the input
size and hypothesis complexity.
2. Sufficient Data: The learner must have enough training samples to meet the
specified error and confidence levels.

Example:

In a binary classification task (e.g., spam detection):

● The PAC learning algorithm ensures that with 95% confidence (1−δ=0.95), the
hypothesis has at most 5%error (ϵ=0.05) on unseen data.

Importance:

PAC Learning provides a theoretical foundation for understanding how much data and
computational resources are required to learn effectively, ensuring generalization from finite
data.
Version Spaces

Definition:
A version space is the subset of all possible hypotheses in the hypothesis space H that are
consistent with the observed training examples. It is used in concept learning to represent
the set of candidate hypotheses that correctly classify the training data.

Key Concepts:

1. Hypothesis Space (H):


The set of all possible hypotheses that can describe the target concept.
2. Consistent Hypothesis:
A hypothesis h is considered consistent if it correctly classifies all training examples.
3. General Boundary (G):
The set of the most general hypotheses in H that are consistent with the training
data.
4. Specific Boundary (S):
The set of the most specific hypotheses in H that are consistent with the training
data.
5. Version Space:
The space between the general (G) and specific (S) boundaries. It represents all the
hypotheses that are consistent with the training data:

Where D is the set of training examples.

A consistent hypothesis in machine learning is one that correctly predicts the target values
for all the training examples. In other words, a hypothesis is consistent if it perfectly matches
the training data, meaning there are no errors between the predicted and actual values in the
training set.

The mathematical formulation of a consistent hypothesis can be defined as follows:

Let H be a hypothesis (a model or function) and D={(x1,y1),(x2,y2),…,(xm,ym)} be a training


set with m examples, where xi​is the input and yi​is the corresponding target value.

A hypothesis H is consistent with the training data D if:

H(xi)=yi for all i=1,2,…,m

In other words, the hypothesis H correctly predicts the output yi​for every input xi​in the
training set.

Goal of Version Space:

To iteratively narrow down G and S as more training examples are provided, converging to
the target hypothesis.
Example:

Training Data:

● Attributes: Shape (Circle, Triangle), Color (Red, Blue), Size (Small, Large).
● Target Concept: Objects that are Red and Large.

Example Shape Color Size Label (Target Concept)

1 Circle Red Large Positive

2 Triangle Blue Small Negative

Initial H:

All possible combinations of attributes.

After Training Examples:

● General Boundary (G): (Color=Red,Size=Large)


● Specific Boundary (S): (Color=Red,Size=Large)

Here, S=G, indicating convergence to the target concept.

Advantages of Version Spaces:

1. Helps in systematically narrowing down possible hypotheses.


2. Provides a visual and logical structure for learning.

Limitations:

1. Cannot handle noisy or inconsistent data effectively.


2. Computationally expensive for large hypothesis spaces.
3. Struggles with incomplete training data.

Applications:

1. Concept learning and decision-making.


2. Rule-based systems for classification.
3. Narrowing down candidate solutions in search problems.
Hypothesis Space and Inductive Bias in Machine Learning

Hypothesis Space:

The hypothesis space (H) represents the set of all possible hypotheses (or functions) that a
learning algorithm can consider to map inputs to outputs for a given task. It is defined by:

● The representation language (e.g., decision trees, linear models, neural networks).
● The constraints or assumptions imposed on the hypothesis.

For example:

● In a decision tree classifier, H consists of all possible decision trees.


● For linear regression, H includes all possible linear equations of the form y=mx+b

Inductive Bias:

Inductive bias refers to the set of assumptions a machine learning algorithm makes to
generalize from the training data to unseen examples. Without inductive bias, learning is
impossible since the model would have no preference for one hypothesis over another.

Common types of inductive biases:

1. Restrictive Bias:
○ Reduces the hypothesis space by limiting the form of hypotheses.
○ Example: Linear regression assumes the data follows a linear relationship.
2. Preference Bias:
○ Considers all hypotheses but prefers some over others based on criteria like
simplicity or likelihood.
○ Example: Decision trees prefer smaller trees with fewer splits.

Types of Algorithms in Hypothesis Space and Inductive Bias

1. Concept Learning Algorithms:


○ Aim: Identify a hypothesis consistent with training examples.
○ Examples:
■ Find-S Algorithm: Finds the most specific hypothesis consistent with
positive examples.
■ Candidate Elimination Algorithm: Maintains a version space
containing all hypotheses consistent with the data.
■ List then Eliminate Algorithm
2. Classification Algorithms:
○ Aim: Learn a mapping from input features to discrete labels.
○ Examples:
■ Decision Trees: Use splitting criteria to reduce hypothesis space.
■ Naive Bayes: Assumes independence between features.
3. Regression Algorithms:
○ Aim: Model continuous target variables.
○ Examples:
■ Linear Regression: Inductive bias assumes linearity between features
and target.
■ Polynomial Regression: Expands hypothesis space by allowing
polynomial relationships.
4. Clustering Algorithms:
○ Aim: Group similar data points without supervision.
○ Examples:
■ k-Means: Assumes data is grouped in spherical clusters.
■ Hierarchical Clustering: Bias prefers proximity-based grouping.
5. Reinforcement Learning:
○ Hypothesis space involves policies that map states to actions.
○ Inductive bias comes from the assumption about rewards and exploration
strategies.

Connection to Machine Learning:

● The choice of hypothesis space and inductive bias directly impacts the performance,
generalization, and efficiency of a machine learning algorithm.
● Too restrictive bias limits flexibility; too broad a hypothesis space makes
generalization difficult. Finding the right balance is key to effective learning.

Concept Learning Algorithms:


https://www.youtube.com/watch?v=z5AKsT3apWI&list=PL4gu8xQu0_5JBO1FKRO5p20wc8
DprlOgn&index=32

1.Find-S Algorithm

The Find-S (Find-Specific) algorithm is a simple method used in concept learning to find
the most specific hypothesis h that is consistent with the given training examples.

Steps of the Find-S Algorithm:

1. Initialize h:
○ Set the initial hypothesis h to the most specific hypothesis in the hypothesis
space, h=⟨∅,∅,…,∅⟩
2. Iterate through the training examples:
○ For each positive example:
■ Compare h with the example.
■ Generalize h minimally to include the example if it does not already.
3. Ignore negative examples:
○ The algorithm does not modify h for negative examples.
4. Output the final hypothesis h:
○ This is the most specific hypothesis consistent with all positive examples.
Example of Find-S Algorithm:
1.https://www.youtube.com/watch?v=O6vwN74aSGY&list=PL4gu8xQu0_5JBO1FKRO5p20
wc8DprlOgn&index=33
2.https://www.youtube.com/watch?v=SD6MQLC2DdQ&list=PL4gu8xQu0_5JBO1FKRO5p20
wc8DprlOgn&index=34

3.Training Data:
2. Candidate Elimination Algorithm
https://www.youtube.com/watch?v=l-Uk3jDFrWI&list=PL4gu8xQu0_5JBO1FKRO5p20wc8D
prlOgn&index=38

The Candidate Elimination Algorithm finds all hypotheses consistent with the training data
by maintaining both the General Boundary (G) and Specific Boundary (S).

Steps of Candidate Elimination Algorithm:

1. Initialize S and G:
○ S: Set to the most specific hypothesis ⟨∅,∅,…,∅⟩
○ G: Set to the most general hypothesis ⟨?,?,…,?⟩
2. For each training example:
○ If the example is positive:
■ Remove hypotheses from G that do not cover the example.
■ Generalize S minimally to include the example, ensuring consistency
with G.
○ If the example is negative:
■ Remove hypotheses from S that cover the example.
■ Specialize G minimally to exclude the example, ensuring consistency
with S.
3. Repeat for all examples.
4. Output S and G:
○ The version space is the region between S and G.

Solved Problems:
1.https://www.youtube.com/watch?v=O2wYwFOMQ24&list=PL4gu8xQu0_5JBO1FKRO5p20
wc8DprlOgn&index=39
2.https://www.youtube.com/watch?v=VMoPY9Wimi4&list=PL4gu8xQu0_5JBO1FKRO5p20w
c8DprlOgn&index=40
3.https://www.youtube.com/watch?v=kGaR2PQfqlk&list=PL4gu8xQu0_5JBO1FKRO5p20wc
8DprlOgn&index=41
4.https://www.youtube.com/watch?v=8Cud5fmnvJQ&list=PL4gu8xQu0_5JBO1FKRO5p20wc
8DprlOgn&index=42
5.https://www.youtube.com/watch?v=Hr96fzShANk&list=PL4gu8xQu0_5JBO1FKRO5p20wc
8DprlOgn&index=43
6.https://www.youtube.com/watch?v=wrf4YuZA7Io&list=PL4gu8xQu0_5JBO1FKRO5p20wc8
DprlOgn&index=45
3.List-Then-Eliminate Algorithm
https://www.youtube.com/watch?v=_FMDyEoIX3A&list=PL4gu8xQu0_5JBO1FKRO5p20wc8
DprlOgn&index=37

The List-Then-Eliminate algorithm is a brute-force method used in concept learning. It


generates the complete hypothesis space first and then eliminates inconsistent hypotheses
based on the training data.

Steps of List-Then-Eliminate Algorithm:

1. Generate the hypothesis space H:


○ Construct all possible hypotheses from the attributes and their values.
2. Process training examples:
○ For each training example:
■ If the example is positive, eliminate all hypotheses in H that do not
cover the example.
■ If the example is negative, eliminate all hypotheses in H that cover
the example.
3. Output:
○ The remaining hypotheses in H are consistent with the training data.
OTHER IMPORTANT CONCEPTS

Overfitting in Decision Trees:

● Overfitting occurs when the decision tree becomes overly complex and fits the
training data too closely, capturing noise or outliers.
● Causes:
○ Too many branches: Reflect anomalies or noise.
○ Excessive complexity: Results in poor generalization.
● Impact:
○ Poor accuracy on unseen data (test data).

Approaches to Avoid Overfitting:

1. Pruning:
○ Reduces tree complexity by removing irrelevant branches.
○ Improves model generalization and reduces overfitting.
○ Two types:
■ Pre-pruning (Early Stopping):
■ Stops tree growth before it fully classifies the data.
■ The current node becomes a leaf node if a stopping condition
is met.
■ Common criteria:
■ Minimum entropy or Gini Impurity threshold.
■ Minimum gain from splitting.
■ Maximum depth of the tree.
■ Minimum number of samples in a node.
■ Post-pruning:
■ Builds a complete tree and prunes nodes in a bottom-up
manner.
■ Replaces subtrees with leaf nodes if this reduces validation
error.
2. Regularization:
○ Use validation sets to determine optimal tree complexity.

Pre-Pruning Example:

● Stop splitting if entropy e(54∣5)=0.29e(54|5) = 0.29e(54∣5)=0.29 is below a threshold


θent=0.4\theta_{ent} = 0.4θent​=0.4.
● The node becomes a leaf with the majority class label.

Post-Pruning Process:

● Build the full tree using the training dataset.


● Evaluate branches and prune subtrees if:
○ Replacing them with leaf nodes reduces validation error.
● Ensures the tree only retains significant splits. therefore,Pre-pruning prevents
overfitting early but risks underfitting and Post-pruning allows complete growth
before simplification, balancing fit and generalization.
Bias and Variance in Machine Learning

Bias and variance are two fundamental sources of error that help explain how well a
machine learning model can generalize to new, unseen data. Understanding the trade-off
between bias and variance is key to building effective models.

1. Bias

● Definition: Bias refers to the error introduced by approximating a real-world problem


with a simplified model. Essentially, it occurs when the model's assumptions about
the data lead to systematic errors.
● Causes of Bias:
○ The model is too simple (underfitting).
○ The algorithm makes overly simplistic assumptions, such as linearity in a
non-linear problem.
○ Limited features are used to train the model.
● Characteristics:
○ High Bias: The model fails to capture the true underlying patterns in the data.
As a result, it performs poorly on both the training and testing datasets,
leading to underfitting.
○ Low Bias: The model captures the underlying patterns of the data accurately,
which improves prediction performance on both training and test datasets.
● Example:
○ A linear regression model (with high bias) trying to predict a non-linear
relationship will perform poorly as it assumes a linear relationship between
input and output.

2. Variance

● Definition: Variance refers to the error introduced by the model’s sensitivity to small
changes in the training dataset. High variance indicates that the model is too
complex and learns not only the true underlying patterns but also the noise in the
training data.
● Causes of Variance:
○ The model has too many parameters (overfitting).
○ The model is highly flexible, allowing it to fit even the noise in the data.
○ Small fluctuations or variations in the training data result in large changes in
the model's predictions.
● Characteristics:
○ High Variance: The model performs very well on the training data but poorly
on new, unseen test data. This happens when the model learns to fit the
noise in the training data, resulting in overfitting.
○ Low Variance: The model’s predictions are stable and consistent across
different training sets. The model does not react strongly to small changes in
the data.
● Example:
○ A very deep decision tree with a large number of branches (high variance)
might perfectly classify the training data but fail to generalize to unseen data
because it has learned too much noise.

Bias-Variance Trade-off

There is a fundamental trade-off between bias and variance in machine learning models.
This trade-off dictates the model's ability to generalize to new data:

● Low Bias, Low Variance: This is the ideal scenario. The model is able to capture the
true patterns in the data without being overly influenced by noise. It generalizes well
to unseen data. However, achieving this perfect balance is often difficult.
● Low Bias, High Variance: This is an indication of overfitting. The model fits the
training data very well, including its noise and outliers, which results in high variance.
It performs poorly on new data because it fails to generalize.
● High Bias, Low Variance: This is an indication of underfitting. The model is too
simplistic to capture the true patterns of the data. Although the predictions may be
stable (low variance), they are consistently off the mark (high bias), resulting in poor
performance on both training and test data.
● High Bias, High Variance: This is the worst case. The model not only fails to
capture the true patterns (high bias) but also reacts inconsistently to fluctuations in
the training data (high variance). It leads to poor performance on both training and
test data.

Bias and Variance in Relation to Overfitting and Underfitting

● Underfitting: Occurs when the model is too simple to capture the underlying patterns
of the data, resulting in high bias and low variance. The model cannot learn enough
from the training data, which leads to poor performance on both training and test
datasets.
○ Example: Using a linear regression model to fit data that has a non-linear
relationship between input and output.
● Overfitting: Occurs when the model is too complex and learns not only the patterns
but also the noise or outliers in the training data, leading to low bias and high
variance. While it performs excellently on the training data, it fails to generalize to
new, unseen data.
○ Example: Using a very deep decision tree to classify data where the model
fits even the smallest noise points in the data.
StandardScaler is a feature scaling technique that standardizes data by transforming it to
have a mean of 0 and a standard deviation of 1. It uses the formula:

Where:

● X is the data point,


● μ is the mean of the feature,
● σ is the standard deviation.

This scaling ensures that each feature contributes equally, improving the performance of
algorithms like K-Means, PCA, and regression models, especially those sensitive to feature
magnitudes.
COST function of Linear regression:

Use of Sigmoid function in Logistic regression

You might also like