0% found this document useful (0 votes)

14 views20 pages

Unit 4

Unit 4 covers Memory Based Learning, focusing on the K-Nearest Neighbors (KNN) algorithm, which classifies or predicts values based on the closest 'k' data points. It discusses how to choose the optimal 'k', various distance metrics used, and introduces Locally Weighted Regression and Radial Basis Function Neural Networks. Additionally, it explains Case-Based Learning and the PAC learning model, which provides a theoretical framework for understanding the data requirements for reliable learning algorithms.

Uploaded by

B.ushasri Usha sri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views20 pages

Unit 4

Uploaded by

B.ushasri Usha sri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Unit 4

Memory Based Learning

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm

generally used for classification but can also be used for regression tasks. It
works by finding the "k" closest data points (neighbors) to a given input and
makesa predictions based on the majority class (for classification) or the average
value (for regression). Since KNN makes no assumptions about the underlying
data distribution it makes it a non-parametric and instance-based learning
method.
K-Nearest Neighbors is also called as a lazy learner algorithm because it does not
learn from the training set immediately instead it stores the dataset and at the time
of classification it performs an action on the dataset.

What is 'K' in K Nearest Neighbour?

In the k-Nearest Neighbours algorithm k is just a number that tells the algorithm
how many nearby points or neighbors to look at when it makes a decision.
Example: Imagine you're deciding which fruit it is based on its shape and size.
You compare it to fruits you already know.
 If k = 3, the algorithm looks at the 3 closest fruits to the new one.
 If 2 of those 3 fruits are apples and 1 is a banana, the algorithm says the new
fruit is an apple because most of its neighbors are apples.

How to choose the value of k for KNN Algorithm?

 The value of k in KNN decides how many neighbors the algorithm looks at
when making a prediction.
 Choosing the right k is important for good results.
 If the data has lots of noise or outliers, using a larger k can make the
predictions more stable.
 But if k is too large the model may become too simple and miss important
patterns and this is called underfitting.
 So k should be picked carefully based on the data.

Statistical Methods for Selecting k

 Cross-Validation: Cross-Validation is a good way to find the best value of k
is by using k-fold cross-validation. This means dividing the dataset into k
parts. The model is trained on some of these parts and tested on the remaining
ones. This process is repeated for each part. The k value that gives the highest
average accuracy during these tests is usually the best one to use.
 Elbow Method: In Elbow Method we draw a graph showing the error rate or
accuracy for different k values. As k increases the error usually drops at first.
But after a certain point error stops decreasing quickly. The point where the
curve changes direction and looks like an "elbow" is usually the best choice for
k.
 Odd Values for k: It’s a good idea to use an odd number for k especially in
classification problems. This helps avoid ties when deciding which class is the
most common among the neighbors.
Distance Metrics Used in KNN Algorithm
KNN uses distance metrics to identify nearest neighbor, these neighbors are used
for classification and regression task. To identify nearest neighbor we use below
distance metrics:
1. Euclidean Distance
Euclidean distance is defined as the straight-line distance between two points in a
plane or space. You can think of it like the shortest path you would walk if you
were to go directly from one point to another.
distance(x,Xi)=∑j=1d(xj−Xij)2]distance(x,Xi)=∑j=1d(xj−Xij)2]
2. Manhattan Distance
This is the total distance you would travel if you could only move along
horizontal and vertical lines like a grid or city streets. It’s also called "taxicab
distance" because a taxi can only drive along the grid-like streets of a city.
d(x,y)=∑i=1n∣xi−yi∣d(x,y)=∑i=1n∣xi−yi∣
3. Minkowski Distance
Minkowski distance is like a family of distances, which includes both Euclidean
and Manhattan distances as special cases.
d(x,y)=(∑i=1n(xi−yi)p)1pd(x,y)=(∑i=1n(xi−yi)p)p1
From the formula above, when p=2, it becomes the same as the Euclidean
distance formula and when p=1, it turns into the Manhattan distance formula.
Minkowski distance is essentially a flexible formula that can represent either
Euclidean or Manhattan distance depending on the value of p.
Working of KNN algorithm
Thе K-Nearest Neighbors (KNN) algorithm operates on the principle of similarity
where it predicts the label or value of a new data point by considering the labels
or values of its K nearest neighbors in the training dataset.
Step 1: Selecting the optimal value of K
 K represents the number of nearest neighbors that needs to be considered while
making prediction.
Step 2: Calculating distance
 To measure the similarity between target and training data points Euclidean
distance is used. Distance is calculated between data points in the dataset and
target point.
Step 3: Finding Nearest Neighbors
 The k data points with the smallest distances to the target point are nearest
neighbors.
Step 4: Voting for Classification or Taking Average for Regression
 When you want to classify a data point into a category like spam or not spam,
the KNN algorithm looks at the K closest points in the dataset. These closest
points are called neighbors. The algorithm then looks at which category the
neighbors belong to and picks the one that appears the most. This is called
majority voting.
 In regression, the algorithm still looks for the K closest points. But instead of
voting for a class in classification, it takes the average of the values of those K
neighbors. This average is the predicted value for the new point for the
algorithm.
It shows how a test point is classified based on its nearest neighbors. As the test
point moves the algorithm identifies the closest 'k' data points i.e. 5 in this case
and assigns test point the majority class label that is grey label class here.

Locally Weighted Regression

Locally weighted linear regression is the nonparametric regression methods that
combine k-nearest neighbor based machine learning. It is referred to as locally
weighted because for a query point the function is approximated on the basis of
data near that and weighted because the contribution is weighted by its distance
from the query point.
Locally Weighted Regression (LWR) is a non-parametric, memory-based
algorithm, which means it explicitly retains training data and used it for every
time a prediction is made.
To explain the locally weighted linear regression, we first need to understand the
linear regression. The linear regression can be explained with the following
equations:

Let (xi, yi) be the query point, then for minimizing the cost function in the linear
regression:

J(θ)=∑i=1m(y(i)−θTx(i))2

by calculating θθ so, that it minimize the above cost function.

Our output will be: θTxθTx

Thus, the formula for calculating \theta can also be:
θ=(XTX)−1XTYθ=(XTX)−1XTY
where, beta is the vector of linear vector, X, Y is the matrix, and vector of all
observations.
For locally weighted linear regression:
J(θ)=∑i=1mwi(y(i)−θTx(i))2J(θ)=∑i=1mwi(y(i)−θTx(i))2

by calculating

so, that it minimize the above cost function.

Our output will be: θTxθTx

Here, w(i) is the weight associated with each observation of training data. It can
be calculated by the given formula:
Or this can be represented in the form of a matrix calculation:

Impact of Bandwidth

where x(i) is the observation from the training data and x is a particular point
from which the distance is calculated and T(tau) is the bandwidth. Here, T(tau)
decides the amount of fitness in the function, if the function is closely fitted, its
value will be small. Therefore,
then, we can calculate \theta with the following equation:

Radial Basis Functions

Radial Basis Function (RBF) Neural Networks are used for function
approximation tasks. They are a special category of feed-forward neural networks
comprising of three layers. Due to this distinct three-layer architecture and
universal approximation capabilities they offer faster learning speeds and
efficient performance in classification and regression problems.
How Do RBF Networks Work?
RBF Networks are conceptually similar to K-Nearest Neighbor (k-NN) models
though their implementation is distinct. The fundamental idea is that nearby items
with similar predictor variable values influence an item's predicted target value.
Here’s how RBF Networks operate:
1. Input Vector: The network receives an n-dimensional input vector that needs
classification or regression.
2. RBF Neurons: Each neuron in the hidden layer represents a prototype vector
from the training set. The network computes the Euclidean distance between
the input vector and each neuron's center.
3. Activation Function: The Euclidean distance is transformed using a Radial
Basis Function (typically a Gaussian function) to compute the neuron’s
activation value. This value decreases exponentially as the distance increases.
4. Output Nodes: Each output node calculates a score based on a weighted sum
of the activation values from all RBF neurons. For classification the category
with the highest score is chosen.
For example, consider a dataset with two-dimensional data points from two
classes. An RBF Network trained with 20 neurons will have each neuron
representing a prototype in the input space. The network computes category
scores which can be visualized using 3-D mesh or contour plots. We assign
positive weights to neurons in the same category and negative weights to neurons
in different categories. The decision boundary can be plotted by evaluating scores
over a grid.
Key Characteristics of RBFs
 Radial Basis Functions: These are real-valued functions dependent solely on
the distance from a central point. The Gaussian function is the most commonly
used type.
 Dimensionality: The network's dimensions correspond to the number of
predictor variables.
 Center and Radius: Each RBF neuron has a center and a radius (spread). The
radius affects how broadly each neuron influences the input space.
Architecture of RBF Networks
The architecture of an RBF Network typically consists of three layers:
Input Layer
 Function: After receiving the input features the input layer sends them straight
to the hidden layer.
 Components: It is made up of the same number of neurons as the
characteristics in the input data. One feature of the input vector corresponds to
each neuron in the input layer.
Hidden Layer
 Function: This layer uses radial basis functions (RBFs) to conduct the non-
linear transformation of the input data.
 Components: Neurons in the buried layer apply the RBF to the incoming data.
The Gaussian function is the RBF that is most frequently utilized.
 RBF Neurons: Every neuron in the hidden layer has a spread parameter (σ)
and a center which are also referred to as prototype vectors. The spread
parameter modulates the distance between the center of an RBF neuron and the
input vector which in turn determines the neuron's output.
Output Layer
 Function: The output layer uses weighted sums to integrate the hidden layer
neurons outputs to create the network's final output.
 Components: It is made up of neurons that combine the outputs of the hidden
layer in a linear fashion. To reduce the error between the network's predictions
and the actual target values, the weights of these combinations are changed
during training.
Implementing Radial Basis Function Neural Network
An RBF neural network must be trained in three stages: choosing the center's,
figuring out the spread parameters and training the output weights.
Step 1: Selecting the Centers
 Techniques for Centre Selection: Centre's can be picked at random from the
training set of data or by applying techniques such as k-means clustering.
 K-Means Clustering: The center's of these clusters are employed as the
center's for the RBF neurons in this widely used center selection technique
which groups the input data into k groups.
Step 2: Determining the Spread Parameters
 The spread parameter (σ) governs each RBF neuron's area of effect and
establishes the width of the RBF.
 Calculation: The spread parameter can be manually adjusted for each neuron
or set as a constant for all neurons. Setting σ based on the separation between
the center's is a popular method, frequently accomplished with the help of a
heuristic like dividing the greatest distance between canters by the square root
of twice the number of center's
Step 3: Training the Output Weights
 Linear Regression: The objective of linear regression techniques which are
commonly used to estimate the output layer weights, is to minimize the error
between the anticipated output and the actual target values.
 Pseudo-Inverse Method: One popular technique for figuring out the weights
is to utilize the pseudo-inverse of the hidden layer outputs matrix.
Case Based Learning in Machine Learning

Case-Based Learning (CBL) in Machine Learning (ML) is a method where a

model solves new problems by comparing them with previously encountered cases
(examples). It is a memory-based learning approach that doesn't explicitly learn
a model during training but instead stores instances (cases) and defers
generalization until a new query is presented.

🔍 What is Case-Based Learning?

Case-Based Learning (CBL) is inspired by human reasoning: when we encounter

a problem, we think of similar past experiences (cases) and reuse the knowledge
from them to make decisions.

 Cases = stored experiences or examples, typically represented as feature

vectors with corresponding labels.
 When a new problem (query) is encountered, the system searches for
similar past cases.
 Decision is made based on the most similar case(s).

🔍 Key Components of a Case-Based Learning System

Component Description

Case Base A memory of past cases (instances, examples)

Similarity A method to compute how similar a new problem is to stored

Measure cases (e.g., Euclidean distance)

Retrieval
Algorithm to find the most relevant past cases
Mechanism

Adjust the solution from retrieved case(s) to fit the new

Adaptation
problem
Component Description

Update case base by adding new cases (and possibly removing

Learning
old ones)

🔍 Examples of Case-Based Learning Algorithms

1. K-Nearest Neighbors (KNN) – A classic case-based learner.

o Stores all training data.
o When a query comes, finds the k most similar cases and makes a
prediction (e.g., by majority vote for classification).
2. Locally Weighted Regression (LWR) – Predicts based on locally relevant
training data.
o Assigns weights to training instances based on proximity to the query.
3. Case-Based Reasoning (CBR) – Used in expert systems and AI, involving:
o Retrieve → Reuse → Revise → Retain cycle.

🔍 Case-Based Reasoning (CBR) Cycle

1. Retrieve most similar case(s)

2. Reuse the case to solve the problem
3. Revise the proposed solution if necessary
4. Retain the new solution as part of the case base

🔍 Advantages of Case-Based Learning

 No need for an explicit training phase.

 Naturally supports incremental learning.
 Good for domains with episodic memory, like helpdesk systems, diagnosis,
etc.
 Highly interpretable (reasoning is traceable through past cases).
🔍 Disadvantages

 Slow during inference (especially for large datasets).

 Memory-intensive, since it stores all (or most) of the training data

PAC learning model

Probably Approximately Correct (PAC) learning stands as a cornerstone theory,

offering insights into the fundamental question of how much data is needed for
learning algorithms to reliably generalize to unseen instances. PAC learning
provides a theoretical framework that underpins many machine learning
algorithms.

PAC Learning Theorem

The PAC learning theorem provides formal guarantees about the performance of
learning algorithms. It states that for a given accuracy (ε) and confidence (δ),
there exists a sample size (m) such that any learning algorithm that returns a
hypothesis consistent with the training samples will, with probability at least 1-δ,
have an error rate less than ε on unseen data.
Mathematically, the PAC learning theorem can be expressed as:

where:
 M is the number of samples,
 ϵ is the desired accuracy,
 δ is the desired confidence level,
 VC(H) is the Vapnik-Chervonenkis dimension of the hypothesis space HH.
The VC dimension is a measure of the capacity or complexity of the hypothesis
space. It quantifies the maximum number of points that can be shattered (i.e.,
correctly classified in all possible ways) by the hypotheses in the space. A higher
VC dimension indicates a more complex hypothesis space, which may require
more samples to ensure good generalization.
The PAC learning theorem provides a powerful tool for analyzing and designing
learning algorithms. It helps determine the sample size needed to achieve a
desired level of accuracy and confidence, guiding the development of efficient
and effective models.

Challenges of PAC Learning

Real-world Applicability
While PAC learning provides a solid theoretical foundation, applying it to real-
world problems can be challenging. The assumptions made in PAC learning, such
as the availability of a finite hypothesis space and the existence of a true
underlying function, may not always hold in practice.
In real-world scenarios, data distributions can be complex and unknown, and the
hypothesis space may be infinite or unbounded. These factors can complicate the
application of PAC learning, requiring additional techniques and considerations
to achieve practical results.

Computational Complexity

Finding the optimal hypothesis within the PAC framework can be

computationally expensive, especially for large and complex hypothesis spaces.
This can limit the practical use of PAC learning for certain applications,
particularly those involving high-dimensional data or complex models.
Efficient algorithms and optimization techniques are needed to make PAC
learning feasible for practical use. Researchers are continually developing new
methods to address the computational challenges of PAC learning and improve its
applicability to real-world problems.
Ensemble learning is a method where we use many small models instead of just
one. Each of these models may not be very strong on its own, but when we put
their results together, we get a better and more accurate answer. It's like asking a
group of people for advice instead of just one person—each one might be a little
wrong, but together, they usually give a better answer.

Types of Ensembles Learning in Machine Learning

There are three main types of ensemble methods:
1. Bagging (Bootstrap Aggregating):
Models are trained independently on different random subsets of the training
data. Their results are then combined—usually by averaging (for regression) or
voting (for classification). This helps reduce variance and prevents overfitting.
2. Boosting:
Models are trained one after another. Each new model focuses on fixing the
errors made by the previous ones. The final prediction is a weighted
combination of all models, which helps reduce bias and improve accuracy.
3. Stacking (Stacked Generalization):
Multiple different models (often of different types) are trained, and their
predictions are used as inputs to a final model, called a meta-model. The meta-
model learns how to best combine the predictions of the base models, aiming
for better performance than any individual model.
1. Bagging Algorithm
Bagging classifier can be used for both regression and classification tasks. Here is
an overview of Bagging classifier algorithm:
 Bootstrap Sampling: Divides the original training data into ‘N’ subsets and
randomly selects a subset with replacement in some rows from other subsets.
This step ensures that the base models are trained on diverse subsets of the data
and there is no class imbalance.
 Base Model Training: For each bootstrapped sample we train a base model
independently on that subset of data. These weak models are trained in parallel
to increase computational efficiency and reduce time consumption. We can use
different base learners i.e. different ML models as base learners to bring
variety and robustness.
 Prediction Aggregation: To make a prediction on testing data combine the
predictions of all base models. For classification tasks it can include majority
voting or weighted majority while for regression it involves averaging the
predictions.
 Out-of-Bag (OOB) Evaluation: Some samples are excluded from the training
subset of particular base models during the bootstrapping method. These “out-
of-bag” samples can be used to estimate the model’s performance without the
need for cross-validation.
 Final Prediction: After aggregating the predictions from all the base models,
Bagging produces a final prediction for each instance.
Python pseudo code for Bagging Estimator implementing libraries:
1. Importing Libraries and Loading Data
 BaggingClassifier: for creating an ensemble of classifiers trained on different
subsets of data.
 DecisionTreeClassifier: the base classifier used in the bagging ensemble.
 load_iris: to load the Iris dataset for classification.
 train_test_split: to split the dataset into training and testing subsets.
 accuracy_score: to evaluate the model’s prediction accuracy.
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
2. Loading and Splitting the Iris Dataset
 data = load_iris(): loads the Iris dataset, which includes features and target
labels.
 X = data.data: extracts the feature matrix (input variables).
 y = data.target: extracts the target vector (class labels).
 train_test_split(...): splits the data into training (80%) and testing (20%) sets,
with random_state=42 to ensure reproducibility.

data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
3. Creating a Base Classifier
Decision tree is chosen as the base model. They are prone to overfitting when
trained on small datasets making them good candidates for bagging.
 base_classifier = DecisionTreeClassifier(): initializes a Decision Tree
classifier, which will serve as the base estimator in the Bagging ensemble.

base_classifier = DecisionTreeClassifier()
4. Creating and Training the Bagging Classifier
 A BaggingClassifier is created using the decision tree as the base classifier.
 n_estimators = 10 specifies that 10 decision trees will be trained on different
bootstrapped subsets of the training data.
bagging_classifier = BaggingClassifier(base_classifier, n_estimators=10,
random_state=42)
bagging_classifier.fit(X_train, y_train)
5. Making Predictions and Evaluating Accuracy
 The trained bagging model predicts labels for test data.
 The accuracy of the predictions is calculated by comparing the predicted labels
(y_pred) to the actual labels (y_test).

y_pred = bagging_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
Output:
Accuracy: 1.0
2. Boosting Algorithm
Boosting is an ensemble technique that combines multiple weak learners to create
a strong learner. Weak models are trained in series such that each next model
tries to correct errors of the previous model until the entire training dataset is
predicted correctly. One of the most well-known boosting algorithms is AdaBoost
(Adaptive Boosting). Here is an overview of Boosting algorithm:
 Initialize Model Weights: Begin with a single weak learner and assign equal
weights to all training examples.
 Train Weak Learner: Train weak learners on these dataset.
 Sequential Learning: Boosting works by training models sequentially where
each model focuses on correcting the errors of its predecessor. Boosting
typically uses a single type of weak learner like decision trees.
 Weight Adjustment: Boosting assigns weights to training datapoints.
Misclassified examples receive higher weights in the next iteration so that next
models pay more attention to them.
Python pseudo code for boosting Estimator implementing libraries:
1. Importing Libraries and Modules
 AdaBoostClassifier from sklearn.ensemble: for building the AdaBoost
ensemble model.
 DecisionTreeClassifier from sklearn.tree: as the base weak learner for
AdaBoost.
 load_iris from sklearn.datasets: to load the Iris dataset.
 train_test_split from sklearn.model_selection: to split the dataset into
training and testing sets.
 accuracy_score from sklearn.metrics: to evaluate the model’s accuracy.

from sklearn.ensemble import AdaBoostClassifier

from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
2. Loading and Splitting the Dataset
 data = load_iris(): loads the Iris dataset, which includes features and
target labels.
 X = data.data: extracts the feature matrix (input variables).
 y = data.target: extracts the target vector (class labels).
 train_test_split(...): splits the data into training (80%) and testing (20%)
sets, with random_state=42 to ensure reproducibility.

data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
3. Defining the Weak Learner
We are creating the base classifier as a decision tree with maximum depth 1 (a
decision stump). This simple tree will act as a weak learner for the AdaBoost
algorithm, which iteratively improves by combining many such weak learners.
base_classifier = DecisionTreeClassifier(max_depth=1)
4. Creating and Training the AdaBoost Classifier
 base_classifier: The weak learner used in boosting.
 n_estimators = 50: Number of weak learners to train sequentially.
 learning_rate = 1.0: Controls the contribution of each weak learner to the
final model.
 random_state = 42: Ensures reproducibility.

adaboost_classifier = AdaBoostClassifier(
base_classifier, n_estimators=50, learning_rate=1.0, random_state=42
)
adaboost_classifier.fit(X_train, y_train)
5. Making Predictions and Calculating Accuracy
We are calculating the accuracy of the model by comparing the true
labels y_test with the predicted labels y_pred. The accuracy_score function
returns the proportion of correctly predicted samples. Then, we print the accuracy
value.

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)
Output:
Accuracy: 1.0
Benefits of Ensemble Learning in Machine Learning
Ensemble learning is a versatile approach that can be applied to machine learning
model for: -
 Reduction in Overfitting: By aggregating predictions of multiple model's
ensembles can reduce overfitting that individual complex models might
exhibit.
 Improved Generalization: It generalizes better to unseen data by minimizing
variance and bias.
 Increased Accuracy: Combining multiple models gives higher predictive
accuracy.
 Robustness to Noise: It mitigates the effect of noisy or incorrect data points
by averaging out predictions from diverse models.
 Flexibility: It can work with diverse models including decision trees, neural
networks and support vector machines making them highly adaptable.
 Bias-Variance Tradeoff: Techniques like bagging reduce variance, while
boosting reduces bias leading to better overall performance.
There are various ensemble learning techniques we can use as each one of them
has their own pros and cons.
Ensemble Learning Techniques
Technique Category Description

Random forest constructs multiple decision trees

on bootstrapped subsets of the data and aggregates
Bagging
their predictions for final output, reducing
Random Forest overfitting and variance.

Random Trains models on random subsets of input features

Subspace Bagging to enhance diversity and improve generalization
Method while reducing overfitting.

Gradient Gradient Boosting Machines sequentially builds

Boosting decision trees, with each tree correcting errors of
Boosting
Machines the previous ones, enhancing predictive accuracy
(GBM) iteratively.

Extreme
XGBoost do optimizations like tree pruning,
Gradient
Boosting regularization, and parallel processing for robust
Boosting
and efficient predictive models.
(XGBoost)

AdaBoost Boosting AdaBoost focuses on challenging examples by

Technique Category Description

(Adaptive assigning weights to data points. Combines weak

Boosting) classifiers with weighted voting for final
predictions.

CatBoost specialize in handling categorical

features natively without extensive preprocessing
Boosting
with high predictive accuracy and automatic
CatBoost overfitting handling.

ML GTU Solution
No ratings yet
ML GTU Solution
83 pages
3.2.1. K Nearest Neighbors
No ratings yet
3.2.1. K Nearest Neighbors
34 pages
Unit V: Distance and Rule Based Models
No ratings yet
Unit V: Distance and Rule Based Models
56 pages
ML-LECTURE9 KNN Classification
No ratings yet
ML-LECTURE9 KNN Classification
23 pages
ML04 KNN-SVM 2024-2025
No ratings yet
ML04 KNN-SVM 2024-2025
57 pages
ML Unit-2
No ratings yet
ML Unit-2
33 pages
Lect 06
No ratings yet
Lect 06
26 pages
KNN Lecture Presentation
No ratings yet
KNN Lecture Presentation
9 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
Presentation UNIT-2 (Old)
No ratings yet
Presentation UNIT-2 (Old)
58 pages
Unit V Non Parametric Machine Learning
No ratings yet
Unit V Non Parametric Machine Learning
47 pages
445 Lecture 5
No ratings yet
445 Lecture 5
28 pages
Machine Learning: Supervised Learning Basics
No ratings yet
Machine Learning: Supervised Learning Basics
46 pages
04 Unit-Iv - ML
No ratings yet
04 Unit-Iv - ML
23 pages
KNN
No ratings yet
KNN
53 pages
Instance Based Learning
No ratings yet
Instance Based Learning
7 pages
k-NN Algorithm: Basics, Applications, and Advantages
No ratings yet
k-NN Algorithm: Basics, Applications, and Advantages
42 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
CH 2
No ratings yet
CH 2
30 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
Unit 5 ML
No ratings yet
Unit 5 ML
13 pages
21 KNN
No ratings yet
21 KNN
28 pages
Data Science Study Guide
No ratings yet
Data Science Study Guide
25 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
Intro to KNN for Data Science
No ratings yet
Intro to KNN for Data Science
37 pages
Ayan Khan - PMCOE Day 12
No ratings yet
Ayan Khan - PMCOE Day 12
5 pages
Week 07
No ratings yet
Week 07
24 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
Concepts Techniques and Applications in Microsoft Office Excel R With Xlminer R 11919444
No ratings yet
Concepts Techniques and Applications in Microsoft Office Excel R With Xlminer R 11919444
78 pages
Day43 KNN Intro
No ratings yet
Day43 KNN Intro
4 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Multilingual Healthcare Chatbot Using Machine Learning
No ratings yet
Multilingual Healthcare Chatbot Using Machine Learning
6 pages
Lec 46
No ratings yet
Lec 46
12 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
INSTANCE Based Learning
No ratings yet
INSTANCE Based Learning
12 pages
MSc Machine Learning Exam
No ratings yet
MSc Machine Learning Exam
25 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
KNN Algorithm Guide with Python
No ratings yet
KNN Algorithm Guide with Python
13 pages
ML 2
No ratings yet
ML 2
6 pages
ML Unit V
No ratings yet
ML Unit V
10 pages
What Is KNN
No ratings yet
What Is KNN
9 pages
Water Quality Analysis Final
No ratings yet
Water Quality Analysis Final
48 pages
Anomaly Detection: A Tutorial: Arindam Banerjee, Varun Chandola, Vipin Kumar, Jaideep Srivastava
No ratings yet
Anomaly Detection: A Tutorial: Arindam Banerjee, Varun Chandola, Vipin Kumar, Jaideep Srivastava
101 pages
Sample KNN
No ratings yet
Sample KNN
7 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Introduction to Web Mining Techniques
No ratings yet
Introduction to Web Mining Techniques
12 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
Loading The Dataset: 'Diabetes - CSV'
No ratings yet
Loading The Dataset: 'Diabetes - CSV'
4 pages
Unit 3 ML
No ratings yet
Unit 3 ML
40 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
Sholom M. Weiss Nitin Indurkhya: Regression y y y y Continuous y
No ratings yet
Sholom M. Weiss Nitin Indurkhya: Regression y y y y Continuous y
21 pages
KNN & Decision Tree Basics
No ratings yet
KNN & Decision Tree Basics
9 pages
Projects
No ratings yet
Projects
35 pages
Machine Learning
No ratings yet
Machine Learning
44 pages
K-Nearest Neighbors (KNN)
No ratings yet
K-Nearest Neighbors (KNN)
3 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Implement of Salary Prediction System To Improve Student Motivation Using Data Mining Technique PDF
No ratings yet
Implement of Salary Prediction System To Improve Student Motivation Using Data Mining Technique PDF
6 pages
K-Nearest Neighbour Classifier: Prerequisite
No ratings yet
K-Nearest Neighbour Classifier: Prerequisite
6 pages
UNIT 1 DBMS Part 1
No ratings yet
UNIT 1 DBMS Part 1
33 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
SAS Programming For Data Mining: AUC Calculation Using Wilcoxon Rank Sum Test
No ratings yet
SAS Programming For Data Mining: AUC Calculation Using Wilcoxon Rank Sum Test
8 pages
Mincut
No ratings yet
Mincut
10 pages
Unauthorised Access Point Detection Using Machine Learning Algorithms For Information Protection
No ratings yet
Unauthorised Access Point Detection Using Machine Learning Algorithms For Information Protection
8 pages
PMU Fog-Based Anomaly Detection
No ratings yet
PMU Fog-Based Anomaly Detection
10 pages
Nandini Internship Certificate 1
No ratings yet
Nandini Internship Certificate 1
28 pages
DataScienceProcess 14may2019
No ratings yet
DataScienceProcess 14may2019
35 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
33 pages
Multivariate Short-Term Traffic Flow Prediction Based On Real-Time Expressway Toll Plaza Data Using Non-Parametric Techniques
No ratings yet
Multivariate Short-Term Traffic Flow Prediction Based On Real-Time Expressway Toll Plaza Data Using Non-Parametric Techniques
19 pages
Unit 2 ML
No ratings yet
Unit 2 ML
17 pages
Face Recognition Door Lock System
No ratings yet
Face Recognition Door Lock System
23 pages
RBF, KNN, SVM, DT
No ratings yet
RBF, KNN, SVM, DT
9 pages
Missing Data Imputation by K Nearest Neighbours Based On Grey Relational Structure and Mutual Information
No ratings yet
Missing Data Imputation by K Nearest Neighbours Based On Grey Relational Structure and Mutual Information
22 pages
8119-Article Text-8942-1-10-20230930
No ratings yet
8119-Article Text-8942-1-10-20230930
10 pages
23phd10042 Assignment 1
No ratings yet
23phd10042 Assignment 1
6 pages
Module 7 Homework Prompt - JMP
No ratings yet
Module 7 Homework Prompt - JMP
6 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
12 pages
Intro to k-Nearest Neighbor Algorithm
No ratings yet
Intro to k-Nearest Neighbor Algorithm
3 pages
The Nearest Neighbour Algorithm
No ratings yet
The Nearest Neighbour Algorithm
3 pages
Gene Expression Prediction Guide
No ratings yet
Gene Expression Prediction Guide
14 pages
IoT Botnet Detection via LDA Optimization
No ratings yet
IoT Botnet Detection via LDA Optimization
12 pages
Capstone Project Business: Predict Customer Churn in E-Commerce
100% (2)
Capstone Project Business: Predict Customer Churn in E-Commerce
10 pages
Detection of Wastewater Pollution Through Natural Language Generation With Low Cost Sensing Platform 2
No ratings yet
Detection of Wastewater Pollution Through Natural Language Generation With Low Cost Sensing Platform 2
12 pages
Multi Modal Hate Speech Detection Using Machine Learning
100% (1)
Multi Modal Hate Speech Detection Using Machine Learning
5 pages
ML Lesson Plan
No ratings yet
ML Lesson Plan
7 pages
24-25 Session Plan III I Ossa
No ratings yet
24-25 Session Plan III I Ossa
4 pages
Unit2 Decision Trees
No ratings yet
Unit2 Decision Trees
3 pages
Unit1 ML Introduction
No ratings yet
Unit1 ML Introduction
3 pages

Unit 4

Uploaded by

Unit 4

Uploaded by

Unit 4

Memory Based Learning

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm

What is 'K' in K Nearest Neighbour?

How to choose the value of k for KNN Algorithm?

Statistical Methods for Selecting k

Locally Weighted Regression

by calculating θθ so, that it minimize the above cost function.

Our output will be: θTxθTx

so, that it minimize the above cost function.

Our output will be: θTxθTx

Radial Basis Functions

Case-Based Learning (CBL) in Machine Learning (ML) is a method where a

🔍 What is Case-Based Learning?

Case-Based Learning (CBL) is inspired by human reasoning: when we encounter

 Cases = stored experiences or examples, typically represented as feature

🔍 Key Components of a Case-Based Learning System

Case Base A memory of past cases (instances, examples)

Similarity A method to compute how similar a new problem is to stored

Adjust the solution from retrieved case(s) to fit the new

Update case base by adding new cases (and possibly removing

🔍 Examples of Case-Based Learning Algorithms

1. K-Nearest Neighbors (KNN) – A classic case-based learner.

🔍 Case-Based Reasoning (CBR) Cycle

1. Retrieve most similar case(s)

🔍 Advantages of Case-Based Learning

 No need for an explicit training phase.

 Slow during inference (especially for large datasets).

PAC learning model

Probably Approximately Correct (PAC) learning stands as a cornerstone theory,

PAC Learning Theorem

Challenges of PAC Learning

Finding the optimal hypothesis within the PAC framework can be

Types of Ensembles Learning in Machine Learning

from sklearn.ensemble import AdaBoostClassifier

accuracy = accuracy_score(y_test, y_pred)

Random forest constructs multiple decision trees

Random Trains models on random subsets of input features

Gradient Gradient Boosting Machines sequentially builds

AdaBoost Boosting AdaBoost focuses on challenging examples by

(Adaptive assigning weights to data points. Combines weak

CatBoost specialize in handling categorical

You might also like