Notes - Unit-Iii - & - Iv - Computer Vision
Notes - Unit-Iii - & - Iv - Computer Vision
UNIT-III & IV
COMPUTER VISION
Q.1) What is Clustering-based segmentation, explain K-means Clustering-based algorithm with example.
Ans:- Clustering-based segmentation is a technique used in image processing and computer vision to divide an
image into its constituent parts or objects. The goal is to group similar pixels or regions together based on their
features, such as color, texture, or intensity.
*K-Means Algorithm:*
K-Means is a popular unsupervised learning algorithm used for clustering-based segmentation. The algorithm
works as follows:
1. *Initialization*: Choose the number of clusters (K) and randomly select K initial centroids.
2. *Assignment*: Assign each pixel in the image to the closest centroid based on a distance metric (e.g.,
Euclidean distance).
3. *Update*: Update the centroids by calculating the mean of all pixels assigned to each cluster.
4. *Repeat*: Repeat steps 2-3 until convergence or a stopping criterion is reached.
*Example:*
Suppose we have an image of a sunset with three main regions: sky, sea, and beach. We want to segment the
image into these three regions using K-Means.
1. *Initialization*: We choose K=3 and randomly select three initial centroids (e.g., (100, 100, 100), (200, 200,
200), and (50, 50, 50)).
2. *Assignment*: We assign each pixel in the image to the closest centroid based on Euclidean distance.
1. *Update*: We update the centroids by calculating the mean of all pixels assigned to each cluster.
1. *Repeat*: We repeat steps 2-3 until convergence. The final centroids represent the mean color of each region.
*Segmented Image:*
The resulting segmented image will have three regions, each corresponding to one of the centroids.
K-Means is a simple yet effective algorithm for clustering-based segmentation. However, it has some
limitations, such as:
Ans:- Region-based segmentation is a technique used in image processing and computer vision to divide an
image into its constituent parts or objects. The goal is to group similar pixels or regions together based on their
features, such as color, texture, or intensity.
1. Region Growing
Region growing is a technique that starts with a seed point or a small region and gradually grows the region by
adding neighboring pixels that meet certain criteria.
_Steps:_
_Example:_
Suppose we have an image of a brain MRI scan, and we want to segment the tumor region. We can start with a
seed point within the tumor and grow the region by adding neighboring pixels with similar intensity values.
2. Region Splitting
Region splitting is a technique that starts with a large region and recursively splits it into smaller sub-regions
based on certain criteria.
_Steps:_
_Example:_
Suppose we have an image of a landscape with a mountain range, and we want to segment the different regions
(e.g., sky, mountain, forest). We can start with the entire image and recursively split it into smaller regions
based on color and texture differences.
3. Region Merging
Region merging is a technique that starts with a set of small regions and gradually merges them into larger
regions based on certain criteria.
_Steps:_
_Example:_
Suppose we have an image of a cityscape with many small buildings, and we want to segment the different
buildings. We can start with a set of small regions corresponding to individual buildings and merge them into
larger regions based on color and texture similarities.
-----------------------------------------------------------------------------------------------------------------------------------
Ans:- Thresholding is a technique used to separate objects from the background in an image based on intensity
values.
The simplest thresholding methods replace each pixel in an image with a black pixel if the image intensity is
less than a fixed value called the threshold , or a white pixel if the pixel intensity is greater than that threshold.
In the example image on the right, this results in the dark puppy becoming completely black, and the bright
puppy becoming completely white.
*Types of Thresholding:*
1. Binary Thresholding: Divides pixels into two classes (object and background), Binary thresholding is
a simple yet effective technique used in image processing to segment an image into two regions: foreground and
background.
Binary thresholding is a technique that converts a grayscale image into a binary image by applying a threshold
value. Pixels with intensity values above the threshold are assigned a value of 255 (white), while pixels with
intensity values below the threshold are assigned a value of 0 (black).
3. Adaptive Thresholding:
Adjusts threshold value based on local image characteristics, Adaptive thresholding is a technique used in
image processing to segment an image into different regions based on local characteristics.
Adaptive thresholding is a method that adapts the threshold value to the local characteristics of the image, such
as intensity, texture, or color. This approach is useful when the image has varying intensity levels or when the
threshold value needs to be adjusted dynamically.
Otsu thresholding is a technique used in image processing to automatically determine the optimal threshold
value for segmenting an image into two regions.
_History:_
5. Manual Thresholding: User-defined threshold value , Manual thresholding is a technique used in image
processing to segment an image into two regions by manually selecting a threshold value.
Feature extraction is the process of automatically extracting relevant features from an image. The goal of feature
extraction is to reduce the dimensionality of the image data while retaining the most important information.
1. Edge Detection
Edge detection is a technique used to identify the boundaries or edges within an image. Common edge detection
algorithms include:
- Sobel operator
- Canny edge detector
- Laplacian of Gaussian (LoG)
2. Corner Detection
Corner detection is a technique used to identify the corners or interest points within an image. Common corner
detection algorithms include:
3. Blob Detection
Blob detection is a technique used to identify the blobs or regions of interest within an image. Common blob
detection algorithms include:
4. Texture Analysis
Texture analysis is a technique used to describe the texture or pattern within an image. Common texture analysis
algorithms include:
- Gabor filters
- Wavelet transform
- Local Binary Patterns (LBP)
5. Color Features
Color features are used to describe the color properties of an image. Common color features include:
- Color histograms
- Color moments
- Color correlograms
6. Shape Features
Shape features are used to describe the shape properties of an object within an image. Common shape features
include:
- Fourier descriptors
- Moment invariants
- Shape context
Q.5) How does Color , Texture and Shape Features are represented
Ans:- Color, texture, and shape features are represented using various mathematical and computational
techniques.
1.Color Features:-Color features can be represented using various color models, such as:
1. *RGB (Red, Green, Blue)*: Each pixel is represented by three values (R, G, B) that range from 0 to 255.
2. *HSV (Hue, Saturation, Value)*: Each pixel is represented by three values (H, S, V) that range from 0 to 255.
3. *Color Histograms*: A color histogram represents the distribution of colors in an image. It's a graphical
representation of the number of pixels for each color value.
4. *Color Moments*: Color moments are a set of statistical measures that describe the distribution of colors in
an image.
2.Texture Features:-Texture features can be represented using various techniques, such as:
1. *Co-Occurrence Matrix (GLCM)*: The GLCM represents the probability of two pixels having a certain
intensity value and spatial relationship.
2. *Local Binary Patterns (LBP)*: LBP represents the texture features by comparing the intensity values of
neighboring pixels.
3. *Gabor Filters*: Gabor filters are used to extract texture features by applying a set of filters with different
frequencies and orientations.
4. *Wavelet Transform*: The wavelet transform represents the texture features by decomposing the image into
different frequency bands.
3.Shape Features:- Shape features can be represented using various techniques, such as:
1. *Fourier Descriptors*: Fourier descriptors represent the shape features by describing the shape boundary
using a set of Fourier coefficients.
2. *Moment Invariants*: Moment invariants represent the shape features by calculating a set of statistical
measures that are invariant to translation, rotation, and scaling.
3. *Shape Context*: Shape context represents the shape features by describing the distribution of points within
the shape boundary.
4. *Convex Hull*: The convex hull represents the shape features by describing the smallest convex polygon that
encloses the shape.
--------------------------------------------------------------------------------------------------------------------------------------
Q.6) What are the different types of shape classes , explain any two in detail
Ans:-Shape classes are categories used to describe and classify shapes based on their geometric properties.
1. *Chain Code*: A chain code is a sequence of numbers that represent the direction of the contour at each
point.
2. *Polygonal Approximation*: The contour is approximated by a polygon with a specified number of vertices.
3. *Fourier Descriptors*: The contour is represented by a set of Fourier coefficients that describe the shape of
the contour.
4. *Moment Invariants*: The contour is represented by a set of moment invariants that are invariant to
translation, rotation, and scaling.
5. *Shape Context*: The contour is represented by a set of shape context descriptors that describe the
distribution of points along the contour.
1. *Boundary Following*: The contour is extracted by following the boundary of the object.
2. *Edge Detection*: The contour is extracted by detecting the edges of the object using edge detection
algorithms such as Canny or Sobel.
3. *Active Contours*: The contour is extracted by using active contour models such as snakes or level sets.
4. *Contour Matching*: The contour is matched with a set of predefined contours to determine the shape of the
object.
1. *Robustness to Noise*: Contour-based shape representation is robust to noise and can handle noisy or
incomplete data.
2. *Invariance to Transformations*: Contour-based shape representation can be invariant to transformations
such as translation, rotation, and scaling.
3. *Efficient Representation*: Contour-based shape representation can provide an efficient representation of the
shape of an object.
1. *Object Recognition*: Contour-based shape representation is used in object recognition applications such as
image classification and object detection.
2. *Image Segmentation*: Contour-based shape representation is used in image segmentation applications such
as boundary detection and region segmentation.
3. *Computer-Aided Design (CAD)*: Contour-based shape representation is used in CAD applications such as
shape modeling and shape analysis.
Region-based shape representation is a technique used to describe the shape of an object by representing its
interior region.
Types of Region-Based Shape Representations
1. _Medial Axis Transform (MAT)_: The MAT represents the shape of an object by its medial axis, which is the
set of points that are equidistant from the object's boundary.
2. _Skeletonization_: Skeletonization represents the shape of an object by its skeleton, which is a simplified
representation of the object's structure.
3. _Region Growing_: Region growing represents the shape of an object by iteratively growing a region from a
seed point until it reaches the object's boundary.
4. _Watershed Transform_: The watershed transform represents the shape of an object by dividing the image
into regions based on the gradient of the intensity values.
5. _Shape Masks_: Shape masks represent the shape of an object by a binary mask that indicates the presence or
absence of the object at each pixel location.
1. _Region Filling_: Region filling involves filling a region with a specific value or color to represent the shape
of an object.
2. _Boundary Detection_: Boundary detection involves detecting the boundary of an object to represent its
shape.
3. _Region Merging_: Region merging involves merging multiple regions to represent the shape of an object.
4. _Region Splitting_: Region splitting involves splitting a region into multiple sub-regions to represent the
shape of an object.
1. _Robustness to Noise_: Region-based shape representation is robust to noise and can handle noisy or
incomplete data.
2. _Invariance to Transformations_: Region-based shape representation can be invariant to transformations such
as translation, rotation, and scaling.
3. _Efficient Representation_: Region-based shape representation can provide an efficient representation of the
shape of an object.
1. _Object Recognition_: Region-based shape representation is used in object recognition applications such as
image classification and object detection.
2. _Image Segmentation_: Region-based shape representation is used in image segmentation applications such
as region growing and watershed transform.
3. _Computer-Aided Design (CAD)_: Region-based shape representation is used in CAD applications such as
shape modeling and shape analysis.
------------------------------------------------------------------------------------------------------------------------------------
Q.8) What is region identification and how does region identification handle
noisy or incomplete data.
Ans:-Region identification is the process of identifying and labeling regions or areas of interest within an image
or a dataset. It involves segmenting the image or data into distinct regions based on certain characteristics, such
as intensity, texture, or color.
Region identification is a crucial step in various applications, including image analysis, computer vision, and
data mining.
It helps to:
1. *Simplify complex data*: By dividing the data into regions, it becomes easier to analyze and understand the
underlying patterns and structures.
2. *Extract meaningful information*: Region identification enables the extraction of meaningful information,
such as object boundaries, textures, or patterns.
3. *Improve data visualization*: By labeling and coloring regions, it becomes easier to visualize and
communicate complex data insights.
To handle noisy or incomplete data, region identification techniques employ various strategies, including:
1. *Noise reduction filters*: Applying filters, such as Gaussian blur or median filter, to reduce noise and smooth
out the data.
2. *Thresholding*: Applying thresholding techniques to separate regions based on intensity or other
characteristics.
3. *Edge detection*: Using edge detection algorithms to identify region boundaries and separate regions.
4. *Region growing*: Starting from a seed point, region growing algorithms expand the region based on
similarity criteria, such as intensity or texture.
5. *Watershed transform*: The watershed transform is a technique that separates regions based on the gradient
of the intensity values.
6. *Machine learning-based approaches*: Using machine learning algorithms, such as convolutional neural
networks (CNNs), to learn features and identify regions from noisy or incomplete data.
7. *Data augmentation*: Augmenting the data with additional information, such as texture or color features, to
improve region identification.
8. *Post-processing techniques*: Applying post-processing techniques, such as morphological operations or
active contours, to refine region boundaries and improve accuracy.
Q.9) What is difference between hierarchical and non-hierarchical approach in object recognition?
Ans:-In object recognition, hierarchical and non-hierarchical approaches are two different methods used to
represent and recognize objects.
1. *Semantic Networks*: A graph-based representation of knowledge, where nodes represent concepts and
edges represent relationships between them.
2. *Frames*: A structured representation of knowledge, where each frame represents an object or concept and
contains slots for its attributes and values.
3. *Ontologies*: A formal representation of knowledge, where concepts and relationships are defined using a
set of axioms and rules.
4. *Decision Trees*: A tree-based representation of knowledge, where each node represents a decision or a
feature, and the edges represent the flow of decisions.
Suppose we want to recognize objects in an image, such as cars, trees, and buildings. We can represent our
knowledge about these objects using a semantic network, as shown below:
+-----------------------------------------------+
| Car | Tree | Building
+-----------------------------------------------+
| Wheels | Leaves | Walls
| Doors | Branches | Windows
| Engine | Trunk | Roof
+------------------------------------------------+
In this example, the semantic network represents our knowledge about the objects, their features, and their
relationships. The network can be used to recognize objects in an image by matching the features of the objects
in the image with the features represented in the network.
For example, if we want to recognize a car in an image, we can use the network to match the features of the car
in the image, such as the wheels, doors, and engine, with the features represented in the network. If the features
match, we can conclude that the object in the image is a car.
1. *Improved Accuracy*: Knowledge representation can improve the accuracy of object recognition by
providing a more structured and formal representation of knowledge.
2. *Increased Efficiency*: Knowledge representation can increase the efficiency of object recognition by
reducing the amount of data that needs to be processed.
3. *Better Handling of Uncertainty*: Knowledge representation can handle uncertainty and ambiguity in object
recognition by providing a more nuanced and contextual representation of knowledge.
Q.11)What is syntactic clustering, how does syntactic clustering handled ambiguity and uncertainty?
Ans:- Syntactic clustering is a technique used in pattern recognition and machine learning to group similar
patterns or objects into clusters based on their structural or syntactic properties. This approach is particularly
useful when dealing with complex patterns or objects that cannot be easily represented using numerical features.
1. _Pattern representation_: Each pattern or object is represented using a syntactic structure, such as a string,
tree, or graph.
2. _Similarity measurement_: A similarity measure is defined to compare the syntactic structures of different
patterns or objects.
3. _Clustering_: The patterns or objects are grouped into clusters based on their similarity.
1. _Fuzzy syntactic clustering_: This approach uses fuzzy logic to represent the uncertainty or ambiguity in the
syntactic structures of the patterns or objects.
2. _Probabilistic syntactic clustering_: This approach uses probabilistic models to represent the uncertainty or
ambiguity in the syntactic structures of the patterns or objects.
3. _Hierarchical syntactic clustering_: This approach uses a hierarchical representation of the syntactic
structures to handle ambiguity and uncertainty.
4. _Ensemble syntactic clustering_: This approach combines multiple clustering models to improve the
robustness and accuracy of the clustering results.
1. _String matching_: This technique is used to compare the syntactic structures of two patterns or objects
represented as strings.
2. _Tree matching_: This technique is used to compare the syntactic structures of two patterns or objects
represented as trees.
3. _Graph matching_: This technique is used to compare the syntactic structures of two patterns or objects
represented as graphs.
1. _Image recognition_: Syntactic clustering can be used to recognize objects in images based on their shape,
texture, and other structural properties.
2. _Natural language processing_: Syntactic clustering can be used to analyze the syntactic structure of
sentences and texts.
3. _Bioinformatics_: Syntactic clustering can be used to analyze the structural properties of biological
molecules, such as proteins and DNA sequences.
-------------------------------------------------------------------------------------------------------------
Ans:-Statistical object recognition is a technique used in computer vision and machine learning to recognize
objects in images or videos based on statistical models of their appearance and shape.
The goal of statistical object recognition is to learn a probabilistic model of the object's appearance and shape
from a set of training images, and then use this model to recognize the object in new, unseen images.
Suppose we want to recognize images of cars. We collect a dataset of images of cars, each labeled with the
location and scale of the car in the image. We then use this dataset to learn a statistical model of the car's
appearance and shape.
One common approach to statistical object recognition is to use a deformable template model, which represents
the object's shape as a set of connected points or lines. The model is learned by finding the optimal alignment of
the template to each training image, and then computing the statistics of the aligned templates.
For example, we might learn a deformable template model of a car by aligning a template of a car to each
training image, and then computing the mean and covariance of the aligned templates. The resulting model can
then be used to recognize cars in new images by finding the best alignment of the template to the image.
1. _Deformable template models_: These models represent the object's shape as a set of connected points or
lines, and learn the statistics of the aligned templates.
2. _Gaussian mixture models_: These models represent the object's appearance as a mixture of Gaussian
distributions, and learn the parameters of the mixture model from the training data.
3. _Hidden Markov models_: These models represent the object's appearance as a sequence of hidden states, and
learn the parameters of the model from the training data.
4. _Support vector machines_: These models represent the object's appearance as a set of support vectors, and
learn the parameters of the model from the training data.
1. _Robustness to variation_: Statistical object recognition can handle variations in the object's appearance and
shape.
2. _Ability to handle occlusion_: Statistical object recognition can handle occlusion by modeling the object's
appearance as a mixture of visible and occluded regions.
3. _Ability to handle clutter_: Statistical object recognition can handle clutter by modeling the object's
appearance as a mixture of object and background regions.
Ans:- 1) Bays classifier :- Bayes classifier is a probabilistic classifier based on Bayes' theorem. It is a simple and
effective classifier that can be used for binary and multi-class classification problems.
*Bayes' Theorem*
Bayes' theorem describes the probability of an event occurring given some prior knowledge of conditions that
might be related to the event. Mathematically, it can be expressed as:
where:
*Bayes Classifier*
The Bayes classifier uses Bayes' theorem to calculate the posterior probability of a class given a set of features.
The classifier chooses the class with the highest posterior probability.
where:
1. *Simple to implement*: The Bayes classifier is a simple and intuitive classifier to implement.
2. *Robust to noise*: The Bayes classifier is robust to noise and can handle noisy data.
3. *Handles missing values*: The Bayes classifier can handle missing values by using the prior probabilities.
1. *Assumes independence*: The Bayes classifier assumes that the features are independent, which may not
always be the case.
2. *Requires prior probabilities*: The Bayes classifier requires prior probabilities, which may not always be
available.
3. *Can be computationally expensive*: The Bayes classifier can be computationally expensive, especially for
large datasets.
Suppose we want to classify a person as either "healthy" or "sick" based on their symptoms. We have a dataset
of people with their symptoms and corresponding health status. We can use the Bayes classifier to classify a
new person as either "healthy" or "sick" based on their symptoms.
Let's say we have two symptoms: "fever" and "headache". We can calculate the prior probabilities of each
symptom given the health status, and then use the Bayes classifier to calculate the posterior probability of each
health status given the symptoms.
For example, let's say we have a person with a fever and a headache. We can calculate the posterior probability
of each health status given the symptoms as follows:
2) KNN classifier :- K-Nearest Neighbors (KNN) is a supervised learning algorithm used for classification and
regression tasks. It is a simple, yet effective algorithm that works well for many types of data.
1. *Data Preprocessing*: The dataset is preprocessed to ensure that all features are on the same scale. This is
typically done using normalization or standardization techniques.
2. *Choose K*: The value of K is chosen, which represents the number of nearest neighbors to consider when
making a prediction.
3. *Calculate Distances*: When a new instance is presented to the model, the distances between the new
instance and all instances in the training set are calculated using a distance metric such as Euclidean distance or
Manhattan distance.
4. *Find K-Nearest Neighbors*: The K instances with the smallest distances to the new instance are selected as
the K-nearest neighbors.
5. *Make Prediction*: The prediction is made by taking a majority vote of the K-nearest neighbors. In other
words, the class label of the new instance is assigned based on the class label of the majority of the K-nearest
neighbors.
1. *Simple to Implement*: KNN is a simple algorithm to implement, and it can be used for both classification
and regression tasks.
2. *Robust to Noise*: KNN is robust to noisy data, as the prediction is based on the majority vote of the K-
nearest neighbors.
3. *Handling Non-Linear Relationships*: KNN can handle non-linear relationships between features, as it does
not rely on linear combinations of features.
1. *Computational Complexity*: KNN can be computationally expensive, especially for large datasets, as it
requires calculating distances between all instances.
2. *Sensitive to Choice of K*: The choice of K can significantly affect the performance of the KNN classifier. If
K is too small, the model may be prone to overfitting, while if K is too large, the model may be prone to
underfitting.
3. *Not Suitable for High-Dimensional Data*: KNN can be less effective for high-dimensional data, as the
distance metric used may not be effective in capturing the relationships between features.