Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
22 views19 pages

Computer Vision Unit 3

The document covers the fundamentals of feature detection and matching in computer vision, including techniques like edge detection, corner detection, and blob detection. It introduces various feature descriptors such as SIFT, SURF, and ORB, and discusses the importance of feature matching for applications like object recognition and image registration. Additionally, it highlights the use of RANSAC for robust matching and outlines real-world applications of object detection in industries such as autonomous vehicles and security.

Uploaded by

princepoddar747
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views19 pages

Computer Vision Unit 3

The document covers the fundamentals of feature detection and matching in computer vision, including techniques like edge detection, corner detection, and blob detection. It introduces various feature descriptors such as SIFT, SURF, and ORB, and discusses the importance of feature matching for applications like object recognition and image registration. Additionally, it highlights the use of RANSAC for robust matching and outlines real-world applications of object detection in industries such as autonomous vehicles and security.

Uploaded by

princepoddar747
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Lecture 1

Introduction To Feature Detection And


Matching
Feature detection and matching is an important task in many computer vision applications, such as
structure-from-motion, image retrieval, object detection, and more.

Edge Detection in Image Processing: An


Introduction
Edge detection is a fundamental image processing technique for identifying and locating the
boundaries or edges of objects in an image. It is used to identify and detect the discontinuities in the
image intensity and extract the outlines of objects present in an image. The edges of any object in an
image (e.g. flower) are typically defined as the regions in an image where there is a sudden change
in intensity.

The goal of edge detection algorithms is to identify the most significant edges within an image or
scene. These detected edges should then be connected to form meaningful lines and boundaries,
resulting in a segmented image that contains two or more distinct regions. The segmented results
are subsequently used in various stages of a machine vision system for tasks such as object counting,
measuring, feature extraction, and classification.

interest points and corners:


Corner detection is valuable for locating complex objects and for tracking them in 2-D or 3-D.
Corner detection is an approach used within computer vision systems to extract certain kinds of
features and infer the contents of an image. Corner detection is frequently used in motion detection,
image registration, video tracking, image mosaicing, panorama stitching, 3D reconstruction and
object recognition. Corner detection overlaps with the topic of interest point detection.

blob detection:

Blob detection is a technique used in computer vision to identify and locate regions of
interest in an image. These regions, known as blobs, can represent objects, shapes, or
patterns. Blob detection algorithms analyze the intensity or color information of an image to
identify regions that differ significantly from their surrounding areas.

One common approach in blob detection is to use a thresholding technique, where a threshold
value is set to separate the blobs from the background. Pixels with intensities above the
threshold are considered part of a blob, while pixels below the threshold are considered
background.

Blobs generally refer to small dots in an image. This can be viewed as a small object or as a dot.
Blobs can be detected without using deep learning-based models that perform various tasks in the
computer vision field.
General computer vision models mainly use models that include a convolution layer to perform their
tasks. Since the convolution layer is a layer that compresses the image, the features of small pixel
points are lost.

local image features:

A local feature in computer science refers to an outstanding point, object, or region extracted from
an image, described by a data vector that is invariant to changes in camera position and
orientation. These features are crucial for tasks like image matching, object detection, and wide
baseline stereo matching.
A local feature is an image pattern which differs from its immediate neighborhood. It is usually
associated with a change of an image property or several properties simultaneously, though it is not
necessarily localized exactly on this change. The image properties commonly considered are
intensity, color, and texture.

Local features can be points, but also edgels or small image patches. Typically, some measurements
are taken from a region centered on a local feature and converted into descriptors. The descriptors
can then be used for various applications.
Lecture 2
Feature Descriptors (SIFT, SURF, ORB, and BRIEF descriptors):

Feature extraction is an important part of many image processing methods with use cases ranging
from image panorama stitching and robotics. The ideal feature extraction method would be robust
to changes in illumination, rotation, scale, noise and other transformations while being fast enough
to be of use in real-time scenarios.

The three methods we will explore today are Scale-Invariant Feature Transform (SIFT), Speeded Up
Robust Feature (SURF) and Orientated FAST and Robust BRIEF (ORB).

Scale-Invariant Feature Transform (SIFT)

SIFT was created in 2004 by D. Lowe in the University of British Columbia to solve the
problem of scale variance for feature extraction. SIFT can be broken down into two parts:
key-point detection and key-point descriptor extraction.

The key-point detection works by approximating Laplacian of Gaussian, which solves the
scale variance problem but is expensive to compute, with Difference of Gaussian (DoG).
The DoG is stack and searched for local extrema in a 3x3x3 neighborhood to be identified as
key-points.

An orientation also is assigned to each key-point. This is done by extracting the


neighborhood around the key-point and creating a orientation histogram, the peak of the
histogram is used as the orientation, though any other peak above 80% is also considered for
the calculation.

To generate the descriptor , a 16x16 neighborhood around the key-point is taken and divided
into 4x4 cells. An orientation histogram is calculated for each cell and the combined
histograms are concatenated into a 128 dimension feature descriptor.

Speeded-Up Robust Feature (SURF)

SURF was created as an improvement on SIFT in 2006, aimed at increasing the speed of the
algorithm.

Rather than using Difference of Gaussian to approximate LoG, SURF utalises Box Filters.
The benefit of this is that box filters can be easily calculated and calculations for different
scales can be done simultaneously.

To extract the features of the key-point, a 20s x 20s neighborhood is extracted and divided into 4x4
cells. X and y wavelet responses are extracted from each cell and responses from each cell are
concatenated to form a 64 dimension feature descriptor.

Orientated FAST and Robust BRIEF (ORB)

ORB, which as the name suggests is the combination of two algorithms FAST and BRIEF
and was created as an alternative to both SIFT and SURF in 2011.
FAST or Features from Accelerated Segment Test is used as the key-point detector. It works
by selecting pixels in a radius around a key-point candidate and checks if there are n
continuous pixels that are all brighter or darker than the candidate pixel. This is sped up by
only comparing a subset of these pixels before testing the whole range. One this to note is
that FAST does not compute orientation, to solve this the authors of ORB uses the
intensity weighted centroid of the key-point patch and the direction of this centroid with
reference to the key-point is used as the orientation.

BRIEF or Binary Robust Independent Elementary Features is used as the key-point descriptor. As
BRIEF performs poorly with rotation, the computed orientation of the key-points are used to steer
the orientation of the key-point patch before extracting the descriptor. Once this is done, a series of
binary tests are computed comparing a pattern of pixels in the patch. The output of the binary tests
are concatenated and used as the feature descriptor.
Lecture 3
feature matching:

A feature is a piece of information that can be used to solve a computational task related to a
certain application. Points, edges, or objects specific to a structure in the image may be
considered a feature. Features can be subdivided into 2 main categories-

 Keypoint Features - These are the features that are located in some specific locations
in the image. It could be a mountain peak or building corners. They are described by
the appearance of patches of pixels surrounding the location.
 Edges - Edges define the boundaries and can be excellent indicators because they
signify a sharp discontinuity in pixel values. Edges can be matched based on their
local appearance and orientation.

Components of Feature Matching


Interest Point Identification

First, we need to identify an interest point. An interest point is expressive in texture. It is a


point at which the direction of the boundary of the object changes significantly. Or it could be
the intersection of two edge segments.

Description

The visual descriptions are defined around each feature point and these are invariant to other
factors like illumination and in-plane rotation. This is done with the help of descriptor vectors
that capture all the necessary information. They describe elementary characteristics like
colour, shape, and texture around the feature point.

Feature descriptors store the information by encoding it into a unique numerical series which
enable the computer to differentiate among the features. Ideally, descriptors capture
information in such a way to make it invariant to transformations. Descriptors can be
subdivided into 2 categories - Local and Global descriptors.

 Local Descriptor: The aim is to capture the information in the most immediate surroundings
of the feature point. They try to resemble the local neighbourhood of the feature point.
 Global Descriptor: They capture the image as a whole. This makes it highly unreliable in real-
world scenarios since the slightest transformation in an image sample may cause the model
to fail.
Feature Matching

Feature matching or image matching is widely used for various real-world applications such
as object recognition and image registration. It establishes the correspondences between two
similar images based on the resulting interest points and detectors. We match the local
descriptors to draw the correspondence between the images in comparison.

Feature matching is the process of comparing two images based on their respective
keypoints(features and descriptors). The search distance algorithm is a widely used technique for the
process.
Lecture 4
matching techniques:

Image matching in computer vision refers to the process of finding correspondences between
different images or parts of images. This can involve identifying objects, features, or patterns
in one image that are similar to those in another image.

The goal is to establish relationships between different images or parts of images, which can
be used for tasks such as object recognition, image registration, and augmented reality.

Image Matching Techniques

There are several approaches to image matching. We will discuss two of the most common
approaches here.

1. Feature-based matching
2. Template Matching

Feature-based matching

This method involves identifying distinctive features (such as corners, edges, or blobs) in the
images and matching them based on their descriptors. Some common and popular algorithms
used for feature-based matching include SIFT (Scale-Invariant Feature Transform), SURF
(Speeded-Up Robust Features), ORB (Oriented FAST and Rotated BRIEF), AKAZE
(Accelerated-KAZE), BRISK (Binary Robust Invariant Scalable Keypoints), and FREAK
(Fast Retina Keypoint).

Feature based matching involves following two important steps.

1. Detect keypoints and descriptors: Detect distinctive points or regions in both images that are
likely to be matched and extract numerical descriptors or feature vectors around each
keypoint to describe its local neighborhood. These descriptors should be distinctive and
invariant to changes in scale, rotation, and illumination. Algorithms such as SIFT used for this
process.
2. Match keypoints: Compare the descriptors of keypoints between the two images to find
correspondences. We may apply filtering techniques to remove incorrect matches and retain
only reliable correspondences. Different feature matcher such as Brute-Force matcher,
FLANN matcher are used for this process.

Template matching

Template matching is a technique used in image processing and computer vision to find a
template image within a larger image. It involves sliding the template image over the larger
image and comparing their pixel values or features to find the best match. Here's how it
works in detail:

 Input Images: You have a template image and a larger image within which you want to find
occurrences of the template.
 Sliding Window: The template image is moved (or "slid") over the larger image in a
systematic way, usually pixel by pixel or in larger strides.
 Comparison: At each position of the template, a similarity measure is computed between
the template and the corresponding region in the larger image. This measure can be based
on pixel-wise differences, correlation coefficients, or other metrics depending on the
application.
 Best Match: The position with the highest similarity measure indicates the best match of the
template within the larger image.

RANSAC for robust matching:

RANSAC (Random Sample Consensus) is a robust algorithm used in machine learning and
computer vision to estimate model parameters in the presence of outliers. It is particularly
useful when there is a large amount of noisy data, and the goal is to find a model that fits
the inliers well. RANSAC is an iterative algorithm that randomly samples a subset of the data
and fits a model to that subset. The model is then used to classify the remaining data as
either inliers or outliers. The algorithm continues to iterate, selecting new random subsets of
the data, until a satisfactory model is found.

Random sample consensus (RANSAC) is an iterative method to estimate parameters of a


mathematical model from a set of observed data that contains outliers, when outliers are to be
accorded no influence on the values of the estimates. Therefore, it also can be interpreted as
an outlier detection method. It is a non-deterministic algorithm in the sense that it produces a
reasonable result only with a certain probability, with this probability increasing as more
iterations are allowed. The algorithm was first published by Fischler and Bolles at SRI
International in 1981. They used RANSAC to solve the Location Determination Problem
(LDP), where the goal is to determine the points in the space that project onto an image into a
set of landmarks with known locations.

RANSAC uses repeated random sub-sampling A basic assumption is that the data consists of
"inliers", i.e., data whose distribution can be explained by some set of model parameters,
though may be subject to noise, and "outliers" which are data that do not fit the model. The
outliers can come, for example, from extreme values of the noise or from erroneous
measurements or incorrect hypotheses about the interpretation of data. RANSAC also
assumes that, given a (usually small) set of inliers, there exists a procedure which can
estimate the parameters of a model that optimally explains or fits this data.

Example

A simple example is fitting a line in two dimensions to a set of observations. Assuming that
this set contains both inliers, i.e., points which approximately can be fitted to a line, and
outliers, points which cannot be fitted to this line, a simple least squares method for line
fitting will generally produce a line with a bad fit to the data including inliers and outliers.
The reason is that it is optimally fitted to all points, including the outliers. RANSAC, on the
other hand, attempts to exclude the outliers and find a linear model that only uses the inliers
in its calculation. This is done by fitting linear models to several random samplings of the
data and returning the model that has the best fit to a subset of the data. Since the inliers tend
to be more linearly related than a random mixture of inliers and outliers, a random subset that
consists entirely of inliers will have the best model fit. In practice, there is no guarantee that a
subset of inliers will be randomly sampled, and the probability of the algorithm succeeding
depends on the proportion of inliers in the data as well as the choice of several algorithm
parameters.

A data set with many outliers for which a line has to be fitted.

Fitted line with RANSAC; outliers have no influence on the result.

applications in object recognition:

Object detection is a computer vision technique that allows you to recognize objects in
various environments to further classify and analyze them.

Due to machine learning and artificial intelligence development, the technology has
significantly increased its potential application.

The object detection system operates in several stages:

1. The algorithm analyzes input data obtained in various ways (images, videos, real-time
photography, and video capturing).
2. An object is localized by assigning a class label to it, which is displayed as a
bounding box.
3. Based on the previous stage, the algorithm recognizes an object and its position,
displaying the location, size, and class label for each object in the image.

Real-world Applications of Object Detection in Various Industries

Software systems utilizing object detection technology can be a valuable and efficient tool for
solving diverse tasks. Read about examples of how companies operating in various industries
currently use them in real-case scenarios.

Self-Driving Vehicles

Object detection is a fundamental technology underlying the concept of autonomous


transport. Precise object recognition (pedestrians, obstacles, and other vehicles) is a
prerequisite for self-driving vehicles to move safely on public roads.

Object detectors, used in autonomous vehicles, pinpoint vehicle location in the environment,
account for the surrounding context, and track other objects to plan further routes.
Biometric and facial recognition

Security checks at airports or customs points, biometric access control, and tracking specific
objects by police officers and detectives — these and many more similar use cases show how
automated object detection systems can speed up result generation and reduce manual work.

Safety Control

In production environments, logistics, and construction, object or people detection


technology can become a crucial factor, primarily in the context of potential security threat
identification. Systems tracking and detecting objects in real-time control hazardous areas,
improve compliance with safety protocols and reduce threats to personnel.
Crowd Counting and Traffic Monitoring

Object detection can find its use in solving such tasks as:

 People automatic counting in public areas;


 Vehicle monitoring on specific routes and streets;
 Pedestrian traffic inside and outside stores or malls.

Those interested can use the information obtained for further processing by a person and for
making various automated decisions based on algorithms embedded in the system.
Crowd Detection Systems,
Inventory and warehousing

Object detection can serve well in logistics, warehousing, and retail to track, control, and
optimize the stock management process. Autonomous robots and cheaper manual devices are
currently available for this purpose.

Video Surveillance Solutions

Video surveillance systems use object detection technology to identify any abnormal activity,
after which they activate automatic alerts if necessary.

It is an efficient tool to provide security monitoring for financial institutions, retail


establishments, and warehouses. Security services and law enforcement agencies also use
similar solutions.

Enhancing Quality Control

Manufacturing companies use object detection technology to find defects in their products or
raw materials, and by doing so, they achieve a high-automated quality assurance process.

Neural networks can quickly detect even minute defects, thus completely transforming the
previously entirely manual process requiring huge human resources.
Computer vision application for quality inspection

Medical Image Analysis

According to IBM, images constitute about 90% of all medical data. Machine learning
technologies capable of analyzing images to identify abnormalities and observable objects
pose tremendous potential as a tool for doctors.

When properly utilized, such software solutions significantly speed up data processing and
allow healthcare specialists to detect patterns that would be extremely difficult to observe
even for the most experienced professionals.
Visual Product Search

Visual search technology allows potential buyers to scan a product in the real world and
quickly find it on sale in an online store. This way, buyers feel a seamless customer
experience and blur the boundaries between offline and online shopping.

The app’s operational mechanism is simple: the neural network processes images and
compares their features with the products (or images) available in the database. After that, it
displays relevant results based on the similarity assessment.
Build the next big thing with us

Set the requirements and let us build your custom application to power your business with an
efficient software solution.

SKUs Management in Retail

By combining AI, machine learning, and object detection technologies, retailers obtain
software solutions that monitor product availability on the shop shelves, identify
discrepancies between planograms and the actual shelves’ appearance, or organize
contactless service at checkouts.
Retail outlets can also apply the technology to count people entering their premises, track
where they head, and monitor what products they interact with. This approach allows retail
owners to understand customer behavior better.

Crop Monitoring and Pest Detection

Many agricultural companies are now actively implementing digital solutions armed with
object detection technology. First of all, such tools are helpful for plant health monitoring,
pest control, crop maintenance, and yield estimation.

Harness the Power of Object Detection Technology for Your Business

Object detection is one of the key technologies in computer vision. It enables machines to
identify and pinpoint the location of objects in images or video. This capability makes the
technology a powerful tool for companies looking to integrate innovative solutions to
optimize and automate their business processes.

You might also like