Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
22 views14 pages

Object Detection Using CNN-RCNN.-1

R-CNN (Regions with Convolutional Neural Networks) is a deep learning framework for object detection that classifies and localizes objects in images using a three-step process: region proposal, feature extraction, and classification with bounding box regression. While R-CNN offers accurate detection and robustness to object variations, it suffers from high computational complexity and slow inference times. Additionally, R-CNN is not an end-to-end model, which can limit its performance compared to more modern architectures.

Uploaded by

samgarg703
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views14 pages

Object Detection Using CNN-RCNN.-1

R-CNN (Regions with Convolutional Neural Networks) is a deep learning framework for object detection that classifies and localizes objects in images using a three-step process: region proposal, feature extraction, and classification with bounding box regression. While R-CNN offers accurate detection and robustness to object variations, it suffers from high computational complexity and slow inference times. Additionally, R-CNN is not an end-to-end model, which can limit its performance compared to more modern architectures.

Uploaded by

samgarg703
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Object Detection using CNN-RCNN.

Object Detection using R-CNN

R-CNN (Regions with Convolutional Neural Networks) is a deep learning framework for object detection,
which involves not only classifying objects in an image but also localizing them via bounding boxes. It
was proposed by Ross Girshick et al. in 2014 and was one of the first successful deep learning
approaches to object detection.
How R-CNN Works

R-CNN works in the following three main steps:


 Region Proposal:
Use an algorithm like Selective Search to generate around 2,000 region proposals (possible object
locations).
 Feature Extraction:
For each region, warp it to a fixed size and pass it through a pre-trained CNN (like AlexNet) to
extract features.
 Classification + Bounding Box Regression:
Use a set of SVM classifiers to label each region (car, dog, background, etc.).
Use a regression model to fine-tune the coordinates of the bounding box.
How R-CNN Works
How R-CNN Works
How R-CNN Works

Region Proposal

R-CNN starts by dividing the input image into multiple regions or subregions. These regions are
referred to as "region proposals" or "region candidates." The region proposal step is responsible for
generating a set of potential regions in the image that are likely to contain objects. R-CNN does not
generate these proposals itself; instead, it relies on external methods like Selective Search or
EdgeBoxes to generate region proposals.
How R-CNN Works

Feature Extraction

In CNNs, feature extraction is the process of identifying and isolating essential patterns and
information from visual data, like images, to enable the network to understand the input. This is
achieved through a series of convolutional and pooling layers that learn hierarchical representations
of features, starting with basic edges and shapes and progressing to more complex and specific
features.
How R-CNN Works

Feature Extraction-How it Works?


How it works:
1. Convolutional Layers:
These layers apply filters (also called kernels) to the input image, performing convolution operations to detect
patterns like edges, corners, and textures. The number of output images from a convolutional layer equals the
number of filters used.
2. Pooling Layers:
Pooling layers reduce the spatial dimensions of the feature maps (output from convolution layers) by selecting the
maximum or average value within a region. This helps to make the model more robust to variations in the input and
reduces computational cost.
3. Activation Functions:
Activation functions introduce non-linearity to the network, allowing it to learn more complex patterns.
4. Hierarchical Feature Extraction:
As the input image passes through multiple convolutional and pooling layers, the network learns increasingly
abstract and complex features.
5. Classifier Network:
The extracted features are then fed into a classifier network, typically consisting of fully connected layers, which
makes the final prediction or classification.
How R-CNN Works

Object Classification

The extracted feature vectors from the region proposals are fed into a separate machine learning
classifier for each object class of interest. R-CNN typically uses Support Vector Machines (SVMs)
for classification. For each class, a unique SVM is trained to determine whether or not the region
proposal contains an instance of that class.
During training, positive samples are regions that contain an instance of the class. Negative samples
are regions that do not.
How R-CNN Works

Bounding Box Regression

In addition to classifying objects, R-CNN also performs bounding box regression. For each class, a
separate regression model is trained to refine the location and size of the bounding box around the
detected object. The bounding box regression helps improve the accuracy of object localization by
adjusting the initially proposed bounding box to better fit the object's actual boundaries.
How R-CNN Works

Non-Maximum Suppression (NMS)

After classifying and regressing bounding boxes for each region proposal, R-CNN applies non-
maximum suppression to eliminate duplicate or highly overlapping bounding boxes. NMS ensures
that only the most confident and non-overlapping bounding boxes are retained as final object
detections.
Advantages of CNN

Accurate Object Detection: R-CNN provides accurate object detection by leveraging region-based
convolutional features. It excels in scenarios where precise object localization and recognition are
crucial.
Robustness to Object Variations: R-CNN models can handle objects with different sizes,
orientations, and scales, making them suitable for real-world scenarios with diverse objects and
complex backgrounds.
Flexibility: R-CNN is a versatile framework that can be adapted to various object detection tasks,
including instance segmentation and object tracking. By modifying the final layers of the network,
you can tailor R-CNN to suit your specific needs.
Disadvantages of CNN

Computational Complexity: R-CNN is computationally intensive. It involves extracting region


proposals, applying a CNN to each proposal, and then running the extracted features through a
classifier. This multi-stage process can be slow and resource-demanding.
Slow Inference: Due to its sequential processing of region proposals, R-CNN is relatively slow
during inference. Real-time applications may find this latency unacceptable.
Overlapping Region Proposals: R-CNN may generate multiple region proposals that overlap
significantly, leading to redundant computation and potentially affecting detection performance.
R-CNN is Not End-to-End: Unlike more modern object detection architectures like Faster R-CNN,
R-CNN is not an end-to-end model. It involves separate modules for region proposal and
classification, which can lead to suboptimal performance compared to models that optimize both
tasks jointly.

You might also like