Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views21 pages

Object Detection

Uploaded by

jay4pelican
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views21 pages

Object Detection

Uploaded by

jay4pelican
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Object Detection : Only

Convolution Based Models

Copyright 2019 RESTRICTED CIRCULATION


Object Localisation & Detection ( single object)

Source:https://towardsdatascience.com/evolution-of-object-detection-and-localization-algorithms-e241021d8bad

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Multiple Objects with Sliding Window

• Sliding window using simple CNN for object detection that we built earlier
• Strides can vary
• Window size can vary
• Computation cost is huge ( slow models )

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Issues

• Multiple aspect ratios


• Multiple bounding boxes for same object
• Object overlapping is not handled properly
• Overlapping bounding boxes go through repeated
convolutions instead of sharing features

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Localisation and detection as single convolution

• Usual CNN layers

• Image is divided into a grid • Output is 3X3X8 tensor

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Evaluate=>IOU: intersection Over Union

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Non Max Suppression
• Multiple instances of same object must be brought down to
one
• Discard all bounding boxes with Pc < 0.6
• Pick the one with highest Pc, discard all boxes which have
IOU > 0.5 with that box
• Do this until you have either all high Pc box or discarded
them
• For multiple classes , NMS needs to be done separately for
each class

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Anchor Boxes

• One grid area might need to output multiple bounding


boxes for multiple classes
• We can simply output multiple instances for each grid
• Number of such outputs are called number of anchor
boxes

Copyright 2019 RESTRICTED CIRCULATION ‹#›


YOLO

• CNN with tensor output is used to build the model ( input


needs to be prepared according to the grid size )
• Output is : nXnXAX(1+4+C)
• n= grid size , A = number of anchor boxes , 1 = probability
for background vs object , 4 = for bounding box coordinates,
C = number of classes being considered
• Use NMS for better bounding boxes while predictions
( separately for each class )

Copyright 2019 RESTRICTED CIRCULATION ‹#›


SSD: Single Shot Detection

• Issue with YOLO: can not detect at different scales very well
• SSD has convolutions of multiple scales on top features
created by VGG16
• Prediction is facilitated at different convolution output.
• Early layers output help predict objects at finer scale due to
their receptive field being limited to smaller areas in the
image
• As we move forward , layers receptive fields grow larger and
they favour predicting larger objects
• Unlike YOLO, SSD does not split the image into grids of
arbitrary size but predicts offset of predefined anchor boxes
(this is called “default boxes” in the paper) for every location
of the feature map.

Copyright 2019 RESTRICTED CIRCULATION ‹#›


SSD Architechture

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Object Detection : Region
Proposal Based Models

Copyright 2019 RESTRICTED CIRCULATION


What is Region Proposal

• Region Proposal is a process of identifying parts of images


[ rectangles ] which have high chances of having an object
instead of background
• Selective Search is a common approach for coming up with
region proposals
• Its pretty fast with high recall [ many of the region proposals
might not have any object, but all the objects will be
contained in proposed regions ]
• Its not part of the network being built
• For deep dive in selective search :
• https://www.learnopencv.com/selective-search-for-object-detection-cpp-python/

Copyright 2019 RESTRICTED CIRCULATION ‹#›


R-CNN

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Issues with R-CNN

• Very slow training due to large number of convent usage


across region proposal
• Prediction is also very slow , 47 seconds/image

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Fast R-CNN

• Instead of using multiple instances of


convnet feature extraction , Region
proposals are projected on the convnet
feature map
• Linear part from fully connected layer is
used for bounding box regression
• Actually loss is a composite one ,
containing both classification and
regression losses . We can use weighted
some [ wt as a hyper parameter ] instead
of simple sum .
• Removal of multiple application of
convnet gives huge reduction in
prediction time as well as training time

Copyright 2019 RESTRICTED CIRCULATION ‹#›


R-CNN Vs Fast R-CNN

Notice that during prediction, most of the time in Fast R-CNN is being
taken by external Region Proposal Process. Faster R-CNN, makes the
Region Proposals also part of Network and further speed up things

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Faster R-CNN

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Pixel Level Masks : Mask R-CNN

Copyright 2019 RESTRICTED CIRCULATION


Semantic Segmentation

• Pixel Level classification


• Doesn’t Differentiate between two objects of same
class if they are adjacent [ no mask boundaries ]

Copyright 2019 RESTRICTED CIRCULATION ‹#›


Mask R-CNN

• Upper Branch is essentially


doing what Faster R-CNN does
• Lower branch is for semantic
segmentation for each bounding
box for each class . This
combination eventually gives
instance segmentation

Copyright 2019 RESTRICTED CIRCULATION ‹#›

You might also like