Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
24 views35 pages

Deep Learning For Object Detection - 131124

The document discusses deep learning techniques for object detection, highlighting its applications in areas such as self-driving cars, robotics, and facial recognition. It details various models like YOLO, Faster R-CNN, and SSD, emphasizing the advantages of YOLO for real-time detection. Additionally, it covers dataset creation, model training, and performance evaluation metrics essential for developing effective object detection systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views35 pages

Deep Learning For Object Detection - 131124

The document discusses deep learning techniques for object detection, highlighting its applications in areas such as self-driving cars, robotics, and facial recognition. It details various models like YOLO, Faster R-CNN, and SSD, emphasizing the advantages of YOLO for real-time detection. Additionally, it covers dataset creation, model training, and performance evaluation metrics essential for developing effective object detection systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

DEEP LEARNING FOR

OBJECT DETECTION

• Djoko Purwanto
• Artificial Intelligence and Health Technology Research Center
• Institut Teknologi Sepuluh Nopember (ITS)
OBJECT DETECTION
Introduction

Object detection is a computer vision task that involves identifying and locating objects within
images. Deep learning has significantly advanced this field, leveraging neural networks to improve
accuracy and efficiency.

dog
cat
Model

parameter

2
Object Detection Applications
 Optical character recognition: OCR is the recognition of hand-written, printed, or typed characters
from an image. These techniques are used for scanning printed books to a digital document. Other
applications are data entry, traffic sign recognition, etc.
 Self-driving cars: These cars can drive by itself. One of the major capabilities of self-driving cars is
detecting pedestrians, cars, trucks, traffic signs, etc. These detections are essential for the proper
working of self-driving cars.
 Verification using face and IRIS code: Face and IRIS verification and authentication are used in iPhone
and Android phones. It does the device authorization if the exact face or IRIS match detected.
 Robotics: There are a lot of applications in robotics using object detection. One of the common
applications is bin picking and sorting of objects. Using object detection techniques, the robot can able
to understand the location of objects. Using that information, the robot can able to pick the object and
able to sort it.
 Object tracking and counting: Using object detection techniques, you can track an object and can be
used as an object counter. For example, how many cars have crossed in a junction, how people entered
a shopping mall etc.
 Other applications
3
Object Detection Models

Model Name Note


YOLO (You Only Look Once) Processes images in real-time by predicting bounding
boxes and class probabilities simultaneously
Faster R-CNN Combines region proposal networks with CNNs to
enhance detection speed and accuracy
SSD (Single Shot MultiBox Balances speed and accuracy by detecting objects at
Detector) multiple scales in a single pass.

4
You Only Look Once (YOLO)
YOLO is one of the ‘Deep learning-based approach‘ of object detection. The object detection algorithms
using deep learning can be classified into two groups :

1. Classification based algorithms: There are mainly two stages in classification based algorithms. In
the first stage, it will select a bunch of Region of Interest (ROI) in the image where the chances of
objects are high. In the second stage, it will apply a Convolution Neural Network to these regions to
detect the presence of an object. One of the problems with this method is, we have to execute the
detector in each of the ROI, and that makes is slow and computationally expensive. One example of
this type of algorithm is R-CNN.
2. Regression-based algorithms: In this algorithm, there is no selection of interesting ROI in the image,
instead of that, it will predict the classes and bounding boxes for the entire image at once. This makes
detection faster than classification algorithms. One of the famous regression-based algorithms is
YOLO (“You Only Look Once“). The YOLO detector is very fast so it is used in self-driving cars and
other applications where real-time object detection is required.

5
The YOLO detector can predict the class of object, its bounding box, and the probability of the
object’s class in the bounding box.

Each bounding box is having the following parameters:


 center position of the bounding box in the image ( 𝑏𝑏𝑥𝑥 , 𝑏𝑏𝑦𝑦 )
 width of the box( 𝑏𝑏𝑤𝑤 )
 height of the box ( 𝑏𝑏ℎ )
 class of object ( 𝑐𝑐 )
 probability of the object’s class (𝑝𝑝𝑐𝑐 )
6
YOLO in Darknet Framework

Darknet is an open-source neural


network framework primarily
written in C and CUDA. It is
designed for high performance and
is particularly well-known for its
implementation of the YOLO (You
Only Look Once) object detection
system.

7
 YOLOv4: Known for its balance of speed and accuracy.
 YOLOv5: Developed separately but widely used for its ease of use.
 YOLOv7: Further optimizations for speed and accuracy.

8
YOLO in PyTorch Framework

Ultralytics is a company focused on advancing artificial intelligence, particularly in computer vision.


They are best known for their work on the YOLO (You Only Look Once) series of models, which are
widely used for real-time object detection and image segmentation.
9
Ultralytics provides a comprehensive ecosystem for working with object detection models, from
dataset preparation to training and deployment. This makes it a powerful tool for both researchers
and developers looking to implement AI solutions in various domains.

10
The latest YOLO model from Ultralytics is YOLOv11, which was released in late 2024

11
Object Detection Illustration

12
Mask detection
Mask detection technology enhances public health and
safety by monitoring compliance in crowded places like
airports and malls.

Person recognition
Person recognition accurately identifies individuals,
enhancing security and efficiency. In smart office
environments, this technology can be leveraged for staff
authentication, ensuring secure access while also enabling
personalized settings that cater to individual preferences.

13
Ship detection
Ship detection using maritime drones is increasingly
utilized for effective surveillance and monitoring of
waterways. Drones equipped with advanced imaging
technology can identify and track vessels in real-time,
enhancing maritime safety and security.

Ping-pong ball detection


Ping-pong ball detection is used in sports training and
robotics to track the ball's position and trajectory in real-time.
This technology enhances training by providing analytics on
ball movement, helping athletes improve their techniques.

14
DATASET
Dataset on the Internet The dataset is utilized in the learning process of models for specific
applications. Some datasets are freely available on the internet

roboflow

15
Kaggle

16
Custom Dataset Builder for Object Detection
Custom datasets for object detection can be created using the Image Labeler tool

labelImg

17
Creating a dataset can be efficiently accomplished using various freely available or commercially
accessible web-based software tools.
roboflow

18
Dataset Criteria for Object Detection

 Number of Images:
Aim for 1,000 to 5,000 images as a minimum; more complex tasks may need 10,000 to 100,000.
 Number of Instances:
 Each image should have 5-10 instances of the target objects.
 Ensure a balanced representation of different classes.
 Annotation Quality:
Use accurate bounding boxes and correct class labels for each object.
 Data Augmentation:
Apply techniques like rotation and flipping to increase dataset size and diversity.
 Dataset Split:
Divide into training (70-80%), validation (10-15%), and testing (10-15%) sets.

19
TRAIN THE MODEL
Train using
roboflow

20
Train using Google Colab

21
Train in PC using Darknet Framework

To run the Darknet framework on your PC, start by installing the necessary prerequisites: Visual
Studio for Windows or build-essential and git for Linux. Next, clone the Darknet repository from
GitHub and modify the Makefile to enable GPU and OpenCV support if desired. Build the
project using make on Linux or Visual Studio on Windows. Prepare datasets and the YOLO model,
placing them in the Darknet directory. Finally, execute the appropriate command in the terminal
or command prompt to train your model or detect objects in an image

22
Train in PC using
PyTorch
Framework

23
Model Performance Evaluation
 Intersection over Union (IoU): measures the overlap between predicted and ground
truth bounding boxes.
 Precision: measures the accuracy of the positive predictions made by the model.
 Recall: assesses the model's ability to identify all relevant instances.
 Mean Average Precision (mAP): measures how well a model can detect and locate
objects in images. It is calculated by averaging the Average Precision (AP) scores for
different object classes and various Intersection over Union (IoU) thresholds.

24
OBJECT DETECTION PROGRAMMING
Framework
 PyTorch: An open-source deep learning library that
provides the foundation for building and training
neural networks. It is known for its flexibility and
dynamic computation capabilities.
 Ultralytics: A company that specializes in computer
vision, particularly through the development of the
YOLO (You Only Look Once) models. They focus on
creating user-friendly tools and frameworks for object
detection.
 YOLO: A series of real-time object detection models
that can identify multiple objects in images quickly
and accurately. The latest version, YOLOv11, is
implemented in PyTorch, benefiting from its efficient
training and inference capabilities.

25
Inference using YOLOv11 Pre-trained Model
 The YOLOv11 pre-trained model is downloaded from https://github.com/ultralytics/ultralytics
 The YOLOv11 model is trained on the MS COCO (Common Objects in Context) dataset

26
Train YOLOv11 Model using Transfer Learning
Directory structure YAML file

27
Python program for training the model

Class distribution

28
Training results obtained using a GPU accelerator

29
Training results obtained using CPU

30
Model performance evaluation

31
32
Labeling of validation data Prediction results for the validation data

33
Prediction result for the test image

34
THANK YOU

You might also like