Object Detection

INTRODUCTION

This project uses a fast real-time object detector to identify and localize various objects present in a live feed through a webcam. This will help segregate objects of concern from other objects. The real-time detection feature of this detector can also help in the surveillance of multiple places at once.

In this project, two big worldwide problems are identified, and an attempt is made to propose a solution for them.

Poor waste management contributes to climate change and air pollution and directly affects many ecosystems and species. Failing to segregate waste properly means that it will end up mixed in landfills. Waste items like food scraps, paper, and liquid waste can mix and decompose, releasing run-off into the soil and harmful gas into the atmosphere.

On the other side, Plastic bags cause many minor and major ecological and environmental issues. In 2002, India banned the production of plastic bags below 20 µm in thickness to prevent plastic bags from clogging the municipal drainage systems and to prevent the cows of India from ingesting plastic bags as they confuse them for food. However, enforcement remains a problem. The Ministry of Environment, Forest and Climate Change has also passed a regulation to ban all polythene bags less than 50 microns on 18 March 2016. Due to poor implementation of this regulation, regional authorities (states and municipal corporations), have had to implement their own regulation [source: Wikipedia].

TECHNOLOGIES & TOOLS USED

Python 3.9
OpenCV, Numpy
PyCharm, Google Colab
labelImg
OIDv4_ToolKit

ACTIVITY DIAGRAM

IMPLEMENTATION STEPS

Firstly, Images are gathered from the Open Images dataset - 513 images of plastic bags, 800 images of bottles, and 800 images of tin cans (Note- Images are in jpg format only).

Figure 1. Images Collected.

Preprocessing / Annotation is performed. A text file is generated for each image. These files contain the location(s) of object instances in the images together with their class identities. Files contain this information in YOLO format (class id, object centers (x, y), object width, and object height). These numbers are normalized by the real width and height of the images respectively. Text files are generated using a tool – labelImg.

Figure 2. Annotated Images

Training is done on Google’s colab. Online GPU is utilized to speed up the process. Further, the advantage of pre-trained weights is taken and weights are downloaded and tested after every 2000 iterations. Overall, 6000 iterations are performed i.e., approximately 9 hours of training. The darknet framework is utilized for training purposes which is created also created by one of the contributors to the YOLO algorithm – Joseph Redmon. This framework serves as a backbone or feature extractor. Images are split into a 7:3 ratio for training and validation.

Figure 3. Training

Finally, the model is evaluated by using charts provided by the Darknet framework and tested over some real-time images/feed from the webcam.

TESTING & FINDING

Firstly, Mean Average Precision(mAP) is used to evaluate model performance. The mean of average precision values is calculated over recall values from 0 to 1. It uses other sub-metrics such as Confusion Matrix, Intersection over Union or Jaccard Index, Recall, and Precision.

These values are computed by the Darknet framework after every 1000 iterations.

Secondly, YOLOv3 uses binary cross-entropy loss for each label and computes total loss to plot a chart of mAP and loss values for each iteration.

Figure 4. Chart showing loss and mAP after 3000 iterations.

Figure 5. Object detected in real-time (True Positives).

Figure 6. Object detected in real-time (True Negatives).

Figure 7. Object detected in real-time (False Negatives).

Figure 8. Object detected in real-time (False Positives).

CONCLUSION & FUTURE SCOPE

By implementing the YOLO algorithm, an efficient object detector is developed which can detect plastic bags, bottles, and tin cans. This detection is very fast, and detections are made in almost real time. As soon as any object of concern is detected, a short clip is recorded and an alert sound is generated the model detects the object for some fixed amount of time. This model can be used for implementing tighter bans or surveillance.

Yolo’s architecture can be trained on multiple objects; thus, it can be scaled easily without hurting any other elements. Any new set of items/objects can be detected simply by using newly trained weights. However, receptive field and feature resolution will remain the factors of utmost concern.

Increasing the size of a dataset will increase the detection accuracy, but for scale, robust detection data augmentation needs to be implemented.

Transfer learning is used to improve generalization ability and training speed but in some cases, a divergence between the original MS COCO dataset and the dataset used, and domain mismatch may increase false positive or true negative results.

Fixed-size anchor boxes are scale variants leading to misses in case of unexpected changes.

To address the problem of waste management and effective sorting of different items from a pool of waste, this detector will be implemented on a robotic hand that can work 24x7 to sort waste into different categories.

Efforts will be made retrospectively toward improving the training datasets' quality and quantity. Use of methods like data augmentation, random initialization, and different image input sizes (320x320, 608x608, etc.).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Object Detection

INTRODUCTION

TECHNOLOGIES & TOOLS USED

ACTIVITY DIAGRAM

IMPLEMENTATION STEPS

TESTING & FINDING

CONCLUSION & FUTURE SCOPE

About

Uh oh!

Uh oh!

Languages

Uh oh!

License

Uh oh!

Prakhar-Verma39/Object-Detection

Folders and files

Latest commit

History

Repository files navigation

Object Detection

INTRODUCTION

TECHNOLOGIES & TOOLS USED

ACTIVITY DIAGRAM

IMPLEMENTATION STEPS

TESTING & FINDING

CONCLUSION & FUTURE SCOPE

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages