Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Waste Management based project, uses a fast real-time object detector to classify and localize objects present in a live feed through a webcam, which will help in segregation. It can also be used in autonomous surveillance during ban.

License

Prakhar-Verma39/Object-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Object Detection

INTRODUCTION

This project uses a fast real-time object detector to identify and localize various objects present in a live feed through a webcam. This will help segregate objects of concern from other objects. The real-time detection feature of this detector can also help in the surveillance of multiple places at once.

In this project, two big worldwide problems are identified, and an attempt is made to propose a solution for them.

Poor waste management contributes to climate change and air pollution and directly affects many ecosystems and species. Failing to segregate waste properly means that it will end up mixed in landfills. Waste items like food scraps, paper, and liquid waste can mix and decompose, releasing run-off into the soil and harmful gas into the atmosphere.

On the other side, Plastic bags cause many minor and major ecological and environmental issues. In 2002, India banned the production of plastic bags below 20 µm in thickness to prevent plastic bags from clogging the municipal drainage systems and to prevent the cows of India from ingesting plastic bags as they confuse them for food. However, enforcement remains a problem. The Ministry of Environment, Forest and Climate Change has also passed a regulation to ban all polythene bags less than 50 microns on 18 March 2016. Due to poor implementation of this regulation, regional authorities (states and municipal corporations), have had to implement their own regulation [source: Wikipedia].

TECHNOLOGIES & TOOLS USED

  • Python 3.9
  • OpenCV, Numpy
  • PyCharm, Google Colab
  • labelImg
  • OIDv4_ToolKit

ACTIVITY DIAGRAM

IMPLEMENTATION STEPS

  1. Firstly, Images are gathered from the Open Images dataset - 513 images of plastic bags, 800 images of bottles, and 800 images of tin cans (Note- Images are in jpg format only).

  2. Figure 1. Images Collected.

  3. Preprocessing / Annotation is performed. A text file is generated for each image. These files contain the location(s) of object instances in the images together with their class identities. Files contain this information in YOLO format (class id, object centers (x, y), object width, and object height). These numbers are normalized by the real width and height of the images respectively. Text files are generated using a tool – labelImg.

  4. Figure 2. Annotated Images

  5. Training is done on Google’s colab. Online GPU is utilized to speed up the process. Further, the advantage of pre-trained weights is taken and weights are downloaded and tested after every 2000 iterations. Overall, 6000 iterations are performed i.e., approximately 9 hours of training. The darknet framework is utilized for training purposes which is created also created by one of the contributors to the YOLO algorithm – Joseph Redmon. This framework serves as a backbone or feature extractor. Images are split into a 7:3 ratio for training and validation.

  6. Figure 3. Training

  7. Finally, the model is evaluated by using charts provided by the Darknet framework and tested over some real-time images/feed from the webcam.

TESTING & FINDING

Firstly, Mean Average Precision(mAP) is used to evaluate model performance. The mean of average precision values is calculated over recall values from 0 to 1. It uses other sub-metrics such as Confusion Matrix, Intersection over Union or Jaccard Index, Recall, and Precision.

These values are computed by the Darknet framework after every 1000 iterations.

Secondly, YOLOv3 uses binary cross-entropy loss for each label and computes total loss to plot a chart of mAP and loss values for each iteration.

Figure 4. Chart showing loss and mAP after 3000 iterations.

Figure 5. Object detected in real-time (True Positives).

Figure 6. Object detected in real-time (True Negatives).

Figure 7. Object detected in real-time (False Negatives).

Figure 8. Object detected in real-time (False Positives).

CONCLUSION & FUTURE SCOPE

By implementing the YOLO algorithm, an efficient object detector is developed which can detect plastic bags, bottles, and tin cans. This detection is very fast, and detections are made in almost real time. As soon as any object of concern is detected, a short clip is recorded and an alert sound is generated the model detects the object for some fixed amount of time. This model can be used for implementing tighter bans or surveillance.

Yolo’s architecture can be trained on multiple objects; thus, it can be scaled easily without hurting any other elements. Any new set of items/objects can be detected simply by using newly trained weights. However, receptive field and feature resolution will remain the factors of utmost concern.

Increasing the size of a dataset will increase the detection accuracy, but for scale, robust detection data augmentation needs to be implemented.

Transfer learning is used to improve generalization ability and training speed but in some cases, a divergence between the original MS COCO dataset and the dataset used, and domain mismatch may increase false positive or true negative results.

Fixed-size anchor boxes are scale variants leading to misses in case of unexpected changes.

To address the problem of waste management and effective sorting of different items from a pool of waste, this detector will be implemented on a robotic hand that can work 24x7 to sort waste into different categories.

Efforts will be made retrospectively toward improving the training datasets' quality and quantity. Use of methods like data augmentation, random initialization, and different image input sizes (320x320, 608x608, etc.).

About

Waste Management based project, uses a fast real-time object detector to classify and localize objects present in a live feed through a webcam, which will help in segregation. It can also be used in autonomous surveillance during ban.

Topics

Resources

License

Stars

Watchers

Forks

Languages