This repository provides a PyTorch implementation of the original RCNN research paper, focusing on object detection and semantic segmentation.
- Implement region proposal using selective search.
- Fine-tune AlexNet on the VOC dataset.
- Implement a bounding box regressor for precise localization.
- Apply Non-Maximum Suppression (NMS) to reduce overlapping boxes.
- Improve accuracy by integrating a VGG-based architecture.
We use selective search to extract approximately 2000 region proposals per image for object detection.
The resulting output looks like this:
| Original Image | Region Proposal Image |
|---|---|
We fine-tune AlexNet on the VOC dataset to classify objects and employ a bounding box regressor for enhanced localization accuracy.
The resulting output looks like this:
| Original Image | Detected Objects |
|---|---|
Non-Maximum Suppression (NMS) is applied to eliminate overlapping bounding boxes, ensuring that only the most confident prediction is retained for each detected object.
The resulting output looks like this:
| Original Image | After NMS |
|---|---|
This project is maintained and developed by:
- Arun Kumar
- Deepak Diwan
- Satyam Kumar
- Mansi Aggarwal