0% found this document useful (0 votes)

15 views8 pages

Report Objectdetection Achyut

This document outlines a project focused on real-time object detection using a webcam integrated with the ROS 2 framework and the YOLOv8 model. It details the system design, implementation steps, challenges faced, and results observed, highlighting the effectiveness of the approach in capturing and processing images for object detection. Key findings include high detection accuracy, particularly for common objects, and the potential for further improvements in system performance and application in robotic navigation.

Uploaded by

achyut.morang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views8 pages

Report Objectdetection Achyut

Uploaded by

achyut.morang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

EE50013 Autonomous Navigation

Assignment: Object Detection using Webcam with ROS

Achyut Morang – [email protected]
March 16, 2025

1 Introduction
Object detection plays a fundamental role in modern robotics and autonomous systems, en-
abling machines to perceive and understand their environment. With advancements in deep
learning, real-time object detection has become more efficient and accurate, making it crucial
for applications such as autonomous vehicles, industrial automation, and surveillance [3, 1,
2]. This project focuses on integrating a real-time object detection system into the ROS 2
framework using YOLOv8.
The motivation for this experiment stems from the increasing need for robust and efficient
perception systems in autonomous navigation. Traditional object detection methods often
suffer from high computational costs or limited generalization to real-world scenarios. YOLO
(You Only Look Once) is known for its ability to perform object detection with high accuracy
and speed, making it ideal for real-time robotic applications [3, 2]. By implementing YOLOv8
in ROS 2, this project aims to explore its feasibility in a robotic vision pipeline.
The system is designed to capture live images from a webcam, publish them to a ROS
topic, and process them for object detection. The detected objects are visualized with
bounding boxes and confidence scores, and detection results are stored in a structured format
for further analysis. This implementation demonstrates how deep learning-based object
detection can be effectively integrated into a real-time ROS 2-based perception system, which
is crucial for smart mobility applications and autonomous decision-making in robots [4, 5].

2 System Design
2.1 Development Environment
The project was implemented in a virtualized Ubuntu environment running on macOS using
UTM as the virtualization software. Since ROS 2 is not natively supported on macOS, the
VM setup was necessary to ensure compatibility with ROS 2 Humble Hawksbill. A bridged
network configuration was used to enable seamless file transfers and remote execution.

1
ROS 2 was chosen as the middleware due to its modernized publisher-subscriber archi-
tecture, support for real-time applications, and integration with robotic frameworks. The
YOLOv8 model was selected for its balance between accuracy and efficiency in object
detection tasks.

2.2 Object Detection Pipeline

The system followed a modular architecture, enabling image streaming, object detection,
and result visualization. The major components were:

• Image Publisher: Captures webcam frames and publishes them to the /image raw
topic.

• YOLO Detector: Subscribes to /image raw, processes frames using YOLOv8, and
publishes detected objects to /detection results.

• Result Logger: Stores detection results in a structured JSON format for further
analysis.

• Bag File Recorder: Logs both raw images and detection results for offline analysis.

2.3 ROS 2 Topics and Message Flow

The system was built using a publisher-subscriber model in ROS 2. The two main topics
used for communication were:

• /image raw: Published by the webcam node to stream real-time images.

• /detection results: Published by the YOLO detector node to store object detection
outputs.

Since custom message generation in ROS 2 was unsuccessful, a workaround was

implemented by using standard message types (std msgs/String) for detection results.

2.4 Challenges Faced

Several challenges were encountered during development, requiring iterative debugging and
improvements:

• ROS 2 Message Generation Issues: Custom messages were not successfully built
due to CMake and package.xml errors, leading to the use of standard message types.

• Library Dependencies: Compatibility issues arose with OpenCV, PyTorch, and

Ultralytics, requiring multiple troubleshooting steps.

2
• Webcam Detection Issues: The correct device mapping had to be manually speci-
fied due to multiple video input sources.

• Storage and Transfer Constraints: The ROS 2 bag files were large (over 400MB)
and had to be compressed for efficient storage and submission.

• Visualization Issues: OpenCV’s GUI functions required additional configurations

when running inside a virtual machine.

2.5 Final Architecture

The final implementation followed a structured design:

• Publisher: image publisher → Publishes images to /image raw.

• Subscriber: yolo detector → Processes images and publishes results to /detection results.

• Logger: Saves detections to detections.json.

• Bag File Recorder: Captures and stores image and detection data for offline review.

This design ensures modularity, real-time processing, and efficient data logging.

3 Implementation Steps
3.1 Setting Up the ROS 2 Workspace
A ROS 2 workspace was initialized with the standard structure. The object detection
package was created inside the src directory. The necessary ROS 2 dependencies were
configured in package.xml and CMakeLists.txt to support Python-based execution.

3.2 Writing the Image Publisher Node

The image publisher.py script was developed to:

• Capture real-time images from the webcam.

• Convert them into ROS 2-compatible messages.

• Publish them on the /image raw topic.

The node was tested independently to ensure correct image streaming.

3
3.3 Implementing the YOLOv8 Detector Node
The yolo detector.py script was designed to:

• Subscribe to the /image raw topic to receive images.

• Process each frame using the YOLOv8 model.
• Extract bounding boxes, confidence scores, and class names.
• Publish the results to the /detection results topic.
• Log the results in detections.json for further analysis.

Additionally, the script was fine-tuned to detect objects with low confidence scores,
ensuring that even uncertain detections were logged.

3.4 Configuring ROS 2 Topics

The ROS 2 topics were verified to ensure proper message flow. The two key topics used
were:

• /image raw: Streaming real-time frames from the webcam.

• /detection results: JSON-formatted object detection results.

3.5 Running and Testing the System

The object detection system was executed and tested in real time. Debugging involved:

• Ensuring ROS 2 was sourced properly before running nodes.

• Checking the availability of expected ROS 2 topics.
• Verifying frame processing and detection accuracy.
• Analyzing logged detection results.

3.6 Recording and Managing the ROS 2 Bag File

For post-processing and offline analysis, a ROS 2 bag file was recorded using the ros2 bag
record command. Unlike ROS 1, which stores bag files in a compressed format, ROS 2 bag
files use an SQLite-based .db3 format, requiring special tools for playback.
The following topics were logged:

• /image raw – Stores all captured webcam images.

• /detection results – Logs object detection outputs.

4
3.7 Logging and Storing Detection Results
The detection results were stored in detections.json. The logging mechanism was modified
to:

• Append new detections instead of overwriting previous ones.

• Maintain consistency in formatting for easy visualization.

• Store bounding box coordinates, object class, and confidence scores.

The recorded JSON data was later analyzed to extract meaningful insights, including
object occurrence frequency, confidence distribution, and trends across multiple runs.

Figure 1: Frequency distribution of detected object classes.

4 Results and Observations

The implemented object detection system successfully captured live webcam images, pro-
cessed them using YOLOv8, and logged detections in both ROS 2 topics and structured
JSON files. The experiment provided insights into real-time object detection, confidence
score distribution, and the effectiveness of ROS 2 for modular robotic applications.

4.1 Key Findings

The key observations from the experiment are summarized as follows:

5
• The majority of detections were persons, with a total of 679 instances recorded,
highlighting the model’s strong performance in detecting humans in the scene.

• Other frequently detected objects included cell phones (66 times), bottles (60
times), remotes (38 times), and vases (26 times), reflecting common objects in
the test environment.

• Less frequently detected objects included refrigerators (8 times), laptops (2 times),

chairs (1 time), and a single instance of a banana, demonstrating the variability in
the dataset.

• The confidence score statistics revealed that:

– The average confidence score across all detections was 0.83, indicating a gen-
erally reliable detection performance.
– The highest confidence recorded was 0.98, while the lowest was 0.25.
– 75% of the detected objects had confidence scores above 0.93, ensuring
high reliability in most predictions.
– A minimum confidence threshold of 0.25 was used to log even lower-confidence
detections, which provided insight into borderline predictions.

Figure 2: Confidence score distribution of detected objects.

6
4.2 Detection Analysis
A frequency analysis of detected objects showed that certain classes were identified far more
often than others. Figure 1 illustrates the distribution of detected objects, with dominant
categories being people and commonly used items. This pattern was influenced by the
camera placement, the scene composition, and the presence of dynamic elements such as
moving persons.

Figure 3: Sample detection: A cell phone identified with bounding box and confidence score.

An important observation was the confidence score distribution of detections. Fig-

ure 2 shows that most detections had confidence scores above 0.60, indicating that the
model was generally confident in its predictions. However, a small percentage of detections
had confidence below 0.30, suggesting some level of false positives or ambiguous detections.
Adjusting the confidence threshold can allow for a trade-off between precision and recall.

4.3 Sample Detections

The system effectively detected various objects in real-time. Figure 3 illustrates a sample de-
tection: where a cell phone was successfully identified. The bounding boxes and confidence
scores are clearly displayed, demonstrating the robustness of the YOLOv8 model.

4.4 Performance and System Behavior

The system was tested in real-time, achieving an average inference time of approximately
100ms per frame. Performance was influenced by factors such as hardware limitations in the
virtualized environment, input frame resolution, and the number of objects present in each
frame.

7
For offline analysis, a ROS 2 bag file was recorded, capturing both raw images and
detection results. Unlike ROS 1, where bag files are stored as a single binary, ROS 2 bag
files use an SQLite-based .db3 format. The recorded topics included /image raw for webcam
frames and /detection results for object metadata. The bag file was compressed and
archived as detection data ros2bag.zip, submitted alongside detections.json and a
demonstration video demo detection.mp4.

5 Conclusion
This experiment demonstrated the feasibility of using ROS 2 and YOLOv8 for real-time
object detection. The structured logging in JSON allowed for post-processing and visualiza-
tion, aiding in a quantitative assessment of system performance. The findings suggest that
fine-tuning confidence thresholds and computational optimizations can further improve the
system. Future work could explore multi-camera setups, object tracking, and integration
with robotic motion planning to enhance real-world applicability.

References
[1] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. “YOLOv4: Optimal
Speed and Accuracy of Object Detection”. In: arXiv preprint arXiv:2004.10934 (2020).
[2] Glenn Jocher, Ayush Chaurasia, Jing Qiu, et al. YOLOv8: Cutting-Edge, Real-Time
Object Detection and Segmentation. Available at https://github.com/ultralytics/
ultralytics. 2023.
[3] Joseph Redmon and Ali Farhadi. “YOLO9000: Better, Faster, Stronger”. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016),
pp. 7263–7271.
[4] Open Robotics. ROS 2: Robot Operating System. Available at https://docs.ros.org/
en/rolling/. 2023.
[5] Mujahed Talha et al. “ROS 2: An Overview of the Next Generation Robotics Middle-
ware”. In: arXiv preprint arXiv:2101.00689 (2021).

Stationary List
No ratings yet
Stationary List
3 pages
01
No ratings yet
01
314 pages
Creatine Kinase: 7D63-20 and 7D63-30
No ratings yet
Creatine Kinase: 7D63-20 and 7D63-30
8 pages
Astm C273-C273M - 19
No ratings yet
Astm C273-C273M - 19
9 pages
Splenomegaly: Clinical Insights
No ratings yet
Splenomegaly: Clinical Insights
57 pages
Kumerahou: Pomaderris Kumeraho
No ratings yet
Kumerahou: Pomaderris Kumeraho
1 page
Clinical Microbiology MCQ Practice Test
100% (4)
Clinical Microbiology MCQ Practice Test
13 pages
Processed Food Industry Pakistan
0% (1)
Processed Food Industry Pakistan
6 pages
Report 34
No ratings yet
Report 34
22 pages
BIOMETRICS
No ratings yet
BIOMETRICS
18 pages
Pick-And-Place Application Using A Dual Arm Collaborative Robot and An RGB-D Camera With YOLOv5
No ratings yet
Pick-And-Place Application Using A Dual Arm Collaborative Robot and An RGB-D Camera With YOLOv5
14 pages
Devspec Template
No ratings yet
Devspec Template
9 pages
Master Hua Yun 2022
No ratings yet
Master Hua Yun 2022
88 pages
YOLO: For Computer Vision Experts
No ratings yet
YOLO: For Computer Vision Experts
3 pages
Computer Vision Report
No ratings yet
Computer Vision Report
21 pages
Final Synopsis1
No ratings yet
Final Synopsis1
10 pages
YOLO: Real-Time Object Detection
No ratings yet
YOLO: Real-Time Object Detection
10 pages
Metro Jobs Clearance Form Blank
100% (1)
Metro Jobs Clearance Form Blank
1 page
Patiala Army Recruitment Rally 2020
No ratings yet
Patiala Army Recruitment Rally 2020
9 pages
STANDARD OPERATING PROCEDURES Masjid CFS
50% (2)
STANDARD OPERATING PROCEDURES Masjid CFS
2 pages
SLW Investment Group
No ratings yet
SLW Investment Group
30 pages
Project Phase 2 R-3
No ratings yet
Project Phase 2 R-3
17 pages
Conference
No ratings yet
Conference
16 pages
Richland Technologies 5th Anniversary Press Release
No ratings yet
Richland Technologies 5th Anniversary Press Release
2 pages
Yolo Paper
No ratings yet
Yolo Paper
10 pages
AI Final Exam
No ratings yet
AI Final Exam
10 pages
Yolo
No ratings yet
Yolo
10 pages
CCTV
No ratings yet
CCTV
23 pages
Presentation1 FINAL 1
No ratings yet
Presentation1 FINAL 1
11 pages
YOLOv1 v8综述
No ratings yet
YOLOv1 v8综述
36 pages
GoAnywhere System Architecture Guide
No ratings yet
GoAnywhere System Architecture Guide
29 pages
YOLOv2: Real-Time Object Detection
No ratings yet
YOLOv2: Real-Time Object Detection
5 pages
Siemens PBX & Cisco CallManager Guide
No ratings yet
Siemens PBX & Cisco CallManager Guide
37 pages
Ibm-Intern Report
No ratings yet
Ibm-Intern Report
14 pages
P&ID Symbols and Legend Guide
No ratings yet
P&ID Symbols and Legend Guide
1 page
Blog Hubspot Com Marketing Team Structure Diagrams
No ratings yet
Blog Hubspot Com Marketing Team Structure Diagrams
13 pages
Object Detection SDS
No ratings yet
Object Detection SDS
27 pages
Abir
No ratings yet
Abir
10 pages
YOLOv12 - A Breakdown of The Key Architectural Features
No ratings yet
YOLOv12 - A Breakdown of The Key Architectural Features
9 pages
Project 2
No ratings yet
Project 2
10 pages
BE Mech 5.5 Year
No ratings yet
BE Mech 5.5 Year
3 pages
Presentation 4
No ratings yet
Presentation 4
23 pages
Chapter 1+2+GSCM
No ratings yet
Chapter 1+2+GSCM
45 pages
EdgeYOLO AnEdge-Real-Time Object Detector
No ratings yet
EdgeYOLO AnEdge-Real-Time Object Detector
7 pages
HR Synopsis
No ratings yet
HR Synopsis
11 pages
Ielts5 - Santiago Suarez
No ratings yet
Ielts5 - Santiago Suarez
1 page
Farooq Resume
No ratings yet
Farooq Resume
3 pages
Yolov 8
No ratings yet
Yolov 8
12 pages
YOLO: Fast Object Detection for Engineers
No ratings yet
YOLO: Fast Object Detection for Engineers
6 pages
Ankit Report
No ratings yet
Ankit Report
73 pages
YOLO Algorithm For Real-Time Object Detection: 2.1. Network Design
No ratings yet
YOLO Algorithm For Real-Time Object Detection: 2.1. Network Design
3 pages
Repair & Rehab of Structures Course
No ratings yet
Repair & Rehab of Structures Course
2 pages
Hand, Foot and Mouth Disease (HFMD)
No ratings yet
Hand, Foot and Mouth Disease (HFMD)
3 pages
DC Project
No ratings yet
DC Project
4 pages
Object Detection Document
No ratings yet
Object Detection Document
4 pages
Lecture Notes 2 - Atomic Structure
No ratings yet
Lecture Notes 2 - Atomic Structure
32 pages
Yolopdf
No ratings yet
Yolopdf
10 pages
Object Detection SRS
No ratings yet
Object Detection SRS
23 pages
Rare Project-2023-24 - 230614 - 163032
No ratings yet
Rare Project-2023-24 - 230614 - 163032
6 pages
Object Detection
No ratings yet
Object Detection
11 pages
Conference-Ppt Namat
No ratings yet
Conference-Ppt Namat
17 pages
Object Detection Projects
No ratings yet
Object Detection Projects
3 pages
Smart Attendance System
No ratings yet
Smart Attendance System
38 pages
YOLO Based Object Detection Models: A Review and Its Applications
No ratings yet
YOLO Based Object Detection Models: A Review and Its Applications
40 pages
Traffic Monitoring System YOLOv8 OpenCV
No ratings yet
Traffic Monitoring System YOLOv8 OpenCV
4 pages
Phase Theory An Introduction Draft Citko B Download
100% (1)
Phase Theory An Introduction Draft Citko B Download
90 pages
Yolov5 Paper
No ratings yet
Yolov5 Paper
12 pages
Finish Presentation
No ratings yet
Finish Presentation
56 pages
Hypertension Cheat Sheet
No ratings yet
Hypertension Cheat Sheet
4 pages
Sepm Exp. 0-5
No ratings yet
Sepm Exp. 0-5
14 pages
Document 1
No ratings yet
Document 1
4 pages
TSLD Yolov11 Py File Data
No ratings yet
TSLD Yolov11 Py File Data
3 pages
Efficient Object Detection With YOLO A C
No ratings yet
Efficient Object Detection With YOLO A C
13 pages
Parenteral Feeding
No ratings yet
Parenteral Feeding
3 pages
PP T Project
No ratings yet
PP T Project
17 pages
Smart Parking Project
No ratings yet
Smart Parking Project
7 pages
DOC-20250427-WA0007 pdf11
No ratings yet
DOC-20250427-WA0007 pdf11
16 pages
DOC-20250427-WA0007 pdf11
No ratings yet
DOC-20250427-WA0007 pdf11
16 pages
YOLOv8 RealTime ObjectDetection
No ratings yet
YOLOv8 RealTime ObjectDetection
16 pages
Deep Learning Assignment 2
No ratings yet
Deep Learning Assignment 2
14 pages
Autonomous Driving With ROS 2 No Names
No ratings yet
Autonomous Driving With ROS 2 No Names
1 page
Project Paper Samuel
No ratings yet
Project Paper Samuel
10 pages
Project Problem Statement - AI ML
No ratings yet
Project Problem Statement - AI ML
14 pages
Autonomous Driving With ROS 2
No ratings yet
Autonomous Driving With ROS 2
1 page
YOLOv 8
No ratings yet
YOLOv 8
13 pages

Report Objectdetection Achyut

Uploaded by

Report Objectdetection Achyut

Uploaded by

EE50013 Autonomous Navigation

Assignment: Object Detection using Webcam with ROS

2.2 Object Detection Pipeline

2.3 ROS 2 Topics and Message Flow

• /image raw: Published by the webcam node to stream real-time images.

Since custom message generation in ROS 2 was unsuccessful, a workaround was

2.4 Challenges Faced

• Library Dependencies: Compatibility issues arose with OpenCV, PyTorch, and

• Visualization Issues: OpenCV’s GUI functions required additional configurations

2.5 Final Architecture

• Publisher: image publisher → Publishes images to /image raw.

• Logger: Saves detections to detections.json.

3.2 Writing the Image Publisher Node

• Capture real-time images from the webcam.

• Convert them into ROS 2-compatible messages.

• Publish them on the /image raw topic.

The node was tested independently to ensure correct image streaming.

• Subscribe to the /image raw topic to receive images.

3.4 Configuring ROS 2 Topics

• /image raw: Streaming real-time frames from the webcam.

3.5 Running and Testing the System

• Ensuring ROS 2 was sourced properly before running nodes.

3.6 Recording and Managing the ROS 2 Bag File

• /image raw – Stores all captured webcam images.

• Append new detections instead of overwriting previous ones.

• Maintain consistency in formatting for easy visualization.

• Store bounding box coordinates, object class, and confidence scores.

Figure 1: Frequency distribution of detected object classes.

4 Results and Observations

4.1 Key Findings

• Less frequently detected objects included refrigerators (8 times), laptops (2 times),

• The confidence score statistics revealed that:

Figure 2: Confidence score distribution of detected objects.

An important observation was the confidence score distribution of detections. Fig-

4.3 Sample Detections

4.4 Performance and System Behavior

You might also like