Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views13 pages

NSUT

The document presents SmartVision, a modular framework for accurate vehicle speed detection using YOLOv8 and perspective transformation. The system integrates vehicle detection, tracking, and speed estimation from CCTV footage in real time, achieving an average speed estimation error of 7.81%. This approach aims to enhance urban traffic management and safety by providing reliable speed measurements and traffic statistics.

Uploaded by

YASH MISHRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views13 pages

NSUT

The document presents SmartVision, a modular framework for accurate vehicle speed detection using YOLOv8 and perspective transformation. The system integrates vehicle detection, tracking, and speed estimation from CCTV footage in real time, achieving an average speed estimation error of 7.81%. This approach aims to enhance urban traffic management and safety by providing reliable speed measurements and traffic statistics.

Uploaded by

YASH MISHRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

SmartVision: A Modular Framework for Accurate Vehicle Speed

Detection via YOLOv8 and Perspective Transformation

Sunakshi Mehra1[0000-0002-6397-6049], Arkin Kansra2[0009-0002-1653-2127], and Yash Mishra2[0009-0003-6914-6032]


1 Department of Computer Science & Engineering, Delhi Technological University, Bawana Road, Delhi, India
2 University School of Information, Communication and Technology, Guru Gobind Singh Indraprastha

University, Dwarka, Delhi, India


*[email protected],
[email protected],[email protected]

Abstract. In today’s era of smart cities and advancing transportation technologies, accurately
estimating vehicle speed is essential for improving traffic management, safety, and overall
transportation efficiency. Reliable speed estimation benefits both road users and traffic
authorities by enabling effective traffic monitoring and control. However, precise speed
measurement from video data remains challenging due to dynamic traffic conditions and real-
world complexities. To address these challenges, this study proposes a novel vehicle speed
estimation system built around several key modules: VehicleDetector, which uses YOLOv8 for
robust vehicle detection; SimpleTracker, employing centroid distance for multi-object tracking;
PerspectiveTransformer, converting image coordinates to real-world distances; SpeedEstimator,
calculating speeds with smoothing and filtering for stability; and SpeedDetectionSystem, which
integrates all components into a cohesive framework. The system processes CCTV footage in
real time, detecting and tracking vehicles, mapping their positions to physical space, and
estimating their speeds accurately. Experimental results show an average speed fluctuation of
5.98 km/h, reflecting consistent performance across different vehicles, and an average speed
estimation error of 7.81%, demonstrating reasonable accuracy given the limitations of video-
based measurements and calibration. This integrated approach holds promise for deployment in
intelligent transportation systems to enhance urban traffic control and road safety.

Keywords: VehicleDetector, PerspectiveTransformer, YOLOv8

1 Introduction

In recent years, growing concerns over traffic congestion have led to an increased focus on
improving existing traffic systems. As a result, numerous studies have emerged aimed at enhancing
traffic flow and efficiency. These studies play a crucial role in the development of Intelligent
Transportation Systems (ITS), which leverage advanced statistical and machine learning techniques
to optimize road traffic. The primary areas of focus in this research include vehicle rerouting [1],
traffic forecasting [2], and traffic management [3]. Road speed is a critical metric for traffic managers
and infrastructure designers, as it is closely linked to road safety, pollutant emissions, and the overall
quality of service within urban road networks. When traffic flow or density surpasses a road’s
capacity, a reduction in speed can lead to significant congestion [4]. Moreover, drivers' speed
choices—shaped largely by road design and infrastructure—have a considerable impact on both the
2

frequency and severity of traffic accidents. For instance, drivers often reduce their speed on narrower
roads or uneven pavement surfaces due to an increased perception of risk [5]. In air quality models,
traffic speed plays a critical role alongside vehicle flow, composition, and road network characteristics
[6]. Additionally, both low and high traffic speeds significantly impact fuel consumption costs [7].
Speed modeling studies are typically limited in scope, often focusing on a single road corridor [8], a
small number of road segments [9], or a specific set of routes [10].

Intelligent Transportation Systems (ITS) depend heavily on accurate forecasting of traffic


conditions—including flow, speed, and congestion—as it is essential for enhancing urban traffic
efficiency and alleviating congestion, two major challenges faced by modern cities. This forecasting
process involves the analysis of large-scale datasets collected from GPS devices, road sensors, and
historical traffic records to identify and interpret traffic patterns. Significant progress in this area has
been driven by the application of advanced techniques such as deep learning frameworks, machine
learning algorithms, and statistical models. Artificial intelligence (AI) has been widely used to
optimize traditional data-driven approaches in a wide range of research disciplines [11]. The AI-based
vehicle-to-everything (V2X) system collects information from various sources to enhance driver
awareness and anticipate crashes. The Internet of Vehicles (IoV) enables data sharing and
communication between vehicles. This part of the Internet of Things (IoT) is used in applications
related to automobiles. Through it, vehicles can automatically converse with each other. allowing a
motorist on a particular highway to be aware of other vehicles approaching from the other direction
[12, 13]. Deep learning algorithms have proven effective in capturing the spatiotemporal
characteristics of trajectory sequences. Among these, recurrent neural network (RNN)-based models
are particularly notable, having demonstrated strong performance across various prediction tasks [14].
For instance, Park et al. [15] proposed an LSTM-based encoder–decoder model to predict future
vehicle locations on a map grid. Building on this, [16] introduced a multi-layer LSTM encoder–
decoder model with a temporal attention mechanism to enhance sequence learning for human mobility
forecasting. Similarly, Capobianco et al. [17] incorporated the attention mechanism into a recurrent
network model to improve the forecasting of vessel trajectories. Accurate prediction of energy
consumption and speed has significant practical implications for the development of sustainable and
efficient urban environments.

2 Literature Review

Recent advancements in deep learning and computer vision have enabled more scalable and cost-
effective solutions. Vision-based approaches offer a promising alternative for speed estimation in
traffic scenarios by leveraging existing camera infrastructures [18]. For tasks involving precise
vehicle speed prediction, deep learning models—especially those trained for real-time object
detection—have shown significant improvements in both accuracy and robustness [19]. A typical
Deep Neural Network (DNN) is not inherently designed for real-time object detection; however,
YOLO (You Only Look Once) is specifically optimized for this purpose. In general, DNNs operate
in multiple stages, which can impact speed and efficiency. For object detection tasks, traditional
models often follow image classification with additional processes such as region proposal (e.g., R-
CNN or Fast R-CNN), which can be computationally intensive and time-consuming. In contrast,
3

YOLO streamlines this process by detecting objects in a single forward pass, making it highly suitable
for applications that demand real-time analysis, such as live traffic monitoring [20]. YOLO divides
the input image into a grid and simultaneously predicts bounding boxes and class probabilities for
each grid cell in one pass.
Due to its single-pass architecture, YOLO is capable of processing images in real time, in contrast
to traditional DNNs, which are often slower owing to their multilayered processing pipelines.
Conventional DNN-based [38-39] object detection models—such as those relying on region proposals
or sliding windows—typically analyze discrete parts of an image independently, which can limit
global context awareness and result in fragmented understanding. In contrast, YOLO processes the
entire image simultaneously, enabling it to capture the global context. This holistic approach allows
YOLO to detect objects that might otherwise be overlooked or misclassified by models with limited
contextual perception [21]. [22] employed YOLOv8 for vehicle detection and Deep SORT for multi-
object tracking to estimate vehicle speed and count in real time, achieving over 80% accuracy on a
real-world dataset. This integrated approach offers a cost-effective and efficient solution for
monitoring high-speed traffic. Jin-xiang Wang [23] proposed a video-based speed estimation method
that calculates vehicle velocity by tracking centroids and extracting features using three-frame
differencing and background subtraction. Using a static camera (640×480, 18 fps), the method
effectively reduced noise and accurately detected vehicle boundaries. Saif B. Neamah and Abdulamir
A. Karim [24] developed a real-time traffic monitoring system utilizing YOLOv8, achieving high
accuracy across five preprocessing stages—including 87.28% for speed estimation and 97.54% for
vehicle counting—using 1080p, 24 fps video and a GTX 1070 GPU. Héctor Rodríguez-Rangel et al.
[25] estimated vehicle speeds from monocular camera data using methods such as YOLOv3 and
Kalman filtering, with the LRM model demonstrating the highest accuracy and efficiency for
integration into traffic management systems.

Using a Deep SARSA algorithm with experience replay, Ogbeide et al. [26] introduced the Deep
SARSA Replay model for real-time traffic control. The model improves training stability and
adaptability by using a multilayer perceptron (MLP) to process lane-wise traffic data and update Q-
values. When tested on SUMO over 1000 episodes under varying traffic conditions (low to heavy), it
outperformed baseline deep reinforcement learning (DRL) models in terms of stability, reward
maximization, and traffic flow optimization. A system [27] was developed to dynamically regulate
traffic signals using image analysis and real-time vehicle counts at intersections. It optimizes traffic
flow by controlling both moving and stationary vehicles. To prevent deadlocks and enhance
efficiency, study [28] proposed a hybrid traffic light control system that combines the Structured
Systems Analysis and Design Methodology (SSADM) with fuzzy logic. This system classifies
junctions and simulates traffic control rules. Using Arduino and RFID, Miyim & Muhammad [29]
developed a smart traffic management system that prioritizes emergency vehicles. However, it
overlooks regular traffic, resulting in delays for non-emergency vehicles. Khan et al. [30] proposed a
lightweight neural network for traffic sign recognition, trained on the GTSRB and BelgiumTS
datasets. The model achieved 98.41% accuracy on GTSRB and 92.06% on BelgiumTS,
outperforming state-of-the-art models such as GoogleNet, AlexNet, VGG16/19, MobileNetv2, and
ResNetv2 by margins of 0.1–4.20% (GTSRB) and 9.33–33.18% (BelgiumTS).
4

Deep learning has significantly advanced object detection in recent years. The first use of
convolutional neural networks (CNNs) for this task was introduced with R-CNN [31] in 2014. Today,
most object detectors fall into two categories: one-stage (e.g., YOLO [32]–[37]) and two-stage (e.g.,
R-CNN [1], [23]) models. One-stage methods are faster as they directly predict object locations, while
two-stage methods are slower but generally more accurate. However, one-stage models still face
challenges in detecting small objects.

3 Proposed Methodology

3.1 Proposed SmartVision

The SmartVision system presents a comprehensive and modular framework for accurate vehicle
speed detection, seamlessly integrating advanced computer vision techniques with deep learning. The
workflow begins with video input captured from traffic surveillance cameras, which is processed by
the VehicleDetector module leveraging the powerful YOLOv8 model for precise, real-time vehicle
detection. Detected vehicles are then fed into the SimpleTracker component, which applies a
centroid-distance-based multi-object tracking algorithm to maintain consistent vehicle identities
across frames. To translate image-based detections into meaningful physical measurements, the
PerspectiveTransformer module converts pixel coordinates into real-world distances through a
calibrated homography transformation, ensuring accurate spatial representation. Speed estimation is
then performed by the SpeedEstimator module, which calculates vehicle velocity by analyzing
changes in real-world positions over time, incorporating smoothing and filtering techniques to reduce
noise and improve stability. Finally, the SpeedDetectionSystem integrates all these modules into a
cohesive pipeline that outputs real-time visualizations including bounding boxes, vehicle IDs, speed
annotations, and trajectory paths, while also compiling traffic statistics for monitoring and analysis.
This end-to-end system combines state-of-the-art detection, tracking, and transformation methods to
provide a reliable, efficient solution for intelligent transportation systems aiming to enhance urban
traffic safety and management. This study proposes a practical, modular system for estimating vehicle
speed from video footage, combining deep learning, object tracking, and geometric transformation
techniques. The overall process can be understood in three key stages:

3.1.1 Detecting and Tracking Vehicles

The system starts by detecting vehicles in video frames using YOLOv8, a fast and accurate deep
learning model designed for object detection. It can identify different types of vehicles, such as cars,
buses, and motorcycles. Once the vehicles are detected, they are tracked across frames using a simple
yet effective centroid-based tracking method. This approach keeps track of each vehicle by measuring
the distance between their positions in consecutive frames, which helps maintain consistent
identification over time, even in moderately busy scenes.

3.1.2 Mapping Video Frames to Real-World Scale

To measure speed accurately, we need to know how far a vehicle has actually traveled—not just
in pixels, but in real-world units like meters. For this, we use a perspective transformation technique.
The user manually selects four reference points from the video and provides their actual real-world
5

measurements. These calibration points allow the system to translate pixel movements into physical
distances, forming the basis for accurate speed calculations.

3.1.3 Estimating Speed and Showing Results

With detection, tracking, and calibration in place, the system calculates the speed of each vehicle
by measuring the distance it moves between frames and dividing it by the time taken. To make the
speed readings more stable, we apply smoothing techniques that help reduce noise and filter out
unusual spikes. The final system displays real-time visuals—like bounding boxes, speed labels, and
vehicle paths—alongside summary statistics such as average speed, total vehicles counted, and frame
processing speed. These insights help evaluate both traffic conditions and the performance of the
system itself.

4 Experimental Results

The dataset used is taken from Kaggle [40]. The experiment is performed on Python 3.7 and relies
on several key libraries, including OpenCV for video processing and visualization, NumPy for
numerical computations, PyTorch to support the YOLOv8 model, and the Ultralytics package for
model integration. SciPy is optionally used for advanced signal filtering. The experiment follows a
modular architecture, which ensures clarity, maintainability, and ease of future enhancements. Each
core component—vehicle detection, multi-object tracking, perspective transformation, and speed
estimation—is implemented as a separate module, all integrated through a centralized execution
script. YOLOv8, provided by the Ultralytics library, is used for accurate and real-time vehicle
detection. A simple yet effective centroid-based tracking algorithm is employed to maintain object
identities across frames. To convert pixel coordinates into real-world distances, perspective
transformation is applied based on a calibrated mapping between image and physical space. Vehicle
speeds are then estimated by calculating displacement over time, with optional smoothing and
filtering applied to enhance measurement stability.

4.1 System Configuration

The system’s behavior and performance are managed through a centralized configuration file,
allowing users to adjust key parameters without modifying the core codebase. Users can set video
input paths, fallback frame rates, detection thresholds, and model locations. Additionally, tracking
settings—such as maximum allowable frame gaps and distance thresholds for maintaining identity
across frames—are customizable. Perspective transformation is defined through manually selected
image and world coordinate pairs to calibrate the scene for accurate spatial measurement. The
configuration also allows users to toggle visualization elements, such as detection boxes and
trajectories, enabling a flexible and user-centric experience suitable for both development and
deployment environments.

4.2 Visualization and Output

The system provides an intuitive real-time visualization of the traffic scene, overlaying key
information directly on the video feed. Each detected vehicle is highlighted with a bounding box and
6

assigned a unique ID for consistent tracking across frames. Alongside these, the estimated speed for
each vehicle is displayed, with color-coded indicators to visually distinguish different speed ranges.
Vehicle trajectories are also rendered as motion paths, providing insights into vehicle movement
patterns. At the end of the analysis, summary statistics are generated, including the total number of
vehicles detected, average speeds, and overall system performance metrics such as processing time
and frames per second (FPS). These outputs not only enhance system transparency but also support
post-analysis and decision-making in traffic management.
7
8

Fig. 1. It depicts a real-time traffic monitoring system on a multi-lane highway where vehicles are detected,
tracked, and their speeds estimated using computer vision, with each vehicle assigned a unique ID and speed
displayed in color-coded bounding boxes indicating velocity levels, trajectory lines showing movement paths,
and system statistics summarizing active vehicles, total measurements, and average speed.

The figure 1 shows a real-time traffic monitoring system applied to a multi-lane highway, where
vehicles are automatically detected, tracked, and their speeds estimated using computer vision
techniques. Each vehicle is enclosed in a colored bounding box with a unique identifier (e.g., ID:111)
in last sampled figure from figure 1 and an associated speed in kilometers per hour. The color of the
bounding boxes indicates the relative speed of the vehicles—red for fast-moving vehicles, yellow for
those at moderate speed, and green for slower vehicles. One of the vehicles (ID:77) is still in the
process of speed calculation, marked with a blue box and labeled "Calculating...". Blue trajectory
lines show the predicted movement path of each vehicle. At the top left corner, system statistics are
displayed, including the number of active and total tracked vehicles (20 active, 126 total), the number
of speed measurements taken (12,515), and the average vehicle speed (62.7 km/h). The scene
exemplifies an intelligent traffic surveillance system that leverages object detection and tracking
algorithms to monitor vehicular movement and estimate real-time speeds, potentially aiding in traffic
management, law enforcement, and road safety analysis.
9

Fig. 2. It depicts a real-time traffic monitoring system on a multi-lane highway where vehicles are detected,
tracked, and their speeds estimated using computer vision, with each vehicle assigned a unique ID and speed
displayed in color-coded bounding boxes indicating velocity levels, trajectory lines showing movement paths,
and system statistics summarizing active vehicles, total measurements, and average speed.

The figure 2 presents a comparative analysis of the minimum and maximum speeds (in km/h) recorded
for ten different vehicles. The blue and green lines represent the minimum and maximum speeds,
respectively, plotted against vehicle numbers on the X-axis. The left Y-axis denotes the speed values,
while the right Y-axis shows the fluctuation in speed (red dashed line) and the percentage error
(orange dotted line). The plot reveals that the first five vehicles exhibit higher speed ranges with
considerable fluctuations, indicating greater performance variability. In contrast, vehicles 6 to 10
maintain lower speed values and more stable behavior with smaller fluctuations. However, certain
vehicles—particularly numbers 6 and 9—show noticeable spikes in the percentage error, suggesting
potential inconsistencies or anomalies in their speed measurements. Overall, the chart provides
valuable insight into the performance dynamics and reliability of each vehicle based on speed
fluctuations and measurement errors.
10

Fig. 3. Speed Comparison and Error Analysis between Kaggle and Our Model

The figure 3 illustrates a comparative study between the speeds predicted by the Kaggle model
(green line) and those from our proposed model (blue line) for 63 vehicles, plotted along the X-axis.
The left Y-axis represents the speed in km/h, while the right Y-axis captures the associated error
metrics. The red dashed line indicates the absolute error, and the orange dotted line shows the
percentage error between the two predictions. A close inspection reveals that while both models
generally follow a similar trend, our model occasionally exhibits sharper fluctuations, especially
between vehicle IDs 1–20 and 40–50. Despite these variations, the percentage error remains relatively
moderate for the majority of vehicles, staying under 5% in most cases, except for a few outliers where
it spikes up to around 9–10%. This analysis highlights the consistency of our model in closely
approximating Kaggle’s performance, with only a few instances of notable deviation, making it a
promising approach for accurate speed prediction. The system demonstrated an average speed
fluctuation of 5.98 km/h, indicating a relatively stable performance across different vehicle instances.
Additionally, the average error in speed estimation was calculated to be 7.81%, which reflects a
reasonably accurate outcome given the constraints of video-based measurement and real-world
calibration challenges.

5 Conclusion

In conclusion, the proposed vehicle speed estimation system effectively leverages the strengths of
YOLOv8-based detection, multi-object tracking, and perspective transformation to provide accurate
and reliable speed measurements from video data. Despite challenges inherent in real-world traffic
11

environments and video-based analysis, the system demonstrated stable performance with an average
speed fluctuation of 5.98 km/h and an error rate of 7.81%. Its modular architecture allows for
flexibility and potential enhancements, making it well-suited for real-time applications in intelligent
transportation systems. By improving traffic monitoring and management capabilities, this system
contributes meaningfully toward safer and more efficient urban mobility solutions.

References

1. Ho, M. C., Lim, J. M. Y., Soon, K. L., & Chong, C. Y. (2019). An improved pheromone-based vehicle
rerouting system to reduce traffic congestion. Applied Soft Computing, 84, 105702.
2. Qu, L., Li, W., Li, W., Ma, D., & Wang, Y. (2019). Daily long-term traffic flow forecasting based on a deep
neural network. Expert Systems with applications, 121, 304-312.
3. Ning, Z., Huang, J., & Wang, X. (2019). Vehicular fog computing: Enabling real-time traffic management
for smart cities. IEEE Wireless Communications, 26(1), 87-93.
4. Wang, C., Quddus, M. A., & Ison, S. G. (2013). The effect of traffic and road characteristics on road safety:
A review and future research direction. Safety science, 57, 264-275.
5. Stephan, K. L., & Newstead, S. V. (2014). Characteristics of the road and surrounding environment in
metropolitan shopping strips: association with the frequency and severity of single-vehicle crashes. Traffic
injury prevention, 15(sup1), S74-S80.
6. Pinto, J. A., Kumar, P., Alonso, M. F., Andreão, W. L., Pedruzzi, R., dos Santos, F. S., ... & de Almeida
Albuquerque, T. T. (2020). Traffic data in air quality modeling: A review of key variables, improvements
in results, open problems and challenges in current research. Atmospheric Pollution Research, 11(3), 454-
468.
7. Hosseinlou, M. H., Kheyrabadi, S. A., & Zolfaghari, A. (2015). Determining optimal speed limits in traffic
networks. IATSS research, 39(1), 36-41.
8. Zedda, M., & Pinna, F. (2017, December). Prediction models for space mean speed on urban roads.
In Proceedings of International Conference on Computational Intelligence and Data Engineering: ICCIDE
2017 (pp. 11-28). Singapore: Springer Singapore.
9. Thiessen, A., El-Basyouny, K., & Gargoum, S. (2017). Operating speed models for tangent segments on
urban roads. Transportation Research Record, 2618(1), 91-99.
10. Zheng, F., Li, J., Van Zuylen, H., & Lu, C. (2017). Influence of driver characteristics on emissions and fuel
consumption. Transportation Research Procedia, 27, 624-631.
11. Tong, W., Hussain, A., Bo, W. X., & Maharjan, S. (2019). Artificial intelligence for vehicle-to-everything:
A survey. IEEE Access, 7, 10823-10843.
12. Dabboussi, A. (2019). Dependability approaches for mobile environment: Application on connected
autonomous vehicles (Doctoral dissertation, Université Bourgogne Franche-Comté).
13. Storck, C. R., & Duarte-Figueiredo, F. (2019). A 5G V2X ecosystem providing internet of
vehicles. Sensors, 19(3), 550.
14. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and
translate. arXiv preprint arXiv:1409.0473.
15. Park, S. H., Kim, B., Kang, C. M., Chung, C. C., & Choi, J. W. (2018, June). Sequence-to-sequence
prediction of vehicle trajectory via LSTM encoder-decoder architecture. In 2018 IEEE intelligent vehicles
symposium (IV) (pp. 1672-1678). IEEE.
16. Li, F., Gui, Z., Zhang, Z., Peng, D., Tian, S., Yuan, K., ... & Lei, Y. (2020). A hierarchical temporal attention-
based LSTM encoder-decoder model for individual mobility prediction. Neurocomputing, 403, 153-166.
12

17. Capobianco, S., Millefiori, L. M., Forti, N., Braca, P., & Willett, P. (2021). Deep learning methods for vessel
trajectory prediction based on recurrent neural networks. IEEE Transactions on Aerospace and Electronic
Systems, 57(6), 4329-4346.
18. Tsekeris, T., & Vogiatzoglou, K. (2014). Public infrastructure investments and regional specialization:
empirical evidence from Greece. Regional Science Policy & Practice, 6(3), 265-290.
19. Dilek, E., & Dener, M. (2023). Computer vision applications in intelligent transportation systems: a
survey. Sensors, 23(6), 2938.
20. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot
multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The
Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.
21. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional
networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-
4708).
22. Gollapalli, P., Muthyala, N., Godugu, P., Didikadi, N., & Ankem, P. K. (2024). Speed sense: Smart traffic
analysis with deep learning and machine learning. World Journal of Advanced Research and Reviews, 21(3),
2240-2247.
23. Wang, J. X. (2016, July). Research of vehicle speed detection algorithm in video surveillance. In 2016
International Conference on Audio, Language and Image Processing (ICALIP) (pp. 349-352). IEEE.
24. Neamah, S. B., & Karim, A. A. (2023). Real-time traffic monitoring system based on deep learning and
yolov8. aro-the scientific journal of koya university, 11(2), 137-150.
25. Rodríguez-Rangel, H., Morales-Rosales, L. A., Imperial-Rojo, R., Roman-Garay, M. A., Peralta-Peñuñuri,
G. E., & Lobato-Báez, M. (2022). Analysis of statistical and artificial intelligence algorithms for real-time
speed estimation based on vehicle detection with YOLO. Applied Sciences, 12(6), 2907.
26. Ayodeji, O., Olusegun, A., Olu, A., Obafemi, J. R., Akinrolabu, O. D., & Rotiba, O. A Deep Learning-based
Model for Traffic Signal Control using the YOLO Algorithm. International Journal of Computer
Applications, 975, 8887.
27. Amin, A., Bahnasy, S., Elhadidy, A., & Elattar, M. (2020, October). Real-time 4-way Intersection Smart
Traffic Control System. In 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference
(NILES) (pp. 428-433). IEEE.
28. Chinyere, O. U., Francisca, O. O., & Amano, O. E. (2011). Design and simulation of an intelligent traffic
control system. International journal of advances in engineering & technology, 1(5), 47.
29. Miyim, A. M., & Muhammed, M. A. (2019, December). Smart traffic management system. In 2019 15th
International Conference on Electronics, Computer and Computation (ICECCO) (pp. 1-6). IEEE.
30. Khan, M. A., Park, H., & Chae, J. (2023). A lightweight convolutional neural network (CNN) architecture
for traffic sign recognition in urban road networks. Electronics, 12(8), 1802.
31. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object
detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 580-587).
32. Jocher, G. (2023). YOLOv5 by Ultralytics. 2020.
33. Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., ... & Wei, X. (2022). YOLOv6: A single-stage object
detection framework for industrial applications. arXiv preprint arXiv:2209.02976.
34. Wang, C. Y., Bochkovskiy, A., & Liao, H. Y. M. (2023). YOLOv7: Trainable bag-of-freebies sets new
state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer
vision and pattern recognition (pp. 7464-7475).
35. Wang, C., He, W., Nie, Y., Guo, J., Liu, C., Wang, Y., & Han, K. (2023). Gold-YOLO: Efficient object
detector via gather-and-distribute mechanism. Advances in Neural Information Processing Systems, 36,
51094-51112.
13

36. Wang, C. Y., Yeh, I. H., & Mark Liao, H. Y. (2024, September). Yolov9: Learning what you want to learn
using programmable gradient information. In European conference on computer vision (pp. 1-21). Cham:
Springer Nature Switzerland.
37. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., & Han, J. (2024). Yolov10: Real-time end-to-end object
detection. Advances in Neural Information Processing Systems, 37, 107984-108011.
38. Mehra, S., & Susan, S. (2022, December). Early fusion of phone embeddings for recognition of low-
resourced accented speech. In 2022 4th International Conference on Artificial Intelligence and Speech
Technology (AIST) (pp. 1-5). IEEE.
39. Mehra, S., Ranga, V., & Agarwal, R. (2024). Multimodal Integration of Mel Spectrograms and Text
Transcripts for Enhanced Automatic Speech Recognition: Leveraging Extractive Transformer‐Based
Approaches and Late Fusion Strategies. Computational Intelligence, 40(6), e70012.
40. https://www.kaggle.com/datasets/yashmishra2006/motorway-dataset-for-speed-detection-models/data

You might also like