0% found this document useful (0 votes)

12 views60 pages

Object Detection

The document discusses multi-object detection as a computer vision approach aimed at efficient object localization and recognition. It covers various aspects including image processing, machine learning, and deep learning techniques, particularly focusing on the YOLO (You Only Look Once) model. The document also presents experimental results, accuracy metrics, and future work directions in the field of object detection.

Uploaded by

Mc Swathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views60 pages

Object Detection

Uploaded by

Mc Swathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 60

Multi Object Detection:acomputer vision approach for efficient object localization and

recognition
Table of Contents

Page

List of Figures 6

Chapter

I. Introduction..................................................................................................................... 8

Overview................................................................................................................. 8

Motivation............................................................................................................... 8

Research aim........................................................................................................... 9

Problem description and Research Questions......................................................... 9

What is an image in Computer Science? ................................................................ 9

What is Object Detection? .................................................................................... 13

What is Image Recognition?................................................................................. 14

II. Background ................................................................................................................... 17

Overview............................................................................................................... 17

Image Processing .................................................................................................. 17

Artificial intelligence and learning ....................................................................... 18

Machine Learning ................................................................................................. 19

Machine Learning Evaluation............................................................................... 21

Deep Learning....................................................................................................... 22

Neural Network..................................................................................................... 23

Unified Detection Model YOLO ....................................................................... 28

Chapter Page

Pytesseract............................................................................................................. 39

Regular Expressions.............................................................................................. 41

III. Experiment .................................................................................................................... 43

Overview............................................................................................................... 43

Data Preparation.................................................................................................... 43

Training................................................................................................................. 45

Evaluation Metric.................................................................................................. 47

IV. Results ........................................................................................................................... 49

Overview.............................................................................................................. .
49
Detection Accuracy..............................................................................................
.
Font Style and Size ..............................................................................................49

Input Size ..............................................................................................................

52
V. Conclusion....................................................................................................................
.
Overview.............................................................................................................. 54

Application........................................................................................................... .
55
Limitations of Work.............................................................................................
.
Future work..........................................................................................................55

References .................................................................................................................................
55
1. Image Recognition and Object Detection difference [36]. ...............................................

2. Enhancing grayscale images with histogram equalization [37]........................................

3. Structure of Perceptron [35]..............................................................................................

4. Output of a Perceptron [35]. .............................................................................................

5. Activation functions. (1) the left curve is a sigmoid function curve. (2) the right curve is a

tanh function curve [35]....................................................................................................

6. A sample fully connected neural network with only one hidden layer [35].....................

7. Visualization of a Training Process [35]. .........................................................................

8. Max-pooling. Pooling from 24 × 24 to 12 × 12 [35]. .......................................................

9. Structure of a CNN Based Object Detection Model [35]. ................................................

10. YOLO Structure [35]. .......................................................................................................

11. YOLO Network Architecture [42]....................................................................................

12. Bounding Box prediction formula [32].............................................................................

13. Image before pre-processing [43]. ....................................................................................

14. Image after pre-processing [43]. .......................................................................................

15. Word bank used to generate our training image data. ......................................................

16. Sample generated image using our custom-built tagging tool..........................................

17. Sample grid generated by YOLO for our detection algorithm. ........................................

18. Our detection model summary..........................................................................................

19. Graphical View of the IoU equation [41]. ........................................................................

20. The higher the IoU, the better the performance [24]. .......................................................

21. (a) Object detection algorithm after 10 epochs (b) Object detection algorithm for an

image without any link (c) Improved Object detection algorithm after 200 epochs. .......

22. Accuracy between two Optical Character Recognition techniques and our model on ten

im-ages using regular font style........................................................................................

23. Regular Expression used to recognize a URL in a string of text. .....................................

24. Our model experiencing overfitting..................................................................................

25. OCR image conversion to text versus our object detection model...................................

26. Example italicized text......................................................................................................

27. Detection Accuracy between Optical Character Recognition techniques and our model on

ten images using italicized font style. ...............................................................................

28. Detection on different input image sizes. .........................................................................

29. Our models performance on slightly varying font size.....................................................

2.
3.

5.
14
15
Figure 1. Image Recognition and Object Detection difference [36].
Figure 2. Enhancing grayscale images with histogram equalization [37].
Figure 3. Structure of Perceptron [35].
Figure 4. Output of a Perceptron [35].

Figure 5. Activation functions. (1) the left curve is a sigmoid function curve. (2) the right curve
is a tanh function curve [35].
Figure 6. A sample fully connected neural network with only one hidden layer [35].
Figure 7. Visualization of a Training Process [35].
Figure 8. Max-pooling. Pooling from 24 × 24 to 12 × 12 [35].
Figure 9. Structure of a CNN Based Object Detection Model [35].
1.

6.
Figure 10. YOLO Structure [35].

Figure 11. YOLO Network Architecture [42].

Figure 12. Bounding Box prediction formula [32].
Figure 13. Image before pre-processing [43].
Figure 14. Image after pre-processing [43].
Figure 15. Word bank used to generate our training image data.

Figure 16. Sample generated image using our custom-built tagging tool.
Figure 17. Sample grid generated by YOLO for our detection algorithm.
Figure 20. The higher the IoU, the better the performance [24].
Figure 21. (a) Object detection algorithm after 10 epochs (b) Object detection algorithm for an
image without any link (c) Improved Object detection algorithm after 200 epochs.
Figure 22. Accuracy between two Optical Character Recognition techniques and our model on
ten im-ages using regular font style.

Figure 23. Regular Expression used to recognize a URL in a string of text.

Figure 24. Our model experiencing overfitting.
Figure 26. Example italicized text.

Detection Accuracy on italicized texts

80
70
60
50
40
30
20
10
0

Thesis OCR Pattern 1 OCR Pattern 2 - regEx

ErrorsOverfittingExact

Figure 27. Detection Accuracy between Optical Character Recognition techniques and our
model on ten images using italicized font style.
59

[1] P. Chakravorty, "What Is a Signal? [Lecture Notes]," in IEEE Signal Processing Magazine,
vol. 35, no. 5, pp. 175-177, Sept. 2018, doi: 10.1109/MSP.2018.2832195.
[2] M. Rouse, [Online]. Available at: https://whatis.techtarget.com/definition/image.
[Accessed: 01/10/20]
[3] Merriam-Webster [Online]. Available at: https://www.merriam-webster.com/dic-
tionary/image, [Assessed: 01/10/20]
[4] Wikipedia, Image [Online]. Available at: https://en.wikipedia.org/wiki/Image, [Assessed:
01/10/20]
[5] The Editors of Encyclopedia Britannica, Image-processing [Online], Available at:
https://www.britannica.com/technology/image-processing. [Assessed: 01/10/20]
[6] Wikipedia, [Online], Available at:
https://www.bbc.co.uk/bitesize/guides/zqyrq6f/revision/3. [Assessed: 02/10/20]
[7] Object (image processing) [Online], Available at: https://en.wikipedia.org/wiki/Object_(im-
age_processing) , [Assessed: 02/10/20]
[8] P. Ganesh, Object Detection: Simplified [Online], Available at: https://towardsdatasci-
ence.com/object-detection-simplified-e07aa3830954, [Assessed: 02/10/20]
[9] Tensorflow, Available at: https://www.tensorflow.org/lite/models/object_detection/overview,
[Assessed: 03/10/20]
[10] Wikipedia, Available at: https://en.wikipedia.org/wiki/Object_detection,
[Assessed: 02/10/20]
[11] Fritz, Detection Available at: https://www.fritz.ai/object-detection/, [As-
sessed: 03/10/20]
[12] Wikipedia, of object Available at: https://en.wikipedia.org/wiki/Out-
line_of_object_recognition , [Assessed: 02/10/20]
[13] Object Recognition, Available at: cse.usf.edu/~r1k/MachineVisionBook/Machine-
Vision.files/MachineVision_Chapter15.pdf, [Assessed: 03/10/20]
[14] N. Pinto, D. D. Cox, and J.J. DiCarlo, Why is Real-World Visual Object Recognition
(2008) PLoS Comput Biol 4(1): e27.
[15] A. Gulli and P. Sujit, Learning with Keras (2017), Available at:
https://1lib.us/book/3411804/7ea47a?id=3411804. [Assessed: 03/10/20]
[16] Definitive Glossary of Higher Mathematical Jargon - Algorithm". Available at:
https://mathvault.ca/math-glossary/ [Assessed: 03/10/20]
[17] of ALGORITHM". Merriam-Webster Online Dictionary. Available at:
https://www.merriam-webster.com/dictionary/algorithm [Assessed: 04/10/20]
[18] Y. Gavrilova, Artificial Intelligence vs. Machine Learning vs. Deep Learning:
Available at https://serokell.io/blog/ai-ml-dl-difference [Assessed: 04/10/20]
[19] J. Brownlee, Gentle Introduction to Object Recognition with Deep (2018)
Available at: https://machinelearningmastery.com/object-recognition-with-deep-learn-
ing/ [Assessed: 01/10/20]
60

[20] A. Kamal, YOLOv2 and YOLOv3: All You want to Available at:
https://medium.com/@amrokamal_47691/yolo-yolov2-and-yolov3-all-you-want-to-
know-7e3e92dc4899 [Assessed: 04/10/20]
[21] You Only Look Once: Unified, Real-Time Object Detection, 2015. Available at:
https://arxiv.org/abs/1506.02640, [Assessed: 04/10/20]
[22] A. Rosebrock., over Union (IoU) for object (2016) Available at:
https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-de-
tection/, [Assessed: 04/10/20]
[23] I. Tan, Labelling Quality with IOU and F1 Available at: https://me-
dium.com/supahands-techblog/measuring-labelling-quality-with-iou-and-f1-score-
1717e29e492f , [Assessed: 01/10/20]
[24] StackOverflow, Over Union (IoU) ground truth in Available at:
https://stackoverflow.com/questions/61758075/intersection-over-union-iou-ground-truth-
in-yolo, [Assessed: 02/10/20]
[25] J. Redmon & A. Farhadi, (University of Washington), YOLO9000: Better, Faster,
Available at: https://pjreddie.com/media/files/papers/YOLO9000.pdf, [As-
sessed: 02/10/20]
[26] v2 Object Available at: https://www.geeksforgeeks.org/yolo-v2-ob-
ject-detection/ [Assessed: 04/10/20]
[27] A. Aggarwal, Explained Available at: https://medium.com/analytics-vidhya/yolo-
explained-5b6f4564f31 [Assessed: 04/10/20]
[28] K. Mahesh Babu, M.V. Raghunadh, Vehicle number plate detection and recognition using
bounding box method, May 2016, pp 106 110
[29] L. Cai, F. Jiang, W. Zhou, and K. Li, Design and Application of An Attractiveness Index for
Urban Hotspots Based on GPS Trajectory Data, (Fellow, IEEE), pg 4
[30] Wikipedia, bounding Available at: https://en.wikipedia.org/wiki/Mini-
mum_bounding_box, [Assessed: 04/10/20]
[31] Dive into Deep Learning, Object Detection and Bounding Available at:
https://d2l.ai/chapter_computer-vision/bounding-box.html , [Assessed: 04/10/20]
[32] J. Redmon & A. Farhadi, YOLOv3: An Incremental Improvement, University of Washing-
ton Available at: https://arxiv.org/abs/1804.02767 [Assessed: 05/10/20]
[33] YOLO: You Only Look Once, Available at: jeremyjordan.me/object-detection-one-
stage/#yolo, [Assessed: 05/10/20]
[34] Longman Dictionary, Definition of training, Available at: https://www.ldoceonline.com/dic-
tionary/training [Assessed: 05/10/20]
[35] Guangrui Liu -Time Object Detection for Autonomous Driving Based on Deep Learn-
(2017), Available at: https://tamucc-ir.tdl.org/handle/1969.6/5637 [Assessed:
03/18/21]
[36] A. Abdulkader & C. Vlahija -time vehicle and pedestrian detection, a data-driven rec-
ommendation focusing on safety as a perception to autonomous Available at:
http://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1479957&dswid=-3676 [As-
sessed: 03/18/21]
61

[37] Enhancement methods in image processing Available at: https://www.mathworks.com/dis-

covery/image-enhancement.html. [Assessed: 03/18/21]
[38] Evaluating a machine learning model. Available at: https://www.jeremyjordan.me/evaluat-
ing-a-machine-learning-model/. [Assessed: 03/18/21]
[39] A Comprehensive Guide to Convolutional Neural Networks the ELI5 way. Available at
https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-net-
works-the-eli5-way-3bd2b1164a53. [Assessed: 03/18/21]
[40] J. Jokela Counter Using Real-Time Object Detection and a Small Neural
Turku University of Applied Sciences. Available at: https://www.theseus.fi/bit-
stream/handle/10024/153489/Jokela_Jussi.pdf?sequence=1&isAllowed=y. [Assessed:
03/18/21]
[41] Manishgupta You Only Look Available at: https://towardsdatasci-
ence.com/yolo-you-only-look-once-3dbdbb608ec4. [Assessed: 03/18/21]
[42] E. Y. Li Really Deep into YOLO v3: A Available at: https://to-
wardsdatascience.com/dive-really-deep-into-yolo-v3-a-beginners-guide-9e3d2666280e.
[Assessed: 03/18/21]
[43] F. Zelic & A. Sable A comprehensive guide to OCR with Tesseract, OpenCV and Py-
thon. Available at: https://nanonets.com/blog/ocr-with-tesseract/. [Assessed: 03/18/21]
[44] Tessdoc. Available at: https://github.com/tesseract-ocr/tessdoc/blob/master/ImproveQual-
ity.md. [Assessed: 06/04/21]
[45] J. Goyvaerts Expressions: The Complete Available at: https://www.reg-
ular-expressions.info/print.html. [Assessed: 06/04/21]
[46] M. Erwig & R. Gopinath, for Regular Available at:
https://web.engr.oregonstate.edu/~erwig/papers/ExplRegExp_FASE12.pdf. [Assessed:
06/04/21]

Object Detection Using TensorFlow
No ratings yet
Object Detection Using TensorFlow
21 pages
Realtime Object Detection Using SSD
No ratings yet
Realtime Object Detection Using SSD
8 pages
Ccs355 Neural Networks and Deep Learning Unit1
No ratings yet
Ccs355 Neural Networks and Deep Learning Unit1
29 pages
Helmet Detection Using Machine Learning and Automatic License Final
75% (4)
Helmet Detection Using Machine Learning and Automatic License Final
47 pages
iCMLDE2019 Paper 25
No ratings yet
iCMLDE2019 Paper 25
5 pages
Object Detection Security System Report
No ratings yet
Object Detection Security System Report
13 pages
Handwritten Character Recognition From Images Using CNN-ECOC Handwritten Character Recognition From Images Using CNN-ECOC
No ratings yet
Handwritten Character Recognition From Images Using CNN-ECOC Handwritten Character Recognition From Images Using CNN-ECOC
7 pages
Object Detection and Identification
No ratings yet
Object Detection and Identification
8 pages
Project Report Pallapati
No ratings yet
Project Report Pallapati
62 pages
Ci - Adaline & Madaline Network
No ratings yet
Ci - Adaline & Madaline Network
35 pages
Object Detection and Recognition Using YOLO - Detect and Recognize
No ratings yet
Object Detection and Recognition Using YOLO - Detect and Recognize
60 pages
19bce0014 VL2021220702099 Pe003
No ratings yet
19bce0014 VL2021220702099 Pe003
17 pages
Detection and Content Retrieval of Object in An Image Using YOLO
No ratings yet
Detection and Content Retrieval of Object in An Image Using YOLO
8 pages
Feb2018
No ratings yet
Feb2018
226 pages
Real Time Object Recognition and Classification
No ratings yet
Real Time Object Recognition and Classification
6 pages
Introduction To Artificial Neural Networks: Andrew L. Nelson
No ratings yet
Introduction To Artificial Neural Networks: Andrew L. Nelson
29 pages
Electronics 12 01515
No ratings yet
Electronics 12 01515
21 pages
Result - 9 - 19 - 2023, 7 - 14 - 17 AM
No ratings yet
Result - 9 - 19 - 2023, 7 - 14 - 17 AM
43 pages
Yolo 220209212833
No ratings yet
Yolo 220209212833
17 pages
REPORT
No ratings yet
REPORT
59 pages
Full Text 01
No ratings yet
Full Text 01
85 pages
Elhadj Amal
No ratings yet
Elhadj Amal
128 pages
Final Synopsis1
No ratings yet
Final Synopsis1
10 pages
Final Project Report
No ratings yet
Final Project Report
60 pages
Mini Project Report
No ratings yet
Mini Project Report
15 pages
Traffic Scene Object Detection
No ratings yet
Traffic Scene Object Detection
5 pages
Object Detection
No ratings yet
Object Detection
13 pages
Paper Id 334 (New) With Animation - PPTX - 20240311 - 215722 - 0000
No ratings yet
Paper Id 334 (New) With Animation - PPTX - 20240311 - 215722 - 0000
11 pages
Neural Network Lab Manual
No ratings yet
Neural Network Lab Manual
37 pages
Ankit Report
No ratings yet
Ankit Report
73 pages
Index Page Arjun
No ratings yet
Index Page Arjun
8 pages
Team 10
No ratings yet
Team 10
20 pages
Yolo Algorithm
No ratings yet
Yolo Algorithm
37 pages
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
No ratings yet
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
6 pages
Optical Character Recognition Using Convolutional Neural Network
No ratings yet
Optical Character Recognition Using Convolutional Neural Network
5 pages
AI Object Detection with TensorFlow
No ratings yet
AI Object Detection with TensorFlow
37 pages
Thesis (2) Removed
No ratings yet
Thesis (2) Removed
34 pages
The Mostly Complete Chart of Neural Networks
100% (1)
The Mostly Complete Chart of Neural Networks
19 pages
1382 - SP24AI05 - GSP04 - AI Capstone Projec T - Đồ Án Tốt Nghiệp Trí Tuệ Nhân Tạo
No ratings yet
1382 - SP24AI05 - GSP04 - AI Capstone Projec T - Đồ Án Tốt Nghiệp Trí Tuệ Nhân Tạo
51 pages
A.I Project (9) - 094855
No ratings yet
A.I Project (9) - 094855
30 pages
MYPPTT
No ratings yet
MYPPTT
19 pages
B.E Cse Batchno 66
No ratings yet
B.E Cse Batchno 66
52 pages
Presentation1 FINAL 1
No ratings yet
Presentation1 FINAL 1
11 pages
Unit 5
No ratings yet
Unit 5
76 pages
Ai Powered Ocr For Efficient Government Documentation
No ratings yet
Ai Powered Ocr For Efficient Government Documentation
49 pages
Real-Time CNN Visual Recognition
No ratings yet
Real-Time CNN Visual Recognition
13 pages
1 s2.0 S1877050924033301 Main
No ratings yet
1 s2.0 S1877050924033301 Main
7 pages
An Empirical Evaluation of Generic Convolutional and Recurrent Networks For Sequence Modeling
No ratings yet
An Empirical Evaluation of Generic Convolutional and Recurrent Networks For Sequence Modeling
14 pages
Memoire Finale
No ratings yet
Memoire Finale
69 pages
Object Detection
No ratings yet
Object Detection
11 pages
Real Time Final Data
No ratings yet
Real Time Final Data
46 pages
YOLO: Fast Object Detection for Engineers
No ratings yet
YOLO: Fast Object Detection for Engineers
6 pages
Sapkota Et Al., 2025
No ratings yet
Sapkota Et Al., 2025
28 pages
Advanced LSTM for AI Researchers
No ratings yet
Advanced LSTM for AI Researchers
55 pages
Intro Deep Learning
No ratings yet
Intro Deep Learning
43 pages
Object Detection Using Tensorflow....
No ratings yet
Object Detection Using Tensorflow....
9 pages
Unit 5 - Neural Networks
No ratings yet
Unit 5 - Neural Networks
41 pages
An Investigation of Deep Neural Network Based Techniques For Object Detection An
No ratings yet
An Investigation of Deep Neural Network Based Techniques For Object Detection An
6 pages
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
No ratings yet
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
29 pages
4 LSTM Gru
No ratings yet
4 LSTM Gru
44 pages
Object Detection Using Deep Learning
No ratings yet
Object Detection Using Deep Learning
45 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
lightllm源码导读模型
No ratings yet
lightllm源码导读模型
37 pages
Incremental Training for Unseen Object Classification
No ratings yet
Incremental Training for Unseen Object Classification
19 pages
Object Detection Using Machine Learningand Neural Networks
No ratings yet
Object Detection Using Machine Learningand Neural Networks
10 pages
YOLO-Based Object Detection with Voice and Cartoon Effects
No ratings yet
YOLO-Based Object Detection with Voice and Cartoon Effects
6 pages
Neural Networks & Fuzzy Logic Basics
No ratings yet
Neural Networks & Fuzzy Logic Basics
51 pages
Management&organizational Behaviour - Pyqs
No ratings yet
Management&organizational Behaviour - Pyqs
6 pages
Ocr 3
No ratings yet
Ocr 3
22 pages
Activation Funtions
No ratings yet
Activation Funtions
26 pages
Nria20-Dl - Unit-3 Notes-Final
No ratings yet
Nria20-Dl - Unit-3 Notes-Final
23 pages
Optical Character Recognition Using Convolutional Neural Network
No ratings yet
Optical Character Recognition Using Convolutional Neural Network
5 pages
ML Unit 2 Lecture Notes
No ratings yet
ML Unit 2 Lecture Notes
20 pages
Data Banknote Authentication
No ratings yet
Data Banknote Authentication
24 pages
Irjet V9i4167
No ratings yet
Irjet V9i4167
5 pages
SNN vs ANN: Performance Insights
No ratings yet
SNN vs ANN: Performance Insights
14 pages
CNNs: Deep Learning for Visual Data
No ratings yet
CNNs: Deep Learning for Visual Data
21 pages
Seminar 201202175023
No ratings yet
Seminar 201202175023
16 pages
Inter-IIT Proposal
No ratings yet
Inter-IIT Proposal
3 pages
Mathematics 1 Book
No ratings yet
Mathematics 1 Book
11 pages
Object Detection and Recognition Using TensorFlow For Blind People
No ratings yet
Object Detection and Recognition Using TensorFlow For Blind People
6 pages
Neural - N - Problems - MLP
No ratings yet
Neural - N - Problems - MLP
15 pages
Yolopdf
No ratings yet
Yolopdf
10 pages
CNN Architecture & Training Guide
No ratings yet
CNN Architecture & Training Guide
7 pages
Neural Network and Deep Learning Assignment 1
No ratings yet
Neural Network and Deep Learning Assignment 1
7 pages
Image Preprocessing For Efficient Training of YOLO Deep Learning Networks
No ratings yet
Image Preprocessing For Efficient Training of YOLO Deep Learning Networks
3 pages
A Literature Review of Object Detection Using YOLOv4 Detector
No ratings yet
A Literature Review of Object Detection Using YOLOv4 Detector
7 pages
Multilayer Feed Forward Neural Network
No ratings yet
Multilayer Feed Forward Neural Network
8 pages
Deep Neural Network Activation Functions
No ratings yet
Deep Neural Network Activation Functions
6 pages
Module 8
No ratings yet
Module 8
3 pages
XXXBetter Plain ViT Baselines For ImageNet-1k
No ratings yet
XXXBetter Plain ViT Baselines For ImageNet-1k
3 pages
The Godfather of AI and His Contributions
No ratings yet
The Godfather of AI and His Contributions
2 pages
Chapter 3 Soft Computing
No ratings yet
Chapter 3 Soft Computing
1 page

Object Detection

Uploaded by

Object Detection

Uploaded by

Multi Object Detection:acomputer vision approach for efficient object localization and

Problem description and Research Questions......................................................... 9

What is an image in Computer Science? ................................................................ 9

What is Object Detection? .................................................................................... 13

What is Image Recognition?................................................................................. 14

II. Background ................................................................................................................... 17

Image Processing .................................................................................................. 17

Artificial intelligence and learning ....................................................................... 18

Machine Learning ................................................................................................. 19

Machine Learning Evaluation............................................................................... 21

Unified Detection Model YOLO ....................................................................... 28

III. Experiment .................................................................................................................... 43

IV. Results ........................................................................................................................... 49

Input Size ..............................................................................................................

2. Enhancing grayscale images with histogram equalization [37]........................................

3. Structure of Perceptron [35]..............................................................................................

4. Output of a Perceptron [35]. .............................................................................................

tanh function curve [35]....................................................................................................

7. Visualization of a Training Process [35]. .........................................................................

8. Max-pooling. Pooling from 24 × 24 to 12 × 12 [35]. .......................................................

9. Structure of a CNN Based Object Detection Model [35]. ................................................

10. YOLO Structure [35]. .......................................................................................................

11. YOLO Network Architecture [42]....................................................................................

12. Bounding Box prediction formula [32].............................................................................

13. Image before pre-processing [43]. ....................................................................................

14. Image after pre-processing [43]. .......................................................................................

16. Sample generated image using our custom-built tagging tool..........................................

18. Our detection model summary..........................................................................................

im-ages using regular font style........................................................................................

23. Regular Expression used to recognize a URL in a string of text. .....................................

24. Our model experiencing overfitting..................................................................................

26. Example italicized text......................................................................................................

ten images using italicized font style. ...............................................................................

28. Detection on different input image sizes. .........................................................................

29. Our models performance on slightly varying font size.....................................................

Figure 11. YOLO Network Architecture [42].

Figure 23. Regular Expression used to recognize a URL in a string of text.

Detection Accuracy on italicized texts

Thesis OCR Pattern 1 OCR Pattern 2 - regEx

[37] Enhancement methods in image processing Available at: https://www.mathworks.com/dis-

You might also like