Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
72 views7 pages

Yang 2020

This document summarizes a research paper about developing a face mask recognition system using YOLOV5, a deep learning algorithm. The system aims to automatically detect whether people are wearing masks in public places as a way to help enforce mask mandates during the COVID-19 pandemic. It analyzes images to recognize faces and masks. The researchers tested the system and found it could effectively recognize faces and masks, making it suitable for monitoring mask compliance in public spaces.

Uploaded by

Azrina Tahir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views7 pages

Yang 2020

This document summarizes a research paper about developing a face mask recognition system using YOLOV5, a deep learning algorithm. The system aims to automatically detect whether people are wearing masks in public places as a way to help enforce mask mandates during the COVID-19 pandemic. It analyzes images to recognize faces and masks. The researchers tested the system and found it could effectively recognize faces and masks, making it suitable for monitoring mask compliance in public spaces.

Uploaded by

Azrina Tahir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2020 IEEE 6th International Conference on Computer and Communications

Face Mask Recognition System with YOLOV5 Based on Image Recognition

Guanhao Yang1*, Wei Feng2*, Jintao Jin1, Qujiang Lei1Ъ, Xiuhao Li1, Guangchao Gui1, Weijun Wang1
Intelligent Robot & Equipment Center
1
Guangzhou Institute of Advanced Technology, Chinese Academy of Sciences
Guangzhou, 511458, China
2
University of Chinese Academy of Sciences
2020 IEEE 6th International Conference on Computer and Communications (ICCC) | 978-1-7281-8635-1/20/$31.00 ©2020 IEEE | DOI: 10.1109/ICCC51575.2020.9345042

Beijing,100039, China
*Co-first authors
ЪCorresponding author
e-mail: [email protected]

Abstract—The rapid development of computer vision makes The difference between the COVID-19 epidemic and
human-computer interaction possible and has a wide previous infectious diseases is that it can spread through
application prospect. Since the discovery of the first case of aerosols and infect people within a very short period by
COVID-19, the global fight against the epidemic has begun. In contact. According to a report by Beijing SATELLITE TV
addition to various studies and findings by medical and health on June 25, a couple in Beijing's Hai Dian district became
care experts, people's daily behaviors have also become key to infected simply by visiting a public toilet. In addition,
combating the epidemic. In China, the government has taken COVID-19 lurks in the body of an infected person until
active and effective measures of isolation and closure, as well symptoms appear 20 days later [2].
as the active cooperation of the general public, such as it is During this time, the patients will not show any
unnecessary to stay indoors and wear masks. China, as the symptoms, which means that asymptomatic patients cannot
country with the first outbreak of the epidemic, has now
be detected and the government cannot isolate them in time
become the benchmark country of epidemic prevention in the
to avoid further disaster. So far, a vaccine and a specific for
world. Of course, it is not enough for people to wear masks
consciously. Wearing masks in all kinds of public places still
Covid19 are on the way, but until the virus is eradicated,
needs supervision. In this process, this paper proposes to wearing masks has become the most effective way for people
replace manual inspection with a deep learning method and around the world to avoid infection. Around the world,
use YOLOV5, the most powerful objection detection algorithm masks were once in short supply, and merchants even raised
at present, to better apply it in the actual environment, the price of masks by dozens of times until the market supply
especially in the supervision of wearing masks in public places. could meet consumer demand, so masks have become an
The experimental results show that the algorithm proposed in essential item for personal travel. As masks have become a
this paper can effectively recognize face masks and realize the necessity, many public places where crowds gather, such as
effective monitoring of personnel. malls and swimming pools, need to be monitored for
wearing them, but most Chinese shopping malls today still
Keywords-computer vision; COVID-19; YOLOV5; wear use manual and restricted access to monitor the flow of
masks; deep learning; in public places people wearing them. Although manual inspection has
certain defects, it can be avoided by adding more manpower.
I. INTRODUCTION Although The epidemic prevention and control in China is
Computer vision technology uses a variety of imaging excellent, the occurrence of infection cannot be completely
systems instead of visual organs as input means, using avoided in places with a large and scattered population, such
computers to replace the brain to complete the processing as shopping malls, because it will not only waste a lot of
and interpretation of visual information. With the continuous resources and manpower to assign staff to supervise at each
development of computer vision technology, computers can entrance of shopping malls. In addition, if one of the
recognize all kinds of faces and give feedback. At present, entrances has a large flow of people, it may cause staff to
face recognition of masks is most widely used in public miss the inspection. In addition, human supervision also
places. wastes time, causing a large number of people to gather at
Since the first case of pneumonia of unknown cause the entrance of shops and supermarkets, and creating the risk
appeared in Wuhan, China, in late 2019, the world has been of infection. In this paper, of course, there will also be
gripped by a new pandemic. The World Health Organization temperature detection and photo-taking in the identification
(WHO) named mycoplasma pneumoniae 19 caused by the mask. Only when these are met at the same time, the gate
novel coronavirus on 12 January 2020, as “COVID-19” and will be opened and people can enter the site, otherwise, it
raised the global risk level for COVID-19 to the highest level will not be opened. This paper mainly addresses the
[1]. identification of masks.

© IEEE 2021. This article is free to access and download, along with rights for full text and data mining, re-use and analysis.

1398

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.
To solve this problem, this paper proposes a wearable and more in YOLOV4, YOLOV5 can achieve 140 FPS on
detection algorithm based on deep learning YOLOV5. The Tesla P100 rapid detection, YOLOv4 is only 50 FPS.
algorithm can identify whether faces in public places such as Meanwhile, the size of YOLOV5 is only 27 MB, while the
malls and factories are wearing masks. size of YOLOv4 using Darknet architecture is 244 MB.
YOLOV5 also has the same accuracy as YOLOV4.
II. RELATED WORK YOLOV5 has inherited the advantages of YOLOV4,
A. Object Detection namely adding SPP-NET (as shown in the figure below),
modifying the SOTA method, and putting forward new data
Object detection tasks can be understood via Fig. 1. enhancement methods, such as Mosaic training, self-
adversary training (SAT), and multi-channel feature
replacing FPN fusion with PANet [4].

(a)classification (b)detection (c)segmentation

Figure 1. Three steps about object detection

1. Classification, which is to structure the image into a


certain type of information, with a predefined category
(String) or instance ID to describe the image. Figure 2. SPP-net
2. Detection, the classification task is concerned with the
whole. Compared with classification, detection gives the The network structure diagram of YOLOV5 is shown in
understanding of picture foreground and background. The Fig. 3(mainly the structure diagram of YOLOV5S).
output of the detection model is a list, and each item of the
list used a data group to give the category and position of the
detected target (commonly used coordinate representation of
rectangular detection box).
3. Segmentation, which is a pixel-level description of an
image, which gives meaning to each pixel category (instance)
and is suitable for scenes requiring high understanding.
B. One-Stage and Two-Stage
The one-stage network is represented by the YOLO
series network, while the two-stage network is represented
by faster-RCNN [3].
C. YOLOV5
Joseph Redmon, the original author of the YOLO Figure 3. YOLOV5 network
algorithm, announced that he would stop all research in the
CV field after he became dissatisfied with the military and D. Center Loss [5]
privacy applications of his open-source algorithm. Then, In recent years, in addition to the improvement of
when the YOLO series went nowhere, Alexey Bochkovskiy network structure, there are also a group of people studying
published a paper and carried on the development of the the improvement of the loss layer. Wen Yandong introduced
YOLO series on April 23, 2020, and subsequently received the monitoring method of Center Loss [5] in a novel way. It
the official approval of YOLO. In the heat of the YOLOV4 can effectively enhance the recognition ability of deep
continues, on May 30, issued by YOLOV5 Ultralytics LLC learning features in the neural network. The formulated in Eq.
team, although the original author's official website has not 1
issued an acknowledged, and many also will be as YOLO4.5 2
but do not affect its usefulness, compared with other YOLO = ∑ − 2 (1)
series, by darknet into PyTorch YOLOV5 realization way,

1399

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.
The gradients of LC for and update equation of are
computed as:

=x −c (2)

∑ ( )⋅
Δ = ∑ ( )
(3)

where δ(condition) = 1 if the condition is satisfied, and


δ(condition) = 0 if not. α is restricted in [0, 1].
After soft-max loss added center loss[5], the inter-class
distance increased and the intra-class distance decreased. In
his paper, Wen Yandong presented the comparison results of
soft-max loss and soft-Max loss + Center loss, the direct Figure 6. The detection results in markets
impact is shown in Figure 4. III. PROPOSED SYSTEM
The system presented in our study is shown in Fig. 7.
First of all, people entering the mall will take pictures
through the camera, and the images will be sent to the
interface for face mask recognition. If the face identified
within two seconds is a face with a mask, the mall gate will
be opened and displayed to pass, otherwise, it will be
returned to face mask recognition until success.

Figure 4. The distribution of deeply learned features in (a) training set (b)
testing set, both under the supervision of soft-max loss, where we use
50K/10K train/test splits. The points with different colors denote features
from different classes. Best viewed in color. (Color figure online) Figure 7 Working system

We divided the whole system into four parts: facial mask


image enhancement, facial mask image segmentation, facial
mask image recognition, and interface interaction. Facial
mask image enhancement is used to improve the resolution
of the mask worn for easy detection. Face mask image
segmentation is used to extract mask information. The facial
mask recognition part classifies the extracted mask
information. The final interface output can make the gate
open smoothly and help customers to enter. The recognition
model is shown in Fig. 8.

Figure 5. The distribution of deeply learned features under the joint


supervision of soft-max loss and center loss [5]. The points with different
colors denote features from different classes. Different lead to different
deep feature distributions ( = 0.5). The white dots (c0, c1, …, c9) denote 10
class centers of deep features. Best viewed in color. (Color figure online)

Here are some test images from the web that show the
results in Fig. 6.
Figure 8. Recognition model

1400

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.
A. Enhancement of Facial Mask Images G(x, y) = ∑( , ∈ ) f(x, y)(x, y) ∉ S (6)
In real life, when entering the mall, the image of face
masks is often affected by the complex and unfavorable The results of the comparison are shown in Fig 9.
environment such as lights, stains, and colors, which reduces
the image quality and weakens the features of masks.
Therefore, it is very important to improve the image quality
of digital images. This operation requires the enhancement of
the part of the face with the mask.
The main purpose of image smoothing is to reduce image
noise and extract useful information. A smoothing filter is to
enhance the low-frequency component of the image, weaken
the high-frequency component of the image, eliminate
random noise, and play a smoothing role. Common Original image Mean filtering
smoothing filtering methods include mean filtering, median
filtering, and Gaussian filtering [6-7].
(1). Gaussian filtering
Gaussian filter is a kind of linear smoothing filter, which
is suitable for eliminating Gaussian noise. It is widely used in
image processing [8]. Generally speaking, Gaussian filtering
is a weighted average process of the whole image. The value
of each pixel is obtained by the weighted average of itself
and other pixel values in the neighborhood. The specific
operation of Gaussian filtering is to scan every pixel in the Median filtering Gaussian filtering
image with a template (or convolution or mask) and replace Figure 9. Comparison of three methods
the value of the central pixel of the template with the
weighted average gray value of the pixel in the neighborhood B. Segmentation of Facial Mask Images
determined by the template. Face segmentation is one of many image segmentation
In Equation (1),x2 and y2 represent the distance between algorithms, and it is the basic premise to realize Robustness
other pixels in the neighborhood and the center pixel in the and practical face recognition system. The image
neighborhood, respectively, and represent the standard segmentation algorithm can accurately locate the contour of
deviation. each organ of the face and extract the target area of interest
from the whole image, thus establishing a description of the
=  (4) face image that is easier to analyze and more expressive [11].
At present, the face segmentation algorithm has become
(2) Median filtering [9] mature, and a large number of image segmentation
Median filtering method [9] is a nonlinear signal algorithms based on good robustness are proposed.
smoothing technology based on the sorting statistics theory The Model-driven segmentation algorithm is the most
that can effectively suppress noise. A two-dimensional widely used face segmentation algorithm, but in the actual
sliding template is used to sort the pixels in the board face recognition, the face image is often affected by some
according to the size of pixel values, and a monotonic interference factors. Therefore, to extract more accurate
ascending (or descending) two-dimensional data sequence is target segmentation from the background image, model-
generated. The output of the two-dimensional median filter is based segmentation algorithms are used in face images. The
active contour model [12], also known as the Snakes model,
G(x, y)=med{f(x-k, y-l),(k, lɚW) (5) has proved to be an effective image segmentation framework.
The active contour model can be roughly divided into two
Among them, f (x, y), g (x, y) of the original image and categories: the active contour model based on edge
processed image respectively. information [13][14][15] and the active contour model based
(3). Mean filtering [10] on area information [16][17]. The C-V model proposed by
Mean filtering [10] is a traditional image denoising Chan and Vese [16], the most popular regional active
method in the spatial domain and its application in image contour model, can naturally handle topological structure
denoising is mainly to use various image smoothing changes and global segmentation. However, the C-V model
templates for image convolution processing, to suppress or itself also has some shortcomings. The model assumes that
remove noise. the gray values of image pixels in the same region have good
The basic idea of mean filtering is to replace the gray continuity, and the gray values of pixels in different regions
value of a pixel with the gray value of several pixels. For a differ greatly. This assumption is basically not true for many
pixel point (x, y) in a given image with f (x, y), its real images and the segmentation effect is bound to be poor.
neighborhood S contains M pixels, the image after mean Wang et al. [18] proposed an effective local C-V model,
filtering and smoothing is g(x, y), g(x, y) is defined by the which is insensitive to the selection of initial contour wave
following formula(3): position and control parameters and has less time complexity.

1401

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.
Therefore, this paper adopts the C-V model based on the
tending to the active contour model to operate.
The energy functional of the C-V model [16] is:
( , , )=
2
∮ + ∬ ( ( , )− ) + ∬ ( ( , )−
c )2 (7)

( )= ( ) ( )= ( ) (8)

∬ ( ) ( , )
= ( )
= 1,2 (9)

Among them, I (x, y) is for image segmentation, C is the


evolution curve, (I = 1, 2) as the curve of the internal and
external area, ϕ is the level set function, H (ϕ) is the
Heaviside function, ≥0, 1, 2 > 0 is the weight coefficient. Figure 10. The process of people entering the mall
By the representation of the level set and gradient descent
flow, the partial differential equation controlling the level set IV. EXPERIMENT
can be derived[19]:
A. Datasets
2
= ( )[ (|∇ |)] + ( )[− ( − c ) + (I- In this paper, the datasets we used are from the.
)2] (10) (https://github.com/AIZOOTech/FaceMaskDetection)
AIZOOTe team's FaceMaskDetection. The datasets contain
( ) is the Dirac function. images that can detect faces and determine whether a mask is
worn, and open source 7,959 facial mask annotation data.
C. Face Mask Image Recognition We select 92% in the data set for training and 8% for testing.
As the latest target Detection method of YOLO (YOU In the classification, "0" stands for "mask" and "1" stands for
ONLY LOOK ONCE: Unified, real-time Object Detection) "no mask".The experimental data are shown in Figure 11.
series, YOLOV5 combines the functions of YOLOV4, and at
the same Time proposes new data enhancement methods,
mosaics training, self-antagonistic training (SAT), and
PANet[4] instead of FPN. In this paper, we use YOLOV5 to
carry out the training data set, which can quickly and
accurately identify whether a face mask is worn or not.
D. Interface Interaction
Human-computer Interaction (HCI) refers to the process
of information exchange between a person and a Computer
to complete certain tasks by using a certain dialogue
language and in a certain way. In the current era of artificial
intelligence, human-computer interaction has a very
important impact and significance [20]. Figure 11. The different classes about “no mask” and “mask”
In this paper, we design a man-machine interface for the
recognition of face to wear masks, when people enter the B. GIOU Loss [21]
store into the door of the mall will be prompted to "face look The formula for GIOU is as follows
at the camera," since then, the computer whether to wear Algorithm 1: Generalized Intersection over Union
masks to customers, which can identify if the customer does Input: Two arbitrary convex shapes: A, B ⊆ ∈ ℝ
not wear masks gate won't open, considering the various Output: GIoU
complex environment can affect the accuracy of the results, 1.For A and B, find the smallest enclosing convex object
we will identify the time of design for 2 seconds, after the C, where C ⊆ ∈ ℝ
success of the recognition, voice system will be prompted to | ∩ |
"identify success", since gate open, customers can enter the 2.IoU=
| ∪ |
| ( ∪ )|
market. 3.GIoU=IoU-
Figure 10 shows the process as people enter the mall. | |

1402

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.
Algorithm 2: IoU and GIoU as bounding box losses V. CONCLUSION
Input: Predicted Bp and ground truth Bg bounding box This paper puts forward a new method based on YOLOV5
coordinates: for application to recognize faces whether wearing masks,
Bp=(xp1 ,yp1 ,xp2 ,yp2 ),Bg=(xg1 ,yg1 ,xg2 ,yg2 ) people into the store, they just need to stand in front of the
Output: IoU , GIoU. camera, people can be identified, if recognition success and
1.For the predicted box Bp, ensuring xp2 >xp1 and yp2 >yp1 : interface display can enter the gate open, this method is no
p longer need to use the human crowd control, greatly saves
1 =min(xp1 , xp2 ), p2 =max(xp1 , xp2 ) time and waste. Through testing, our experiment has a
p
1 =min(yp1 , yp2 ), p2 =max(yp1 , yp2 ). success rate of about 97.9%, we select some other classic
machine learning models for comparison. The result is
2.Calculating area of Bg: Ag=(xg2 -xg1 )h(yg2 -yg1 ). shown in Fig.13. Also, there is a picture wearing a mask but
3.Calculating area of Bp: Ap=( p2 - p1 )h( p2 - p1 ). not covering the nose, which can also be well recognized.
4.Calculating intersection between Bp and Bg: The experimental results are shown in Fig.14.Because of the
global impact of COVID-19, we believe that this design can
xℒ1 =max( p1 , xg1 ),xℒ2 =min( p2 , xg2 ), effectively reduce exposure distances and implement
yδ1 =max( p1 , yg1 ),yℒ2 =min( p2 ,yg2 ). effective surveillance.

− ℒ × ℒ − ℒ if x ℒ > ℒ
, ℒ
> ℒ
= 100.00%
0 ℎ
90.00%
5.Finding the coordinate og smallest enclosing box:Bc: 80.00%
c p g c
x =min( , x ),x =max( , x ). p g 70.00%
1 1 1 2 2 2
c p g c p g
y =min( , y ),y =max( ,y ).
1 1 1 2 2 2 60.00%
50.00%
6.Calculating area of Bc:Ac=(xc2 -xc1 )h(yc2 -yc1 ).

7.IoU= ,where =Ap-Ag- 40.00%
30.00%
8.GIoU=IoU- .
9. IoU=1-IoU, GIoU=1-GIoU . 20.00%
10.00%
In this paper, we adopted the method of combining 0.00%
GIOU Loss [21] and Center Loss [5] to identify whether a Faster R-
R-FCN SDD YOLOV5
face mask is worn or not. For face-covering recognition, CNN
masks can be regarded as covering objects, and Center Loss Accuracy 70.40% 77.60% 84.60% 97.90%
[5] can recognize faces, so we used this method to carry out
experiments. The test results of the model used in the Faster R-CNN R-FCN SDD YOLOV5
experiment are shown in Fig. 12.
Figure 13. Accuracy comparison

Figure 14. The detection results of wearing the mask but not covering the
nose

Figure 12. The GIOU Loss curve of training

1403

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.
VI. FUTURE WORK vision [M]. Tsinghua University Press, 2016.
[8] Wei Ying Gaussian filter OpenMP parallelization [J].Telecom World,
In this paper, YOLOV5 is applied to identify whether a 2015(10) 194-194.
face mask is worn so that the gate at the entrance of the [9] Gong Shengrong, Liu Chunping, Wang Qiang, Digital image
shopping mall can be opened and closed successfully. processing, and analysis [M].Beijing㸹Tsinghua University Press㸪
However, this kind of recognition is only for mask 2006.
recognition, if in some special circumstances, the customer [10] YAN Bing, WANG Jin-he㸪ZHAO Jing. Research of Image De-
covers part of the mask with his hand, it will not be noising Technology Based on Mean Filtering and Wavelet
recognized successfully. In the future, we will improve the Transformation[J]. Computer Technology and
Development.2011,21(2). (PP51-53,57)
situation where masks covered by hands or other shielding
[11] Li Dong mei. Research of Masked Face Recognition based on Image
objects cannot be recognized, making it more convenient for segmentation [D].Wuhan: South-central University For Nationalities㸪
people to enter the shopping mall in this special environment, 2015㸹(P5-P8)
and the identification system will be more intelligent. [12] Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models
[J]. International journal of computer vision, 1988, 1(4): 321-331.
ACKNOWLEDGEMENT [13] Caselles V, Catté F, Coll T, et al. A geometric model for active
This research is supported by the Guangdong Province contours in image processing [J]. Numerische Mathematik, 1993,
Key Field R&D Program Project (Grant No. 66(1): 1-31.
2020B090925002) ˈthe Major Projects of Guangzhou City [14] Caselles V, Kimmel R, Sapiro G. Geodesic active contours [J].
International journal of computer vision, 1997, 22(1): 61-79.
of China (Grant No. 201907010012).
[15] Malladi R, Sethian J A, Vemuri B C. Shape modeling with front
propagation: A level set approach [J]. IEEE Transactions on Pattern
REFERENCES Analysis and Machine Intelligence, 1995, 17(2): 158-175.
[1] World Health Organization. (2020). Coronavirus disease 2019 [16] Chan T F, Vese L A, Active contours without edges [J]. IEEE
(COVID-19): situation report, 72. Transactions on Image Process. 2001,10 (2): 266-277.
[2] Nishiura, H., Kobayashi, T., Miyama, T., Suzuki, A., Jung, S. M., [17] Tsai A, Yezzi A, Willsky A S. Curve evolution implementation of
Hayashi, K., ... & Linton, N. M. (2020). Estimation of the the Mumford–Shah functional for image segmentation, denoising,
asymptomatic ratio of novel coronavirus infections (COVID-19). interpolation, and magnification [J]. IEEE Transactions on Image
International Journal of infectious diseases, 94, 154. Process. 2001,10 (8) :1169-1186.
[3] Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: [18] Liu C, Wechsler H. Gabor feature based classification using the
Towards real-time object detection with region proposal networks. In enhanced fisher linear discriminant model for face recognition [J].
Advances in neural information processing systems (pp. 91-99). IEEE Transactions on Image processing, 2002, 11(4): 467-476.
[4] Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Pathaggregation [19] Xu Dong Peng Zhenming. Improved image segmentation method
network for instance segmentation. In Proceedings of the IEEE based on fast level set and C-V model [J]. High Power Laser and
conference on computer vision and pattern recognition (pp. 8759- Particle Beams. 2012,24(12); 2817-2821.
8768).
[20] Lin H I, Hsu M H, Chen W K. Human hand gesture recognition
[5] Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016, October). A using a convolution neural network[C]//2014 IEEE International
discriminative feature learning approach for deep face recognition. Conference on Automation Science and Engineering (CASE). IEEE,
In European conference on computer vision (pp. 499-515). Springer, 2014: 1038-1043.
Cham.
[21] Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian,
[6] Li Wei. Application and research of curve fitting in steel pipe Ian Reid, Silvio Savarese. (2019) Generalized Intersection over
counting [D]. Southwest Jiaotong University, 2010. Union: A Metric and A Loss for Bounding Box Regression[D]/
[7] Milan Sonka, AI Haizhou. Image processing, analysis, and machine https://arxiv.org/pdf/1902.09630.pdf

1404

Authorized licensed use limited to: IEEE Xplore. Downloaded on May 14,2021 at 01:20:52 UTC from IEEE Xplore. Restrictions apply.

You might also like