Human Abnormality Classification Report
Human Abnormality Classification Report
CNN-RNN Approach
By
Md. Mohsin Kabir, ID:16172103218
Farisa Benta Safir, ID:16172103049
Saifullah Shahen, ID:16172103186
Jannatul Maua, ID:16172103291
Iffat Ara Binte Awlad, ID:16172103054
Bachelor of Science in
May 2021
Declaration
We do hereby declare that the research works presented in this thesis entitled
are the results of our own works. We further declare that the thesis has
been compiled and written by us. No part of this thesis has been submitted
elsewhere for the requirements of any degree, award or diploma, or any other
purposes except for publications. The materials that are obtained from
Saifullah Shahen
ID: 16172103186 Signature
Jannatul Maua
ID: 16172103291 Signature
ii
Approval
Approach” result from the original works carried out by Dr. Muhammad
We further declare that no part of this thesis has been submitted elsewhere
for the requirements of any degree, award or diploma, or any other purposes
iii
Acknowledgement
We would like to express our heartfelt gratitude to the almighty Allah who
offered upon our family and us kind care throughout this journey until the
Also, we express our sincere respect and gratitude to our supervisor Dr.
not exist. We are grateful to him for his excellent supervision and for putting
his utmost effort into developing this project. We owe him a lot for his
as a researcher.
Finally, we are grateful to all our faculty members of the CSE department,
BUBT, to make us compatible to complete this research work with the proper
iv
Abstract
Deep Learning domain with the advent of big data. The facial expression
such as drug addiction, autism, criminal mentality, etc., are quite challenging
due to the limitation of existing FER systems. Besides, there are no existing
datasets that consist of helpful images that describe the human face’s
features within facial portions of the images, and the recurrent network
considers the temporal dependencies which exist in the images. The proposed
abnormalities.
v
List of Tables
vi
List of Figures
vii
List of Abbreviations
DL Deep Learning
viii
Contents
Declaration ii
Approval iii
Acknowledgement iv
Abstract v
List of Tables vi
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Problem Background . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Flow of the Research . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Significance of the Research . . . . . . . . . . . . . . . . . . 5
1.8 Research Contribution . . . . . . . . . . . . . . . . . . . . . 6
1.9 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . 6
1.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Background 8
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Problem Analysis . . . . . . . . . . . . . . . . . . . . . . . . 16
viii
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Proposed Model 17
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Feasibility Analysis . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Requirement Analysis . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Research Methodology . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 Data Pre-processing . . . . . . . . . . . . . . . . . . 19
3.4.2 Convolutional Neural Network . . . . . . . . . . . . . 20
3.4.3 Recurrent Neural Network . . . . . . . . . . . . . . . 21
3.4.4 Combined CNN-RNN Architecture . . . . . . . . . . 22
3.5 Design, Implementation, and Simulation . . . . . . . . . . . 24
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6 Conclusion 37
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Future Works and Limitations . . . . . . . . . . . . . . . . . 37
ix
Introduction
1.1 Introduction
[1, 2]. Facial expressions differ from person to person and are also influenced
the real feelings of a human being. So FER has become more pledging
Neural Network (CNN) [3] and Recurrent Neural Network (RNN) [4] to
classify human abnormalities. This approach analyses the human face and
analysis of facial expressions is a laborious task for Deep Learning (DL) [5]
approaches because human beings can deflect the way they express their
deep learning, mostly used for image analysis and image processing tasks
[7, 8]. Deep CNNs have gained momentous success and have been explicitly
1
proven well suited for image recognition tasks from massive datasets [9, 10].
Recent CNN architectures employ several ways to shorten the training time
acceleration [9]. RNN has been getting popular exponentially because RNN
not only assesses its input(s) momentarily but also its evaluation lies on the
past input(s) [12, 13]. Thus, the result is generated from a composition of
information coming from the past and present. Hence, in this work, we have
developed an architecture that uses both CNN and RNN to classify human
abnormalities.
Real-time facial expression analysis and finding facial pattern has remained
niques since people can vary significantly in the way they show their faces.
and regular people. The datasets available on the internet are not bringing
Therefore, the world has never seen such human abnormalities detection
architectures.
2
1.3 Problem Background
The present Facial Emotion Recognizer system only classifies human emo-
tions like sadness, happiness, anger, fear, disgust, etc. Identifying human
addict human faces. Lacking a proper dataset is the critical challenge of our
research. We are looking forward to working with the newly created dataset
of our own.
Expression (NAHFE).
cation.
abnormalities accurately.
• Comparing the existing architectures with the proposed one for the
3
1.5 Motivations
attention in computer vision researchers during the past decade since it lies
research work has already done that gives us reasonable solutions for few FER
The research work is developing into several steps. First, we have analysed
the research topics and then studied the basic theory of facial expression
1.1 illustrates the overall steps to the research procedure in the following
diagram.
4
Figure 1.1. The figure illustrates the flow of the thesis work.
datasets on the web are built for classifying six or seven primary expressions
divided into four classes: Drag addiction, Autism, Criminalist, and Normal.
5
Therefore this study introduces a new dataset to classify human stability into
which gives the best result. This human stability classification problem
6
Chapter 4 includes the details of the tests and evaluations performed to
1.10 Summary
explicitly at our research work’s objectives, the background, and the research
work’s motivation. This chapter also illustrates the overall steps on which
7
Background
2.1 Introduction
computer vision researchers attention during the past few years. The
which have been universal across cultures and subgroups: neutral, happy,
surprised, fear, angry, sad, and disgusted. However, before this study, no
significant attention in computer vision researchers during the past few years
[14, 15]. Popular approaches classify seven basic expressions include happy,
sad, surprised, disgusted, fear, angry, and neutral but no prior work on
8
Takalkar et al. [16] analysed the use of deep learning for micro-expression
Jung et al. [17] proposed two deep network models using CNN and DNN
for the FER problem. Using the FER 2013 database, they achieved 72.78%
accuracy for DNN architectures and 86.45% for CNN. The authors first
detected faces from input images by Haar-Like features and then applied
model using CNN. The authors divide the process into two parts. First, the
network is trained on the PC, and then the network is implemented on the
Field Programmable Gate Array (FPGA) and got 99.25% accuracy which is
state-of-the-art.
Neha Jain et al. [19] proposed the face emotion recognition model, a
combination of deep CNN and RNN models. In this model, the author used
two datasets: MMI Facial Expression Database (FED) and the Japanese
Female Facial Expression (JAFFE). In this model, 80% of datasets are used
for training, and 20% of datasets are used for validation. Achieve 94.91%
accuracy when used with JAFFE datasets and achieve 92.07% accuracy
the VGG model and trained famous datasets like CK+, MUG, and got
9
nearly 99% accuracy which is state-of-the-art.
Ali et al. [7] proposed deep neural network architecture presented for
automated facial expression, which has two convolutional layers. One is max
pooling, another is four inception layers, and firstly applies the inception
layers. The proposed approach takes a facial image as input and classifies
that image into 6-expression or neutral. The author used different databases
and achieved 94.7%, which is the best accuracy using the CMU Multi-PIE
database.
services among Google, Amazon, and Microsoft. They provide the best
output services for their dataset. They use 140 images as a dataset. As
However, in recognition accuracy, google does its best job. Amazon always
Authors got suitable and faster processing time for facial expression recogni-
tion. On the CUFace dataset, this CNN base architecture got nearly 90%
accuracy.
combination of DBNs and MLP using the JAFFE dataset and Cohn Kanade
dataset. Using the proposed DBNs + MLP method, achieve the highest
accuracy of 90.95% (use 64 X64 images) when using the JAFFE database
and perform the highest 98.57% (use 64 X64 images) when using the Cohn
10
Kanade database.
they use the PUBALL dataset, which their creation and author achieve
2013 dataset, this model (BKVGCi) got the highest accuracy of 7% for
they also used pre-processing techniques to solve this method. Get best
Neural Network for classifying one of six emotional states. This model
trained on two famous datasets CK+ and JAFFE and outperformed the
the HOG feature descriptor, which is used to detect human faces and CNN
Recognition datasets, the authors got high accuracy with low computation
time.
11
network (CDBN) models for Multimodal Emotion Recognition. They also
present the emoFBVP database, which they created and get an expected
Sun et al. [28] proposed a new face detection scheme using deep learning
small datasets for facial expression recognition. The author’s best submission
achieve better results than the state-of-the-art on IARPA’s CS2 and NIST’s
IJB-A in both verification and identification tasks. They have used CASIA-
WebFace [21] for training and both IJBA [18] and IARPA’s Janus CS2 for
evaluation.
Tolga Soyata et al. [32] work with basic functionalities of MOCHA and
cloud platform. They use mobile, capture them, and send them to the
12
cloudlet and use it as a random image. As a result, by increasing cloud
device. However, a 50W power budget and cost of under $100 can be two
algorithm using deep learning using LC DRC. They try to reduce the national
level of understanding and provide an efficient system of error red. They use
400 face images from 40 individuals of 10 different face actions. These are
tested with the help of BCRE and WCRE results. Using a Deep learning
algorithm, it performs 87% in ORL face datasets higher than normal LDRC.
De Silva et al. [34] proposed a new basis function called cloud basis
images database that included images taken from the Carnegie Mellon
University facial expression image database and their database. The CBF
have used the RML database and eNTERFACE’05 database. The highest
Zeng et al. [36] proposed a novel framework for facial expression recog-
13
nition to distinguish the expressions with high accuracy automatically.
recognize facial expressions with high accuracy by learning robust and dis-
criminative features from the data. The experiment used the d Cohn-Kanade
Network. They have used their dataset and compared their results with
Md. Zia Uddin et al. [38] extract valuable features from depth faces
which are further combined with deep learning and recognition with Modified
binary code. This value is calculated by its relative edge strengths in eight
operates on the raw signal and performs end-to-end emotion prediction tasks
from speech and visual data. For this, they use 96X96 images, 30 videos,
have 75, 150, 300 layers for speech and visual models. The best value of the
speech model is 150 layers, while the visual model was 300 layers. Firstly,
they extract facial landmarks using the face alignment method by Eng et al.
input to the recurrent model, train only the recurrent network. Their system
14
Md. Zia Uddin et al. [38] improve Directional Position Pattern (DPP)
principal Component Analysis (PCA). They use RGB cameras and capture
through intensities very rapidly due to illumination changes in the scene, but
distance-based capture does not work correctly. They use 120 videos, where
40 videos have ten face reactions (Anger, Happy, Sad etc.). The average
recognition rate using PCA with HMM on depth faces 58% and PCA-LDA
with FER. The average result is 61.50%. After applying ICA and HMM,
Tian et al. [40] proposed a novel deep feature fusion convolution neural
and shape index values). The authors then combine different facial attribute
feature fusion CNN subnet trained from a large-scale image dataset for
universal visual tasks. They reach an accuracy level of 79.17% by using this
outperforming the baseline approaches. The authors have got 81.5% accuracy
However, all these FER techniques only analysed the six or seven basic
expressions but did not give any solution to human stability identification.
15
In this paper, we have discussed how the CNN-RNN combined approach
attention in computer vision researchers during the past few years. These
FER techniques only analysed the six or seven basic expressions: happy,
sad, surprised, disgusted, fear, angry, and neutral, but no prior work on
classifying human abnormalities has been done yet. In this thesis work, we
have discussed how the CNN-RNN combined approach solves this issue.
2.4 Summary
This chapter investigated and reviewed the latest techniques of facial ex-
16
Proposed Model
3.1 Introduction
in this structure. Finally, this chapter illustrates the model’s overall archi-
This research work required five researchers with one supervisor and took
including, hardware and software. The research work also required a dataset
The immense data collection of the work is executed, considering the legal
feasibility of the dataset. Also, the thesis work did not require any financial
17
3.3 Requirement Analysis
This section is sub-sectioned into four segments. The sub-sections are sorted
from the input to output phase of the model consecutively with detailed
architecture.
Input
Abnormality classifying
Processing the Input to classify
dataset CNN+RNN abnormalities
Figure 3.1. The figure illustrates the workflow of the proposed system (from
left to right).
18
3.4.1 Data Pre-processing
The data pre-processing has occurred in two stages, data normalization and
as intensity offsets. The intensity offsets are fixed in the local region.
ξ (π, θ) − µ (π, θ)
ψ (π, θ) = (3.1)
6σ (π, θ)
our NAHFE dataset has 1936 images for 4 classes, it is still inadequate
19
techniques for propagating diverse tiny variations in appearances and
tion for every actual image in the dataset. Therefore the number of
evaluate the model. 80% of the dataset is used to train the model,
define the data pre-processing technique that can work with any type of
input shape and quality. In this architecture, CNN constructs with six
convolutional layers and two dense layers, each with a ReLu activation
function, and dropout for training. Equation (3.4) and Equation (3.5)
applied regularization for every weight matrix that shortens the volume of
(l−1)
mi
(l)
X (l) (l−1)
Yi (l) = Bi + Ki,j Yj (3.4)
j=1
(l) (l)
where the output Yi of layer l consists of the m3 feature of size
20
(l) (l) (l) (l)
m1 × m2 . The ith feature map denoted Yi and Bi is a big bias matrix
(l)
and Ki,j is the filter of size.
d (x) = Activation W T x + b
(3.5)
if the ReLU function gets any non-positive value, it returns zero but for
x,
with prob.p
Dropout(x, p) = (3.7)
x, with prob.1 − p
dropout possibility.
used to process time-series and other sequential data. In RNN, the tensors
generates an output at each time step and has recursive connections between
21
Number of
kernels
CONV
LAYER Next Layer
(CL) 1 CL 2
CL 6
Feature
Input size
FC LAYER 1 FC LAYER 2
ture. Each of the cubes represents an output of the convolution. The height
and width are the gained information, and each cube’s depth is equal to the
set of nodes.
yt = σy (Wy ht + by ) (3.9)
and learn the information. To regulate all the parameters, the CNN feature
extraction method is used. The RNN classifies the images by adding the
extracted features from the successive CNN network of each image, and
finally, the prediction uses Softmax. While experimenting, when the image
from the dense layer. For the learned time t, the network takes P frames
22
from the past ([t − P, t]). After that, every frame runs from time t − P to
t the CNN and extracts P vectors for every input. After that, each vector
passes by a node of RNN, and each node of that model gives some outputs
of the valence label. The experiment and evaluation of the architecture are
done by various layers of CNN as input features, and the proposed one has
acquired the maximum score on test data. To calculate the cost function,
the mean squared error is used while optimizing. The overall architecture is
Output
malities using the NAHFE dataset. CNN extracts the information and the
RNN classifies the images. The input images first pass to CNN and then the
output from the dense layer of CNN goes through the RNN and classifies
23
3.5 Design, Implementation, and Simulation
3.1. All the mentioned steps of the prototype are implemented using Python
[43]. The convolutional and recurrent neural network models are imple-
support, Numpy [44] is used. The dataset used to test the architecture is
the architecture.
3.6 Summary
24
Implementation, Testing, and Result Analysis
4.1 Introduction
4.2 Dataset
datasets on the web are built for classifying six or seven basic expressions of
the human face. But, to analyse human stability, we need a dataset that
Normal and Abnormal human images from the web and create our Normal
classes in addition to the name of the class (e.g., sinful, convicted, sinner
25
Figure 4.1. The sample images of Normal, Autistic, Drug addict, and
Criminal human face from the dataset we used in this research from top to
the right. These images are gathered from the web using the web gathering
technique.
to the same class. Finally, we placed 1936 images for the four classes. In
evaluations, we have used 80% (1548) of the images for training and the rest
20% (388) for testing. Also, images of four classes are distributed evenly,
and the number of samples in each class is 484. A couple of sample images
26
4.3 System Setup
4.4 Evaluation
human abnormalities.
how superior an algorithm or approach is. The major problem for evaluating
any method is adopting training and testing sets, which can introduce an
based upon the confusion matrix, which consists of true-positive (TP), true-
negative (TN), false-positive (FP), and false-negative (FN) [45] values. The
is done.
correct guesses the model estimates from the model’s total estimations. The
27
TP + TN
Accuracy = (4.1)
TP + TN + FP + FN
Precision defines all the positive classes the model predicted correctly,
how many are actually positive. To obtain the value of precision, we divide
the total number of correctly classified positive examples by the total number
TP
P recision = (4.2)
TP + FP
Recall defines how much the model predicted correctly among all positive
classes. A recall is the ratio of the total number of correctly classified positive
TP
Recall = (4.3)
TP + FN
chitecture. Table 4.1 shows the prediction accuracy, precision, and recall
compared the predicted result of the proposed architecture with basic CNN
does not classify the human abnormalities properly. Basic CNN architecture
gives only 0.732 accuracies while CNN-RNN combined approach gives 0.895.
28
It is noted that the performance measurement by precision and recall also
architecture. So, to get better results combining CNN with RNN found
more effective. Every result in this research is given as a mean of four runs.
Table 4.1. The table shows the validation Accuracy, Precision, and Recall of
Table 4.2. The table shows the validation Accuracy, Precision, and Recall
units.
Table 4.2 presented the result of the proposed model using a different
number of hidden units. The investigation says that the model performs the
best result while using 150 hidden units. The accuracy, precision, and recall
29
scores increase with the increasing number of hidden units. However, when
the number of hidden units cross 150, the model start behaving negative
result.
Table 4.3. The table shows the validation Accuracy, Precision, and Recall
layers.
Similarly, Table 4.3 shows the result of using different hidden layers in
the model and it is found that using 6 hidden layers gives the best result of
model performs best while using 150 hidden units and 6 hidden layers.
4.6 Summary
30
Standards, Constraints, Milestones
of the thesis work. Then, the Constraints and Alternatives are illustrated.
Finally, the Schedules, Tasks, and Milestones of the proposed work are
presented.
We ensure that our thesis work will be sustainable for many years. Facial
detection is a FER problem that can be helpful for society and the country.
Moreover, CNN and RNN that we used for the implementation is the current
edge deep learning approach. As our used resources will be available for more
extended periods of time, we can say this thesis work will be sustainable.
In society, there are different types of people around us. Different people
31
different gestures due to their health abnormalities. As the facial expression
are different from a non-drug addict. We can group people by their facial
we have a special school or daycare. But in-home every parent are not known
about autism. For them, it is hard to find that if their child is average or
not. The police can not identify criminal or drug addict by their eye every
the help of our research work, they can catch criminal or drug-addicted
recognising that people, our society can be cleaner and more aware.
5.3 Ethics
recline on the dataset applied to train the model. The installation of the
systems must sustain individuals’ privacy concerns and should not be applied
for any motive that enhances a social, national or global security threat.
The collection of the dataset must be complete under the code of moral
32
5.4 Challenges
user facial expression. Sometimes people may behave unusually, and people
may use a facemask or covering face, which can create a problem to detect
a person. Besides, Misuse is a big problem. Awful people can roughly use
this system.
5.5 Constraints
However, budget can vary in the market environment because the product
33
5.6 Timeline and Gantt Chart
Our thesis work timeline is divided into three divisions as we get three
semesters to complete our work. We have conducted our work through the
a proposal and reviewed the related work of the thesis work. Also, we
have implemented the overall architecture and test with the introduced
dataset and reported the overall workflow. In the meantime, we also wrote
process to complete this thesis work. The thesis work is completed within
three semesters, where per semester is four month that means 12 month in
total.
34
1st Semester
Weeks 1 2 3 4 5 6 7 8 9 10 11 12
Topic Selection
Built Prototype
Evolution
2nd Semester
Weeks 13 14 15 16 17 18 19 20 21 22 23 24
Model Diagram
Design Submission
Model Analysis
Partial Implementation
Evolution
3rd Semester
Weeks 25 26 27 28 29 30 31 32 33 34 35 36
Model Finalization
Result Evaluation
Report Writing
However, this chapter briefly explains the standards, impacts, ethics, chal-
lenges of the thesis work. Also, the constraints, alternatives, schedules, tasks,
36
Conclusion
6.1 Introduction
method of CNN and RNN to train and test our method precisely. We
detection from video and audio using deep learning architectures can be
37
the architecture, but the latest architecture like GRU, LSTM can be tested
38
References
[1] Yadan Lv, Zhiyong Feng, and Chao Xu. Facial expression recognition
[2] Inchul Song, Hyun-Jun Kim, and Paul Barom Jeon. Deep learning for
564–567, 2014.
[3] Steve Lawrence, C Lee Giles, Ah Chung Tsoi, and Andrew D Back.
[4] Tomáš Mikolov, Martin Karafiát, Lukáš Burget, Jan Černockỳ, and
association, 2010.
[5] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.
[6] Heechul Jung, Sihaeng Lee, Junho Yim, Sunjeong Park, and Junmo
39
[7] Ali Mollahosseini, David Chan, and Mohammad H Mahoor. Going
1–10, 2016.
[8] Masakazu Matsugu, Katsuhiko Mori, Yusuke Mitari, and Yuji Kaneda.
6):555–559, 2003.
[10] Karen Simonyan and Andrew Zisserman. Very deep convolutional net-
2014.
[11] Patrice Y Simard, David Steinkraus, John C Platt, et al. Best practices
3(2003), 2003.
15(1):1929–1958, 2014.
[13] Hiroshi Kobayashi and Fumio Hara. Dynamic recognition of basic facial
[14] Abir Fathallah, Lotfi Abdi, and Ali Douik. Facial expression recognition
40
[15] Anima Majumder, Laxmidhar Behera, and Venkatesh K Subramanian.
2016.
[16] Madhumita A Takalkar and Min Xu. Image based facial micro-
2017.
[17] Heechul Jung, Sihaeng Lee, Sunjeong Park, Byungju Kim, Junmo Kim,
[18] Xiujie Qu, Tianbo Wei, Cheng Peng, and Peng Du. A fast face recogni-
[19] Neha Jain, Shishir Kumar, Amit Kumar, Pourya Shamsolmoali, and
142–148, 2018.
1324, 2018.
41
[23] Dinh Viet Sang, Nguyen Van Dat, et al. Facial expression recognition
neural networks: coping with few data and the training sample order.
[25] Deepak Kumar Jain, Pourya Shamsolmoali, and Paramjit Sehdev. Ex-
[26] Jinwoo Jeon, Jun-Cheol Park, YoungJoo Jo, ChangMo Nam, Kyung-
[28] Xudong Sun, Pengcheng Wu, and Steven CH Hoi. Face detection using
299:42–50, 2018.
[29] Hong-Wei Ng, Viet Dung Nguyen, Vassilios Vonikakis, and Stefan
[30] Wael AbdAlmageed, Yue Wu, Stephen Rawls, Shai Harel, Tal Hassner,
42
Natarajan, et al. Face recognition using deep multi-pose representations.
[31] Yanan Guo, Dapeng Tao, Jun Yu, Hao Xiong, Yaotang Li, and Dacheng
Tao. Deep neural networks with relativity learning for facial expression
[32] Tolga Soyata, Rajani Muraleedharan, Colin Funai, Minseok Kwon, and
2012.
1253, 2008.
601, 2019.
[36] Nianyin Zeng, Hong Zhang, Baoye Song, Weibo Liu, Yurong Li, and
[37] Samuli Laine, Tero Karras, Timo Aila, Antti Herva, Shunsuke Saito,
43
performance capture using deep convolutional neural networks. arXiv
[38] Md Zia Uddin, Weria Khaksar, and Jim Torresen. Facial expression
[40] Kun Tian, Liaoyuan Zeng, Sean McGrath, Qian Yin, and Wenyi Wang.
1–6, 2019.
[41] Wei Li, Min Li, Zhong Su, and Zhigang Zhu. A deep-learning approach
2015.
[42] Tong Zhang, Wenming Zheng, Zhen Cui, Yuan Zong, and Yang Li.
[44] Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. The numpy
[45] Jake Lever, Martin Krzywinski, and Naomi Altman. Points of signifi-
44