Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
26 views24 pages

Image Processing

The document presents a report on a Face Mask Detection System developed by a group of students in response to the COVID-19 pandemic. It outlines the importance of face masks in controlling virus transmission and describes the use of Haar cascade classifiers and deep learning algorithms for real-time face and mask detection. The report also includes a literature survey of existing methodologies and their results related to face mask detection.

Uploaded by

utkarshv036
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views24 pages

Image Processing

The document presents a report on a Face Mask Detection System developed by a group of students in response to the COVID-19 pandemic. It outlines the importance of face masks in controlling virus transmission and describes the use of Haar cascade classifiers and deep learning algorithms for real-time face and mask detection. The report also includes a literature survey of existing methodologies and their results related to face mask detection.

Uploaded by

utkarshv036
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Image Processing J Component Report

Slot: G1+TG1

submitted by

PRABHVEER SINGH KHURANA(19BCE0280)

UTKARSH VERMA(19BCE0078)

ALKA RANI(19BCI0004)

ADVIKA SRIVASTAVA(19BCE2217)

On

Face Mask Detection System

SCHOOL OF COMPUTER SCIENCE AND ENGINEERING


Abstract
Coronavirus disease 2019 (COVID-19) is a new respiratory infectious illness caused by the
coronavirus 2 of the Severe Acute Respiratory Syndrome (SARS-CoV2). According to a
World Health Organization (WHO) report released on September 15, 2020, COVID-19 has
swiftly spread to the majority of countries globally, affecting over 21.9 million people and
causing approximately 45.5 million fatalities. The fight against Coronavirus necessitates a
plethora of important tools. Face Mask is one of these necessities. Initially, wearing a face
mask was not required for everyone, but as time passed, scientists and physicians began to
urge that everyone do so. Face Mask Detection Technique will now be used to determine
whether or not a person is wearing a face mask. Our initiative aims to assist us in determining
who is wearing a mask and who isn't. This would aid in minimising illness transmission from
one person to the next in public areas. The proposed algorithm for face mask detection in this
system utilises Haar cascade classifier to detect the face and the mask. The whole system has
been built and demonstrated in a practical application for checking people wearing face
masks. In this paper, we represent a methodology for face detection robustly in a real time
environment. Object Detection using Haar feature-based cascade classifiers is an effective
object detection method proposed by Paul Viola and Michael Jones in their paper, "Rapid
Object Detection using a Boosted Cascade of Simple Features" in 2001. It is a machine
learning based approach where a cascade function is trained from a lot of positive and
negative images. It is then used to detect objects in other images. Here we use Haar classifier
and deep learning algorithms to track faces on the OpenCV platform which is open source
and developed by Intel.

Keywords:
DeepLearning; Computer Vision; OpenCV; Haar cascade
Introduction
India witnessed a surge in COVID-19 infections from mid July 2020, the total number of new
cases rose upto 97,900 per day by late September, then declining to lesser number of new
cases per day by November. This was termed to be the first wave of COVID-19. However,
with the oncoming of new year and other festivals, the restrictions loosened and people
became careless and the face masks were not worn anymore even in large crowds. This
carelessness was punished by the pandemic as it hit again this time, raising the new infection
count to as high as 4,00,000 per day as recorded on 6th-8th May of 2021. Now, as the
infection rates subside, the people have become careless yet again. However, taking lessons
from the second wave, public places like malls, cinema etc. have enforced wearing of masks.
However, deploying a large workforce for this purpose is infeasible. This is where our project
comes in. Our project helps ensure everyone is wearing masks with the need of little to no
human intervention.

LITERATURE SURVEY
SR. NO. TITLE METHODOLOGY RESULTS DRAWBACK

1. Face Mask detection This paper used different The experimental The dataset used is
using transfer learning deep Convolutional Neural results show that not large enough to
2020 IEEE 2nd Networks (CNN) to extract despite a small deal with complex
International deep features from images dataset size (1376 problems.
Conference on of faces. The extracted faces), the use of
Electronics, Control, features are further transfer learning is a
Optimization and processed using various good approach to
Computer Science machine learning classifiers classifying faces
(ICECOCS) such as Support Vector with masks and
Machine (SVM) and faces without masks.
K-Nearest Neighbors Seeing that the
(K-NN). Were used and combination of
examined all different MobileNet-V2
metrics such as accuracy model with SVM
and precision, to compare got an excellent
all model performances. performance
(97.11%).

2. Social distance A robust model to initiate Image processing The proposed model
Monotoring and Face the basic mandatory step to based social is not implemented
Mask detection using be taken by the society to distancing in large coverage
Deep neural network control the COVID transit. framework with area which is
MSc Internet of Things with Using Deep Neural Network deep learning neural required to improve
Data Analytics Bournemouth to identify social distancing network for analysis the accuracy of the
University, United Kingdom
(2020).
being maintained at the is evaluated. The prediction model
public sites and making sure proposed idea with respect to
people are wearing a mask suggests a global improving the
while entering a premise. display using training model.
AWS-IoT platform.
The research would
be further extended
by implementing the
robust prediction
system and real time
monitoring of face
mask detection
system with large
data size.
3. An Improved Neural The author proposed a When testing, Nil
Network Cascade for technique for masked face different sizes of
Face Detection in Large detection using four detection windows
Scene Surveillance. different steps of are normalised to fit
Received: 30 September estimating distance from the input size of each
2018; Accepted: 8 camera, eye line detection, algorithm. Method
November 2018; facial part detection and (red) outperforms the
Published: 11 November eye detection. ,They two compared
2018 improve popular cascade approaches by a large
algorithms by proposing a margin. The number
novel multi-resolution of true positive and
framework that utilises false positive
parallel convolutional detections when we
neural network cascades tune over different
for detecting faces in large threshold values.
scene. Comparing with When the threshold
popular cascade value is smaller, more
algorithms, this method faces are accepted, as
outperforms them by a well as more false
large margin. alarm. Yet, overall,
our approach
performs relatively
well over different
thresholds.

The project
4. Face Mask Detection implementation is divided Nil
Using CNN for into two phases: Training Detected the people
Covid-19.Vol. 10, Issue and Deployment. In not wearing face
6, June 2021 training phase, the GitHub masks and thus
dataset named ‘Face Mask decrease the risk of
Detection Dataset’ is getting infected by
loaded. This dataset is used COVID-19.
to train the Face Mask
Classifier using CNN
algorithm. VGG-16
(Visual Geometric Group)
architecture will be used to
train the CNN classifier.
Store the trained Face
Mask Classifier on the
disk. In the deployment
phase, first the trained
Face Mask Classifier is
loaded.
5 Facial Mask Detection This paper proposes twin Results are Nil
using Semantic objective of creating a demonstrated on
Segmentation," 2019 4th Binary face classifier Multi Human Parsing
International Conference which can detect faces in Dataset with mean
on Computing, any orientation irrespective pixel level accuracy.
Communications and of alignment and train it in Also the problem of
Security (ICCCS), 2019. an appropriate neural erroneous predictions
network to get accurate has been solved and a
results. The model requires proper bounding box
inputting an RGB image of has been drawn
any arbitrary size to the around the segmented
model. The model’s basic region. Proposed
function is feature network can detect
extraction and class non frontal faces and
prediction multiple faces from
single image. The
method can find
applications in
advanced tasks such
as facial part
detection.

6. An Efficient Ant Colony In this paper, they use the The proposed NIL
System for Edge particular variant Ant algorithms were
Detection in Image Colony System (ACS) compared with the
Processing Dorigo and Gambardella ACS system
An Efficient Ant Colony presented in Baterina
(1997) to solve the
System for Edge and Oppus (2010)
Detection in Image problem of edge detection. without filtering. The
Processing. ACS applied two rules for two algorithms
the pheromone update: proposed in this paper
local and global, in order have outperformed
to achieve a better search the traditional ACS
of the problem space. They (without filtering)
and higher quality
propose two modified
solutions were
algorithms of ACS, which obtained in
are developed to obtain an significantly shorter
efficient edge detection time, without the
with a minimized need for addition
complexity. noise-filtering
processes.

They propose a simple and


7. A FRAMEWORK FOR It does not include
effective framework for
VIDEO scene detection by For scene detection scene classification,
SEGMENTATION computing the similarity key frames are content-based video
USING GLOBAL AND between the key frames in extracted from abrupt classification, and
LOCAL FEATURES. a sliding window. We boundaries and K size efficient video
detect candidate abrupt retrieval based on
window is used for
boundaries by analysing visual queries.
the abrupt changes in similarity of
consecutive frames. SURF keyframes in
A FRAMEWORK FOR features matching is used temporal order.
VIDEO to refine the candidate Substantial
abrupt boundaries. Before improvements are
SEGMENTATION
detecting the scene achieved over SBD
USING GLOBAL AND boundaries, we detect fade
LOCAL FEATURES and encouraging
effects present in video
streams. Fade boundaries results are obtained
are detected by analysing for scene detection.
the changing pattern of The proposed
entropy in temporal order. framework can be
This paper is limited to rebuilt easily on any
fade effects in gradual commodity hardware.
bound.

8. Berryman, J. G. (1985). Fourier transform methods the major remaining


Measurement of spatial It is found that a
and array processor obstacle to
correlation functions digitized image with
techniques for calculating application of these
using image processing 512 X 512 pixels
the spatial correlation techniques to real
techniques. gives sufficiently
functions are treated. By two-phase materials
good statistics to
introducing a minimal set is again one of
provide a good
of lattice-commensurate resolution if the
reproduction of the
triangles, a method of constituents' colours
expected values of
sorting and storing the are not "black" and
the spatial correlation
values of three-point "white" as assumed
functions. The
correlation functions in a here. However, the
A FRAMEWORK FOR computed correlation
compact one-dimensional resolution problem
VIDEO functions are then
array is developed. is not so much an
used to calculate
SEGMENTATION Although results depend image processing
bounds on the
USING GLOBAL AND somewhat on problem as a
conductivity for a
LOCAL FEATURES magnification and on sample surface
model material.
relative volume fraction, it preparation problem
is found that photographs which should be
digitized with 512×512 solved in the
pixels generally have laboratory prior to
sufficiently good statistics photographing the
for most practical surface.
purposes. To illustrate the
use of the correlation
functions, bounds on
conductivity for the
penetrable sphere model
are calculated with a
general numerical scheme
developed for treating the
singular three-dimensional

9. Chowdary, P. R. V., Gesture recognition is the They have proposed It Does not
Babu, M. N., fast growing field in image four simple implement by using
Subbareddy, T. V., processing and artificial MATLAB algorithms Xilinx System
Reddy, B. M., & technology. The gesture for gesture Generator software
Elamaran, V. (2014). recognition is a process in recognition which are which is linked with
Image processing which the gestures or invariant to rotation. Xilinx FPGAs by
algorithms for gesture postures of human body Out of all the implementing
recognition using parts are identified and are algorithms scanning hardware
MATLAB. used to control computers method is the robust co-simulation.
and other electronic method which
appliances. The most delivers accurate
contributing reason for the results for 82.47% of
emerging gesture images. The results
recognition is that they can obtained from the
create a simple above algorithms can
communication path be used to control any
between human and electronic appliance.
computer called HCI. Some of them can be
MATLAB code can be controlled the VLC
converted to HDL or media player or
VHDL code and can be power point
embedded in FPGA for presentation without
hardware execution. This having any physical
future work can be contact with the
implemented by using computer thus
Xilinx System Generator establishing a better
software which is linked human computer
with Xilinx FPGAs by interaction
implementing hardware
co-simulation.

10. Stefan van der Walt, Scikit-image is an image scikit-image provides NIL
Johannes L. processing library that easy access to a
Schönberger, Juan implements algorithms and powerful array of
Nunez-Iglesias, François utilities for use in research, image processing
Boulogne and Neil education and industry functionality. Over
Yager. Scikit-image: applications. It provides the past few years, it
Image processing in easy access to a powerful has seen significant
Python array of image processing growth in both
functionality. Over the past adoption and
few years, it has seen contribution,19 and
significant growth in both the team is excited to
adoption and contribution. collaborate with
Due to the breadth and others to see it grow
maturity of its code base, even further, and to
as well as its commercial establish it the de
friendly licence, facto library for
scikit-image is well suited image processing in
for industrial application Python
11. NIL
H. Adusumalli, D. The methodology Due to increasing
Kalyani, R. K. Sri, M. consisted of 4 stages. First COVID cases a
Pratapteja and P. V. R. stage is dataset collection: system to replace
D. P. Rao, "Face Mask the dataset was collected humans to check
Detection Using from Kaggle and divided masks on the faces of
OpenCV," 2021 Third into training and testing people is greatly
International Conference data after analysis. Second needed. The system
on Intelligent stage is training the model proposed in the paper
Communication to detect face masks: can be installed in
Technologies and Virtual OpenCV module is used to public places like
Mobile Networks obtain faces followed by airports, malls, etc. to
(ICICV), 2021, pp. training a Keras model to ensure safety of
1304-1309, doi: identify face masks. Third workers and public. It
10.1109/ICICV50876.20 stage is detecting the can also be used in
21.9388375. person not wearing a mask MNCs where there
and finally sending them are hundreds of
an email or notification for workers, as this
the same. system can store
employee data as well
and can notify them if
they are not wearing
a mask.

RCNN is essentially a
Although the Faster
12. H. Jiang and E. region-based convolutional MultiresHPM is a
R-CNN is designed
Learned-Miller, "Face neural network used for better detector than
for generic object
Detection with the object recognition. The Faster R-CNN.
detection, it
Faster R-CNN," 2017 pipeline has a total of two
demonstrates
12th IEEE International stages. First, a set of
impressive face
Conference on category-independent
detection
Automatic Face & object suggestions is
performance when
Gesture Recognition generated by a selective
retrained on a suitable
(FG 2017), 2017, pp. search. In the second
face detection
650-657, doi: refinement step, the image
training set. It may be
10.1109/FG.2017.82. area in each proposal is
possible to further
warped to a fixed size and
boost its performance
then mapped to a
by considering the
4096-dimensional feature
special patterns of
vector. This feature vector human faces.
is sent to the classifier and
the regressor to narrow
down the detection
position. R-CNN brings
high accuracy of CNN’s on
classification problems of
object detection. R-CNN
requires a forward pass
through the convolution
neural network for each
object to extract its
features. This requires
large computations,
creating a computational
burden. To solve this
problem Fast R-CNN was
introduced, that runs
through the network
exactly once for an entire
input image.

13. Q. Du, J. Zhao, L. Shi Input: Elements/properties


and L. Wang, "Research obtained after human facialThe paper defines NA
on the two-dimensional feature extraction is the features points
face image feature raw input for facial image introduced for 3D
extraction method," analysis. Facial image face model
2012 3rd International reconstruction
analysis will play a crucial and
Conference on System role in our proposed calibration of feature
Science, Engineering project i.e. Face Mask regions using various
Design and Detection System. methods such as edge
Manufacturing detection, clustering,
Informatization, 2012, The author focuses on etc.
pp. 251-254, doi: automatic extraction of
feature points from the
two-dimensional digital
10.1109/ICSSEM.2012. face photo. The automatic
6340720. extraction process is
adjusted and grey is before
data processing. The face
regional calibration
method is the edge
detection and binarization
after clustering and the
result is under control of
the integral projection
combined with the
physiological structure.
Experimental results show
the method can effectively
extract 2D face image
features and can be used
for practical purposes. The
key contributions listed in
the paper are:

1. Defining what feature


points are. 54 points
are chosen as the facial
feature point set with
face characteristics
parameters in MPEG-4
standard.

2. The feature extraction


method. Feature points
are automatically
calibrated using Susan
algorithm, cluster
analysis and image
binarization.

3. Flow diagram of
methodology:
14.
G. Deore, R. Bodhula, Input: Estimated distance Distance from camera Algorithm
V. Udpikar and V. More, from camera, eye line method is used to see
detects very
"Study of masked face detection, facial part if person is
small part of
detection approach in detection, eye detection. approaching towards
image in low
video analytics," 2016 the camera or going
The author proposes a resolution.
Conference on Advances away to decide
in Signal Processing technique of masked face whether to trigger
(CASP), 2016, pp. detection using the four face detection or not.
196-200, doi: inputs mentioned above. Eye line detection
10.1109/CASP.2016.774 Distance from camera can finds out the valley in
6164. be calculated using pinhole histogram projection.
camera model. Eye and If eye line is detected
eyebrows correspond to correctly, face
low grey levels as detection can be
compared to other parts. applied to check for a
These locations correspond mask. (Eye line and
to the local valley of the eye detection have
horizontal projection maximum false rates
histogram. For face because the algorithm
detection, Viola Jones’s detects a very small
algorithm is used. Eye part of the image in
detection based on Viola low resolution).
Jones’s algorithm, which is
the last step is done to
ensure if a person has a
mask or not. If eyes are
detected and face is also
detected, implies no mask.
If only eyes are detected,
implies the person is
wearing a mask.
15. NA
Mandal, B., Okeukwu, Input: Pre-trained ResNet-50 Precision, recall and
A. and Theis, Y., 2021. architecture trained on human F1-score are used to
Masked face recognition faces. observe, how well the
using resnet-50. arXiv model works.
The problem of identifying
preprint the identity of an individual
Precision was 0.8993,
arXiv:2104.08997. wearing a mask, because recall score was
features required to predict 0.8970 and F1 score
are reduced to eyes and was 0.897.
forehead. The paper proposes
a framework to solve the
problem of detecting masked
individuals using ResNet-50
architecture. Transfer
learning is applied to adapt to
a pre trained ResNet-50
model, to images of people
without masks. The model is
the subjected to
hyperparameter tuning to
identify identity of
individuals with a mask
(images of same individuals
without masks used before).
The accuracy obtained in the
experiment is 89%. Dataset
used for experiment is Real
World Masked Face
Recognition
Dataset(RMFRD). The
datasets contain three
different data, real-world
masked face recognition
dataset, simulated masked
face recognition datasets, and
real-world masked face
verification. Additionally, the
dataset has 5000 masked and
90,000 unmasked faces.
Modules:

1. Image Enhancement
a. Grayscale Conversion
b. Histogram Equalization
c. Blurring

2. Classification using pre-trained Classifiers (Haar-cascade).

Methodology:
➔ The image is converted into grayscale, because grayscale stores just luminance
(brightness) information rather than colour information, maximum luminance is white and
zero luminance is black; anything in between is a shade of grey. As a result, grayscale
photographs have only grayscale hues and no colour. This makes it easier to process the
image as there is only one colour channel.
➔ After conversion to grayscale, the image undergoes histogram equalization. Histogram
equalisation, also known as histogram flattening, is one of the most essential nonlinear point
procedures. Its premise is similar to that of FSHS: a picture should not only cover the
available grayscale range, but also distribute equally across it. This could be seen in the
results we obtained.
➔ After the equalization is done, the image is subjected to blurring. Blurring an image
reduces the deviation of intensity values. That means, it reduces the sudden changes in the
intensity. The reasoning behind using blurring is it reduces the amount of unwanted edges
(for e.g. the folds of the eye, face or clothing).
➔ At last, the image goes through the algorithm of edge detection which is the canny edge
cascade. Canny Edge Detection is a popular edge detection algorithm. It was developed by
John F. Canny. It is a multi-stage algorithm including, Noise reduction, Intensity Gradient
detection, Non-maximum suppression and Hysteresis Thresholding. ➔ Finally, the important
features are extracted from the image and fed into the haar classifier and the result is
displayed on the original image.
Architecture Diagram :
Code:
import numpy as np
import cv2
import random

face_cascade = cv2.CascadeClassifier('data\\xml\\haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('data\\xml\\haarcascade_eye.xml')
mouth_cascade = cv2.CascadeClassifier('data\\xml\\haarcascade_mcs_mouth.xml')
upper_body = cv2.CascadeClassifier('data\\xml\\haarcascade_upperbody.xml')

# Adjust threshold value in range 80 to 105 based on your light.


bw_threshold = 90

# User message
font = cv2.FONT_HERSHEY_SIMPLEX
org = (30, 30)
weared_mask_font_color = (255, 255, 255)
not_weared_mask_font_color = (0, 0, 255)
thickness = 1
font_scale = 1
weared_mask = "Thank You for wearing mask"
not_weared_mask = "Please wear mask to prevent Corona"

# Read video
cap = cv2.VideoCapture(0) #access webcam

while 1:
# Get individual frame
ret, img = cap.read()
# img = cv2.flip(img,1)

# Convert Image into gray


gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #haar classifier works with
grayscale images

# Convert image in black and white


(thresh, black_and_white) = cv2.threshold(gray, bw_threshold, 255,
cv2.THRESH_BINARY)
#cv2.imshow('black_and_white', black_and_white)
# detect face
faces = face_cascade.detectMultiScale(gray, 1.1, 4) #scaling factor and min neighbours

# Face prediction for black and white


faces_bw = face_cascade.detectMultiScale(black_and_white, 1.1, 4)

if(len(faces) == 0 and len(faces_bw) == 0):


cv2.putText(img, "No face found...", org, font, font_scale, weared_mask_font_color,
thickness, cv2.LINE_AA)
elif(len(faces) == 0 and len(faces_bw) == 1):
# It has been observed that for white mask covering mouth, with gray image face
prediction is not happening
cv2.putText(img, weared_mask, org, font, font_scale, weared_mask_font_color,
thickness, cv2.LINE_AA)
else:
# Draw rectangle on face
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 255, 255), 2) #color(white) and
thickness
roi_gray = gray[y:y + h, x:x + w]
roi_color = img[y:y + h, x:x + w]

# Detect lips counters


mouth_rects = mouth_cascade.detectMultiScale(gray, 1.5, 4)

# Face detected but Lips not detected which means person is wearing mask
if(len(mouth_rects) == 0):
cv2.putText(img, weared_mask, org, font, font_scale, weared_mask_font_color,
thickness, cv2.LINE_AA)
else:
for (mx, my, mw, mh) in mouth_rects:

if(y < my < y + h):


# Face and Lips are detected but lips coordinates are within face cordinates
which `means lips prediction is true and
# person is not waring mask
cv2.putText(img, not_weared_mask, org, font, font_scale,
not_weared_mask_font_color, thickness, cv2.LINE_AA)

#cv2.rectangle(img, (mx, my), (mx + mh, my + mw), (0, 0, 255), 3)


break
# Show frame with results
cv2.imshow('Mask Detection', img)
k = cv2.waitKey(30) & 0xff
if k == ord('q'): #press q(01110001) to quit #ord(char) returns ASCII value of the
character(8 bit Integer)
break

# Release video
cap.release()
cv2.destroyAllWindows()

Output Screenshots:
False Positive:

Experimental Results:

True Positive:

False Negative:
True Negative:

False Positive:
False Positive:
Evaluation Metrics:

Accuracy=tp+fp/tp+fp+tn+fn= 3/6

Precision= tp /tp+fp= 1/3

Recall= tp /tp+fn=1/2

Conclusion:
Hence with the help of OpenCV and Haar Cascade Classifier, we were able to implement
mask detection.

The outputs were accurate upto a certain extent. In low lighting, the model showed certain
inaccuracy. There were also false positives as well, as shown in the output section.

This proposed architecture can be implemented in malls and other densely populated places
and can be connected to an alarm that can notify owners or other administrators that there are
users without masks and can hence ensure safety and well-being of other people in the
surrounding area.
Future Work:
Future enhancement focuses on reducing the false positive i.e. predicting that a person is
wearing a mask even though he is not, which can be done by training a custom , as well as to
reduce the false negative i.e. wrong prediction regarding person not wearing a mask, Thus
improving the overall accuracy.

You might also like