Face Mask Detection System Using Python

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

FACE MASK DETECTION SYSTEM

USING PYTHON

A MINOR PROJECT REPORT ON

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE AWARD


OF THE DEGREE OF

BACHELOR OF TECHNOLOGY
(Electrical Engineering)

SUBMITTED TO
BIPIN TRIPATHI KUMAON INSTITUTE OF TECHNOLOGY

SUBMITTED BY
Name of Student(s) University Roll No.
Mohan Singh (180180105018)
Pooja Agarwal (180180105021)
Vishwajeet Queera (180180105033)

SUPERVISED BY
Mr. Lalit Mohan Tiwari
Assistant Professor, Department of Electrical Engineering
March, 2022

BIPIN TRIPATHI KUMAON INSTITUTE OF TECHNOLOGY


DWARAHAT, UTTARAKHAND

Page 1
CERTIFICATE

I hereby certify that the work which is being presented in the B.Tech. Minor Project
Report entitled “Face Mask Detection using Python”, in partial fulfillment of the
requirements for the award of the Bachelor of Technology in Electrical Engineering
and submitted to the Department of Electrical Engineering of BTKIT, Dwarahat
(Almora), Uttarakhand is an authentic record of my own work carried out during a period
from October 2022 to March 2022 (7th semester) under the supervision of Mr Lalit
Mohan Tiwari, Electrical Department.

The matter presented in this Project Report has not been submitted by me for the award
of any other degree elsewhere.

Mohan Singh (180180105018) Pooja Agarwal (180180105021) Vishwajeet Queera (180180105033)

This is to certify that the above statement made by the student(s) is correct to the best of
my knowledge.

Signature of Supervisor(s)
Date: Mr. Lalit Mohan Tiwari
Assistant Professor
Department of Electrical Engineering

Head

Electrical Engineering Department

Page 2
ACKNOWLEDGEMENT

I would like to place on record my deep sense of gratitude to Prof. Miss Vijya Bhandari,
HOD-Dept. of Electrical Engineering, BTKIT, Dwarahat (Almora) India for his
generous guidance, help and useful suggestions.

I express my sincere gratitude to Prof. Mr Lalit Tiwari, Dept. of Electrical Engineering,


BTKIT, Dwarahat (Almora), India, for his stimulating guidance, continuous
encouragement and supervision throughout the course of present work.

I also wish to extend my thanks to Prof. Mr. Sachin Tyagi and other colleagues for
attending my seminars and for their insightful comments and constructive suggestions
to improve the quality of this project work.

I am extremely thankful to Dr. Satyendra Singh, Director, BTKIT, Dwarahat (Almora),


for providing me infrastructural facilities to work in, without which this work would
not have been possible.

Signature(s) of Students
Mohan Singh (180180105018)
Pooja Agarwal (180180105021)
Vishwajeet Queera (180180105033)

Page 3
ABSTRACT

Face mask recognition has been growing rapidly after corona insistent last years for its
multiple uses in the areas of Law Enforcement Security purposes and other commercial
uses Face appears spreading others to corona a novel approach to perform face mask
recognition is proposed. The proposed system to classify face mask detection using
COVID-19.

After the outbreak of the worldwide pandemic COVID-19, there arises a severe need of
protection mechanisms, face masks being the primary one. According to the World
Health Organization, the corona virus COVID-19 pandemic is causing a global health
epidemic, and the most successful safety measure is wearing a face mask in public places.
Convolutional Neural Networks (CNNs) have developed themselves as a dominant class
of image recognition models.

The aim of this research is to examine and test machine learning capabilities for detecting
and recognizing face masks worn by people in any given video or picture or in real time.
This project develops a real-time, GUI-based automatic Face detection and recognition
system. It can be used as an entry management device by registering an organization's
employees or students with their faces, and then recognizing individuals when they
approach or leave the premises by recording their photographs with faces.The proposed
methodology makes use of HAAR Cascade Algorithm. Based on the performance and
accuracy of our model, the result of the binary classifier will be indicated showing a
green rectangle superimposed around the section of the face indicating that the person at
the camera is wearing a mask, or a red rectangle indicating that the person on camera is
not wearing a mask along with face identification of the person.

Keywords: Face Recognition and Detection, Convolutional Neural Network, GUI,


HAAR Cascade Algorithm

Page 4
Table of Contents
Certificate………..………………………………………………………………………………………..2
Acknowledgement………………………………………………………………………………………...3
Abstract………….………………………...……………………………………………………………....4
List of figures……………………………………………………………………………………………...6
List of symbols…………………………………………………………………………………………….6
List of abbreviations………………………………………………………………………………………6
Chapter 1 Introduction …………………………………………….…………………………………….7
1.1 Motivation of work
1.2 Problem Statement
Chapter 2 Literature Survey…………………………………………………………………………....8
2.1 An Automated System to limit COVID-19 Using Facial Mask Detection
2.2 Masked Face Recognition Using Convolutional Neural Network
Chapter3 Methodology………………………………………………………………………………..9-27
3.1 Proposed system
3.2 Tensor flow framework
3.3 OpenCV
3.4 Numpy
3.5 Ipython
3.6 Python Concepts
3.7 Keras
3.8 Machine Learning Approaches
3.9 Haar Feature-Based Cascade Classifiers
3.10 Deep Learning
3.11 Neutral Network Versus Conventional computers
3.12 Architecture of Neural Network
3.13 Convolutional Neural Network
3.14 Convolutional Layer
3.15 CNN Model
3.16 System Architecture
Chapter4 Design………………………………………………………………………………28-32
4.1 UML Diagrams
Chapter5 Experimental Analysis…………………………………………………………….33-39
5.1 Modules
5.2 Datasets
5.3 Code Implementation
5.4 Functional Requirements
5.5 Non-Functional Requirements
5.6 Input and Output
Chapter 6 Conclusion……………………………………………………………………………40-42
6.1 Summary
6.2 Future Scope
6.3 Conclusion
6.4 References

Page 5
LIST OF FIGURES

Fig.no. Description Page.no.

1. Feed Forward Network 20


2. Layer in NN 21
3. Basic structure of CNN 23
4. Layers in CNN 25
5. System Architecture 26
6. Building the Model 27
7. Use Case Diagram 30
8. Activity Diagram 31
9. Block Diagram 32
10. Flowchart Diagram 33
11. Datasets with facemask 35
12. Datasets without facemask 35
13. Real time input/output 40

LIST OF SYMBOLS

h - Height
w - Width
f - Filter
d - Dimension

LIST OF ABBREVIATIONS

● MTCNN - Multi-Task Cascaded Convolutional Neural Networks


● NN - Neural Network
● CNN - Convolutional Neural Network
● CCTV - Closed-Circuit Television
● PyPI - Python Package Index
● SIFT - Scale-Invariant Feature Transform
● HOG - Histogram of Oriented Gradients
● UML - Unified Modeling Language

Page 6
CHAPTER 1

INTRODUCTION

1.1 MOTIVATION OF WORK

The world has not yet fully recovered from this pandemic and the vaccine that can
effectively treat Covid-19 is yet to be discovered. However, to reduce the impact of the
pandemic on the country's economy, several governments have allowed a limited number
of economic activities to be resumed once the number of new cases of Covid-19 has
dropped below a certain level. As these countries cautiously restart their economic
activities, concerns have emerged regarding workplace safety in the new post-Covid-19
environment.

To reduce the possibility of infection, it is advised that people should wear masks and
maintain a distance of at least 1 meter from each other. Deep learning has gained more
attention in object detection and was used for human detection purposes and to develop a
face mask detection tool that can detect whether the individual is wearing a mask or not.
This can be done by evaluation of the classification results by analyzing real-time
streaming from the Camera. In deep learning projects, a training data set is needed. It is
the actual dataset used to train the model for performing various actions.

1.2 PROBLEM STATEMENT

The main objective of the face detection model is to detect the face of individuals and
conclude whether they are wearing masks or not at that particular moment when they are
captured in the image. The large scale losses that have been noticed across the world due
to the covid-19 pandemic have been highly shocking and lead to a lot of loss of property
and life. The pandemic was sudden and the people and governments could not prepare
themselves effectively beforehand to mitigate the effects of this epidemic. This virus is
highly deadly and has caused multiple casualties which could be prevented through
effective preventive measures. Therefore use of a mask enables effective prevention and
further spread of the virus.

Page 7
CHAPTER 2

LITERATURE SURVEY

2.1 An Automated System to Limit COVID-19 Using Facial Mask


Detection in Smart City Network [1]:

COVID-19 pandemic caused by novel coronavirus is continuously spreading until now


all over the world. The impact of COVID-19 has fallen on almost all sectors of
development. The healthcare system is going through a crisis. Many precautionary
measures have been taken to reduce the spread of this disease where wearing a mask is
one of them. In this paper, we propose a system that restricts the growth of COVID-19 by
finding out people who are not wearing any facial mask in a smart city network where all
the public places are monitored with Closed-Circuit Television (CCTV) cameras. While a
person without a mask is detected, the corresponding authority is informed through the
city network. A deep learning architecture is trained on a dataset that consists of images
of people with and without masks collected from various sources. The trained
architecture achieved 98.7% accuracy on distinguishing people with and without a facial
mask for previously unseen test data. It is hoped that our study would be a useful tool to
reduce the spread of this communicable disease for many countries in the world.

2.2 Masked Face Recognition Using Convolutional Neural Network [2]:

Recognition from faces is a popular and significant technology in recent years. Face
alterations and the presence of different masks make it too challenging. In the real-world,
when a person is uncooperative with the systems such as in video surveillance then
masking is further common scenarios. For these masks, current face recognition
performance degrades. An abundant amount of research work has been performed for
recognizing faces under different conditions like changing pose or illumination, degraded
images, etc. Still, difficulties created by masks are usually disregarded. The primary
concern to this work is about facial masks, and especially to enhance the recognition
accuracy of different masked faces. A feasible approach has been proposed that consists
of first detecting the facial regions. The occluded face detection problem has been
approached using Multi-Task Cascaded Convolutional Neural Network (MTCNN). Then
facial features extraction is performed using the Google Face Net embedding model.

Page 8
CHAPTER 3

METHODOLOGY

3.1 PROPOSED SYSTEM

1. This system is capable of training the dataset of both persons wearing masks and
without wearing masks.
2. After training the model the system can predict whether the person is wearing the mask
or not.
3. It also can access the webcam and predict the result.

3.2 TENSORFLOW FRAMEWORK:

Tensor flow is an open-source software library. It was originally developed by researchers


and engineers. It is working on the Google Brain Team within Google’s Machine
Intelligence research organization for the purposes of conducting machine learning and
deep neural networks research.

It is an open source framework to run deep learning and other statistical and predictive
analytics workloads. It is a python library that supports many classification and
regression algorithms and more generally deep learning.

TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library, and is also used for
machine learning applications such as neural networks. It is used for both research and
production at Google, TensorFlow is Google Brain's second-generation system.

Version 1.0.0 was released on February 11, While the reference implementation runs on
single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA
and SYCL extensions for general-purpose computing on graphics processing units).

TensorFlow is available on 64-bit Linux, macOS, Windows, and mobile computing


platforms including Android and iOS. Its flexible architecture allows for the easy
deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from
desktops to clusters of servers to mobile and edge devices.

The name TensorFlow derives from the operations that such neural networks perform on

Page 9
multidimensional data arrays, which are referred to as tensors.

3.3 OPENCV:

It is a cross-platform library using which we can develop real-time computer vision


applications.

It mainly focuses on image processing, video capture and analysis including features like
face detection and object detection.

Currently OpenCV supports a wide variety of programming languages like C++, Python,
Java etc. and is available on different platforms including Windows, Linux, OS X,
Android, iOS etc.

Also, interfaces based on CUDA and OpenCL are also under active development for
high-speed GPU operations. OpenCV-Python is the Python API of OpenCV.

It combines the best qualities of OpenCV C++ API and Python language.

OpenCV (Open-Source Computer Vision Library) is an open source computer vision and
machine learning software library. OpenCV was built to provide a common infrastructure
for computer vision applications and to accelerate the use of machine perception in the
commercial products. Being a BSD-licensed product, OpenCV makes it easy for
businesses to utilize and modify the code.

The library has more than 2500optimized algorithms, which includes a comprehensive set
of both classic and state-of -the-art computer vision and machine learning algorithms.

Algorithms can be used to detect and recognize faces, identify objects, classify human
actions in videos, track camera movements, track moving objects, extract 3D models of
objects, produce 3D point clouds from stereo cameras, stitch images together to produce a
high resolution image of an entire scene, find similar images from an image database,
remove red eyes from images taken using flash, follow eye movements, recognize
scenery and establish markers to overlay it with augmented reality, etc.

3.4 NUMPY:

NumPy is a library for the Python programming language, adding support for large,
multi-dimensional arrays and matrices, along with a large collection of high level
mathematical functions to operate on these arrays. The ancestor of NumPy, Numeric, was
originally created by Jim Hugunin with contributions from several other developers. In
2005, Travis Oliphant created NumPy by incorporating features of the competing Num

Page 10
array into Numeric, with extensive modifications.

The Python programming language was not initially designed for numerical computing,
but attracted the attention of the scientific and engineering community early on, so that a
special interest group called matrix-sig was founded in 1995 with the aim of defining an
array computing package. Among its members was Python designer and maintainer
Guido van Rossum, who implemented extensions to Python's syntax (in particular the
indexing syntax) to make array computing easier.

An implementation of a matrix package was completed by Jim Fulton, then generalized


by Jim Hugunin to become Numeric also variously called Numerical Python extensions
or NumPy Hugunin, a graduate student at Massachusetts Institute of Technology (MIT)
joined the Corporation for National Research Initiatives (CNRI) to work on J Python in
1997 leaving Paul Dubois of Lawrence Livermore National Laboratory (LLNL) to take
over as maintainer.

In early 2005, NumPy developer Travis Oliphant wanted to unify the community around
a single array package and ported num-array's features to Numeric, releasing the result as
NumPy 1.0 in 2006. This new project was part of SciPy. To avoid installing the large
SciPy package just to get an array object, this new package was separated and called
NumPy.

3.5 IPYTHON

What exactly is Python? You may be wondering about that. You may be referring to this
book because you wish to learn editing but are not familiar with editing languages.
Alternatively, you may be familiar with programming languages such as C, C ++, C #, or
Java and wish to learn more about Python language and how it compares to these "big
word" languages.

3.6 PYTHON CONCEPTS

Python was developed into an easy-to-use programming language. It uses English words
instead of punctuation, and has fewer syntax than other languages. Python is a highly
developed, translated, interactive, and object-oriented language.

Python translated - Interpreter processing Python during launch. Before using your
software, you do not need to install it. This is similar to PERL and PHP editing
languages. Python interactive - To write your own applications, you can sit in Python
Prompt and communicate directly with the interpreter. Python Object-Oriented - Python
supports the Object-Oriented program style or method, encoding the code within objects.
Python is a language for beginners - Python is an excellent language for beginners, as it
allows for the creation of a variety of programs, from simple text applications to web

Page 11
browsers and games.

3.6.1 Python Features

Python features include -

1. Easy-to-learn - Python includes a small number of keywords, precise structure,


and well-defined syntax. T This allows the student to learn the language faster.
2. Easy to read - Python code is clearly defined and visible to the naked eye.
3. Easy-to-maintain - Python source code is easy to maintain.
4. Standard General Library - Python's bulk library is very portable and shortcut
compatible with UNIX, Windows, and Macintosh.
5. Interaction mode - Python supports interaction mode that allows interaction
testing and correction of captions errors.
6. Portable - Python works on a variety of computer systems and has the same user
interface for all.
7. Extensible - Low-level modules can be added to the Python interpreter. These
modules allow system developers to improve the efficiency of their tools either by
installing or customizing them.
8. Details - All major commercial information is provided by Python ways of
meeting.
9. GUI Programming - Python assists with the creation and installation of a user
interface for images of various program phones, libraries, and applications,
including Windows MFC, Macintosh, and Unix's X Window.
10. Scalable - Major projects benefit from Python building and support, while Shell
writing is not.

Aside from the characteristics stated above, Python offers a long list of useful features,
some of which are described below. −

●It supports OOP as well as functional and structured programming methodologies.

●It can be used as a scripting language or compiled into Byte-code for large-scale
application development.

●It allows dynamic type verification and provides very high-level dynamic data types.

●Automatic garbage pickup is supported by IT.

Page 12
3.62 ADVANTAGES/BENEFITS OF PYTHON:

The diverse application of the Python language is a result of the combination of features
which give this language an edge over others. Some of the benefits of programming in
Python include:

1. Presence of Third-Party Modules:

The Python Package Index (PyPI) contains numerous third-party modules that make
Python capable of interacting with most of the other languages and platforms

2. Extensive Support Libraries:

Python provides a large standard library which includes areas like internet protocols,
String operations, web services tools and operating system interfaces. Many high use
programming tasks have already been scripted into the standard library which reduces the
length of code to be written significantly.

3. Open Source and Community Development:

Python language is developed under an OSI-approved open source license, which makes
it free to use and distribute, including for commercial purposes. Further, its development
is driven by the community which collaborates for its code through hosting conferences
and mailing lists, and provides for its numerous modules.

4. Learning Ease and Support Available:

Python offers excellent readability and uncluttered simple-to-learn syntax which helps
beginners to utilize this programming language. The code style guidelines, PEP 8,
provide a set of rules to facilitate the formatting of code. Additionally, the wide base of
users and active developers has resulted in a rich internet resource bank to encourage
development and the continued adoption of the language.

5. User-friendly Data Structures:

Python has built-in list and dictionary data structures which can be used to construct fast
runtime data structures. Further, Python also provides the option of dynamic high level
data typing which reduces the length of support code that is needed.

Page 13
6. Productivity and Speed:

Python has Clean object-oriented design, provides enhanced process control capabilities,
and possesses strong integration and text processing capabilities and its own unit testing
framework, all of which contribute to the increase in its speed and productivity. Python is
considered a viable option for building complex multi-protocol network applications.

As can be seen from the above-mentioned points, Python offers a number of advantages
for software development. As upgrading of the language continues, its loyalist base could
grow as well.

Python has five standard data types −


●Numbers
●String
●List
●Tuple
●Dictionary

3.6.3 Python Numbers

Numeric values are stored in number data types. When you give a number a value, it
becomes a number object.

3.6.4 Python Strings

In this python, uses a string is defined as a collection set of characters enclosed in


quotation marks. Python allows you to use any number of quotes in pairs. The slice
operator ([] and [:] ) can be used to extract subsets of strings, with indexes starting at 0 at
the start of the string and working their way to -1 at the end.

3.6.5 Python Lists

The most diverse types of Python data are lists. Items are separated by commas and
placed in square brackets in the list ([]). Lists are similar to C-order in some ways.
Listings can be for many types of data, which is one difference between them. The slide
operator ([] and [:]) can be used to retrieve values stored in the list, the indicators start
with 0 at the beginning of the list and work their way to the end of the list. The
concatenation operator of the list is a plus sign (+), while the repeater is an asterisk (*).

Page 14
3.6.6 Python Tuples

A cone is a type of data type similar to a sequence of items. Cone is a set of values
separated by commas. The pods, unlike the list, are surrounded by parentheses. Lists are
placed in parentheses ([]), and the elements and sizes can be changed, but the lumps are
wrapped in brackets (()) and cannot be sorted. Powders are the same as reading lists only

3.7 KERAS

KERAS is an API designed for human beings, not machines. Keras follows best practices
for reducing cognitive load: it offers consistent & simple APIs, it minimizes the number
of user actions required for common use cases, and it provides clear & actionable error
messages.

It also has extensive documentation and developer guides. Keras contains numerous
implementations of commonly used neural network building blocks such as layers,
objectives, activation functions, optimizers, and a host of tools to make working with
image and text data easier to simplify the coding necessary for writing deep neural
network code.

The code is hosted on GitHub, and community support forums include the GitHub issues
page, and a Slack channel. Keras is a minimalist Python library for deep learning that can
run on top of Theano or TensorFlow. It was developed to make implementing deep
learning models as fast and easy as possible for research and development.

FOUR PRINCIPLES:

Modularity: A model can be understood as a sequence or a graph alone. All the concerns
of a deep learning model are discrete components that can be combined in arbitrary ways.

Minimalism: The library provides just enough to achieve an outcome, no frills and
maximizing readability.

Extensibility: New components are intentionally easy to add and use within the
framework, intended for researchers to trial and explore new ideas.

Python: No separate model files with custom file formats. Everything is native Python.
Keras is designed for minimalism and modularity allowing you to very quickly define
deep learning models and run them on top of a Theano or TensorFlow backend.

Page 15
3.8 Machine Learning approaches:

1.Viola–Jones object detection framework based on HAAR Features

2.Scale-invariant feature transform (SIFT)

3.Histogram of oriented gradients (HOG) features

Machine learning (ML)is the study of computer algorithms that improve automatically
through experience. It is seen as a subset of artificial intelligence. Machine learning
algorithms build a mathematical model based on sample data, known as "training data",
in order to make predictions or decisions without being explicitly programmed to do so.

Machine learning algorithms are used in a wide variety of applications, such as email
filtering and computer vision, where it is difficult or infeasible to develop conventional
algorithms to perform the needed tasks.

Machine learning is closely related to computational statistics, which focuses on making


predictions using computers. The study of mathematical optimization delivers methods,
theory and application domains to the field of machine learning.

Data mining is a related field of study, focusing on exploratory data analysis through
unsupervised learning. In its application across business problems, machine learning is
also referred to as predictive analytics.

Approaches for Machine Learning:

3.8.1 Viola-Jones Object detection framework based in HAAR features:

The Viola-Jones algorithm is one of the most popular algorithms for object recognition in
an image. This research paper deals with the possibilities of parametric optimization of
the Viola-Jones algorithm to achieve maximum efficiency of the algorithm in specific
environmental conditions. It is shown that with the use of additional modifications it is
possible to increase the speed of the algorithm in a particular image by 2-5 times with the
loss of accuracy and completeness of the work by not more than 3-5%.

Page 16
3.8.2 Scale-invariant feature transform (SIFT):

The scale-invariant feature transform (SIFT) is a feature detection algorithm in computer


vision to detect and describe local features in images. SIFT key points of objects are first
extracted from a set of reference images and stored in a database. An object is recognized
in a new image by individually comparing each feature from the new image to this
database and finding candidate matching features based on Euclidean distance of their
feature vectors.

From the full set of matches, subsets of key points that agree on the object and its
location, scale, and orientation in the new image are identified to filter out good matches.
The determination of consistent clusters is performed rapidly by using an efficient hash
table implementation of the generalized Hough transform . Each cluster of 3 or more
features that agree on an object and its 14 pose is then subject to further detailed model
verification and subsequently outliers are discarded, Finally the probability that a
particular set of features indicates the presence of an object is computed, given the
accuracy of fit and number of probable false matches. Object matches that pass all these
tests can be identified as correct with high confidence.

3.8.3 Histogram of oriented gradients (HOG):

The histogram of oriented gradients (HOG) is a feature descriptor used in computer


vision and image processing for the purpose of object-detection. The technique counts
occurrences of gradient orientation in localised portions of an image.

This method is similar to that of edge orientation histograms, scale-invariant feature


transform descriptors, and shape contexts, but differs in that it is computed on a

An algorithm is involved in this proposed system

HAAR Feature-Based Cascade Classifiers

3.9 HAAR Feature-Based Cascade Classifiers

It is an Object Detection Algorithm used to identify faces in an image or a real time


video. Dense grid of uniformly spaced cells and uses overlapping local contrast
normalization for improved accuracy.

It is an effective way for object detection. In this approach, a lot of positive and negative
images are used to train the classifier. In this, a model pre-trained with frontal features is
developed and used in this experiment to detect the faces in real-time.

Page 17
3.10 DEEP LEARNING

1. Deep learning is an AI function that mimics the workings of the human brain in
processing data for use in detecting objects, recognizing speech, translating languages,
and making decisions.

2. Deep learning AI is able to learn without human supervision, drawing from data that is
both unstructured and unlabeled.

3. In this, face mask detection is built using Deep Learning technique called Convolution
Neural Networks (CNN).

Deep learning methods aim at learning feature hierarchies with features from higher
levels of the hierarchy formed by the composition of lower-level features. Automatically
learning features at multiple levels of abstraction allow a system to learn complex
functions mapping the input to the output directly from data, without depending
completely on human-crafted features. Deep learning algorithms seek to exploit the
unknown structure in the input distribution in order to discover good representations,
often at multiple levels, with higher-level learned features defined in terms of lower-level
features.

3.11 NEURAL NETWORKS VERSUS CONVENTIONAL COMPUTERS:

Neural networks take a different approach to problem solving than that of conventional
computers. Conventional computers use an algorithmic approach i.e the computer follows
a set of instructions in order to solve a problem.

Unless the specific steps that the computer needs to follow are known the computer
cannot solve the problem. That restricts the problem-solving capability of conventional
computers to problems that we already understand and know how to solve. But computers
would be so much more useful if they could do things that we don't exactly know how to
do.

Neural networks process information in a similar way the human brain does. The network
is composed of a large number of highly interconnected processing elements (neurons)
working in parallel to solve a specific problem.

Neural networks learn by example. They cannot be programmed to be wasted or even


worse the network might be functioning incorrectly. The disadvantage is that because the
network finds out how to solve the problem by itself, its operation can be unpredictable.

Page 18
On the other hand, conventional computers use a cognitive approach to problem solving;
the way the problem is solved must be known and stated in small unambiguous
instructions. These instructions are then converted to a high level language program and
then into machine code that the computer can understand. These machines are totally
predictable; if anything goes wrong is due to a software or hardware fault.

Neural networks and conventional algorithmic computers are not in competition but
complement each other. There are tasks more suited to an algorithmic approach like
arithmetic operations and tasks that are more suited to neural networks. Even more, a
large number of tasks require systems that use a combination of the two approaches
(normally a conventional computer is used to supervise the neural network) in order to
perform at maximum efficiency.

3.12 ARCHITECTURE OF NEURAL NETWORKS:

3.12.1 FEED-FORWARD NETWORKS:

Feed-forward ANNs allow signals to travel one way only; from input to output. There is
no feedback (loops) i.e. the output of any layer does not affect that same layer.
Feed-forward ANNs tend to be straight forward networks that associate inputs with
outputs. They are extensively used in pattern recognition. This type of organization is also
referred to as bottom-up or top-down. to be straight forward networks that associate
inputs with outputs. They are extensively used in pattern recognition. This type of
organization is also referred to as bottom-up or top-down.

Page 19
3.12.2 FEEDBACK NETWORKS:

Feedback networks can have signals travelling in both directions by introducing loops in
the network. Feedback networks are very powerful and can get extremely complicated.
Feedback networks are dynamic; is changing continuously until they reach an equilibrium
point.

They remain at the equilibrium point until the input changes and a new equilibrium needs
to be found. Feedback architectures are also referred to as interactive or recurrent,
although the latter term is often used to denote feedback connections in single-layer
organization.

3.12.3 NETWORK LAYERS:

The commonest type of artificial neural network consists of three groups, or layers, of
units: a layer of input units is connected to a layer of hidden units, which is connected to
a layer of output units.

The activity of the input units represents the raw information that is fed into the network.

The activity of each hidden unit is determined by the activities of the input units and the
weights on the connections between the input and the hidden units.

Page 20
The behaviour of the output units depends on the activity of the hidden units and the
weights between the hidden and output units.

This simple type of network is interesting because the hidden units are free to construct
their own representations of the input. The weights between the input and hidden units
determine when each hidden unit is active, and so by modifying these weights, a hidden
unit can choose what it represents.

Also distinguish single-layer and multi-layer architectures. The single-layer organization,


in which all units are connected to one another, constitutes the most general case and is of
more potential computational power than hierarchically structured multi-layer
organizations. In multi-layer networks, units are often numbered by layer, instead of
following a global numbering.

3.13 Convolution Neural Network

A convolution neural network is a special architecture of an artificial neural network


proposed by yann lecun in 1988. One of the most popular uses of the architecture is
image classification. CNNs have wide applications in image and video recognition,
recommender systems and natural language processing. In this article, the example that
this project will take is related to Computer Vision. However, the basic concept remains
the same and can be applied to any other use-case!

CNNs, like neural networks, are made up of neurons with learnable weights and biases.
Each neuron receives several inputs, takes a weighted sum over them, pass it through an
activation function and responds with an output. The whole network has a loss function
and all the tips and tricks that we developed for neural networks still apply on CNNs. In
more detail the image is passed through a series of convolution, nonlinear, pooling layers
and fully connected layers, then generates the output.

In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of deep,


feed-forward artificial neural networks, most commonly applied to analyzing visual
imagery.

Convolutional networks were inspired by biological processes in that the connectivity


pattern between neurons resembles the organization of the visual cortex. CNNs use
relatively little pre-processing compared to other image classification algorithms.

CNN is a special kind of multi- layer NNs applied to 2-d arrays (usually images), based
on spatially localized neural input. CNN Generate ‘patterns of patterns’ for pattern
recognition.

Page 21
Each layer combines patches from previous layers. Convolutional Networks are trainable
multistage architectures composed of multiple stages Input and output of each stage are
sets of arrays called feature maps. At output, each feature map represents a particular
feature extracted at all locations on input.

Each stage is composed of: a filter bank layer, a non-linearity layer, and a feature pooling
layer. A ConvNet is composed of 1, 2 or 3 such 3-layer stages, followed by a
classification module.

Basic structure of CNN, where C1, C3 are convolution layers and S2, S4 are
pooled/sampled layers.

Filter: A trainable filter (kernel) in the filter bank connects input feature map to output
feature map Convolutional layers apply a convolution operation to the input, passing the
result to the next layer. The convolution emulates the response of an individual neuron to
visual stimuli.

Page 22
3.14 CONVOLUTIONAL LAYER

It is always first. The image (matrix with pixel values) is entered into it. Image that the
reacting of the input matrix begins at the top left of the image. Next the software selects
the smaller matrix there, which is called a filter. Then the filter produces convolution that
moves along the input image.

The filter task is to multiply its value by the original pixel values. All these
multiplications are summed up and one number is obtained at the end. Since the filter has
read the image only in the upper left corner it moves further by one unit right performing
a similar operation. After passing the filter across all positions, a matrix is obtained, but
smaller than an input matrix.

This operation, from a human perspective is analogous to identifying boundaries and


simple Colors on the image. But in order to recognize the fish the whole network is
needed. The network will be -consists of several convolution layers mixed with nonlinear
and pooling layers.

Convolution is the first layer to extract features from an input image. Convolution
features using small squares of input data.

It is a mathematical operation that takes two inputs such as image matrix and a filter or
kernel.

● An image matrix of dimension (h x w x d)

● A filter (fh x fw x d)

● Outputs a volume dimension (h-fh+1) x (w-fw+1) x1

3.15 CNN MODEL

1.This CNN model is built using the Tensorflow framework and the OpenCv library
which is highly used for real-time applications.

2.This model can also be used to develop a full-fledged software to scan every person
before they can enter the public gathering.

Page 23
3.15.1 LAYERS IN CNN MODEL

● Conv2D
● MaxPooling2D
● Flatten ()
● Dropout
● Dense

1. Conv2DLayer: It has 100 filters and the activation function used is the ‘ReLu’.
The ReLu function stands for Rectified Linear Unit which will output the input
directly if it is positive ,otherwise it will output zero.
2. MaxPooling2D: It is used with pool size or filter size of 2*2.
3. Flatten () Layer: It is used to flatten all the layers into a single 1D layer.
4. Dropout Layer: It is used to prevent the model from overfitting.
5. Dense Layer: The activation function here is soft max which will output a vector
with two probability distribution values.

Page 24
3.16 SYSTEM ARCHITECTURE:

1. Data Visualization.
2. Data Augmentation.
3. Splitting the data.
4. Labeling the Information.
5. Importing Face detection.
6. Detecting the Faces with and without Masks.

Page 25
Data Visualization

In the first step, let us visualize the total number of images in our dataset in both
categories. We can see that there are 690 images in the ‘yes’ class and 686 images in the
‘no’ class.

Data Augmentation

In the next step, we augment our dataset to include more images for our training. In this
step of data augmentation, we rotate and flip each of the images in our dataset.

Splitting the data

In this step, we split our data into the training set which will contain the images on which
the CNN model will be trained and the test set with the images on which our model will
be tested.

Building the Model

In the next step, we build our Sequential CNN model with various layers such as
Conv2D, MaxPooling2D, Flatten, Dropout and Dense.

Page 26
Pre-Training the CNN model

After building our model, let us create the ‘train_generator’ and ‘validation_generator’ to
fit them to our model in the next step.

Training the CNN model

This step is the main step where we fit our images in the training set and the test set to our
Sequential model we built using keras library. We have trained the model for 10 epochs
(iterations). However, we can train for more epochs to attain higher accuracy lest there
occurs over-fitting.

Labeling the Information

After building the model, we label two probabilities for our results. [‘0’ as ‘without_
mask’ and ‘1’ as ‘with_ mask’]. I am also setting the boundary rectangle color using the
RGB values.

Importing the Face detection Program

After this, we intend to use it to detect if we are wearing a face mask using our PC’s
webcam. For this, first, we need to implement face detection. In this, we used the Haar
Feature-based Cascade Classifiers for detecting the features of the face.

Detecting the Faces with and without Masks

In the last step, we use the OpenCV library to run an infinite loop to use our web camera
in which we detect the face using the Cascade Classifier.

Page 27
CHAPTER 4

DESIGN

4.1 UML Diagrams:

A UML diagram is a partial graphical representation (view) of a model of a system under


design, implementation, or already in existence. The UML diagram contains graphical
elements (symbols) - UML nodes connected with edges (also known as paths or flows) -
that represent elements in the UML model of the designed system. The UML model of
the system might also contain other documentation such as use cases written as templated
texts.

The kind of the diagram is defined by the primary graphical symbols shown on the
diagram. For example, a diagram where the primary symbols in the contents area are
classes is a class diagram. A diagram which shows use cases and actors is a use case
diagram. A sequence diagram shows sequence of message exchanges between lifelines.

UML specification does not preclude mixing of different kinds of diagrams, e.g. to
combine structural and behavioral elements to show a state machine nested inside a use
case. Consequently, the boundaries between the various kinds of diagrams are not strictly
enforced. At the same time, some UML Tools do restrict the set of available graphical
elements which could be used when working on a specific type of diagram.

UML specification defines two major kinds of UML diagrams: structure diagrams and
behavior diagrams.

Structure diagrams show the static structure of the system and its parts on different
abstraction and implementation levels and how they are related to each other. The
elements in a structure diagram represent the meaningful concepts of a system, and may
include abstract, real world and implementation concepts.

Behavior diagrams show the dynamic behavior of the objects in a system, which can be
described as a series of changes to the system over time.

Page 28
4.1.1 Use Case Diagram

In the Unified Modeling Language (UML), a use case diagram can summarize the details
of your system's users (also known as actors) and their interactions with the system. To
build one, you'll use a set of specialized symbols and connectors. An effective use case
diagram can help your team discuss and represent:

● Scenarios in which your system or application interacts with people, organizations, or


external systems.

● Goals that your system or application helps those entities (known as actors) achieve.

● The scope of your system

Page 29
4.1.2 Activity Diagram

An activity diagram is a behavioral diagram i.e., it depicts the behavior of a system.

An activity diagram portrays the control flow from a start point to a finish point showing
the various decision paths that exist while the activity is being executed.

Page 30
4.1.3 BLOCK DIAGRAM

A block diagram is a graphical representation of a system – it provides a functional view


of a system. Block diagrams give us a better understanding of a system’s functions and
help create interconnections within it.

They are used to describe hardware and software systems as well as to represent
processes.

Page 31
4.1.4 FLOWCHART DIAGRAM

Page 32
CHAPTER 5

EXPERIMENT ANALYSIS

5.1 MODULES

We have four modules -

Datasets Collecting: We collect a number of data sets with face masks and without
masks. We can get high accuracy by collecting the number of images.

Datasets Extracting: We can extract the features using mobile net v2 of mask and no
mask sets.

Models Training: We will train the model using openCV, Keras (python library).

Facemask Detection: We can detect Pre processing image and also detect via live
video or web cam. If the person is wearing a mask the rectangle around the face will be
green otherwise it will be red.

5.2 DATASETS

● Datasets are collected from Kaggle.


● Two set of datasets composed of faces of various people from different parts of the
world and of all age groups.
● One set of dataset consists of faces of people with mask.
● Other set of dataset consists of faces of people without mask.

Page 33
DATASET WITH FACEMASK:

DATASET WITHOUT FACEMASK:

Page 34
5.3 CODE IMPLEMENTATION

import numpy as np
import keras
import keras.backend as k
from keras.layers import Conv2D, MaxPooling2D, SpatialDropout2D, Flatten,
Dropout, Dense
from keras.models import Sequential,load_model
from keras.optimizers import adam_v2
from keras.preprocessing import image
import cv2
import datetime

# UNCOMMENT THE FOLLOWING CODE TO TRAIN THE CNN FROM


SCRATCH
# BUILDING MODEL TO CLASSIFY BETWEEN MASK AND NO MASK
model=Sequential()
model.add(Conv2D(32,(3,3),activation='relu',input_shape=(150,150,3)))
model.add(MaxPooling2D() )
model.add(Conv2D(32,(3,3),activation='relu'))
model.add(MaxPooling2D() )
model.add(Conv2D(32,(3,3),activation='relu'))
model.add(MaxPooling2D() )
model.add(Flatten())
model.add(Dense(100,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy']
)
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)

Page 35
training_set = train_datagen.flow_from_directory(
'train',
target_size=(150,150),
batch_size=16 ,
class_mode='binary')
test_set = test_datagen.flow_from_directory(
'test',
target_size=(150,150),
batch_size=16,
class_mode='binary')
model_saved=model.fit_generator(
training_set,
epochs=10,
validation_data=test_set )
model.save('mymodel.h5',model_saved)

#To test for individual images


mymodel=load_model('mymodel.h5')
test_image=image.load_img('D:\FaceMaskDetector-master/ML Datasets/Face
Mask Detection/Dataset/test/without_mask/30.jpg',target_size=(150,150,3))

test_image=image.load_img(r'D:\FaceMaskDetector-master/test/with_mask/1-wi
th-mask.jpg',
target_size=(150,150,3))
test_image
test_image=image.img_to_array(test_image)
test_image=np.expand_dims(test_image,axis=0)
mymodel.predict(test_image)[0][0]

Page 36
# IMPLEMENTING LIVE DETECTION OF FACE MASK

mymodel=load_model('mymodel.h5')
cap=cv2.VideoCapture(0)
face_cascade=cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
while cap.isOpened():
_,img=cap.read()
face=face_cascade.detectMultiScale(img,scaleFactor=1.1,minNeighbors=4)
for(x,y,w,h) in face:
face_img = img[y:y+h, x:x+w]
cv2.imwrite('temp.jpg',face_img)
test_image=image.load_img('temp.jpg',target_size=(150,150,3))
test_image=image.img_to_array(test_image)
test_image=np.expand_dims(test_image,axis=0)
pred=mymodel.predict(test_image)[0][0]
if pred==1:
cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),3)
cv2.putText(img,'NO
MASK',((x+w)//2,y+h+20),cv2.FONT_HERSHEY_SIMPLEX,1,(0,0,255),3)
else:
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),3)

cv2.putText(img,'MASK',((x+w)//2,y+h+20),cv2.FONT_HERSHEY_SIMPLEX
,1,(0,255,0),3)
datet=str(datetime.datetime.now())
cv2.putText(img,datet,(400,450),cv2.FONT_HERSHEY_SIMPLEX,0.5,(255,25
5,255),1)
cv2.imshow('img',img)
if cv2.waitKey(1)==ord('q'):
break
cap.release()
cv2.destroyAllWindows()

Page 37
5.4 FUNCTIONAL REQUIREMENTS

The primary purpose of computer results is to deliver processing results to users. They
are also employed to maintain a permanent record of the results for future use. In
general, the following are many types of results:

● External results are those that are exported outside the company.
● Internal results, which are the main user and computer display and have a place
within the organization.
● Operating results used only by the computer department.
● User-interface results that allow the user to communicate directly with the
system.
● Understanding the user's preferences, the level of technology and the needs of
his or her business through a friendly questionnaire.

5.5 NON-FUNCTIONAL REQUIREMENTS

5.5.1 SYSTEM CONFIGURATION

This project can run on commodity hardware. We ran the entire project on an Intel
I5 processor with 8 GB Ram, 2 GB Nvidia Graphic Processor. It also has 2 cores
which run at 1.7 GHz, 2.1 GHz respectively. First part is the training phase which
takes 10-15 mins of time and the second part is the testing part which only takes a
few seconds to make predictions and calculate accuracy.

5.5.2 HARDWARE REQUIREMENTS

● RAM: 4 GB
● Storage: 1 TB
● CPU: 2 GHz or faster
● Architecture: 32-bit or 64-bit

5.5.3 SOFTWARE REQUIREMENTS

● Python 3.5 in Google Colab is used for data pre-processing, model training
and prediction
● Operating System: windows 7 and above or Linux based OS or MAC OS
● Coding Language : Python

Page 38
5.5 INPUT AND OUTPUT

REAL TIME INPUT/OUTPUT:

Page 39
CHAPTER 6

6.1 SUMMARY
● Collected the datasets of people with and without facemask.
● Trained the facemask classifier using Keras/Tensor flow.
● Created a model using the datasets.
● The built model is loaded.
● Cameras are switched on.
● CV2 detects faces, faces are extracted.
● People with or without facemasks are detected.

6.2 FUTURE SCOPE


● Manual monitoring is very difficult for officers to check whether people are
wearing masks or not. So in our technique we are using a webcam to detect people's
faces and to prevent virus transmission.
● It can be implemented in ATM’s, Banks etc.
● This real time system can be used in public places like bus stands, airports, railway
stations etc. to detect travelers without masks.
● It has fast and high accuracy and is also cost effective.
● We can prevent peoples from virus transmission through this system hence it also
acts as a life saver.

Page 40
6.3 CONCLUSION

As the technology is blooming with emerging trends the availability of new face mask
detectors can possibly contribute to public healthcare. The architecture consists of
Mobile Net as the backbone; it can be used for high and low computation scenarios. In
order to extract more robust features, we utilize transfer learning to adopt weights from a
similar task face detection, which is trained on a very large dataset.

We used OpenCV, tensor flow, and NN to detect whether people were wearing face
masks or not. The models were tested with images and real-time video streams. The
accuracy of the model is achieved and the optimization of the model is a continuous
process and we are building a highly accurate solution by tuning the hyper parameters.
This specific model could be used as a use case for edge analytics.

Furthermore, the proposed method achieves state-of-the-art results on a public face mask
dataset. By the development of face mask-detection we can detect if the person is
wearing a face mask and allow their entry would be of great help to the society.

Page 41
6.4 REFERENCES

[1] M. S. Ejaz and M. R. Islam, "Masked Face Recognition Using Convolutional Neural
Network," 2019 International Conference on Sustainable Technologies for Industry 4.0
(STI), 2019, pp. 1-6, doi: 10.1109/STI47673.2019.9068044.

[2] M. R. Bhuiyan, S. A. Khushbu and M. S. Islam, "A Deep Learning Based Assistive
System to Classify COVID-19 Face Mask for Human Safety with YOLOv3," 2020 11th
International Conference on Computing, Communication and Networking Technologies
(ICCCNT)

[3] M. M. Rahman, M. M. H. Manik, M. M. Islam, S. Mahmud and J. -H. Kim, "An


Automated System to Limit COVID-19 Using Facial Mask Detection in Smart City
Network," 2020 IEEE International IOT, Electronics and Mechatronics Conference
(IEMTRONICS), 2020

[4] Y. Sun, Y. Chen, X. Wang, and X. Tang, “Deep learning face representation by joint
identification-verification,” in Advances in neural information processing systems, 2014,
pp. 198hy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale
Visual Recognition Challenge,” 2014.

[5] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, “Face recognition: A


literature survey,” ACM computing surveys (CSUR), vol. 35, no. 4, pp. 399–458, 2003.

Page 42

You might also like