Pfe Bader El Hajari
Pfe Bader El Hajari
Pfe Bader El Hajari
Département d’Informatique
Master in Intelligent Processing Systems
(Traitement Intelligent des Systèmes)
Title:
At the end of this work, we would like to express our deep gratitude and our sincere
appreciations to our supervisor Professor IDRISSI Abdellah, for all the time he devoted to us, his
invaluable advice, and for the quality of his follow-up
throughout the period of our project.
Also I want to thank ICOZ company for providing me with the opportunity to be part of their
company for those last months, and especially my tutor TAOUAF FAYSSAL, who always
came up with creative ideas to make the project goals achievable.
iii
Résumé
Ce système a pour but d’aider les entreprises spécialement, les magasins et les places de
marché pour savoir le nombre d’employées là, et/ou pour surveiller leur trafic de visiteurs.
Ce système peut envoyer des alertes en temps réel si le nombre total de personnes est dépassé
d’un nombre donné dans une salle ou un magasin...
Mots clés: Système compteur de personnes en temps réel, Deep Learning, Computer
vision, l'intelligence artificielle, OpenCV
iii
Abstract
In this internship report are presented the different phases of conception and realization of a
real time people counter system.
This system helps companies especially stores and marketplaces for understanding the number
of the employees inside the company, and/or tracking and counting the visitor entering.
The system can send an alert in real-time, in case if the limit number of people is exceeded in
a store/building…
I realized the system by using artificial intelligence techniques especially Deep Learning and
Computer vision and other libraries such as OpenCV.
The system is able to count the number of the personness(in/out), and visualize the conversion
rate data, in order to evaluate the performance of stores and other.
Keywords : Real time people counting, Deep Learning, Computer vision, artificial intelligence,
OpenCV
iv
Table of Contents
Remerciements ...................................................................................................................................................... iii
Résumé ................................................................................................................................................................... iii
Abstract ................................................................................................................................................................. iv
List of Abbreviations .............................................................................................................................................. 1
General context of the project ........................................................................................................................... 2
1.1 Introduction .............................................................................................................................................. 2
1.2 Presentation of the organization ............................................................................................................. 2
1.2.1 ICOZ Company .................................................................................................................................. 2
1.2.2 Technical Sheet of the Company ........................................................................................................ 3
1.2.3 Company Diagram .............................................................................................................................. 3
1.3 Issue / Problematic ................................................................................................................................... 4
1.4 project goals and objectives ..................................................................................................................... 4
1.5 Agile methodology................................................................................................................................... 4
1.6 Scrum methodology ................................................................................................................................. 5
1.7 Communication within the ICOZ team.................................................................................................... 5
1.8 Resources provided and/or to be used .................................................................................................... 6
Conception of the project ....................................................................................................................................... 7
2.1 Introduction .............................................................................................................................................. 7
2.2 Artificial intelligence ............................................................................................................................... 7
2.3 Deep learning ........................................................................................................................................... 8
2.4 Computer vision ....................................................................................................................................... 8
2.4.1 Image classification ............................................................................................................................ 9
2.4.2 Object Detection ............................................................................................................................... 10
2.4.2.1 SSD MobileNet Architecture ........................................................................................................... 11
2.4.2.2 SSD architecture ............................................................................................................................... 11
2.4.3 Object Tracking ................................................................................................................................ 12
2.4.4 difference between object detection and object tracking ............................................................... 12
2.4.5 Combine the concept of object detection and object tracking ....................................................... 12
3.1 Development Tools and technologies .................................................................................................... 14
3.1.1 Tools used .......................................................................................................................................... 14
3.1.2 Work environment ............................................................................................................................. 14
3.1.3 Librairies ............................................................................................................................................ 15
Realization of the project ....................................................................................................................................... 18
4.1.1 Introduction ............................................................................................................................................... 18
vi
4.1.2 Conception of the project .......................................................................................................................... 18
4.1.3 Concept of Centroid tracking .................................................................................................................... 18
4.2 Realization ................................................................................................................................................... 23
Conclusion ........................................................................................................................................................... 31
References ............................................................................................................................................................ 32
vii
viii
List of Figures
Figure1 : ICOZ LOGO ............................................................................................................................................ 2
Figure 2 : Technical Sheet of ICOZ ....................................................................................................................... 3
Figure 3: Company Diagram .................................................................................................................................. 3
Figure 4 : agile methodology diagram .................................................................................................................... 5
Figure 5: scrum methodology diagram ................................................................................................................. 5
Figure 6 : Core Elements of Artificial intelligence ................................................................................................. 7
Figure 7 : A typical neural network ....................................................................................................................... 8
Figure 8 : Output from SSD Mobilenet Object Detection Model ......................................................................... 10
Figure 9 : Connection of MobileNet and SSD ...................................................................................................... 11
Figure 10 : SSD architecture ................................................................................................................................. 11
Figure 11: Accept bounding box coordinates and compute centroids .................................................................. 19
Figure 12: Compute Euclidean distance between new bounding boxes and existing objects .............................. 20
Figure 13 : Updating coordinates.......................................................................................................................... 21
Figure 14 : Register new objects........................................................................................................................... 22
Figure 15: importing classes and libraries ......................................................................................................... 23
Figure 16: construct the argument parse and parse the arguments ....................................................................... 24
Figure 17: initialize the list of class labels ............................................................................................................ 24
Figure 18 : initialization of the video stream ........................................................................................................ 25
Figure 19: finding objects belonging to the “person” class .................................................................................. 25
Figure 20: adding the bounding box for the objects ......................................................................................... 26
Figure 21 : counting if the person has moved up or down through the frame ...................................................... 27
Figure 22 : the case if the direction of the person was moving down................................................................... 28
Figure 23 : the case if the direction of the person was moving up ....................................................................... 28
Figure 24 :counting the number of people ............................................................................................................ 29
Figure 25 : script for the email alert .................................................................................................................... 29
Figure 26 : example of an email alert .................................................................................................................. 30
Figure 27 Graph of number of visitors in a week ............................................................................................. 30
ix
List of Abbreviations
Abbreviations Description
AI Artificial Intelligence
1
Chapter 1
General context of the project
1.1 Introduction
As part of my final year of a Master's degree in intelligent processing systems at
University of Mohammed V, the Faculty of Sciences of Rabat (FS Rabat), I have to do a six-
month
In this report, I present my work environment as well as the main mission I carried out
within the ICOZ company, namely the realization of a system for counting people in real time.
2
1.2.2 Technical Sheet of the Company
Staff 10
Email [email protected]
Having to control the amount of visitors poses a problem for essential public service providers,
such as Pharmacies, Supermarkets Hospitals, clinics and vet shops, Banks, Government offices
and other
These businesses cannot provide their services online and require physical visits which, in turn,
may lead to overcrowding
To ensure that people are as spaced out as needed, businesses turn to occupancy monitoring
by manually counting the visitors and also to gather data for statistical analysis.
Yet this rudimentary technique is limited in both scope and reliability, running the risk of
inaccuracies due to human error.
The project idea is to create a system that can detect and track visitor in real time
Our system will count the number of visitor.
the system can use the visitor counting data to visualize and analyzing the conversion
rate data, in order to evaluate the performance of stores and other.
If the total number of people exceeded inside, the system can send an email alert in
real-time
Agile is a project management approach developed as a more flexible and efficient way to get
products to market. The word ‘agile’ refers to the ability to move quickly and easily.
Therefore, an Agile approach enables project teams to adapt faster and easier compared to other
project methodologies.
agile allows the team to plan continuously throughout the project. This makes it easier for
adjustments and changes when required. An agile team is highly recommended for firms that work
in a dynamic environment and want to meet tight project deadlines. Especially in the tech industry,
agile methods and teams are highly preferred for its innovative and adaptable condition.
4
Figure 4 : agile methodology diagram
The scrum approach includes assembling the project’s requirements and using them to define
the project. we can then plan the necessary sprints, and divide each sprint into its own list of
requirements. Daily scrum meetings help keep the project on target as do regular inspections and
reviews. At the end of every sprint, you hold a sprint retrospective to look for ways to improve the
next sprint.
6
Chapter 2
2.1 Introduction
The conception phase of a system is the most important part of the project. It consists of finding
out the needs of the agents, also modelling the system and preparing its development, so that it can
be correctly integrated into the company's IT system. At the end of this conception phase, we
should have a functional specification.
We will also need to have and explaining the criteria for choosing one technology over another.
in this phase we will talk about the technologies that we used during our project we will talk
about artificial intelligence and deep learning also computer vision...
Computer vision have a lot of models these models help us answer questions about an image.
What objects are in the image? Where are those objects in the image? Where are the key points
on an object? What pixels belong to each object? We can answer these questions by building
different types of DNNs. These DNNs can then be used in applications to solve problems like
determining how many cars are in an image, whether a person is sitting or standing, or whether
an animal in a picture is a cat or a dog.
Among these models there are Image classification, Object detection, Image segmentation and
Object Tracking.
Since the project is about Tracking and Counting People in real-time we will focus on Image
classification and Object detection also Object Tracking model.
Image classification (or Image recognition) attempts to identify the most significant object
class in the image, is a subdomain of computer vision in which an algorithm looks at an image
and assigns it a tag from a collection of predefined tags or categories that it has been trained on.
Vision is responsible for 80-85 percent of our perception of the world, and we, as human
9
beings, trivially perform classification daily on whatever data we come across.
Therefore, emulating a classification task with the help of neural networks is one of the first
uses of computer vision that researchers thought about.
Well-researched domains of object detection include face detection and Pedestrian Detection.
Object detection has applications in many areas of computer vision, including image retrieval and
video surveillance.
Object detection as the term suggest is the procedure to detect the objects in real world. For
example, dog, car, humans, birds etc. In this process we can detect the presence of any still object
with much ease. another great thing that can be done with it is that detection of multiple objects in
a single frame can be done easily. For Example, in the image below the SSD (Single Shot
Detector) model has detected mobile phone, laptop, coffee, glasses in a single shot. It detects
different objects in a single shot.
We have chosen to work with this model since is one of the most representative detection
methods with respect to speed and accuracy trade-off. Compared to other detector such as R-CNN,
SSD is faster with MobileNet architecture
10
2.4.2.1 SSD MobileNet Architecture
SSD MobileNet is an object detection framework which is trained to detect and classify the defects
and stains on the captured image. Here, MobileNet network utilized to extract the high-level feature
from the images for classification and detection and SSD is a detection model which uses MobileNet
feature map outputs and convolution layers of different sizes to classify and detect bounding boxes
through regression.
The SSD architecture is a single convolution network that learns to predict bounding box
locations and classify these locations in one pass. Hence, SSD can be trained end-to-end. The SSD
network consists of base architecture (MobileNet in this case) followed by several convolution
layers:
11
2.4.3 Object Tracking
Object Tracking refers to the process of following a specific object of interest, or multiple
objects, in a given scene. It traditionally has applications in video and real-world interactions
where observations are made following an initial object detection. Now, it’s crucial to autonomous
driving systems such as self-driving vehicles from companies like Uber and Tesla.
Object Tracking methods can be divided into 2 categories according to the observation model:
generative method and discriminative method. The generative method uses the generative model
to describe the apparent characteristics and minimizes the reconstruction error to search the object
The discriminative method can be used to distinguish between the object and the background,
its performance is more robust, and it gradually becomes the main method in tracking. The
discriminative method is also referred to as Tracking-by-Detection, and deep learning belongs to
this category
An object tracker, will take the input coordinates (x, y) coordinates of where an object is
located in an image and then will
-predicting the location of the new object in the next frame based on various image attributes
(gradient, optical flow, etc.)
12
combining the concepts of object detection and tracking into a single algorithm, normally
divided into two phases:
Phase 1 - Detection:
In the detection phase, we actually run our more computationally expensive object tracker to
detect if we have new objects entered into our view, and see if we can find objects that were "lost"
in the tracking phase.
For each detected object, we create or update an object tracker with the new bounding box
coordinates. Since our object detector is more computationally expensive, we only run this phase
once every N frames.
Phase 2 - Tracking:
When we are not in the "detection" phase, we are in the "tracking" phase. For each of our
detected objects, we create an object tracker to follow the object as it moves through the frame.
Our object tracker should be faster and more efficient than the object detector.
We will keep tracking until we reach the Nth frame, then run our object tracker again. The
entire process is then repeated.
13
Chapter 3
3.1 Development Tools and technologies
3.1.1 Tools used
In this part, we will talk about the tools that we used in our project. We will talk
About:
Jupyter Notebook
14
PyCharm
Python
3.1.3 Librairies
Numpy
15
— Matrix-Matrix and Matrix-Vector multiplication
— Element-wise operations on vectors and matrices (i.e., adding, subtracting,
multiplying, and dividing by a number)
— Element-wise or array-wise comparisons
— Applying functions element-wise to a vector/matrix (like pow, log, and exp)
— A whole lot of Linear Algebra operations can be found in NumPy.linalg
Reduction, statistics, and much more.
Pandas
OpenCV
There are lots of applications which are solved using OpenCV, some of them are listed below
16
— Face recognition
— Automated inspection and surveillance
— number of people – count (foot traffic in a mall, etc)
— Vehicle counting on highways along with their speeds
— Interactive art installations
— Anamoly (defect) detection in the manufacturing process (the odd defective
products)
— Street view image stitching
— Video/image search and retrieval
— Robot and driver-less car navigation and control
— Object recognition
— Medical image analysis
— Movies – 3D structure from motion
— TV Channels advertisement recognition
SciPy
Imutils
Dlib
17
Chapter 4
Realization of the project
4.1.1 Introduction
In this chapter, we will present the practical part of our project, we will explain the general
concept of the project and we will also talk about the implementation part.
To implement our system, the people counter, we used OpenCV and dlib.
We used OpenCV for the standard computer vision functions, along with the deep learning object
detector for counting people.
We use dlib for its implementation of correlation filters. We also use the centroid tracking.
18
Step 1: Accept bounding box coordinates and compute centroids
To create an object tracking algorithm using centroid tracking, the first step is to accept bounding
box coordinates from an object detector and use them to compute centroids.
The centroid tracking algorithm supposes that we are passing in a set of bounding box (x, y)-
coordinates for each detected object in every single frame.
These bounding boxes can be produced by any type of object detector (color thresholding + contour
extraction, Haar cascades, HOG + Linear SVM, SSDs, Faster R-CNNs, etc.), provided that they are
computed for every frame in the video.
After having the bounding box coordinates we must compute the centroid, or more simply, the
center (x, y)-coordinates of the bounding box.
Then we will assign the bounding boxes presented to our algorithm with unique IDs
19
Step 2: Euclidean distance between new bounding boxes and existing objects
In the image there is three objects for object tracking We have to compute the Euclidean distances
between each pair of original centroids (red) and new centroids (green).
For each subsequent frame in our video stream we applying the Step 1 of computing object
centroids.
Instead of assigning a new unique ID to each detected object (which would defeat the purpose of
object tracking), we first need to determine if we can associate the new object centroids (yellow) with
the old object centroids (purple).
To accomplish this process, we compute the Euclidean distance (highlighted with green arrows)
between each pair of existing object centroids and input object centroids.
From Figure we can see that we have this time detected three objects in our image.
The two pairs that are close together are two existing objects.
We then calculate the Euclidean distances between each pair of original centroids (yellow) and new
centroids (purple).
20
Step 3 Update (x, y)-coordinates of existing objects
The primary assumption of the centroid tracking algorithm is that a given object will potentially
move in between subsequent frames, but the distance between the centroids for frames 𝐹𝑡 and 𝐹𝑡+1 will
be smaller than all other distances between objects.
Therefore, if we choose to associate centroids with minimum distances between subsequent frames
we can build our object tracker.
In Figure we can see how our centroid tracker algorithm chooses to associate centroids that
minimize their respective Euclidean distances.
the lonely point in the bottom-left It didn’t get associated with anything then it will be a new object.
21
Step 4: Creation of new objects
We have a new object that wasn’t linked with an existing object, so it is registered as object ID 3
In the cas that there are more input detections than existing objects being tracked, we have d to
register the new object.
Registering simply means that we are adding the new object to our list of tracked objects by:
After that we can go back to Step 2 and repeat the pipeline of steps for every frame in our video
stream.
Any reasonable object tracking algorithm needs to be able to handle when an object has been
lost, disappeared, or left the field of view.
Exactly how we handle these situations is really dependent on where your object tracker is
meant to be deployed, but for this implementation, we will deregister old objects when they cannot
be matched to any existing objects for a total of N subsequent frames.
22
4.2 Realization
The first step is to download the required packages. Installation via pip:
OpenCV will be used for deep neural network inference, opening video files, writing video files
and displaying output images on our screen.
the VideoStream and FPS modules from imutils.video that allowed us to work with a webcam
and to calculate the estimated Frames Per Second (FPS) throughput rate.
23
Figure 16: construct the argument parse and parse the arguments
These arguments allow us to transfer information to our people counter script from the terminal
at run-time:
Then we initialize CLASSES — the list of classes that our SSD supports. We can count other
24
moving objects as well (however, if is “Car”, “Bus”, or “bicycle”, but we are only interested in the
“person” class
we also load the pretrained MobileNet SSD used to detect objects and tracking people
In the first case if a video path was not supplied, we will grab a reference to the ip camera
Otherwise, we will be capturing frames from a video file
By looping over the detections, we proceed to capture the confidence and filter out the weak
results, which are not belong to the "person" class.
25
Figure 20: adding the bounding box for the objects
Then we compute the (x, y)-coordinates of the bounding box for the objects
we instantiate our dlib correlation tracker followed by passing in the object’s bounding box
coordinates to dlib.rectangle , storing the result as rect
Subsequently, we start tracking and append the tracker to the trackers list
We update the status to “Tracking” and catch the object position from there we extract the
position coordinates
26
Figure 21 : counting if the person has moved up or down through the frame
if there is an existing Trackable Object, we have to see if the object "person" is moving up or
down, so the difference between the y-coordinate of the current centroid and the mean of the
previous centroids will tell us in which direction the object is moving (negative for 'up' and
positive for 'down')
we take the mean is to ensure that our direction tracking is more stable.
27
Figure 22 : the case if the direction of the person was moving down
After checking that the direction is positive and the centroid is above the centerline, we
increment total down.
The second case if the direction is Negative and the centroid is below the centerline, we
increment Total up
28
Figure 24 :counting the number of people
This time there have been 7 persons who have entered the building and 7 people who have left.
Real-Time alert
If the total number of people exceeded inside the system can send an email alert in real-time to
the staff by using the SMTP protocol which handles sending e-mail and routing e-mail between
mail servers.
29
Figure 26 : example of an email alert
in addition, we can use the people counting data to visualize and analyzing the conversion rate
data, in order to evaluate the performance of stores and other
30
Conclusion
This report presents in an explicit way the course of our internship period which is a part of our master's
degree final project in intelligence processing systems, this internship, which took place in a professional
environment.
In this project were developed a system that can count personnes in real time by using the techniques of
computer vision we ara shuch as Image Classification and detection object Detection .
On the professional side, we learned to work in a team with constraints and instructions to respect,
especially Agile working.
We would like to specify that the internship continues after our graduation because we wish to
achieve the objectives mentioned above, we aspire to :
31
References
[1] G. Howard, M. Zhu, B. Chen et al., “MobileNets: efficient convolutional neural networks for
mobile vision applications,” http://arxiv.org/abs/1704.04861.
[2] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang
Fu, Alexander C. Berg, “SSD: Single Shot MultiBox Detector,”
http://arxiv.org/abs/1512.02325.
[4] Zhong-Qiu Zhao, Member, IEEE, Peng Zheng, Shou-tao Xu, and Xindong Wu, Fellow, IEEE
, “Object Detection with Deep Learning” https://arxiv.org/pdf/1807.05511.pdf
32