Thesis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

PEOPLE’S DEMOCRATIC REPUBLIC OF ALGERIA MINISTRY OF

HIGHER EDUCATION AND SCIENTIFIC RESEARCH

UNIVERSITY BATNA 2 MUSTAPHA BEN BOULAID

FACULTY OF TECHNOLOGY DEPARTMENT OF

ELECTRONICS, TELECOMMUNICATIONS SYSTEMS

FINAL YEAR PROJECT

FOR THE MASTER’S DEGREE IN TELECOMMUNICATIONS

SYSTEMS

MAIN TITLE OF YOUR THESIS GOES HERE

by

DAOUDI KHADIDJA SENNI AMINA

Copyright © graduation,2024
Dedication

I dedicate this thesis to my family for their unwavering support and encourage-
ment throughout my academic journey. To my parents, who have always believed
in me and pushed me to reach my full potential. To my siblings, who have been
my biggest cheerleaders and inspiration. And to my friends, who have stood by me
through late-night study sessions and moments of doubt. Without their love and en-
couragement, I would not have been able to achieve this milestone. I also dedicate
this thesis to my professors and mentors, who have guided me and nurtured my in-
tellectual growth. Their expertise and guidance have been invaluable in shaping my
research . This thesis is dedicated to all those who have supported, influenced, and in-
spired me on my journey to graduation. Thank you for believing in me and standing
by me every step of the way. “khadidja”

i
ii

This work is dedicated to my family, whose unwavering support and love have
been my strength. To my late father, who remains my guiding star and inspiration;
your memory continues to motivate me to strive for excellence.
To my friends, for their constant encouragement and companionship throughout
this journey.
To my teachers, for their invaluable guidance and wisdom that have shaped my
path.
Thank you all for being an integral part of my success.“Amina”
Abstract

As brick-and-mortar stores face increasing competition from online shopping, innova-


tive solutions like smart human-following shopping trolleys are emerging to enhance
customer experiences and retention. These trolleys are designed to follow customers
automatically, utilizing a combination of sensors and cameras to track the shopper’s
movements and navigate the environment. This technology not only offers conve-
nience by allowing customers to focus on shopping without pushing the trolley but
also provides significant benefits to individuals with limited mobility. By integrat-
ing intelligent control strategies and environmental awareness, these trolleys ensure
stable and reliable following behavior, thus improving the overall efficiency and enjoy-
ment of the shopping experience. This thesis presents the design and implementation
of a smart human-following shopping trolley using a Raspberry Pi microcontroller,
Raspberry Pi camera module, and a mobile platform. The trolley leverages computer
vision and deep learning techniques to robustly detect and track the target person in
real-time. The proposed system aims to provide a seamless and personalized shop-
ping experience, catering to the needs of a wide range of customers while helping
traditional retail stores remain competitive in the digital age. Through extensive
testing and evaluation, this work demonstrates the feasibility and effectiveness of

iii
iv

deploying smart following trolleys in real-world shopping environments.


Acknowledgements

Above all, we are grateful to God for giving us the strength, endurance, and health
we have needed over these years. We would like to express our sincere gratitude
to our supervisors Dr.fateh bougerra, Dr serairi fouzi and madame khacha for their
invaluable guidance, support, and encouragement throughout this project. We are
also thankful to the faculty and staff of the Department of Electronics and Telecom-
munications Systems at University Batna 2 and fablab laboratory for providing us
with the necessary resources and facilities to complete this work. Lastly, we extend
our heartfelt appreciation to our families and friends for their unwavering support
and understanding during the course of this project.

v
Contents

Dedication ii

Abstract iv

Acknowledgements v

1 Deep learning and computer vision 1


1.1 Intoduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4.1 Types of deep neural network . . . . . . . . . . . . . . . . . . 3
1.4.2 Deep learning applications . . . . . . . . . . . . . . . . . . . . 5
1.5 Convolutional Neural Network (CNN) . . . . . . . . . . . . . . . . . . 5
1.5.1 CNN Architecture . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5.2 CNN applications . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.3 learning types . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.4 Data conditioning (data preparation) . . . . . . . . . . . . . . 8

vi
Contents vii

1.6 Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10


1.6.1 Computer vision and image processing . . . . . . . . . . . . . 10
1.6.2 Examples of computer vision applications . . . . . . . . . . . . 10
1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 System description 13
2.1 introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 The control platform . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Raspberry Pi . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Raspberry Pi 4 specifications . . . . . . . . . . . . . . . . . . 14
2.3 Camera module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 The Raspberry Pi Camera Module . . . . . . . . . . . . . . . 15
2.3.2 the Raspberry Pi Camera Module specifications . . . . . . . . 16
2.4 The mobile platform . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Additional components . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.1 The HC-SR04 sensor . . . . . . . . . . . . . . . . . . . . . . . 17
2.5.2 Connection wires . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 pratical aspect 20
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.1 Implementation Overview . . . . . . . . . . . . . . . . . . . . 20
3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 synoptic/function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Contents viii

3.5 training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.1 roboflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.2 google collab . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.6 Software part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.1 Operating system . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.2 Programming language and required libraries . . . . . . . . . 24
3.7 Organigramme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Chapter 4 Name 27
4.1 Section 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.1 Subsection 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Section 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.1 Subsection 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

A Title of appendix 30
List of Tables

ix
List of Figures

1.1 Differences between Artificial intelligence, machine learning and deep


learning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Example of a network with many convolutional layers. . . . . . . . . 6

2.1 Raspberry Pi 4 modelB board. . . . . . . . . . . . . . . . . . . . . . 14


2.2 Raspberry Pi Camera Module Rev 1.3. . . . . . . . . . . . . . . . . 16
2.3 Fonctionnement capteur-HC-SR04. . . . . . . . . . . . . . . . . . . . 18
2.4 Connection wires. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1 schema synoptic of system . . . . . . . . . . . . . . . . . . . . . . . . 22


3.2 organigramme (flowchart) for the program. . . . . . . . . . . . . . . 26

x
Acronyms

AI Artificial Intelligence

CNN Convolutional Neural Network

DL Deep Learning

FOV Field of View

FPS Frames Per Second

GPU Graphics Processing Unit

IoT Internet of Things

mAP Mean Average Precision

ML Machine Learning

RNN Recurrent Neural Network

ROS Robot Operating System


Chapter 1

Deep learning and computer vision

1.1 Intoduction

We inhabit an era of unparalleled connectivity where every mundane interaction,


from a phone call to a payment transaction to web browsing, becomes a drop in an
ever-expanding ocean of data. The advent of the Internet of Things (IoT) further
amplifies this torrent of information, as cars, alarms, wearables, and myriad other
devices contribute voluminous data streams daily. In this chapter, we will delve into
foundational definitions and principles of the key technologies shaping the future,
within the domain of our professional focus.

1.2 Artificial Intelligence

Artificial Intelligence (AI) has revolutionized the way we perceive and interact with
technology, enabling machines to perform tasks traditionally requiring human intel-
ligence. It involves imbuing computers, computer-controlled robots, or software with
the capacity to think intelligently, akin to the human mind. This is achieved through

1
1.3. Machine Learning 2

the examination of human brain patterns and analysis of cognitive processes. The
culmination of these investigations leads to the creation of intelligent software and
systems. AI emphasizes the capability to think deeply and analyze data comprehen-
sively.

1.3 Machine Learning

Machine learning addresses the question of how to build computers that improve
automatically through experience. It is one of todayâs most rapidly growing technical
fields, lying at the intersection of computer science and statistics, and at the core
of artificial intelligence and data science. Recent progress in machine learning has
been driven both by the development of new learning algorithms and theory and by
the ongoing explosion in the availability of online data and low-cost computation.
The adoption of data-intensive machine-learning methods can be found throughout
science, technology and commerce, leading to more evidence-based decision-making
across many walks of life, including health care, manufacturing, education, financial
modeling, policing, and marketing.Jordan and Mitchell (2015)

1.4 Deep Learning

Deep learning allows computational models that are composed of multiple process-
ing layers to learn representations of data with multiple levels of abstraction. These
methods have dramatically improved the state-of-the-art in speech recognition, visual
object recognition, object detection and many other domains such as drug discovery
and genomics. Deep learning discovers intricate structure in large data sets by using
1.4. Deep Learning 3

the backpropagation algorithm to indicate how a machine should change its inter-
nal parameters that are used to compute the representation in each layer from the
representation in the previous layer. Deep convolutional nets have brought about
breakthroughs in processing images, video, speech and audio, whereas recurrent nets
have shone light on sequential data such as text and speech.LeCun et al. (2015)

Figure 1.1: Differences between Artificial intelligence, machine learning and deep
learning.

1.4.1 Types of deep neural network

A deep neural network (DNN) is a type of artificial neural network with multiple
layers between the input and output layers. These layers enable the network to learn
hierarchical representations of data, allowing it to model complex relationships and
make increasingly abstract interpretations. DNNs have gained popularity due to
their ability to handle large and complex datasets, making them well-suited for tasks
1.4. Deep Learning 4

such as image and speech recognition, natural language processing, and more.
There are several types of deep neural networks, each designed to address specific
tasks and data types. Some common types include:
1. Convolutional Neural Networks (CNNs): Particularly effective for image recog-
nition and classification tasks, CNNs use convolutional layers to extract features from
input images and pooling layers to reduce dimensionality.
2. Recurrent Neural Networks (RNNs): Well-suited for sequential data processing
tasks such as time series analysis, language modeling, and speech recognition. RNNs
have recurrent connections that allow them to retain information over time.
3. Long Short-Term Memory Networks (LSTMs): A specialized type of RNN
designed to address the vanishing gradient problem, LSTMs are capable of learning
long-term dependencies in sequential data, making them useful for tasks like speech
recognition and language translation.
4. Generative Adversarial Networks (GANs): Consisting of two neural networks,
a generator and a discriminator, GANs are used for generating new data instances
that resemble a given dataset. They are commonly used for image generation and
data augmentation.
5. Autoencoders: These networks are designed to learn efficient representations
of data by training to reconstruct their input. They can be used for tasks such as
dimensionality reduction, data denoising, and anomaly detection.
6. Deep Belief Networks (DBNs): Comprising multiple layers of stochastic, latent
variables, DBNs are capable of learning hierarchical representations of data. They
are often used for unsupervised learning tasks such as feature learning and density
1.5. Convolutional Neural Network (CNN) 5

estimation.
These are just a few examples of deep neural network architectures, and there
are many other variations and combinations tailored to specific applications and
domains.Liu et al. (2017)

1.4.2 Deep learning applications

Deep learning finds applications across various fields such as:


1. Image Recognition and Classification.
2. Natural Language Processing.
3. Speech Recognition and Synthesis.
4. Recommendation Systems.
5. Healthcare (Medical Image Analysis, Patient Outcome Prediction).
6. Autonomous Vehicles.
7. Financial Services (Fraud Detection, Risk Assessment).
8. Manufacturing (Quality Control, Predictive Maintenance).
These applications demonstrate the versatility and effectiveness of deep learning
in solving complex problems and driving innovation across industries.

1.5 Convolutional Neural Network (CNN)

Convolutional Neural Networks (CNNs) represent a revolutionary approach to deep


learning, enabling machines to learn intricate patterns and features from data, par-
ticularly in the domain of computer vision. CNNs have revolutionized various in-
dustries, including healthcare, automotive, and entertainment, through applications
1.5. Convolutional Neural Network (CNN) 6

such as image classification, object detection, and facial recognition. Despite their
efficacy, CNNs face challenges such as overfitting, which can hinder their generaliza-
tion capability. The rapid growth of CNNs is facilitated by powerful deep learning
frameworks such as TensorFlow, Keras, and PyTorch.

1.5.1 CNN Architecture

CNNs draw inspiration from the hierarchical architecture and local connectivity of
the human visual system.
The architecture of CNNs comprises several key components, including convolu-
tional layers, rectified linear units (ReLU), pooling layers, and fully connected layers.
This section elucidates the functionality of each component and its role in feature
extraction and prediction. Saha (2018)

Figure 1.2: Example of a network with many convolutional layers.


1.5. Convolutional Neural Network (CNN) 7

1.5.2 CNN applications

Convolutional neural networks, or CNNs, are very good deep learning models that
are mostly applied to the interpretation of images and videos. These are the main
uses for them:
-Recognition and Categorization of Images:
.Improve search results by classifying and analyzing photos using search engines.
.Social media: Recognize and tag individuals in pictures.
.Recommender Systems: Provide analogous images by analyzing the visual con-
tent.
-Object Detection:
.vital for robots, self-driving automobiles, and surveillance systems by recognizing
and localizing things in photos.
-Recognizing faces:
.Analyze facial traits to identify faces: this technique is utilized in social media
tagging, smartphone unlock features, and security systems.
These uses highlight the adaptability and strength of CNNs in interpreting and
analyzing visual data.Pra

1.5.3 learning types

1.Supervised Learning:
-In supervised learning, the model learns from labeled data. Itâs given input-
output pairs (features and corresponding labels) during training.
1.5. Convolutional Neural Network (CNN) 8

-The goal is to learn a mapping from inputs to outputs, so the model can make
accurate predictions on unseen data.
-Examples include image classification (where each image has a label), regression
(predicting a continuous value), and natural language processing tasks.
2.Unsupervised Learning:
-In unsupervised learning, the model learns from unlabeled data. It doesnât have
explicit output labels. -The goal is to discover patterns, structures, or relationships
within the data.
-Examples include clustering (grouping similar data points), dimensionality re-
duction (reducing features while preserving information), and anomaly detection
3.Semi-Supervised Learning:
-Semi-supervised learning combines elements of both supervised and unsupervised
learning. It uses a small amount of labeled data along with a larger amount of
unlabeled data.
-The idea is to leverage the unlabeled data to improve model performance.
-Examples include using labeled images along with a large set of unlabeled images
for training an image classifier.Robert (2024)

1.5.4 Data conditioning (data preparation)

What is Data Preparation in Machine Learning?


Data preparation, or “dataprep,” involves preparing training data for ingestion
by a machine learning model. This key preliminary stage includes phases from data
collection to validation. The process involves formatting data, correcting errors,
1.5. Convolutional Neural Network (CNN) 9

and potentially enriching it to improve data quality before processing. Detecting


anomalies helps correct biases that could negatively impact model results. Data
preparation is also useful for data visualization and analysis.
-Steps of Data Preparation:
**Data collection
**Data evaluation
**Cleaning, adding, or removing values
**Data transformation and formatting
**Data validation
**Data storage or routing
Difference Between Data Preparation and Data Exploration:
Data preparation involves transforming raw data into usable data through collec-
tion, cleaning, and formatting. Data exploration, the next step, involves navigating
and understanding the assembled dataset, crucial for creating analytical dashboards
or training machine learning models.
-Roles of Training, Validation, and Test Datasets:
**Training Dataset: Used to train the model, enabling it to make predictions on
new data.
**Validation Dataset: Used to validate the trained model and adjust its param-
eters, based on examples not included in the training dataset.
**Test Dataset: Used to evaluate the final modelâs performance, accuracy, and
robustness.Crochet-Damais (2022)
1.6. Computer Vision 10

1.6 Computer Vision

Computer vision, a subset of artificial intelligence, focuses on instructing computers


to interpret digital images, including photographs and videos. Its primary aim is to
emulate and automate tasks performed by the human visual system. These tasks en-
compass various techniques for capturing, processing, analyzing, and comprehending
digital images.

1.6.1 Computer vision and image processing

Image processing primarily involves the utilization and application of mathematical


functions and transformations on images, without necessarily involving intelligent in-
ference. This entails performing operations such as smoothing, sharpening, contrast
adjustment, and stretching on the image using algorithms.

1.6.2 Examples of computer vision applications

Here are some examples of computer vision applications:


1. **Object Recognition and Classification**: Automatically identifying and
categorizing objects within images or videos, such as detecting pedestrians in au-
tonomous vehicles or recognizing different species of animals in wildlife conservation
efforts.
2. **Facial Recognition**: Recognizing and identifying faces in images or videos,
which has applications in security systems, surveillance, and biometric authentica-
tion.
1.6. Computer Vision 11

3. **Image Segmentation**: Partitioning an image into multiple segments or re-


gions to simplify its representation, which is useful in medical imaging for identifying
organs or tumors, as well as in satellite imagery for land cover classification.
4. **Gesture Recognition**: Interpreting human gestures captured by cameras,
enabling hands-free interaction with devices or controlling interfaces in virtual reality
(VR) and augmented reality (AR) applications.
5. **Visual Inspection**: Automatically inspecting manufactured products for
defects or abnormalities in industries such as automotive, electronics, and pharma-
ceuticals, ensuring product quality and consistency.
6. **Autonomous Vehicles**: Providing real-time perception and understanding
of the vehicle’s surroundings through cameras and sensors, enabling self-driving cars
to navigate safely and make informed decisions.
7. **Augmented Reality (AR)**: Overlaying digital information or virtual ob-
jects onto the real-world environment in real-time, enhancing user experiences in
fields like gaming, retail, education, and interior design.
8. **Medical Image Analysis**: Assisting healthcare professionals in diagnosing
diseases and conditions by analyzing medical images such as X-rays, MRIs, CT scans,
and histopathology slides for abnormalities or anomalies.
9. **Visual Search**: Allowing users to search for similar or related items based
on images rather than text, improving e-commerce experiences by enabling product
discovery through visual similarity.
10. **Human Pose Estimation**: Estimating the poses or positions of human
bodies in images or videos, which has applications in fitness tracking, sports analysis,
1.7. Conclusion 12

and motion capture for animation and gaming.


These examples demonstrate the diverse range of applications where computer
vision technologies play a crucial role in extracting meaningful information from
visual data.

1.7 Conclusion

The definitions provided underscore the capability of deep learning to efficiently man-
age and process vast amounts of data, performing intricate mathematical operations
that can rival the expertise of human specialists in certain tasks, or tackle problems
that are beyond human capacity to solve within reasonable timeframes.
However, this significant advantage comes with its own set of drawbacks. Com-
plex deep learning models require substantial computing resources, typically only
available on high-end computers or specialized computing accelerators. Moreover,
the development of these models demands extensive resources, both in terms of
computational power and time. Such resources are often costly, prompting many
developers to utilize platforms like Google Colab, where computing resources are
rented out by big companies to facilitate model development.
Chapter 2

System description

2.1 introduction

In this section, we will introduce the primary components and technologies that form
the foundation of our project. We will discuss their specifications and elaborate on
the rationale behind selecting them.

2.2 The control platform

Our image processing algorithms need to be performed on-board. There are some
embedded processing platforms known for that, like the NVIDIA Jetson, Odroid,
Asus Tinker Board⦠Based on availability once again, we decided to go with the
Rapsberry Pi 4 model B 8GB RAM.

2.2.1 Raspberry Pi

Raspberry Pi 4 is a small, low-cost computer device known for its high processing
power and advanced capabilities. It belongs to the fourth generation of the Raspberry

13
2.2. The control platform 14

Pi series and is designed to meet the needs of developers and enthusiasts in areas
such as programming, electronics, and artificial intelligence.

Figure 2.1: Raspberry Pi 4 modelB board.

2.2.2 Raspberry Pi 4 specifications

The specifications of the Raspberry Pi 4 Model B include:


1. Processor: Quad-core ARM Cortex-A72 (64-bit) CPU running at 1.5GHz.
2. RAM: Options for 2GB, 4GB, or 8GB LPDDR4 SDRAM.
3. Connectivity: - Gigabit Ethernet (RJ45). - Dual-band 802.11 b/g/n/ac wire-
less LAN. - Bluetooth 5.0. - Two USB 3.0 ports. - Two USB 2.0 ports.
4. Video Audio: - Two micro HDMI ports (supports dual-display configurations
up to 4K resolution). - Multimedia H.265 (4Kp60) decode capability. - Stereo audio
and composite video output via a 3.5mm jack.
2.3. Camera module 15

5. Storage: MicroSD card slot for operating system and data storage.
6. GPIO: 40-pin GPIO header (fully backwards-compatible with previous Rasp-
berry Pi boards).
7. Power: USB-C connector for power input (supports up to 3A).
8. Dimensions: 85mm à 56mm à 22mm.
These specifications make the Raspberry Pi 4 Model B a versatile and power-
ful single-board computer suitable for various applications, including DIY projects,
educational purposes, and embedded systems development.

2.3 Camera module

There are a lot of cameras adapted for this use. We chose to go with the Raspberry
Pi Camera Module Rev 1.3 for availability and compatibility with our Raspberry Pi.

2.3.1 The Raspberry Pi Camera Module

The Raspberry Pi Camera Module, often referred to as “Cam Pi,” is a compact


camera attachment designed specifically for Raspberry Pi single-board computers.
Available in different versions, it offers resolutions of 5MP or 8MP. It connects to the
Raspberry Pi via a ribbon cable and is widely used in various applications, including
home security, wildlife monitoring, and DIY projects, thanks to its affordability and
versatility.
2.3. Camera module 16

Figure 2.2: Raspberry Pi Camera Module Rev 1.3.

2.3.2 the Raspberry Pi Camera Module specifications

The specifications of the Raspberry Pi Camera Module (latest version, V2) are as
follows:
1. **Resolution**: 8 megapixels (3280 Ã 2464 pixels).
2. **Sensor**: Sony IMX219PQ CMOS sensor.
3. **Lens**: Fixed focus.
4. **Aperture**: f/2.0.
5. **Field of View (FOV)**: 62.2 degrees diagonal.
6. **Supported Video Modes**: - 1080p at 30fps - 720p at 60fps - VGA (640 Ã
480 pixels) at 90fps.
7. **Supported Still Image Modes**: - 3280 Ã 2464 pixels - 2592 Ã 1944 pixels
- 1920 Ã 1080 pixels (1080p) - 1280 Ã 720 pixels (720p) - 640 Ã 480 pixels (VGA).
2.4. The mobile platform 17

8. **Connectivity**: Ribbon cable connection to the Raspberry Pi’s camera


port.
9. **Dimensions**: 25mm x 23mm x 9mm.
10. **Weight**: Approximately 3 grams.
These specifications make the Raspberry Pi Camera Module a versatile and ca-
pable imaging solution for various projects and applications.
Richardson and Wallace (2012)

2.4 The mobile platform

We chose the Keyestudio KS4032 4WD mecanum robot car for micro:bit due to its
omnidirectional movement capabilities, which allow for translations and rotations in
all directions using mecanum wheels. This smart DIY car is designed for micro:bit
and includes a car body with extended functions, a PCB base plate with integrated
motor drive sensors, 4 decelerating DC motors, mecanum wheels, various modules,
sensors, and acrylic boards.

2.5 Additional components

2.5.1 The HC-SR04 sensor

The HC-SR04 sensor is an ultrasonic distance sensor frequently used in robotics


projects to measure distances to objects. It works by emitting an ultrasonic signal
and measuring the time it takes for the echo to return to the sensor. Using the speed
of the sound and the time of travel back and forth, the sensor can determine the
2.5. Additional components 18

distance to an object.
The HC-SR04 sensor is easy to use and integrate into robotics projects because
it has four wires: power, mass, trigger and echo. It can be controlled by a variety of
electronic controllers, such as the Arduino card.
HC-SR04 sensors are useful for robots that need to avoid obstacles or navigate
in an environment. They can also be used to detect moving objects, to measure
distances to objects etc.

Figure 2.3: Fonctionnement capteur-HC-SR04.

2.5.2 Connection wires

Connection wires are electrical wires used to connect electronic components to the
raspberry pi. They are commonly used to connect sensors, actuators, displays and
other components to the raspberry pi to create electronic circuits.
2.5. Additional components 19

There are two types of connection wires: male-male connections and male-female
connections. Male-male connector wires are used to connect components that both
have male pins, while male-female connectors are used for connecting components
with a male and a female pins.
Connection wires are usually made of copper or copper alloy and are coated with
a plastic insulator to protect electrical wires and prevent short circuits. They are
available in a variety of colors to help identify and organize the different wires in a
circuit.

Figure 2.4: Connection wires.


Chapter 3

pratical aspect

3.1 Introduction

3.1.1 Implementation Overview

In this chapter, we will outline the step-by-step implementation of our project. We


will detail the connection and assembly process, describe the system development
and its underlying logic, and discuss the problems we encountered along with the
solutions we applied.

3.2 Background

The rapid advancements in artificial intelligence, computer vision, and robotics have
paved the way for innovative solutions across various industries. In the retail sector,
the increasing competition from online shopping platforms has compelled brick-and-
mortar stores to explore new ways to enhance customer experiences and maintain
their relevance. One promising approach is the integration of smart technologies, such
as autonomous human-following shopping trolleys, to provide a more personalized

20
3.3. Problem Statement 21

and efficient shopping experience. Traditional shopping trolleys, while serving their
purpose, often pose challenges for customers, especially those with limited mobility
or those who find it cumbersome to navigate crowded store aisles while pushing
a trolley. Moreover, the lack of interactive features and personalized assistance in
conventional trolleys may lead to a less engaging shopping experience. To address
these issues and bridge the gap between online and offline shopping, the concept of
smart human-following shopping trolleys has emerged.

3.3 Problem Statement

Despite the potential benefits of smart human-following shopping trolleys, their im-
plementation faces several challenges. Firstly, robust human detection and tracking
in dynamic and crowded environments, such as shopping malls, is a complex task.
Traditional computer vision techniques may struggle to accurately identify and fol-
low the target person amidst occlusions, varying lighting conditions, and multiple
individuals in the scene. Secondly, autonomous navigation in indoor environments
requires precise localization, obstacle avoidance, and path planning capabilities. The
trolley must be able to seamlessly navigate through narrow aisles, avoid collisions
with static and moving obstacles, and maintain a safe following distance from the
target person. Ensuring smooth and reliable navigation while considering factors
such as speed, stability, and responsiveness is crucial for the trolley’s usability and
user acceptance. Lastly, integrating various hardware components, such as sensors,
cameras, and actuators, and developing a cohesive software architecture that enables
real-time processing, decision-making, and control is a significant challenge. The sys-
3.4. synoptic/function 22

tem must be computationally efficient, modular, and scalable to accommodate future


enhancements and adaptations to different retail environments.

3.4 synoptic/function

Figure 3.1: schema synoptic of system

3.5 training

3.5.1 roboflow

Roboflow simplifies the process of annotating datasets for machine learning projects,
enabling teams to collaborate efficiently. The platform allows users to invite team
members, upload and share datasets, annotate images, and merge individual datasets
into a final one. This workflow not only speeds up the annotation process but
3.5. training 23

also ensures consistency and quality in the labeled data. Roboflow also supports
preprocessing and augmentations to enhance dataset quality, making it easier to
train robust machine learning models Roboflow (2024)

3.5.2 google collab

Training data in Google Colab for tensorflow involves a series of steps to set up
the environment, configure the model, and initiate the training process. Google
Colab provides an excellent platform for this due to its free access to powerful GPU
resources, which are crucial for training deep learning models efficiently.
The process begins with setting up the required dependencies, including the ten-
sorflow repository and necessary Python packages. Next, configuration files are cre-
ated for both the model architecture and training parameters. These configuration
files specify details such as the number of classes, paths to the training and validation
datasets, and various model-specific settings.
Once the setup is complete, the training process can be started using specific
commands in Colab. These commands define the image size, batch size, number of
epochs, and paths to the configuration files and pre-trained weights. During training,
it is essential to monitor performance metrics like mAP (mean Average Precision)
to ensure the model is learning correctly.
After training, the model’s performance is evaluated on a validation dataset, and
visualizations of the training results are generated using tools like TensorBoard or
custom plotting scripts. Finally, the trained model can be tested on new images to
assess its real-world performance.Natsunoyuki (2024)
3.6. Software part 24

3.6 Software part

3.6.1 Operating system

The operating system for Raspberry Pi has been specifically created for it. The op-
erating system is Linux, tailored specifically for Raspberry Pi devices. It is renowned
for its dependability, adaptability, security, and low power usage. Preinstalled pro-
gramming languages and tools include C, C++, and Python.

3.6.2 Programming language and required libraries

Depending on our tastes, we can use any programming language, however Python is
the best option for AI, ML, DL, and data science.
The libraries that are going to be used are:
**OpenCV-Python: This Python binding library was created to address issues
related to computer vision. Python is a general-purpose programming language that
was created by Guido van Rossum. It gained a lot of popularity very fast, mostly
due to its ease of use and readable code. It allows the programmer to write code
that conveys ideas in fewer lines without sacrificing readability.
Python is slower than languages like C/C++. Having said that, C/C++ is a
simple way to extend Python, enabling us to develop computationally demanding
code in C/C++ and produce Python wrappers that can be utilized as Python mod-
ules. Due to the fact that real C++ code is running in the background, this offers
us two benefits: first, the code is just as quick as the original C/C++ code; second,
it is simpler to write.
3.6. Software part 25

**NumPy: The core Python module for scientific computing is called NumPy.
A multidimensional array object, different derived objects (like masked arrays and
matrices), and a variety of routines for quick operations on arraysâlike sorting, se-
lecting, I/O, logical, mathematical, and discrete Fourier transformsâas well as basic
statistical operations, random simulation, and much moreâare all provided by this
Python library.
**TensorFlow Lite: For mobile and embedded devices, TensorFlow Lite is a con-
densed version of TensorFlow. By offering pre-trained models that are optimized and
tools to effectively convert and deploy bespoke models, it facilitates on-device ma-
chine learning. Applications like mobile apps, Internet of Things devices, and edge
computing solutions that demand low latency and low power consumption are well
suited for TensorFlow Lite, which is compatible with several platforms. It is appro-
priate for real-time applications because it enables quick inference and low memory
consumption.
3.7. Organigramme 26

3.7 Organigramme

Figure 3.2: organigramme (flowchart) for the program.


Chapter 4

Chapter 4 Name

4.1 Section 1

4.1.1 Subsection 1

4.2 Section 2

4.2.1 Subsection 2

27
Bibliography

Antoine Crochet-Damais. Data préparation : définition et fonction-


nement. Journal du Net, 2022. URL https://www.journaldunet.fr/
intelligence-artificielle/guide-de-l-intelligence-artificielle/
1501329-data-preparation-en-machine-learning-definition/. Updated on
October 21, 2022.

Michael I Jordan and Tom M Mitchell. Machine learning: Trends, perspectives, and
prospects. Science, 349(6245):255–260, 2015.

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):
436–444, 2015.

Weibo Liu, Zidong Wang, Xiaohui Liu, Nianyin Zeng, Yurong Liu, and Fuad E
Alsaadi. A survey of deep neural network architectures and their applications.
Neurocomputing, 234:11–26, 2017.

Natsunoyuki. Speeding up model training with google co-


lab, 2024. URL https://medium.com/@natsunoyuki/

28
Bibliography 29

speeding-up-model-training-with-google-colab-b1ad7c48573e. Accessed:
2024-06-04.

Matt Richardson and Shawn Wallace. Getting started with raspberry PI. “ O’Reilly
Media, Inc.”, 2012.

Jérémy Robert. Stable diffusion: tout savoir sur ce modèle de machine


learning. DataScientest, 4 2024.

Roboflow. Getting started with roboflow, 2024. URL https://blog.roboflow.


com/getting-started-with-roboflow/. Accessed: 2024-06-04.

Sumit Saha. A comprehensive guide to convolutional neural networks â the eli5 way.
Towards Data Science, 2018.
Appendix A

Title of appendix

30

You might also like