Mini_Project_Report_400[1] (1) (1) (2)

VISVESVARAYA TECHNOLOGICAL UNIVERSITY
BELAGAVI-590018, KARNATAKA.
A PROJECT REPORT
On
“VIRTUAL MOUSE USING HAND GESTURE

RECOGNITION”
Submitted in Partial Fulfillment for the Award of the Degree
of
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE & ENGINEERING
Submitted By:
1EE22CS400 ANUSHA R D
1EE22CS401 D SIMRAN
1EE22CS403 N RAJASHEKAR
1EE22CS404 NANDINI K C
Under the Guidance of

Prof. CHAITHRA K R
Assistant Professor
Dept. Of CSE
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
EAST WEST COLLEGE OF ENGINEERING

Bengaluru - 560064
2023 - 2024
No.13, Major Akshay Girish Kumar Road, Sector A Yelahanka New Town,
Bangalore-64(Affiliated to Visvesvaraya Technological University Belgaum)
DEPARTMENT OF COMPUTER SCIENCE &

ENGINEERING
Certificate
Certified that the Mini Project Work entitled “Virtual Mouse using Hand Gesture Recognition”
carried out by Anusha R D (1EE22CS400), D Simran (1EE22CS401), N Rajashekar (1EE22CS403),
Nandini K C (1EE22CS404), bonafide students of East West College of Engineering, in partial
fulfillment for the award of Bachelor of Engineering in Computer Science and Engineering of
Visvesvaraya Technological University, Belagavi during the academic year 2023-2024. It is certified
that all corrections/suggestions indicated for Internal Assessment have been incorporated in the report
deposited in the Department Library. The project report has been approved as it satisfies the academic
requirements in respect of Project Work Phase II (21CSMP67) prescribed for the said degree.
Signature of the Guide Signature of the HOD Signature of the Principal

Prof. Chaithra K. R. Dr. Lavanya N. L. Dr. Santosh Kumar G.
Assistant Professor Professor & Head Principal
(No.13, Major Akshay Girish Kumar Road, Sector A Yelahanka New Town,
Bangalore-64(Affiliated to Visvesvaraya Technological UniversityBelgaum)
Department of Computer Science and Engineering
DECLARATION
We ANUSHA R D(1EE22CS400), D SIMRAN (1EE22CS401), RAJASHEKAR

(1EE22CS403) and NANDINI K C (1EE22CS404), bonafide students of East West College of
Engineering, hereby declare that the Mini project entitled “VIRTUAL MOUSE USING HAND
GESTURE RECOGNITION” submitted in partial fulfilment for the award of Bachelor of Engineering in
Computer Science & Engineering of the Visvesvaraya Technological University, Belgaum during the year
2023-2024 is our original work and the project has not formed the basis for the award of any other degree,
fellowship or any other similar titles.
Name & Signature of the Student with date
1) ANUSHA R D 1EE22CS400
2) D SIMRAN 1EE22CS401
3) N RAJASHEKAR 1EE22CS403
4) NANDINI K C 1EE22CS404
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompany the completion of any task would be incomplete
without the mention of the people who made it possible, whose constant guidance and
encouragement ground my efforts with success.
I consider it is a privilege to express my gratitude and respect to all those who guided me in
completion of technical seminar.
I am grateful to thank our Principal Dr. Santhosh Kumar G, East West College of Engineering,
who patronized throughout our career & for the facilities provided to carry out this work
successfully.
It’s a great privilege to place on record my deep sense of gratitude to our beloved HOD Dr.
Lavanya N L of Computer Science & Engineering, who patronized throughout our career & for
the facilities provided to carry out this work successfully.
I am also grateful to thank my technical seminar Guide Prof. Chaithra K R, Assistant professor
of CSE department for her invaluable support and guidance.
I would also like to thank the teaching and non-teaching staff members who have helped me
directly or indirectly during the project.
Lastly but most importantly I would like to thank my family and friends for their co- operation
and motivation to complete this project successfully.
ANUSHA R D 1EE22CS400
D SIMRAN 1EE22CS401
RAJASHEKAR 1EE22CS403
NANDINI K C 1EE22CS404
I
ABSTRACT
Recent improvements in gesture detection and hand tracking have brought about both opportunities
and challenges. Try out a few of these options while outlining challenges and exciting future
possibilities for virtual reality and user engagement. Given the popularity of COVID19, the goal of
this research is to reduce interactions between people and the reliance on technology to operate
computers. These results will motivate additional research and, in the long run, support the use of
virtual environments. In the proposed era, there are no such restrictions and gesture recognition may
be used as a substitute. It may be possible to click and drag objects during this adventure using a
variety of hand actions. The suggested project's input technique will only require a camera. The
languages Python and OpenCV.
II
TABLE OF CONTENTS
Sl No CHAPTERS Page no
1 Introduction 1-2
1.1 Background 1
1.2 Overview 1
1.3 Problem Statement 1-2
1.4 Objectives 2
2 Literature review 3-4
2.1 Summary of Prior Works 3
2.2 Outcome of the Review 3
2.3 Proposed Work 4
3 System Requirements 5-7
3.1 Functional Requirements 5
3.2 Software and Hardware Used 7
4 System Design/Methodology 8-22
4.1 Methodology 8-15
4.2 System Architecture 16
4.3 Working Principle 16
4.4 Data Flow Diagram 17
4.5 UML Diagram 17
4.6 Major Algorithms 20
5 Implementation 23-35
III
TABLE OF CONTENTS
6 Testing 36-41
6.1 Overview 36
6.2 A/B Testing 36
6.3 Beta Testing 37
6.4 White Box Testing 37
6.5 Black Box Testing 37
6.6 Positive Testing 38
6.7 Negative Testing 38
6.8 Unit Testing 38
6.9 Integration Testing 39
6.10 System Testing 39
6.11 Acceptance Testing 39-41
7 Input and Output Screens 42-45
7.1 Snapshot of the Project 42-45
8 Conclusion and Future Work 46-47
8.1 Conclusion 46
8.2 Implications and Future Work 47
References 48-49
IV
LIST OF FIGURES
Figure No Figure Name Page No
4.1 Working of Virtual Mouse 8
4.1.1 The Camera used in the Virtual Mouse System 9
4.1.2 Capturing the Video and Processing 9
4.1.3 Virtual Screen Matching 9
4.1.4 Detecting which finger is up 10
4.1.6 Mouse cursor moving around the computer window 10
4.2 Virtual Mouse System Architecture 16
4.4 Data flow diagram 17
4.5 UML Diagram 18
4.5.3 Sequence diagram 19
7.1 Snapshot of the project 42-45
7.1.1 Left click 42
7.1.2 Right click 42
7.1.3 Double click 43
7.1.4 Grab function 43
7.1.5 Volume control 44
7.1.6 Brightness control 44
7.1.7 Scroll function 45
V
Virtual Mouse using Hand Gestures
CHAPTER 1
INTRODUCTION
1.1 Background
The development of human-computer interaction (HCI) technologies has led to innovative ways
for users to interact with digital devices. One such advancement is the use of hand gestures to
control virtual devices, such as a virtual mouse. This technology leverages computer vision and
machine learning to interpret human hand movements, enabling a touch-free method of
interacting with a computer.
1.2 Overview of the Present Work
Users who don’t have a physical mouse can nevertheless use a virtual mouse to control their
computer. Because it utilises a normal webcam, it may be viewed as hardware. Input devices
like a genuine mouse or a computer keyboard can be utilised with a virtual mouse. A camera-
controlled virtual mouse uses a variety of image processing methods. Mouse clicks are interpreted
from user hand motions. The default setting on a web camera is for continuous image capturing. A
virtual mouse is software that allows users to give mouse inputs to a system without using an
actual mouse. To the extreme it can also be called as hardware because it uses an ordinary web
camera. A virtual mouse can usually be operated with multiple input devices, which may
include an actual mouse or a computer keyboard. Virtual mouse which uses web camera works
with the help of different image processing techniques.
1.3 Problem Statement

1. To design motion tracking mouse which detect finger movements gestures instead of
physical mouse.
2.User Interface must be Simple & easy to understand.
3. Physical mouse requires special hardware and surface to operate.
4. Physical mouse is not easily adaptable to different environments and its performance varies
depending on the environment.
DEPT. OF CS&E, EWCE 2023-24 1

1.4 Objectives
The objectives of this project are as follows:
1. Create such application which is part of AI.
2. To design to operate with the help of a webcam.
3. User should be able to easily install in their computer.
4. User should be able to use feature of Drag & Drop.
5. Also, it must have Scrolled feature.
6. To design a virtual input that can operate on all surfaces.

CHAPTER 2
LITERATURE REVIEW
2.1 Summary of Prior Works

We may utilise a hand recognition system to control the mouse pointer, left click, right click,
drag, and other fundamental mouse functions in the existing virtual mouse control system. The
use of hand recognition won't be around in the future. There are many systems for recognising
hands, but the system they used is static hand recognition, which only recognises the shapes
that hands make and defines an action for each shape. This system is limited to a small number
of defined actions and causes a lot of confusion. As technology advances, there are more and
more alternatives to using a mouse. Gesture Controlled Virtual Mouse makes using a computer
with a human being simple by combining voice commands and hand motions. There is very
little direct contact with the computer. A voice assistant and static and dynamic hand motions
can practically perform all i/o tasks. This project recognises hand movements and verbal
commands using cutting-edge Computer Vision and Machine Learning algorithms without the
usage of any additional gear. It uses models developed by Media Pipe, which uses pybind11 as
its foundation.
2.2 Outcome of the Review

As computer use has been engrained in our daily lives, human-computer interaction is
becoming more and more convenient. While most people take these areas for granted, people
with disabilities frequently struggle to use them properly. In order to imitate mouse activities
on a computer, this study offers a gesture-based virtual mouse system that makes use of hand
movements and hand tip detection. The major objective of the recommended system is to
replace the traditional mouse with a web camera or a computer's built-in camera to perform
computer mouse pointer and scroll functions. Real-time detection and identification of a hand
or palm is accomplished using a single-shot detector model. The Media Pipe uses the single-
shot detector concept. Because it is simpler to learn palms, the hand detection module initially
trains a model for palm detection. Furthermore, for tiny objects like hands or fists, the Non
maximum suppression performs noticeably better. Finding joint or knuckle coordinates in the
hand area is a model for a hand landmark.

2.3 Proposed Work
The system works by identifying the color of the hand and decides the position of the cursor
accordingly but there are different conditions and scenario which make it difficult for the
algorithm to run in the real environment due to the following reasons.
● Noises in the environment.
● Lighting condition in the environment

● Different textures of skin.
● Background object in the same color of skin.
So it becomes very important that the color determining algorithm works accurately. The
proposed system can work for the skin tone of any color as well as can work accurately in any
lighting condition as well for the purpose of clicking the user needs to create a 15 degree angle
between its two-finger the proposed system can easily replace the traditional mouse as well
as the algorithm that requires colored tapes for controlling the mouse .the research paper can
be a pioneer in its field and can be a source of further research in the corresponding field. The
project can be developed with “zero- cost” and can easily integrate with the existing system.
2.3.1 Advantages of proposed system
1. Virtual Mouse using Hand gesture recognition allows users to control mouse with the help
of hand gestures.
2. System’s webcam is used for tracking hand gestures.
3. Computer vision techniques are used for gesture recognition.
4. OpenCV consists of a package called video capture which is used to capture data from a
live video.

CHAPTER 3
SYSTEM REQUIREMENTS
3.1 Functional Requirements

Creating a virtual mouse using hand gesture recognition involves several functional
requirements to ensure it works effectively and seamlessly. Here's a detailed breakdown of the
key functional requirements:
1. Gesture Recognition
 Hand Detection: Accurately detect the presence of a hand within the camera's field of
view.
 Gesture Identification: Identify specific gestures (e.g., open hand, closed fist, pointing
finger) used to control the mouse.
 Multi-Gesture Support: Support a variety of gestures for different mouse functions (e.g.,
moving the cursor, left-click, right-click, scroll).
2. Cursor Control
 Cursor Movement: Translate hand movements into corresponding cursor movements on
the screen.
 Sensitivity Adjustment: Allow users to adjust the sensitivity and speed of cursor
movement based on hand gestures.
 Boundary Handling: Ensure the cursor does not move beyond the screen boundaries
regardless of hand position.
3. Click Functions
 Left-Click: Recognize and execute a gesture for left-clicking (e.g., closing the fist).
 Right-Click: Recognize and execute a gesture for right-clicking (e.g., a specific hand
pose).
 Double-Click: Recognize and execute a gesture for double-clicking.
4. Scroll Functions
 Vertical Scrolling: Recognize gestures for scrolling up and down.
 Horizontal Scrolling: Recognize gestures for scrolling left and right (if applicable).
 Scroll Speed Adjustment: Allow users to adjust the scrolling speed based on the gesture's
intensity or speed.
5. Drag and Drop
 Drag Initiation: Recognize a gesture to initiate a drag action (e.g., holding the fist).
 Dragging Movement: Translate hand movements into dragging actions.
 Drop Action: Recognize a gesture to release the drag (e.g., opening the fist).
6. Gesture Calibration and Customization

 Calibration: Provide a calibration process to ensure accurate gesture recognition based on
the user's hand size and shape.
 Customization: Allow users to customize gestures for different functions to match their
preferences.
7. Feedback and Indicators

 Visual Feedback: Display visual indicators on the screen to show detected gestures and
corresponding actions.
 Auditory Feedback: Provide optional sound cues for specific actions (e.g., click sounds).
8. Performance and Accuracy

 Real-Time Processing: Ensure gestures are recognized and translated into actions in real-
time with minimal latency.
 High Accuracy: Maintain high accuracy in gesture recognition to avoid misinterpretation
and unintended actions.
 Environmental Adaptability: Ensure the system works under different lighting
conditions and backgrounds.
9. Compatibility and Integration

 Platform Compatibility: Ensure compatibility with different operating systems (e.g.,
Windows, macOS, Linux).
 Application Integration: Allow the virtual mouse to work seamlessly with various
applications (e.g., web browsers, office suites, design software).
10. Security and Privacy
 Data Security: Ensure that any data captured (e.g., hand movements) is securely processed
and stored.
 Privacy Considerations: Provide transparency about data usage and allow users to control
their data privacy settings.
11. User Training and Support

 User Manual: Provide a detailed user manual or tutorial for setting up and using the virtual
mouse.
 Customer Support: Offer customer support for troubleshooting and assistance.
These functional requirements provide a comprehensive foundation for developing a
virtual mouse using hand gesture recognition, ensuring a user-friendly and efficient
interaction experience.
3.2 Software and Hardware Used

Hardware requirements
 System: i3 or i5 processor
 System will be using Processor: Core2Duo Main Memory:
 GB RAM (Minimum)
 Hard Disk: 512 GB (Minimum)
 Display: 14" Monitor (For more comfort)
Software requirements
 Operating system: Windows 10
 Coding Language: Python
 IDE: Visual Studio
 Python: To access camera & tracking all hand motion, python is very easy & accurate to use.
Python comes with lots of build in libraries which makes code short and easily
understandable. Python version required for building of this application is 3.7
 Open CV Library: OpenCV are also included in the making of this program. OpenCV (Open-
Source Computer Vision) is a library of programming functions for real time computer
vision. OpenCV have the utility that can read image pixels value, it also has the ability to
create real time eye tracking and blink detection.

CHAPTER 4
SYSTEM DESIGN/METHODOLOGY
4.1 Methodology
The various functions and conditions used in the system are explained in the flowchart of the
real-time AI virtual mouse system
Fig 4.1: Working of Virtual Mouse

4.1.1 The Camera Used in the AI Virtual Mouse System
The proposed AI virtual mouse system is based on the frames that have been captured by the
webcam in a laptop or PC. By using the Python computer vision library OpenCV, the video
capture object is created, and the web camera will start capturing video. The web camera
captures and passes the frames to the AI virtual system.
Fig. 4.1.1
4.1.2 Capturing the Video and Processing
The AI virtual mouse system uses the webcam where each frame is captured till the termination of the
program.
4.1.3 (Virtual Screen Matching) Rectangular Region for Moving through the
Window
The AI virtual mouse system makes use of the transformational algorithm, and it converts the
co-ordinates of fingertip from the webcam screen to the computer window full screen for
controlling the mouse. When the hands are detected and when we find which finger is up for
performing the specific mouse function, a rectangular box is drawn with respect to the computer
window in the webcam region where we move throughout the window using the mouse cursor.
Fig. 4.1.3

4.1.4. Detecting Which Finger Is Up and Performing the Particular Mouse

Function
In this stage, we are detecting which finger is up using the tip Id of the respective finger that
we found using the MediaPipe and the respective co-ordinates of the fingers that are up and
according to that, the particular mouse function is performed.
Fig. 4.1.4
4.1.5. Mouse Functions Depending on the Hand Gestures and Hand Tip Detection
Using Computer Vision
4.1.6. For the Mouse Cursor Moving around the Computer Window
If the index finger is up with tip Id = 1 or both the index finger with tip Id = 1 and the middle
finger with tip Id = 2 are up, the mouse cursor is made to move around the window of the
computer using the AutoPy package of Python.
Fig. 4.1.6

4.1.7. For the Mouse to Perform Left Button Click
If both the index finger with tip Id = 1 and the middle finger with tip Id = 0 are up and the
distance between the two fingers is lesser than 30px, the computer is made to perform the left
mouse button click using the pinout Python package.
4.1.8. For the Mouse to Perform Right Button Click
If both the index finger with tip Id = 1 and the middle finger with tip Id = 2 are up and the
distance between the two fingers is lesser than 40 px, the computer is made to perform the right
mouse button click using the pinout Python package.
4.1.9. For the Mouse to Perform Scroll up Function
If both the index finger with tip Id = 1 and the thumb finger with tip Id = 2 are up and the distance
between the two fingers is lesser than 10 px and if the two fingers are moved up the page, the
computer is made to perform the scroll up mouse function using the PyAutoGUI Python
package.
4.1.10. For the Mouse to Perform Scroll down Function
If both the index finger with tip Id = 1 and the thumb finger with tip Id = 2 are up and the
distance between the two fingers is lesser than 10px and if the two fingers are moved down the
page, the computer is made to perform the scroll down mouse function using the PyAutoGUI
Python package.
If all the fingers are up with tip Id = 0, 1, 2, 3, and 4, the computer is made to not perform any
mouse events in the screen. The various functions and conditions used in the system are
explained in the flowchart of the real-time AI virtual mouse system in figure.
Camera Used in the AI Virtual Mouse System. The proposed AI virtual mouse system is based
on the frames that have been captured by the webcam in a laptop or PC. By using the Python
computer vision library OpenCV, the video capture object is created, and the web camera will
start capturing video, as shown in Figure. The web camera captures and passes the frames to
the AI virtual system.

Capturing the Video and Processing. The AI virtual mouse system uses the webcam where
each frame is captured till the termination of the program. The video frames are processed
from BGR to RGB color space to find the hands in the video frame by frame as shown in the
following code:
def findHands(self, img , draw = True):
imgRGB =cv2.cvtColor(img,cv2.COLOR_BGR2RGB)self.results=
self.hands.process(imgRGB)
Fig. 4.1.10
Rectangular Region for Moving through the Window. The AI virtual mouse system makes use
of the transformational algorithm, and it converts the coordinates of fingertip from the webcam
screen to the computer window full screen for controlling the mouse. When the hands are
detected and when we find which finger is up for performing the specific mouse function, a
rectangular box is drawn with respect to the computer window in the webcam region where we
move throughout the window using the mouse cursor.
Detecting Which Finger Is Up and Performing the Particular Mouse Function. In this stage,
we are detecting which finger is up using the tip Id of the respective finger that we found using
the MediaPipe and the respective co-ordinates of the fingers that are up, and according to that,
the particular mouse function is performed.

Mouse Functions Depending on the Hand Gestures and Hand Tip Detection Using Computer
Vision For the Mouse Cursor Moving around the Computer Window. If the index finger is up
with tip Id = 1 or both the index finger with tip Id = 1 and the middle finger with tip Id = 2 are
up, the mouse cursor is made to move around the window of the computer using the AutoPy
package of Python.
For the Mouse to Perform Left Button Click. If both the index finger with tip Id = 1 and the
thumb finger with tip Id = 0 are up and the distance between the two fingers is lesser than
30px, the computer is made to perform the left mouse button click using the pinout.
Algorithm Used for Hand Tracking:
For the purpose of detecting hand gestures and hand tracking, the MediaPipe framework is
used, and the OpenCV library is used for computer vision. The algorithm makes use of
machine learning concepts to track and recognize hand gestures and hand tips.

MediaPipe:
MediaPipe is a framework that is used for applying in a machine learning pipeline, MediaPipe
framework is multimodal, where this framework can be applied to various audios and videos.
The MediaPipe framework is used by the developer for building and analyzing the
systems through graphs, and it has also been used for developing the systems
for application purposes.
OpenCV:
OpenCV is a computer vision library that contains image-processing algorithms for object
detection OpenCV is a library of Python programming language, and real-time computer vision
applications can be developed by using the computer vision library. The OpenCV library is
used in image and video processing and also analysis such as face detection and object
detection

System Design
System design is a modelling process. It can be defined as a transaction from a user view to
programmers (developers) view. It concentrates on transferring of requirement specification
to design specification. The design phase acts as a bridge between the requirements
specification and implementation phase. In this stage, the complete description of our project
was understood and all possible combination to be implemented was considered.
The design activity into 2 separate phases:

 High level design
 Low level design
 High Level Design
The high-level design involves decomposing system into modules and representing the
interface and invocation relationships among modules. A high-level design is referred to as
software architecture. A high-level design document will usually include a high-level
architecture diagram depicting the component may also depict or otherwise refers to workflow
between component system.
 Low Level Design
It involves the design of the internal logic of the individual module and involves deciding on
the class structure and algorithm to be used in the system. The detailed design determines
how the components identified the high-level design can be implemented on software.

4.2 System Architecture
Fig. 4.2 AI virtual mouse System Architecture
4.3 Working principle

The AI virtual mouse system makes use of the transformational algorithm, and it converts the
co-ordinates of fingertip from the webcam screen to the computer window full screen for
controlling the mouse.

4.4 Data flow diagram

 A data-flow diagram is way of representing a flow of data of a process or a system. The
DFD also provides information about the outputs and inputs of each entity and the process
itself.
Fig. 4.4 Data flow diagram
4.5 UML Diagram
 UML stands for Unified Modelling Language. UML is a standardized general- purpose
modelling language in the field of object-oriented software engineering. The standard is
managed, and was created by, the Object Management Group.
 The goal is for UML to become a common language for creating models of object-oriented
computer software. In its current form UML is comprised of two major components: A Meta-
model and a notation. In the future, some form of method or process may also be added to;
or associated with, UML.
 The Unified Modelling Language is a standard language for specifying, Visualization,
Constructing and documenting the artifacts of software system, as well as for business
modelling and other non-software systems.
 The UML represents a collection of best engineering practices that have proven successful
in the modelling of large and complex systems.
 The UML is a very important part of developing objects-oriented software and the software
development process. The UML uses mostly graphical notations to express the design of
software projects.

Fig. 4.5 UML diagram
4.5.1 Goals
The Primary goals in the design of the UML are as follows:
 Provide users a ready-to-use, expressive visual modelling Language so that they can
develop and exchange meaningful models.
 Provide extendibility and specialization mechanisms to extend the core concepts.
 Be independent of programming languages and development process.

 Provide a formal basis for understanding the modelling language.
 Encourage the growth of OO tools market.
 Support higher level development concepts such as collaborations, frameworks, patterns
and components.


4.5.2 Use Case Diagram
A use case diagram in the Unified Modelling Language (UML) is a type of behavioural
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented
as use cases), and any dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor. Roles of the actors
in the system can be depicted.
Fig. 4.5.2 Use Case Diagram
4.5.3 Sequence Diagram

A sequence diagram in Unified Modelling Language (UML) is a kind of interaction diagram
that shows how processes operate with one another and in what order. It is a construct of a
Fig. 4.5.3 sequence diagram

message sequence chart. Sequence diagrams are sometimes called event diagrams, event
scenarios, and timing diagrams.
4.6 Major Algorithms
Image recognition algorithms

Virtual mouse systems using hand gesture recognition leverage various algorithms spanning
different stages of the process, including hand detection, feature extraction, and gesture
classification. Here are the major algorithms typically involved:
1. Hand Detection:
 Background Subtraction: Differentiates the moving hand from the static background.
 Skin Color Detection: Identifies hand regions based on skin color using color spaces like
HSV or YCbCr.
 Haar Cascades: Uses machine learning-based object detection method, trained with lots
of positive and negative images of hands.
 Deep Learning Models: Convolutional Neural Networks (CNNs) like YOLO, SSD, or
Faster R-CNN for precise hand detection.
2. Feature Extraction:
 Edge Detection: Uses algorithms like Canny or Sobel to detect edges of the hand.
 Contour Detection: Uses methods like the Douglas-Peucker algorithm to find and simplify
hand contours.
 Keypoint Detection: Identifies key points on the hand using algorithms like the MediaPipe
Hands model or OpenPose.
3. Gesture Classification:
 Template Matching: Compares the detected gesture with pre-defined gesture templates.
 K-Nearest Neighbours (KNN): Classifies gestures based on the majority class among the
nearest neighbours in the feature space.
 Support Vector Machines (SVM): Finds the hyperplane that best separates different
gesture classes in the feature space.
 Neural Networks: Uses deep learning models like CNNs or RNNs for complex gesture
recognition.

 Hidden Markov Models (HMM): Models the temporal dynamics of hand movements for
recognizing gestures involving motion sequences.
4. Tracking:
 Kalman Filter: Predicts and updates the hand's position over time, reducing the impact
of noise.
 Mean Shift Tracking: Tracks the hand by iteratively shifting a window to the maximum
density of pixels.
5. Pose Estimation:
 MediaPipe: Google's framework for high-fidelity hand and pose tracking using deep
learning models.
 OpenPose: Detects hand key points and pose estimation for robust gesture recognition.
 These algorithms work together to detect, track, and recognize hand gestures, enabling the
creation of a virtual mouse system that can interpret and respond to user hand movements.
Algorithm
 Step 1: Start the program.

 Step 2: Open the file and using that file location go to cmd.
 Step 3: Using some libraries installed for the code run the program.
 Step 4: Initialize the system and start the video capturing of WEBCAM.
 Step 5: Capture frames using WECAM.
 Step 6: Detect Hands and Hand Tips using Media pipe and OpenCV.
 Step 7: Detect which finger is UP.
 Step 8: Recognizing the Gesture.
 Step 9: Perform mouse operations according to gesture.
 Step 10: Stop the program.
Natural Language Processing (NLP) Models:
Virtual mouse systems that use hand gesture recognition typically leverage various techniques
from computer vision and natural language processing (NLP) domains. Here are some of the
key components and techniques that might be involved:

1. Hand Gesture Recognition:

 Convolutional Neural Networks (CNNs): These are commonly used for detecting and
tracking hand movements in real-time video feeds.
 Pose Estimation: Techniques like OpenPose or MediaPipe are used to estimate the pose
of the hand and fingers, which helps in understanding gestures.
2. Natural Language Processing (NLP):
 Gesture-to-Command Translation: Once gestures are recognized, NLP can be used to
map these gestures to specific commands or actions. This involves:
 Sequence-to-Sequence Models: Such as Recurrent Neural Networks (RNNs) or
Transformer models, which can learn mappings between gestures (input sequences) and
corresponding mouse commands (output sequences).
 Keyword Spotting: NLP techniques can be used to detect keywords or specific gestures
that trigger certain actions (e.g., "click", "drag", "scroll").
3. Virtual Mouse Control:

 Mouse Emulation: Algorithms that simulate mouse movements and clicks based on
detected gestures.
 Integration with GUI Events: NLP models might interact with the operating system's
graphical user interface (GUI) events to perform actions like clicking, dragging, or
scrolling.
4. Model Integration:
 Real-time Processing: Efficient algorithms are crucial for real-time processing of hand
gestures and translating them into mouse actions without significant delay.
 Accuracy and Robustness: NLP models need to be trained on diverse gesture datasets to
ensure accurate recognition across different users and environments.
Examples of specific models and technologies used could include:
 For hand gesture recognition: CNN-based models like ResNet or MobileNet, combined
with pose estimation techniques.
 For NLP: Transformer models like BERT or GPT, adapted to translate recognized gestures
into executable mouse commands.
These systems often require a combination of expertise from computer vision, machine
learning, and NLP to create robust and intuitive virtual mouse interfaces based on hand gesture

CHAPTER 5
IMPLEMENTATION
Creating a virtual mouse using hand gesture recognition involves several key steps. Here's a
high-level overview of how you might implement this:
1. Capture Video Input: Use a camera to capture video input of the user's hand.
2. Hand Detection: Detect the hand in the video frames.
3. Gesture Recognition: Recognize specific hand gestures for different mouse actions.
4. Mouse Control: Map the recognized gestures to mouse movements and actions.
Step-by-Step Implementation:
1. Capture Video Input
Use OpenCV to capture video input from the camera.
2. Hand Detection
Use a pre-trained model like MediaPipe Hands to detect hands in the video frames.
3. Gesture Recognition
Recognize gestures by analysing the positions of the hand landmarks. For simplicity, let's
recognize two gestures: open hand (move mouse) and closed fist (click).
4. Mouse Control
Map the recognized gestures to mouse movements and actions.
This is a basic implementation. For a more robust solution, you might need to refine the gesture
recognition, handle edge cases, and add more gestures.

Implementation Code
# Imports
import cv2
import mediapipe as mp
import pyautogui
import math
from enum import IntEnum
from ctypes import cast, POINTER
from comtypes import CLSCTX_ALL
from pycaw.pycaw import AudioUtilities, IAudioEndpointVolume
from google.protobuf.json_format import MessageToDict
import screen_brightness_control as sbcontrol
pyautogui.FAILSAFE = False
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands
# Gesture Encodings
class Gest(IntEnum):
# Binary Encoded
FIST = 0
PINKY = 1
RING = 2
MID = 4
LAST3 = 7
INDEX = 8
FIRST2 = 12
LAST4 = 15
THUMB = 16
PALM = 31
# Extra Mappings
V_GEST = 33
TWO_FINGER_CLOSED = 34
PINCH_MAJOR = 35
PINCH_MINOR = 36
# Multi-handedness Labels
class HLabel(IntEnum):
MINOR = 0
MAJOR = 1
# Convert Mediapipe Landmarks to recognizable Gestures

class HandRecog:
def __init__(self, hand_label):

self.finger = 0
self.ori_gesture = Gest.PALM
self.prev_gesture = Gest.PALM
self.frame_count = 0
self.hand_result = None
self.hand_label = hand_label
def update_hand_result(self, hand_result):

self.hand_result = hand_result
def get_signed_dist(self, point):

sign = -1
if self.hand_result.landmark[point[0]].y < self.hand_result.landmark[point[1]].y:
sign = 1
dist = (self.hand_result.landmark[point[0]].x - self.hand_result.landmark[point[1]].x)**2
dist += (self.hand_result.landmark[point[0]].y - self.hand_result.landmark[point[1]].y)**2
dist = math.sqrt(dist)
return dist*sign
def get_dist(self, point):

dist = (self.hand_result.landmark[point[0]].x - self.hand_result.landmark[point[1]].x)**2
dist += (self.hand_result.landmark[point[0]].y - self.hand_result.landmark[point[1]].y)**2
dist = math.sqrt(dist)
return dist
def get_dz(self,point):
return abs(self.hand_result.landmark[point[0]].z - self.hand_result.landmark[point[1]].z)
# Function to find Gesture Encoding using current finger_state.

# Finger_state: 1 if finger is open, else 0
def set_finger_state(self):
if self.hand_result == None:
return
points = [[8,5,0],[12,9,0],[16,13,0],[20,17,0]]
self.finger = 0
self.finger = self.finger | 0 #thumb
for idx,point in enumerate(points):
dist = self.get_signed_dist(point[:2])
dist2 = self.get_signed_dist(point[1:])
try:
ratio = round(dist/dist2,1)
except:
ratio = round(dist/0.01,1)
self.finger = self.finger << 1

if ratio > 0.5 :
self.finger = self.finger | 1
# Handling Fluctations due to noise

def get_gesture(self):
if self.hand_result == None:
return Gest.PALM

current_gesture = Gest.PALM
if self.finger in [Gest.LAST3,Gest.LAST4] and self.get_dist([8,4]) < 0.05:
if self.hand_label == HLabel.MINOR :
current_gesture = Gest.PINCH_MINOR
else:
current_gesture = Gest.PINCH_MAJOR
elif Gest.FIRST2 == self.finger :

point = [[8,12],[5,9]]
dist1 = self.get_dist(point[0])
dist2 = self.get_dist(point[1])
ratio = dist1/dist2
if ratio > 1.7:
current_gesture = Gest.V_GEST
else:
if self.get_dz([8,12]) < 0.1:
current_gesture = Gest.TWO_FINGER_CLOSED
else:
current_gesture = Gest.MID
else:
current_gesture = self.finger
if current_gesture == self.prev_gesture:
self.frame_count += 1
else:
self.frame_count = 0
self.prev_gesture = current_gesture
if self.frame_count > 4 :
self.ori_gesture = current_gesture
return self.ori_gesture
# Executes commands according to detected gestures

class Controller:
tx_old = 0
ty_old = 0
trial = True
flag = False
grabflag = False
pinchmajorflag = False
pinchminorflag = False
pinchstartxcoord = None
pinchstartycoord = None
pinchdirectionflag = None
prevpinchlv = 0
pinchlv = 0
framecount = 0
prev_hand = None
pinch_threshold = 0.3
def getpinchylv(hand_result):
dist = round((Controller.pinchstartycoord - hand_result.landmark[8].y)*10,1)
return dist
def getpinchxlv(hand_result):
dist = round((hand_result.landmark[8].x - Controller.pinchstartxcoord)*10,1)
return dist
def changesystembrightness():
currentBrightnessLv = sbcontrol.get_brightness()/100.0
currentBrightnessLv += Controller.pinchlv/50.0
if currentBrightnessLv > 1.0:
currentBrightnessLv = 1.0
elif currentBrightnessLv < 0.0:
currentBrightnessLv = 0.0
sbcontrol.fade_brightness(int(100*currentBrightnessLv) , start = sbcontrol.get_brightness())
def changesystemvolume():
devices = AudioUtilities.GetSpeakers()
interface = devices.Activate(IAudioEndpointVolume._iid_, CLSCTX_ALL, None)

volume = cast(interface, POINTER(IAudioEndpointVolume))
currentVolumeLv = volume.GetMasterVolumeLevelScalar()
currentVolumeLv += Controller.pinchlv/50.0
if currentVolumeLv > 1.0:
currentVolumeLv = 1.0
elif currentVolumeLv < 0.0:
currentVolumeLv = 0.0
volume.SetMasterVolumeLevelScalar(currentVolumeLv, None)
def scrollVertical():
pyautogui.scroll(120 if Controller.pinchlv>0.0 else -120)
def scrollHorizontal():
pyautogui.keyDown('shift')
pyautogui.keyDown('ctrl')
pyautogui.scroll(-120 if Controller.pinchlv>0.0 else 120)
pyautogui.keyUp('ctrl')
pyautogui.keyUp('shift')
# Locate Hand to get Cursor Position

# Stabilize cursor by Dampening
def get_position(hand_result):
point = 9
position = [hand_result.landmark[point].x ,hand_result.landmark[point].y]
sx,sy = pyautogui.size()
x_old,y_old = pyautogui.position()
x = int(position[0]*sx)
y = int(position[1]*sy)
if Controller.prev_hand is None:
Controller.prev_hand = x,y
delta_x = x - Controller.prev_hand[0]
delta_y = y - Controller.prev_hand[1]
distsq = delta_x**2 + delta_y**2

ratio = 1
Controller.prev_hand = [x,y]
if distsq <= 25:

ratio = 0
elif distsq <= 900:
ratio = 0.07 * (distsq ** (1/2))
else:
ratio = 2.1
x , y = x_old + delta_x*ratio , y_old + delta_y*ratio
return (x,y)
def pinch_control_init(hand_result):
Controller.pinchstartxcoord = hand_result.landmark[8].x
Controller.pinchstartycoord = hand_result.landmark[8].y
Controller.pinchlv = 0
Controller.prevpinchlv = 0
Controller.framecount = 0
# Hold final position for 5 frames to change status

def pinch_control(hand_result, controlHorizontal, controlVertical):
if Controller.framecount == 5:
Controller.pinchlv = Controller.prevpinchlv
if Controller.pinchdirectionflag == True:
controlHorizontal() #x
elif Controller.pinchdirectionflag == False:

controlVertical() #y
lvx = Controller.getpinchxlv(hand_result)
lvy = Controller.getpinchylv(hand_result)
if abs(lvy) > abs(lvx) and abs(lvy) > Controller.pinch_threshold:

Controller.pinchdirectionflag = False
if abs(Controller.prevpinchlv - lvy) < Controller.pinch_threshold:

Controller.framecount += 1
else:
Controller.prevpinchlv = lvy
elif abs(lvx) > Controller.pinch_threshold:

Controller.pinchdirectionflag = True
if abs(Controller.prevpinchlv - lvx) < Controller.pinch_threshold:
Controller.framecount += 1
else:
Controller.prevpinchlv = lvx
def handle_controls(gesture, hand_result):

x,y = None,None
if gesture != Gest.PALM :
x,y = Controller.get_position(hand_result)
# flag reset
if gesture != Gest.FIST and Controller.grabflag:
Controller.grabflag = False
pyautogui.mouseUp(button = "left")
if gesture != Gest.PINCH_MAJOR and Controller.pinchmajorflag:

Controller.pinchmajorflag = False
if gesture != Gest.PINCH_MINOR and Controller.pinchminorflag:

Controller.pinchminorflag = False
# implementation
if gesture == Gest.V_GEST:
Controller.flag = True
pyautogui.moveTo(x, y, duration = 0.1)

elif gesture == Gest.FIST:

if not Controller.grabflag :
Controller.grabflag = True
pyautogui.mouseDown(button = "left")
pyautogui.moveTo(x, y, duration = 0.1)
elif gesture == Gest.MID and Controller.flag:

pyautogui.click()
Controller.flag = False
elif gesture == Gest.INDEX and Controller.flag:

pyautogui.click(button='right')
elif gesture == Gest.TWO_FINGER_CLOSED and Controller.flag:

pyautogui.doubleClick()
elif gesture == Gest.PINCH_MINOR:

if Controller.pinchminorflag == False:
Controller.pinch_control_init(hand_result)
Controller.pinchminorflag = True
Controller.pinch_control(hand_result,Controller.scrollHorizontal, Controller.scrollVertical)
elif gesture == Gest.PINCH_MAJOR:

if Controller.pinchmajorflag == False:
Controller.pinch_control_init(hand_result)
Controller.pinchmajorflag = True
Controller.pinch_control(hand_result,Controller.changesystembrightness,
Controller.changesystemvolume)
'''
---------------------------------------- Main Class ----------------------------------------
Entry point of Gesture Controller
'''
class GestureController:
gc_mode = 0
cap = None
CAM_HEIGHT = None
CAM_WIDTH = None
hr_major = None # Right Hand by default
hr_minor = None # Left hand by default
dom_hand = True
def __init__(self):
GestureController.gc_mode = 1
GestureController.cap = cv2.VideoCapture(0)
GestureController.CAM_HEIGHT =
GestureController.cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
GestureController.CAM_WIDTH =
GestureController.cap.get(cv2.CAP_PROP_FRAME_WIDTH)
def classify_hands(results):
left , right = None,None
try:
handedness_dict = MessageToDict(results.multi_handedness[0])
if handedness_dict['classification'][0]['label'] == 'Right':
right = results.multi_hand_landmarks[0]
else :
left = results.multi_hand_landmarks[0]
except:
pass
try:
handedness_dict = MessageToDict(results.multi_handedness[1])
if handedness_dict['classification'][0]['label'] == 'Right':
right = results.multi_hand_landmarks[1]
else :
left = results.multi_hand_landmarks[1]
except:
pass
if GestureController.dom_hand == True:
GestureController.hr_major = right
GestureController.hr_minor = left
else :
GestureController.hr_major = left
GestureController.hr_minor = right
def start(self):
handmajor = HandRecog(HLabel.MAJOR)
handminor = HandRecog(HLabel.MINOR)
with mp_hands.Hands(max_num_hands=2,min_detection_confidence=0.5,
min_tracking_confidence=0.5) as hands:
while GestureController.cap.isOpened() and GestureController.gc_mode:
success, image = GestureController.cap.read()
if not success:
print("Ignoring empty camera frame.")
continue
image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)

image.flags.writeable = False
results = hands.process(image)
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
if results.multi_hand_landmarks:
GestureController.classify_hands(results)
handmajor.update_hand_result(GestureController.hr_major)
handminor.update_hand_result(GestureController.hr_minor)
handmajor.set_finger_state()
handminor.set_finger_state()
gest_name = handminor.get_gesture()
if gest_name == Gest.PINCH_MINOR:
Controller.handle_controls(gest_name, handminor.hand_result)
else:
gest_name = handmajor.get_gesture()
Controller.handle_controls(gest_name, handmajor.hand_result)
for hand_landmarks in results.multi_hand_landmarks:

mp_drawing.draw_landmarks(image, hand_landmarks,
mp_hands.HAND_CONNECTIONS)
else:
Controller.prev_hand = None
cv2.imshow('Gesture Controller', image)
if cv2.waitKey(5) & 0xFF == 13:
break
GestureController.cap.release()
cv2.destroyAllWindows()
# uncomment to run directly

gc1 = GestureController()
gc1.start

CHAPTER 6
TESTING
6.1 Overview
The purpose of testing is to discover errors. Testing is the process of trying to discover every
conceivable fault or weakness in a work product. It provides a way to check the functionality
of components, sub-assemblies, assemblies and/ or a finished product. It is the process of
exercising software with the intent of ensuring that the Software system meets its requirements
and user expectations and does not fail in an unacceptable manner. There are various types of
tests. Each test type addresses a specific testing requirement.
6.2 A/B Testing

The case for A/B testing is very strong: When any changes are being made in design,
performing tests along the way means you backup design decisions with data, for example.
The occurrence is very common:
 Members of a design team will disagree on what is the best path to pursue
 Client and designer will disagree as to which variation of an interface will work better
 Business or Making team and design team will disagree on which will work better
Apart from being a platform for everyone to raise sometimes heated personal opinions and
biases, discussion like these usually lead nowhere other than hour-long, heavy- loaded
meetings. Data is by far, the best way to settle these debates. A client wouldn’t be arguing that
a blue button is better than a red one if he knew the red variation would increase his revenue
by .5% (or let’s say, $500/day).
A design team wouldn’t argue over which imagery to use if they knew that a certain
variation increase retention. A/B testing helps teams deliver better work, more efficiently.
Going further, it also allows you to improve key business metrics. Testing – specially if it’s
conducted continually – enables you to optimize your interface and make sure your website
is delivering the best results possible. Picture for a moment an ecommerce store.

The goal is to increase the number of checkouts. A/B testing only the listing page would have
a small effect on the total number checkouts. It wouldn’t move the needle significantly and
neither would be optimizing just the homepage header, for example. However, running tests
to optimize all areas – from the menus, all the way to the checkout confirmation – will results
in a compound effect that will make more of an impact.
6.3 Beta Testing

A type of User Acceptance Testing, Beta Testing, also known as “field testing”, is done in the
customer’s environment. Beta testing is commonly used for brand new features and products.
The purpose of beta testing is to provide access to users who then provides feedback, which
helps improve the application. Beta testing often involves a limited number of users.
6.4 White Box Testing

White box testing is a method of testing software in which the internal workings, code,
architecture, design, etc) are known to the tester. White box testing validates the internal
structure and therefore often focuses primarily on improving security, and making the flow of
inputs/ outputs more efficient and optimized. In white box testing, the tester is often testing for
internal security holes and broken or poorly structured coding paths. The term “white box” is
used because in this type of testing, you have visibility into the internal workings. Because of
this, white box testing usually requires a more technical person. Types of white box testing
include unit testing and integration testing.
6.5 Black Box Testing

Black box testing is a method of testing software in which the internal workings, (code,
architecture, design, etc), are NOT known to the tester. Black box testing focuses on the
behavior of the software and involves testing from an external or end-user perspective. With
black box testing, the tester is testing the functionality of the software without looking at the
code or having any knowledge of the application’s internal flows. Inputs and outputs are tested
and compared to the expected output and if the actual output doesn’t match the expected output,
a bug has been found.
The term “black box” is used because in this type of testing, you don’t look inside of the
application. For this reason, non-technical people often conduct black box testing. Types of
black box testing include functional testing, system testing, usability testing, and regression
testing.

6.6 Positive Testing

Positive testing is the type of testing that can be performed on the system by providing the
valid data as input. It checks whether an application behaves as expected with positive inputs.
This test is done to check the application that does what it is supposed to do.
6.7 Negative Testing

Negative Testing is a variant of testing that can be performed on the system by providing
invalid data as input. It checks whether an application behaves as expected with the negative
inputs. This is to test the application does not do anything that it is not supposed to do so.
6.8 Unit Testing

Unit testing involves the design of test cases that validate that the internal program logic is
functioning properly, and that program inputs produce valid outputs. All decisions branches
and internal code flow should be validated. It is the testing of individual software units of the
application. It is done after the completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction and is invasive. Unit tests
perform basic tests at component level and test a specific business process, application, and/
or system configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specification and contains clearly defined inputs and expected
results. Unit testing is usually conducted as part of a combined code and unit test phase of the
software lifecycle, although it is not uncommon for coding and unit testing to be conducted
as two distinct phases.
 The movements must not be delayed.
Features to be tested
 Is it performing all the actions?
 Is it fast?
 7.9 Functional Testing
Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user
manuals.

Functional testing is centered on the following items:
Valid : Identified classes of valid input must be

Input accepted.
Invalid : Identified classes of invalid input must be
Input rejected.
Functions : Identified functions must be exercised.
Output : Identified classes of application outputs must be exercised.
System/ Procedures : Interfacing systems or procedures must be exercised.
Organization and preparing of functional test is focused on requirements, key function, or

special test cases.
In addition, systematic coverage pertaining to identify Business process flows, data fields,
predefined processes, and successive processes must be considered for testing.
Before functional testing is complete, additional test are identified and the effective value of
current test is determined.
6.9 Integration Testing

Software integration testing is the incremental integration testing of two or more integrated
software components on a single platform to produce failure caused by interface defects.
The task of the integration test is to check that components or software, e.g. component in a
software system or – one step up – software applications at the company level – interact
without error.
6.10 System Testing

System testing ensures that the entire integrated software system meets requirements. It tests
a configuration to ensure known and predictable results. An example of system testing is based
on process description and flows, emphasizing pre-driven process links and integration points.
6.11 Acceptance Testing

User Acceptance Testing is a critical phase of any project and requires significant participation
by the end user. It also ensures that the system meets the functional requirements.

Test Results
All the test cases mentioned above passed successfully. No defects encountered.
Test Cases:
Testing is a multi-faceted process that ensures the AI-driven recipe creation system operates
reliably, accurately, and efficiently. Below, we delve deeper into each testing aspect.
Test Boundary Expected Actual Status

case Scenario Value Result Result
id
1 Used in normal >90% In normal Hand gestures Passed
environment. environment got easily
hand gestures recognized
can be and work
recognized properly.
easily.
2 Used in bright >60% In brighter In bright Passed

environment. environment, conditions the
software should software works
work fine as it very well.
easily detects the
hand movements
but in a more
brighter
conditions it may
not detect
the hand gestures
as expected.
3 Used in dark <30% In dark In dark Failed

environment environment, environment
It should work software didn’t
properly. work properly
in detecting
hand gestures.
4 Used at a near distance >80% At this distance, It works fine Passed

(15cm) from the web this software and all
cam. should perform features works
perfectly. properly.

5 Used at a far distance >95% At this distance, At this Passed

(35cm) from the web this software distance, it is
cam. should work fine. working
properly.
6 Used at a farther >60% At this distance, At this Passed

distance there will be some distance, The
(60cm) from the web problem in functions of
cam. detecting hand this software
gestures but it works
should work fine. properly.

CHAPTER 7
INPUT AND OUTPUT SCREENS
7.1 Snapshot of the Project

Fig 7.1.1 Left click
Fig 7.1.2 Right click

Fig 7.1.3 Double click
Fig 7.1.4 Grab function

Fig 7.1.5 Volume control
Fig 7.1.6 Brightness control

Fig 7.1.7 Scroll function
7.2 Project Description
As computer use has been engrained in our daily lives, human-computer interaction is
becoming more and more convenient. While most people take these areas for granted, people
with disabilities frequently struggle to use them properly. In order to imitate mouse activities
on a computer, this study offers a gesture-based virtual mouse system that makes use of hand
movements and hand tip detection. The major objective of the recommended system is to
replace the traditional mouse with a web camera or a computer’s built-in camera to perform
computer mouse pointer and scroll functions. Real-time detection and identification of a hand
or palm is accomplished using a single-shot detector model. The Media Pipe uses the
single-shot detector concept. Because it is simpler to learn palms, the hand detection module
initially trains a model for palm detection. Furthermore, for tiny objects like hands or fists, the
Non maximum suppression performs noticeably better. Finding joint or knuckle coordinates
the hand area is a model for a hand landmark. The camera frames from a laptop or PC serve
as the foundation for the planned Gesture virtual mouse system. The video capture object
is constructed using the Python computer vision package OpenCV, and the web camera begins
recording footage. The virtual system receives frames from the web camera and processes
them. The transformational method is used by the gesture-based virtual mouse system to
translate the fingertip’s coordinates from the camera screen to the full-screen computer
window for operating the mouse.

CHAPTER 8
CONCLUSION AND FUTURE WORK
8.1 Conclusion
Summary of Work
In this study, we developed and evaluated a virtual mouse system that uses hand gesture
recognition to control the cursor and perform various mouse functions. The system leverages
computer vision techniques and machine learning algorithms to detect and interpret hand
gestures in real-time.
Key Findings
Accuracy and Responsiveness:
 The hand gesture recognition system demonstrated high accuracy in detecting and
interpreting basic gestures such as clicking, dragging, and scrolling.
 The system's responsiveness was found to be adequate for real-time applications, with
minimal latency observed during gesture recognition and cursor movement.
User Experience:
 User feedback indicated that the virtual mouse system is intuitive and easy to learn,
providing a natural way of interacting with the computer.
 The system's ability to reduce physical strain associated with traditional mouse usage was
highlighted as a significant benefit, especially for users prone to repetitive strain injuries.
Technical Challenges:
 Lighting conditions and background complexity were identified as factors that could affect
the accuracy of hand gesture recognition. Solutions such as adaptive algorithms and
improved background segmentation were discussed.
 Hand occlusions and variations in hand sizes and shapes presented challenges that were
partially mitigated through robust training data and advanced preprocessing techniques.

System Integration:
 The virtual mouse system was successfully integrated with standard operating systems and
applications, demonstrating compatibility and ease of use.
 The system's performance in various software environments was consistent, showing its
versatility and potential for widespread adoption.
8.2 Implications and Future Work
The development of a virtual mouse using hand gesture recognition represents a significant
step forward in human-computer interaction. The positive results from this study suggest that
such systems could become a viable alternative to traditional input devices, offering improved
ergonomics and accessibility.
Future work should focus on:
 Improving Gesture Recognition: Enhancing the accuracy and robustness of the gesture
recognition algorithm, particularly in challenging environments.
 Expanding Gesture Vocabulary: Developing a broader set of gestures to provide more
functionality and customization options for users.
 Optimizing Performance: Reducing latency and computational requirements to ensure
smooth and efficient operation on a wide range of hardware.
 User Customization: Allowing users to personalize gesture controls to suit their
preferences and workflows.
 Accessibility Enhancements: Tailoring the system to meet the needs of users with
disabilities, providing greater inclusivity in computer interactions.
In conclusion, the virtual mouse system using hand gesture recognition has demonstrated its
potential as an innovative and practical tool for enhancing human-computer interaction. With
continued development and refinement, it holds promise for transforming the way we interact
with digital devices, making technology more accessible and intuitive for all users

REFERENCES
1. Rao, A.K., Gordon, A.M., 2001. Contribution of tactile information to accuracy in pointing
movements. Exp. Brain Res. 138, 438–445. https://doi.org/10.1007/s002210100717.
2. Masurovsky, A., Chojecki, P., Runde, D., Lafci, M., Przewozny, D., Gaebler, M.,2020.
Controller-Free Hand Tracking for Grab-and Place Tasks in Immersive Virtual Reality:
Design Elements and Their EmpiricalStudy. Multimodal Technol.Interact.4,91.
https://doi.org/10.3390/mti4040091.
3. Lira, M., Egito, J.H., Dall'Agnol, P.A., Amodio, D.M., Gonçalves, O.F., Boggio, P.S., 2017.
The influence of skin colour on the experience of ownership in the rubber hand illusion.
Sci. Rep. 7. 15745. https://doi.org/10.1038/s41598-017-16137-3.
4. Inside Facebook Reality Labs: Wrist-based interaction for the next computing platform
[WWW Document], 2021 Facebook Technol. URL https://tech.fb.com/inside-facebook-
realitylabs-wrist- basedinter action-for-the-next computing-platform/ (accessed 3.18.21)
5. Danckert, J., Goodale, M.A., 2001. Superior performance for visually guided pointing in the
lower visual field. Exp. Brain Res. 137, 303-308. https://doi.org/10.1007/s002210000653.
6. Carlton, B., 2021. HaptX Launches True-Contact Haptic Gloves For VR And Robotics.
VRScout. URL https://vrscout.com/news/haptx-truecontact-haptic- gloves-vr/ (accessed
3.10.21).
7. Brenton, H., Gillies, M., Ballin, D., Chatting, D., 2005. D.: The uncanny valley: does it exist,
in: In: 19th British HCI Group Annual Conference: Workshop on Human-Animated
Character Interaction.
8. Buckingham, G., Michela kakis, E.E., Cole, J., 2016. Perceiving and acting upon weight
illusions in the absence of Neurophysiology. Somatosensory 115, information.
https://doi.org/10.1152/jn.00587.2015. J. 1946-1953.
9. J. Katona, "A review of human-computer interaction and virtual reality research fields in
cognitive Info Communications," Applied Sciences, vol. 11, no. 6, p. 2646, 2021. View at:
Publisher Site | Google Scholar. 10. D. L. Quam, "Gesture recognition with a DataGlove,"
10. IEEE Conference on Aerospace and Electronics, vol. 2, pp. 755-760, 1990. View at:
Publisher Site | Google Scholar
11. D.-H. Liou, D. Lee, and C.-C. Hsieh, "A real time hand gesture recognition system using
motion history image," in Proceedings of the 2010 2nd International Conference on Signal
Processing Systems, IEEE, Dalian, China, July 2010. View at: Publisher Site | Google
Scholar

12. S. U. Dudhane, "Cursor control system using hand gesture recognition," IJARCCE, vol. 2,
no. 5, 2013. View at: Google Scholar.
13. K. P. Vinay, "Cursor control using hand gestures," International Journal of Critical
Accounting, vol. 0975-8887, 2016. View at: Google Scholar.
14. J. Katona, "A review of human-computer interaction and virtual reality research fields in
cognitive Info Communications," Applied Sciences, vol. 11, no. 6, p. 2646, 2021.
15. J. Jaya and K. Thanushkodi, "Implementation of classification system for medical images,"
European Journal of Scientific Research, vol. 53, no. 4, pp. 561- 569, 2011. View
at: Google Scholar.

Mini_Project_Report_400[1] (1) (1) (2)

Uploaded by

Copyright:

Available Formats

Mini_Project_Report_400[1] (1) (1) (2)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mini_Project_Report_400[1] (1) (1) (2)

Uploaded by

Copyright:

Available Formats

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“VIRTUAL MOUSE USING HAND GESTURE

Under the Guidance of

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

EAST WEST COLLEGE OF ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE &

Signature of the Guide Signature of the HOD Signature of the Principal

Department of Computer Science and Engineering

We ANUSHA R D(1EE22CS400), D SIMRAN (1EE22CS401), RAJASHEKAR

Name & Signature of the Student with date

1.3 Problem Statement 1-2

2 Literature review 3-4

2.1 Summary of Prior Works 3

2.2 Outcome of the Review 3

2.3 Proposed Work 4

3 System Requirements 5-7

3.1 Functional Requirements 5

3.2 Software and Hardware Used 7

4 System Design/Methodology 8-22

4.1 Methodology 8-15

4.2 System Architecture 16

4.3 Working Principle 16

4.4 Data Flow Diagram 17

4.5 UML Diagram 17

4.6 Major Algorithms 20

6.2 A/B Testing 36

6.3 Beta Testing 37

6.4 White Box Testing 37

6.5 Black Box Testing 37

6.6 Positive Testing 38

6.7 Negative Testing 38

6.8 Unit Testing 38

6.9 Integration Testing 39

6.10 System Testing 39

6.11 Acceptance Testing 39-41

7 Input and Output Screens 42-45

7.1 Snapshot of the Project 42-45

8 Conclusion and Future Work 46-47

8.2 Implications and Future Work 47

Figure No Figure Name Page No

4.1 Working of Virtual Mouse 8

4.1.1 The Camera used in the Virtual Mouse System 9

4.1.2 Capturing the Video and Processing 9

4.1.3 Virtual Screen Matching 9

4.1.4 Detecting which finger is up 10

4.1.6 Mouse cursor moving around the computer window 10

4.2 Virtual Mouse System Architecture 16

4.4 Data flow diagram 17

4.5 UML Diagram 18

4.5.3 Sequence diagram 19

7.1 Snapshot of the project 42-45

7.1.1 Left click 42

7.1.2 Right click 42

7.1.3 Double click 43

7.1.4 Grab function 43

7.1.5 Volume control 44

7.1.6 Brightness control 44

7.1.7 Scroll function 45

1.2 Overview of the Present Work

1.3 Problem Statement