Major Report
Major Report
Major Report
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE & ARTIFICIAL INTELLIGENCE
by
MADURI RAM CHARAN TEJA 2003A52026
SUHAAS SANGA 2003A52132
RENUKUNTLA DHANUSH 2003A52053
KOTHAPALLY PREM SAI 2003A52052
GURRAPU ADITYA KRISHNA 2003A52085
SR University, Ananthsagar,Warangal,Telagnana-506371
SR University
Ananthasagar, Warangal.
CERTIFICATE
This is to certify that this project entitled “ADVANCED HAND GESTURE CONTROLLED
PRESENTATION USING OPENCV" is the bonafied work carried out by MADURI RAM
CHARAN TEJA, SUHAAS SANGA, RENUKUNTLA DHANUSH, KOTHAPALLY
PREM SAI, GURRAPU ADITYA KRISHNA as a Capstone Project phase-1 for the partial
fulfillment to award the degree BACHELOR OF TECHNOLOGY in School of Computer
Science and Artificial Intelligence during the academic year 2023-2024 under our guidance and
Supervision.
Reviewer-1 Reviewer-2
Name: Name:
Designation: Designation:
Signature: Signature:
ACKNOWLEDGEMENT
We owe an enormous debt of gratitude to our Capstone project phase-1 guide Mr. R.
Vijaya Prakash, Professor as well as Head of the School of CS&AI , Dr. M.Sheshikala,
Professor for guiding us from the beginning through the end of the Capstone Project Phase-1
with their intellectual advices and insightful suggestions. We truly value their consistent
feedback on our progress, which was always constructive and encouraging and ultimately drove
us to the right direction.
We express our thanks to project co-ordinators Mr. Sallauddin Md, Asst. Prof., and
R.Ashok Asst. Prof. for their encouragement and support.
Finally, we express our thanks to all the teaching and non-teaching staff of the
department for their suggestions and timely support.
ABSTRACT
Making presentations is essential in a lot of facets of life. At some point in your life,
whether you're a student, worker, business owner, or employee of an organisation, you've
probably given presentations. Presentations might seem dull at times since you have to
use a keyboard or other specialised device to manipulate and alter the slides. Our goal is
to provide hand gesture control of the slide display for users. Human-computer
interaction has seen a sharp increase in the use of gestures in recent years. The system has
attempted to control several PowerPoint features with hand gestures. Machine learning
has been used in this system to map movements utilising many Python modules and
identify motions with minute variations. The slides, the keys to switching the slides, and
the audience's composure are some of the factors contributing to the growing obstacles to
crafting the ideal presentation. A hand gesture-based intelligent presentation system
provides an easy way to manipulate or update the slides. Presentations involve many
pauses so that the presenter may use the keyboard to control the presentation. The goal of
the technology is to let users explore and control the slide presentation with hand
gestures. The method uses machine learning to recognise different hand gestures for a
wide range of tasks. A recognition approach provides a communication bridge between
humans and systems.
TABLE OF CONTENTS
Chapter Page No.
1. Introduction
2. Related work
3. Problem Statement
4 Requirement Analysis
5. Risk Analysis
6. Feasibility Analysis
7. Proposed approach
8. Architecture Diagrams
9.2 implementation
13. References
LIST OF FIGURES:
FIGURE.NO TITLE PAGE NO
1 Data Pre-Processing
2 Cyclic Process
10 Pointer Gesture[0,1,1,0,0]
12 Delete Gesture[0,1,1,1,0]
16 Pointer Gesture(Mechanism)
18 Delete Gesture(Mechanism)
20 Gesture Working
LIST OF ACRONYMS
KLT Kanade-Lucas-Tomasi
os Operating System
re Regular Expression
OpenCV Open Source Computer
Vision Library
numpy Numerical Python
cnn Convolutional Neural
Networks
1. INTRODUCTION
Central to this pioneering effort is the utilization of a camera to capture and interpret six distinct
hand gestures. Each of these gestures triggers precise actions within the presentation framework,
empowering presenters to seamlessly transition between slides, annotate or erase content,
highlight sections, and even conclude presentations – all accomplished through intuitive hand
movements. The system's standout feature lies in its ability to execute these functions without the
need for additional hardware, specialized gloves, or markers, thereby offering a cost-effective
and user-friendly alternative to traditional presentation controls.
At its technological core, this system is built upon the robust foundation of the Python
framework, incorporating crucial components such as OpenCV, CV Zone, NumPy, and Media
Pipe. This amalgamation of machine learning and motion image-based techniques allows the
system to aptly interpret intricate hand motions. This capability, in turn, empowers presenters to
convey non-verbal cues, effectively engage audiences, and maintain precise control over their
presentations, all while harnessing the simplicity and expressiveness of natural gestures.
The project represents a groundbreaking fusion of machine learning and computer vision,
presenting a versatile human-machine interface that redefines the traditional presentation
experience. With gestures like swiping, giving a thumbs-up, or halting, users effortlessly
command their presentation slides, significantly enhancing the flow, interactivity, and
expressiveness of their presentations. Ultimately, the project seeks to empower presenters by
providing a more natural and interactive means of controlling presentations, thus amplifying the
overall impact and effectiveness of their messages.
In this digital age, characterized by rapid technological advancements, this dynamic hand
gesture-based control system holds the promise of revolutionizing the art of presentations. It
offers a modern and engaging tool for communicators to captivate their audiences, introducing a
transformative shift in presentation delivery and engagement. Its innovative nature transcends
mere technological advancement, promising to redefine the landscape of effective
communication in the digital realm while catering to a diverse range of users by eliminating the
complexities associated with conventional presentation tools.
As this innovative system ushers in a new era of human-machine interaction, it not only
streamlines the presentation process but also adapts to diverse user skill levels, offering
inclusivity and accessibility to presenters of varying backgrounds and technical expertise. The
potential of this fusion of dynamic hand gestures and advanced computer vision extends beyond
presentations, hinting at a future where intuitive and natural interaction with technology becomes
an integral part of our daily lives.
2. RELATED WORK
In their study, authors Devivara Prasad and Mr. Srinivasulu M from UBDT
College of Engineering, India, explore the significance of gesture recognition
in Human-Computer Interaction (HCI), emphasizing its practical applications
for individuals with hearing impairments and stroke patients. They delve into
previous research on hand gestures, investigating image feature extraction
tools and AI-based classifiers for 2D and 3D gesture recognition. Their
proposed system harnesses machine learning, real-time image processing
with Media Pipe, and OpenCV to enable efficient and intuitive presentation
control using hand gestures, addressing the challenges of accuracy and
robustness. The research focuses on enhancing the user experience,
particularly in scenarios where traditional input devices are impractical,
highlighting the potential of gesture recognition in HCI.[1]
The paper titled "A Hand Gesture Based Interactive Presentation System
Utilizing Heterogeneous Cameras" authored by Bobo Zeng, Guijin Wang, and
Xinggang Lin presents a real-time interactive presentation system that
utilizes hand gestures for control. The system integrates a thermal camera
for robust human body segmentation, overcoming issues with complex
backgrounds and varying illumination from projectors. They propose a fast
and robust hand localization algorithm and a dual-step calibration method for
mapping interaction regions between the thermal camera and projected
content using a web camera. The system has high recognition rates for hand
gestures, enhancing the presentation experience. However, the challenges
they encountered during development, such as the need for precise
calibration and handling hand localization, are not explicitly mentioned in the
paper. [5]
This paper, authored by Rutika Bhor, Shweta Chaskar, Shraddha Date, and
guided by Prof. M. A. Auti, presents a real-time hand gesture recognition
system for efficient human- computer interaction. It allows remote control of
PowerPoint presentations through simple gestures, using Histograms of
Oriented Gradients and K-Nearest Neighbor classification with around 80%
accuracy. The technology extends beyond PowerPoint to potentially control
various real-time applications. The paper addresses challenges in creating a
reliable gesture recognition system and optimizing lighting conditions. It
hints at broader applications, such as media control, without intermediary
devices, making it relevant to the human-computer interaction field.
References cover related topics like gesture recognition in diverse domains.
[8]
In this paper by Thin Thin Htoo and Ommar Win, they introduce a real-time
hand gesture recognition system for PowerPoint presentations. The system
employs low-complexity algorithms and image processing steps like RGB to
HSV conversion, thresholding, and noise removal. It also calculates the
center of gravity, detects fingertips, and assigns names to fingers. Users can
control PowerPoint presentations using hand gestures for tasks like slide
advancement and slideshow control. The system simplifies human-computer
interaction by eliminating the need for additional hardware. The paper's
approach leverages computer vision and image processing techniques to
recognize and map gestures to specific PowerPoint commands. The authors
recognize the technology's potential for real-time applications and its
significance in human-computer interaction. The references include related
works in image processing and hand gesture recognition, enriching the
existing knowledge base. [9]
The prevailing methods used during presentations involve the utilization of highlighters, digital
pens, and remote controls to manipulate slides, which can often be cumbersome and limit the
presenter's mobility. However, an innovative solution is sought to revolutionize this process by
developing a comprehensive hand gesture-controlled presentation application. This envisioned
application aims to seamlessly replace traditional tools by enabling users to execute various
presentation functions solely through hand gestures. The desired functionalities encompass
fundamental tasks such as changing slides, acting as a pointer, writing directly onto slides, and
the ability to undo any written annotations. The challenge lies in creating a cohesive and intuitive
system that not only recognizes an array of hand gestures accurately but also integrates
seamlessly with presentation software to provide a versatile and user-friendly experience. By
encompassing all these aspects, this proposed application aims to redefine the presentation
experience, allowing presenters to navigate slides, interact with content, and engage with
audiences effortlessly through intuitive hand gestures, thereby eliminating the need for
conventional tools and enhancing overall presentation dynamics.
So Building a complete hand gesture controlled presentation application which can able to work
in all aspects like change of slides ,pointer ,writing on a slide,undo the writing on the slides.
4.REQUIREMENT ANALYSIS
Software Requirements:
1. Pychram
2. Google Colab
3. VS Code
Python:
Ensure that Python is installed on your system. You can download it from the official website
https://www.python.org/
1. os - Operating System
2. re - Regular Expression
3. cv2 - OpenCV (Open Source Computer Vision Library)
4. numpy - Numerical Python
5. HandTrackingModule
Functional Requirements:
a. Hand Gesture Recognition: Develop a system that accurately recognizes and interprets six
distinct hand gestures.
b. Slide Control: Enable the presenter to navigate between slides using specific hand gestures
(e.g., next slide, previous slide).
c. Writing and Erasing: Implement functionalities to allow writing or drawing on slides using
hand gestures and erasing content.
d. Highlighting: Enable the system to highlight specific sections or points on slides using
gestures.
f. Camera Integration: Utilize a camera interface to capture hand movements for gesture
recognition.
g. Software Components: Develop modules using Python, OpenCV, CV Zone, NumPy, and
MediaPipe to execute the hand gesture recognition system effectively.
Non-functional Requirements:
a. Accuracy: Ensure a high level of accuracy in recognizing hand gestures to prevent false
triggers and ensure seamless presentation control.
c. Usability: Design an intuitive user interface that allows presenters to easily understand and
use the hand gestures without complex learning curves.
e. Reliability: Create a robust system that operates consistently across various environmental
conditions, lighting situations, and hand orientations.
f. Security and Privacy: Address any potential security concerns related to using a camera
interface and ensure user data privacy.
5. RISK ANALYSIS
Technical Challenges:
a. Accuracy and Reliability: Hand gesture recognition systems might face accuracy issues,
especially in diverse lighting conditions or when different users with varying hand sizes are
involved.
b. Complexity in Gesture Recognition: Ensuring precise recognition of gestures within a
diverse set can be challenging. Some gestures may have similar patterns, leading to
misinterpretation or confusion.
c. Performance Concerns: Processing live camera feed in real-time may strain computational
resources and cause system lag or delays in response time.
d. Compatibility Issues: Compatibility problems may arise with different operating systems,
cameras, or hardware configurations, affecting the system's functionality.
User-Related Risks:
a. User Learning Curve: Users might find it challenging to adapt to new ways of controlling
presentations, especially those accustomed to traditional methods like clickers or keyboard
controls.
b. User Accessibility: Hand gesture recognition might not be suitable for individuals with
physical disabilities or conditions affecting hand movements.
6.FEASIBILITY ANALYSIS
Technical Feasibility:
a. Technology Stack: The project utilizes Python, OpenCV, CV Zone, NumPy, and Media Pipe.
These technologies are well-established and commonly used for computer vision and machine
learning applications, providing robust support.
b. Hand Gesture Recognition: OpenCV's capabilities in hand detection and tracking, along
with machine learning models, allow the identification and interpretation of hand gestures
effectively.
c. System Architecture: The system architecture involves modules for hand detection, finger
tracking, finger state classification, and gesture recognition, which seem technically feasible
based on existing libraries and algorithms.
Operational Feasibility:
a. Ease of Use: The proposed system aims to simplify the presentation process by allowing users
to control slides using hand gestures. This can potentially make presentations more intuitive and
engaging for both presenters and audiences.
b. Compatibility: The system's compatibility with different presentation formats (e.g.,
PowerPoint, images) needs to be considered for seamless integration with various presentation
tools.
7.Proposed approach
Fig.1,Data Pre-Processing
2. Building the Hand Gesture Recognition Model:
Programming Language: Python
Libraries:
Os: Operating System.
Re: Regular Expression.
OpenCV (cv2): For computer vision tasks, including image processing, detection, and tracking.
NumPy: For numerical operations and array manipulations.
Custom HandTrackingModule: A module for hand detection, finger tracking, and gesture
recognition. Likely built using machine learning or custom algorithms.
Fig.2,Cyclic Process
Fig.3, Step By Step Mechanism
9.1.Simulation setup
Hardware Requirements:
Software Requirements:
1.Preprocessing:
Begin by capturing an initial frame from a video feed or image sequence that includes the
hand(s) you want to detect.
Convert the frame to a suitable color space (like grayscale) to simplify subsequent computations.
2.Feature Detection:
Apply a feature detection method (e.g., Harris corner detection, FAST features, etc.) to identify
distinctive points or corners within the image that can represent potential features of the hand.
3.Feature Tracking Initialization:
Select the features that are within the region(s) of the hand in the initial frame.
These features serve as the starting point for tracking the hand's movement across subsequent
frames.
4.Tracking the Features:
For each feature detected and selected, track its movement in the subsequent frames of the video
sequence using the KLT algorithm.
The KLT algorithm tracks the movement by estimating the optical flow, i.e., how the pixels or
features move between frames. It does so by finding the best matching points between
consecutive frames.
As the frames progress, some features might get occluded or become unreliable for tracking due
to factors like lighting changes or hand movement.
Constantly update and reinitialize the feature set by detecting new features in the regions where
the hand is expected to be present.
6.Hand Region Estimation:
Aggregate the tracked features that consistently represent the hand across multiple frames.
Using geometric or statistical methods (like bounding box estimation around the tracked
features), define the region where the hand is detected.
7.Hand Gesture Recognition (Optional):
After detecting and tracking the hand region, additional algorithms or machine learning models
can be employed to recognize specific gestures or actions performed by the hand.
The KLT algorithm, when adapted for hand detection, provides a framework for continuously
tracking features representing the hand across video frames. This allows for real-time estimation
of hand movement and location, enabling applications in gesture recognition, human-computer
interaction, and more. However, it's essential to consider potential challenges like occlusions,
lighting variations, and variations in hand appearance for robust hand detection using this
approach.
\
Gesture Mechanism:
Pointer Gesture:
This gesture allows users to activate a pointer function, enabling them to interact with content or
highlight specific areas on the slides. By allowing this gesture both above and below the line,
users can seamlessly control the pointer regardless of its position relative to the line.
Write & Draw Gesture:
Enabling this gesture both above and below the line grants users the ability to annotate or draw
on the slides. It offers flexibility for users to create annotations wherever they find it
comfortable, whether it's above or below the green line.
Delete Gesture:
The delete gesture is also made accessible in both regions to provide users the capability to erase
or delete any annotations or drawings made on the slides. This ensures ease of interaction and
correction, regardless of the position of the drawn content in relation to the green line.
The visual cue of the green line offers a clear demarcation for users, simplifying the navigation
of various presentation functionalities using hand gestures. It optimizes user control and
minimizes confusion, ensuring a smooth and intuitive interaction with the presentation content
while offering distinct areas for specific gesture-based actions.
10. RESULTS COMPARISON WITH CHALLENGES:
Fig.16,Pointer Gesture(Mechanism)
Fig.17,Write or Draw Gesture(Mechanism)
Fig.18,Delete Gesture(Mechanism)
Fig.19, Delete Gesture(Mechanism)
Accuracy:
We can set the detectionCon=0.8 or 0.9 (0.8=80% and 0.9=90%).
Fig.20,Gesture Working
Complete algorithm work simple array of size 5 where there are only 1’s or 0’s.
[1,1,1,1,1]
1 indicates that the finger is up.
0 indicates that the finger is down.
in this
1st position of value indicate above thumb
2nd position value indicate as index finger.
3rd position value indicated as middle finger.
4th position value indicated as ring finger.
5th position value indicated as little finger.
Based on the hand gesture we give it converts the gesture into array and perform appropriate
action on the presentation according to the user.
The hand Tracking module and hand detection works on this mechanism.
We set our accuracy as 80 to 90 % for proper functioning the presentation using gestures.
Achieving an accuracy level between 80% to 90% implies that the system can effectively
interpret hand gestures, enabling presenters to navigate through slides, highlight sections, or
perform other actions with a high degree of reliability. This reliability is crucial in real-world
scenarios, ensuring that the user experiences a seamless and responsive interaction while
delivering presentations.
The integration of cutting-edge technologies like deep learning techniques and OpenCV has
enabled the development of a sophisticated presentation control system reliant on hand gestures.
By elevating the accuracy of gesture recognition to 95% and allowing for future expansions in
gesture libraries, this project marks a significant advancement in user interaction with
presentations. The system's success lies in its ability to streamline presentation navigation,
annotation, and control through intuitive hand movements, ensuring a smoother and more
engaging experience for presenters. Furthermore, the forward-thinking approach to incorporate
speech controls and user detection techniques signifies a commitment to continual improvement,
promising a more versatile and user-centric system for controlling presentations.
This project's fundamental aim is to simplify presentation delivery by harnessing the natural
language of hand gestures. Through the amalgamation of deep learning methodologies and the
robustness of OpenCV, a functional and efficient presentation control mechanism has been
established. Moving forward, the project remains open to evolution and enhancement,
acknowledging the potential for future advancements in gesture recognition, user interaction, and
technological capabilities. This continual pursuit of improvement ensures that the system
remains adaptive, responsive, and at the forefront of revolutionizing the art of presentation
control, offering a user-friendly and technologically advanced solution for presenters worldwide.
13. REFERENCES
3. Hajeera Khanum, Dr. Pramod H B. "Smart Presentation Control by Hand Gestures Using
Computer Vision and Google’s Mediapipe" y (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07
| July 2022.
https://www.irjet.net/archives/V9/i7/IRJET-V9I7482.pdf
4. Salonee Powar, Shweta Kadam, Sonali Malage, Priyanka Shingane. "Automated Digital
Presentation Control using Hand Gesture Technique."ITM Web of Conferences 44, 03031
(2022).
https://www.itm-conferences.org/articles/itmconf/pdf/2022/04/itmconf_icacc2022_03031.pdf
5. Bobo Zeng, Guijin Wang, Xinggang Lin . "A Hand Gesture Based Interactive
Presentation System Utilizing Heterogeneous Cameras." ISSNll1007-
0214ll15/18llpp329-336 Volume 17, Number 3, June 2012.
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6216765