Report
Report
Report
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE & ARTIFICIAL INTELLIGENCE
by
MADURI RAM CHARAN TEJA 2003A52026
SUHAAS SANGA 2003A52132
RENUKUNTLA DHANUSH 2003A52053
KOTHAPALLY PREM SAI 2003A52052
GURRAPU ADITYA KRISHNA 2003A52085
SR University, Ananthsagar,Warangal,Telagnana-506371
SR University
Ananthasagar, Warangal.
CERTIFICATE
This is to certify that this project entitled “GESTURE CONTROLLED PRESENTATION
SYSTEM WITH SPEECH RECOGNITION AND WEB INTERACTION” is the bonafied
work carried out by MADURI RAM CHARAN TEJA, SUHAAS SANGA, RENUKUNTLA
DHANUSH, KOTHAPALLY PREM SAI, GURRAPU ADITYA KRISHNA as a Capstone
Project phase 2 for the partial fulfillment to award the degree BACHELOR OF
TECHNOLOGY in School of Computer Science and Artificial Intelligence during the
academic year 2023-2024 under our guidance and Supervision.
Reviewer-1 Reviewer-2
Name: Name:
Designation: Designation:
Signature: Signature:
ACKNOWLEDGEMENT
We owe an enormous debt of gratitude to our Capstone project phase-2 guide Dr. R.
Vijaya Prakash, Professor as well as Head of the School of CS&AI , Dr. M.Sheshikala,
Professor for guiding us from the beginning through the end of the Capstone Project Phase-2
with their intellectual advices and insightful suggestions. We truly value their consistent
feedback on our progress, which was always constructive and encouraging and ultimately drove
us to the right direction.
We express our thanks to project co-ordinators Mr. Sallauddin Md, Asst. Prof., and
R.Ashok Asst. Prof. for their encouragement and support.
Finally, we express our thanks to all the teaching and non-teaching staff of the
department for their suggestions and timely support.
I
ABSTRACT
The system allows users to control a presentation using hand gestures captured through a webcam. It
starts by prompting the user to select a folder containing PNG images representing slides. The
program then renames the images sequentially for easier navigation. Once the folder is selected, the
webcam captures the user's hand gestures Hand gestures are interpreted to navigate through slides
(swipe left/right), annotate slides (draw), erase annotations, and perform additional commands
(speech recognition). Speech recognition enables users to issue commands like "open slide X,"
"next," "previous," "delete," "delete all," and "terminate." These commands facilitate easy navigation
and interaction with the presentation. Annotations can be drawn on slides using specific hand
gestures, enhancing interactivity during presentations. Additionally, there's an option to delete
individual annotations or all annotations on a slide The system provides visual feedback by
overlaying annotations on the slides in real-time. It also displays the total number of slides and a
rectangle indicating the area for gesture recognition. In summary, the project combines computer
vision, speech recognition, and user interface design to create an intuitive and interactive
presentation system.
II
TABLE OF CONTENT
6 Feasibility Analysis 13
III
LIST OF FIGURES:
1 Data Pre-Processing 14
2 Cyclic Process 17
4 Working Mechanism 18
12 Pointer Gesture[0,1,1,0,0] 31
14 Delete Gesture[0,1,1,1,0] 32
19 Pointer Gesture(Mechanism) 37
IV
21 Delete Gesture(Mechanism) 38
23 Listening Mode 39
24 Recognizing mode 39
25 Gesture Working 40
V
LIST OF ACRONYMS
KLT Kanade-Lucas-Tomasi
os Operating System
re Regular Expression
OpenCV Open Source Computer Vision Library
numpy Numerical Python
cnn Convolutional Neural Networks
ml Machine Learning
dl Deep Learning
VI
1. INTRODUCTION
Built upon a foundation of cutting-edge technologies, our system harnesses the power of the
OpenCV library for seamless hand tracking and gesture recognition, allowing users to intuitively
navigate slides with simple hand movements. Furthermore, we've seamlessly integrated speech
recognition functionalities using the SpeechRecognition library, enabling users to interact
effortlessly with the presentation through voice commands. This fusion of computer vision and
speech recognition technologies empowers presenters to deliver captivating presentations with
enhanced interactivity and engagement. By combining these key components, our system offers a
streamlined and immersive presentation experience that elevates communication and audience
interaction to new heights.
The user interface is designed to be intuitive and easy to use. Users can select a folder containing
their presentation slides using a graphical interface built with the tkinter library. Once the
presentation folder is selected, the system automatically renames the slides and prepares them for
display.
1
At the heart of our presentation control system are hand gestures, which serve as the primary
method for navigating slides and interacting with content. Users can effortlessly swipe left or
right to move between slides, while specific gestures enable annotation and command activation,
such as deleting annotations or advancing slides. Powered by the HandDetector module, our
system accurately tracks hand movements and interprets gestures in real-time, ensuring seamless
and responsive control throughout the presentation. This intuitive gesture-based interface
enhances user engagement and simplifies the presentation experience, fostering smoother
communication and interaction.
In addition to hand gestures, our system offers voice command functionality, enhancing control
and convenience for users. With simple commands like "next slide" or "delete annotation," users
can effortlessly navigate through the presentation and manage content using natural language.
Leveraging the Google Web Speech API, our system accurately recognizes and processes spoken
commands, ensuring smooth and intuitive interaction with the presentation. This integration of
voice control adds versatility to the user experience, allowing for hands-free operation and
enabling users to focus on delivering their message effectively..
The fusion of gesture-based control, voice recognition, and an intuitive interface results in our
Interactive Presentation System providing an immersive and engaging presentation experience.
Ideal for various settings such as classrooms, boardrooms, or conferences, this system enables
presenters to captivate their audience and deliver impactful presentations effortlessly. With
seamless integration and user-friendly features, presenters can navigate slides, annotate content,
and execute commands with precision and ease. The system's versatility ensures adaptability to
diverse presentation styles and enhances the overall effectiveness of communication and
engagement.
2
2. RELATED WORK
In their study, authors Devivara Prasad and Mr. Srinivasulu M from UBDT College of
Engineering, India, explore the significance of gesture recognition in Human-Computer
Interaction (HCI), emphasizing its practical applications for individuals with hearing
impairments and stroke patients. They delve into previous research on hand gestures,
investigating image feature extraction tools and AI-based classifiers for 2D and 3D gesture
recognition. Their proposed system harnesses machine learning, real-time image processing with
Media Pipe, and OpenCV to enable efficient and intuitive presentation control using hand
gestures, addressing the challenges of accuracy and robustness. The research focuses on
enhancing the user experience, particularly in scenarios where traditional input devices are
impractical, highlighting the potential of gesture recognition in HCI.[1]
The paper authored by G. Reethika, P. Anuhya, and M. Bhargavi from JNTU, ECE, Sreenidhi
Institute Of Science and Technology, Hyderabad, India, presents a study on Human- Computer
Interaction (HCI) with a focus on hand gesture recognition as a natural interaction technique. It
explores the significance of real-time hand gesture recognition, particularly in scenarios where
traditional input devices are impractical. The methodology involves vision- based techniques that
utilize cameras to capture and process hand motions, offering the potential to replace
conventional input methods. The paper discusses the advantages and challenges of this approach,
such as the computational intensity of image processing and privacy concerns regarding camera
usage. Additionally, it highlights the benefits of gesture recognition for applications ranging
from controlling computer mouse actions to creating a virtual HCI device.[2]
The paper titled "Smart Presentation Control by Hand Gestures Using Computer Vision and
Google’s MediaPipe" was authored by Hajeera Khanum, an M.Tech student, and Dr. Pramod H
B, an Associate Professor from the Department of Computer Science Engineering at Rajeev
Institute of Technology in Hassan, Karnataka, India. Their research, though lacking a specific
publication year, outlines a methodology that harnesses OpenCV and Google's MediaPipe
framework to create a presentation control system that interprets hand gestures. Using a webcam,
the system captures and translates hand movements into actions such as slide control, drawing on
slides, and erasing content, eliminating the need for traditional input devices. While the paper
does not explicitly enumerate the challenges encountered during system development, common
obstacles in this field may include achieving precise gesture recognition, adapting to varying
lighting conditions, and ensuring the system's reliability in real-world usage scenarios. This work
contributes to the advancement of human-computer interaction, offering a modern and intuitive
approach to controlling presentations through hand gestures.[3]
3
In their paper titled "Automated Digital Presentation Control Using Hand Gesture Technique,"
authors Salonee Powar, Shweta Kadam, Sonali Malage, and Priyanka Shingane introduce a
system that utilizes artificial intelligence-based hand gesture detection, employing OpenCV and
MediaPipe. While the publication year is unspecified, the system allows users to control
presentation slides via intuitive hand gestures, eliminating the reliance on conventional input
devices like keyboards or mice. The gestures correspond to various actions, including initiating
presentations, pausing videos, transitioning between slides, and adjusting volume. This
innovative approach enhances the natural interaction between presenters and computers during
presentations, demonstrating its potential in educational and corporate settings. Notably, the
paper does not explicitly detail the challenges encountered during the system's development, but
it makes a valuable contribution to the realm of human- computer interaction by rendering digital
presentations more interactive and user-friendly. [4]
The paper titled "A Hand Gesture Based Interactive Presentation System Utilizing
Heterogeneous Cameras" authored by Bobo Zeng, Guijin Wang, and Xinggang Lin presents a
real-time interactive presentation system that utilizes hand gestures for control. The system
integrates a thermal camera for robust human body segmentation, overcoming issues with
complex backgrounds and varying illumination from projectors. They propose a fast and robust
hand localization algorithm and a dual-step calibration method for mapping interaction regions
between the thermal camera and projected content using a web camera. The system has high
recognition rates for hand gestures, enhancing the presentation experience. However, the
challenges they encountered during development, such as the need for precise calibration and
handling hand localization, are not explicitly mentioned in the paper. [5]
The paper "Smart Presentation Using Gesture Recognition" by Meera Paulson, Nathasha P R,
Silpa Davis, and Soumya Varma introduces a gesture recognition system for enhancing
presentations and enabling remote control of electronic devices through hand gestures. It
incorporates ATMEGA 328, Python, Arduino, Gesture Recognition, Zigbee, and wireless
transmission. The paper emphasizes the significance of gesture recognition in human- computer
interaction, its applicability in various domains, and its flexibility to cater to diverse user needs.
The system offers features such as presentation control, home automation, background change,
and sign language interpretation. The authors demonstrated a cost- effective prototype with easy
installation and extensive wireless signal transmission capabilities. The paper discusses the
results, applications, methodology, and challenges, highlighting its potential to improve human-
machine interaction across different fields.[6]
4
The paper "Adaptive Hand Gesture Recognition System Using Machine Learning Approach,"
authored by Rina Damdoo, Kanak Kalyani, and Jignyasa Sanghavi from the Department of
Computer Science & Engineering at Shri Ramdeobaba College of Engineering and Management
in Nagpur, India, was received on 7th October 2020 and accepted after revision on 28th
December 2020. This paper presents a vision-based adaptive hand gesture recognition system
employing Convolutional Neural Networks (CNN) for machine learning classification. The study
addresses the challenges of recognizing dynamic hand gestures in real time and focuses on the
impact of lighting conditions. The authors highlight that the performance of the system
significantly depends on lighting conditions, with better results achieved under good lighting.
They acknowledge that developing a robust system for real- time dynamic hand gesture
recognition, particularly under varying lighting conditions, is a complex task. The paper offers
insights into the potential for further improvement and the use of filtering methods to mitigate
the effects of poor lighting, contributing to the field of dynamic hand gesture recognition.[7]
This paper, authored by Rutika Bhor, Shweta Chaskar, Shraddha Date, and guided by Prof. M.
A. Auti, presents a real-time hand gesture recognition system for efficient human- computer
interaction. It allows remote control of PowerPoint presentations through simple gestures, using
Histograms of Oriented Gradients and K-Nearest Neighbor classification with around 80%
accuracy. The technology extends beyond PowerPoint to potentially control various real-time
applications. The paper addresses challenges in creating a reliable gesture recognition system
and optimizing lighting conditions. It hints at broader applications, such as media control,
without intermediary devices, making it relevant to the human-computer interaction field.
References cover related topics like gesture recognition in diverse domains. [8]
In this paper by Thin Thin Htoo and Ommar Win, they introduce a real-time hand gesture
recognition system for PowerPoint presentations. The system employs low-complexity
algorithms and image processing steps like RGB to HSV conversion, thresholding, and noise
removal. It also calculates the center of gravity, detects fingertips, and assigns names to fingers.
Users can control PowerPoint presentations using hand gestures for tasks like slide advancement
and slideshow control. The system simplifies human-computer interaction by eliminating the
need for additional hardware. The paper's approach leverages computer vision and image
processing techniques to recognize and map gestures to specific PowerPoint commands. The
authors recognize the technology's potential for real-time applications and its significance in
human-computer interaction. The references include related works in image processing and hand
gesture recognition, enriching the existing knowledge base. [9]
5
The authors propose a novel method for hands-free control of PowerPoint presentations using
real-time hand gestures, eliminating the need for external devices. Their approach involves
segmenting the hand in real-time video by detecting skin color, even in varying lighting
conditions. The number of active fingers is counted to recognize specific gestures, allowing
actions like advancing slides, going back, starting, and exiting the slideshow. The method,
implemented with .Net functions and MATLAB, achieved over 90% accuracy in tests with
various participants. Challenges include hand positioning variations, potential misplacements,
and issues with similar background elements. Future work may focus on accuracy improvement,
gesture expansion, and broader software control applications. [10]
6
3. PROBLEM STATEMENT
7
4. REQUIREMENT ANALYSIS
Software Requirements:
1. Pychram
2. Google Colab
3. VS Code
Python:
Ensure that Python is installed on your system. You can download it from the official website
https://www.python.org/
1. os - Operating System
2. re - Regular Expression
3. cv2 - OpenCV (Open Source Computer Vision Library)
4. numpy - Numerical Python
5. HandTrackingModule
6. SpeechRecognition Module
Functional Requirements:
a. Speech Recognition:
The system should accurately recognize speech commands spoken by the user using a
microphone.
It should support commands such as "next," "previous," "delete," "delete all," "open slide no,"
and "terminate".
8
b. User Interface:
The system should provide a user-friendly interface for selecting a folder using a file dialog.
It should display slides along with hand-drawn annotations and a webcam feed.
The interface should indicate the total number of slides and the current slide number.
Hand gestures should be detected to control slide navigation (e.g., left swipe for previous slide,
right swipe for next slide).
Gestures for drawing annotations and erasing content on slides should be recognized.
d. Slide Control:
Users should be able to navigate between slides using both speech commands and hand gestures.
The system should allow users to jump to a specific slide by speaking the command "open slide
[slide number]."
There should be support for erasing annotations using specific gestures or commands.
f. Folder Operations:
Users should be able to select a folder containing slide images.
The system should rename PNG files in the selected folder with sequential numbers.
g. Camera Integration: Utilize a camera interface to capture hand movements for gesture
recognition.
h. Software Components: Develop modules using Python, OpenCV, CV Zone, NumPy, and
MediaPipe to execute the hand gesture recognition system effectively.
9
Non-functional Requirements:
a. Accuracy:
Ensure a high level of accuracy in recognizing hand gestures to prevent false triggers and
ensure seamless presentation control.
b. Performance:
Aim for real-time responsiveness in interpreting gestures to maintain a smooth and uninterrupted
presentation flow.
c. Usability:
Design an intuitive user interface that allows presenters to easily understand and use the hand
gestures without complex learning curves.
d. Compatibility:
e. Reliability:
Create a robust system that operates consistently across various environmental conditions,
lighting situations, and hand orientations.
Address any potential security concerns related to using a camera interface and ensure user
data privacy.
10
5. RISK ANALYSIS
Technical Challenges:
11
User-Related Risks
d. Privacy Concerns:
Risk: User concerns about microphone and camera usage.
Mitigation: Communicate privacy policies, offer control options.
12
6.FEASIBILITY ANALYSIS
Technical Feasibility:
a. Technology Stack: The project utilizes Python, OpenCV, CV Zone, NumPy, and Media
Pipe. These technologies are well-established and commonly used for computer vision and
machine learning applications, providing robust support.
b. Hand Gesture Recognition: OpenCV's capabilities in hand detection and tracking,
along with machine learning models, allow the identification and interpretation of hand
gestures effectively.
c. System Architecture: The system architecture involves modules for hand detection,
finger tracking, finger state classification, and gesture recognition, speech recognition which
seem technically feasible based on existing libraries and algorithms.
Operational Feasibility:
a.Ease of Use: The proposed system aims to simplify the presentation process by allowing users to
control slides using hand gestures. This can potentially make presentations more intuitive and
engaging for both presenters and audiences.
b.Compatibility: The system's compatibility with different presentation formats (e.g.,
PowerPoint, images) needs to be considered for seamless integration with various
presentation tools.
13
7. Proposed Approach
Fig.1,Data Pre-Processing
14
2. Building the Hand Gesture Recognition and Speech Recognition Model:
Programming Language: Python
Libraries:
Os: Operating System.
Re: Regular Expression.
OpenCV (cv2): For computer vision tasks, including image processing, detection, and tracking.
NumPy: For numerical operations and array manipulations.
Custom HandTrackingModule: A module for hand detection, finger tracking, and gesture
recognition. Likely built using machine learning or custom algorithms.
Speech Recognition: A module in Python helps programs understand and process spoken
language. It listens to audio input, like what you say into a microphone, and converts it into text
that the program can understand and work with.
15
5.Implementation and Testing:
User Interaction:
Allow users to interact with the presentation using predefined hand gestures.
Utilize hand gesture recognition to control slide transitions, writing, erasing, highlighting, and
other actions.
Presentation Control:
Trigger actions such as next slide, previous slide, write/draw, delete, pointer, and exit/terminate
presentation based on recognized gestures.
Implement speech recognition to provide an alternative means of user interaction, allowing users
to control the presentation using voice commands.
Testing:
Test the system extensively with various hand gestures and speech commands to ensure accurate
recognition and reliable action execution.
Conduct usability testing to gather feedback from users and refine the system based on their
experience and suggestions.
16
7. ARCHITECTURE DIAGRAMS:
Fig.2,Cyclic Process
17
Fig.4, Working Mechanism
18
8. SIMULATION SETUP AND IMPLEMENTATION
Hardware Requirements:
Software Requirements:
19
7.2Implementation
20
Fig.6, Code Implementation
21
Fig.7, Code Implementation
22
Fig.8, Code Implementation
23
Fig.9, Code Implementation
24
1. User Uploaded PPT Images:
Users upload PowerPoint slides which are converted to PNG format.
2. Renaming the PPT Images with Sequence Numbers:
The system assigns sequential numbers to each uploaded PNG image to ensure the slides are in a
specific order.
3. Sorting the PPT Images:
After renaming, the images are sorted based on the assigned numerical sequence to ensure the
correct order of slides.
4. Storing the Images Folder to Variables:
The folder containing the sorted PNG images is stored as a variable or accessed for further
processing in the Python environment.
5. Importing Hand Tracking Module (KLT Algorithm):
A custom Hand Tracking Module utilizing the Kanade-Lucas-Tomasi (KLT) algorithm is
imported. This module enables hand detection, finger tracking, and gesture recognition.
25
8.Initialization and Configuration:
The speech recognition module (speech_recognition) is imported at the beginning of the code.
An instance of the Recognizer class is created and assigned to the variable recognizer.
The recognize_speech() function is defined to handle speech recognition tasks.
This function uses the microphone as the audio source to listen for user commands.
When invoked, it displays a "Listening..." message on the image combined with the webcam feed
(img_combined) to indicate that it's ready to receive commands.
It adjusts for ambient noise for one second using recognizer.adjust_for_ambient_noise() to
improve recognition accuracy.
The listen() method of the recognizer records audio input from the microphone until a pause or
silence is detected.
The recorded audio is then passed to the Google Web Speech API (recognize_google()) for speech
recognition.
If the API successfully recognizes the speech, the recognized text is printed, and the function
returns the recognized text in lowercase.
If the API fails to recognize the speech due to unknown input or connection issues, appropriate
error messages are displayed, and an empty string is returned.
Within the main loop of the code, after hand detection and gesture recognition, a condition checks
if all fingers are raised (fingers == [1, 1, 1, 1, 1]), indicating a closed fist or "fist bump" gesture.
Upon detecting this gesture, the recognize_speech() function is called to listen for the user's voice
command.
Based on the recognized text, specific actions are performed, such as navigating to the next or
previous slide, opening a specific slide, deleting annotations, terminating the presentation, or
deleting all slides.
The recognized text is used to trigger actions within the presentation control logic, providing a
seamless integration of voice commands with hand gestures for controlling PowerPoint slides.
26
Kanade-Lucas-Tomasi (KLT) algorithm
1. Preprocessing:
Begin by capturing an initial frame from a video feed or image sequence that includes the
hand(s) you want to detect.
Convert the frame to a suitable color space (like grayscale) to simplify subsequent computations.
2. Feature Detection:
Apply a feature detection method (e.g., Harris corner detection, FAST features, etc.) to identify
distinctive points or corners within the image that can represent potential features of the hand.
3. Feature Tracking Initialization:
Select the features that are within the region(s) of the hand in the initial frame.
These features serve as the starting point for tracking the hand's movement across subsequent
frames.
4. Tracking the Features:
For each feature detected and selected, track its movement in the subsequent frames of the video
sequence using the KLT algorithm.
The KLT algorithm tracks the movement by estimating the optical flow, i.e., how the pixels or
features move between frames. It does so by finding the best matching points between
consecutive frames.
As the frames progress, some features might get occluded or become unreliable for tracking due
to factors like lighting changes or hand movement.
Constantly update and reinitialize the feature set by detecting new features in the regions where
the hand is expected to be present.
Aggregate the tracked features that consistently represent the hand across multiple frames.
Using geometric or statistical methods (like bounding box estimation around the tracked
features), define the region where the hand is detected.
27
7. Hand Gesture Recognition (Optional):
After detecting and tracking the hand region, additional algorithms or machine learning models
can be employed to recognize specific gestures or actions performed by the hand.
The KLT algorithm, when adapted for hand detection, provides a framework for continuously
tracking features representing the hand across video frames. This allows for real-time estimation
of hand movement and location, enabling applications in gesture recognition, human-computer
interaction, and more. However, it's essential to consider potential challenges like occlusions,
lighting variations, and variations in hand appearance for robust hand detection using this
approach.
2.Feature Extraction:
The audio stream is converted into a format suitable for analysis, often in the form of
spectrograms or other feature representations.
This step extracts relevant features from the audio signal that can be used for recognition.
28
4.Acoustic and Language Models:
Acoustic models map audio features to phonemes or basic speech sounds.
Language models incorporate linguistic knowledge to predict word sequences and improve
recognition accuracy.
Some engines may use statistical methods, while others rely on neural network-based
approaches.
6.Output:
The recognized text output is returned to the calling program for further processing or display.
The module may also provide additional metadata, such as confidence scores or timing
information, to assess the accuracy and reliability of the transcription.
29
Gesture Mechanism:
30
Fig.12, Pointer Gesture[0,1,1,0,0]
31
Fig.14, Delete Gesture[0,1,1,1,0]
32
Fig.16, Speech Enable Gesture [1,1,1,1,1]
Speech Enable Gesture will switch to the speech recognition mode where we can give
commands to the program through voice
It will enable operations like
1. Next Slide command as “next”.
2. Previous Slide command as “previous”.
3. Delete command as “delete”.
4. Delete All command as “delete all”.
5.Terminate the presentation command as “terminate”.
6.Open Any Slide command as “open slide No”.
7. only Writing on the slide will be accessed by hand gesture.
33
The use of a green line on the screen serves as a visual guide and segmentation element for users
during a presentation, particularly for gesture-based interactions. By dividing the screen with this
line, it delineates specific regions for different gesture functionalities, enhancing the user
experience and ensuring accuracy in gesture recognition. The allocation of gestures above and
both above and below the line is purposeful, catering to different functionalities based on their
positioning relative to the line.
Previous Slide Gesture, Next Slide Gesture, Exit Gesture, Speech Enable Gesture :
These gestures are exclusively designed to work in the space above the green line. They facilitate
essential presentation controls, such as moving to the previous or next slide and exiting the
presentation mode and switching to speech recognition mode.
Restricting these gestures to the area above the line ensures a clear and distinct control space for
navigating through slides without interference from other functionalities.
Gestures Above and Below the Line:
Pointer Gesture:
This gesture allows users to activate a pointer function, enabling them to interact with content or
highlight specific areas on the slides. By allowing this gesture both above and below the line,
users can seamlessly control the pointer regardless of its position relative to the line.
Write & Draw Gesture:
Enabling this gesture both above and below the line grants users the ability to annotate or draw
on the slides. It offers flexibility for users to create annotations wherever they find it
comfortable, whether it's above or below the green line.
34
Delete Gesture:
The delete gesture is also made accessible in both regions to provide users the capability to erase
or delete any annotations or drawings made on the slides. This ensures ease of interaction and
correction, regardless of the position of the drawn content in relation to the green line.
The visual cue of the green line offers a clear demarcation for users, simplifying the navigation
of various presentation functionalities using hand gestures. It optimizes user control and
minimizes confusion, ensuring a smooth and intuitive interaction with the presentation content
while offering distinct areas for specific gesture-based actions.
35
9. RESULTS COMPARISON WITH CHALLENGES:
36
Fig.19,Write Or Draw Gesture(Mechanism)
37
Fig.21,pointer(Mechanism)
38
Speech Enable Gesture(Mechanism)
Accuracy:
We can set the detectionCon=0.8 or 0.9 (0.8=80% and 0.9=90%).
39
Fig.25,Gesture Working
Complete algorithm work simple array of size 5 where there are only 1’s or 0’s.
[1,1,1,1,1]
1 indicates that the finger is up.
0 indicates that the finger is down.
in this
1st position of value indicate above thumb
2nd position value indicate as index finger.
3rd position value indicated as middle finger.
4th position value indicated as ring finger.
5th position value indicated as little finger.
Based on the hand gesture we give it converts the gesture into array and perform appropriate
action on the presentation according to the user.
The hand Tracking module and hand detection works on this mechanism.
We set our accuracy as 80 to 90 % for proper functioning the presentation using gestures.
40
The algorithm's precision is essential to discern various hand configurations accurately,
translating them into discrete actions within the presentation software. It establishes a reliable
interface between the user's hand movements and the control of the presentation slides, ensuring
that the gestures are correctly identified and mapped to the intended commands.
Achieving an accuracy level between 80% to 90% implies that the system can effectively
interpret hand gestures, enabling presenters to navigate through slides, highlight sections, or
perform other actions with a high degree of reliability. This reliability is crucial in real-world
scenarios, ensuring that the user experiences a seamless and responsive interaction while
delivering presentations.
41
Speech Recognition in Gesture Recognition Loop:
Within the gesture recognition loop, there's a conditional block checking for hand gestures and
speech recognition.
If hand gestures are detected and certain conditions are met, speech recognition is triggered by
calling recognize_speech().
Depending on the recognized command, various actions are performed such as navigating
between slides, deleting annotations, etc.
42
10. LEARNING OUTCOMES
43
11. CONCLUSION WITH CHALLENGES
In conclusion, while the presented system demonstrates the potential of combining hand gesture
and speech recognition for interactive presentations, there are opportunities for refinement and
enhancement. Addressing challenges related to recognition accuracy, system responsiveness, and
user interface intuitiveness will be critical for developing a more robust and user-friendly
presentation system. With further refinement and iteration, the system can offer an engaging and
intuitive user experience, empowering presenters to interact with their slides more seamlessly.
44
12. REFERENCES
3. Hajeera Khanum, Dr. Pramod H B. "Smart Presentation Control by Hand Gestures Using
Computer Vision and Google’s Mediapipe" y (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue:
07
| July 2022.
https://www.irjet.net/archives/V9/i7/IRJET-V9I7482.pdf
5. Bobo Zeng, Guijin Wang, Xinggang Lin . "A Hand Gesture Based Interactive Presentation
System Utilizing Heterogeneous Cameras." ISSNll1007-0214ll15/18llpp329-336 Volume 17,
Number 3, June 2012.
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6216765
45
[7] Rina Damdoo, Kanak Kalyani ,Jignyasa Sanghavi."Adaptive Hand Gesture Recognition
System Using Machine Learning Approach".Biosc.Biotech.Res.Comm. Special Issue Vol 13
No 14 (2020) Pp-106-110.
https://bbrc.in/wp-content/uploads/2021/01/13_14-SPL-Galley-proof-026.pdf
[8] Bhor Rutika, Chaskar Shweta, Date Shraddha, Prof. Auti M. A.4. Power Point
Presentation Control Using Hand Gestures Recognition .International Journal of Research
Publication and Reviews, Vol 4, no 5, pp 5865-5869, May 2023
https://ijrpr.com/uploads/V4ISSUE5/IJRPR13592.pdf
[9] THIN THIN HTOO, OMMAR WIN. Hand Gesture Recognition System for Power
Point Presentation.ISSN 2319-8885 Vol.07,Issue.02, February-2018
https://ijsetr.com/uploads/251346IJSETR16450-43.pdf
[10] Ram Rajesh J,Sudharshan R,Nagarjunan D,Aarthi R. Remotely controlled
PowerPoint presentation navigation using hand gestures.
46