Final Report R

TABLE OF CONTENTS
PROJECT SYNOPSIS
1) INTRODUCTION 5
2) OBJECTIVE 6
3) PROPOSED PLAN OF WORK 7

3.1) AI Voice Assistance
4) METHODOLOGY 8
5) HARDWARE & SYSTEM REQUIREMENT 9
6) SYSTEM DESIGN 10
7) TECHNOLOGY USED 11
8) MODULE USED 12
9) IMPLEMENTATION WORK DETAILS 25
10) FUTURE SCOPE AND FURTHER ENHANCEMENT 27
11) REFERENCES 28
PROJECT REPORT
1) ABSTRACT 30
2) SYSTEM ANALYSIS 31
2.1 IDENTIFICATION OF NEED
2.2 PREMINARY INVESTIGATION
2.3 FEASIBITY STUDY
2.4 PROJECT PLANNING
2.5 PROJECT SCHEDULING
2.6 SOFTWARE REQUIREMENT SPECIFICATION(SRS)
2.7 SOFTWARE ENGINEERING PARADIGM USED
2.8 DIAGRAMS
3) SYSTEM DESIGN 60
3.1 PROGRAM STRUCTURE
3.2 DATABASE DESIGN
3.3 MODULARIZATION DETAILS
3.4 DATA FLOW DIAGRAM
3.5 E-R DIAGRAM
3.6 CLASS DIAGRAM
3.7 SEQUENCE DIAGRAM
3.8 USER INTERFACE DESIGN
4) IMPLEMENTATION 81
4.1 CODING
4.2 CODE EFFICIENCY
4.3 PARAMETER PASSING / CALLING
4.4 VALIDATION CHECKS
5) TESTING 134
5.1 TESTING TECHNIQUES & STRATEGIES
5.2 CODE IMPROVEMENT
6) SYSTEM SECURITY 143

I.1 SYSTEM SECURITY
I.2 ACCESS CONTROL
7) COST ESTIMATION 145
8) SCREENSHOTS 146
9) CONCLUSION & FUTURE SCOPE 153

9.1 FUTURE SCOPE OF PROJECT
9.2 LIMITATIONS
10) RFERENCES 156

SYNOPSIS
INTRODUCTION
When AI is combined with other machines, it demonstrates that it is capable of thinking like a human.
The goal is to create a computer system that necessitates the use of a human user interface. Because
Python is a relatively new programming language, creating a Voice Assistant script in Python is a cinch.
You have complete control over how the assistant responds to your commands. Alexa, Siri, and other
voice-activated assistants use speech recognition technology. An API called Speech Recognition exists in
Python, and it enables us to turn spoken words into written ones. Creating my own personal helper was a
fascinating challenge. With the use of a single voice command, you can now send emails, search the
internet, play music, and launch your favourite IDE without ever having to open a browser. The present
state of technology means that it is capable of doing any work as successfully as we are, if not better. I
discovered that the notion of AI in every sector reduces human work and saves time via the creation of
this project.
There are a number of features that make this app useful, such as the ability to send emails, the ability to
open command prompts (such as your preferred IDE or notepad), and the ability to play music, as well as
the ability to run Wikipedia searches for you. Basic discussion is possible. There has been research on
the similarities and differences between various voice assistant devices and services
In the twenty-first century, whether it's your house or your automobile, everything is moving toward
automation. Over the last several years, technology has seen an incredible shift or growth. In today's
environment, you can communicate with your computer. As a human, how do you engage with a
computer? Obviously, you'll have to provide some input, but what if you don't type anything at all, but
instead use your own voice? With more than 70 percent of all intelligent voice assistant-enabled devices
running the Alexa platform, it is the dominating market leader (Griswold, 2018). Is it possible to have the
computer communicate with you in a similar way as a personal assistant? Isn't it possible that the
computer is not only providing you with the best results, but also suggesting a better alternative? Using
voice instructions to control a machine is a revolutionary new method of human-system interaction. We
need to utilise a voice to text API in order to interpret the input. Companies like Google and Amazon are
attempting to make this universally available. Aren't you awestruck that you can create reminders simply
by stating "Remind me to..." Set a timer for to wake me up, or set an alarm for to wake me up. A system
that can be installed anywhere in the neighbourhood and can be asked to assist you accomplish anything
for you only by speaking with the device has been developed to recognise the relevance of this issue.
Adding to this, you may link two of these gadgets together through Wi-Fi in the future to enable
communication between them. You may use this gadget on a daily basis, and it can help you perform
more effectively by continually reminding you of tasks and providing you with updates and notifications.
What's the point of it? Your voice, rather than an enter key, is becoming the ideal input method
Because the voice assistant is powered by Artificial Intelligence, the results it provides are very accurate
and efficient. Using an assistant reduces the amount of human work and time required to accomplish a
job; they eliminate the need for typing entirely and act as an additional person to whom we may converse
and delegate tasks. Science and the educational sector are also looking at whether these new gadgets can
aid education, as they do with every new ground breaking technology. In the past, personal computers
and tablet computers have had comparable issues (Algoufi, 2016; Gikas and Grant, 2013; Herrington and
Herrington, 2007).
We will be using Visual Studio Code to construct this project, and all of the py files were produced in
VSCode. The following modules and libraries were also utilised in my project: PyAudio, pyttsx3,
Wikipedia, Smtplib, pyAudio, OS, Webbrowser, and so on.
In today's world, virtual assistants are really helpful. It facilitates human existence in ways similar to
Using just vocal commands, run a computer or laptop. Using a virtual assistant saves time. We are able
to devote more time to other projects thanks to the help of a virtual assistant.
A virtual assistant is often a cloud-based application that works with devices connected to the
internet .As a means of developing a virtual assistant Python will take over our PC. Task oriented virtual
assistants are the most common kind of virtual assistant. The use of a remote assistance understanding of
and capacity to follow instructions. In a three-week study, Beirl et al. (2019) examined how Alexa was
used in the household. Studying how families use Alexa's new talents in music, storytelling, and gaming
was the goal of the research. A virtual assistant is a computer programme that is able to recognise and
respond to user requests. Clients' instructions are followed verbally and in writing. To put it simply,
they're ability to understand and react to human speech via the use of artificial voice syntheses. A variety
of voices are available. assistants on the market, such as Apple TV's Siri and Pixel phones' Google
Assistant An Alexa-powered smart speaker built on a Raspberry Pi and Microsoft Windows. There are
ten Cortanas in the world. Our own virtual assistant was produced in the same way as all other virtual
assistants. windows. This project would benefit greatly from the application of artificial intelligence
technologies. Python may also be used as the language, since python has a large number of well-known
libraries. A microphone is required to run this programme.
In this research, Savago et al. look at the usage of voice assistants by seniors (age 65 and above) (2019).
In order to better understand the usage of digital technology by older persons, the authors stress the need
of doing more study. Additionally, Kowalski et al. (2019) studied older individuals' usage of voice-
activated devices. The research included seven elderly persons. Voice-activated input and output devices.
Several diverse technologies, such as speech recognition, voice analysis, and language processing, are
used in this procedure. Natural language processing is used by virtual assistants to translate text and
voice input from users into actionable instructions. Audio signals are translated to digital signals when a
user instructs their personal virtual assistant to do a job.
OBJECTIVES
1. We need to create a voice assistant that will help the infants and children in such a way that they
can easily interact with the desktop voice assistant. Most of the voice assistant faces difficulties
while understanding the commands, they misinterpret the commands and then the commands gets
executed wrongly.
2. We will create a desktop voice assistant that will work even after having grammatical errors
while speaking so that will not be a barrier to use this Voice Assistant.
3. We will create a voice assistant that will take the commands from the user and then understands
what instructions users is giving and then perform it accordingly. The tasks may contains
performing queries, connecting to a device, or opening Netflix.
4. The integration of voice assistant with the system and applications. Integrate voice assistant in
such a way that it will perform the tasks seamlessly and efficiently.
5. Voice Assistant will provide the information also like providing updates regarding stocks,
weather, and sports.
6. Voice Assistant will ensure the security of user data. We will try to implement the encryption
methods and data handling.
PROPOSED PLAN OF WORK
Analyzing the user's microphone instructions was the first step in the process. Everything from
retrieving data to managing a computer's internal files falls under this category. Reading and
testing the cases from the literature cited above, this is an empirical qualitative investigation.
Programming is done according to books and internet resources, with the intention of
discovering best practises and a deeper grasp of Voice Assistant.
A limited subset of often asked questions (e.g., "what's the weather like today?") should be
prioritised by an ASR system while performing voice assistant activities. This is often done by
raising the weight of the language model (or reducing the AM by proxy) such that the probability
of recognising a frequent command or phrase is high. If the LM weight is too high, the model will
only produce sentences it has already seen. Dysfluencies such as sound repeats or prolongations
tend to generate mistakes in the AM component of the system, so we may exploit this trade-off.
There may be fewer repetitions with a lower level of resistance (LM) than with a higher level of
resistance (LM). While the default ASR decoder correctly identified "what is my brother's add
add add address" when applied to a dysfluent speech sample, increasing the LM weight improved
its accuracy to "what is my brother's address." Repetitive "EY d" sound repeats, which resulted in
the half word "add," were eliminated.
3.1 AI VOICE ASSISTANCE
AI role as a personal assistant, the end-user is assisted with everyday tasks such as general
human conversation, searching queries in various search engines like Google and Bing,
retrieving videos, live weather conditions, word meanings, searching for medicine details,
health recommendations based on symptoms, and reminding the user of the scheduled events
and tasks. Machine learning is used to determine the best course of action based on the user's
comments and requests.
Presently, Jarvis is being developed as an automation tool and virtual assistant. Among the
Various roles played by Jarvis are:
1. Search Engine with voice interactions
2. Medical diagnosis with Medicine aid.
3. Reminder and To-Do application.
4. Vocabulary App to show meanings and correct spelling errors.
5. Weather Forecasting Application.
Everything remains the same, even for a developer working on Linux who relies on running
queries. By allowing online searches for our voice assistant, we've fulfilled a critical need for
internet users. Node JS and the Selenium framework have been used in this example to both
extract and show the results from the web. Jarvis shows search results from a variety of search
engines, including Google, Bing, and Yahoo, by scraping the entered searches.
As a primary source of entertainment, videos have remained a top priority for virtual
assistants. These videos have a dual purpose: entertainment and education, since the majority
of educational and scientific endeavours now take place on YouTube. This facilitates a more
hands-on, outside-the-classroom learning experience.
The core Golang service manages a subprocess module that Jarvis uses to implement the
functionality. The Selenium WebDriver and the YouTube search query are scraped by this
service in a Node JS subprocess.
It is easier to send emails from Jarvis than it would be if you had to open the email account in
question. Jarvis eliminates the necessity for switching to another tab to do a common daily
job. Emails may be sent to the recipient of the user's choice. Once he selects Send mail, a form
will appear. Click the Send Email button after filling out the form.
METHODOLOGY
Figure 2.1. Shows backend working data flow
User input may be matched with executable instructions using Natural Language Processing
(NLP). An audio signal is translated into executable commands or digital data that may be
used by software to do a specific action when a user asks a query. VirtualAssistant is used to
operate machines based on your own instructions, and then this data is compared with software
data to obtain an appropriate solution. We utilise python installers like- to create virtual
assistants Horn proposes a classroom environment (2018). Each classroom should have
enough microphones to detect each student's voice and offer individualised replies to each
student's headphones via voice assistants, according to the author. Each classroom might have
a smart speaker where students can ask questions. Alternatively Teachers should have access
to voice assistant data in real time so they may step in as necessary. Teachers are not replaced
by the gadgets, but rather their job is amplified by their use of them.
Neiffer investigates the impact of intentional education using the intelligent voice assistant Siri
on student participation in science classes in upper elementary and middle school grades
(2018). Student involvement is connected with student graduation rates. High student
involvement leads to greater teacher’s satisfaction and pleasure. Research shows that there is
too much complexity in the relationship between technology and education to draw any firm
conclusions. Furthermore, there is no clear correlation between the use of Siri in 5th and
middle school science classrooms and an increase in students' interest in learning science. A
unique Alexa Skill on Scotland was made by Davie and Hilber (2018), who utilised it with
students prior to a trip to the country. Students utilised the Amazon Echo gadget and found the
talent to be interesting.
16
Figure 2.2. input/output text to speech
The proposed multi-domain ASR framework consists of three main modules: a basic ASR
module to conduct first-pass decoding and generate top N hypotheses of a speech query, a text
classification module to determine which domain the speech query belongs to, and a reranking
module to rescore n-best lists of the first-pass decoding output using domainspecific language
models. Figure 1 shows the diagram of the proposed multi-domain ASR framework
Speech recognition:
To translate spoken input into text, the system makes use of Google's online voice
recognition technology. A particular corpus of voice data is saved on a computer network
server at the information centre and then delivered to Google Cloud for speech recognition,
allowing users to talk and get the text as a result of their voice input. The voice assistant
application receives and sends the exact same text.
Backend in Python:
Python is used as the backend for the whole software. Context extraction, API calls,
and system calls are all types of calls that the Python Backend can distinguish between with the
use of a speech recognition module. The output is then provided back to the respondent.
Calls to the application programming interface:
The API's job is to act as a bridge between two programmes so that they may
communicate with one another. This means that APIs act as a messenger between the service
provider and the user, delivering their requests and subsequently returning their responses.
Content Extraction:
17
Machine-readable documents that are unstructured or semi-structured may be

automatically analysed using Context Extraction. Natural language processing (NLP) is used in
this activity to process documents written in human language. Content extraction might include
tasks like image/video/audio annotation and content extraction
System Calls
For example, accessing the hard disc drive, creating new processes, and
communicating with process scheduling are all examples of System Calls. An key part of the
OS-process interaction is provided by this component.
Google-Text-to-Speech
For the most part, Text-To-Speech is used to turn user-provided Text into Speech.
Sound may be generated from the phonemic representation of the text by a TTS Engine once it
has been translated into waveform form. Third-party publishers have contributed a variety of
languages to the TTS's growing feature set.
1. PRESENT SYSTEM
Many current voice assistants, such as Alexa, Siri, Google Assistant, and Cortana, utilise
the language processing and speech recognition concepts that we are all acquainted with. They
pay attention to the user's instructions and carry out the requested task quickly and effectively.
Using Artificial Intelligence, these voice assistants are able to provide results that are
very accurate and efficient. Using these assistants, we may do more with less human effort and
time consumption, since they do not need any typing at all and act as if they were an actual
person to whom we were conversing and giving instructions. There is no comparison between
these helpers and a person, yet we can state that they are more effective and efficient at doing
any duty. Because of this, the method utilised to create these assistants minimises the amount
of time required.
These assistants, however, need an account (such as a Google or Microsoft account)

and an internet connection in order to be used, since these assistants will only function while
connected to the internet. They are compatible with a wide range of gadgets, including mobile
phones, computers, and speakers, among others.
2. PROPOSED SYSTEM
Creating my own personal helper was a fascinating challenge. With the use of a single
voice command, you can now send emails, search the internet, play music, and launch your
favourite IDE without ever having to open a browser. While most standard voice assistants rely
on an internet connection to get instructions, Jarvis is unique in that it is desktop-specific and
does not need a user account in order to use it.
VSCode is the IDE used in this project. Using VSCode, I was able to construct the
18
python files and install all of the essential dependencies. It was necessary to utilise the
following modules and libraries for this project, including pyttsx3, Speech Recognition and
Datetime. Using the JARVIS, I've constructed a live GUI that allows me to interact with it in a
more visually appealing way.
Tutor's growth means that he or she can complete any work as effectively as we can, or
even better. I discovered that the notion of AI in every sector reduces human work and saves
time via the creation of this project. Among the features of this project are the ability to send
emails and read PDF files; the ability to launch command prompt, your preferred IDE,
notepad, and other applications; the ability to play music; the ability to make Wikipedia
searches; and the ability to set up desktop reminders of your choosing. Basic discussion is
possible
The following functionalities will be included in the system as proposed:
1.) In order to respond to a call with the specified functionality, it always retains a list of its
name.
2.) In addition, it retains the sequence of inquiries asked of it in relation to its setting, which it
uses in the future. As a result, every time the identical situation is brought up, you'll be in a
position to bring up pertinent points of discussion.
3) Using voice instructions to do arithmetic computations and returning the results by voice.
4.)In this fourth step, the computer searches the Internet depending on the user's voice input
and returns a voice response with more interactive questions.
5) The data on its cloud server will maintain auto synchronisation up to date.
6.) Update the data in the cloud with the help of a Firebase server.
7.) User may connect smart devices and conduct actions such as turning on and off lights with
the assistance of the IoT architecture.
8.) Push notifications, such as email or text messages, may be used to alert the owner of a
smartphone.
9.) Some more options include playing music, setting an alarm, and monitoring local weather
conditions. The use of reminders, spell-checks, etc
REQUIREMENTS
 Software requirements
 Pycharm IDE/visual studio code

 Inno Setup Compiler
 Pyinstaller
 Python 3.8.2 and its Sub modules
 Hardware requirements
 Intel core i3
 4gb RAM
 30 Gb Hard drive space
SYSTEM DESIGN
Flow Charts and System Diagrams
Speech-to-Text Interface
The goal of voice recognition is to offer a way to convert spoken words into written ones. This
objective may be achieved in a variety of ways. Building models for each word that has to be
identified is the simplest method. Speech signal mainly transmits the words or message being
said. The underlying meaning of the utterance is the focus of speech recognition. Extracting
and modelling the speech-dependent properties that may successfully differentiate one word
from another is the key to success in speech recognition The system consists of a set of
components.
Due to the fact that all systems are based on machine learning and employ vast quantities of
data acquired from different sources and then trained on them, the source of this data plays a
vital part in their production. The kind of assistance that emerges depends on the quantity of
data gathered from various sources. Despite the wide variety of learning methodologies,
algorithms, and techniques, the basic building blocks of these systems remain essentially the
same across the industryAssistive technology A virtual assistant is often a cloud-based
application that works with devices connected to the internet. is the ability to contract for just
the services they need. As a means of developing a virtual assistant Python will take over your
PC. Task-oriented virtual assistants are the most common kind of virtual assistant. The use of a
remote assistance understanding of and capacity to follow instructions
20
DATA TRAFFIC.
HELLO
Start
It will take input through voice commands related to the

Inpu task which is required to be done.
t
It will perform the required task for the user like

opening notepad, searching on browser, sending
Perform
mails, playing songs etc.
It keeps on asking for the command from

user until the user say "Quit". Once the user
Exi say "Quit", it exits.
t
Figure 3.2 shows Tutor data flow
The system is built on the idea of Artificial Intelligence and the relevant Python packages.
pyttsx3 can read PDFs using Python's various libraries and packages, such as python. Chapter
3 of this study goes into depth about these packages.
Everything in this project is based on human input, thus the assistant will do anything the user
commands it to do. Everything a user wishes to be done, in human language, may be entered as
a list of tasks. English.
Student involvement is connected with student graduation rates. High student involvement
leads to greater teacher’s satisfaction and pleasure. Research shows that there is too much
complexity in the relationship between technology and education to draw any firm conclusions.
22
shows use case diagram
DFD 0
DFD 1
DFD 2
USE CASE DIAGRAM
SEQUENCE DIAGRAM
TECHNOLOGY USED
 Speech Recognition: Python's Speech Recognition library allows you to easily convert speech
to text, enabling your voice assistant to understand user commands and queries.
 Natural Language Processing (NLP): Libraries such as nltk and spaCy can be used for NLP
tasks, such as tokenization, part-of-speech tagging, and named entity recognition, to better
understand the context of user inputs.
 Text-to-Speech (TTS): Python provides libraries like pyttsx3 and gTTS for converting text to
speech, allowing your voice assistant to respond to user queries in a natural and human-like
voice.
 User Interface: Python offers various GUI libraries such as tkinter and PyQt for creating
interactive user interfaces for your voice assistant, making it easy for users to interact with the
assistant.
 Integration with APIs and Services: Python's requests library allows you to easily make HTTP
requests to APIs and services, enabling your voice assistant to fetch information from the
internet or interact with external services.
 Platform Independence: Python is platform-independent, meaning your voice assistant can run
on different operating systems, such as Windows, macOS, and Linux, without modification.
 Ease of Development: Python's simple and readable syntax makes it easy to develop and
maintain code, speeding up the development process for your voice assistant.
 Community Support: Python has a large and active community of developers, providing access
to a wealth of libraries, tutorials, and resources to help you build your voice assistant.
By leveraging Python's features and libraries, you can create a simple desktop voice assistant that can
understand user commands, retrieve information, and provide helpful responses, enhancing the user's
desktop experience.
Modules that will be used for desktop voice assistant
You can utilize multiple modules and libraries to handle different areas of the assistant's functionality
when creating a desktop voice assistant in Python. The following are some essential Python modules
and libraries that are frequently used to create voice assistants:
 Speech Recognition: It is a library for performing speech recognition, supporting multiple

speech recognition engines and APIs.
 pyttsx3: It is a text-to-speech conversion library that supports multiple TTS engines, including
SAPI5 and NSSpeech Synthesizer.
 nltk and spaCy: It is a Natural language processing libraries that provide tools for tokenization,
part-of-speech tagging, and other NLP tasks.
 gTTS (Google Text-to-Speech): It is a library for text-to-speech conversion using Google's TTS
engine.
 pyAudio: It Provides bindings for the Port Audio library, allowing you to capture and play audio
streams.
 requests: It is a library for making HTTP requests, which can be useful for interacting with web
APIs to fetch information or perform actions.
 json: It is Python's built-in JSON module can be used to parse and manipulate JSON data, which
is commonly used in web APIs and data exchange formats.
 tkinter or PyQt: These are GUI libraries that allow us to create a graphical user interface
(GUI) for your voice assistant.
 Os and subprocess: These are modules for interacting with the operating system, which can be
useful for performing system-level tasks.
 datetime: This is a module for working with dates and times, which can be useful for providing
time-related information or scheduling tasks.
 pickle or shelve: Modules for serializing Python objects, which can be useful for saving and
loading data related to the voice assistant's state.
By using these modules and libraries, you can create a desktop voice assistant in Python that can
understand and respond to user commands, interact with external services, and provide a seamless user
experience.
IMPLEMENTATION WORK DETAILS
In order to automate many of the routine desktop operations, such as playing music or
launching your preferred IDE, TUTOR, a desktop assistant, utilises a voice assistant.
While most standard voice assistants rely on an internet connection to get instructions,
Jarvis is unique in that it is desktop-specific and does not need a user account in order to
use it.
1. APPLICATION IN THE REAL WORLD

This desktop voice assistant, known as TUTOR, allows us to do a variety of things just
by speaking our commands into the gadget.
Using a conversational approach, it streamlines the process of accomplishing any
operation by using Python's most important modules and libraries automatically. The
conversational interaction between providing input and receiving the required output in
the form of a job completed makes it seem like a human helper when a user instructs it
to do a task.
As a result of its receptive character, the desktop assistant responds to the user in a
fashion that is intelligible to human beings, which is why it responds in English. As a
result, the user reacts intelligently and with knowledge.
The most common use of it is its capacity to do many tasks at once. It may keep asking
for instructions one after the other until the user says "QUIT" to stop it.
After receiving the user's instructions, it just does the work without requiring a "trigger
phase" to begin the process.Many difficulties and challenges arose throughout the
development period, which pushed us to design a system that could recognise words in
the Nepali language based on their numbers. To do this, we conducted extensive study
into several speech- recognition systems and applied the data to build the system.
2. DATA IMPLEMENTATION AND PROGRAM EXECUTION
Installing all of the required packages and libraries is a good place to start. Installing the
libraries is as simple as running "pip install" and then importing the results. The
following components are included in the set:FUNCTIONS
1.) Use takeCommand() to get a command from the user's microphone and return it as
a string with the function's output.
2.) Good Morning, Good Afternoon, and Good Evening are some of the
greetings that wishMe() may send to the user based on the current time.
3.) SendEmail(), pdf reader(), news(), and numerous conditions in if conditions like
"open google," "open notepad," "search on Wikipedia," "play music," and so on and so forth are all
defined in taskExecution().
Without a doubt, the effectiveness and efficiency of Tutor as a voice assistant make it a
valuable tool for busy users. These limitations and opportunities for improvement were
discovered while working on this project, which are outlined in the following
sections.Artificial Intelligence and Natural Language Processing will be used to create a
voice-
28
activated personal assistant that can operate IoT devices and even search the web for answers
to specific questions. There are various subsystems that may be automated to reduce the
amount of time and effort required to communicate with the main system. The system's goal
is to make human existence as pleasant as possible. In further detail, this system is meant to
communicate intelligently with other subsystems and operate these devices, including
Internet of Things (IoT) devices or receiving news from the Internet, delivering other
information, obtaining customised data previously kept on the system, and so on. The
Android app should allow the user to add data, such as calendar entries, alarms, or reminders,
to the app. All of these platforms will be made more accessible with the help of the software,
which will go through the following stages: voice data collecting, analysis, text conversion,
data storage, and speech generation from text output processed via these stages. The data
collected at each stage may be utilised to identify trends and provide recommendations to the
user. Artificial intelligence devices that can learn and comprehend their users may utilise this
as a significant foundation. It has been determined that the suggested system would not only
make it easier for us to interface with other systems and modules, but it also helps us stay
organised. With a little help from the device we can help build a new generation of voice-
controlled devices and bring about a long-term change in the automation industry. A
prototype for a wide range of future applications can be found in this paper.
As a result, voice recognition systems have made their way into a wide range of industries.
The use of speech signals as input to a system is one of the many advantages of IVR
(Interactive Voice Response) systems. This is why we proposed the creation of an Interactive
Voice Response (IVR) system that includes automatic speech recognition (ASR). It was the
primary goal of the project to design a system that could recognise speech signals in the
Nepali language
SCOPE FOR FUTURE WORK
● Make TUTOR learn on its own and acquire a new skill.
● Additionally, TUTOR android apps may also be made.
● Increase the number of voice terminals available.
● For further protection, voice instructions may be encrypted. the severely disabled or those who have
suffered minor repetitive stress injuries, i.e., those who may need the assistance of others to manage their
surroundings.
He use of an IVR system is increasing on a daily basis. Such technologies make it easier for the user to
communicate with the computer system, which in turn facilitates the completion of a variety of activities. The
IVR system acts as an intermediary between humans and computers. Due to the time and research constraints,
the existing IVR system is only suitable to desktop computers and will not be implemented in real phone
devices. This is a disadvantage since the IVR system with Automatic Voice Recognition (AVR) may be used
in a wide range of applications. Although the project is still in its infancy, there is plenty of room for
improvement in the years to come. The following are some of the places that may be relevant:
1.Organizational inquiry desk: The system may be utilised in different organisations for simple access to
information about the organisation using the voice command.
2.Detection of isolated words is all that the suggested system does, but it might be expanded to include audio
to text conversion with additional improvements in algorithms.
3.It is possible to employ voice recognition of Nepali phrases in order to accomplish the work of freshly
generated apps and therefore create more user pleasant applications.
4 In embedded systems, voice commands may be used to handle multiple activities using speech recognition
technology. This promotes automation of labour and can thus be very advantageous in industrial process
automation.process automation
5.Application for People with Disabilities: People with disabilities may also benefit from voice recognition
software. It is particularly beneficial for those who are unable to use their hands.
REFERENCES
[1] Diksha Goutam” A REVIEW: DESKTOP VOICE ASSISTANT”,IJRASET,
Volume:05/Issue:01/January-2023, e-ISSN: 2582-5208
[2] Asodariya, H., Vachhani, K., Ghori, E., Babariya, B., & Patel, T. Desktop Voice Assistant.
[3] G Gaurav Agrawal*1, Harsh Gupta*2, Divyanshu Jain*3 , Chinmay Jain*4 , Prof. Ronak
Jain*5,” DESKTOP VOICE ASSISTANT” International Research Journal of Modernization in
Engineering Technology and Science Volume:02/Issue:05/May-2020.
[4] Bandari , Bhosale , Pawar , Shelar , Nikam , Salunkhe (2023). Intelligent Desktop Assistant.
2023 JETIR June 2023, Volume 10, Issue 6, www.jetir.org (ISSN-2349-5162)
[5] Vishal Kumar Dhanraj Lokeshkriplani , Semal Mahajan: ISSN (Online): 2320-9364, ISSN
(Print): 2320-9356 www.ijres.org Volume 10 Issue 2 ǁ 2022 ǁ PP. 15-20
[6] Ujjwal Gupta, Utkarsh Jindal, Apurv Goel, Vaishali Malik ” Desktop Voice
Assistant”,IJRASET, 2022-05-08
[7] Vishal Kumar Dhanraj (076) Lokeshkriplani (403) Semal Mahajan (427) ” Research Paper
onDesktop Voice Assistant” (IJRES) ISSN (Online): 2320-9364, ISSN (Print): 2320-9356
www.ijres.org Volume 10 Issue 2 ǁ 2022 ǁ PP. 15-20.
[8] V. Geetha, C.K.Gomathy, Kottamasu Manasa Sri Vardhan, Nukala Pavan Kumar,“The Voice
Enabled Personal Assistant for Pc using Python,”International Journal of Engineering and Advanced
Technology (IJEAT), April 2021.
[9] Chen, X., Liu, C., & Guo, W. (2020). A Survey on Voice Assistant Systems. IEEE Access, 8,
27056-27070
PROJECT
ABSTRACT
Voice assistants have improved accessibility and convenience across a range of devices in recent years,
becoming indispensable components of everyday life. The goal of this project is to create a Desktop
Voice Assistant system that will enable smooth voice-activated desktop computer interaction. To
comprehend customer inquiries and complete tasks quickly, the Desktop Voice Assistant system
combines speech recognition, natural language processing (NLP), and text-to-speech (TTS) algorithms.
In order to help users with tasks like online browsing, scheduling, reminders, and information retrieval,
the project's main goals are to build an intuitive user interface, implement strong speech recognition
capabilities, and integrate a varied variety of functionality. The Desktop Voice Assistant system uses
cutting-edge natural language processing (NLP) models to accurately understand user commands and
provide timely, pertinent information or actions in response. By using Visual Studio Code for
development with modules like Wikipedia, PyAudio, and pyttsx3, the project shows how Python can be
used to create complex voice assistant systems that are both feasible and adaptable. Voice assistants are
positioned to become commonplace companions as technology develops, streamlining activities and
improving human-computer connection. The potential for voice assistants to completely transform daily
life is growing thanks to continuous developments in artificial intelligence and speech recognition
technologies, which present fresh chances for creativity and effectiveness.
Problem Statement
Design and develop a desktop voice assistant application to enhance user productivity and accessibility
within a computer environment. The voice assistant should provide seamless interaction through voice
commands, catering to a diverse range of user needs and tasks.
SYSTEM ANALYSIS
2.1 Need and Significance of Desktop Voice Assistant

Hands-Free Interaction: A desktop voice assistant allows users to interact with their computers
hands-free, which can be particularly useful in situations where manual input is inconvenient or
impossible, such as when cooking, driving, or multitasking.
 Accessibility: Voice assistants improve accessibility for users with disabilities, allowing
them to use computers more easily and efficiently.
 Increased Productivity: Voice assistants can help users complete tasks more quickly
and efficiently, reducing the time spent on manual input and navigation.
 Natural Interaction: Voice interaction provides a more natural and intuitive way to
interact with computers, making technology more accessible to a wider range of users.
 Personalization: Voice assistants can be personalized to understand and respond to
individual user preferences, providing a customized user experience.
 Multitasking: Voice assistants allow users to multitask more effectively by enabling them
to perform tasks while keeping their hands and eyes focused on other activities.
 Efficient Information Retrieval: Voice assistants can quickly retrieve information
from the internet or other sources, saving users time and effort.
 Improved User Experience: Voice assistants can enhance the overall user experience
by providing a more interactive and engaging interface.
 Integration with Other Applications: Voice assistants can be integrated with other
desktop applications, such as calendars, email clients, and task managers, to provide a
seamless user experience.
 Future Technology Trends: Voice technology is a growing trend in computing, and
developing a desktop voice assistant can help users adapt to and benefit from future
technological advancements.
2.2 Preliminary Investigation

1. Customization and Personalization: Explore options for customizing the voice assistant to
suit individual preferences, such as changing the voice, adjusting language settings, or
creating custom commands.
2. Privacy and Security: Investigate the privacy and security measures implemented by the
voice assistant. Consider factors such as data encryption, storage of voice recordings, and
permissions required for accessing personal information.
3. User Interface and Experience: Evaluate the overall user experience of interacting with the
voice assistant. Assess factors such as ease of use, responsiveness, and intuitiveness of the
interface.
4. Compatibility and Device Support: Determine which devices and operating systems are
supported by the voice assistant. Test its performance across different platforms to identify
any compatibility issues.
5. Accessibility: Consider the accessibility features of the voice assistant, such as support for
different languages, accents, and speech impairments.
6. Performance and Reliability: Assess the performance and reliability of the voice assistant
under various conditions, such as background noise, different accents, and network
connectivity issues.
7. Future Development and Updates: Investigate the roadmap for future development and
updates of the voice assistant. Look for information on upcoming features, improvements,
and enhancements.
8. Functionality and Features: Evaluate what tasks the voice assistant can perform, such as
setting reminders, sending messages, playing music, checking the weather, etc. Compare the
capabilities of different voice assistants to see which one offers the most comprehensive set
of features.
9. Accuracy and Understanding: Test the accuracy of the voice recognition software by
speaking various commands and assessing how well the assistant understands and executes
them. Note any instances of misinterpretation or errors.
10. Integration with Other Services: Assess how well the voice assistant integrates with other
apps and services. For example, does it work seamlessly with calendar apps, email services,
smart home devices, etc.?
By conducting a thorough investigation across these areas, you can gain valuable insights into the
strengths and weaknesses of desktop voice assistants and make informed decisions about their
suitability for specific use cases.
2.3 Feasibility Study

Technical feasibility assesses the practicality of implementing the voice assistant based on available
technology, resources, and expertise. It evaluates the hardware and software requirements, as well as
the skills needed for development.
2.3.1 Technical Feasibility

8.1.1Hardware and Software Availability
To develop a desktop voice assistant, you will need access to hardware components such as a
microphone and speakers. These components should be easily obtainable and compatible
with your development environment. Ensure that you have the necessary software tools and
libraries, such as Python and relevant packages for speech recognition and synthesis, readily
available and compatible with your system.
2.3.2 Market Feasibility

Market feasibility evaluates the demand for the desktop voice assistant and its potential to
attract users. It involves identifying the target market, understanding user needs, and
analyzing competition in the market.
Identifying Potential Users

Conduct market research to identify potential users of the voice assistant. This may include
students, professionals, or seniors who could benefit from hands-free desktop interaction.
Determine the specific needs and preferences of these potential users to tailor the voice
assistant's features and functionalities accordingly.
Target Market
Based on your research, define your target market for the voice assistant. Consider factors
such as demographics, interests, and needs of the target market. This will help you tailor your
marketing efforts and product features to attract and retain users.
2.3.3 Financial Feasibility

Financial feasibility assesses the project's ability to generate sufficient revenue to cover its
costs and achieve a satisfactory return on investment. It involves estimating development and
maintenance costs, as well as potential revenue streams.
Cost Estimation
Estimate the costs associated with developing and maintaining the voice assistant. This
includes costs for hardware, software, personnel, and any other resources needed for
development. Consider both one-time costs for initial development and ongoing costs for
maintenance and updates.
Revenue Generation and Cost-Benefit Analysis
Identify potential revenue streams for the voice assistant. This may include selling the
application, offering premium features through a subscription model, or integrating
advertisements. Estimate the potential revenue from each stream based on market research
and competitive analysis and also conduct a cost-benefit analysis to assess the financial
viability of the project. Compare the estimated costs with the potential revenue to determine
if the project is financially feasible.
2.3.4 Legal and Regulatory Feasibility

Legal and regulatory feasibility assesses the project's compliance with laws, regulations, and
industry standards. It involves identifying legal requirements and ensuring that the project
meets these requirements.
It includes some important points to be considered for Legal and Regulatory Feasibility-
 Data Privacy and Security

 Technological Restrictions
 Intellectual Property Rights
2.3.5 Operational Feasibility

Operational feasibility assesses the practicality of implementing the voice assistant within the
organization or environment where it will be used. It involves evaluating how well the voice
assistant fits with existing processes and systems, as well as the availability of resources to
support its implementation and operation.
Integration
Assess how easily the voice assistant can be integrated into existing desktop environments.
Consider factors such as compatibility with different operating systems and software
applications. Determine if any modifications or additional resources will be needed to ensure
smooth integration.
Resource Availability
Determine if there are enough resources, such as time, manpower, and expertise, available to
develop and maintain the voice assistant. Consider if additional resources may be needed and
if they can be obtained within the project's constraints. Ensure that there is adequate support
for the voice assistant's operation and maintenance after deployment.
2.3.6. Risk Analysis
Risk analysis identifies potential risks and uncertainties that could affect the success of the
project. It involves assessing the likelihood and impact of these risks and developing
strategies to mitigate them.
Identifying Risks
Identify potential risks that could impact the development and implementation of the voice
assistant. This may include technical challenges, such as difficulties with speech recognition
or natural language processing, as well as external factors like changes in user preferences or
market conditions.
2.4 Project Planning
It is an aspect of Project Management that focuses a lot on Project Integration. The

project plan reflects the current status of all project activities and is used to monitor and control the
project. The Project Planning tasks ensure that various elements of the Project are coordinated and
therefore guide the project execution. Project Planning helps in
-Facilitating communication
- Monitoring/measuring the project progress, and
- Provides overall documentation of assumptions/planning decisions
The Project Planning Phases can be broadly classified as follows:
-Development of the Project Plan
- Execution of the Project Plan
- Change Control and Corrective Actions
Project Planning is an ongoing effort throughout the Project Lifecycle.
What are the steps in Project Planning?
Project Planning spans across the various aspects of the Project. Generally Project Planning is
considered to be a process of estimating, scheduling and assigning the projects resources in order to
deliver an end product of suitable quality. However it is much more as it can assume a very strategic
role, which can determine the very success of the project. A Project Plan is one of the crucial steps in
Project Planning in General!
Typically Project Planning can include the following types of project Planning:
1) Project Scope Definition and Scope Planning
2) Project Activity Definition and Activity Sequencing
3) Time, Effort and Resource Estimation
4) Risk Factors Identification
5) Cost Estimation and Budgeting
6) Organizational and Resource Planning
7) Schedule Development
8) Quality Planning
9) Risk Management Planning
10) Project Plan Development and Execution
11) Performance Reporting
12) Planning Change Management
13) Project Rollout Planning
1) Project Scope Definition and Scope Planning:

In this step we document the project work that would help us achieve the project goal. We document
the assumptions, constraints, user expectations, Business Requirements, Technical requirements,
project deliverables, project objectives and everything that efines the final product requirements.
This is the foundation for a successful project completion.
2) Quality Planning:
The relevant quality standards are determined for the project. This is an important aspect of Project
Planning. Based on the inputs captured in the previous steps such as the Project Scope,
Requirements, deliverables, etc. various factors influencing the quality of the final product are
determined. The processes required to deliver the Product as promised and as per the standards are
defined.
3) Project Activity Definition and Activity Sequencing:

In this step we define all the specific activities that must be performed to deliver the
product by producing the various product deliverables. The Project Activity sequencing identifies
the interdependence of all the activities defined.
4) Time, Effort and Resource Estimation:

Once the Scope, Activities and Activity interdependence is clearly defined and
documented, the next crucial step is to determine the effort required to complete each of the
activities. See the article on “Software Cost Estimation” for more details. The Effort
can be calculated using one of the many techniques available such as Function Points, Lines of
Code, Complexity of Code, Benchmarks, etc.
This step clearly estimates and documents the time, effort and resource required for each activity.
5) Risk Factors Identification:

“Expecting the unexpected and facing it”
It is important to identify and document the risk factors associated with the project based on the
assumptions, constraints, user expectations, specific circumstances, etc.
6) Schedule Development:
The time schedule for the project can be arrived at based on the activities,
interdependence and effort required for each of them. The schedule may influence the cost estimates,
the cost benefit analysis and so on.
Project Scheduling is one of the most important task of Project Planning and also the most difficult
tasks. In very large projects it is possible that several teams work on developing the project. They
may work on it in parallel. However their work may be interdependent.
Again various factors may impact in successfully scheduling a project

o Teams not directly under our control
o Resources with not enough experience
Popular Tools can be used for creating and reporting the schedules such as Gantt Charts
7) Cost Estimation and Budgeting:

Based on the information collected in all the previous steps it is possible to estimate the cost
involved in executing and implementing the project. See the article on "Software -cost Estimation"
for more details. A Cost Benefit Analysis can be arrived at for the project. Based on
the Cost Estimates Budget allocation is done for the project.
8) Organizational and Resource Planning

Based on the activities identified, schedule and budget allocation resource types and
resources are identified. One of the primary goals of Resource planning is to ensure that the project
is run efficiently. This can only be achieved by keeping all the project resources fully utilized as
possible.
9) Risk Management Planning:

Risk Management is a process of identifying, analyzing and responding to a risk. Based on the Risk
factors Identified a Risk resolution Plan is created. The plan analyses each of the risk factors and
their impact on the project. The possible responses for each of them can be planned. Throughout the
lifetime of the project these risk factors are monitored and acted upon as necessary.
10) Project Plan Development and Execution:

Project Plan Development uses the inputs gathered from all the other planning
processes such as Scope definition, Activity identification, Activity sequencing, Quality Management
Planning,etc. A detailed Work Break down structure comprising of all the activities
identified is used. The tasks are scheduled based on the inputs captured in the steps previously
described. The Project Plan documents all the assumptions, activities, schedule,
timelines and drives the project.
Each of the Project tasks and activities are periodically monitored. The team and the
stakeholders are informed of the progress. This serves as an excellent communication mechanism.
Any delays are analyzed and the project plan may be adjusted accordingly
11) Performance Reporting:

As described above the progress of each of the tasks/activities described in the Project plan is
monitored. The progress is compared with the schedule and timelines documented in the Project
Plan. Various techniques are used to measure and report the project performance such as EVM
(Earned Value Management) A wide variety of tools can be used to report the performance of the project
such as PERT Charts, GANTT charts, Logical Bar Charts, Histograms, Pie Charts, etc.
12) Planning Change Management:

Analysis of project performance can necessitate that certain aspects of the project be changed. The
Requests for Changes need to be analyzed carefully and its impact on the project should be studied.
Considering all these aspects the Project Plan may be modified to accommodate this request for
Change.
13) Project Rollout Planning:

In Enterprise environments, the success of the Project depends a great deal on the success of its rollout and
implementations. Whenever a Project is rolled out it may affect the technical systems, business systems and
sometimes even the way business is run.
2.5 Project scheduling
Program evaluation and review technique (PERT) and critical path method (CPM) are two project scheduling
methods that can applied to software development.
PERT chart for this application software is illustrated below in the figure, while critical path for this is design,
Code Generation and Integration & Testing.
Planning
Requirements
Design
Estimation
Development
Testing
Implementation
JUNE JULY AUG SEP NOV DEC FEB
Basis Pert chart
START ANALYSIS DESIGN
INTEGRATION
CODING
&
TESTING
5th june to 10th Aug 2009
FINISH
PERT CHART
2.6 Software requirement specifications (SRS)
Requirements Analysis is done in order to understand the problem for which the software system is to solve.
Once the problem is analyzed and the essentials understood, the requirements must be specified in the
requirement specification document. For requirement specification in the form of document, some
specification language has to be selected (example: English, regular expressions, tables, or a combination of
these). The requirements documents must specify all functional and performance requirements, the formats of
inputs, outputs and any required standards, and all design constraints that exits due to political, economic
environmental, and security reasons. The phase ends with validation of requirements specified in the
document. The basic purpose of validation is to make sure that the requirements specified in the document,
actually reflect the actual requirements or needs, and that all requirements are specified. Validation is often
done through requirement review, in which a group of people including representatives of the client, critically
review the requirements specification.
IEEE (Institute of Electrical and Electronics Engineering) defines as,
1. A condition of capability needed by a user to solve a problem or achieve an objective;
A condition or capability that must be met or possessed by a system to satisfy a contract, standard,
specification, or other formally imposed document.
2.7 SOFTWARE ENGINEERING PARADIGM USED
The development of a desktop voice assistant typically involves the application of various software
engineering paradigms to ensure efficient design, implementation, and maintenance. Here are some
key paradigms commonly used in creating desktop voice assistants:
1. Object-Oriented Programming (OOP): OOP is widely used in developing desktop voice

assistants as it allows for organizing code into reusable objects. Each feature or functionality
of the voice assistant can be encapsulated within an object, making the code modular,
maintainable, and scalable.
2. Event-Driven Programming: Voice assistants often rely on event-driven programming
paradigms where actions are triggered by user interactions or system events. Events such as
voice commands or user inputs are captured and processed by event handlers, allowing the
assistant to respond accordingly.
3. Model-View-Controller (MVC): MVC architecture is commonly used to separate the
concerns of data manipulation (Model), user interface (View), and application logic
(Controller). In the context of a voice assistant, the Model may represent the underlying data
and algorithms, the View may handle the user interface elements, and the Controller may
manage the interaction between the user and the assistant.
4. Natural Language Processing (NLP) and Machine Learning (ML): Voice assistants
heavily rely on NLP and ML paradigms to understand and interpret user commands
accurately. NLP techniques are used to process and analyze natural language inputs, while
ML algorithms may be employed to improve the accuracy of speech recognition and
language understanding over time through training on large datasets.
5. Microservices Architecture: In larger-scale voice assistant systems, a microservices
architecture may be utilized to decompose the application into smaller, independent services.
Each service can be responsible for a specific functionality (e.g., speech recognition,
language understanding, task execution), allowing for easier development, deployment, and
scalability.
6. Continuous Integration and Continuous Deployment (CI/CD): CI/CD practices are
essential in the development of voice assistants to ensure rapid and reliable delivery of
updates and improvements. Automated testing, version control, and deployment pipelines
help maintain the quality and stability of the assistant throughout its lifecycle.
7. Agile Development: Agile methodologies, such as Scrum or Kanban, are often employed in
the development of voice assistants to iteratively deliver value to users and adapt to changing
requirements. Short development cycles, frequent feedback loops, and collaboration among
cross-functional teams enable the rapid development and evolution of the assistant.
By leveraging these software engineering paradigms, developers can design and build desktop voice
assistants that are robust, scalable, and capable of providing an intuitive and seamless user
experience.
2.8 DIAGRAMS
USE CASE DIAGRAM
SYSTEM DESIGN
3.1 PROGRAM STRUCTURE

3.2 DATABASE DESIGN
User
Key Text
Value Text
Lock Boolean
Password Text
Question
Qid Integer PRIMARY KEY
Query Text
Answer Text
Task
Tid Integer
PRIMARY KEY
Status Text (Active/Waiting/Stopped)
Level Text (Parent/Sub)
Priority Integer
Reminder
Rid Integer PRIMARY KEY
Tid Integer FOREIGN KEY
What Text
When Time
On Date
Notify before Time
Note
Nid Integer PRIMARY KEY
Tid Integer FOREIGN KEY
Data Text
Priority Integer
3.4 DATA FLOW DIAGRAM

DFD 0
DFD 1
DFD 2
3.5 E-R DIAGRAM

3.6 CLASS DIAGRAM
3.7 SEQUENCE DIAGRAM
3.8 USER INTERFACE DESIGN
Designing the user interface (UI) for a desktop voice assistant involves creating an intuitive and
seamless interaction experience between the user and the system. Here are some considerations and
principles for designing the UI of a desktop voice assistant:
1. Minimalistic Interface:
 Keep the interface clean and uncluttered to avoid overwhelming the user.
 Prioritize essential features and information, and avoid unnecessary visual elements.
2. Clear Voice Prompts:

 Provide clear and concise voice prompts to guide users on how to interact with the
assistant.
 Use natural language and conversational tone to make interactions more engaging and
intuitive.
3. Feedback Mechanisms:
 Provide feedback to users to confirm that their commands have been recognized and
understood.
 Use visual and auditory cues, such as animations or voice responses, to acknowledge
user input.
4. Voice Command Examples:

 Display examples of voice commands or suggest actions that the user can perform.
 This helps users discover the capabilities of the voice assistant and encourages them
to explore different functionalities.
5. Contextual Information:
 Provide contextually relevant information based on the user's current interaction or
task.
 Display relevant data, such as weather updates, upcoming events, or recent
notifications, when appropriate.
6. Customizable Preferences:
 Allow users to customize their preferences and settings, such as language, voice, or
preferred applications.
 Provide options for adjusting the assistant's behavior and personalizing the user
experience.
7. Accessibility Features:
 Ensure accessibility for users with disabilities by incorporating features such as voice
commands, keyboard shortcuts, or screen reader compatibility.
 Design the interface to be inclusive and accessible to all users, regardless of their
abilities.
8. Error Handling:
 Design error messages and recovery mechanisms to help users understand and resolve
any issues that may arise.
 Provide clear instructions on how to correct errors or retry commands, and offer
assistance when needed.
9. Consistent Design Language:

 Maintain consistency in design elements, such as colors, fonts, and layouts, across the
interface.
 Use familiar UI patterns and design conventions to make the interface easy to
navigate and understand.
10. User Testing and Iteration:

 Conduct usability testing with real users to gather feedback and identify areas for
improvement.
 Iterate on the design based on user feedback to enhance usability and overall user
satisfaction.
IMPLEMENTATION
4.1 CODING
import subprocess
import wolframalpha
import pyttsx3
import tkinter
import json
import random
import operator
import speech_recognition as sr
import datetime
import wikipedia
import webbrowser
import os
import winshell
import pyjokes
import feedparser
import smtplib
import ctypes
import time
import requests
import shutil
from twilio.rest import Client
from clint.textui import progress
from ecapture import ecapture as ec
from bs4 import BeautifulSoup
import win32com.client as wincl

from urllib.request import urlopen
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)
#assname =("Jarvis 1 point o")
def speak(audio):
engine.say(audio)
engine.runAndWait()
def wishMe():
hour = int(datetime.datetime.now().hour)
if hour>= 0 and hour<12:
speak("Good Morning Sir !")
elif hour>= 12 and hour<18:
speak("Good Afternoon Sir !")
else:
speak("Good Evening Sir !")
assname =("Jarvis 1 point o")
speak("I am your Assistant")
speak(assname)
def username():
speak("What should i call you sir")
uname = takeCommand()
speak("Welcome Mister")
speak(uname)
columns = shutil.get_terminal_size().columns
print("#####################".center(columns))
print("Welcome Mr.", uname.center(columns))
print("#####################".center(columns))
speak("How can i Help you, Sir")
def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 1
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language ='en-in')
print(f"User said: {query}\n")
except Exception as e:
print(e)
print("Unable to Recognize your voice.")
return "None"
return query
def sendEmail(to, content):

server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
# Enable low security in gmail
server.login('your email id', 'your email password')
server.sendmail('your email id', to, content)
server.close()
if __name__ == '__main__':
clear = lambda: os.system('cls')
# This Function will clean any
# command before execution of this python file
clear()
wishMe()
username()
while True:
query = takeCommand().lower()
if 'wikipedia' in query:
speak('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences = 3)
speak("According to Wikipedia")
print(results)
speak(results)
elif 'open youtube' in query:
speak("Here you go to Youtube\n")
webbrowser.open("youtube.com")
elif 'open google' in query:
speak("Here you go to Google\n")
webbrowser.open("google.com")
elif 'open stackoverflow' in query:
speak("Here you go to Stack Over flow.Happy coding")
webbrowser.open("stackoverflow.com")
elif 'play music' in query or "play song" in query:
speak("Here you go with music")
# music_dir = "G:\\Song"
music_dir = "C:\\Users\\GAURAV\\Music"
songs = os.listdir(music_dir)
print(songs)
random = os.startfile(os.path.join(music_dir, songs[1]))
elif 'the time' in query:
strTime = datetime.datetime.now().strftime("% H:% M:% S")
speak(f"Sir, the time is {strTime}")
elif 'open opera' in query:
codePath = r"C:\\Users\\GAURAV\\AppData\\Local\\Programs\\Opera\\launcher.exe"
os.startfile(codePath)
elif 'email to gaurav' in query:
try:
speak("What should I say?")
content = takeCommand()
to = "Receiver email address"
sendEmail(to, content)
speak("Email has been sent !")
print(e)
speak("I am not able to send this email")
elif 'send a mail' in query:
try:
speak("What should I say?")
content = takeCommand()
speak("whome should i send")
to = input()
sendEmail(to, content)
speak("Email has been sent !")
print(e)
speak("I am not able to send this email")
elif 'how are you' in query:
speak("I am fine, Thank you")
speak("How are you, Sir")
elif 'fine' in query or "good" in query:
speak("It's good to know that your fine")

elif "change my name to" in query:
query = query.replace("change my name to", "")
assname = query
elif "change name" in query:
speak("What would you like to call me, Sir ")
assname = takeCommand()
speak("Thanks for naming me")
elif "what's your name" in query or "What is your name" in query:
speak("My friends call me")
speak(assname)
print("My friends call me", assname)
elif 'exit' in query:
speak("Thanks for giving me your time")
exit()
elif "who made you" in query or "who created you" in query:
speak("I have been created by Gaurav.")
elif 'joke' in query:
speak(pyjokes.get_joke())
elif "calculate" in query:
app_id = "Wolframalpha api id"
client = wolframalpha.Client(app_id)
indx = query.lower().split().index('calculate')
query = query.split()[indx + 1:]

res = client.query(' '.join(query))
answer = next(res.results).text
print("The answer is " + answer)
speak("The answer is " + answer)
elif 'search' in query or 'play' in query:
query = query.replace("search", "")
query = query.replace("play", "")
webbrowser.open(query)
elif "who i am" in query:
speak("If you talk then definitely your human.")
elif "why you came to world" in query:
speak("Thanks to sunny. further It's a secret")
elif 'power point presentation' in query:
speak("opening Power Point presentation")
power = r"C:\\Users\\GAURAV\\Desktop\\Minor Project\\Presentation\\Voice Assistant.pptx"
os.startfile(power)
elif 'is love' in query:
speak("It is 7th sense that destroy all other senses")
elif "who are you" in query:
speak("I am your virtual assistant created by sunny")
elif 'reason for you' in query:

speak("I was created as a Minor project by Mister sunny ")
elif 'change background' in query:
ctypes.windll.user32.SystemParametersInfoW(20,
0,
"Location of wallpaper",
0)
speak("Background changed successfully")
elif 'open bluestack' in query:
appli = r"C:\\ProgramData\\BlueStacks\\Client\\Bluestacks.exe"
os.startfile(appli)
elif 'news' in query:
try:
jsonObj = urlopen('''https://newsapi.org / v1 / articles?source = the-times-of-india&sortBy =

top&apiKey =\\times of India Api key\\''')
data = json.load(jsonObj)
i=1
speak('here are some top news from the times of india')
print('''=============== TIMES OF INDIA ============'''+ '\n')
for item in data['articles']:
print(str(i) + '. ' + item['title'] + '\n')
print(item['description'] + '\n')
speak(str(i) + '. ' + item['title'] + '\n')
i += 1
print(str(e))
elif 'lock window' in query:
speak("locking the device")
ctypes.windll.user32.LockWorkStation()
elif 'shutdown system' in query:
speak("Hold On a Sec ! Your system is on its way to shut down")
subprocess.call('shutdown / p /f')
elif 'empty recycle bin' in query:
winshell.recycle_bin().empty(confirm = False, show_progress = False, sound = True)
speak("Recycle Bin Recycled")
elif "don't listen" in query or "stop listening" in query:
speak("for how much time you want to stop jarvis from listening commands")
a = int(takeCommand())
time.sleep(a)
print(a)
elif "where is" in query:
query = query.replace("where is", "")
location = query
speak("User asked to Locate")
speak(location)
webbrowser.open("https://www.google.nl / maps / place/" + location + "")

elif "camera" in query or "take a photo" in query:
ec.capture(0, "Jarvis Camera ", "img.jpg")
elif "restart" in query:
subprocess.call(["shutdown", "/r"])
elif "hibernate" in query or "sleep" in query:
speak("Hibernating")
subprocess.call("shutdown / h")
elif "log off" in query or "sign out" in query:
speak("Make sure all the application are closed before sign-out")
time.sleep(5)
subprocess.call(["shutdown", "/l"])
elif "write a note" in query:
speak("What should i write, sir")
note = takeCommand()
file = open('jarvis.txt', 'w')
speak("Sir, Should i include date and time")
snfm = takeCommand()
if 'yes' in snfm or 'sure' in snfm:
strTime = datetime.datetime.now().strftime("% H:% M:% S")
file.write(strTime)
file.write(" :- ")
file.write(note)
else:
file.write(note)
elif "show note" in query:
speak("Showing Notes")
file = open("jarvis.txt", "r")
print(file.read())
speak(file.read(6))
elif "update assistant" in query:
speak("After downloading file please replace this file with the downloaded one")
url = '# url after uploading file'
r = requests.get(url, stream = True)
with open("Voice.py", "wb") as Pypdf:
total_length = int(r.headers.get('content-length'))
for ch in progress.bar(r.iter_content(chunk_size = 2391975),
expected_size =(total_length / 1024) + 1):
if ch:
Pypdf.write(ch)
elif "jarvis" in query:
wishMe()
speak("Jarvis 1 point o in your service Mister")
speak(assname)
elif "weather" in query:
# Google Open weather website
# to get API of Open weather

api_key = "Api key"
base_url = "http://api.openweathermap.org / data / 2.5 / weather?"
speak(" City name ")
print("City name : ")
city_name = takeCommand()
complete_url = base_url + "appid =" + api_key + "&q =" + city_name
response = requests.get(complete_url)
x = response.json()
if x["code"] != "404":
y = x["main"]
current_temperature = y["temp"]
current_pressure = y["pressure"]
current_humidiy = y["humidity"]
z = x["weather"]
weather_description = z[0]["description"]
print(" Temperature (in kelvin unit) = " +str(current_temperature)+"\n atmospheric pressure (in hPa
unit) ="+str(current_pressure) +"\n humidity (in percentage) = " +str(current_humidiy) +"\n description = "
+str(weather_description))
else:
speak(" City Not Found ")
elif "send message " in query:
# You need to create an account on Twilio to use this service
account_sid = 'Account Sid key'
auth_token = 'Auth token'
client = Client(account_sid, auth_token)
message = client.messages \
.create(
body = takeCommand(),
from_='Sender No',
to ='Receiver No'
print(message.sid)
elif "wikipedia" in query:
webbrowser.open("wikipedia.com")
elif "Good Morning" in query:
speak("A warm" +query)
speak("How are you Mister")
speak(assname)
# most asked question from google Assistant
elif "will you be my gf" in query or "will you be my bf" in query:
speak("I'm not sure about, may be you should give me some time")
elif "how are you" in query:
speak("I'm fine, glad you me that")
elif "i love you" in query:
speak("It's hard to understand")
elif "what is" in query or "who is" in query:

# Use the same API key
# that we have generated earlier
client = wolframalpha.Client("API_ID")
res = client.query(query)
try:
print (next(res.results).text)
speak (next(res.results).text)
except StopIteration:
print ("No results")

4.2 CODE EFFICIENCY
Ensuring code efficiency in a desktop voice assistant is crucial for optimal performance and
responsiveness. Here are some key strategies for improving code efficiency:
1. Algorithm Selection:
 Choose algorithms and data structures that are well-suited for the tasks performed by
the voice assistant.
 Opt for efficient algorithms with lower time complexity for tasks such as speech
recognition, natural language processing, and task execution.
2. Optimized Data Handling:

 Minimize unnecessary data processing and manipulation to reduce computational
overhead.
 Use efficient data structures, such as hash tables or trees, for storing and accessing
data, depending on the requirements of the application.
3. Resource Management:
 Manage system resources, such as memory and CPU usage, efficiently to prevent
bottlenecks and improve performance.
 Implement techniques like lazy loading or caching to reduce resource consumption
and improve responsiveness.
4. Parallelism and Concurrency:

 Utilize parallelism and concurrency to execute tasks concurrently and make efficient
use of multicore processors.
 Design the system to leverage asynchronous programming paradigms, such as
multithreading or asynchronous I/O, where applicable.
5. Optimized I/O Operations:

 Minimize disk I/O and network communication by optimizing file access and
network requests.
 Batch I/O operations and use buffering techniques to reduce the number of read/write
operations and improve throughput.
6. Code Profiling and Optimization:

 Use profiling tools to identify performance bottlenecks and areas of inefficiency in
the code.
 Optimize critical sections of code by refactoring or rewriting them for improved
performance.
7. Caching and Memoization:

 Cache frequently accessed or expensive computations to avoid redundant calculations
and improve response times.
 Implement memoization techniques to store the results of function calls and avoid
repeating computations for the same inputs.
8. Code Review and Refactoring:

 Conduct code reviews to identify areas of code that can be optimized or refactored for
better efficiency.
 Refactor code to eliminate redundancy, improve readability, and optimize
performance where possible.
9. Continuous Monitoring and Optimization:

 Continuously monitor the performance of the voice assistant application in
production.
 Gather metrics and analyze performance data to identify opportunities for further
optimization and improvement.
By employing these strategies, you can enhance the code efficiency of a desktop voice assistant,
resulting in better performance, reduced resource consumption, and an overall improved user
experience.
4.3 PARAMETER PASSING / CALLING
In the implementation of a desktop voice assistant, parameter passing and calling are fundamental
concepts for passing information between different components of the system. Here's how parameter
passing and calling might be utilized:
1. Speech Recognition Module to Natural Language Understanding Module:

 When the speech recognition module successfully recognizes a spoken command, it
passes the transcribed text as a parameter to the natural language understanding
(NLU) module.
 Example:
Python
# Speech Recognition Module
def recognize_speech(audio):
transcribed_text = ... # Transcribe audio to text
nlu_module.process_input(transcribed_text)
# Natural Language Understanding Module

def process_input(transcribed_text):
# Process transcribed text
2. Natural Language Understanding Module to Task Execution Module:

 After understanding the user's intent, the NLU module passes relevant information
(such as the command and extracted parameters) to the task execution module for
further processing.
Example: # Natural Language Understanding Module
def process_input(transcribed_text):
command, parameters = extract_intent(transcribed_text)
task_execution_module.execute_command(command, parameters)
# Task Execution Module
def execute_command(command, parameters):
# Execute command with given parameters
3. User Interface Module to Speech Recognition Module:

 In a user interface with voice command capabilities, the UI module may call the
speech recognition module to capture audio input from the user.
Example: # User Interface Module
def get_voice_input():
audio = speech_recognition_module.capture_audio()
return audio
# Speech Recognition Module
def capture_audio():
# Capture audio input from the microphone
4. Error Handling and Logging Module:

 The error handling module might call a logging function to record errors or debug
information.
 Example: # Error Handling Module
def handle_error(error_message):
logging_module.log_error(error_message)
# Logging Module
def log_error(error_message):
# Log error message to a file or console
Parameter passing and calling facilitate the flow of information between different modules or
components of the desktop voice assistant, enabling seamless communication and collaboration
within the system. By carefully designing and implementing these interactions, you can create a
robust and efficient voice assistant application.
4.4 VALIDATION CHECKS
1. Functionality Testing:
 Verify that the voice assistant accurately understands and responds to user
commands.
 Test a variety of commands and queries to ensure comprehensive coverage.
 Check for proper handling of errors and fallback mechanisms when the assistant
doesn't understand a command.
2. Accuracy and Precision:

 Assess the accuracy of speech recognition and natural language understanding.
 Validate that the assistant interprets user inputs correctly and provides relevant
responses.
 Evaluate the precision of the assistant's responses and actions.
3. User Experience (UX) Testing:

 Evaluate the overall user experience, including ease of use and intuitiveness.
 Gather feedback from users to identify areas for improvement.
 Test the assistant with users from diverse backgrounds to ensure inclusivity and
accessibility.
4. Performance Testing:
 Measure the response time of the voice assistant to user inputs.
 Assess the system's performance under different loads and usage scenarios.
 Check for any latency issues or delays in processing user requests.
5. Compatibility Testing:
 Validate that the voice assistant works seamlessly on different desktop platforms
(e.g., Windows, macOS, Linux).
 Ensure compatibility with various hardware configurations and system settings.
 Test the assistant across different web browsers if it has a web-based component.
TESTING
5.1 TESTING TECHNIQUES & STRATEGIES

1. Unit Testing:
 Test individual components or modules of the voice assistant application in isolation.
 Verify the correctness of functions, methods, and classes using automated unit tests.
 Mock external dependencies, such as speech recognition or natural language
processing libraries, to isolate the unit under test.
2. Integration Testing:
 Test the interaction and integration between different components of the voice
assistant.
 Validate the flow of data and control between modules, such as speech recognition,
natural language understanding, and task execution.
 Use integration tests to verify the behavior of the system as a whole, including the
handling of various user inputs and scenarios.
3. End-to-End Testing:
 Conduct end-to-end tests to validate the entire user journey and interaction flow of
the voice assistant application.
 Test common user scenarios, from initiating voice commands to receiving and
verifying the assistant's responses.
 Use real-world user inputs or scripted scenarios to simulate typical usage patterns and
evaluate the system's behavior under different conditions.
4. Usability Testing:
 Evaluate the usability and user experience of the voice assistant through usability
testing.
 Gather feedback from real users to assess the intuitiveness, effectiveness, and
satisfaction of the voice interaction interface.
 Identify usability issues, navigation difficulties, and areas for improvement based on
user feedback and observations.
5. Performance Testing:
 Measure and evaluate the performance characteristics of the voice assistant
application.
 Conduct performance tests to assess factors such as response time, latency,
throughput, and resource utilization under varying loads and conditions.
 Identify performance bottlenecks, scalability limitations, and areas for optimization to
ensure the voice assistant meets performance requirements.
6. Security Testing:
 Perform security testing to identify and mitigate potential vulnerabilities and threats
in the voice assistant application.
 Assess the application's security posture by conducting penetration testing, code
reviews, and vulnerability assessments.
 Verify the implementation of security best practices, such as data encryption, access
controls, and secure communication protocols, to protect sensitive information and
ensure user privacy.
7. Regression Testing:
 Conduct regression testing to validate that recent changes or updates to the voice
assistant application do not introduce new defects or regressions in functionality.
 Maintain a suite of automated regression tests to systematically verify the behavior of
the system across different use cases and scenarios.
8. Accessibility Testing:
 Evaluate the accessibility of the voice assistant application to ensure it is usable by
individuals with disabilities.
 Test for compliance with accessibility standards and guidelines, such as the Web
Content Accessibility Guidelines (WCAG), to support users with diverse needs and
abilities.
By employing a comprehensive testing approach that incorporates these techniques and strategies,
you can ensure the quality, reliability, and effectiveness of the desktop voice assistant application.
5.2 CODE IMPROVEMENT

Improving code quality and testing practices in a desktop voice assistant project can significantly
enhance its reliability, maintainability, and overall quality. Here are some strategies for code
improvement in testing:
1. Test-Driven Development (TDD):

 Adopt a test-driven development approach where tests are written before
implementing new features or making changes to existing code.
 Write automated tests to define the desired behavior of the voice assistant and use
them to drive the development process.
 Write unit tests, integration tests, and end-to-end tests to cover different aspects of the
application's functionality.
2. Code Coverage Analysis:

 Use code coverage analysis tools to measure the percentage of code covered by
automated tests.
 Aim for high code coverage to ensure that critical parts of the codebase are
adequately tested and to identify areas that require additional testing.
3. Refactoring for Testability:
 Refactor code to improve its testability by breaking down complex functions or
classes into smaller, more modular components.
 Eliminate dependencies on external resources or services by using dependency
injection or mocking frameworks in tests.
4. Parameterized Tests:
 Use parameterized tests to test a function or method with different sets of input
parameters and expected outputs.
 Parameterized tests help increase test coverage and reduce code duplication by testing
multiple scenarios with a single test case.
5. Test Data Management:

 Manage test data effectively to ensure reproducibility and consistency in tests.
 Use fixtures or factories to generate test data programmatically and reduce reliance on
hard-coded or static test data.
6. Continuous Integration (CI):

 Set up a continuous integration pipeline to automate the execution of tests whenever
changes are made to the codebase.
 Integrate code quality checks, such as linting and static code analysis, into the CI
pipeline to maintain code consistency and identify potential issues early.
7. Mutation Testing:
 Implement mutation testing to assess the quality of the test suite by introducing small
changes (mutations) to the code and checking if the tests detect these mutations.
 Use mutation testing tools to measure the effectiveness of the test suite in detecting
faults and identifying areas for improvement.
8. Feedback Loop and Iterative Improvement:

 Establish a feedback loop to collect feedback from developers, testers, and end users
on the effectiveness of the test suite and testing practices.
 Continuously review and improve testing practices based on feedback and lessons
learned from previous testing cycles.
By incorporating these code improvement strategies into the testing process, you can enhance the
reliability, maintainability, and quality of the desktop voice assistant project.
SYSTEM SECURITY
6.1 SYSTEM SECURITY
Ensuring the security of a desktop voice assistant project is essential to protect user privacy, data
integrity, and system confidentiality. Here are some key aspects to consider for system security:
1. Data Encryption:
 Encrypt sensitive data, such as user voice recordings, command history, and personal
information, to prevent unauthorized access or interception.
 Use strong encryption algorithms and key management practices to secure data both
at rest and in transit.
2. Access Control:
 Implement access control mechanisms to restrict access to sensitive functionality and
resources within the voice assistant application.
 Authenticate users and enforce proper authorization levels to prevent unauthorized
users from accessing privileged features or data.
3. Secure Communication:
 Use secure communication protocols, such as HTTPS, SSL/TLS, or SSH, to encrypt
data exchanged between the voice assistant application and external services or APIs.
 Verify the authenticity of remote endpoints and validate server certificates to prevent
man-in-the-middle attacks.
4. User Authentication:
 Implement strong user authentication mechanisms, such as multi-factor authentication
(MFA) or biometric authentication, to verify the identity of users accessing the voice
assistant application.
 Enforce password policies, session management, and account lockout mechanisms to
protect against unauthorized access and brute force attacks.
5. Secure Storage:
 Store sensitive data securely using encrypted storage mechanisms and access controls
to prevent unauthorized access or data leakage.
 Follow best practices for secure configuration and management of databases, file
systems, and other storage repositories.
6. Input Validation:
 Validate and sanitize user input to prevent injection attacks, such as SQL injection or
cross-site scripting (XSS), which could compromise the security of the voice assistant
application.
 Use input validation libraries or frameworks to enforce data integrity and mitigate the
risk of common security vulnerabilities.
7. Secure Development Practices:

 Follow secure coding practices and guidelines, such as those outlined by OWASP, to
minimize the risk of introducing security vulnerabilities during development.
 Conduct code reviews, static code analysis, and security testing to identify and
remediate potential security flaws early in the development lifecycle.
8. Vulnerability Management:
 Regularly scan the voice assistant application for security vulnerabilities using
automated scanning tools, vulnerability databases, and security assessments.
 Maintain an up-to-date inventory of software dependencies and third-party libraries,
and promptly apply security patches and updates to address known vulnerabilities.
9. Logging and Monitoring:

 Implement logging and monitoring mechanisms to detect and respond to security
incidents, unauthorized access attempts, and anomalous behavior.
 Monitor system logs, user activity, and network traffic for signs of compromise or
malicious activity, and take appropriate remedial actions as needed.
10. Privacy Protection:

 Respect user privacy by minimizing data collection, anonymizing or pseudonymizing
personally identifiable information (PII), and providing transparent privacy policies.
 Obtain explicit consent from users before collecting or processing sensitive data, and
allow users to control their privacy preferences and data sharing settings.
By incorporating these security measures into the design, development, and operation of the desktop
voice assistant project, you can mitigate security risks and safeguard the integrity, confidentiality,
and availability of the system and its data.
SCREENSHOTS
CONCLUSION & FUTURE SCOPE
We've covered Python-based Personal Virtual Assistants for Windows in this article. Humans' lives are made
simpler by virtual assistants. Using a virtual assistant gives you the freedom to contract for just the services
you need.. Python is used to create virtual assistants for all Windows versions, much as Alexa, Cortana, Siri,
and Google Assistant. Artificial Intelligence is used in this project, and virtual personal assistants are an
excellent method to keep track of your calendar. Because of their portability, loyalty, and availability at any
moment, virtual personal assistants are more dependable than human personal assistants. Our virtual assistant
will get to know you better and be able to provide ideas and follow orders. This device will most likely be
with us for the rest of our lives It is possible to enhance education by using immersive technology.
Voice assistants may help students study in new and innovative ways. This article contains studies on the use
of AI voice assistants in education. There hasn't been a lot of study done on voice assistants yet, but that's
about to change. New discoveries could be made in the future as a result of this study's results. Next years will
be all about voice-activated devices like smart speakers and virtual assistants. Exactly how they will be most
successful in the classroom is still a mystery. As a result, not all voice assistants are bilingual, and this might
be problematic. Additionally, voice assistants lack sufficient security safeguards and protection filters that
students may use in the classroom. The use of these devices in the classroom can only be successful if
instructors are given the proper training and incentives to do so. Despite the fact that most students and
teachers have reported positive results, the data are sparse, fragmentary, and unstructured. More research is
required to better understand the use of these devices in the classroom, according to our findings so far
9.1 LIMITATIONS
● The lack of voice command encryption raises concerns about the project's overall security.
● Voices in the background might cause issues.
● Accents can cause misinterpretation, which can lead to inaccurate results.
● Unlike Google Assistant, which can be accessed by saying, "Ok Google! ", TUTOR cannot be accessed
externally at any time.”
9.2 FUTURE SCOPE OF PROJECT

● Make TUTOR learn on its own and acquire a new skill.
● Additionally, TUTOR android apps may also be made.
● Increase the number of voice terminals available.
● For further protection, voice instructions may be encrypted. the severely disabled or those who have
suffered minor repetitive stress injuries, i.e., those who may need the assistance of others to manage their
surroundings. Such technologies make it easier for the user to communicate with the computer system, which
in turn facilitates the completion of a variety of activities. The IVR system acts as an intermediary between
humans and computers.
Due to the time and research constraints, the existing IVR system is only suitable to desktop computers and
will not be implemented in real phone devices. This is a disadvantage since the IVR system with Automatic
Voice Recognition (AVR) may be used in a wide range of applications. Although the project is still in its
infancy, there is plenty of room for improvement in the years to come. The following are some of the places
that may be relevant:
1.Organizational inquiry desk: The system may be utilised in different organisations for simple access to
information about the organisation using the voice command.
2.Detection of isolated words is all that the suggested system does, but it might be expanded to include audio
to text conversion with additional improvements in algorithms.
3.It is possible to employ voice recognition of Nepali phrases in order to accomplish the work of freshly
generated apps and therefore create more user pleasant applications..
4 In embedded systems, voice commands may be used to handle multiple activities using speech recognition
technology. This promotes automation of labour and can thus be very advantageous in industrial process
automation.
5.Application for People with Disabilities: People with disabilities may also benefit from voice recognition
software. It is particularly beneficial for those who are unable to use their hands.
REFERENCES
[1] Diksha Goutam” A REVIEW: DESKTOP VOICE ASSISTANT”,IJRASET,
Volume:05/Issue:01/January-2023, e-ISSN: 2582-5208
[2] Asodariya, H., Vachhani, K., Ghori, E., Babariya, B., & Patel, T. Desktop Voice Assistant.
[3] G Gaurav Agrawal*1, Harsh Gupta*2, Divyanshu Jain*3 , Chinmay Jain*4 , Prof. Ronak
Jain*5,” DESKTOP VOICE ASSISTANT” International Research Journal of Modernization in
Engineering Technology and Science Volume:02/Issue:05/May-2020.
[4] Bandari , Bhosale , Pawar , Shelar , Nikam , Salunkhe (2023). Intelligent Desktop Assistant.
2023 JETIR June 2023, Volume 10, Issue 6, www.jetir.org (ISSN-2349-5162)
[5] Vishal Kumar Dhanraj Lokeshkriplani , Semal Mahajan: ISSN (Online): 2320-9364, ISSN
(Print): 2320-9356 www.ijres.org Volume 10 Issue 2 ǁ 2022 ǁ PP. 15-20
[6] Ujjwal Gupta, Utkarsh Jindal, Apurv Goel, Vaishali Malik ” Desktop Voice
Assistant”,IJRASET, 2022-05-08
[7] Vishal Kumar Dhanraj (076) Lokeshkriplani (403) Semal Mahajan (427) ” Research Paper
onDesktop Voice Assistant” (IJRES) ISSN (Online): 2320-9364, ISSN (Print): 2320-9356
www.ijres.org Volume 10 Issue 2 ǁ 2022 ǁ PP. 15-20.
[8] V. Geetha, C.K.Gomathy, Kottamasu Manasa Sri Vardhan, Nukala Pavan Kumar,“The Voice
Enabled Personal Assistant for Pc using Python,”International Journal of Engineering and Advanced
Technology (IJEAT), April 2021.
[9] Chen, X., Liu, C., & Guo, W. (2020). A Survey on Voice Assistant Systems. IEEE Access, 8,
27056-27070

Final Report R

Uploaded by

Copyright:

Available Formats

Final Report R

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Report R

Uploaded by

Copyright:

Available Formats

TABLE OF CONTENTS

3) PROPOSED PLAN OF WORK 7

5) HARDWARE & SYSTEM REQUIREMENT 9

9) IMPLEMENTATION WORK DETAILS 25

10) FUTURE SCOPE AND FURTHER ENHANCEMENT 27

6) SYSTEM SECURITY 143

9) CONCLUSION & FUTURE SCOPE 153

10) RFERENCES 156

Figure 2.1. Shows backend working data flow

Figure 2.2. input/output text to speech

Calls to the application programming interface:

Machine-readable documents that are unstructured or semi-structured may be

These assistants, however, need an account (such as a Google or Microsoft account)

 Pycharm IDE/visual studio code

Flow Charts and System Diagrams

It will take input through voice commands related to the

It will perform the required task for the user like

It keeps on asking for the command from

Figure 3.2 shows Tutor data flow

shows use case diagram

 Speech Recognition: It is a library for performing speech recognition, supporting multiple

1. APPLICATION IN THE REAL WORLD

2. DATA IMPLEMENTATION AND PROGRAM EXECUTION

● Make TUTOR learn on its own and acquire a new skill.

● Additionally, TUTOR android apps may also be made.

● Increase the number of voice terminals available.

2.1 Need and Significance of Desktop Voice Assistant

2.2 Preliminary Investigation

2.3 Feasibility Study

2.3.1 Technical Feasibility

2.3.2 Market Feasibility

Identifying Potential Users

2.3.3 Financial Feasibility

2.3.4 Legal and Regulatory Feasibility

 Data Privacy and Security

2.3.5 Operational Feasibility

2.4 Project Planning

It is an aspect of Project Management that focuses a lot on Project Integration. The

Project Planning is an ongoing effort throughout the Project Lifecycle.

What are the steps in Project Planning?

1) Project Scope Definition and Scope Planning:

3) Project Activity Definition and Activity Sequencing:

4) Time, Effort and Resource Estimation:

5) Risk Factors Identification:

Again various factors may impact in successfully scheduling a project

7) Cost Estimation and Budgeting:

8) Organizational and Resource Planning

9) Risk Management Planning:

10) Project Plan Development and Execution:

11) Performance Reporting:

12) Planning Change Management:

13) Project Rollout Planning:

Basis Pert chart

START ANALYSIS DESIGN

IEEE (Institute of Electrical and Electronics Engineering) defines as,

1. A condition of capability needed by a user to solve a problem or achieve an objective;

1. Object-Oriented Programming (OOP): OOP is widely used in developing desktop voice

3.1 PROGRAM STRUCTURE

Status Text (Active/Waiting/Stopped)