1822 B.E Cse Batchno 10

SMART VOICE ASSISTANT USING PYTHON
Submitted in partial fulfillment of the requirements

for the award of
Bachelor of Engineering degree in Computer Science and Engineering by
Akash S (38110016)
Neeraj Jayaram (38110360)
DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING SCHOOL OF COMPUTING
SATHYABAMA
INSTITUTE OF SCIENCE AND TECHNOLOGY
(DEEMED TO BE UNIVERSITY)
Accredited with Grade “A” by NAAC
JEPPIAAR NAGAR, RAJIV GANDHI SALAI,

CHENNAI - 600 119
MARCH – 2022
i
SATHYABAMA
INSTITUTE OF SCIENCE AND TECHNOLOGY
(DEEMED TO BE UNIVERSITY)
Accredited with “A” grade by NAAC
Jeppiaar Nagar, Rajiv Gandhi Salai, Chennai - 600119
www.sathyabama.ac.in
DEPARTMENT OF COMPUTER SCIENCE
ENGINEERING BONAFIDE
CERTIFICATE
This is to certify that this Project Report is the bonafide work of Akash S (38110016)
and Neeraj Jayaram (38110360) who carried out the project entitled “SMART VOICE
ASSISTANT USING PYTHON” under my supervision from November 2020 to March
2021.
Internal Guide
Dr. A. JESUDOSS, M.E., Ph.D.
Head of the Department

Dr. S. Vigneshwari M.E., Ph.D., and Dr. L. Lakshmanan M.E., Ph.D.,
Submitted for Viva voce Examination held on
Internal Examiner External Examiner
ii
DECLARATION
I Akash S (38110016) and Neeraj Jayaram (38110360) hereby declare that the Project
Report entitled “SMART VOICE ASSISTANT USING PYTHON” done by us under the
guidance of Dr. A. JESUDOSS, M.E., Ph.D. is submitted in partial fulfillment of the
requirements for the award of Bachelor of Engineering degree in 2018-2022.
DATE:
PLACE: SIGNATURE OF THE CANDIDATE
iii
ACKNOWLEDGEMENT
I am pleased to acknowledge my sincere thanks to Board of Management of

SATHYABAMA for their kind encouragement in doing this project and for completing it
successfully. I am grateful to them.
I convey my thanks to Dr. T.Sasikala M.E.,Ph.D., Dean, School of Computing

Dr.S.Vigneshwari M.E., Ph.D. and Dr.L.Lakshmanan M.E., Ph.D. , Heads of the
Department of Computer Science and Engineering for providing us necessary support and
details at the right time during the progressive reviews.
I would like to express my sincere and a deep sense of gratitude to my Project Guide Dr.
A. JESUDOSS, M.E., Ph.D. for his valuable guidance, suggestions and constant
encouragement paved way for the successful completion of my project work.
I wish to express my thanks to all Teaching and Non-teaching staff members of the
Department of Computer Science and Engineering who were helpful in many
ways for the completion of the project.
iv
ABSTRACT
A Voice Assistant is one of the hot topics in the current world that are programs that
listens to human’s verbal command and respond to them which makes it a human-
computer/device interaction. In the current days, a voice assistant is everywhere which is a lot
useful in these busy days. Nowadays, almost everyone in the current world is using voice
assistant because it’s everywhere starting from Google smartphone assistant which even 5 years
old kids will know how to use because of the current world pandemic which makes them use
smartphones till Amazon's Alexa which will be very useful to do works starting from
entertaining the users till turning on and off the household products (Internet of Things). One of
the greatest features is that it will be very useful to even physically challenged people, for
example, people who aren't able to walk use the Internet of Things (IoT) feature to operate the
household products and maintain them. So, we tend to develop a voice assistant which will be
very useful to the users same as the other voice assistants which are currently in the world.
v
TABLE OF CONTENTS
Chapter No. TITLE Page No.

ABSTRACT v
LIST OF FIGURES viii
LIST OF ABBREVIATIONS ix
1 INTRODUCTION 1
1.1. OVERVIEW 2
1.2. DESIGN 3
1.3. VOICE ASSISTANT 3
1.3.1 WHAT IS VOICE ASSISTANT 3
1.3.2 WHY DO WE NEED IT 3
1.3.3 WHERE TO USE IT 4
2 LITERATURE SURVEY 5
2.1. RELATED WORK 5
3 METHODOLOGY 8
3.1. EXISTING SYSTEM 8
3.2. PROPOSED SYSTEM 8
3.3. OBJECTIVE OF THE PROJECT 9
3.4. SOFTWARE AND HARDWARE REQUIREMENTS 10
3.4.1. SOFTWARE REQUIREMENTS 10
3.4.2. HARDWARE REQUIREMENTS 10
3.4.3. LIBRARIES 11
3.5. PROGRAMMING LANGUAGES 14
3.5.1. PYTHON 14
3.5.2. DOMAIN 15
3.6. SYSTEM ARCHITECTURE 16
3.7. ALGORITHMS USED 16
3.7.1. SPEECH RECOGNITION MODULE 16
vi
3.7.2. SPEECH TO TEXT & TEXT TO SPEECH 17
CONVERSION
3.7.3. PROCESS & EXECUTES THE 17
REQUIRED COMMAND
3.8. SYSTEM DESIGN 18
3.8.1 USE CASE DIAGRAM 18
3.8.2 COMPONENT DIAGRAM 19
3.8.3 SEQUENCE DIAGRAM 19
3.9. FEASIBILITY STUDY 21
3.10. TYPES OF OPERATION 22
4 RESULTS AND DISCUSSION 25
4.1. WORKING 25
5 CONCLUSION 29
5.1. CONCLUSION 29
5.2. FUTURE WORK 29
REFERENCES 31
APPENDICES 32
A. SOURCE CODE 32
B. SCREENSHOTS 37
C. PLAGIARISM REPORT 39
D. JOURNAL PAPER 40
vii
LIST OF FIGURES
Figure No. Figure Name Page No.

3.1 SYSTEM ARCHITECTURE 16
3.2 USE CASE DIAGRAM 18
3.3 COMPONENT DIAGRAM 19
3.4 SEQUENCE DIAGRAM 19
3.5 SEQUENCE DIAGRAM (Answering the 20
user)
4.1 FLOWCHART 24
viii
LIST OF ABBREVIATIONS
ABBREVIATIONS EXPANSION
IOT Internet of Things
AI Artificial Intelligence
COM Communication Port
OOPs Object Oriented

Programming
API Application
Programming
Interface
TTS Text to Speech
STT Speech to Text

RAD Rapid Application
Development
UIDs Unique Identifiers
NOVA Next-Gen Optimal
Voice Assistant
IP Internet Protocol
ix
CHAPTER 1
INTRODUCTIO
N
The very first voice activated product was released in 1922 as Radio Rex. This toy was very simple,
wherein a toy dog would stay inside a dog house until the user exclaimed its name, “Rex” at which point
it would jump out of the house. This was all done by an electromagnet tuned to the frequency similar to
the vowel found in the word Rex, and predated modern computers by over 20 years.
In the 21st century, human interaction is being replaced by automation very quickly. One of the
main reasons for this change is performance. There’s a drastic change in technology rather than
advancement. In today’s world, we train our machines to do their tasks by themselves or to think like
humans using technologies like Machine Learning, Neural Networks, etc. Now in the current era, we can
talk to our machines with the help of virtual assistants.
Virtual assistants are software programs that help you ease your day to day tasks, such as showing
weather reports, giving daily news, searching the internet etc. They can take commands by voice. Voice-
based intelligent assistants need an invoking word or wake word to activate the listener, followed by the
command. We have so many virtual assistants, such as Apple’s Siri, Amazon’s Alexa and Microsoft’s
Cortana and Amazon's Alexa and this has been an inspiration for us to do this as a project. This system
is designed to be used efficiently on desktops. Voice assistants are programs on digital devices that listen
and respond to verbal commands. A user can say, “What's the weather?” and the voice assistant will
answer with the weather report for that day and location.
1
1.1 OVERVIEW
A disease is a condition that affects the individual functioning of body totally. Diseases if
neglected will lead to the death of an individual. Diseases can be identified by the symptoms
of the body of an individual. Health is the most important in every human’s life. Weekly or
monthly check up of one’s health is most important for the prevention and also to stay
healthy.
Healthcare is the most crucial parts of the human life. Nowadays, so many are not
willing to go to hospital, due to work overload and negligence of their health. The doctors and
nurses are putting up maximum efforts to save people’s lives without even considering their
own loves. There are also some villages which lack medical facilities.
Accurate and on-time analysis of any health-related problem is important for the
prevention and treatment of the illness. The traditional way of diagnosis may not be sufficient
in the case of a serious ailment. In this situation, where everything has turned virtual, the
doctors and nurses are putting up maximum efforts to save people’s lives even if they have to
danger their own.
There are also some remote villages which lack medical facilities. The dataset was
processed in ML models Naive Bayes and Decision Tree. While processing the data,
symptoms are given as input and the disease was received as an output. This project helps to
get the idea about the disease of an individual based on the symptoms he/she have, and get
the treatment easily by contacting the concern doctor.
2
1.2 DESIGN
a) The voice assistant takes an input word which is called as "signal word" to be
activated. so, it takes in the signal word and starts operating for the user
commands.
b) Converting the speech into text will be processed by the assistant.
c) The converted text is now processed to get the required results.
d) The text given by the user should contain one or two keywords that determine
what query is to be executed. If the keyword doesn’t match any of the queries in
the code then the assistant asks the user to speak again.
e) Finally, the output to the user's query will be given by converting speech to
text.
1.3 VOICE ASSISTANT
Our assistant “NOVA” extends to helps us when working on a system in which it is

installed. We can access by calling the wake word "Hello NOVA".
1.3.1 WHAT IS VOICE ASSISTANT
A voice assistant, also known as an intelligent personal assistant or a connected speaker, is a new
type of device that is based on natural language speech recognition and is offered by popular
companies like Apple, Amazon, and Google. We got inspired by that and created one our self.
1.3.2 WHY DO WE NEED IT
Usually, typing out and searching or doing day-to-day tasks becomes hectic. But our life doesn’t
need to be like that. One can ask for help to voice assistants. They let the users to perform a task
using a speech command, as well as retrieve information via voice synthesis.
3
Following are the reasons to have a voice assistant.
 Minimal Effort
 It’s easier to say a few words than type them on a small smartphone screen.
 Eyes Free
 One can be as blind as a bat, but a voice assistant will always help you. Our ears are enough.
One can also ask the bot about something while cooking at the same time.
 Fast response
 Imagine how much time you have to spend to find some information on a website? Or how
many clicks do you need to make before you find the thing you need in a mobile application?
Voice assistants don’t generate such difficulties. One can ask a question and you have the
answer.
1.3.3 WHERE TO USE IT
Voice search has been a hot topic of discussion. Voice visibility will undoubtedly be a challenge.
This is due to the lack of a visual interface for voice assistants. Users cannot see or interact with a
voice interface unless it is linked to the Alexa or Google Assistant app. Search behavior patterns
will change dramatically as a result.
Brands are currently undergoing a transformation in which touchpoints are transforming into
listening points, and organic search will be the primary means of brand visibility. Advertising
agencies are becoming more popular as voice search grows in popularity. Voice assistants will also
continue to offer more individualized experiences as they get better at differentiating between
voices. The number of people using voice assistants is expected to grow. According to the Voice
bot Smart Speaker Consumer Adoption Report 2018, almost ten percent of people who do not
own a smart speaker plan to purchase one. If this holds true, the user base of smart speaker users
will grow 50 percent, meaning a quarter of adults in the United States will own a smart speaker.
4
CHAPTER 2
LITERATURE
SURVEY
2.1 RELATED WORK
This field of virtual assistants having speech recognition has seen some major advancements or
innovations. This is mainly because of its demand in devices like smartwatches or fitness bands, speakers,
Bluetooth earphones, mobile phones, laptop or desktop, television, etc. Almost all the digital devices
which are coming nowadays are coming with voice assistants which help to control the device with
speech recognition only. A new set of techniques is being developed constantly to improve the
performance of voice automated search.
As the amount of data is increasing exponentially now known as Big Data the best way to improve the
results of virtual assistants is to incorporate our assistants with machine learning and train our devices
according to their uses. Other major techniques that are equally important are Artificial Intelligence,
Internet of Things, Big Data access and management, etc. With the use of voice assistants, we can
automate the task easily, just give the input to the machine in the speech form and all the tasks will be
done by it from converting your speech into text form to taking out keywords from that text and execute
the query to give results to the user.
Machine Learning is just a subset of Artificial Intelligence. This has been one of the most helpful
advancements in technology. Before AI we were the ones who were upgrading technology to do a task
but now the machine is itself able to counter new tasks and solve it without need to involve the humans to
evolve it.
This has been helpful in day-to-day lifestyle. From mobile phones to personal desktops to mechanical
industries these assistants are in very much demand for automating tasks and increasing efficiency.
5
 Nivedita Singh (2021) et al. proposed a voice assistant using python speech to text (STT) module and
had performed some api calls and system calls which has led to developing a voice assistant using
python which allows the user to run any type of command through voice without interaction of keyboard.
This can also run on hybrid platforms. Therefore, this paper lacks in some parts like the system calls that
aren’t much supported.
 Abeed Sayyed (2021) et al. presented a paper on Desktop Assistant AI using python with IOT features
and also used Artificial Intelligence (AI) features along with a SQLite DB with the use of Python. This
Project has a Database connection and a query framework but lacks API call and System calls features.
 P.Krishnaraj (2021) et al. presented a project on Portable Voice Recognition with GUI Automation,
This system uses Google’s online speech recognition system for converting speech input to text along
with Python. Therefore, this project has a GUI and is also has a portable framework. Accuracy of this
text to speech (TTS) engine is comparatively less and also lacks IoT.
 Rajdip Paul (2021) et al. presented a project named A Novel Python-based Voice Assistance System
for reducing the Hardware Dependency of Modern Age Physical Servers. This Author has proposed
assistant project with python as a backend supporting system calls, api calls and various features. This
Project is quite well responsive with api calls, also needs improvement in understanding and reliability.
 V. Geetha (2021) et al. presented a project named The Voice Enabled Personal Assistant for Pc using
Python. This Author has proposed assistant project with python as a backend and features like turning
our PC off, or restarting it, or reciting some latest news, are just one voice command away. Also, this
project has well supported library not every API will have the capability to convert the raw JSON data
into text. And there is a delay in processing request calls.
 Dilawar Shah Zwakman (2021) et al. proposed the Usability Evaluation of Artificial
Intelligence-Based Voice Assistants which can give proper response to the user's request. It also has a
feature where it can make an appointment with the person mention by the user through voice but it lacks
API calls.
6
 Dimitrios Buhalis (2021) et al. proposed a paper on In-room Voice-Based AI Digital Assistants
Transforming On-Site Hotel Services and Guests’ Experiences. Where voice assistant is used for hotel
services. It'll be very useful in this current COVID-19 era. Human Touch is considered as a danger in
this COVID time and with a voice assistant, loss of human touch is not considered as an advantage. It
can also be used to control the temperature controls and room light controls but it needs Complex
Integration and Staff Training.
 Philipp Sprengholz (2021) et al. has proposed Ok Google: Using virtual assistants for data collection
in psychological and behavioural research which is a survey mate that they have developed which is an
extension of the Google Assistant that was used to check the reliability and validity of data collected by
this test. Possible answers and synonyms are defined for every different type of questions so, it can be
used to analyse the behaviour of an individual. As it is a psychological and behavioural research
assistant.
 Rahul Kumar (2020) et al. has proposed Power Efficient Smart Home with voice assistant by which
we can say that a Voice Assistant is one of the important part of the Smart home which is becoming
one of the major things in the current world as it can operate the Home Appliances just with voice which
also increase the home security because of the smart locks but it requires a reliable internet connection
which is crucial and sometimes, the user might lock themselves out of their own house.
 Benedict D. C (2020) et al. proposed Consumer decisions with artificially intelligent voice assistants
that will have stronger psychological reactions to the system's look on human like behaviours. The
assistant has an IoT (Internet of Things) features. It can also order stuffs which the user want but
there are some cons in this paper. Voice assistant relies on the speaker’s ability to represent the decision
alternatives to catch up in voice dialogues and another main disadvantage is that, it lacks system calls.
 Tae-Kook Kim (2020) el at. has proposed a Short Research on Voice Control System Based on
Artificial Intelligence Assistant which states AI assistant system using open API artificial intelligence,
and the conditional auto-run system, IFTTT (IF This, Then That). It can control the system using the
Raspberry PI board but it lacks system calls.
7
CHAPTER 3
METHODOLOGY
3.1 EXISTING SYSTEM
From the above literature survey, we have inferred that all the systemsexisting predict
only particular diseases namely lung disease, breast cancer, heartdisease, diabetes by
implementing various algorithms on the particular datasets.
After implementing various algorithms, the most accurate one is selected and it is
used for prediction of disease. Sometimes, we may get confused of what algorithm to use.
Also, all the systems find only the particular disease and not the disease based on the
symptoms.
3.2 PROPOSED SYSTEM
We are proposing a system in an efficient way of implementing a Personal voice

assistant, Speech Recognition library has many in-built functions, that will let the assistant
understand the command given by user and the response will be sent back to user in
voice, with Text to Speech functions. When assistant captures the voice command given
by user, the under lying algorithms will convert the voice into text. And according to the
keywords present in the text (command given by user), respective action will be performed
by the assistant.
This is made possible with the functions present in different libraries. Also, the
assistant was able to achieve all the functionalities with help of some API’s. We had used
these APIs for functionalities like performing calculations, extracting news from web
sources, and for telling the weather. We will be sending a request, and through the API,
we’re getting the respective output. API’s like WOLFRAMALPHA, are very helpful in
performing things like calculations, making small web searches. And for getting the data
from web. In this way, we are able to extract news from the web sources, and send them as
input to a function for further purposes. Also, we have libraries like Random and many
other libraries, each corresponding to a different technology. We used the library OS to
implement Operating System related functionalities like Shutting down a system, or
restarting a system.
8
At the outset we make our program capable of using system voice with the help of
sapi5 and pyttsx3. pyttsx3 is a text-to-speech conversion library in Python. Unlike
alternative libraries, it works offline, and is compatible with both Python 2 and 3. The
Speech Application Programming Interface or SAPI is an API developed by Microsoft to
allow the use of speech recognition and speech synthesis within Windows applications.
Then we define the speak function to enable the program to speak the outputs.
After that we will define a function to take voice commands using the system
microphone. The main function is then defined where all the capabilities of the program
are defined.
 The proposed system will have the following functionality:
(a) The system will keep listening for commands and the time for listening is variable
which can be changed according to user requirements.
(b) If the system is not able to gather information from the user input it will keep
asking again to repeat till the desired number of times.
(c) The system can have both male and female voices according to user requirements.
(d) Features supported in the current version include playing music, texts, search on
Wikipedia, or opening system installed applications, opening anything on the web
browser, etc.
3.3 OBJECTIVE OF PROJECT
Main objective of building personal assistant software (a virtual assistant) is using semantic data
sources available on the web, user generated content and providing knowledge from knowledge
databases. The main purpose of an intelligent virtual assistant is to answer questions that users may have.
This may be done in a business environment, for example, on the business website, with a chat interface.
On the mobile platform, the intelligent virtual assistant is available as a call-button operated service
where a voice asks the user “What can I do for you?” and then responds to verbal input. Virtual assistants
can tremendously save you time. We spend hours in online research and then making the report in our
terms of understanding.
9
Provide a topic for research and continue with your tasks while the assistant does the research. Another
difficult task is to remember test dates, birthdates or anniversaries. It comes with a surprise when you
enter the class and realize it is class test today. Just tell assistant in advance about your tests and she
reminds you well in advance so you can prepare for the test. One of the main advantages of voice
searches is their rapidity. In fact, voice is reputed to be four times faster than a written search: whereas
we can write about 40 words per minute, we are capable of speaking around 150 during the same period
of time. In this respect, the ability of personal assistants to accurately recognize spoken words is a
prerequisite for them to be adopted by consumers.
3.4 SOFTWARE AND HARDWARE REQUIREMENTS
3.4.1 Software Requirements:
 Python 3.5 & Above
 Windows 7 And Above
3.4.2 Hardware Requirements:
 Processor: Intel Core i5
 RAM: 4GB
 OS: Windows / Mac
 Microphone
 ARDUINO UNO board
 Relay
 A Light Bulb
 USB Cable
 Electronics Wires
 Plug Point & a Plug
1
3.4.3 Libraries:
 Pyttsx3- It is a text to speech conversion library in python which is used to

convert the text given in the parenthesis to speech. It is compatible with
python 2 and 3. An application invokes the pyttsx3.init() factory function to get a
reference to a pyttsx3. it is a very easy to use tool which converts the entered
text into speech. The pyttsx3 module supports two voices first is female and the
second is male which is provided by “sapi5” for windows. Command to install: - pip
install pyttsx3
It supports three TTS engines: -

sapi5- To run on windows
nsss - NSSpeechSynthesizer on Mac OS X
espeak – eSpeak on every other platform
 Speech_recognition- It allows computers to understand human language. Speech

recognition is a machine's ability to listen to spoken words and identify them. We
can then use speech recognition in Python to convert the spoken words into text, make
a query or give a reply. Python supports many speech recognition engines and APIs,
including Google Speech Engine, Google Cloud Speech API.
Command to install :- pip install SpeechRecognition
 WolfarmAlpha- Wolfram Alpha is an API which can compute expert-level answers

using Wolfram's algorithms, knowledgebase and AI technology. It is made possible
by the Wolfram Language. The WolfarmAlpha API provide a web-based API
allowing the computational and presentation capabilities of WolframAlpha to be
integrated into web, mobile and desktop applications.
Command to install :- pip install wolframalpha
 Randfacts- Randfacts is a python library that generates random facts. We can use
randfacts.get_fact() to return a random fun fact.
Command to install :- pip install randfacts
1
 Pyjokes- Pyjokes is a python library that is used to create one-line jokes for the users.
Informally, it can also be referred as a fun python library which is pretty simple to use.
Command to install :- pip install pyjokes
 Datetime- This module is used to get the date and time for the user. This is a built-in module
so there is no need to install this module externally. Python Datetime module supplies classes
to work with date and time. Date and datetime are an object in Python, so when we manipulate
them, we are actually manipulating objects and not string or timestamps.
 Random2- Python version 2 has a module named "random". This module provides a Python
3 ported version of Python 2.7's random module. It has also been back-ported to work in
Python 2.6. In Python 3, the implementation of randrange() was changed, so that even with the
same seed you get different sequences in Python 2 and 3.
 Math- This is a built-in module which is used to perform mathematical tasks. For example,
math.cos() which returns the cosine of a number or math.log() returns the natural logarithm of
a number, or the logarithm of number to base.
 Warnings- The warning module is actually a subclass of Exception which is a built-in class
in Python. A warning in a program is distinct from an error. Conversely, a warning is not
critical. It shows some message, but the program runs.
 OS- The os module is a built-in module which provides functions with which the user can
interact with the os when they are running the program. This module provides a portable way
of using operating system-dependent functionality. This module has functions with which the
user can open the file which is mentioned in the program.
 Serial- This module encapsulates the access for the serial port. It provides backends for
Python running on Windows, OSX, Linux, BSD and Iron Python. The module named “serial”
automatically selects the appropriate backend.
Command to install :- pip install pyserial
1
 Time- This module provides many ways of representing time in code, such as objects,
numbers, and strings. It also provides functionality other than representing time, like waiting
during code execution and measuring the efficiency of our code. This is a built-in module so
the installation is not necessary.
 Wikipedia :-This is a Python library that makes it easy to access and parse data from
Wikipedia. Search Wikipedia, get article summaries, get data like links and images from a
page, and more. Wikipedia is a multilingual online encyclopedia.
Command to install :- pip install wikipedia
 Selenium Webdrive- The selenium module is used to automate web browser interaction
from Python. Several browsers/drivers are supported (Firefox, Chrome, Internet Explorer), as
well as the Remote protocol. The supported python versions are python 3.5 and above.
Command to install :- pip install selenium
 Requests- The requests module allows you to send HTTP requests using Python. The HTTP
request returns a Response Object with all the response data. With it, we can add content like
headers, form data, multipart files, and parameters via simple Python libraries. It also allows
you to access the response data of Python in the same way.
Command to install :- pip install requests
 Webbrowser- Webbrowser module is a convenient web browser controller. It provides a

high-level interface that allows displaying Web-based documents to users. webbrowser can
also be used as a CLI tool. It accepts a URL as the argument with the following optional
parameters: -n opens the URL in a new browser window, if possible, and -t opens the URL in
a new browser tab. This is a built-in module so installation is not required.
1
3.5. PROGRAMMING LANGUAGES
3.5.1 PYTHON
Python is an OOPs (Object Oriented Programming) based, high level, interpreted

programming language. It is a robust, highly useful language focused on rapid
application development (RAD). Python helps in easy writing and execution of codes.
Python can implement the same logic with as much as 1/5th code as compared to other
OOPs languages. Python provides a huge list of benefits to all. The usage of Python is
such that it cannot be limited to only one activity. Its growing popularity has allowed it to
enter into some of the most popular and complex processes like Artificial Intelligence
(AI), Machine Learning (ML), natural language processing, data science etc. Python has
a lot of libraries for every need of this project. For this project, libraries used are speech
recognition to recognize voice, Pyttsx for text to speech, selenium for web automation
etc.
It’s owing to the subsequent strengths that Python has –
 Easy to be told and perceive- The syntax of Python is simpler; thence it's
comparatively straightforward, even for beginners conjointly, to be told and perceive
the language.
 Multi-purpose language − Python could be a multi-purpose programing language

as a result of it supports structured programming, object-oriented programming yet as
practical programming.
 Support of open supply community − As being open supply programing

language, Python is supported by awfully giant developer community. Because of
this, the bugs square measure simply mounted by the Python community. This
characteristic makes Python terribly strong and adaptative.
1
3.5.2 DOMAIN
The internet of things, or IoT, is a system of interrelated computing devices,

mechanical and digital machines, objects, animals or people that are provided with
unique identifiers (UIDs) and the ability to transfer data over a network without requiring
human-to-human or human-to-computer interaction.
A thing in the internet of things can be a person with a heart monitor implant, a farm
animal with a biochip transponder, an automobile that has built- in sensors to
alert the driver when tire pressure is low or any other natural or man-made object that can
be assigned an Internet Protocol (IP) address and is able to transfer data over a network.
Increasingly, organizations in a variety of industries are using IoT to operate more

efficiently, better understand customers to deliver enhanced customer service, improve
decision-making and increase the value of the business. An IoT ecosystem consists of
web-enabled smart devices that use embedded systems, such as processors, sensors and
communication hardware, to collect, send and act on data they acquire from their
environments.
IoT devices share the sensor data they collect by connecting to an IoT gateway or
other edge device where data is either sent to the cloud to be analysed or analysed
locally. Sometimes, these devices communicate with other related devices and act on the
information they get from one another. The devices do most of the work without human
intervention, although people can interact with the devices -- for instance, to set them up,
give them instructions or access the data.
1
3.6. SYSTEM ARCHITECTURE
Fig 3.1 System Architecture
3.7 ALGORITHMS USED
3.7.1 SPEECH RECOGNITION MODULE
 The class which we are using is called Recognizer.

 It converts the audio files into text and module is used to give the output in speech.
 Energy threshold function represents the energy level threshold for sounds. Values below
this threshold are considered silence, and values above this threshold are considered speech.
1
 Recognizer instance.adjust_for_ambient_noise(source, duration = 1), adjusts the energy
threshold dynamically using audio from source (an AudioSource instance) to account for
ambient noise.
3.7.2 SPEECH TO TEXT & TEXT TO SPEECH CONVERSION
 Pyttsx3 is a text-to-speech conversion library in Python. And can change the Voice, Rate
and Volume by specific commands.
 Python provides an API called Speech Recognition to allow us to convert audio into text
for further processing converting large or long audio files into text using the Speech
Recognition API in python.
 We have Included sapi5 and espeak TTS Engines which can process the same.
3.7.3 PROCESS & EXECUTES THE REQUIRED COMMAND
 The said command is converted into text via speech recognition module and further stored
in a temp.
 Then, Analyze the user’s text via temp and decide what the user needs based on input
provided and runs the while loop.
 Then, Commands are executed.
1
3.8 SYSTEM DESIGN:
3.8.1 USE CASE DIAGRAM:
Fig 3.2. Use Case Diagram
 In this project there is only one user. The user queries command to the system. System then
interprets it and fetches answer. The response is sent back to the user.
1
3.8.2 COMPONENT DIAGRAM:
Fig 3.3. Component Diagram
 The main component here is the Virtual Assistant. It provides two specific service,
executing Task or Answering your question.
SEQUENCE DIAGRAM:
1
Fig 3.4. Sequence Diagram
 The user sends command to virtual assistant in audio form. The command is passed to the
interpreter. It identifies what the user has asked and directs it to task executer. If the task is
missing some info, the virtual assistant asks user back about it. The received information is
sent back to task and it is accomplished. After execution feedback is sent back to user.
Fig 3.5. Sequence Diagram (Answering the user)
 The above sequence diagram shows how an answer asked by the user is being fetched
from internet. The audio query is interpreted and sent to Web scraper. The web scraper searches
and finds the answer. It is then sent back to speaker, where it speaks the answer to user.
2
3.9 Feasibility Study
Feasibility study can help you determine whether or not you should proceed with your
project. It is essential to evaluate cost and benefit. It is essential to evaluate cost and benefit of the
proposed system. Five types of feasibility study are taken into consideration.
1. Technical feasibility: It includes finding out technologies for the project, both
hardware and software. For virtual assistant, user must have microphone to convey
their message and a speaker to listen when system speaks. These are very cheap now a days and
everyone generally possess them. Besides, system needs internet connection.
While using, make sure you have a steady internet connection. It is also not an issue in
this era where almost every home or office has Wi-Fi.
2. Operational feasibility: It is the ease and simplicity of operation of proposed system.

System does not require any special skill set for users to operate it. In fact, it is designed to be
used by almost everyone. Kids who still don’t know to write can read
out problems for system and get answers.
3. Economic feasibility: Here, we find the total cost and benefit of the proposed system over current
system. For this project, the main cost is documentation cost. User also, would have to pay for
microphone and speakers. Again, they are cheap and available. As far as maintenance is concerned, it
won’t cost too much.
4. Organizational feasibility: This shows the management and organizational structure of the project.
This project is not built by a team. The management tasks are all to be carried out by a single person. That
won’t create any management issues and will increase the feasibility of the project.
5. Cultural feasibility: It deals with compatibility of the project with cultural environment. Virtual
assistant is built in accordance with the general culture. This project is technically feasible with no
external hardware requirements. Also, it is simple in operation and does not cost training or repairs.
Overall feasibility study of the project reveals that the goals of the proposed system are achievable.
Decision is taken to proceed with the project.
2
3.10. TYPES OF OPERATION
 Information:
If we ask for some information, it opens up wikipedia and asks us the topic on which we want
the information, then it clicks on the wikipedia search box using its xpath, searches the topic in the
search box and clicks the search button using the xpath of the button and reads a paragraph about that
topic.
Keyword: information
 Plays the video which we ask:

If we ask it to play a video, it opens up YouTube and asks us the name of the video which it
wants to play. After that, it clicks on the search YouTube search box using its xpath, then it clicks on
the search button using its xpath and clicks the first result of the search using the xpath of the first
video.
Keyword: Play and video or music
 News of the day:

If we ask for the news, it reads out the Indian news of the day on which it is asked.
Keyword: news
 Temperature and Weather:

If the user asks the temperature, it gives the current temperature.
Keyword: temperature
 Joke:
If the user asks for a joke, it tells a one liner joke to the user.
Keyword: funny or joke
 Fact:
If the user asks for some logical fact, it tells a fact to the user.
Keyword: fact
2
 Game:
The assistant can play the number guessing game with the user. First, it asks for the lower and the
upper limit between which the number should be. Then it initializes a random number between that
upper and lower limit. After that, it uses a formula to calculate the number of turns within which the user
should guess the number.
Keyword: game
 Restart the system:

The assistant restarts the system if the user asks the assistant to restart the system.
Keyword: Restart the system or Reboot the system
 Open:
The assistant will open some of the folders and applications which the user asks the assistant
to open.
Keyword: Open
 Date and Time:

If the user asks for the date or time, the assistant tells it.
Keyword: date or time or date and time
 Calculate:
The assistant will calculate the equations which the user tells it to calculate using
wolframalpha API key.
Keyword : calculate (along with the equation)
 Turn on the light:

This is an IOT feature where the assistant turns on the light if the user asks it to turn on the
light.
Keyword: light on
 Turn off the light:

This is an IOT feature where the assistant turns off the light if the user asks it to turn off the
light.
Keyword: light off
2
 Tells its name:
The assistant tells its name if the user asks it. The name of the assistant is Next Gen
Optimal Assistant NOVA.
Keyword: Name
 Exit:
The assistant will stop assisting the user if the user asks it to exit.
Keyword: exit or end or stop.
2
CHAPTER 4
RESULTS AND DISCUSSION
The project work of the voice assistant has been clearly explained in this report, how useful it
is and how we can rely on a voice assistant for performing any/every task which the user needs to
complete and how the assistant is developing everyday which we can hope that it'll be one of the
biggest technology in the current technological world. Development of the software is almost
completed form our side and it's working fine as expected which was discussed for some extra
development. So, maybe some advancement might come in the near future where the assistant which
we developed will be even more useful than it is now.
4.1. WORKING
It starts with a signal word. Users say the names of their voice assistants for the same reason. They might
say, “Hey Siri!” or simply, “Alexa!” Whatever the signal word is, it wakes up the device. It signals to the
voice assistant that it should begin paying attention. After the voice assistant hears its signal word, it
starts to listen. The device waits for a pause to know you’ve finished your request. The voice assistant
then sends our request over to its source code. Once in the source code, our request is compared to other
requests. It’s split into separate commands that our voice assistant can understand. The source code then
sends these commands back to the voice assistant. Once it receives the commands, the voice assistant
knows what to do next. If it understands, the voice assistant will carry out the task we asked for. For
example, “Hey NOVA! What’s the weather?” NOVA reports back to us in seconds. The more directions
the devices receive, the better and faster they get at fulfilling our requests. The user gives the voice input
through microphone and the assistant is triggered by the wake up word and performs the STT (Speech to
Text) and converts it into a text and understands the Voice input and further performs the task said by the
user repeatedly and delivers it via TTS (Text to Speech) module via AI Voice.
These are the important features of the voice assistant but other than this, we can do an plenty of things
with the assistant.
2
List of features that can be done with the assistant:
- Playing some video which, the user wants to see.
- Telling some random fact at the start of the day with which the user can do their work in an
informative way and the user will also learn something new.
- One of the features which will be there in every assistant is playing some game so that the user
can spend their free time in a fun way.
- Users might forget to turn off the system which might contain some useful data but with a voice
assistant, we can do that even after leaving the place where the system is just by commanding the
assistant to turn the system off.
As discussed about the mandatory features to be listed in voice assistant are implemented in this work,
brief explanation is given below.
API CALLS
We have used API keys for getting news information from newsapi and weather forecast from
openweathermap which can accurately fetch information and give results to the user.
SYSTEM CALLS
In this feature, we have used OS & Web Browser Module to access the desktop, calculator, task
manager, command prompt & user folder. This can also restart the pc and open the chrome application.
CONTENT EXTRATION
This can Perform content extraction from YouTube, Wikipedia and Chrome using the web driver
module from selenium which provides all the implementations for the webdrive like searching for a
specific video to play, to get a specific information in google or from Wikipedia.
SERIAL MODULES
Finally, we used the serial module for implementing the Internet of Things (IOT) feature for this
project. It is a module which acquires the access for the serial port of the Arduino board
and used port number 11 and COM3.
2
Fig 4.1. Flowchart
1) Must provide the user any information which they ask for: -
The user might need any information which will be available on the internet but searching for that
information and reading that takes a lot of time but with the help of a voice assistant, we can complete
that task of getting the information sooner than searching and reading it. So, this is a small proof that a
voice assistant helps the user to save time
2) Telling the day's hot news in the user's location: -

In Common, watching a news channel just to know the important news in one’s location takes a
lot of time and the user might even want to listen to some news which is unnecessary to them or a news
of some different location before getting to know the news which they want needs a lot of patience to the
user but having a voice assistant makes all that nothing, it'll give the news of the location which the user
wants to now or the news which they want to know.
3) Telling some joke to chill up the moment: -

Now let's be honest, everyone would have had at least one moment in their life where they were
so tensed up or had an argument with their close people. So, these moments can be chilled up at least ten
percentage with some random joke which might cool us that moment or stop that fight. We even have a
quote stating "Laughter is the best medicine" which is relatable to the words mentioned here in this
paragraph.
2
4) Opening the file/folder which the user wants: -
In the busy world, everything should do quick else, our schedule will get changed and sometimes
we need assistance of someone to complete that task quickly but, if we have a voice assistant, we can
complete that task in right away in a hustle freeway. For example, let's say the user is doing some
documentation but after a while, he needs some file for reference and he goes searching for that file
which wastes a lot of time and he ends up missing the deadline but, with a voice assistant we can do the
searching part in a quick way by commanding the assistant to open the folder. So, by this we can say that
it is one of the important features of a voice assistant.
5) Telling the temperature/weather at the user's location: -

Let's start this with a question, why is it important for us to know the weather of the day? or why
is it important for us to monitor the weather every day? The answer is pretty simple it forewarns the users
asking about the weather telling that "it might rain today so carry an umbrella if you go out" or "It will be
a sunny day so wear a sun glass". So, by this we can say that this is also a must have feature.
6) Searching for what the user asks:

Today in the 20th century, we people often get doubts and we need to clear that doubt as soon
as possible else that one doubt will be multiplied and at the end, we'd have n doubts and to clear the
doubts searching the question in the internet will give us an answer and clear our doubts and asking that
to the assistant will save a lot of time. Other than clearing the doubts, we need to search a lot of
questions or topics in the internet to keep up with the trend and we can do this searching just by giving
command to our assistant, asking it to search a specific topic/question.
7) Internet of Things:
The final important feature which is the most important feature and that is Internet of Things
which is a lot useful because, it'll save a lot of time. Let's take an example, let's say that there is a person
with a walking disability and he has to turn on the fan but the switch is a bit far and he can't walk but
what he can do is that, he can tell the assistant to turn on the fan and that will turn it on. This is just one
example but with the help of IoT, we can do a lot of helpful stuffs like this. These are the important
features of the voice assistant but other than this, we can do an ample of stuffs with the assistant.
2
CHAPTER 5
CONCLUSION
5.1. CONCLUSION
As stated before, "voice assistant is one of the biggest problem solver" and you can see that in the
proposals with the examples that it is in fact one of the biggest problem solver of the current world. We
can see that voice assistant is one of the major evolving artificial intelligence in the current world once
again on seeing the proposal examples because at the past, the best feature which a voice assistant had
was telling the date and searching the web and giving the results but now look at the functions that it can
do so with this, we can say that it is a evolving software in the current world. The main idea is to develop
the assistant even more advanced than it is now and make it the best ai in the world which will save an
ample of time for its users. I would like to conclude with the statement that we will try our best and give
one of the best voice assistants which we are able to.
5.2. FUTURE SCOPE
We are entering the era of implementing voice-activated technologies to remain relevant and competitive. Voice-
activation technology is vital not only for businesses to stay relevant with their target customers, but also for
internal operations. Technology may be utilized to automate human operations, saving time for everyone. Routine
operations, such as sending basic emails or scheduling appointments, can be completed more quickly, with less
effort, and without the use of a computer, just by employing a simple voice command. People can multitask as a
result, enhancing their productivity. Furthermore, relieving employees from hours of tedious administrative tasks
allows them to devote more time to strategy meetings, brainstorming sessions, and other jobs that need creativity
and human interaction.
1) Sending Emails with a voice assistant:

Emails, as we all know, are very crucial for communication because they can be used for any professional contact,
and the finest service for sending and receiving emails is, as we all know, GMAIL. Gmail is a Google-created free
email service. Gmail can be accessed over the web or using third-party apps that use the POP or IMAP protocols to
synchronize email content.
To integrate Gmail with Voice Assistant we have to utilize Gmail API. The Gmail API allows you to access and
control threads, messages, and labels in your Gmail mailbox.
2
2) Scheduling appointments using a voice assistant:
The demands on our time increase as our company grows. A growing number of people want to meet with us. We
have a growing number of people who rely on us. We must check in on certain projects or set aside time to chat
with possible business leads. There won't be enough hours in the day if we keep doing things the old way.
We need to get a better handle on our full-time schedule and devise a strategy for arranging appointments that
doesn't interfere with our most critical job. By working with a virtual scheduler or, in other words, a virtual
assistant, we let someone else worry about the organization and prioritize our schedule while we focus on the work.
3) Improved Interface of a voice assistant (VUI):

Voice user interfaces (VUIs) allow users to interact with a system by speaking commands. VUIs include virtual
assistants like Amazon's Alexa and Apple's Siri. The real advantage of a VUI is that it allows users to interact with
a product without using their hands or their eyes while focusing on anything else.
-Other benefits of a Voice user interface (VUI):
Speed and Efficiency:

Hands-free interactions are possible with VUIs. This method of interaction eliminates the need to click
buttons or tap on the screen. The major means of human communication is speech. People have been using speech
to form relationships for ages. As a result, solutions that allow customers to do the same are extremely valuable.
Furthermore, even for experienced texters, dictating text messages has been demonstrated to be faster than typing.
Hands-free interactions, at least in some circumstances, save time and boost efficiency.
Intuitiveness and convenience:

Intuitive user flow is required of high-quality VUIs, and technical advancements are expected to continue to
improve the intuitiveness of voice interfaces. Compared to graphical UIs, VUIs require less cognitive effort from
the user. Furthermore, everyone – from a small child to your grandmother – can communicate. As a result, VUI
designers are in a better position than GUI designers, who run the danger of producing incomprehensible menus
and exposing users to the agony of poor interface design. Customers are unlikely to need to be instructed on how to
utilize the technology by VUI makers. People can instead ask their voice
assistant for assistance.
30
REFERENCES
[1] K. Noda, H. Arie, Y. Suga, T. Ogata, Multimodal integration learning of robot behavior using deep
neural networks, Elsevier: Robotics and Autonomous Systems, 2014.
[2] Artificial intelligence (AI), sometimes called machine intelligence.

https://en.wikipedia.org/wiki/Artificial_intelligence.
[3] Deepak Shende, RiaUmahiya, Monika Raghorte, AishwaryaBhisikar, AnupBhange, “AI Based Voice
Assistant Using Python”, Journal of Emerging Technologies and Innovative Research (JETIR),
February 2019, Volume 6, Issue 2.
[4] J. B. Allen, “From lord rayleigh to shannon: How do humans decode speech,” in International
Conference on Acoustics, Speech and Signal Processing, 2002.
[5] John Levis and Ruslan Suvorov, “Automatic Speech Recognition”.
[6] B.H. Juang and Lawrence R. Rabiner, “Automatic Speech Recognition - A Brief History of the
Technology Development”.
[7] AbhayDekate, ChaitanyaKulkarni, RohanKilledar, “Study of Voice Controlled Personal Assistant

Device”, International Journal of Computer
Trends and Technology (IJCTT) – Volume 42 Number 1 – December 2016.
3
APPENDICES
A) SOURCE CODE
import speech_recognition as sr
import wolframalpha
from YT_auto import music
from selenium_web_driver import inforr
from News import *
import randfacts
from pyjokes import *
from weather import *
import datetime
from search import sear
import random2
import math
import warnings
import open
import os
import serial
import time
arduino = serial.Serial(port='COM3', baudrate=115200, timeout=.1)
warnings.filterwarnings("ignore")
engine = p.init()
rate = engine.getProperty('rate')
engine.setProperty('rate', 150)
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
def speak(text):
engine.say(text)
engine.runAndWait()
3
def wishme():
hour = int(datetime.datetime.now().hour)
if hour > 0 and hour < 12:
return ("Morning")
elif hour >= 12 and hour < 16:
return ("Afternoon")
elif hour >= 16 and hour < 19:
return ("evening")
else:
return ("night")
def quitApp():
hour = int(datetime.datetime.now().hour)
if hour >= 3 and hour < 18:
print("have a good day sir")
speak("have a good day sir")
else:
print("Goodnight sir")
speak("Goodnight sir")
print("Offline")
exit(0)
def write_read(x):
arduino.write(bytes(x, 'utf-8'))
time.sleep(0.05)
data = arduino.readline()
return data
#flags
Light_status_flag = False
today_date = datetime.datetime.now()
r = sr.Recognizer()
speak("Tell the wake up word")
wake = "hello Nova"
with sr.Microphone() as source:
r.energy_threshold = 10000
r.adjust_for_ambient_noise(source, 1.2)
print("Listening")
audio = r.listen(source)
wakeword = r.recognize_google(audio)
3
print(wakeword)
if wake == wakeword:
while True:
speak("hello sir, good " + wishme() + ", i'm here to assist you.")
speak("How are you")

print("Listening")
text = r.recognize_google(audio)
print(text)
if "what" and "about" and "you" in text:

speak("I am also having a good day")
if name == " main ":

while True:
speak("What can i do for you??")

print('Listening. . . .')
text2 = r.recognize_google(audio)
if "information" in text2:
speak("You need information related to which topic")

infor = r.recognize_google(audio)
speak("Searching {} in wikipedia".format(infor))
print("Searching {} in wikipedia".format(infor))
3
assist = inforr()
assist.get_info(infor)
elif "play" and "video" in text2:

speak("Which video you want me to play??")

vid = r.recognize_google(audio)
speak("Playing {} on youtube".format(vid))
print("Playing {} on youtube".format(vid))
assist = music()
assist.play(vid)
elif "news" in text2:

speak("Sure sir, Now I will read news for you")
arr = news()
for i in range(len(arr)):
print(arr[i])
speak((arr[i]))
elif "temperature" in text2:

speak("Temperature in Chennai is" + str(temp()) + " degree celcius" + " and
with " + str(des()))
print("Temperature in Chennai is" + str(temp()) + " degree celcius" + " and
with " + str(des()))
elif "funny" in text2:

speak("Get ready for some chuckles")
joke = pyjokes.get_joke()
speak(joke)
print(joke)
elif "your name" in text2:

speak("My name is Next genn Optimal Voice Assistant Nova")
elif "fact" in text2:
3
speak("Sure sir , ")
x = randfacts.getFact()
speak("Did you know that," + x)
print(x)
elif "search" in text2:

speak("What should i search for sir")
searc = r.recognize_google(audio)
speak("Searching {} in Google".format(searc))
print("Searching {} in Google".format(searc))
asist = sear()
asist.get_infoo(searc)
elif "game" in text2:

speak("enter your lower limit sir")
lower = int(r.recognize_google(audio))
speak("now, Enter your upper limit")
upper = int(r.recognize_google(audio))
x = random2.randint(lower, upper)
speak("\n\tYou've only " + str(round(math.log(upper - lower + 1, 2))) + "
chances to guess the integer!\n")
print("\n\tYou've only " + str(round(math.log(upper - lower + 1, 2))) + "
chances to guess the integer!\n" + str(upper), str(lower))
count = 0
while count < math.log(upper - lower + 1, 2):
count += 1
speak("start guessing")
speak("Guess a number")
3
print('Listening.....')
guess = int(r.recognize_google(audio))
if x == guess:
print("Congratulations you did it in " + str(count) + " try")
speak("Congratulations you did it in " + str(count) + " try")
break
elif x > guess:
print("You guessed too small!")
speak("You guessed too small!")
elif x < guess:
print("You Guessed too high!")
speak("You Guessed too high!")
if count >= math.log(upper - lower + 1, 2):
print("\nThe number is %d" % x)
speak("\nThe number is %d" % x)
print("\tBetter Luck Next time!")
speak("\tBetter Luck Next time!")
elif "reboot the system" in text2:

speak("Do you wish to restart your computer ?")
restart = r.recognize_google(audio)
elif "light off" in text2:
#if Light_status_flag == True:
cmd = "OFF"
Status = write_read(cmd)
speak("Lights are turned off")
#elif Light_status_flag == False:
elif "stop" or "exit" or "end" in text2:
speak("It's a pleasure helping you and I am always here to help you out!")
quitApp()
B) SCREENSHOTS
3
3
C) PLAGIARISM REPORT
3
4
D) JOURNAL PAPER
DESKTOP BASED SMART VOICE ASSISTANT USING PYTHON LANGUAGE INTEGRATED WITH ARDUINO
Mr. Akash S
Mr. Neeraj Jayaram Dr. Jesudoss A
UG Student, Department of
UG Student, Department of Associate Professor,
Computer Science and
Computer Science and Department of Computer
Engineering
Engineering Science and Engineering
Sathyabama Institute of
Sathyabama Institute of Sathyabama Institute of
Science & Technology
Science & Technology Science & Technology
Chennai
Chennai Chennai
[email protected]
jayaramansarma1971@gma jesudossa.cse@sathyabama.
il.com ac.in
Abstract- A voice assistant is also known as voice-based Artificial

Intelligence (AI) is one of the hot topics in the current world that are
programs that listens to human’s verbal command and responds to II. LITERATURE REVIEW
them which makes it a human-computer/device interaction. In the Aarthi Easwara Moorthy (2014) et al. proposed Voice
current days, a voice assistant is everywhere which is a lot useful in
these busy days. Nowadays, almost everyone in the current world is
Activated Personal Assistant Acceptability of Use in the Public
using voice assistant because it’s everywhere starting from Google Space, on which can control home appliances just with the user's
Assistant which even 5 years old kids can use it because of the voice but presence of strangers may affect the user's behaviour
current world pandemic which makes them to use the smartphone. towards the voice assistant. Also, voice-activated personal
Amazon's Alexa equally competes with the Google Assistant and assistants’ model which acts as a base interpreter and performs
will be very useful to do variety of works starting from entertaining
the users till turning on and off the household products Internet of the needs of the user through the desktop assistant [1].
Things (IoT). One of the greatest features is that it will be helpful to
even physically challenged people, for example people who aren't Veton Këpuska (2018) et al. has proposed Next-
able to walk use the Internet of Things feature to operate the Generation of Virtual Personal Assistants stated that Voice
household products and maintain them. So, we tend to develop a Assistant will be used to increase the interaction between the
voice assistant which will be convenient for the users same as the
other voice assistants which are currently in trend. humans and the systems with the help of gesture recognition,
image or video recognition and finally the speech recognition.
Keywords- The assistant will choose the perfect output device and show the
Voice Assistant, Speech to Text, Text to Speech, Internet of result to the user. It lacks system calls [2].
Things, Python.
George Terzopoulos (2019) et al. proposed Voice
I. INTRODUCTION Assistants and Smart Speakers in Everyday Life and in
VOICE ASSISTANT is one of the major evolving and the Education he stated that Artificial Intelligent is natural language
trending topic in the current world or in the current tech world in processing which will be very helpful for people in the day-to-
other words, it's becoming an integral part of everyone's life. It is day life. It will be very useful for blind people because it helps
doing a lot of help to people just by giving it a command and them to make everyday tasks possible with its IoT features [3].
giving that command is not that big of a deal, just the user's voice
is enough to do that and because of this, even a kid who can Rahul Kumar (2020) et al. has proposed Power
speak in the required language (English in most of the cases) can Efficient Smart Home with voice assistant by which we can say
operate that voice assistant. As mentioned in the abstract it can that a Voice Assistant is one of the important part of the Smart
be very helpful to physically challenged people. Let's take an home which is becoming one of the major things in the current
example, suppose if a person is not able to walk and he/she wants world as it can operate the Home Appliances just with voice
to turn off the lights, he can do it just with their voice by giving which also increase the home security because of the smart locks
the command to their assistant to turn off the lights so in simple
but it requires a reliable internet connection which is crucial and
words, we can say that it has Internet of things (IoT). It reminds
the user to do the stuffs which you ask it to remind, it can order sometimes, the user might lock themselves out of their own
from which you ask and from where you want that food from and house [4].
many more.
One of the most important features of a voice assistant is that, it Tae-Kook Kim (2020) el at. has proposed a Short
will save a bunch of time and everything will be simple to do Research on Voice Control System Based on Artificial
with the help of a voice assistant. In other words, you can do Intelligence (AI) Assistant which states assistant system using
multi-tasking. open Application Programming Interface (API) artificial
intelligence, and the conditional auto-run system, IFTTT (IF
This, Then That). It can control the system using the Raspberry and dialog management process and features like System calls
PI board but it lacks system calls [5]. and IoT are low [11].
Subhash S (2020) et al. proposed artificial Intelligence
based Voice Assistant. The Author has implemented the voice to Rajdip Paul (2021) et al. presented the project named
text via a recorded audio file and then the processing of the A Novel Python-based Voice Assistance System for reducing the
functions is thrown. This project has well supported library and Hardware Dependency of Modern Age Physical Servers. This
has used Google Text To Speech (GTTS) engine. Although there Author has proposed assistant project with python as a backend
isn’t features like system calls and IoT [6]. supporting system calls, API calls and various features. This
Project is quite well responsive with api calls, also needs
Benedict D. C (2020) et al. proposed Consumer improvement in understanding and reliability [12].
decisions with artificially intelligent voice assistants that will
have stronger psychological reactions to the system's look on
human like behaviours. The assistant has an Internet of Things V. Geetha (2021) et al. presented a project named The
features. It can also order stuffs which the user want but there are Voice Enabled Personal Assistant for PC using Python. This
some cons in this paper. Voice assistant relies on the speaker’s Author has proposed assistant project with python as a backend
ability to represent the decision alternatives to catch up in voice and features like turning our PC off, or restarting it, or reciting
dialogues and another main disadvantage is that, it lacks system some latest news, are just one voice command away. Also, this
calls [7]. project has well supported library not every API will have the
capability to convert the raw JavaScript Object Notation (JSON)
Nivedita Singh (2021) et al. proposed a voice assistant data into text. And there is a delay in processing request calls
using python speech to text module and had performed some api [13].
calls and system calls which has led to developing a voice
assistant using python which allows the user to run any type of Dilawar Shah Zwakman (2021) et al. proposed the
command through voice without interaction of keyboard. This Usability Evaluation of Artificial Intelligence- Based Voice
can also run on hybrid platforms. Therefore, this paper lacks in Assistants which can give proper response to the user's request. It
some parts like the system calls that aren’t much supported [8]. also has a feature where it can make an appointment with the
person mention by the user through voice but it lacks API calls
Abeed Sayyed (2021) et al. presented a paper on [14].
Desktop Assistant AI using python with IOT features and also Philipp Sprengholz (2021) et al. has proposed Ok
used Artificial Intelligence (AI) features along with an SQLite Google: Using virtual assistants for data collection in
DB (Database) with the use of Python. This Project has a psychological and behavioural research which is a survey mate
Database connection and a query framework but lacks API call that they have developed which is an extension of the Google
and System calls features [9]. Assistant that was used to check the reliability and validity of
data collected by this test. Possible answers and synonyms are
P.Krishnaraj (2021) et al. presented a project on defined for every different type of questions so, it can be used to
Portable Voice Recognition with Graphical User Interface (GUI) analyse the behaviour of an individual. As it is a psychological
Automation, This system uses Google’s online speech and behavioural research assistant [15].
recognition system for converting speech input to text along with
Python. Therefore, this project has a GUI and is also has a Dimitrios Buhalis (2021) et al. proposed a paper on In-
portable framework. Accuracy of this Text to Speech (TTS) room Voice-Based AI Digital Assistants Transforming On-Site
engine is comparatively less and also lacks IoT [10]. Hotel Services and Guests’ Experiences. Where voice assistant is
used for hotel services. It'll be very useful in this current
A.M.Sermakani (2021) et al. proposed Creating COVID-19 era. Human Touch is considered as a danger in this
Desktop Speech Recognition Using Python Programming. This COVID time and with a voice assistant, loss of human touch is
project is built with AI technologies, voice activation, automatic not considered as an advantage. It can also be used to control the
speech recognition, Teach-To-Speech, voice biometrics with temperature controls and room light controls but it needs
Python as a Backend. This project has a solid brain technology Complex Integration and Staff Training [16].
III. PROPOSED WORK
Voice assistant is one of the biggest problem solvers in the something. So, a voice assistant is a must require AI software in
modern world. Solution of every problem can be easily found in the current world. Which requires a wake-up word called “Hello
seconds and that also just by the user's voice, nothing extra is NOVA” (Next-Gen Optimal Voice Assistant).
required to find the solution. It's also useful in maintaining the
houses for example, setting up a stop watch to turn on or off
and he goes searching for that file which wastes a lot of time and
he ends up missing the deadline but, with a voice assistant we
can do the searching part in a quick way by commanding the
assistant to open the folder. So, by this we can say that it is one
of the important features of a voice assistant.
5) Telling the temperature/weather at the user's location: -

Let's start this with a question, why is it important for us
to know the weather of the day? or why is it important for us to
monitor the weather every day? The answer is pretty simple it
forewarns the users asking about the weather telling that "it
Fig 1. Use Case Diagram might rain today so carry an umbrella if you go out" or "It will be
a sunny day so wear a sun glass". So, by this we can say that this
Here, we will try to understand the design of a use case diagram. is also a must have feature.
Some possible scenarios of the system are explained as follows:
6) Searching for what the user asks:
Each User is an actor here for the Use-Case Diagram and the Today in the 20th century, we people often get doubts
functionality offered by NOVA is voice assistant which is a Use- and we need to clear that doubt as soon as possible else that one
Case. User gives query to the assistant using the microphone doubt will be multiplied and at the end, we'd have n doubts and
which the assistant understands using the speech to text module to clear the doubts searching the question in the internet will give
which converts the speech given by the actor to text and further us an answer and clear our doubts and asking that to the assistant
processes it to modules and performs the calls accordingly. will save a lot of time. Other than clearing the doubts, we need to
search a lot of questions or topics in the internet to keep up with
A). Mandatory features of a Voice Assistant: - the trend and we can do this searching just by giving command to
our assistant, asking it to search a specific topic/question.
1) Must provide the user any information which they ask for: -
The user might need any information which will be 7) Internet of Things:
available on the internet but searching for that information and The final important feature which is the most important
reading that takes a lot of time but with the help of a voice feature and that is Internet of Things which is a lot useful
assistant, we can complete that task of getting the information because, it'll save a lot of time. Let's take an example, let's say
sooner than searching and reading it. So, this is a small proof that that there is a person with a walking disability and he has to turn
a voice assistant helps the user to save time on the fan but the switch is a bit far and he can't walk but what he
can do is that, he can tell the assistant to turn on the fan and that
2) Telling the day's hot news in the user's location: - will turn it on. This is just one example but with the help of IoT,
In Common, watching a news channel just to know the we can do a lot of helpful stuffs like this.
important news in one’s location takes a lot of time and the user
might even want to listen to some news which is unnecessary to These are the important features of the voice assistant but other
them or a news of some different location before getting to know than this, we can do an ample of stuffs with the assistant.
the news which they want needs a lot of patience to the user but
having a voice assistant makes all that nothing, it'll give the news
of the location which the user wants to now or the news which
they want to know.
3) Telling some joke to chill up the moment: -

Now let's be honest, everyone would have had at least
one moment in their life where they were so tensed up or had an
argument with their close people. So, these moments can be
chilled up at least ten percentage with some random joke which
might cool us that moment or stop that fight. We even have a
quote stating "Laughter is the best medicine" which is relatable
to the words mentioned here in this paragraph.
4) Opening the file/folder which the user wants: -

Fig 2. Workflow Model
In the busy world, everything should do quick else, our
schedule will get changed and sometimes we need assistance of The user gives the voice input through microphone and the
someone to complete that task quickly but, if we have a voice assistant is triggered by the wake-up word and performs the STT
assistant, we can complete that task in right away in a hustle (Speech to Text) and converts it into a text and understands the
freeway. For example, let's say the user is doing some
documentation but after a while, he needs some file for reference
Voice input and further performs the task said by the user - One of the features which will be there in every assistant is
repeatedly and delivers it via TTS module via AI Voice. playing some game so that the user can spend their free time in a
fun way.
B). Limitations: - - Users might forget to turn off the system which might contain
some useful data but with a voice assistant, we can do that even
Voice assistants are here in this current technological world for a after leaving the place where the system is just by
long time and there are few drawbacks or let's say limitations for commanding the assistant to turn the system off.
the existing systems.
B). What can this Voice Assistant do?
1) The voice recognition isn't perfect - When the user asks As discussed about the mandatory features to be listed in voice
for a query, sometimes the user has to repeat the query assistant are implemented in this work, brief explanation is given
for the assistant to understand and process the query and below.
sometimes the assistant misinterpret and gives different API CALLS
results. For example, it cannot always differentiate
between homonyms, such as "their" and "there. We have used API keys for getting news information
from the News API, which is a simple JSON-based REST
2) Background Noise Interference - One of the biggest (Representational State Transfer) API for finding and retrieving
drawbacks of a voice assistant is that, to operate a voice news articles from across the web. It can be used to post top
assistant, we need to be in a quite environment to news on news sites or to search for top news on a particular topic
operate it. This is because they may not be able to and weather forecast from openweathermap platform which is
differentiate between the one who is speaking to the broadly recognizable APIs. Powered by convolutional machine
voice assistant other people talking and other ambient learning also capable of delivering all the weather information
noise, leading to script mix ups which might cause necessary for decision-making for any location on the globe.
errors. which can accurately fetch information and give results to the
user.
3) Security Concerns - Anyone who has access to a voice-
activated device can ask it questions and obtain SYSTEM CALLS
information about the device's accounts and services.
Because the gadgets will read out calendar contents, In this feature, we have used OS & Web Browser
emails, and highly personal information, this poses a Module to access the desktop, calculator, task manager,
significant security concern. Voice assistants are also command prompt & user folder. This can also restart the pc and
susceptible to a variety of other threats such as Man in open the chrome application.
the middle attacks.
CONTENT EXTRATION
IV. IMPLEMENTATION OF PROPOSED WORK
A). Why Python is used: This can Perform content extraction from YouTube,
We have used Python to build the assistant as it supports Object Wikipedia and Chrome using the web driver module from
Oriented Programming through which a lot of built-in functions selenium which provides all the implementations for the web
is made keeping it less complicated to build the assistant. The drive like searching for a specific video to play, to get a specific
assistant's query can be modified to suit the user's needs. Speech information in google or from Wikipedia. First, the assistant asks
Recognition is a process of converting the audio into text which the user what it can do for the user. If the user requests
can further used by the assistant to find what the user is something to search on the browser, it opens the browser and
requesting the assistant to do. The usage of Python is such that it then it finds the search element using the xpath of the element
cannot be limited to only one activity. Its growing popularity has and clicks the element. Then the assistant types what the user is
allowed it to enter into some of the most popular and complex requesting the assistant to search for. After typing the request, the
processes like Artificial Intelligence, Machine Learning (ML), assistant clicks on the search button the same way it clicked the
natural language processing, data science etc. Python has a lot of search bar, by using the xpath of the search button, it clicks on
libraries for every need of this project. the button. This is how the assistant interacts with the browser.
These are the important features of the voice assistant but other For example, if the user asks the assistant to play a
than this, we can do an plenty of things with the assistant. List of video on YouTube, the assistant directly opens YouTube in the
features that can be done with the assistant: browser and searches what the user is saying to play and clicks
on the search button. It clicks on the first video by using the
- Playing any video which, the user wants to see. xpath of the video.
- Telling some random fact at the start of the day with
which the user can do their work in an informative way and the SERIAL MODULES
user will also learn something new.
Finally, we used the serial module for implementing the
IoT feature for this project. It is a module which acquires the
access for the serial port of the Arduino board and used port which we can hope that it'll be one of the biggest technology in
number 11 and COM3. the current technological world.
C). Algorithm: Development of the software is almost completed form our side
- Speech Recognition Module
and it's working fine as expected which was discussed for some
The class which we are using is called Recognizer. It converts
extra development so, maybe some advancement might come in
the audio files into text and module is used to give the output in
the near future where the assistant which we developed will be
speech. Energy threshold function represents the energy level
threshold for sounds. Values below this threshold are considered even more useful than it is now.
silence, and values above this threshold are considered speech.
Recognizer instance adjusts the ambient noise with the source,
and duration, which adjusts the energy threshold dynamically
using audio.
- Speech to Text & Text to Speech Conversion

Pyttsx3 is a text-to-speech conversion library in Python. And can
change the Voice, Rate and Volume by specific commands.
Python provides an API called Speech Recognition to allow us to
convert audio into text for further processing converting large or
long audio files into text using the Speech Recognition API in
python. We have Included sapi5 and espeak TTS Engines which
can process the same. Fig 4. Voice Assistant Environment
- Process & Executes the Required Command VI. CONCLUSION

The said command is converted into text via speech recognition
As stated before, "voice assistant is one of the biggest problem
module and further stored in a temp. Then, Analyses the user’s
solver" and you can see that in the proposals with the examples
text via temp and decide what the user needs based on input
provided and runs the while loop. This way these Commands are that it is in fact one of the biggest problem solver of the current
executed. world. We can see that voice assistant is one of the major
evolving artificial intelligence in the current world once again on
seeing the proposal examples because at the past, the best feature
which a voice assistant had was telling the date and searching the
web and giving the results but now look at the functions that it
can do so with this, we can say that it is an evolving software in
the current world. The main idea is to develop the assistant even
more advanced than it is now by linking other smart application
into this project. Thus, will save an ample of time for users. I
would like to conclude with the statement that we will try our
best and give one of the finest voice assistants which we are able
to.
VII. REFERENCES
Easwara Moorthy, Aarthi & Vu, Kim-Phuong, “Voice Activated Personal

Assistant: Acceptability of Use in the Public Space” HIMI 2014. Lecture
Notes in Computer Science, vol 8522. Springer, pp. 324-334, 10.1007/978-
3-319-07863-2_32.
V. Këpuska and G. Bohouta, "Next-generation of virtual personal assistants
(Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home)," 2018
IEEE 8th Annual Computing and Communication Workshop and
Conference (CCWC), 2018, pp. 99-103, doi:
10.1109/CCWC.2018.8301638.
Fig 3. Architecture Diagram George Terzopoulos and Maya Satratzemi. 2019. “Voice Assistants and Artificial
Intelligence in Education”. In Proceedings of the 9th Balkan Conference on
Informatics (BCI'19). Association for Computing Machinery, New York,
V. RESULTS AND DISCUSSION NY, USA, Article 34, 1–6. DOI:https://doi.org/10.1145/3351556.3351588
The project work of the voice assistant has been clearly R. Kumar, G. Sarupria, V. Panwala, S. Shah and N. Shah, "Power Efficient Smart
Home with Voice Assistant," 2020 11th International Conference on
explained in this report, how useful it is and how we can rely on Computing, Communication and Networking Technologies (ICCCNT),
a voice assistant for performing any/every task which the user 2020, pp. 1-5, doi: 10.1109/ICCCNT49239.2020.9225612.
needs to complete and how the assistant is developing everyday
T. -K. Kim, "Short Research on Voice Control System Based on Artificial
Intelligence Assistant," 2020 International Conference on Electronics,
Information, and Communication (ICEIC), 2020, pp. 1-2, doi:
10.1109/ICEIC49074.2020.9051160.
S. Subhash, P. N. Srivatsa, S. Siddesh, A. Ullas and B. Santhosh, "Artificial
Intelligence-based Voice Assistant," 2020 Fourth World Conference on
Smart Trends in Systems, Security and Sustainability (WorldS4), 2020, pp.
593-596, doi: 10.1109/WorldS450073.2020.9210344.
Dellaert, B.G.C., Shu, S.B., Arentze, T.A. et al. Consumer decisions with
artificially intelligent voice assistants. Mark Lett 31, 335–347 (2020).
https://doi.org/10.1007/s11002-020-09537-5
Harshit Agrawal, Nivedita Singh, Gaurav Kumar, Dr. Diwakar Yagyasen, Mr.
Surya Vikram Singh “Voice Assistant Using Python” 2021, IJIRT Volume
8 Issue 2, ISSN: 2349-6002, pp.419-423.
Abeed Sayyed, AshpakShaikh, AshishSancheti,Swikar Sangamnere, Prof. Jayant
H Bhangale. “Desktop Assistant AI Using Python” (2021) International
Journal of Advanced Research in Science, Communication and Technology
(IJARSCT)Volume 6, Issue 2, June 2021. ISSN (Online) 2581-942.
P.Krishnaraj,F.Mohamed Faris,D.Rajesh “Portable Voice Recognition with GUI
Automation” 2021, IJIRT, Volume 9 Issue 6 ISSN (Print): 2320-9356 PP.
20-23
Mrs.A.M.Sermakani, J.Monisha, G.Shrisha, G.Sumisha, “Creating Desktop
Speech Recognization Using Python Programming.” IJARCCE, Vol. 10,
Issue 3, March 2021, ISSN (Online), pp.129-134.
Rajdip Paul, Nirmalya Mukhopadhya “A Novel Python-based Voice Assistance
System for reducing the Hardware Dependency of Modern Age Physical
Servers”. IRJET, Volume: 08 Issue: 05, May 2021, e-ISSN: 2395-0056,
ISSN: 2395-0072.
V Geetha & Gomathy, C K & Kottamasu, Manasa & Kumar, Nukala. (2021). The
Voice Enabled Personal Assistant for Pc using Python. International Journal
of Engineering and Advanced Technology. 10. 162-165.
10.35940/ijeat.D2425.0410421.
Zwakman, Dilawar Shah & Pal, Debajyoti & Triyason, Tuul & Arpnikanondt,
Chonlameth. (2021). Voice Usability Scale: Measuring the User Experience
with Voice Assistants. 308-311. 10.1109/iSES50453.2020.00074.
Sprengholz, Philipp & Betsch, Cornelia “Ok Google: Using virtual assistants for
data collection in psychological and behavioral research” (2021) Behavior
Research Methods. DOI:10.3758/s13428-021-01629-y
Buhalis D., Moldavska I. “In-room Voice-Based AI Digital Assistants
Transforming On-Site Hotel Services and Guests’ Experiences” 2021 In:
Wörndl W., Koo C., Stienmetz J.L. (eds) Information and Communication
Technologies in Tourism 2021. Springer, Cham.
https://doi.org/10.1007/978-3-030-65785-7_3, pp.
47

1822 B.E Cse Batchno 10

Uploaded by

Copyright:

Available Formats

1822 B.E Cse Batchno 10

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1822 B.E Cse Batchno 10

Uploaded by

Copyright:

Available Formats

SMART VOICE ASSISTANT USING PYTHON

Submitted in partial fulfillment of the requirements

DEPARTMENT OF COMPUTER SCIENCE AND

JEPPIAAR NAGAR, RAJIV GANDHI SALAI,

INSTITUTE OF SCIENCE AND TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE

Head of the Department

Submitted for Viva voce Examination held on

Internal Examiner External Examiner

PLACE: SIGNATURE OF THE CANDIDATE

I am pleased to acknowledge my sincere thanks to Board of Management of

I convey my thanks to Dr. T.Sasikala M.E.,Ph.D., Dean, School of Computing

Chapter No. TITLE Page No.

Figure No. Figure Name Page No.

IOT Internet of Things

COM Communication Port

OOPs Object Oriented

STT Speech to Text

b) Converting the speech into text will be processed by the assistant.

c) The converted text is now processed to get the required results.

1.3 VOICE ASSISTANT

Our assistant “NOVA” extends to helps us when working on a system in which it is

1.3.1 WHAT IS VOICE ASSISTANT

1.3.2 WHY DO WE NEED IT

1.3.3 WHERE TO USE IT

2.1 RELATED WORK

3.1 EXISTING SYSTEM

3.2 PROPOSED SYSTEM

We are proposing a system in an efficient way of implementing a Personal voice

 The proposed system will have the following functionality:

3.3 OBJECTIVE OF PROJECT

3.4 SOFTWARE AND HARDWARE REQUIREMENTS

3.4.1 Software Requirements:

 Python 3.5 & Above

 Windows 7 And Above

3.4.2 Hardware Requirements:

 Processor: Intel Core i5

 OS: Windows / Mac

 ARDUINO UNO board

 Plug Point & a Plug

 Pyttsx3- It is a text to speech conversion library in python which is used to

It supports three TTS engines: -

 Speech_recognition- It allows computers to understand human language. Speech

 WolfarmAlpha- Wolfram Alpha is an API which can compute expert-level answers

Command to install :- pip install wolframalpha

 Webbrowser- Webbrowser module is a convenient web browser controller. It provides a

Python is an OOPs (Object Oriented Programming) based, high level, interpreted

It’s owing to the subsequent strengths that Python has –

 Multi-purpose language − Python could be a multi-purpose programing language

 Support of open supply community − As being open supply programing

The internet of things, or IoT, is a system of interrelated computing devices,

Increasingly, organizations in a variety of industries are using IoT to operate more

Fig 3.1 System Architecture

3.7 ALGORITHMS USED

3.7.1 SPEECH RECOGNITION MODULE

 The class which we are using is called Recognizer.

3.7.2 SPEECH TO TEXT & TEXT TO SPEECH CONVERSION

3.7.3 PROCESS & EXECUTES THE REQUIRED COMMAND