1 ST
1 ST
1 ST
In this chapter we are going to see about voice assistant, What is voice assistant and how it works.
Many of us might have already known about this voice assistant and we use this in our day-to-day
life. A voice assistant is a digital assistant that uses voice recognition, language processing
algorithms, and voice synthesis to listen to specific voice commands and return relevant
information or perform specific functions as requested by the user. A brief description is
given about them in this chapter.
Speech is an effective and natural way for people to interact with applications,
complementing or even replacing the use of mice, keyboards, controllers, and gestures. A
hands- free, yet accurate way to communicate with applications, speech lets people be
productive and stay informed in a variety of situations where other interfaces will not.
Speech recognition is a topic that is very useful in many applications and environments in
our daily life.
Generally speech recognizer is a machine which understands humans and their spoken
word in some way and can act thereafter. A different aspect of speech recognition is to
facilitate for people with functional disability or other kinds of handicap. To make their
daily chores easier, voice control could be helpful. With their voice they could operate the
light switch turn off/on or operate some other domestic appliances. This leads to the
discussion about intelligent homes where these operations can be made available for the
common man as well as for handicapped.
.
1.2 Existing System
The existing system likely involves using JavaScript, a JavaScript web framework,
to create a Voice Assistance similar to Siri ,Alexa. This involves designing a Web
application with a user interface where users can interact with the Voice Assistance. The
Voice Assistance backend would involve integrating natural language processing (NLP)
capabilities, possibly utilizing libraries like markdown for understanding and generating
text responses. Additionally, you may use pre- trained language models like GPT-3
through APIs to enhance the Voice Assistant's conversational abilities.
User Experience: Designing an intuitive and engaging user interface for the
chatbot to ensure a smooth user experience might require significant effort and
expertise in user interface design.
CHAPTER 2
Literature Survey for Problem Identification and
Specification
2.1 Literature Survey
The rise of voice interaction has revolutionized how we interact with technology. Voice
assistants like Alexa, Siri, and Google Assistant have become ubiquitous, offering hands-free
control over smart devices and information access. Javascript, a versatile scripting language,
plays a surprisingly significant role in building these intelligent systems. This survey explores
the current landscape of Javascript-based voice assistant development, highlighting key
libraries, frameworks, and research directions.
The voice interface presents a natural and intuitive way for users to interact with computers.
Voice assistants leverage Speech Recognition (SR) and Natural Language Processing (NLP) to
understand spoken commands and convert them into actionable tasks. Javascript, traditionally
associated with web development, has emerged as a viable option for building voice assistants
due to its:
Versatility: Javascript runs in web browsers, can be embedded in servers, and can be used for
desktop and mobile applications. This flexibility allows developers to create voice assistants that
function across various platforms.
Large Community: Javascript boasts a vast and active developer community. This translates
to readily available libraries, frameworks, and support resources for building voice assistants.
Web Speech API: The Web Speech API provides Javascript with built-in functionalities for
speech recognition and synthesis. This native integration simplifies the development process.
2. Key Libraries and Frameworks:
Several Javascript libraries and frameworks empower developers to build sophisticated voice
assistants. Here are some prominent examples:
Web Speech API: As mentioned earlier, this native browser API provides core functionalities
for speech recognition and synthesis. Developers can use it to capture user voice input, convert it
to text, and generate audio responses.
Speech.js: This lightweight library simplifies Web Speech API usage by offering a user-friendly
interface for speech recognition and text-to-speech functionalities.
annyang: This open-source library enables developers to create voice-controlled applications
with minimal code. It allows for defining voice commands and associating them with specific
actions in your code.
Wit.ai: Now part of Google Cloud Dialogflow, Wit.ai was a popular platform for building
voice-powered applications. It offered features like speech recognition, intent detection, and
entity extraction, making it easier to understand user intent behind spoken commands.
Dialogflow: A Google Cloud service, Dialogflow provides a comprehensive framework for
building conversational interfaces. It integrates seamlessly with Javascript and offers features like
intent recognition, entity detection, and context management, allowing for more complex and
natural interactions with the voice assistant.
2.2 Problem Definition
During the first phase, the requirements for the system are established.
The process starts with eliciting Ask me website requirements, analyzing
and prioritizing them, which ends with the creation of the Vision & Scope
document. Vision is defined as a “long-term strategic concept of the
ultimate purpose and form of a new system.” The scope is what “draws the
boundary between what’s in and what’s out for the project.” In this phase
we gathered the requirements of Voice Assistant Application.