Synopsis Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

FIRST PHASE SYNOPSIS REPORT

on
SIGN LANGUAGE RECOGNITION

(CSE VII Semester)


2021-22

TEAM:-
AWANISH DEOPA(1018465)
DEEPAK SINGH(1018474)
DEV DHIMAN(1018476)
GAJENDER SINGH(1018482)
SYNOPSIS REPORT: SIGN LANGUAGE RECOGNITION

1. ABOUT PROJECT
The goal of this project was to build a neural network able to classify which letter of the
American Sign Language (ASL) alphabet is being signed, given an image of a signing
hand. This project is a _rst step towards building a possible sign language translator,
which can take communications in sign language and translate them into written and oral
language. Such a translator would greatly lower the barrier for many deaf and mute
individuals to be able to better communicate with others in day to day interactions. This
goal is further motivated by the isolation that is felt within the deaf community.
Loneliness and depression exists in higher rates among the deaf population, especially
when they are immersed in a hearing world. Large barriers that profoundly affect life
quality stem from the communication disconnect between the deaf and the hearing. Some
examples are information deprivation, limitation of social connections, and difficulty
integrating in society. Most research implementations for this task have used depth maps
generated by depth camera and high resolution images. The objective of this project was
to see if neural networks are able to classify signed ASL letters using simple images of
hands taken with a personal device such as a laptop webcam. This is in alignment with
the motivation as this would make a future implementation of a real time
ASL-to-oral/written language translator practical in an everyday situation.

2. DETAILED DESCRIPTION OF THE PROJECT


This is divided into 3 parts:

1. Creating the dataset


2. Training a CNN on the captured dataset
3. Predicting the data

1. Creating the dataset

It is fairly possible to get the dataset we need on the internet but in this project, we will be
creating the dataset on our own.

We will be having a live feed from the video cam and every frame that detects a hand in the
ROI (region of interest) created will be saved in a directory (here gesture directory) that
contains two folders train and test, each containing same number of folders folders containing
images captured using the create_gesture_data.py

Now for creating the dataset we get the live cam feed using OpenCV and create an ROI that
is nothing but the part of the frame where we want to detect the hand in for the gestures.
2. Training a CNN on the captured dataset

Supervised machine learning: It is one of the ways of machine learning where the model is
trained by input data and expected output data. Тo create such model, it is necessary to go
through the following phases:

1. model construction
2. model training
3. model testing

Model construction:

It depends on machine learning algorithms. In this project case, it was neural networks. Such an
algorithm looks like:
1. begin with its object: model = Sequential()
2. then consist of layers with their types: model.add(type_of_layer())
3. after adding a sufficient number of layers the model is compiled. At this moment Keras
communicates with TensorFlow for construction of the model. During model compilation it is
important to write a loss function and an optimizer algorithm.The loss function shows the
accuracy of each prediction made by the model.

Before model training it is important to scale data for their further use.

Model training:

After model construction it is time for model training. In this phase, the model is trained using
training data and expected output for this data. Progress is visible on the console when the script
runs. At the end it will report the final accuracy of the model.

Model Testing:

During this phase a second set of data is loaded. This data set has never been seen by the model
and therefore it’s true accuracy will be verified. After the model training is complete, and it is
understood that the model shows the right result, it can be saved. Finally, the saved model can be
used in the real world. The name of this phase is model evaluation. This means that the model can
be used to evaluate new data.

3. Predicting the data


After the training of the model, we'll test the model that how accurate it will be on the new
dataset i.e. test dataset using an evaluation method which returns loss value & accuracy value
for the model in test mode, its input will be input test images and target variables. Then we'll
predict the output for the input samples i.e. test datasets using predict method.
TECHNOLOGY USED

1. HARDWARE REQUIREMENT
 Camera: Good quality,3MP
 Ram: Minimum 4GB or higher
 Processor: Intel Pentium 4 or higher
 HDD: 10GB or higher
 Monitor: 15” or 17” colour monitor
 Mouse: Scroll or Optical Mouse or Touch Pad
 Keyboard: Standard 110 keys keyboard

3. SOFTWARE REQUIREMENT
 Operating System: Windows, Mac, Linux
 SDK: OpenCV, TensorFlow, Keras, NumPy
 Python 3.7 32-bit
 IDLE/Jupyter Notebook
MODULES USED IN THE PROJECT

1. TensorFlow:
TensorFlow is an end-to-end open source platform for machine learning. It has a
comprehensive, flexible ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build and deploy ML
powered applications.
Features:
 Easy model building
Build and train ML models easily using intuitive high-level APIs like Keras with eager
execution, which makes for immediate model iteration and easy debugging.
 Robust ML production anywhere
Easily train and deploy models in the cloud, on-prem, in the browser, or on-device no matter
what language you use.
 Powerful experimentation for research
A simple and flexible architecture to take new ideas from concept to code, to state-of-the-art
models, and to publication faster.

2. NumPy
NumPy is a Python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use
it freely. NumPy stands for Numerical Python.

In Python we have lists that serve the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.
The array object in NumPy is called ndarray, it provides a lot of supporting functions that
make working with ndarray very easy.

Arrays are very frequently used in data science, where speed and resources are very
important. NumPy is a Python library and is written partially in Python, but most of the parts
that require fast computation are written in C or C++

3. OpenCV

OpenCV (Open Source Computer Vision Library) is an open source computer vision and
machine learning software library. OpenCV was built to provide a common infrastructure for
computer vision applications and to accelerate the use of machine perception in the
commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses
to utilize and modify the code. OpenCV is the most popular library for computer vision.
Originally written in C/C++, it now provides bindings for Python.
The library has more than 2500 optimized algorithms, which includes a comprehensive set of
both classic and state-of-the-art computer vision and machine learning algorithms. These
algorithms can be used to detect and recognize faces, identify objects, classify human actions
in videos, track camera movements, track moving objects, extract 3D models of objects,
produce 3D point clouds from stereo cameras, stitch images together to produce a high
resolution image of an entire scene, find similar images from an image database, remove red
eyes from images taken using flash, follow eye movements, recognize scenery and establish
markers to overlay it with augmented reality, etc.

4. Keras

Keras is an open-source neural-network library written in Python. It is capable of running on


top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML. Designed to enable
fast experimentation with deep neural networks, it focuses on being user-friendly, modular,
and extensible. It was developed as part of the research effort of project ONEIROS (Open-
ended Neuro-Electronic Intelligent Robot Operating System), and its primary author and
maintainer is François Chollet, a Google engineer. Chollet also is the author of the XCeption
deep neural network model.

Features: Keras contains numerous implementations of commonly used neural network


building blocks such as layers, objectives, activation functions, optimizers, and a host of tools
to make working with image and text data easier to simplify the coding necessary for writing
deep neural network code. The code is hosted on GitHub, and community support forums
include the GitHub issues page, and a Slack channel.
Keras allows users to productize deep models on smartphones (iOS and Android), on the
web, or on the Java Virtual Machine. It also allows use of distributed training of deep-
learning models on clusters of Graphics processing units (GPU) and tensor processing units
(TPU) principally in conjunction with CUDA
MOTIVATION FOR THE PROJECT
The 2011 Indian census cites roughly 1.3 million people with “Hearing Impairment”. In
contrast to that numbers from India’s National Association of the Deaf estimates that 18
million people –roughly 1 per cent of Indian population are deaf. These statistics formed the
motivation for our project. As these speech impairment and deaf people need a proper
channel to communicate with normal people there is a need for a system . Not all normal
people can understand sign language of impaired people. Our project hence is aimed at
converting the sign language gestures into text that is readable for normal people. Deaf is a
disability that impair their hearing and make them unable to hear 1, while mute is a disability
that impair their speaking and make them unable to speak 2. Both are only disabled at their
hearing and/or speaking, therefore can still do much other things. The only thing that separate
them and the normal people is communication. If there is a way for normal people and deaf-
mute people to communicate, the deaf-mute people can easily live like a normal person. And
the only way for them to communicate is through sign language. While sign language is very
important to deaf-mute people, to communicate both with normal people and with
themselves, is still getting little attention from the normal people. We as the normal people,
tend to ignore the importance of sign language, unless there are loved ones who are deaf-
mute. One of the solution to communicate with the deaf-mute people is by using the services
of sign language interpreter. But the usage of sign language interpreter can be costly. Cheap
solution is required so that the deaf-mute and normal people can communicate normally.
Therefore, researchers want to find a way for the deaf-mute people so that they can
communicate easily with normal person. The breakthrough for this is the Sign Language
Recognition System. The system aims to recognize the sign language, and translate it to the
local language via text or speech. However, building this system cost very much and are
difficult to be applied for daily use. Early researches have known to be successful in Sign
Language Recognition System by using data gloves. But, the high cost of the gloves and
wearable character make it difficult to be commercialized 3. Knowing that, researchers then
try to develop a pure vision Sign Language Recognition Systems. However, it is also coming
with difficulties, especially to precisely track hands movement.

You might also like