Synopsis Report
Synopsis Report
Synopsis Report
on
SIGN LANGUAGE RECOGNITION
TEAM:-
AWANISH DEOPA(1018465)
DEEPAK SINGH(1018474)
DEV DHIMAN(1018476)
GAJENDER SINGH(1018482)
SYNOPSIS REPORT: SIGN LANGUAGE RECOGNITION
1. ABOUT PROJECT
The goal of this project was to build a neural network able to classify which letter of the
American Sign Language (ASL) alphabet is being signed, given an image of a signing
hand. This project is a _rst step towards building a possible sign language translator,
which can take communications in sign language and translate them into written and oral
language. Such a translator would greatly lower the barrier for many deaf and mute
individuals to be able to better communicate with others in day to day interactions. This
goal is further motivated by the isolation that is felt within the deaf community.
Loneliness and depression exists in higher rates among the deaf population, especially
when they are immersed in a hearing world. Large barriers that profoundly affect life
quality stem from the communication disconnect between the deaf and the hearing. Some
examples are information deprivation, limitation of social connections, and difficulty
integrating in society. Most research implementations for this task have used depth maps
generated by depth camera and high resolution images. The objective of this project was
to see if neural networks are able to classify signed ASL letters using simple images of
hands taken with a personal device such as a laptop webcam. This is in alignment with
the motivation as this would make a future implementation of a real time
ASL-to-oral/written language translator practical in an everyday situation.
It is fairly possible to get the dataset we need on the internet but in this project, we will be
creating the dataset on our own.
We will be having a live feed from the video cam and every frame that detects a hand in the
ROI (region of interest) created will be saved in a directory (here gesture directory) that
contains two folders train and test, each containing same number of folders folders containing
images captured using the create_gesture_data.py
Now for creating the dataset we get the live cam feed using OpenCV and create an ROI that
is nothing but the part of the frame where we want to detect the hand in for the gestures.
2. Training a CNN on the captured dataset
Supervised machine learning: It is one of the ways of machine learning where the model is
trained by input data and expected output data. Тo create such model, it is necessary to go
through the following phases:
1. model construction
2. model training
3. model testing
Model construction:
It depends on machine learning algorithms. In this project case, it was neural networks. Such an
algorithm looks like:
1. begin with its object: model = Sequential()
2. then consist of layers with their types: model.add(type_of_layer())
3. after adding a sufficient number of layers the model is compiled. At this moment Keras
communicates with TensorFlow for construction of the model. During model compilation it is
important to write a loss function and an optimizer algorithm.The loss function shows the
accuracy of each prediction made by the model.
Before model training it is important to scale data for their further use.
Model training:
After model construction it is time for model training. In this phase, the model is trained using
training data and expected output for this data. Progress is visible on the console when the script
runs. At the end it will report the final accuracy of the model.
Model Testing:
During this phase a second set of data is loaded. This data set has never been seen by the model
and therefore it’s true accuracy will be verified. After the model training is complete, and it is
understood that the model shows the right result, it can be saved. Finally, the saved model can be
used in the real world. The name of this phase is model evaluation. This means that the model can
be used to evaluate new data.
1. HARDWARE REQUIREMENT
Camera: Good quality,3MP
Ram: Minimum 4GB or higher
Processor: Intel Pentium 4 or higher
HDD: 10GB or higher
Monitor: 15” or 17” colour monitor
Mouse: Scroll or Optical Mouse or Touch Pad
Keyboard: Standard 110 keys keyboard
3. SOFTWARE REQUIREMENT
Operating System: Windows, Mac, Linux
SDK: OpenCV, TensorFlow, Keras, NumPy
Python 3.7 32-bit
IDLE/Jupyter Notebook
MODULES USED IN THE PROJECT
1. TensorFlow:
TensorFlow is an end-to-end open source platform for machine learning. It has a
comprehensive, flexible ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build and deploy ML
powered applications.
Features:
Easy model building
Build and train ML models easily using intuitive high-level APIs like Keras with eager
execution, which makes for immediate model iteration and easy debugging.
Robust ML production anywhere
Easily train and deploy models in the cloud, on-prem, in the browser, or on-device no matter
what language you use.
Powerful experimentation for research
A simple and flexible architecture to take new ideas from concept to code, to state-of-the-art
models, and to publication faster.
2. NumPy
NumPy is a Python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use
it freely. NumPy stands for Numerical Python.
In Python we have lists that serve the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.
The array object in NumPy is called ndarray, it provides a lot of supporting functions that
make working with ndarray very easy.
Arrays are very frequently used in data science, where speed and resources are very
important. NumPy is a Python library and is written partially in Python, but most of the parts
that require fast computation are written in C or C++
3. OpenCV
OpenCV (Open Source Computer Vision Library) is an open source computer vision and
machine learning software library. OpenCV was built to provide a common infrastructure for
computer vision applications and to accelerate the use of machine perception in the
commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses
to utilize and modify the code. OpenCV is the most popular library for computer vision.
Originally written in C/C++, it now provides bindings for Python.
The library has more than 2500 optimized algorithms, which includes a comprehensive set of
both classic and state-of-the-art computer vision and machine learning algorithms. These
algorithms can be used to detect and recognize faces, identify objects, classify human actions
in videos, track camera movements, track moving objects, extract 3D models of objects,
produce 3D point clouds from stereo cameras, stitch images together to produce a high
resolution image of an entire scene, find similar images from an image database, remove red
eyes from images taken using flash, follow eye movements, recognize scenery and establish
markers to overlay it with augmented reality, etc.
4. Keras