2303 07451 PDF
2303 07451 PDF
2303 07451 PDF
1. Introduction
Blindness is a daunting condition. According to a report by the WHO in 2013, there is an estimated 40 to
45 million people who are blind, and about 135 million have low or weak sight. According to a report by
The Hindu, 62 million people in India are visually impaired, of which eight million are blind. Visual
impairment can impact a person's quality of life and make them prone to discrimination. They face many
challenges in navigating around places. There is a large number of adaptive equipment that enable
visually impaired people to live their life independently. However, they are only found in nearby shops or
marketplaces. Also, they are quite expensive, so only some BVI people can use such resources.
The camera-operated mechanism in [1] helps them read text on things that are held in their hands
easily. The proposed system uses a camera to capture the target object, and an algorithm extracts the text
in the captured image from the backdrop. Each text letter is separated by optical character recognition
(OCR). Audio output is provided using a software development kit with the identified text. The wearable
system in [2] is a device that receives user input and recognises things. The device has an ultrasonic
sensor that assists in warning the user of objects that are in his or her path. The items are located using the
Haar cascade method. It is a wearable device that can be mounted on the user's chest. The user will
receive an audio of the object that was discovered.
The traffic scenes use object detection technology in [3] to find objects. Here, they have used a
combination of R-FCN (Regression-based Full Convolution Network) and OYOLO (optimized you only
look once), which is 1.18 times faster than YOLO. It identifies and categorizes photos of vehicles,
cyclists, and other objects. Location errors occur when using YOLO; to prevent these, we employ
OYOLO. Other possible categories of solutions analyzed in [4], with promising present and future
utilities, are based on the existing technologies.
The system in [5] uses features like Artificial Intelligence and Machine Learning to provide a
solution to the problem. A device camera captures images, objects are detected, and distance is calculated
using an application. The prototype in [6] is mounted on top of a walking cane that uses a pi cam to click
pictures and then implements the YOLO algorithm to perform object detection that works more accurately
than others. After that, the gTTS module converts text to speech, resulting in a human voice.
All currently available systems have one or more of the following drawbacks: (i) unaffordable
price (ii) lack of sales, marketing, or servicing in developing nations; (iii) high inaccuracy, thus making
them unfit for general usage; and/or (iv) bulky or challenging to use.
There are five sections in the paper. Section 1 covers the introduction to the paper and some
existing works in the domain of virtual assistance devices. Section 2 presents survey results done as part
of the idea validation process. Section 3 contains the hardware and software details of the proposed
system, and section 4 covers the real-time test results of the proposed system. The paper is concluded in
section 5. Section 6 provides the future scope of this project. The references are there in section 7.
3. Proposed system
This is a microcontroller-based virtual visual assistance device for visually challenged people. The main
objective of this system is to convert the visual world into an audio world to make it easier for people with
visual disabilities to navigate themselves and do their daily activities without feeling deprived of vision.
The ESP32 camera module will capture images of the target(s) (like surrounding objects, people,
text document, currency, road, traffic signal, traffic sign board, etc). This captured image will be sent to a
smartphone device and processed in real-time by multiple algorithms. The information extracted will be
converted into audio-based signals that will act as user feedback.
Fig 1. System Block Diagram
4. Results
4.1. Circuit diagram
An FTDI programmer is used to program the ESP32 Cam. In this circuit, ESP32 Cam module is connected
to FTDI module by connecting pins GND, U0T, U0R of ESP32 Cam module with GND, RX and TX pins of
FTDI module respectively. This whole setup is powered by a 5V battery. Note: GPIO 0 pin of the esp32 cam
module must be connected to GND when uploading code.
Fig 2. Connecting ESP32Cam with Future Technology Devices International (FTDI) module [7]
Fig 3. Image of INR 100 given as input Fig 4. Model’s text and audio-based prediction
(1) Case-I. When we place a note of Rs 100, it acts as an input. It captures the image, processes it and
then applies the algorithm. It detects the denomination and gives the output as 100 with a
probability of 1. The output received is both in the form of audio and text.
Fig 5. Image of INR 10 given as input Fig 6. Model’s text and audio-based prediction
(2) Case-II. When a 10 rupees note is given as an input, it captures the image and analyses it. After
processing the image, the algorithm is applied. It gives output as 10 with a likelihood or probability
of 0.89. The output is given in the form of audio as well as text.
Fig 7. Image of laptop given as input Fig 8. Model’s text and audio-based prediction
(3) Case-III. When the above background image is given as input, it captures the image and processes
it. After image processing, it applies an algorithm. It produces output that is a background with a
likelihood of 1. The output is given in the form of audio as well as text.
5. Conclusion
Due to the lack of directly perceiving visual information, many blind and visually impaired people
struggle to maintain a constant healthy rhythm. Therefore, a navigation system that enables blind people
to navigate their route freely and lets them know where they might be at any given time is necessary. To
make the life of visually impaired people easier, we have used technology in this article to give them a
visual aid. The project's main goal is to design an object detector that can recognise obstructions and
guide a visually impaired person in a path via voiced instructions. Hence, the design and implementation
of a cost-effective device were done to provide support and independence to visually impaired people
during their travel to new or unknown places.
6. Future scope
Future work will include improvements to the device's design to make it affordable for commercial use,
adding computer vision-based algorithms for analysing the nature of travel paths and obstacles ahead, and
conducting user research to improve the system's usability as a whole. These future works will help blind
and visually challenged people in independent navigation.
7. References
[1] S. Deshpande and R. Shriram, "Real time text detection and recognition on hand held objects to
assist blind people", 2016 International Conference on Automatic Control and Dynamic
Optimization Techniques (ICACDOT), 2016
[2] B. Deepthi Jain, S. M. Thakur and K. V. Suresh, "Visual Assistance for Blind Using Image
Processing", 2018 International Conference on Communication and Signal Processing (ICCSP),
Chennai, India, 2018
[3] J. Tao, H. Wang, X. Zhang, X. Li and H. Yang, "An object detection system based on YOLO in
traffic scene", 2017 6th International Conference on Computer Science and Network Technology
(ICCSNT), Dalian, China, 2017
[4] Csapó, Á., Wersényi, G., Nagy, H. et al., “A survey of assistive technologies and applications
for blind users on mobile platforms: a review and foundation for research”, J Multimodal
User Interfaces 9, 275–286 (2015)
[5] S. M. Felix, S. Kumar and A. Veeramuthu, "A Smart Personal AI Assistant for Visually
Impaired People", 2018 2nd International Conference on Trends in Electronics and
Informatics (ICOEI), Tirunelveli, India, 2018
[6] Therese Yamuna Mahesh, Parvathy S S, Shibin Thomas, Shilpa Rachel Thomas and Thomas
Sebastian “CICERONE- A Real Time Object Detection for Visually Impaired People” IOP
Conference Series: Materials Science and Engineering, 2021
[7] https://forum.arduino.cc/t/a-fatal-error-occured/902580/5