Review 2
Review 2
Review 2
PRESENTED BY
P.YASWANTH SAI (15F11A0563)
SK.ANEEF (16F15A0501)
T.SAI HARISH (15F11A0585)
ABSTRACT
Captioning images automatically is one of the heart of the human
visual system. There are various advantages if there is an
application which automatically caption the scenes surrounded by
them and revert back the caption as a plain message. In this paper,
we present a model based on CNN-LSTM neural networks which
automatically detects the objects in the images and generations
descriptions for the images. This model can perform two operations.
The first one is to detect objects in the image using convolutional
neural networks and the other is to caption the images using RNN
based LSTM. Interface of the model is developed using flask rest
API, which is a web development framework of python. The main
use case of this project is to help visually impaired to understand the
surrounding environment and act according to that.
EXISTING SYSTEM
Three papers we are going to discuss to define the existing system:
DETECTING OBJECTS
CNN
GENERATING CAPTIONS
Flickr30k
30k images
MSCOCO
123,000 images
Object detection
Sentence Generation
• Descriptions.txt
• Features.pkl
• Tokenizer.pkl
• Model.h5
LIBRARIES USED
• TENSORFLOW
• KERAS
• PICKLE
• NLTK
DEPLOYING AS A WEB APPLICATION