Welcome to Scribd!

0% found this document useful (0 votes)

39 views

Review 2

Uploaded by

The document describes a model that uses CNN-LSTM neural networks to detect objects in images and generate captions. The model performs object detection using CNNs and image captioning using LSTM RNNs. It is deployed using Flask REST API to help visually impaired understand their environment.

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Review 2

Uploaded by

tereyo

0% found this document useful (0 votes)

39 views34 pages

Original Title

REVIEW 2 (1).pptx

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

39 views34 pages

Review 2

Uploaded by

tereyo

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 34

Search inside document

DETECTING AND CAPTIONING

IMAGES USING CNN-LSTM DEEP

NEURAL NETWORKS AND FLASK
UNDER GUIDANCE OF
Dr. P. VENKATESWARA RAO
PROFESSOR & HOD
DEPARTMENT OF CSE

PRESENTED BY
P.YASWANTH SAI (15F11A0563)
SK.ANEEF (16F15A0501)
T.SAI HARISH (15F11A0585)
ABSTRACT
Captioning images automatically is one of the heart of the human
visual system. There are various advantages if there is an
application which automatically caption the scenes surrounded by
them and revert back the caption as a plain message. In this paper,
we present a model based on CNN-LSTM neural networks which
automatically detects the objects in the images and generations
descriptions for the images. This model can perform two operations.
The first one is to detect objects in the image using convolutional
neural networks and the other is to caption the images using RNN
based LSTM. Interface of the model is developed using flask rest
API, which is a web development framework of python. The main
use case of this project is to help visually impaired to understand the
surrounding environment and act according to that.
EXISTING SYSTEM
Three papers we are going to discuss to define the existing system:

Deep Visual-Semantic Alignments for Generating

Image Descriptions
Sentences act as weak labels : contiguous sequences of words correspond to
some particular (unknown) location in image.

Show and Tell: A Neural Image Caption

Generator
Works fine only with images within dataset.

Deep Captioning with Multimodal Recurrent Nets

Very complex and time taking strategy, which is very difficult and costlier to
deploy in real world.
PROPOSED SYSTEM

DETECTING OBJECTS
CNN

CONVERTING TO NATURAL LANGUAGE

RNN & LSTM

GENERATING CAPTIONS

Using vigorous training and pre-trained libraries of python

DATASETS USED

Flickr8k
 8000 images, each annotated with 5 sentences via AMT

 1000 for validation, testing

Flickr30k
 30k images

 1000 validation, 1000 testing


MSCOCO
 123,000 images

 5000 for validation, testing

SYSTEM REQUIREMENTS

Hardware Requirements Recommendation:
 RAM: 4 GB (minimum)

 Hard disk: 500 GB

Software Requirements Recommendation:

 Operating System : Windows (above 7 64-bit), Linux and MAC

 Web Interface : Flask Rest API ( Python Web Framework)

 Programming Language: Python

 Libraries : Tensorflow, Keras, Numpy, PIL, Flask-python, captionBot

 Browser: Chrome , Firefox

ARCHITECTURE DIAGRAM OF THE PROJECT
MODULAR DIVISION OF THE PROJECT

Creating pre-trained model (Transfer Learning)

Object detection

Sentence Generation

Ranking based caption retrieval

Deployment to Web Server

TRANSFER LEARNING
• Transfer learning is computer vision’s popular
method ,because it allows us to build accurate
models in a timesaving manner.
• With transfer learning, instead of starting the
learning process from scratch, you start from
patterns that have been learned when solving a
different problem.
• In computer vision, transfer learning is usually
expressed through the use of pre-trained
models.
OBJECT DETECTION/WORD DETECTION
SENTENCE GENERATION
RANKING BASED CAPTION RETRIEVAL
NLP PROBABILISTIC MODEL
NLP PROBABILISTIC MODEL
NLP PROBABILISTIC MODEL
NLP PROBABILISTIC MODEL
WORKING OF CNN

• CNN stands for CONVOLUTIONAL NEURAL NETWORK.

WORKING OF RNN
• RNN stands for RECURRENT NEURAL NETWORK.
• [St=F(St-1+Wt)]
WORKING OF LSTM

• LSTM stands for LONG SHORT TERM MEMORY.

PRE TRAINED FILES

• Descriptions.txt
• Features.pkl
• Tokenizer.pkl
• Model.h5
LIBRARIES USED

• TENSORFLOW
• KERAS
• PICKLE
• NLTK
DEPLOYING AS A WEB APPLICATION

• We have used FLASK to deploy our

project as a REST-API in the form of
a web application.
• FLASK is a web application
framework of PYTHON used mostly
to deploy machine learning models.

Invoice For Completed Order CRN38330957
Document4 pages
Invoice For Completed Order CRN38330957
Chakradhar Reddy
0% (1)
Deep Learning TensorFlow and Keras
Document454 pages
Deep Learning TensorFlow and Keras
Zakaria Allito
No ratings yet
The Field Guide - Companion For Illustrating A Stylized Character - 7 Hour Real-Time Walkthrough
Document48 pages
The Field Guide - Companion For Illustrating A Stylized Character - 7 Hour Real-Time Walkthrough
Senhor Frango
100% (3)
Image Caption Generation Using Deep Learning: Department of Electronics & Instrumentation Engineering NIT Silchar, Assam
Document21 pages
Image Caption Generation Using Deep Learning: Department of Electronics & Instrumentation Engineering NIT Silchar, Assam
Potato Gaming
No ratings yet
Banner Student User Guide 8.14.2 and 9.3.9
Document1,715 pages
Banner Student User Guide 8.14.2 and 9.3.9
Ronald Mariotti
100% (1)
Chapter 3
Document32 pages
Chapter 3
ravi
60% (5)
Stack Overflow Architecture Devconf
Document40 pages
Stack Overflow Architecture Devconf
Eranga Udesh
No ratings yet
Project Review
Document12 pages
Project Review
petchima08
No ratings yet
Image Caption Generator Using CNN and LSTM
Document8 pages
Image Caption Generator Using CNN and LSTM
International Journal of Innovative Science and Research Technology
No ratings yet
Deeplearning 1
Document4 pages
Deeplearning 1
habeeb md
No ratings yet
Automatic Image Caption Generation System
Document4 pages
Automatic Image Caption Generation System
International Journal of Innovative Science and Research Technology
No ratings yet
Narrative Paragraph Generation
Document13 pages
Narrative Paragraph Generation
sid202pk
No ratings yet
RNNBasics
Document23 pages
RNNBasics
sudhamshavejendla
No ratings yet
DIP Final
Document3 pages
DIP Final
Pattarin Urapevatcharewan
No ratings yet
DL-unit-4-part-2
Document8 pages
DL-unit-4-part-2
bhumanithin8494910661
No ratings yet
Suvarna Doc
Document4 pages
Suvarna Doc
Garima Saroj
No ratings yet
Image Caption Generator Research Paper
Document4 pages
Image Caption Generator Research Paper
techvech.in
No ratings yet
Minor
Document14 pages
Minor
Shoaib Ali
No ratings yet
Image Caption Generator Final Report
Document28 pages
Image Caption Generator Final Report
zaaaawar
No ratings yet
Deep Learning
Document7 pages
Deep Learning
Suneetha Yellappa
No ratings yet
Practical Guide To Keras
Document28 pages
Practical Guide To Keras
francois
No ratings yet
Oct2022 CSC649 SupervisedDL - CNN
Document79 pages
Oct2022 CSC649 SupervisedDL - CNN
Ryan anak Gaybristi
No ratings yet
Deep Learning with TensorFlow: Explore neural networks and build intelligent systems with Python, 2nd Edition
From Everand
Deep Learning with TensorFlow: Explore neural networks and build intelligent systems with Python, 2nd Edition
Giancarlo Zaccone
No ratings yet
PyTorch Guide
Document17 pages
PyTorch Guide
Augusto Daniel Carcamo
No ratings yet
Image Caption Generator
Document20 pages
Image Caption Generator
Krupa Patel
100% (1)
DL Unit 4 Notes
Document21 pages
DL Unit 4 Notes
Mynapati Prasudha
No ratings yet
Chapter 2: Technologies: What Is Yolov4?
Document6 pages
Chapter 2: Technologies: What Is Yolov4?
Đào Quỳnh Như
No ratings yet
Poster 2
Document1 page
Poster 2
Kishan Senjaliya
No ratings yet
21BCS1133 - Exp 2.3
Document4 pages
21BCS1133 - Exp 2.3
jiteshkumardj
No ratings yet
Deep Learning and Neural Networks
Document8 pages
Deep Learning and Neural Networks
datasciencetrainingnucot
No ratings yet
Experiment 3.3
Document3 pages
Experiment 3.3
studentsingh72
No ratings yet
A NANTHUBRAIN
Document16 pages
A NANTHUBRAIN
Shyamaprasad MS
No ratings yet
Top 10 Deep Learning Algorithms You Should Know in 2023
Document14 pages
Top 10 Deep Learning Algorithms You Should Know in 2023
dark side
No ratings yet
Image Captioning: - A Deep Learning Approach
Document14 pages
Image Captioning: - A Deep Learning Approach
Pallavi Bharti
No ratings yet
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
Document10 pages
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
TJPRC Publications
No ratings yet
Deep Learning Notes
Document14 pages
Deep Learning Notes
badalrkcoc
No ratings yet
Synopsis Report
Document7 pages
Synopsis Report
Shubham Sarswat
No ratings yet
Natural Language Processing with TensorFlow: Teach language to machines using Python's deep learning library
From Everand
Natural Language Processing with TensorFlow: Teach language to machines using Python's deep learning library
Thushan Ganegedara
No ratings yet
Tutorials
Document16 pages
Tutorials
Rajesh Shah
No ratings yet
CSM 422
Document2 pages
CSM 422
gopeenandanbaburaj
No ratings yet
Plagiarism Checker X Originality Report: Similarity Found: 26%
Document29 pages
Plagiarism Checker X Originality Report: Similarity Found: 26%
Dhruva Bhaskar
No ratings yet
Creating Optical Character Recognition (OCR) Applications Using Neural Networks - CodeProject
Document7 pages
Creating Optical Character Recognition (OCR) Applications Using Neural Networks - CodeProject
Sargunam Sankaravadivel
No ratings yet
Ex 3 SRS
Document5 pages
Ex 3 SRS
Vaibhav Puri
No ratings yet
Get Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara PDF ebook with Full Chapters Now
Document65 pages
Get Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara PDF ebook with Full Chapters Now
clockwagervg
100% (2)
CCS355 Set 2
Document2 pages
CCS355 Set 2
amrashbhanuabdul
No ratings yet
Gu An Empirical Study ICCV 2017 Paper PDF
Document10 pages
Gu An Empirical Study ICCV 2017 Paper PDF
Deveshwari Pujari
No ratings yet
An Empirical Study of Language CNN For Image Captioning
Document10 pages
An Empirical Study of Language CNN For Image Captioning
Deveshwari Pujari
No ratings yet
Image Caption Generator
Document13 pages
Image Caption Generator
Samrat Singh
No ratings yet
REPORT Python
Document40 pages
REPORT Python
imroz
No ratings yet
Report 2
Document17 pages
Report 2
sushmasai.panuganti
No ratings yet
Squeeze Net
Document13 pages
Squeeze Net
pukkapad
No ratings yet
Image Captioning - A Deep Learning Approach
Document4 pages
Image Captioning - A Deep Learning Approach
Mandipalli Sai shilpa
No ratings yet
Python - Programming
Document9 pages
Python - Programming
Dong Hai Nguyen
No ratings yet
Deep Learning (MODULE-3) (1)
Document85 pages
Deep Learning (MODULE-3) (1)
Shivanshu Tiwari
No ratings yet
Image Search Engine Using DeepLearning
Document5 pages
Image Search Engine Using DeepLearning
Rajesh Bathula
No ratings yet
Syllabus - Deep Learning and Edge Intelligence
Document3 pages
Syllabus - Deep Learning and Edge Intelligence
Nilesh Nagrale
No ratings yet
Stars 4 0 0 0 + Forks 7 0 0 + License MIT
Document19 pages
Stars 4 0 0 0 + Forks 7 0 0 + License MIT
Abdulrahman Helal
No ratings yet
Master Thesis Neural Network
Document4 pages
Master Thesis Neural Network
tonichristensenaurora
100% (1)
Image Captioning
Document16 pages
Image Captioning
Pallavi Bharti
No ratings yet
Synopsis Main
Document11 pages
Synopsis Main
biradararun333
No ratings yet
Automated Face Mask Detection: A Project by Nishant Goel Under The Guidance of Dr. Anil Kumar
Document21 pages
Automated Face Mask Detection: A Project by Nishant Goel Under The Guidance of Dr. Anil Kumar
Krishnakanth Gudur
No ratings yet
Classify Webcam Images Using Deep Learning
Document17 pages
Classify Webcam Images Using Deep Learning
gaurav
No ratings yet
Computer Vision Pretrained Models: What Is Pre-Trained Model?
Document10 pages
Computer Vision Pretrained Models: What Is Pre-Trained Model?
muhdmunir
No ratings yet
Assessing The Role of Social Bots During The COVID19 Pandemic Infodemic Disagreement and CriticismJournal of Medical Internet Research
Document1 page
Assessing The Role of Social Bots During The COVID19 Pandemic Infodemic Disagreement and CriticismJournal of Medical Internet Research
Deyvis Bautista
No ratings yet
Deep Learning PDF 1715752000
Document10 pages
Deep Learning PDF 1715752000
Shahbaz
No ratings yet
OS6450 AOS 6.6.3 R01 Hardware Guide
Document140 pages
OS6450 AOS 6.6.3 R01 Hardware Guide
Besugo
No ratings yet
Writing PL/SQL Executable Statements
Document27 pages
Writing PL/SQL Executable Statements
Ganesh D Panda
No ratings yet
Building A Recommendation System With R - Sample Chapter
Document11 pages
Building A Recommendation System With R - Sample Chapter
Packt Publishing
No ratings yet
Practical 13: Study of Unix Shell and Environment Variables
Document3 pages
Practical 13: Study of Unix Shell and Environment Variables
Dinal Savaj
No ratings yet
050 Guest Room Management System - Jun-19
Document11 pages
050 Guest Room Management System - Jun-19
Đức Huấn Đỗ
100% (1)
rk3066 Datasheet v1
Document46 pages
rk3066 Datasheet v1
api-432313169
No ratings yet
Activity 1: Parts of The Hard Drive: Base Casting
Document7 pages
Activity 1: Parts of The Hard Drive: Base Casting
Kevin
No ratings yet
Rumba Board Users Manual For DIY Applications
Document22 pages
Rumba Board Users Manual For DIY Applications
martin Kim
No ratings yet
Directional Overcurrent Protection Relay GRE140 Brochure 12026-1 0
Document26 pages
Directional Overcurrent Protection Relay GRE140 Brochure 12026-1 0
tanujaayer
No ratings yet
Skybox Appliance 8050 Quick Start Guide
Document73 pages
Skybox Appliance 8050 Quick Start Guide
Net Runner
No ratings yet
English Romanian Translator
Document1 page
English Romanian Translator
Ilie Cucu
No ratings yet
Fpga Implementation of Neural Networks: Main Contents
Document21 pages
Fpga Implementation of Neural Networks: Main Contents
atef Benhaoues
No ratings yet
5 Group Technology CAPP and CAQC
Document79 pages
5 Group Technology CAPP and CAQC
Amogh Rao
No ratings yet
Instant Download Programming With Microsoft Visual Basic 2017 8th Edition Diane Zak PDF All Chapter
Document54 pages
Instant Download Programming With Microsoft Visual Basic 2017 8th Edition Diane Zak PDF All Chapter
louicefruit
No ratings yet
Input Unit: Memory: in Processing Element (PE) or CPU: Output
Document24 pages
Input Unit: Memory: in Processing Element (PE) or CPU: Output
Hamzah Akhtar
No ratings yet
JROS
Document8 pages
JROS
Rio Aperta
No ratings yet
Autodesk Revit MEP Essentials Training Syllabus
Document4 pages
Autodesk Revit MEP Essentials Training Syllabus
cloudclimber
No ratings yet
Word Lesson Plan
Document72 pages
Word Lesson Plan
irfanbwp
100% (1)
1.how Many Times "Indiabix" Is Get Printed?
Document4 pages
1.how Many Times "Indiabix" Is Get Printed?
neuralblue2
No ratings yet
Mold Tooling Design
Document92 pages
Mold Tooling Design
Adi Fiera
50% (2)
Data 20science 20crash 20course 20for 20beginners
Document310 pages
Data 20science 20crash 20course 20for 20beginners
vlc school
100% (3)
2 - Practice Test Guidelines
Document6 pages
2 - Practice Test Guidelines
19 Maysika Putri Tirtarum Saparsono
No ratings yet
Distributed Shared Memory
Document23 pages
Distributed Shared Memory
Sweta Umrao
No ratings yet
Node JS
Document4 pages
Node JS
12002040701186
No ratings yet
MAN - LinMot Talk Páginas 1 15
Document15 pages
MAN - LinMot Talk Páginas 1 15
eliaslsouz3768
No ratings yet