Image Captioning including Attention

Background

The product is designed for disabled people, who cannot see the pictures but can listen to the description. We are aiming to enable disabled people to access more information and experience the beauty of the world. Our product can predict the captioning of an image and display to users. A user can upload an image and play the audio of the captioning of the image, so that they can "hear" the image.

Modeling

We have used Image Captioning model in the backend. An Image Captioning model helpt the application to take an image as input and produce a short textual summary describing the content. It uses both Computer Vision and Natural Language Processing to generate the captions. The model is implementing an Encoder-Decoder architecture. It encodes images to a high-level representation by Convolutional Neural Network (CNN) and then decodes this representation using an NLP algorithm, Recurrent Neural Network(RNN). In contrast to traditional models, we have also included Attention that helped the model to focus on the most relevant pixels of the image to produce the captions.

Dataset

Flickr8k dataset from Kaggle

Application pages

Home page:

When you upload an image:

Demo:

demo.mov

Getting started in the local machine:

Clone this repo
Download my trained model and put it in a new "models" folder within the repo directory
make install
Open Python Shell and run:

import spacy

from spacy.cli.download import download download(model="en_core_web_sm")
Run app.py

Testing the model in Colab:

Download the Image folder from the above dataset, zip the "Image" folder and put it in the current Colab working directory.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
assets		assets
captions		captions
static		static
templates		templates
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
app.py		app.py
generate_html.py		generate_html.py
get_images.py		get_images.py
get_loader.py		get_loader.py
get_prediction.py		get_prediction.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Captioning including Attention

Background

Modeling

Dataset

Application pages

Demo:

Getting started in the local machine:

Testing the model in Colab:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Captioning including Attention

Background

Modeling

Dataset

Application pages

Demo:

Getting started in the local machine:

Testing the model in Colab:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages