Skip to content

agdanoji/ImageCaptioning

Repository files navigation

ImageCaptioning

Image Captioning can be useful for physically disabled people like semi-blind or blind people if voice output is added to the generated captions. It can also be used in virtual assistants such as Sirior Cortana to help searching images of a particular type. For e.g.:- ”Show me pictures of myselfwearing a blue shirt.” Thus, we can see that there is plenty of motivation and usefulness involved in the image captioning task.

In this project, we implemented three different techniques used for Image Captioning:

  1. CNN-RNN (Google’s Implementation as ourbaseline)
  2. CNN-BRNN (Deep Visual-Semantic alignments for Gen-erating Image Description - Andrej Karpathy)
  3. Attention-based mode (Show Tell and Attend)

The various evaluation metrics used are:

  1. BLEU (Bilingual Evaluation Understudy)score
  2. METEOR
  3. Cider

Datasets Used were:

  1. Flickr8k (8000 im-ages comprising of 1GB)
  2. Flickr30K (31K imagescomprising of 6GB)
  3. MSCOCO (123K im-ages comprising of 18GB)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages