ImageCaptioning using Token Embeddings and inception_resnet_v2

The main idea behind this research is to try improving the Image captioning results, the change that I made is using BERT token embeddings for the text (captions) and I used Inception_resnet_v2 for the images and I successfully implemented them. Also, It is proposed to use BERTScore[2] for captions evaluation, but this is not included in the implementation.

Dataset: I downloaded it using torrent because it's quite big and the provided forum for downloading it is not working.

Presentarion: https://docs.google.com/presentation/d/10kZi5rQVZ-Tkrt5IQh_1J5iZoWbjTUtJMPaCpHAkMlk/edit#slide=id.g57487e59e2_0_26

Documentation: https://docs.google.com/document/d/1W1FD2-nDMSW6x4HnuopuOBJO4qPl3-lSSklyPb5dRiQ/edit

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Image CaptioningBERT.ipynb		Image CaptioningBERT.ipynb
Image CaptioningV2.ipynb		Image CaptioningV2.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ImageCaptioning using Token Embeddings and inception_resnet_v2

About

Releases

Packages

Languages

nesmaAlmoazamy/ImageCaptioning

Folders and files

Latest commit

History

Repository files navigation

ImageCaptioning using Token Embeddings and inception_resnet_v2

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages