Skip to content

Latest commit

 

History

History
31 lines (16 loc) · 919 Bytes

File metadata and controls

31 lines (16 loc) · 919 Bytes

Image-Captioning-with-Inception-LSTM

An image captioning model that uses flickr8k dataset with Deep learning and NLP.

The web interface is generated using our pretrained model using Gradio: https://gradio.app/

Reference : -> https://arxiv.org/abs/1502.03044 https://www.youtube.com/watch?v=y2BaTt1fxJU by Aladdin Persson.

Download the models,weights and files from: https://drive.google.com/drive/folders/1ThbT5oBHeZ83TyUisJUe9KRyfW2q9aJj

Download the dataset used: https://www.kaggle.com/dataset/e1cd22253a9b23b073794872bf565648ddbe4f17e7fa9e74766ad3707141adeb Then set images folder, captions.txt inside a folder Flickr8k.

flikr8k/images/

flikr8k/captions.txt

Steps to run:

Run webcam_test.ipynb to test webcam and test.ipynb to upload caption and test.

Working:

Upload any image:

Gradio Running Image Captioning Code

Use Webcam:

Gradio Running Image Captioning Code