METGen: A Multimodal Emotion Generation Framework

We present an emotion-based image captioning pipeline developed on top of transformer architecture. We contrast this with an RNN-based image captioning baseline. We also conduct experiments using intermediate fine tuning and back-translation. We finally developed a rigorous evaluation scheme comprising of human evaluation, MAUVE score, & classification based evaluation. We measure all our explorations against the evaluation schemes & highlight the shortcomings both qualitatively & quantitatively.

Directory Structure

catr: Contains working IPYNB files to finetune the captioning tranformer model that was proposed by CATR. It also contains the fine tunes models that were obtained using optimal hyperparameters. Refer the README file in the directory to know the internal files usage.

baseline: Contains the baseline architecture code to train and generate captions given an input image. We are using an RNN-based image captioning module. Refer the README file in the directory to know the internal files usage.

emogen: Contains image data and annotations that were used for finetuning the catr for generating "positive" emotion captions given an image input. Refer the README file in the directory to know the internal files usage.

sarcasm: Contains image data and annotations that were used for finetuning the catr for generating "sarcasm" emotion captions given an image input. Refer the README file in the directory to know the internal files usage.

Fine Tuning and predicting captions

Refer the directory catr to see the working of finetuning base model and predicting captions.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.idea		.idea
baseline		baseline
catr		catr
emogen		emogen
images		images
sarcasm		sarcasm
.DS_Store		.DS_Store
Data Exploration.ipynb		Data Exploration.ipynb
DataSet_Generation_Metgen.ipynb		DataSet_Generation_Metgen.ipynb
Multi_Modal_Emotion_Generator_METGen.pdf		Multi_Modal_Emotion_Generator_METGen.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

METGen: A Multimodal Emotion Generation Framework

Directory Structure

Fine Tuning and predicting captions

Architecture

About

Releases

Packages

Contributors 3

Languages

Nishant3815/METGen

Folders and files

Latest commit

History

Repository files navigation

METGen: A Multimodal Emotion Generation Framework

Directory Structure

Fine Tuning and predicting captions

Architecture

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages