This repository contains the MSCOCO extracted SPO triples for the validation dataset.
spo_raw.json
contains the image id, caption together with the extracted triples.
mscoco_spo.csv
contains the image id, caption together with the extracted triples in csv format.
my_verb_conjugations.csv
contains conjugations for verbs within the extracted triple.
If you are using these annotations, please cite our paper
@INPROCEEDINGS{8695365,
author={P. {Harzig} and D. {Zecha} and R. {Lienhart} and C. {Kaiser} and R. {Schallner}},
booktitle={2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)},
title={Image Captioning with Clause-Focused Metrics in a Multi-modal Setting for Marketing},
year={2019},
volume={},
number={},
pages={419-424},
keywords={computer vision;image annotation;learning (artificial intelligence);marketing data processing;neural nets;image ratings;image captioning datasets;multimodal setting;computer vision;branded product;human-product interaction;deep neural network architecture;multitask learning setting;product name;evaluation criteria;clause-focused metrics;annotator disagreements;soft targets;marketing;Measurement;Task analysis;Decoding;Computer vision;Semantics;Companies;Face;image captioning;multi-task learning;marketing analysis;lstm;multi-modal learning},
doi={10.1109/MIPR.2019.00085},
ISSN={},
month={March},}