Skip to content

owanr/annotation_datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

30 Commits
Β 
Β 

Repository files navigation

Year Reference Task Dataset: Has indiv.1 Attributes # annots/
instance
# rows. Score Metric
2022 ArMIS - The Arabic Misogyny and Sexism Corpus with Annotator Subjective Disagreements (Dina Almanea and Massimo Poesio) Hate speech identification ArMIS πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 3 964 0.525 Fleiss' Kappa
2021 Whose Opinions Matter? Perspective-aware Models to Identify Opinions of Hate Speech Victims in Abusive Language Detection (Sohail Akhtar, Valerio Basile, Viviana Patti) Hate speech identification HS-Brexit πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 6 1120 0.35 Fleiss' Kappa2
2021 ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI (Amanda Cercas Curry, Gavin Abercrombie, Verena Rieser) Hate speech identification ConvAbuse πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 3-8 4185 0.69 Alpha
2021 Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators’ Disagreement (Elisa Leonardelli, Stefano Menini, Alessio Palmero Aprosio, Marco Guerini, Sara Tonelli) Hate speech identification MD-Agreement πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 5 10K 71.172 Percent agreement4
2021 Designing Toxic Content Classification for a Diversity of Perspectives (Deepak Kumar, Patrick Gage Kelley, Sunny Consolvo, Joshua Mason, Elie Bursztein, Zakir Durumeric, Kurt Thomas, Michael Bailey) (jury learning) Hate speech identification Dataset: $\square$ πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 5 107,620 65.2-90% Percent agreement2
2021 Did they answer? Subjective acts and intents in conversational discourse (Elisa Ferracane, Greg Durrett, Junyi Jessy Li, Katrin Erk) Sentiment Analysis+Intent Classification Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 3-7 1K Overall (0.494), conversation act (0.652), intent (0.376) Alpha
2020 On Faithfulness and Factuality in Abstractive Summarization (Joshua Maynez, Shashi Narayan, Bernd Bohnet, Ryan McDonald) Hallucination classification Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 3 0.61-0.80 Fleiss' Kappa2
2020 On Faithfulness and Factuality in Abstractive Summarization (Joshua Maynez, Shashi Narayan, Bernd Bohnet, Ryan McDonald) Factuality classification Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ 3 0.81-1.00 Fleiss' Kappa2
2019 Understanding Discourse on Work and Job-Related Well-Being in Public Social Media (Liu, Tong and Homan, Christopher and Ovesdotter Alm, Cecilia and Lytle, Megan and Marie White, Ann and Kautz, Henry) Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦
2019 Learning to Predict Population-Level Label Distributions (Tong Liu, Akash Venkatachalam, Pratik Sanjay Bongale, Christopher M. Homan) Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦
2018 Introducing the gab hate corpus: Defining and applying hate-based rhetoric to social media posts at scale. (Brendan Kennedy, Mohammad Atari, Aida M. Davani, Leigh Yeh, Ali Omrani, Yehsong Kim, Kris Coombs, et al.) Hate speech identification Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦
2018 Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization (Shashi Narayan, Shay B. Cohen, Mirella Lapata) Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦
2018 Quantifying Qualitative Data for Understanding Controversial Issues ()
2018 Addressing Age-Related Bias in Sentiment Analysis (Mark Diaz, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle) Sentiment Analysis Dataset
2018 A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference (Williams, Adina and Nangia, Nikita and Bowman, Samuel) Multi-genre NLI Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦
2015 A large annotated corpus for learning natural language inference (Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning) NLI Dataset πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦
2014 Lexical Acquisition for Opinion Inference: A Sense-Level Lexicon of Benefactive and Malefactive Events (Yoonjung Choi, Lingjia Deng, and Janyce Wiebe) WSD for sentiment analysis Dataset $\triangle$ 0.84 Percent agreement
" " " " " " " 0.75 Kappa
  • The order of datasets within each year is random

1: If there is no πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ emoji, this dataset contains aggregate level annotations. If there is, it contains annotator-level data as well.
$\times$ = no data released,
$\square$ = data can be obtained by reaching out to authors

2:
πŸ“ = contains annotator instructions
πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦ = has annotator-level data
πŸ’» = data is crowdsourced as well (not just annotations). For example, having MTurkers write pairs of sentences as opposed to scraping the web

3: Refer to the paper for more details. The interrater agreement is reported per subset and/or using several metrics. 4: Calculated using information given in paper/from dataset.

Papers talking about calculation methods (todo: organize this table)

Not label annotations but contains input from individual annotators (todo: organize into table)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published