Leveraging crowdsource annotation item agreement for natural language tasks

This project runs experiments comparing the benefit of soft labeling and filtering with label aggregation for learning a classification model n natural language tasks. This project is the experiment code described in the paper, "Noise or additional information? Leveraging crowdsource annotation item agreement for natural language tasks" (Jamison and Gurevych, 2015).

Please use the following citation:

@inproceedings{	TUD-CS-2015179,
	author = {Emily Jamison and Iryna Gurevych},
	title = {Noise or additional information? Leveraging crowdsource annotation item
agreement for natural language tasks},
	month = sep,
	year = {2015},
	publisher = {Association for Computational Linguistics},
	booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language
Processing (EMNLP)},
	pages = {291--297},
	language = {Lisbon, Portugal},
	pubkey = {TUD-CS-2015-1179},
	research_area = {Ubiquitous Knowledge Processing},
	research_sub_area = {UKP_reviewed},
    url = {https://aclweb.org/anthology/D/D15/D15-1035.pdf}
}

Abstract: In order to reduce noise in training data, most natural language crowdsourcing annotation tasks gather redundant labels and aggregate them into an integrated label, which is provided to the classifier. However, aggregation discards potentially useful information from linguistically ambiguous instances. For five natural language tasks, we pass item agreement on to the task classifier via soft labeling and low-agreement filtering of the training dataset. We find a statistically significant benefit from low item agreement training filtering in four of our five tasks, and no systematic benefit from soft labeling.

Contact person: Emily Jamison, EmilyKJamison {at} gmail {dot} com

http://www.ukp.tu-darmstadt.de/

http://www.tu-darmstadt.de/

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project structure

src/main/groovy/de/tudarmstadt/ukp/experiments/ej/repeatwithcrowdsource -- this folder contains java experiment code for 5 natural language tasks
resources/scripts -- this folder contains the Groovy files where experiment parameters may be set
Please note: 3rd party datasets must be downloaded from elsewhere

Requirements

Java 1.7 and higher
Maven
Tested on 64-bit Linux with 2 GB RAM (-Xmx2g)
2 GB RAM

Installation

Follow the DKPro Core instructions to set your DKPro Home environment variable.
All dependencies are available on Maven Central; no 3rd party projects must be installed.
You will need to obtain the necessary corpora for the respective experiment you plan to run. Corpora and locations are described in (Jamison & Gurevych 2015), cited above.
For all experiments except Affective Text, prepare your corpus for our experiment architecture by dividing instances into cross-validation rounds of training and test data. We created "dev" and "final" batches of "train" and "test" datasets, resulting in (for RTE):

rte_orig.r0.devtest.txt
rte_orig.r0.devtrain.txt
rte_orig.r0.finaltest.txt
rte_orig.r0.finaltrain.txt
rte_orig.r1.devtest.txt
rte_orig.r1.devtrain.txt
rte_orig.r1.finaltest.txt
rte_orig.r1.finaltrain.txt
etc.

For each experiment, update file locations in the Groovy file in src/main/resources/scripts (such as method runManualCVExperiment()).

Running the experiments

To run an experiment, first set the experiment parameters in the respective Groovy file in src/main/resources/scripts; in particular, you may wish to change the path to your corpus or parameter instanceModeTrain, the feature set, or feature parameters.

Then, run the respective "RunXXXExperiment" in src/main/groovy/EXPERIMENTTORUN/. For example, to run the Biased language experiment, run the class src/main/groovy/biasedlanguage/RunBiasedLangExperiment.

Affective Text experiments run in a few seconds, while POS Tagging experiments may take several hours.

Expected results

After running the experiments, results should be printed to stdout. They can also be found in your dkpro home folder, under de.tudarmstadt.ukp.dkpro.lab/repository. You can change which results get printed from src/main/groovy/util/CombineTestResultsRegression or CombineTestResultsClassification, as appropriate. The tasks Biased Language, Affective Text use regression, while Stemming, RTE, and POS Tagging use classification.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src/main		src/main
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leveraging crowdsource annotation item agreement for natural language tasks

Project structure

Requirements

Installation

Running the experiments

Expected results

About

Releases

Packages

Languages

License

EmilyKJamison/emnlp2015-crowdsourcing

Folders and files

Latest commit

History

Repository files navigation

Leveraging crowdsource annotation item agreement for natural language tasks

Project structure

Requirements

Installation

Running the experiments

Expected results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages