Skip to content

gabrielStanovsky/template-oie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Templated Open Information Extraction

Introduction

Extracts templated Open Information Extraction: allowing for diverse, non-contiguous, multi-word predicates, while keeping the arguments short and useful for downstream applications.

For example, given the sentence:

Under the agreement with the House and Senate leaders , the minimum wage would rise from the current $ 3.35 an hour to $ 4.25 an hour.

One of the extractions is:

  • Under {A0} {A1} would rise from {A2} to {A3}

      A0:	the agreement
      A1:	the minimum wage
      A2:	$ 3.35 an hour
      A3:	$ 4.25 an hour
    

Note that that the head of the predicate (rise) is also identified.

Prerequisites

  1. python 2.7
  2. pip 9.x

Installation

  1. Install required packages:
pip install -r ./requirements.txt
  1. Download spaCy English models:
python -m spacy download en

Running

Usage:
    prop_extraction --in=INPUT_FILE --out=OUTPUT_FILE [--id]

Extract propositions from a given input file, output is produced in separate output file.
If both in and out paramaters are directories, the script will iterate over all *.txt files in the input directory and
output to *.prop files in output directory.

Options:
   --in=INPUT_FILE      The input file, each sentence in a separate line
   --out=OUTPUT_FILE    The output file, Each extraction in a tab separated line, each consisting of original sentence,
   predicate template, lemmatized predicate template,argument name, argument value, ...
   --id                 Indicate that the input file is composed of sentence-id \t sentence, and copy this id in the output.

Output format

Each extraction is presented in a tab separated line, consisting of:

  1. Original sentence
    Words are separated by a single space, chunks by double spaces.
  2. Index of the main predicate's chunk
  3. Predicate template
  4. Lemmatized predicate
  5. Argument1 name
  6. Argument1 value
  7. ...

Examples

See the example folder for the output over more than 3K sentences from the news domain.

python ./prop_extraction.py --in=../examples/sentences.txt --out=../examples/sentences.prop

Users of template-oie

The following projects make use of template-oie:

About

Extract templated Open Information Extraction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages