-
Notifications
You must be signed in to change notification settings - Fork 6
module__YateaExtractor
#org.bibliome.alvisnlp.modules.yatea.YateaExtractor
Extract terms from the corpus using the YaTeA term extractor.
org.bibliome.alvisnlp.modules.yatea.YateaExtractor hands the corpus to the YaTeA extractor. The corpus is first written in a file in the YaTeA input format. Tokens are annotations in the layer wordLayerName, their surface form, POS tag and lemma are taken from formFeature, posFeature and lemmaFeature features respectively. If sentenceLayerName is set, then an additional SENT marker is added to reinforce sentence boundaries corresponding to annotations in this layer.
The YaTeA is called using the executable set in yateaExecutable, it will run as if it is called from directory workingDir: the result will be written in the subdirectory named corpusName.
Optional
Type: SourceStream
Path to the YaTeA configuration file.
Optional
Type: WorkingDirectory
Path to the directory where YaTeA is launched.
Optional
Type: ExecutableFile
Path to the YaTeA executable file.
Optional
Type: InputDirectory
Optional
Type: String
Optional
Type: InputDirectory
Optional
Type: OutputDirectory
Optional
Type: String
Contents of the PERLLIB in the environment of Yatea binary.
Optional
Type: InputFile
BioYaTeA option: path to the post-processing file option.
Optional
Type: OutputFile
BioYaTeA option: path to the result file after post-processing.
Optional
Type: String
Optional
Type: TestifiedTerminology
Default value: false
Type: Boolean
Default value: true
Type: Expression
Only process document that satisfy this filter.
Default value: true
Type: Boolean
Either to write DOCUMENT special tokens. Not every YaTeA version accepts them.
Default value: form
Type: String
Feature containing the word form.
Default value: lemma
Type: String
Feature containing the word lemma.
Default value: pos
Type: String
Feature containing the word POS tag.
Default value: boolean:and(true, nav:layer:words())
Type: Expression
Process only sections that satisfy this filter.
Default value: sentences
Type: String
Name of the layer containing sentence annotations, sentences are reinforced.
Default value: words
Type: String
Name of the layer containing the word annotations.
Default value: {}
Type: Mapping
Default value: {}
Type: Mapping