Skip to content

Ideas to improve

Pablo Gamallo edited this page Sep 24, 2020 · 1 revision

Ambiguity involving splitting

correos in Spanish: verb+pron and noun desse in Portuguese: prep+dem and verb

Solution: A warning in splitter_exe.perl to avoid splitting and a post-rule in tagger_exe.perl.

The rules could be:

if "correos" is found and was not tagged as noun, then we split in: corred VERB / os PROUN

if "desse" is found and was not tagged as a verb, then we split in: de PRP / esse DET or PROUN??? (maybe is not the best solution)

Clone this wiki locally