Skip to content

Project to generate POS tag dictionary for Ukrainian language

License

Notifications You must be signed in to change notification settings

drupal-ukraine/dict_uk

 
 

Repository files navigation

This is a project to generate POS tag dictionary for Ukrainian language.

Це — проект генерування словника з тегами частин мови для української мови.

Description:

dict_uk/expand/expand_all.py -aff data/affix -dict data/dict

For all files in data/dict the project genereates all possible word forms with POS tags
by using affix rules from files in data/affix.

How to run:

# dict_uk/expand/expand_all.py -aff data/affix -dict data/dict -corp -indent -mfl -wordlist
Output:

    * dict_corp_vis.txt - Dictionary in visual (indented) format for review, analysis or conversion
    * dict_corp_lt.txt - Dictionary for LT for annotating the corpus
    * words.txt, lemmas.txt, tags.txt - list of all uniq words, lemmas and tags

# dict_uk/expand/expand_all.py -aff data/affix -dict data/dict
Output:

    * dict_rules_lt.txt - Dictionary file for LT (LanguageTool) used for grammar rules checking

About

Project to generate POS tag dictionary for Ukrainian language

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 87.5%
  • Groovy 6.3%
  • Shell 4.7%
  • Java 1.5%