You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ProSub is a collection of datasets and corpus annotations dealing with pronoun substitutes and related linguistic categories (personal pronouns, honorific titles, address terms). Pronoun substitutes are non-pronominal expressions (e.g. 'mother', 'aunt', 'teacher') used to refer to the speaker and the addressee, thus functioning like 1st and 2nd person personal pronouns. Pronoun substitutes are very common in languages in SEA, Japan and Korea, but extremely limited elsewhere. The Common subset is based on a common questionnaire. It provides information about whether a given concept (e.g. 'child') can be used as 1st person, 2nd person, title and address term. If the use exists, example sentences are also given. The Annotations subset contains annotation of 1st and 2nd person expressions, including both personal pronouns and pronoun substitutes, and address terms. The corpora used differ from language to language. However, the annotation scheme is the same across languages.
Subsets
Common, Annotations
Languages
zsm, ind, jav, tha, vie, mya
Tasks
Word Sense Disambiguation, Word lists, Semantic Role Labeling, Machine Translation
Dataloader name:
prosub/prosub.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?prosub
The text was updated successfully, but these errors were encountered: