Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collective nouns in other senses & their lemmas #478

Open
AngledLuffa opened this issue Nov 28, 2023 · 5 comments
Open

Collective nouns in other senses & their lemmas #478

AngledLuffa opened this issue Nov 28, 2023 · 5 comments
Labels

Comments

@AngledLuffa
Copy link
Contributor

Other examples, especially when different between EWT and GUM

savings/earnings/goods

GUM

# sent_id = GUM_voyage_lodz-27
# text = Although the city was not destroyed in the aftermath, the material losses were serious as the machinery, raw materials and finished goods were taken away by the fleeing Nazis.
24      goods   good    NOUN    NNS     Number=Plur     18      conj    18:conj:and|26:nsubj:pass       Entity=125)|MSeg=good-s

EWT

# sent_id = newsgroup-groups.google.com_alt.animals_0084bdc731bfc8d8_ENG_20040905_212000-0119
# text = 95 - Percentage of foreign goods that arrive in the United States by sea.
6       goods   goods   NOUN    NNS     Number=Plur     3       nmod    3:nmod:of       _

similar questions could be asked of earnings, savings, etc although those show up in 0 or 1 of the treebanks

troops, also inconsistent:

GUM

# sent_id = weblog-blogspot.com_dakbangla_20041028153019_ENG_20041028_153019-0023
# text = Indeed, their reports are filled with tales of the "atrocities" of Indian troops on the innocent jihadis.
16      troops  troops  NOUN    NNS     Number=Plur     12      nmod    12:nmod:of      _

EWT

# sent_id = GUM_textbook_alamo-34
# text = In January 1835, reneging on earlier promises, he dispatched troops to the town of Anahuac to collect customs duties.
12      troops  troop   NOUN    NNS     Number=Plur     11      obj     11:obj  Entity=(142-person-new-cf4-1-sgl)|MSeg=troop-s

Here I find myself wanting to agree with GUM. I was, after all, a member of a single Boy Scout troop

economics I suppose stays plural? I've never heard of a single economic

# sent_id = newsgroup-groups.google.com_FOOLED_7554c5ce34a5a49e_ENG_20051012_144800-0027
# text = Economics will be a big determiner in the speed of their program."
1       Economics       economics       NOUN    NNS     Number=Plur     6       nsubj   6:nsubj _

regards should also stay plural I suppose? Jamie Lannister sends his regard. Just one, though

legal grounds is less clear to me. You can have a singular ground for doing something, I think

GUM

# sent_id = GUM_news_afghan-23
# text = After some interplay between the State and Homeland Security Departments, the girls were granted "parole" status on the grounds that ...
22      grounds ground  NOUN    NNS     Number=Plur     15      obl     15:obl:on       MSeg=ground-s

EWT

# sent_id = weblog-juancole.com_juancole_20051126063000_ENG_20051126_063000-0027
# text = The Commission said it had no legal grounds for such an exclusion.
8       grounds grounds NOUN    NNS     Number=Plur     5       obj     5:obj   _

coffee grounds though? would that be a single bean, or a single chunk of ground up coffee, or ...?

@nschneid
Copy link
Contributor

I would keep the -s on fields of study (economics, physics, mathematics...).

For others I think there's gray area. Maybe we should decide on a dictionary (or set of dictionaries) to serve as arbiters.

@nschneid
Copy link
Contributor

How about we consult Wiktionary and Open English WordNet, and if they both list the plural as an entry we take that as the lemma?

That would mean:

Is there a principled reason why "earnings" is in and "savings" is out? Maybe not. But I suspect each of us would have slightly different intuitions about how to draw the line. At least looking at dictionaries would give a clear operational way to decide.

@AngledLuffa
Copy link
Contributor Author

For the most part that works, but what the heck is a single saving? I feel bad for whichever bank teller I try to have that conversation with

@amir-zeldes
Copy link
Contributor

I'd be fine with all this, but I'd like an authoritative list for the items in the corpus. The GUM list of items that tolerate xpos=NNS AND lemma=form is here:

https://github.com/amir-zeldes/gum/blob/master/_build/utils/validate.py#L725-L735

Are those all OK from the EWT perspective? If you let me know which of those are OK/not OK, I can revise accordingly.

@nschneid
Copy link
Contributor

Not sure about all of those. I have singular bicep and tricep (historically these are back-formations).

Note that words like series and species can be singular or plural, so they're not pluralia tantum.

Let's move discussion of new guidelines to UniversalDependencies/docs#999

@nschneid nschneid added the lemma label Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants