Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"do", "have" tagged as VERB with no object #403

Closed
nschneid opened this issue Jun 10, 2023 · 14 comments
Closed

"do", "have" tagged as VERB with no object #403

nschneid opened this issue Jun 10, 2023 · 14 comments

Comments

@nschneid
Copy link
Contributor

Many of these are embedded in an object relative clause. The enhanced edge for the object (E:obj) is missing (cf. #392).

Many of the rest are instances of elliptical stranding, and should be tagged AUX.

@amir-zeldes
Copy link
Contributor

Not sure if it makes sense for these to be AUX if they are not functioning as AUX to anything overt... I mean, if you have ellipsis and just answer "I do!" to some question, isn't that the main verb at that point?

@AngledLuffa
Copy link
Contributor

AngledLuffa commented Jun 28, 2023 via email

@amir-zeldes
Copy link
Contributor

OK, I'm curious to see what others think. I guess I'm a little uncomfortable tagging things as AUX which are not auxiliaries to an actual something, because it's a relational type of category, and while I understand the 'object that isn't there' idea, UD is largely surface oriented and has pretty solid ideas about promotion (we also deprel elliptical relatives to an 'auxiliary' as acl:relcl because once there is no lexical predicate, they are effectively the main predicate)

@jnivre
Copy link

jnivre commented Jun 28, 2023

In clear cases of VP ellipsis, the auxiliary has to be promoted with respect to its deprel but should in my opinion retain the AUX tag, even in cases where the controlling context for the ellipsis is inter-sentential. For example:

You do understand, don't you?

(A: Do you understand?)
B: I do.

The behavior of these is clearly different from main verb uses of "do" and "have", which themselves can take "do" and "have" as auxiliaries. For example (with VERB instances capitalised):

You DO yoga, don't you?

(A: Do you DO yoga?)
B: I do.

This is

@nschneid
Copy link
Contributor Author

Agree with @jnivre, these bear properties of auxiliaries as distinct from lexical verbs.

  • He couldn't open the door,
    • ...so I did. [do-support]
    • ...but I could. [modal aux]
    • *...so I opened. [lexical verb cannot work here]

Other properties include subject-aux inversion and negation, as in this tag question:

  • You opened the door, didn't you?

UPOS tags tend to be lexically-oriented rather than determined by the syntactic construction (except where there is ambiguity). In particular, we wouldn't want to have to say that every modal aux is polysemous between AUX and VERB.

Thus it should remain AUX even when promoted to predicate of the clause due to ellipsis.

@amir-zeldes
Copy link
Contributor

OK, this seems to be the consensus, so I will modify it in the corpora I maintain

@nschneid
Copy link
Contributor Author

nschneid commented Jul 23, 2023

Refined criteria:

  • "do" VERB->AUX - rules out various idioms/lexical verb senses of "do"
  • "have" VERB->AUX - rules out some EWT tokens that are ambiguous from the syntactic context alone

@nschneid
Copy link
Contributor Author

DepEdit script:

; VERB->AUX for stranded "do"
; (may need to run twice because of bleeding with the 2nd-to-last rule)
; with object -> lexical "do"
lemma=/do/&upos=/VERB/;func=/obj|[xc]comp|.*:pass/	#1>#2	#1:storage=lex_do
; do well, how are you doing
lemma=/do/&upos=/VERB/;lemma=/how|likewise|so|good|fine|well|great/&func=/advmod/	#1>#2	#1:storage=lex_do
; hard to do, have little to do with
lemma=/do/&upos=/VERB/;lemma=/to/&upos=/PART/&func=/mark/	#1>#2	#1:storage=lex_do
; do as you will (wilt)
lemma=/do/&upos=/VERB/;lemma=/will/&func=/advcl/	#1>#2	#1:storage=lex_do
; that will do
lemma=/do/&upos=/VERB/;lemma=/will/&func=/aux/	#1>#2	#1:storage=lex_do
; monkey see, monkey do
lemma=/do/&upos=/VERB/;lemma=/monkey/&func=/nsubj/	#1>#2	#1:storage=lex_do
; do or die
lemma=/do/&upos=/VERB/;lemma=/die/&func=/conj/	#1>#2	#1:storage=lex_do
; what it has done and is still doing
lemma=/do/&upos=/VERB/&func=/conj/;lemma=/do/&upos=/VERB/	#2>#1	#1:storage=lex_do
; exclude: things to do, things we do (zero relative), have it done
lemma=/do/&upos=/VERB/&xpos!=/VBN/&storage!=/lex_do/&func!=/xcomp|acl|.*:relcl/	none	#1:upos=AUX


; VERB->AUX for stranded "have"
; with object -> lexical "have"
lemma=/have/&upos=/VERB/;func=/obj|[xc]comp|.*:pass/	#1>#2	#1:storage=lex_have
; fun to have
lemma=/have/&upos=/VERB/;lemma=/to/&upos=/PART/&func=/mark/	#1>#2	#1:storage=lex_have
; comparative clause
lemma=/have/&upos=/VERB/;lemma=/than/&func=/mark/;lemma=/have/&upos=/VERB/	#1>#2;#3.*#1	#1:storage=lex_have
; as much as you have
lemma=/have/&upos=/VERB/&func=/advcl/;lemma=/much/	#2>#1	#1:storage=lex_have
; have and will have
lemma=/have/&upos=/VERB/&func=/conj/;lemma=/have/&upos=/VERB/	#2>#1	#1:storage=lex_have
; is influenced by and has Jamaican references
lemma=/have/&upos=/VERB/&func=/conj/;lemma=/influence/;func=/obl/	#2>#1;#2>#3	#1:storage=lex_have
; which one had: (before list)
lemma=/have/&upos=/VERB/;upos=/PUNCT/&form=/:/	#1.#2	#1:storage=lex_have
lemma=/have/&upos=/VERB/&func=/reparandum/;upos=/VERB/	#2>#1	#1:storage=lex_have
; exclude: things we have (zero relative)
lemma=/have/&upos=/VERB/&storage!=/lex_have/&func!=/xcomp|acl|.*:relcl/&num!=/.*\..*/	none	#1:upos=AUX

@nschneid
Copy link
Contributor Author

@amir-zeldes Does the above look good to you?

@amir-zeldes
Copy link
Contributor

Probably, but will need to find a moment to check these in more detail. It might be a few days, sorry!

@amir-zeldes
Copy link
Contributor

This is a bit trickier, some of these could go either way, and some are not so lexiclized. Consider "do as you will" - is it really limited to will? What about "do as you're told"? If we extend it to any advcl, I think you'll quickly get ambiguous cases:

  • Do as you will (VERB)
  • Do as you're told (VERB)
  • As Kim did not build one, I did (AUX)
  • Did you build one? I did (AUX), as they told me to.

For some cases I'm not even sure what's correct, for example:

  • Can you build one? Will do(AUX/VERB???).

Things like "what it has done and is still doing" could be VERB, but imagine you have "which":

  • ... which it has done and is still doing

I think as advcl:relcl it's maybe AUX, but as acl:relcl maybe VERB. We could try to list all of these cases, but it would get awfully complicated very fast... I'm wondering if we should adopt a simpler heuristic definition (along the lines of "VERB if it has obj, x or y, otherwise always AUX") at the expense of a very nuanced approach which is harder to maintain. Or really just scan edge cases and manually annotate those, but TBH I don't love playing with UPOS while also investing so much effort in XPOS.

@nschneid
Copy link
Contributor Author

nschneid commented Aug 3, 2023

I developed the rules by looking pretty meticulously through EWT. I don't know if I checked them systematically against GUM, so in principle there could be ambiguity, yeah. I don't think "do" in "do as you will" is an auxiliary though, so if UD makes the distinction we should try to implement it....

Things like "what it has done and is still doing" could be VERB, but imagine you have "which":

  • ... which it has done and is still doing

Perhaps that's a tricky case (because sentence anaphora are weird), but I don't see it in either corpus so in practice it's not likely to be a big source of errors.

I would definitely say VERB for "what it has done" (acl:relcl) because that should correspond to an E:obj(done,what) relation.

"Will do." as a response—I think it's short for "Will do it" or "Will do that" or "Will do what you suggest", so VERB.

@nschneid
Copy link
Contributor Author

@amir-zeldes Thoughts on incorporating the above script into GUM?

@amir-zeldes
Copy link
Contributor

Yes, this is now folded into the build bot here - I tried to compress it a bit to keep the rules from exploding too much:

https://github.com/amir-zeldes/gum/blob/339dd86501aabfe8b3b9b5df00d42bbda25092e8/_build/utils/upos.ini#L125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants