-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo disambiguation favors disambiguate-add-names #60
Comments
It favours disambiguate-add-names because the spec demands adding names first, and stopping if that works. In this case, it does. “Adding names” here does not start with the previous attempt at disambiguation (ie Kurt/Kurtz), it starts from zero every time, which is the only deterministic approach. So it looks like, for the first reference:
Perform step 1 of disambiguation, add names until no further names yield a reduction in degree. It expands out to
And stops as successful. The same happens to the second cite, but it resolves to the second reference/citekey2 without the z. If adding names does not reduce the degree, then the name count bumper is reset to the last value that reduced it, or 0 if it was never reduced. Hence disambiguate-add-givenname can start with any number of additional names, as long as adding them reduced the degree. (You’ll note we use different language to describe the process, and citeproc-rs doesn’t have “sets of ambiguous cites”, only sets of references that could have produced a cite. If you want me to expound a little, then I can, but I think it’s clear what’s going on in this example.) |
Sticking with Kurt/Kurtz is the expected behaviour if you have |
You're right that it follows the specification exactly, but it's not what a copy editor would expect (and not what |
Are you saying citeproc-js reverses the first two disambiguation passes, because it’s simply better? |
Yes, with by-cite it attempts given name addition/expansion before adding a name, to keep the cite as compact as possible. |
Only with by-cite? I don’t immediately see why the others wouldn’t also benefit, e.g. primary-name getting the second name Kurtzed. Right? |
By second name, do you mean the A.Rossi names in the example? My understanding has been that styles that apply primary-name disambiguation do that and only that. The ones that I have seen mostly have year-suffix as a fallback, in case the limited disambiguation by name fails to resolve everything. |
(Following up on the speculation above, I checked the CSL repo. Out of 217 styles that use |
Ah yes, my mistake, I think you’re right only by-cite needs it. In all other GNDRs, names that would have initials and/or given names added during cite disambiguation have already been added globally. I think this also means that the description of all-names as being for both name and cite disambiguation is redundant. If all names are as unique as possible already, you can’t disambiguate a cite by expanding them. The upshot is you |
I think that's right, yes. Impressive and handy demo, by the way. |
Thanks for this discussion folks! Did either of you happen to write tests for the disambiguation behavior? |
No but I'll take it |
Cool. Did you see the clarified spec linked above? |
https://gist.github.com/cormacrelf/84bc9592cd10602d05a52bed938adece While I was at it I wrote a better tool to convert between the two test case formats in case anyone feels like going full YAML anytime. I'll publish that tomorrow, it does have a caveat where it writes out fully parsed names and also won't handle ALL the weird sections/modes, but at least it does multiline strings for the |
One note is that (it may be my setup that's wrong, but) citeproc-js fails that test, by not writing out the |
There is nothing wrong with your setup, and the test fails here as well. At first I thought, "WTF?" but on a closer look I realized it's caused by |
In the current disambiguation demo,
citeproc-rs
is adding names where disambiguation can be done by adding names or initials to existing partners. Steps to reproduce:Expected
Further disambiguation is unnecessary, and with the style set to
etal-use-first="1"
andetal-min="1"
oretal-min="2"
, a single name should be preferred. The result should remain:Åctual Result
The first given name is reverted to an initial, and two names are added:
The text was updated successfully, but these errors were encountered: