Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OS-7708 nss_ldap needs to cache netgroups Reviewed by: Robert Mustacchi <[email protected]> Reviewed by: Jason King <[email protected]> #221

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joyent-automation
Copy link

OS-7708 nss_ldap needs to cache netgroups
Reviewed by: Robert Mustacchi [email protected]
Reviewed by: Jason King [email protected]

This PR was migrated-from-gerrit, https://cr.joyent.us/#/c/6083/.
The raw archive of this CR is here.
See MANTA-4594 for info on Joyent Eng's migration from Gerrit.

CR discussion

@mgerdts commented at 2019-04-10T21:18:03

Patch Set 1:

The ticket has test notes (including attachments) that demonstrate how this all works.

The dtrace probes (especially the quantity of them) are probably overkill. I'm quite open to significant trimming but would like to save that until code review is winding down.

@jasonbking commented at 2019-04-10T21:24:09

Patch Set 1:

(7 comments)

Pass 1. I'm going to come back in a bit for getnetgrent.c

@jasonbking commented at 2019-04-11T05:11:08

Patch Set 1:

(8 comments)

@jasonbking commented at 2019-04-11T05:11:36

Patch Set 1:

Overall, I think the design is fine. Mostly just some questions and a few nits.

Patch Set 1 code comments
usr/src/lib/nsswitch/ldap/Makefile.com#62 @jasonbking

I'm a little weary of linking cmdutils in since it seems to wreak havoc where it isn't intentionally used. It might be preferred to just build + link list.o from usr/src/common, esp. given this can sometimes make it's way into an aribitrary process.

usr/src/lib/nsswitch/ldap/Makefile.com#62 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getexecattr.c#425 @jasonbking

Is this just to fix compiler warnings?

usr/src/lib/nsswitch/ldap/common/getexecattr.c#425 @mgerdts

Yep

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#327 @jasonbking

Maybe ERRORCHECKMUTEX instead?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#327 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#451 @jasonbking

The wording here feels a bit awkward. Do you mean 'we need to be able to assign a value to ng_name when using avl_find() without doing a heap allocation' or something like that?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#451 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#458 @jasonbking

Might ng->ng_name = (const char *)ng + sizeof (*ng) be a bit clearer to emphasize that it's pointing to the first byte past the end of the struct?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#458 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#603 @jasonbking

nit: if i and entries can never be negative, consider using an unsigned type.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#603 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#698 @jasonbking

nit: parenthesis around return value.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#698 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#927 @jasonbking

probably should annotate data as __unused

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#927 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#943 @jasonbking

I think we're still doing (void) to discard the return value..

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#943 @mgerdts

oops, forgot to cstyle. Fixed this and others.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#983 @jasonbking

That's probably the safer approach -- I would think that netgroups aren't going to be updated so frequently that the potential extra work would amount to much.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#983 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1620 @jasonbking

Consider making leftp size_t

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1620 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1874 @jasonbking

The error semantics of atoi() are undefined, might it be better to use strtol() instead?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1874 @mgerdts

strtol is awful to, but better. done.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1891 @jasonbking

rF for the mode?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1891 @mgerdts

TIL, thanks!

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1943 @jasonbking

If someone puts in a non-numeric value, you'll end up with an error message "netgroup attribute 'xxxx' value out of range" which seems a bit odd. Would it be worth splitting the two cases (out of range and not a number) out.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1943 @mgerdts

Done

usr/src/man/man4/nscd.conf.4#7 @jasonbking

Date should probably be updated

usr/src/man/man4/nscd.conf.4#7 @mgerdts

Done

@mgerdts commented at 2019-04-11T21:34:19

Patch Set 1:

(15 comments)

@rmustacc commented at 2019-04-12T01:39:26

Patch Set 2:

(40 comments)

Patch Set 2 code comments
usr/src/lib/nsswitch/ldap/common/getnetgrent.c#41 @rmustacc

What cleans up the non-static data in those cases? Reading this suggests that if we weren't using nscd we wouldn't end up freeing the data that was in the various lists and we'd slowly leak. Can you expand the comment to point how all the internal data is cleaned up? I think most of this is covered based on how ngc_enable is handled, but we'd like to avoid another occurrence of something like OS-6449.

I'm not sure we want to design something such that it's only ever cleaned by forcing the parent process to restart. I haven't managed to convince myself that the several dlclose() vectors in nscd can't trigger this, but at least form a general library design perspective, it does feel a little unsporting.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#41 @mgerdts

I'd previously had cleanup code and realized that I was only triggering it when nscd restarted itself. Added it back and will come up with a test program to exercise it outside of nscd.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#90 @rmustacc

This section should probably explain why there are atomics that are being used to track parts of the data structures and why they can't be used to protect the atomic.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#90 @mgerdts

By taking the lock for a short while in the unhappy path I can get rid of atomics.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#92 @rmustacc

I would suggest adding something that talks about how you're using time that's subject to ntp drift / sudden jumps and why that's OK/intentional.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#92 @mgerdts

I think that what you are really saying is that I should be using gethrtime() instead.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#92 @rmustacc

You don't have to use hrtime's, it should just be explained that the drift or sudden time skips are OK and how that impacts the caching. Originally I had thought it was being done that way because you're talking to ntp servers and looking at modified times.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#92 @mgerdts

If we were guaranteed to be using some other protocol (e.g. kerberos) that made the system broken due to clock skew, it would be easier to wave my hands at that.

I think the new implementation where I have monotonic seconds since start rather than seconds since the epoch is a more robust solution and no hands need to be waved.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#212 @rmustacc

It'd be nice to mention the unit the constant is in.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#212 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#237 @rmustacc

Is there any reason these can't be const? It may be that we need to change split_triple to be able to handle that, but it appears that once these have gone through split_triple(), we shouldn't need to modify the actual contents of what they point to ever again and in fact doing so incorrectly, could lead the various items that point into the same underlying string to have bad things happen. That's not happening here, but mostly suggesting as a bit of defense in depth.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#237 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#272 @rmustacc

Is there a reason this is a signed type? It appears that the value can't go negative.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#272 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#303 @rmustacc

Missing structure prefixes.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#303 @mgerdts

I avoided changing this in the spirit of not making changes unrelated to the bug I'm trying to fix. I've changed most of the rest of the file, so why not?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#310 @rmustacc

This deserves structure prefixes.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#310 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#314 @rmustacc

This should be the same type as netgroup ng_triplecnt. It also should probably be unsigned (as should ng_triplecnt) as we should never have a negative index into an array.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#314 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#314 @rmustacc

I think this was missed as now ng_triplecnt is an int32_t, but this is still an int.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#314 @mgerdts

everything related to ngc_triplecnt should now be uint32_t.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#343 @rmustacc

Any reason these aren't static like everything else?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#343 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#344 @rmustacc

Is there a reason that ngc_reap_interval would ever need to be negative and thus need to be signed?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#344 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#355 @rmustacc

The surrounding code is treating this as a boolean_t, perhaps it should be.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#355 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#357 @rmustacc

Any reason these aren't static like everything else?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#357 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#361 @rmustacc

If we need this bit of the implementation of the list_t, perhaps it should be in sys/list_impl.h instead of us duplicating it here? I mostly worry because this does feel like we're encoding the implementaiton details of what should be opaque-like.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#361 @mgerdts

Maybe it would be better to have -DDEBUG on list.c make similar assertions about the list_node_t during insert operations. That (combined with avoidance of libcmdutils) would have caught the bug that inspired me to add the LIST_*() functions.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#361 @rmustacc

I had thought about suggesting that; however, you explicitly are asserting things related to the lock state that you'd lose. I had assumed that having the lock assert would be useful.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#361 @mgerdts

I'm just going to rip out the debug code. It has served its purpose and is now just a maintenance burden.

I have also had second thoughts about asserting that list_next and list_prev are null on debug. I have little confidence that list_link_init() or calloc() are used widely enough to prevent this laying a minefield for debug kernels.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#374 @rmustacc

abort has a noreturn keywords associated with it. Does the compiler not recognize that the return (NULL) will be unreached?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#374 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#504 @rmustacc

FYI, we added mutex_enter / mutex_exit to user land. They require an error checking mutex (which you're using), but behave like their kernel counterparts and therefore don't require VERIFY0 wrapping them.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#504 @mgerdts

Nice, thanks!

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#577 @rmustacc

It's probably worth a comment reminding readers where the >> 2 comes from. I'm assuming it's related to the 25% activity markers, but it's not clear.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#577 @mgerdts

Comment added. I've also made this "/ 4" rather than ">> 2" to make it more clear. Even if the compiler doesn't optimize out the division, it doesn't really matter. Code clarity should trump cleverness here.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#597 @rmustacc

Since this is only returning success/failure, maybe use a boolean_t?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#597 @mgerdts

Leaving as-is, per MM discussion

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#617 @rmustacc

i and entries are unsigned, not signed. Need to use %u, not %d.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#617 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#658 @rmustacc

Is this the right RFC? I can't find this section in it at https://tools.ietf.org/html/rfc4512#section-3.3. I found it instead in 3.4.4.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#658 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#662 @rmustacc

It would really help to explain why this is the case in the comment. I'm trying to convince myself that this is true, but having trouble.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#662 @mgerdts

Reworked to fully parse the spec, fractional hours/minutes/seconds, and all. FWIW, strptime() is useless by design or implementation.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#678 @rmustacc

The strptime(3C) manual page says it returns a value that indicates the last non-processed character. Do we not need to check and make sure that we don't have garbage following this that we don't understand or that it's at least a time zone indicator and something that we expect rather than not?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#678 @mgerdts

Sadly, I've found that strptime() is not as flexible as it should be. With some formats, trailing characters allow the string to be parsed and the trailing characters do not cause an error. With other formats, trailing characters (such as a fraction or offset) cause problems. After spending way too many hours trying to coax strptime into doing most of the lifting, I found that I ended up with no workaround code and less overall code by just parsing the string without strptime(). 106 tests added to verify correctness.

That being said, there are still some gaps. timegm() should notice things like illegal leap years and 31 days in April. If that or deficiencies in strptime() need to be addressed, that should be follow-on work.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#681 @rmustacc

mktime(3C) is sensitive to the time zone that the process is in. Particularly in a libc vectored call case, it seems like this'll end up causing issues if the program has set a time zone as you're otherwise using time(2) which I don't believe takes the time zone into account. Maybe you want to use timegm(3C) instead?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#681 @mgerdts

oops, yeah... I thought I had already made that change.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#716 @rmustacc

When we're coming in via libc, will libc try again?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#716 @mgerdts

This replicates pre-existing behavior. I added a comment after considering why we didn't just try again here.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#716 @rmustacc

OK, that's fine. I was just asking what the libc behavior would be since you called out the nscd case. If we're clarifying it for nscd, we should also clarify it for the libc case.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#774 @rmustacc

Explicit comparison for non-boolean_t?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#774 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#943 @rmustacc

Unchecked return value.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#943 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1282 @rmustacc

The type of trip should probably match exactly (and not through typedefs) that of ng_triplecnt which its being compared to.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1282 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1437 @rmustacc

Were these blanks added on purpose?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1437 @mgerdts

Nope, shuffled code around that had been added here and left some blanks behind. Deleted.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1603 @rmustacc

This appears to really be a boolean, I'd suggest using a boolean_t to make success/failure clearer.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1603 @mgerdts

Leaving as-is for consistency with rest of this file and related files in same directory.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1604 @rmustacc

Might be worth a brief comment affirming the intended semantics of this function.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1604 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1607 @rmustacc

Why is a size_t being cast to an int?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1607 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1614 @rmustacc

strlcpy returns a size_t, but is being cast to an int.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1614 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1620 @rmustacc

Maybe worth an ASSERT that left >= len before doing the subtraction?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1620 @mgerdts

Does the if statement at 1615 not suffice?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1620 @rmustacc

Should be fine, apologies for missing that.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @rmustacc

Should this use a strcasecmp?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @mgerdts

Sigh. Not according to nscd.conf(4), but nscd_config.c says otherwise. Sigh.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @rmustacc

This is another case where now we're in terribly locale territory, fwiw.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @mgerdts

I don't think config files are ever localized, so we should be ok, right?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @rmustacc

The concern was that we were in a locale where the case comparison didn't operate as expected for ASCII. But I'm not sure if that actually exists for our current locales or not. It probably isn't worth the trouble.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @mgerdts

While looking into the newlocale() / isdigit() issue, I stumbled across this:

https://grok.elemental.org/source/xref/illumos-joyent/usr/src/lib/libc/port/locale/isdigit.c?r=4f9d8ec0#31

  • As far as we know, every encoding we support is a strict superset of ASCII

Nothing more to be done here.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1839 @rmustacc

For the record, while the implementation can assume that, we should never assume that fact when using these interfaces. That said, it's fine to not do anything else here.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1862 @rmustacc

This looks like strtonum(3C).

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1862 @mgerdts

Indeed it does. TIL

I'm not so sure that using strtonum() would end up with less code or code that is more easily understood. In order to not have a series of "if" statements in the loop in read_nscd_conf, I need to have *toi functions with the same signature. Changing the signature of yntoi() to match strtonum() will not be a lot of help, as strtonum() does not produce an error message that is terribly helpful on its own.

Swapping out the existing call to strtol() with a call to strtonum() will not be very helpful either, as I would still need to handle error cases and form a useful error message.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1862 @rmustacc

OK, fair enough.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1889 @rmustacc

I'd suggest i and line be unsigned, as they really shouldn't ever be negative.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1889 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1889 @rmustacc

i is still signed, fwiw.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1889 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1904 @rmustacc

We may want to run this parsing code through a fuzzer given that this could be running now for a large number of processes, particularly if nscd disabled. Maybe we should drop privileges while doing this?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1904 @mgerdts

Pointers to any existing tools that we've found helpful are welcome.

Since we reading /etc/nscd.conf, how much do we really need to worry about a malicious actor here? If the attacker can modify nscd.conf or interpose in a way that something else is being read, buffer overflows or similar triggered by malicious content in the file being read are the least of our worries.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1904 @rmustacc

My understanding is that folks have used things like afl, etc. You're right that it may not be worth the effort. I just wanted to suggest it and put the idea out there.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1996 @rmustacc

If this is for testing, perhaps it should only be under ifdef DEBUG?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1996 @mgerdts

The tests that make use of this are documented in the ticket. There is some automation but I don't think it is automated enough to warrant inclusion in the new tests directory.

This is not in a performance-sensitive path and can only be triggered by someone that has administrative access to the name-service-cache service. The risk of having this code in place seems quite low while the usefulness of being able to tune the timeout while investigating problems seems rather high.

FWIW, I had initially explored adding this as a tunable via nscd.conf. The existing nscd parser does not like random configuration options being added, so I backed off that plan.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1996 @rmustacc

It's fine, you don't have to remove it. It was only a suggestion given the comment above that it was for testing and there was already a lot of ifdef DEBUG logic in the module.

usr/src/lib/nsswitch/ldap/common/ldap_common.h#74 @rmustacc

Was there any particular reason this part of the check was removed?

usr/src/lib/nsswitch/ldap/common/ldap_common.h#74 @mgerdts

It led to gcc warnings->errors after I got rid of the CFLAGS that caused various warnings to be ignored.

../common/gethostent6.c:116: error: the address of 'hostname' will never be NULL [-Waddress]
../common/gethostent.c:297: error: the address of 'hostname' will never be NULL [-Waddress]

There are two other places where this macro is called and they both handle NULL.

gethostent.c
192 if (be->toglue == NULL || DOTTEDSUBDOMAIN(cname)) in _nss_ldap_hosts2str_int()
212 if (be->toglue == NULL || DOTTEDSUBDOMAIN(name)) in _nss_ldap_hosts2str_int()

usr/src/lib/nsswitch/ldap/common/ldap_common.h#74 @rmustacc

OK, makes sense. Thanks for clarifying.

usr/src/lib/nsswitch/ldap/common/provider.d#2 @rmustacc

Minor nit, but modern prototypes don't have a CDDL HEADER START/END.

usr/src/lib/nsswitch/ldap/common/provider.d#2 @mgerdts

Done

@mgerdts commented at 2019-04-17T20:31:43

Patch Set 2:

(40 comments)

@rmustacc commented at 2019-04-18T01:21:28

Patch Set 3:

(35 comments)

Patch Set 3 code comments
usr/src/lib/nsswitch/ldap/Makefile#31 @rmustacc

It's great to see unit tests. Can they live in usr/src/test so we can make sure it's always run as part of the build and general test infrastructure? Otherwise these unit tests will be missed.

usr/src/lib/nsswitch/ldap/Makefile#31 @mgerdts

thanks for the pointer.

I'm not sure if anything else is needed to cause the new /opt/nsswitch-tests tests to be executed.

usr/src/lib/nsswitch/ldap/Makefile#31 @rmustacc

Today, the tests basically are all packaged up to run on the target system. It's a little less convenient, but hopefully by being part of the broader framework there, it'll be easier to make sure folks are running it regularly. Getting that more automated at our end is on a bit of my long-term TODO list.

usr/src/lib/nsswitch/ldap/Makefile#31 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#141 @rmustacc

Was the spacing switch intentional?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#141 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#449 @rmustacc

ngc_last_tick is a uint32_t, shouldn't this return a uint32_t? Especially as several of the other values we assign are uint32_t values.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#449 @mgerdts

oops, initially switched over to hrtime_t and it turned into a bit of needless pain while updating the dtrace script. Since we really only care about seconds, I switched almost everything away from hrtime_t.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#451 @rmustacc

Maybe worth an ASSERT(MUTEX_HELD())?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#451 @mgerdts

The lock isn't necessarily held, nor does it need to be.

I've changed the code a bit to make it so that if the compiler treats ngc_last_tick as volatile that the probe value and the return value are consistent. Updated comments should clarify the rest.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#680 @rmustacc

Why is the value being tracked as a signed quantity, but hte min and max being treated as an unsigned quantity? It this is parsing and returning an int, then the min and max should be of type int.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#680 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#687 @rmustacc

I suspect you need to really use isdigint_l and specify the C locale explicitly or verify that it's an ascii digit instead through another way and add a comment explaining why isdigit isn't necessarily what you want. Unfortunately, this isdigit leaves you at the mercy of the current locale, which can happen when you're mapped into a libc process. That could lead to unexpected failure.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#687 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#694 @rmustacc

This seems like it's missing overflow detection. The concern around that is someone else will come around and nor realize that's a concern and adopt this incorrectly in the future. You could check for that by using a temporary value for the result of the multiplication and then checking if the resulting value is less than the current one. However, that will only be defined behavior if operating on unsigned data.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#694 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#703 @rmustacc

next and gtime are pointers. The act of all this arithmetic is still a pointer. This will cause a type and size mismatch when not in an ILP32 environment because it will be a 64-bit quantity. I suspect you want to take all that and cast it to a uintptr_t or some other fixed size.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#703 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#723 @rmustacc

Can you please make sure a bug is filed for these issues?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#723 @mgerdts

While attempting to do this, I've been unable to reproduce those that were clearly bugs. I still find strptime() to be more trouble than its worth for the reasons stated in the updated comment.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#723 @rmustacc

OK, the choice not to use it makes sense.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#728 @rmustacc

Should we take a pass and make sure that gtime is all ascii before proceeding through and parsing it? As there's no guarantee that the LDAP server is giving us ascii, though we're assuming it's in it.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#728 @mgerdts

Sure, though probably not necessary now that isdigit_l() is being used.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#734 @rmustacc

This appears unused.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#734 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#764 @rmustacc

It's not clear exactly what you're trying to convey with this. Are you basically worried about behavior when things wrap around and it produces a negative time when cast to a time_t?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#764 @mgerdts

yep, reworded.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#795 @rmustacc

It's not clear to me why we don't care about the return value here.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#795 @mgerdts

Per RFC4517 3.3.13, offset (g-differential) is:

g-differential = ( MINUS / PLUS ) hour [ minute ]

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#987 @rmustacc

See note earlier, but if hrtime_t is kept here, there are numerous uint32_t / hrtime_t (int64_t) comparisons.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#987 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1111 @rmustacc

If we're interrupted do we care that we may reap sooner than the default ngc_reap_interval?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1111 @mgerdts

I don't see any harm in it. If we didn't just go ahead and reap, then we would need to keep track of how much of the time was left over and only sleep for that amount of time to avoid being starved by signal delivery.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1786 @rmustacc

the?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#1786 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2072 @rmustacc

Extra space?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2072 @mgerdts

cstyle failed me!

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2240 @rmustacc

Shouldn't this explicitly mention libc also doing it? nscd isn't the only consumer of this library.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2240 @mgerdts

yep, forgot to update the comment after resurrecting some code written before I learned a thing or two.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2257 @rmustacc

Shouldn't the check for ngc_initialized be done before assigning to it?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2257 @mgerdts

uh, yeah. :shame:

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2261 @rmustacc

Shouldn't the ngc_warmer_thread actually check this value? Right now this'll never actually join.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2261 @mgerdts

more resurrection problems, fixed.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2282 @rmustacc

How do the links on this list get removed? ngc_free() doesn't clean them up. I suspect this'll blow the asserts in list.c.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2282 @mgerdts

should have called ngc_dispose_locked() instead of ngc_free().

usr/src/lib/nsswitch/ldap/test/generalized_time.c#42 @rmustacc

was generated?

usr/src/lib/nsswitch/ldap/test/generalized_time.c#184 @rmustacc

Since you don't use argc or argv, just make this void?

usr/src/lib/nsswitch/ldap/test/generalized_time.c#184 @mgerdts

Done

usr/src/lib/nsswitch/ldap/test/generalized_time.c#190 @rmustacc

You don't have to change this, but in the future, I'd just keep these unsigned for things that can never actually be negative.

usr/src/lib/nsswitch/ldap/test/generalized_time.c#190 @mgerdts

Done

@mgerdts commented at 2019-04-18T20:06:43

Patch Set 3:

(28 comments)

@rmustacc commented at 2019-04-22T20:49:46

Patch Set 2:

(7 comments)

Patch Set 5 code comments
usr/src/lib/nsswitch/ldap/common/getnetgrent.c#385 @rmustacc

cstyle?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#385 @mgerdts

cstyle(1) didn't catch this either. Sigh.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#758 @rmustacc

Is it worth trying to distinguish between being not-present and having an invalid fractional value? For example, if we had a 99 here as opposed to a missing value, should we still consider it a valid timestamp?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#758 @mgerdts

I don't see why we would consider any timestamp that doesn't conform to the RFC as valid. The current code rightly rejects any such timestamp. This is verified with this test:

  •   { "20190416013457+0160", "Offset minutes too large" },
    
usr/src/lib/nsswitch/ldap/common/getnetgrent.c#758 @rmustacc

OK, that wasn't clear to me based on the cast to void and the prior comments. I'd suggest adding something like the following to the comment: 'If there was an invalid offset, it will cause us not to advance the pointer next, so we will fail the check for a nul character and can ignore the return value here.'

The reason I suggest that is that it wasn't clear that we were relying on the side effect of not advancing next initially to be used as a way to indicate that this succeeded or failed, which is why I asked about the invalid offsets.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#758 @mgerdts

Done

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2152 @rmustacc

Missing check for failure?

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2152 @mgerdts

There's no way to return an error and VERIFY() seems rather severe. The only thing that will be affected is isdigit_l(). Even if this were to fail, so long as the LDAP server returns ascii characters in the timestamp isdigit_l() never considers the locale.

Clearly what we want is the same outcome as this.

#define isdigit(c) ((c) >= '0' && c <= '9')

ISTM that we are only making life more difficult by trying to use any form of libc's isdigit().

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2152 @rmustacc

OK, well, since we already verify that the response isascii(), if you want to just check that it's an isdigit manually that seems reasonable. I would just add a comment above it as to why we're not using locales.

usr/src/lib/nsswitch/ldap/common/getnetgrent.c#2152 @mgerdts

Done

usr/src/test/nsswitch-tests/runfiles/Makefile#16 @rmustacc

Did everyone here write these or was this just here because you copied it from somewhere else?

usr/src/test/nsswitch-tests/runfiles/Makefile#16 @mgerdts

I copied a file and made minor edits.

@mgerdts commented at 2019-04-23T16:31:45

Patch Set 5:

(6 comments)

@rmustacc commented at 2019-04-23T17:21:18

Patch Set 2:

(3 comments)

@mgerdts commented at 2019-04-23T19:24:10

Patch Set 5:

(2 comments)

@rmustacc commented at 2019-04-23T21:26:55

Patch Set 8: Code-Review+1

@mgerdts commented at 2019-04-23T21:27:04

Patch Set 8:

Retested and found a couple issues that have since been corrected. Test notes and associated scripts, ldif files, etc. updated in ticket.

@jasonbking commented at 2019-04-23T22:04:20

Patch Set 8: Code-Review+1

@mgerdts commented at 2019-04-24T14:42:20

Patch Set 9: Patch Set 8 was rebased

@mgerdts commented at 2019-04-24T14:44:30

Patch Set 10: Commit message was updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants