-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend running as non-root user cannot kill child processes after timeout #5536
Comments
@sumit-bose @simo5 Thoughts on this? In this situation we have |
Is this the only issue you face when trying to run sssd as non-root user? Unfortunately, this feature isn't "production ready" in general. For example, see https://lists.fedorahosted.org/archives/list/[email protected]/thread/CRMNX6NYDKBXY2ACKARU4LMOJRTNZ265/ |
Actually, yes, this has generally been working (RHEL 8). I have run it this way for a month or so. This is the one issue I have encountered so far that is related to running SSSD as a non-root user, although I may be missing others. I see the reports you linked to; but being able to run SSSD in a container with no root account seems like a very specialized use case. There was a request to automate SSSD configuration for running as a non-root user, but these configuration changes can be applied manually right now. This particular issue is different. The code here needs adjusting somehow in order to handle running as a non-root user, in what is otherwise a very typical deployment. But there is more than one approach that would resolve this, and it's not obvious which one might be preferred. It would also be possible to add CAP_KILL to the |
Hi, I think there are two items consider. First, if you are using FAST or Kerberos ticket validation Second, HTH bye, |
In theory we could transfer the krbtgt received by krb5_child back to pam_sss via the pam conversation back and let pam_sss it write it out as the user, but there are several chicken/egg issues that may arise from such a change in workflow as well. |
Access to the keytab is the real blocking issue. |
Would this be a reasonable solution? %pre common
setfacl -m u:sssd:r /etc/krb5.keytab
%postun common
setfacl -x u:sssd /etc/krb5.keytab I don't know if that is oversimplifying things, or if there would be benefits to using gssproxy instead. |
No, for a few reasons. First of all at install time there may be no keytab, often systems are joined to a domain after sssd has already been installed. Secondarily if the system (or the admin) makes a change to the keytab (like obtaining a new one) and doesn't carefully preserve that permission, you suddenly get users locked out, unable to login, and figuring out why will be painful. Additionally, of course, giving access to sssd to the keytab directly gives another avenue to the sssd user to circle around and elevate privileges by forging a ticket and logging into the system. But that is already the case as sssd can forge a password check anyway, so perhaps not hugely important. Perhaps having a systemd preexec script run by the sssd unit that fixes those permissions could help, but it would still require a reboot, or at least a restart of the unit to fix a system where the keytab file has lost that ACL. |
This works. Besides, when 'sssd_be' run under 'sssd' starts suid binary 'krb5_child', "Real ID" is still set to 'sssd'. This is also enough to allow 'sssd_be' to signal 'krb5_child'. But:
It would be good to have a uniform solution. Among other things, we could replace 'suid' bit with file capability 'cap_setuid,cap_setgid+ep' and probably it's possible to lock process within this set from the very beginning using |
Taking a closer look, I guess there are two points where this issue can be observed: (1) FAST+KEYTAB path, when (2) No-PKINIT path, when |
But real-uid should be SSSD_USER at this point, so this actually leaves only (2) |
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like SSSD#5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow).
There are some known issues like #5536 but those have to be solved differently. Having 'CAP_KILL' in sssd.service doesn't help anyway (and currently isn't used anyhow). Reviewed-by: Justin Stephenson <[email protected]> Reviewed-by: Pavel Březina <[email protected]> Reviewed-by: Sumit Bose <[email protected]>
Keep saved set-user-ID in `k5c_become_user()` so that 'sssd_be' running under SSSD_USER could signal it. Resolves: SSSD#5536
Keep saved set-user-ID in `k5c_become_user()` so that 'sssd_be' running under SSSD_USER could signal it. Resolves: SSSD#5536
Keep saved set-user-ID in `k5c_become_user()` so that 'sssd_be' running under SSSD_USER could signal it. Resolves: SSSD#5536
Each time the backend creates a krb5_child process, it schedules a timeout event. If the child process is still running when this timeout expires, the backend will send SIGKILL to it.
However when the backend is running as a non-root user, it does not have permission to do that:
The reason is described in the kill(2) man page.
This condition is not being met. The backend does not have CAP_KILL when it is running as a non-root user. krb5_child changes both its real user ID and saved set-user-ID when it runs (although there is special handling for PKINIT).
Afterwards, the backend will proceed to launch a second krb5_child process, which actually will execute in parallel with the first one. Notice the interleaved log messages between two different PIDs here, which are both trying to accomplish the same thing:
I'm not sure which approach is preferable for fixing this.
The text was updated successfully, but these errors were encountered: