-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent modunload once underlay has been set #485
Conversation
3db42b9
to
03d371b
Compare
This PR moves the underlay teardown logic from xde_detach into a new ioctl (and opteadm command) to unset the XDE underlay. Detach will now fail so long as the underlay is set -- this successfully prevents `modunload` from knocking us down after this state has been set. The semantics are more or less what you'd expect: * Clearing the underlay will fail if there exists a port. * Clearing the underlay will fail if there is no underlay. * Driver detach now requires that you `opteadm clear-xde-underlay`.
03d371b
to
bca9fd0
Compare
As rough proof of operation:
Naturally we need to change the virtual hardware teardown scripts in omicron's SoftNPU-based environment to call |
@@ -171,6 +172,13 @@ impl OpteHdl { | |||
run_cmd_ioctl(self.device.as_raw_fd(), cmd, Some(&req)) | |||
} | |||
|
|||
/// Clear xde underlay devices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vague architecture question: why is this code duplicated in opte-ioctl
and opteadm
? It looks like opteadm
already has a dependency on opte-ioctl
; is there a subtle difference between struct OpteAdm
and struct OpteHdl
, such that struct OpteAdm
couldn't be implemented by calling into an OpteHdl
handle?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Staring at it for a bit, I don't see that there is much reason for us to be doing this. We have a decent number of ioctls that appear in one but not the other. OpteAdm
doesn't appear to have any methods which convert from CLI-specific datastructures, either: and if we did, there's no reason that OpteAdm
couldn't just deref to an inner OpteHdl
.
Opened #529.
xde/src/xde.rs
Outdated
drop(u.muh); | ||
|
||
// 2. Remove MAC client handle | ||
if Arc::strong_count(&u.mch) > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't obviously correct; there could be a race between checking Arc::strong_count
and someone else cloning the Arc
(maybe this race is impossible due to other factors, but it's not obvious here).
Using Arc::into_inner
(and removing drop(u.mch)
below) would be more obviously correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this race is at all impossible -- xde_rx
has another strong handle to the MAC client handle here, and it could be called into just before we deregister the promisc handle.
That gives two bad orderings between {detach (today), underlay clear} and xde_rx
, which makes me a little nervous if the two happen concurrently: we either bail on cleaning up our extra clients and leak holds on the underlying NIC, or xde_rx
increments the strong count on an invalid Arc
(!!!).
I think the only place this should be a risk is dev environments today, but I'm going to have a think about how we can safely quiesce here. Removing all ports is not a safe precondition, as any underlay traffic (or MAC flow matches, in future) will still land in xde_rx
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a little time double checking promisc handle behaviour in illumos, and it does give us the guarantees we need. I've left a comment explaining its behaviour for the benefit of future readers. 😄
Following discussion, we need to figure out a safer quiesce on rx as there is a theoretical bad ordering between `xde_rx` and underlay removal.
After some research into MAC promiscuous callback behaviour, I've verified that we don't need to do our own quiescing since the cleanup operations will await any active promisc list walkers terminating. I've explained this logic for the benefit of others and/or future Kyle.
Minor papercut I just ran into, this should keep the command zero-friction.
As part of #485, we need to *inspect* the contents of the shared `XdeState` before taking ownership and dropping its contents. However, I had created a scenario where: - The lock is acquired, based on a reference derived from the dev info ptr (`dip`), - Ownership of the contents of `dip` is taken using `Box::new_raw`, - These contents (`XdeState`) are freed, - The `MutexGuard` pointing into the `XdeState` is dropped, and attempts to unlock a freed mutex. This PR scopes the `&XdeState` and the derived `MutexGuard` in an inner block, correcting this drop order issue by forcing the free to happen at the end of a lexica scope. Fixes #543.
As part of #485, we need to *inspect* the contents of the shared `XdeState` before taking ownership and dropping its contents. However, I had created a scenario where: - The lock is acquired, based on a reference derived from the dev info ptr (`dip`), - Ownership of the contents of `dip` is taken using `Box::new_raw`, - These contents (`XdeState`) are freed, - The `MutexGuard` pointing into the `XdeState` is dropped, and attempts to unlock a freed mutex. This PR scopes the `&XdeState` and the derived `MutexGuard` in an inner block, correcting this drop order issue by forcing the free to happen at the end of a lexical scope. Fixes #543.
This PR moves the underlay teardown logic from xde_detach into a new ioctl (and opteadm command) to unset the XDE underlay. Detach will now fail so long as the underlay is set -- this successfully prevents
modunload
from knocking us down after this state has been set.The semantics are more or less what you'd expect:
opteadm clear-xde-underlay
.