-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't reconfigure networkd on "stop" #107
Conversation
023e493
to
01a933b
Compare
Previously, stopping [email protected] would delete the installed configuration for the foo interface and trigger a networkd configuration reload. Doing so would revert the interface's configuration back to the default, and the subsequent networkd reload would reset any conntrack state for connections associated with that interface. Doing so would cause traffic for any connections that relied on the RELATED or ESTABLISHED conntrack properties to be dropped, when the expectation is that it would continue to be passed. Impact from this issue was particularly visible on systems running Docker in bridged networking mode, where the containers rely on the Docker-installed iptables rules for connectivity, including an ACCEPT rule based on established connections, by default. In this case, any connections open from local containers to a remove service would see 100% packet loss after stopping [email protected] (where foo is the interface through which container generated traffic would egress). With this change, the generated config is left behind after stopping the [email protected], even after an ENI is removed. In practice, this is not a problem because: 1. re-attaching the same ENI will use the old configuration, with any configuration changes picked up by the policy-routes service 3. Connecting a different ENI in the same slot (thus with the same name) will not match the MAC Address value, and will use the default configuration. The policy-routes service will then generate the correct ENI-specific configuration, overwriting any existing configuration left behind by the previously attached ENI.
The systemd default of of `control-group` for this value is more aggressive than we want.
...rather than explicitly in the udev rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@nmeyerhans When will this become available? I lost network connectivity last night and saw this service ultimately timeout. I also received the Systems Manager role issue (EC2RoleProvider Failed to connect to Systems Manager with instance profile role credentials - resulting in 404 from get http://169.254.169.254/latest (I think)). I have a newly installed image (Apr 3) and it is fairly vanilla. I am running gunicorn/uvicorn (I have seen that in another post with the same errors). It's odd as I only have one network interface (enX0). I'd like to try this to see if my instance stays stable. |
@rickwargo I'm no longer involved in Amazon Linux development and thus cannot answer your question. Maybe @vigh-m can help. I suspect this is blocked on #108 |
Issue #, if available: n/a
Description of changes:
This makes some changes to the behavior of ec2-net-utils when the
policy-routes
service is stopped. The major change is thatstop
no longer removes the generated config. This reduces the amount of work done and eliminates reloading ofsystemd-networkd
when doing so provides no meaningful benefit. Any routes and policy rules associated with an instance are deleted when an interface is removed, so the config removal is not meaningful.This fixes an issue observed when stopping [email protected] that would lead to forwarded connections (e.g. from a local Docker bridge network) to be flushed from the conntrack tables, leading to dropped packets.
There are other smaller changes to the systemd unit files:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.