Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple server addresses on agent #394

Open
jnummelin opened this issue Aug 30, 2022 · 10 comments
Open

Support for multiple server addresses on agent #394

jnummelin opened this issue Aug 30, 2022 · 10 comments
Assignees
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@jnummelin
Copy link
Contributor

jnummelin commented Aug 30, 2022

The current way of implementing HA setups is bit cumbersome in many cases. When running multiple servers we need to configure each server with proper --server-count but the agent can be configured only with one --proxy-server-host address. Essentially this requires one to have a LB of sorts in front of the servers. While this is not really an issue on cloud envs with ELBs and such at disposal, it's a real burden in bare metal and similar environments.

What if we could configure agent with multiple addresses in --proxy-server-host (or a new flag)? In such case the agent could "just" take connections to all provided servers and thus achieve the same as for getting --server-count number of unique connections via the LB. The big pro (IMO) in this case is that it's pretty simple to re-configure the agent (in k0s case it's running as DaemonSet) based on e.g. watching some service endpoints.

WDYT?

There's couple somewhat related issues for better support for dynamic server counts worth of referencing:
#358
#273

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 28, 2022
@twz123
Copy link

twz123 commented Dec 5, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 5, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2023
@cheftako
Copy link
Contributor

cheftako commented Mar 5, 2023

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2023
@cheftako
Copy link
Contributor

cheftako commented Mar 5, 2023

From another user "Perhaps konnectivity-agent should have an alternate flag --proxy-server-service-name, and it takes a value of a Kubernetes Service and looks at the underlying Endpoints object to find out the specific IP addresses it should connect to. Then it can be sure it is opening connections directly to each replica, without going through the LB."

Another example "Setting the kubernetes service name directly for --proxy-server-host would be a nice improvement and I have also wondered similarly if that is possible."

@cheftako
Copy link
Contributor

cheftako commented Mar 5, 2023

Would it make sense if setting multiple servers simultaneously to assume there is now LB involved and ignore any indication from the server on the number of connections to attempt?

Alternately we could have an explicit flag to turn off LB retry logic. For not if multiple endpoints are configured but the LB retry is not turned off we could error out. This would more easily allow us to support this case in the future once we properly understood what the retry/reconnect logic should be. (Eg. If the requested connection count > configured hosts? do we randomly pick an address to connect on? Do we round robin? Do attempt to keep the count per host even?)

@jnummelin
Copy link
Contributor Author

Would it make sense if setting multiple servers simultaneously to assume there is now LB involved and ignore any indication from the server on the number of connections to attempt?

IMO that sounds good. In our use case it would be much easier to just configure the agent with all server addresses.

Perhaps konnectivity-agent should have an alternate flag --proxy-server-service-name, and it takes a value of a Kubernetes Service and looks at the underlying Endpoints object to find out the specific IP addresses it should connect to.

This sounds pretty good to me. In many cases, at least all cases in the world of k0s, the server is sitting next to API server and thus we'd be able to use the kubernetes svc endpoints pretty much directly.

Another example "Setting the kubernetes service name directly for --proxy-server-host would be a nice improvement and I have also wondered similarly if that is possible."

This would AFAIK have the downside that you could not run the agent in host network as it cannot resolve the svc names.

@liangyuanpeng
Copy link
Contributor

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Oct 10, 2023
@jnummelin
Copy link
Contributor Author

jnummelin commented Jan 30, 2024

@liangyuanpeng Why frozen?

@cheftako Has there been any discussions on this in the SIG group? If any of the alternatives proposed sounds feasible, someone could have a go at the implementation.

@jkh52
Copy link
Contributor

jkh52 commented Apr 22, 2024

/assign @cheftako

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

7 participants