Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken compatibility for socketcluster 16.x #2

Open
rfsbsb opened this issue Nov 7, 2023 · 1 comment
Open

Broken compatibility for socketcluster 16.x #2

rfsbsb opened this issue Nov 7, 2023 · 1 comment

Comments

@rfsbsb
Copy link

rfsbsb commented Nov 7, 2023

We have a project using Socketcluster on a Kubernetes cluster, that internally uses this project.
When it was upgraded to use the newest version of stream-demux it broke the ability of containers on K8S exchange messages, making the horizontal scaling of socketcluster not to work.

After some time investigating we found the culprit: when this project updated to version 5.0.1 (theoricaly a minor version bump) it jumped stream-demux from version 8 to 9. This in turn upgraded writable-consumable-stream to version 4.1.0 making ag-simple-broker to not work.

Unfortuatelly none of the projects seem to have any changelog so we're not sure the reasons why this happened. To revert this we overridden the dependency of socketcluster-server to use ag-simple-broker 5.0.0 instead of allowing it to pull the 5.0.1.

@jondubois
Copy link
Member

jondubois commented Nov 8, 2023

Glad to hear you've found the issue/fix.

This change to stream-demux was supposed to be non-breaking as it did not change any external APIs however, since it was related to timeouts, it could potentially affect things in certain situations.

The issue in writable-consumable-stream was related to the once(timeout) functionality which was failing to timeout (or timing out too late) under certain specific conditions. It's possible that fixing this issue caused some timeouts to (correctly) trigger which did not trigger before. It's possible that your cluster needs a bit of extra time to boot up. Though I can't be sure that is actually your issue. Just thought it's worthwhile to mention.

If that is an issue, it may be worth to consider increasing various timeout environment variables in your cluster. For example:

BROKER_SERVER_CONNECT_TIMEOUT (default 10000ms)
BROKER_SERVER_ACK_TIMEOUT (default 10000ms)
STATE_SERVER_CONNECT_TIMEOUT (default 3000ms)
STATE_SERVER_ACK_TIMEOUT (default 2000ms)

Anyway, I'm not sure if increasing those values would clear the way to use the newer [email protected]. Downgrading sounds like a good/safe strategy. Thanks for raising awareness of this issue and this solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants