-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test - "subscribed query gets simultaneous insert and remove after reconnecting" #131
Comments
A new test case failed in a test rerun today, similar issue:
As the error indicates, the test cases pass, then at some point shortly thereafter
|
More info from debugging. The "Already closed" error comes from the server doing a That might be enough hints to set up a local repro, with artificial Server code logs:
Client code logs:
|
The double-done Mocha error can be repro'd locally by adding a Lines 812 to 823 in 58c8006
When a client reconnects a query subscription, Agent#querySubscribe in sharedb core does two things in parallel:
In the double-callback failure, the second one finishes quickly, sending the query diff down to the client. Once the once the insert and remove diffs are done processing on the client, the test immediately finishes, and an afterEach hook closes the test's sharedb Backend instance. If the first one's Lines 113 to 117 in 58c8006
That error gets sent to the client code, which trips a |
Fixes share/sharedb-mongo#131 This change attempts to fix some tests that are flaky in `sharedb-mongo`. The flakiness can be reproduced locally by wrapping the `Agent._querySubscribe()` [call to `_fetchBulkOps()`][1] in a long `setTimeout()`: ```js if (options.fetchOps) { wait++; setTimeout(function() { console.log('fetch bulk ops'); agent._fetchBulkOps(collection, options.fetchOps, finish); }, 1000); } ``` This forces us into an edge case where the subscribe query triggers and returns the diff from a [`queryPoll()`][2], which triggers the tests' `'insert'` handlers, finishes the test, and closes the backend, all before the `_fetchBulkOps()` call is issued (which subsequently fails because the database has been closed). Handling this query subscribe actually triggers a variety of responses to be sent to the client at different times: 1. The actual `_querySubscribe()` [callback][3] (which ultimately triggers [`agent._reply()`][4] in response to the original request) 2. Ops sent [independently][5] by [`_fetchBulkOps()`][6] 3. The diff resulting from [`queryPoll()`][2] In order to reduce flakiness, this change adds checks that the query's [`'ready'` event][7] has been called, which will happen once the resubscribe request has been replied to by the `Agent`. This ensures we've waited for events 1 & 3 of the above list, although we notably aren't waiting for event 2 (which is where the error is actually coming from). Since no ops will actually be sent to the client, I'm not sure how best to wait for this. Hopefully waiting for the subscribe ack should be sufficient. [1]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/agent.js#L521-L524 [2]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/agent.js#L534 [3]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/agent.js#L511 [4]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/agent.js#L344 [5]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/agent.js#L265 [6]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/agent.js#L523 [7]: https://github.com/share/sharedb/blob/b84290f8a66ee42b3d2c75d6de780e125b99c514/lib/client/query.js#L148
Looks like this is still happening: https://github.com/share/sharedb-mongo/actions/runs/4295623744/jobs/7486290853 |
This test is occasionally flaky on CI:
https://github.com/share/sharedb-mongo/runs/6276722964?check_suite_focus=true#step:7:2548
The text was updated successfully, but these errors were encountered: