fix(cli-repl): only ever run MongoshRepl.onExit once MONGOSH-1943 #2300

addaleax · 2024-12-12T14:29:28Z

(Depends on #2299 as the first commit here and on #2298 logically – without #2298, this just fails consistently now instead of flakily)

Previously, a task in our CI was flaky because it was running exit\n as literal input to the shell, expecting it to reliably make the shell exit.

exit as a command leads to the MongoshRepl.onExit function called, which then calls a) .close() and, indirectly, repl.close() on the Node.js REPL instance as well b) the CliRepl.exit function (and through that, indirectly, process.exit()).

However, closing the Node.js REPL instance makes it flush the history file to disk as a typically fast but asynchronous operation, and, after that, emit exit on the REPL; that in turn called MongoshRepl.onExit again (i.e. calling MongoshRepl.onExit leads to another call to the same function, just asynchronously with a short delay).

The flaky test in CI failed because of a combination of issues, including the fact that process.exit() did not necessarily actually exit the process because of coverage tooling interfering with it (which in turn happened because it was running with an altered working directory pointing at a temporary directory where nyc had not created a subdirectory for storing coverage output).

In typical operations, the REPL would flush history quickly, and calling .onExit() again would allow the process to exit normally (by ways of calling CliRepl.exit and process.exit() a second time). However, in the flaky failure case, the history flushing operation would take long enough that the original process.exit() call (which failed due to nyc) would set the internal process._exiting flag before throwing, which in turn prevents process.nextTick() from scheduling callbacks, and which then ultimately prevented the REPL from cleaning itself up, leading to the process timing out instead of exiting.

This commit introduces safeguards against calling .onExit() twice, always returning the same (never-resolving) Promise instance, and explicitly keeps track of whether the REPL was closed or not.

gagik

Nice, seems tricky to catch this!

gagik · 2024-12-13T10:43:31Z

packages/cli-repl/src/mongosh-repl.ts

+      throw new MongoshInternalError(
+        `mongosh not initialized yet\nCurrentTrace: ${
+          new Error().stack
+        }\nClose trace: ${this.closeTrace}\n`


is there a reason we're ending with a newline?

Yeah no, good point. I think I have a tendency to always terminate string that contain multiple newlines with one (just like most editors do for files), but Errors are a bit special in that their messages are usually displayed with an extra newline anyway. Removed in 3bcdc52!

gagik · 2024-12-13T10:49:02Z

packages/cli-repl/src/mongosh-repl.ts

@@ -544,6 +546,7 @@ class MongoshNodeRepl implements EvaluationListener {
    });

    repl.on('exit', () => {
+      if (this._runtimeState) this._runtimeState.repl = null;


what are the implications of this being set to null? is this to force a quicker garbage collection / flushing? might be worth leaving a comment, though perhaps I could just be lacking context

Oh yeah, comment is a good idea – also added in 3bcdc52. (The shell can also exit when stdin ends, which would result in us calling this listener here first, before we call .onExit() through some other means)

Previously, a task in our CI was flaky because it was running `exit\n` as literal input to the shell, expecting it to reliably make the shell exit. `exit` as a command leads to the `MongoshRepl.onExit` function called, which then calls a) `.close()` and, indirectly, `repl.close()` on the Node.js REPL instance as well b) the `CliRepl.exit` function (and through that, indirectly, `process.exit()`). However, closing the Node.js REPL instance makes it flush the history file to disk as a typically fast but asynchronous operation, and, after that, emit `exit` on the REPL; that in turn called `MongoshRepl.onExit` again (i.e. calling `MongoshRepl.onExit` leads to another call to the same function, just asynchronously with a short delay). The flaky test in CI failed because of a combination of issues, including the fact that `process.exit()` did not necessarily actually exit the process because of coverage tooling interfering with it (which in turn happened because it was running with an altered working directory pointing at a temporary directory where `nyc` had not created a subdirectory for storing coverage output). In typical operations, the REPL would flush history quickly, and calling `.onExit()` again would allow the process to exit normally (by ways of calling `CliRepl.exit` and `process.exit()` a second time). However, in the flaky failure case, the history flushing operation would take long enough that the original `process.exit()` call (which failed due to nyc) would set the internal `process._exiting` flag before throwing, which in turn prevents `process.nextTick()` from scheduling callbacks, and which then ultimately prevented the REPL from cleaning itself up, leading to the process timing out instead of exiting. This commit introduces safeguards against calling `.onExit()` twice, always returning the same (never-resolving) `Promise` instance, and explicitly keeps track of whether the REPL was closed or not.

addaleax mentioned this pull request Dec 12, 2024

chore(e2e-tests): consistently wait for successful exits of shell MONGOSH-1943 #2301

Merged

gagik approved these changes Dec 13, 2024

View reviewed changes

addaleax added 2 commits December 13, 2024 08:44

fixup: cr comments

3bcdc52

addaleax force-pushed the 1943-mongoshrepl-onexit branch from df41dba to 3bcdc52 Compare December 13, 2024 13:46

addaleax merged commit 18a499e into main Dec 13, 2024
127 of 130 checks passed

addaleax deleted the 1943-mongoshrepl-onexit branch December 13, 2024 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli-repl): only ever run MongoshRepl.onExit once MONGOSH-1943 #2300

fix(cli-repl): only ever run MongoshRepl.onExit once MONGOSH-1943 #2300

addaleax commented Dec 12, 2024 •

edited

Loading

gagik left a comment

gagik Dec 13, 2024

addaleax Dec 13, 2024

gagik Dec 13, 2024

addaleax Dec 13, 2024

fix(cli-repl): only ever run MongoshRepl.onExit once MONGOSH-1943 #2300

fix(cli-repl): only ever run MongoshRepl.onExit once MONGOSH-1943 #2300

Conversation

addaleax commented Dec 12, 2024 • edited Loading

gagik left a comment

Choose a reason for hiding this comment

gagik Dec 13, 2024

Choose a reason for hiding this comment

addaleax Dec 13, 2024

Choose a reason for hiding this comment

gagik Dec 13, 2024

Choose a reason for hiding this comment

addaleax Dec 13, 2024

Choose a reason for hiding this comment

addaleax commented Dec 12, 2024 •

edited

Loading