Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout requests and stop processing further #204

Merged
merged 14 commits into from
May 2, 2023

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented May 1, 2023

Timeout requests and stop processing further

Fix #148
Fix #40

  • Apply timeout middleware to all room directory and room routes
  • Stop messing with the response after we timeout. Fix Apply timeout middleware to all routes #148
    • This also involves cancelling any async/await things like requests in the routes so we throw an abort error instead of continuing on. Fix Cancellable requests #40
  • Also abort the route if we see that the user closed the request before we could respond to them
  • Bumps minimum supported Node.js version to v18 because we're now using the built-in native fetch in Node.js vs node-fetch. This gives us the custom signal.reason that we aborted with instead of a generic AbortError.

Dev notes

Avoid errors like the following where the route function still continued on after we sent a timeout response to the user:

Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
    at new NodeError (node:internal/errors:371:5)
    at ServerResponse.setHeader (node:_http_outgoing:576:11)
    at ServerResponse.header (matrix-public-archive/node_modules/express/lib/response.js:776:10)
    at ServerResponse.append (matrix-public-archive/node_modules/express/lib/response.js:737:15)
    at setHeadersToPreloadAssets (matrix-public-archive/server/lib/set-headers-to-preload-assets.js:73:7)
    at matrix-public-archive/server/routes/room-routes.js:908:5

Instead, we are left with these errors logged to the console each time we timeout. It would be best if we could ignore these in an error handler (added as a TODO here)

AbortError: The user aborted a request.
    at abort (matrix-public-archive/node_modules/node-fetch/lib/index.js:1448:16)
    at AbortSignal.abortAndFinalize (matrix-public-archive/node_modules/node-fetch/lib/index.js:1463:4)
    at AbortSignal.[nodejs.internal.kHybridDispatch] (node:internal/event_target:647:20)
    at AbortSignal.dispatchEvent (node:internal/event_target:589:26)
    at abortSignal (node:internal/abort_controller:283:10)
    at AbortController.abort (node:internal/abort_controller:314:5)
    at Timeout.<anonymous> (matrix-public-archive/server/middleware/timeout-middleware.js:25:25)
    at listOnTimeout (node:internal/timers:559:17)
    at processTimers (node:internal/timers:502:7)

Update: Switched to custom errors and native fetch in Node.js which emit the custom signal.reason instead of a generic error. So we now get a lovely error like this:

RouteTimeoutAbortError: Timed out after 25000ms while trying to respond to route /r/too-many-messages-on-day:my.synapse.server/date/2022/11/16
    at Timeout.<anonymous> (server/middleware/timeout-middleware.js:28:7)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)

Todo

  • Support abortSignal in renderHydrogenToString(...)
  • Add an error handler to ignore AbortError (see dev notes for example) which appear in the logs whenever we timeout. We don't need to see these in the logs

@MadLittleMods MadLittleMods added T-Enhancement New feature or request A-backend Related to the backend Node.js server pieces labels May 1, 2023
@MadLittleMods MadLittleMods mentioned this pull request May 1, 2023
2 tasks
@MadLittleMods MadLittleMods marked this pull request as ready for review May 2, 2023 04:58
Comment on lines +29 to +31
new RouteTimeoutAbortError(
`Timed out after ${requestTimeoutMs}ms while trying to respond to route ${req.originalUrl}`
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, it would be interesting to attach some more info about the long-running requests that took up all of the time (get the spans).

Future PR stuff though.

Perhaps, it would just be better to have per-request correlated logging and see the other info that way, #9

Comment on lines +187 to +192
// We should be able to just `reject(err)` without any special-case handling
// here since ideally, we expect the error to be whatever `signal.reason` we
// aborted with but `child_process.fork(...)` doesn't seem play nicely, see
// https://github.com/nodejs/node/issues/47814
if (err.name === 'AbortError') {
const childErrorSummary = assembleErrorAfterChildExitsWithErrors(
childExitCode,
childErrors,
childStdErr
);
reject(
new RethrownError(
`Timed out while running ${modulePath} so we aborted the child process after ${timeout}ms. Any child errors? (${childErrors.length})`,
childErrorSummary
)
);
reject(abortController.signal.reason || err);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created nodejs/node#47814 to track when we can stop doing this workaround

@MadLittleMods MadLittleMods merged commit 9078abf into main May 2, 2023
@MadLittleMods MadLittleMods added the A-tracing OpenTelemetry tracing (spans, timing, observability) label May 2, 2023
@MadLittleMods MadLittleMods deleted the madlittlemods/148-40-timeout-all-requests branch May 2, 2023 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend Related to the backend Node.js server pieces A-tracing OpenTelemetry tracing (spans, timing, observability) T-Enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply timeout middleware to all routes Cancellable requests
1 participant