resolvers: run in separate processes #538

oliver-sanders · 2023-12-08T10:17:30Z

In multi-user setups we may have multiple users subscribing to multiple workflows simultaneously.

Large workflows can cause heavy server load in some cases (e.g. #547). Because we handle each request synchronously on the same process, this means larger updates can hold up other updates. In the extreme, this can cause UIs to show as disconnected as a result of websocket heartbeat timeout.

Ideally we would find a way to run the resolvers for each workflow in separate processes to isolate them from each other. Though we would probably want to limit the number of processes and distribute subscriptions across a pool.

Original Post

We run subscriptions in a ThreadPoolExecutor.

In Python only one thread can actively run at a time (because of the GIL) so there is no compute parallelism advantage to this (but there may be IO concurrency advantages depending on the implementation details of the code being run).

This means that one large workflow can hog 100% of the CPU of the server, causing issues with other workflows.

We should be able to swap the ThreadPoolExecutor for a ProcessPoolExecutor. I had a quick try, but it didn't work first time, the first update came through but subsequent ones got stuck, so a little work required.

This does raise the question of how many processes the UI Server should be allowed to spawn. I think we should be able to run more subscribers than processes in the pool, but would need to read the docs to find out.

oliver-sanders · 2024-01-10T14:00:01Z

#548 may reduce the urgency on this by optimizing some resolver stuff.

dwsutherland · 2024-02-01T00:02:04Z

Yeah, reason I didn't go with separate processes is so the resolvers have access to the central data-store (if it works that way)

oliver-sanders · 2024-02-01T09:30:19Z

We could potentially put the data store into shared memory but I'm guessing that managing parallel access would be challenging.

dwsutherland · 2024-02-07T23:07:53Z

Silly idea, and wider scope..
But if cylc-flow DataStoreMgr is refactored into DeltaMgr and a DataStoreMgr (which is imported by the UIS)..
This would mean UIS window size(s) are isolated from the Scheduler window size(s), so:

Users cannot impact the scheduler machines with absurd window sizes.
Different users/UIS viewing the same workflow can't effect each others window size (including TUI)

The DeltaMgr would receive/be-called-by events from the scheduler internals, and package them up for local (Schd data-store) and abroad (UI Servers). Effectively breaking up DataStoreMgr, and reforming it to something agnostic to where it is and how it get's it's deltas.
Would be a bit of work, and might require the scheduler config being made available somehow.. (if DataStoreMgr is to be imported for independent window sizes)

Would we even consider building a data-store for each subscription? advantages being:

Different subscriptions can build different size views (including history).
Each subscription can be it's own process (?), though, at the moment I'm not sure how this yields data back to the UIS WS sub.
No subscriptions, no/less UIS load.

Disadvantages would include machine load, amongst others..

First part, yes (and I think we've discussed it).. Second part, maybe?

oliver-sanders · 2024-02-08T10:09:58Z

This would mean UIS window size(s) are isolated from the Scheduler window size(s)

That sounds like a good idea. I think this aligns with what I was thinking of in #464.

I think resolving subscriptions off of the same data store makes still sense so long as filtering by the n-window doesn't become prohibitively expensive.

hjoliver · 2024-02-09T06:27:06Z

Yes we originally wanted n=0 (or perhaps 1) at the scheduler, since that's all it needs to schedule.

oliver-sanders · 2024-02-09T10:26:14Z

(the only reason we would want n=1 at the scheduler is Tui, we can drop back to n=0 if there isn't a client connected)

oliver-sanders · 2024-03-26T12:56:49Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resolvers: run in separate processes #538

resolvers: run in separate processes #538

oliver-sanders commented Dec 8, 2023 •

edited

Loading

oliver-sanders commented Jan 10, 2024

dwsutherland commented Feb 1, 2024

oliver-sanders commented Feb 1, 2024

dwsutherland commented Feb 7, 2024 •

edited

Loading

oliver-sanders commented Feb 8, 2024 •

edited

Loading

hjoliver commented Feb 9, 2024 •

edited

Loading

oliver-sanders commented Feb 9, 2024

oliver-sanders commented Mar 26, 2024

resolvers: run in separate processes #538

resolvers: run in separate processes #538

Comments

oliver-sanders commented Dec 8, 2023 • edited Loading

oliver-sanders commented Jan 10, 2024

dwsutherland commented Feb 1, 2024

oliver-sanders commented Feb 1, 2024

dwsutherland commented Feb 7, 2024 • edited Loading

oliver-sanders commented Feb 8, 2024 • edited Loading

hjoliver commented Feb 9, 2024 • edited Loading

oliver-sanders commented Feb 9, 2024

oliver-sanders commented Mar 26, 2024

oliver-sanders commented Dec 8, 2023 •

edited

Loading

dwsutherland commented Feb 7, 2024 •

edited

Loading

oliver-sanders commented Feb 8, 2024 •

edited

Loading

hjoliver commented Feb 9, 2024 •

edited

Loading