Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread pool and concurrency max threads configuration option #1

Draft
wants to merge 58 commits into
base: master
Choose a base branch
from

Conversation

MattFenelon
Copy link
Owner

@MattFenelon MattFenelon commented Aug 30, 2024

MattFenelon and others added 30 commits March 27, 2024 12:31
…handle on what's causing thread pool hangs. refs graphiti-api#469"

This reverts commit 7941b6f.
This option allows to limit the maximum number of resources that can be
sideloaded concurrently. With a properly configured connection pool,
this ensures that the activerecord's connection pool is not exhausted by
the sideloading process.

The thread pool configuration is based on ActiveRecord's
global_thread_pool_async_query_executor.

This was previously attempted but there were reports of deadlocks. This
code differs from the original by using Concurrency::Delay assigned to a
constant instead of a regular Mutex. The Delay+constant is how
concurrent-ruby sets up their global thread pools so it's more likely to
be correct.

Closes graphiti-api#469

See: https://github.com/rails/rails/blob/94faed4dd4c915829afc2698e055a0265d9b5773/activerecord/lib/active_record.rb#L279-L287
See: graphiti-api#470
This include some logging for diagnostics to show what's happening when
the deadlock occurs.

```
Graphiti::Scope
  #resolve_sideloads
    when the requested sideload exists on the resource
      with concurrency
        with nested sideloads greater than Graphiti.config.concurrency_max_threads
thread 6220: employees queuing positions
thread 6220: employees waiting on [:positions]
thread 6240: running positions
thread 6240: positions queuing department
thread 6240: positions waiting on [:department]
          does not deadlock (FAILED - 1)

Failures:

  1) Graphiti::Scope#resolve_sideloads when the requested sideload exists on the resource with concurrency with nested sideloads greater than Graphiti.config.concurrency_max_threads does not deadlock
     Failure/Error: expect { instance.resolve_sideloads(results) }.not_to raise_error

       expected no Exception, got #<fatal:"No live threads left. Deadlock?\n2 threads, 2 sleeps current:0x00007f7e6f7b1780 main thread:...or.rb:339 sleep_forever>\n   rb_thread_t:0x00007f7e6f7b1780 native:0x000070000cfb4000 int:0\n   \n"> with backtrace:
         # ./lib/graphiti/scope.rb:78:in `resolve_sideloads'
         # ./spec/scope_spec.rb:145:in `block (7 levels) in <top (required)>'
         # ./spec/scope_spec.rb:145:in `block (6 levels) in <top (required)>'
     # ./spec/scope_spec.rb:145:in `block (6 levels) in <top (required)>'
```
…s-configuration-option

# Conflicts:
#	lib/graphiti/scope.rb
Not sure if I want to keep this but adding for now.
Rails assigns a connection to a thread so the close call has to happen
on that same thread for the thread's connection to be closed.
@resource.resolve will return either a promise or a real value
I misunderstood what @resource.resolve would call, assuming that the
`scope.to_a` referred to the Scope class. It actually refers to the
base_scope which for AR will be an AR query.
Pass the arguments to the promise so it runs with the correct values.

I've also disabled the adapter close for now as I'm not sure how to
close it without breaking other promises that might run on the same
thread in the pool.
To debug the rails tests. I'll remove this once the tests are passing.
I'll remove this once the PR is ready
This is a slightly different implementation due to the use of promises
for both async and sync modes. The difference is whether the thread pool
is set to synchronous or not.
Includes fix

* Worker threads get stuck when using caller_runs policy #933 in ruby-concurrency/concurrent-ruby#933
This makes it clearer what's going on when a method is called by
enforcing that either the method will return a future, or a resolved
value. It simplifies the code by removing the conditional statements
that were needed to check whether a future or a resolved value was being
returned, which was necessary to know whether the code was being called
async or not.

It also ensures that sync code always has a synchronous entry point to
call that will always block.
MattFenelon and others added 28 commits June 13, 2024 17:36
This ensures that any thread locals that were available to the Graphiti
resolving thread are also accessible to the sideload threads.
This ensures that any fiber locals that were available to the Graphiti
resolving thread are also accessible to the sideload threads.
Otherwise the thread/fiber locals are reset back to the same values.
Avoid memory leaks where the thread/fiber pool store's every local seen.
For ruby 3.3 and head on GitHub actions, the following error happens
when accessing Fiber.current.storage.keys:

```
Failure/Error:
  fiber_storage = Fiber.current.storage.keys.each_with_object({}) do |key, memo|
    memo[key] = Fiber[key]
  end

NoMethodError:
   undefined method 'keys' for nil
```

I don't think this should be happening as Fiber.current.storage is meant
to always return a hash, even with Fiber.new(storage: nil) or
Fiber.current.storage = nil.
This proves that #flat is required to flatten the promise returned by
sideload.future_resolve that is itself called from within a promise.
Collecting stats on the thread pool can be useful to understand the
state of the threadpool. For example, are a lot of tasks having to be
queued.
Multiple sideloads can error concurrently. In that case, throw the first
error and ignore the rest. This will allow normal processing to deal
with an expected error, rather than the Concurrent::MultipleErrors class
that concurrent-ruby raises.
I don't have a repo of the no exception but have seen this error in
testing:

```
exception class/object expected .rescue { |err| raise err } ^^^
```
With synchronous loading, the first error is raised and loading is
aborted.

With async, multiple errors can happen together and we need to pick one.

Pick the first to match how synchronous errors happen.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants