Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize enqueue a bit #3573

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Optimize enqueue a bit #3573

wants to merge 9 commits into from

Conversation

bugadani
Copy link
Contributor

@bugadani bugadani commented Nov 25, 2024

This is a measurable difference on Xtensa, on ESP32-S3 (atomic CAS). The difference for the ESP32-S2 (critical_section) disappeared after a compiler update, unfortunately 🤔

  • Before: Test OK, count=25431, cycles=47186/100
  • After: Test OK, count=26810, cycles=44759/100

This PR adds a new CLAIMED state. Tasks are first CLAIMED, then transition to SPAWNED. This establishes the invariant that if a task is SPAWNED, it has a valid executor configured. Going from CLAIMED to SPAWNED clears the CLAIMED bit.

The pull request then uses this invariant to enqueue a task unconditionally upon wake. This saves a few instructions in the common case, but RUN_QUEUED state no longer means that a task can definitely be found in a run queue.

Edge cases (may be incomplete)

  • Dropping the SpawnToken of a claimed task will leak that task. Implementing Drop on SpawnToken can prevent this, but the implementation already exists and panics.
  • Waking a task that is CLAIMED. As the task's executor may be written at any point, we don't enqueue the task. This prevents the task from getting enqueued in two different run queues. Transitioning a task to SPAWNED will also enqueue it in its new executor.
  • Waking a task just before it becomes CLAIMED. run_enqueue is an atomic operation, so the RUN_QUEUED bit will be set, and the task will not be claimed. The task will be enqueued in the last executor that ran it, and it will be run_dequeue'd, then ignored. A subsequent spawn call may claim the task.
  • Waking a task that has exited. Same behaviour as the "just before it becomes CLAIMED" case.
  • A task can be RUN_QUEUED, but not be present in any run queue. This is possible only if the task is CLAIMED, and transitioning to SPAWNED will properly enqueue the task.
  • On its last poll, task wakes itself, then exits. The task ends up in a run queue, with SPAWNED and CLAIMED clear, and RUN_QUEUED set. The executor's next poll will clear this state in run_enqueue, and the unspawned task will NOT be polled.

Not necessarily a separate edge case, but a concrete concern voiced on Matrix:

also, I wonder if there's some possible race between "thread A respawns a task on a different executor" and "thread B wakes the task"
feels like if it races the right way the task might end up enqueued in the old executor
if that's the case then it's an existing unsoundness

This case is covered by "Waking a task that is CLAIMED" and "Waking a task just before it becomes CLAIMED" I believe. Depending on which thread gets to mutating the state first, the task will either not get spawned, or will not be enqueued.

yeah but waking spawning a task first sets the SPAWNED bit, then writes the executor pointer
what if another thread wakes the task between those operations?

The CLAIMED state ensures that the task will not be queued until the executor is properly updated.

@bugadani bugadani force-pushed the enqueue branch 3 times, most recently from 44ccf40 to 07fade3 Compare November 26, 2024 09:14
@bugadani
Copy link
Contributor Author

bugadani commented Nov 26, 2024

The new test cases fail on main, because exited tasks are not pended there:

---- waking_with_old_waker_after_respawn stdout ----
thread 'waking_with_old_waker_after_respawn' panicked at tests\test.rs:230:5:
assertion `left == right` failed
  left: ["pend", "yield_now", "pend", "poll task1"]
 right: ["pend", "yield_now", "pend", "poll task1", "pend"]
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- waking_after_completion_does_not_poll stdout ----
thread 'waking_after_completion_does_not_poll' panicked at tests\test.rs:176:5:
assertion `left == right` failed
  left: ["pend", "poll task1", "pend", "poll task1"]
 right: ["pend", "poll task1", "pend", "pend", "pend", "poll task1"]

Pending exited tasks make sure the RUNNING bit (which we now unconditionally set) is cleared, which is needed to not leak a task's storage. This clearing is done by run_dequeue and I think it does not matter which executor does it - but the CLAIMED state makes sure it's the old one pre-respawn, and the new one post-respawn. During respawn, pending isn't necessary so we don't do it.

ARM changes are best guess only.

@bugadani bugadani marked this pull request as ready for review November 26, 2024 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant