diff --git a/execution.bs b/execution.bs index f38ac5c..f7de51e 100644 --- a/execution.bs +++ b/execution.bs @@ -1503,6 +1503,9 @@ The changes since R9 are as follows: Fixes: + * `ensure_started`, `start_detached`, `execute`, and `execute_may_block_caller` + are removed from the proposal. They are to be replaced with safer and more + structured APIs by [@P3149R3]. Enhancements: @@ -2303,7 +2306,7 @@ usages will only accept multi-shot senders. Algorithms that accept senders will typically either decay-copy an input sender and store it somewhere for later usage (for example as a data-member of the returned sender) or will immediately call `execution::connect` on the input -sender, such as in `this_thread::sync_wait` or `execution::start_detached`. +sender, such as in `this_thread::sync_wait`. Some multi-use sender algorithms may require that an input sender be copy-constructible but will only call `execution::connect` on an rvalue of each @@ -2573,10 +2576,10 @@ accelerator can sometimes be considerable. However, in the process of working on this paper and implementations of the features proposed within, our set of requirements has shifted, as we understood the different implementation strategies that are available for the feature set -of this paper better, and, after weighting the earlier concerns against the +of this paper better, and, after weighing the earlier concerns against the points presented below, we have arrived at the conclusion that a purely lazy model is enough for most algorithms, and users who intend to launch work earlier -may use an algorithm such as `ensure_started` to achieve that goal. We have also +may write an algorithm to achieve that goal. We have also come to deeply appreciate the fact that a purely lazy model allows both the implementation and the compiler to have a much better understanding of what the complete graph of tasks looks like, allowing them to better optimize the code - @@ -3239,8 +3242,7 @@ is related to the sender arguments it has received. Sender adaptors are lazy, that is, they are never allowed to submit any work for execution prior to the returned sender being [=started=] later on, and are also guaranteed to not start any input senders passed into them. Sender -consumers such as [[#design-sender-consumer-start_detached]] and -[[#design-sender-consumer-sync_wait]] start senders. +consumers such as [[#design-sender-consumer-sync_wait]] start senders. For more implementer-centric description of starting senders, see [[#design-laziness]]. @@ -3483,50 +3485,11 @@ execution::sender auto final = execution::then(both, [](auto... args){ // when final executes, it will print "the two args: 1, abc" -### `execution::ensure_started` ### {#design-sender-adaptor-ensure_started} - -
-execution::sender auto ensure_started(
-    execution::sender auto sender
-);
-
- -Once `ensure_started` returns, it is known that the provided sender has been -[=connect|connected=] and `start` has been called on the resulting operation -state (see [[#design-states]]); in other words, the work described by the -provided sender has been submitted -for execution on the appropriate execution resources. Returns a sender which -completes when the provided sender completes and sends values equivalent to -those of the provided sender. - -If the returned sender is destroyed before `execution::connect()` is called, or -if `execution::connect()` is called but the returned operation-state is -destroyed before `execution::start()` is called, then a stop-request is sent to -the eagerly launched operation and the operation is detached and will run to -completion in the background. Its result will be discarded when it eventually -completes. - -Note that the application will need to make sure that resources are kept alive -in the case that the operation detaches. e.g. by holding a `std::shared_ptr` to -those resources or otherwise having some out-of-band way to signal completion of -the operation so that resource release can be sequenced after the completion. - ## User-facing sender consumers ## {#design-sender-consumers} A [=sender consumer=] is an algorithm that takes one or more senders, which it may `execution::connect`, as parameters, and does not return a sender. -### `execution::start_detached` ### {#design-sender-consumer-start_detached} - -
-void start_detached(
-    execution::sender auto sender
-);
-
- -Like `ensure_started`, but does not return a value; if the provided sender sends -an error instead of a value, `std::terminate` is called. - ### `this_thread::sync_wait` ### {#design-sender-consumer-sync_wait}
@@ -3537,12 +3500,12 @@ auto sync_wait(
 
`this_thread::sync_wait` is a sender consumer that submits the work described by -the provided sender for execution, similarly to `ensure_started`, except that it -blocks the current `std::thread` or thread of `main` until the work is +the provided sender for execution, +blocking the current `std::thread` or thread of `main` until the work is completed, and returns an optional tuple of values that were sent by the provided sender on its completion of work. Where [[#design-sender-factory-schedule]] and [[#design-sender-factory-just]] are -meant to enter the domain of senders, `sync_wait` is meant to exit +meant to enter the domain of senders, `sync_wait` is one way to exit the domain of senders, retrieving the result of the task graph. If the provided sender sends an error instead of values, `sync_wait` throws that @@ -3568,28 +3531,6 @@ different synchronization mechanisms than `std::thread`'s will provide their own flavors of `sync_wait` as well (assuming their execution agents have the means to block in a non-deadlock manner). -## `execution::execute` ## {#design-execute} - -In addition to the three categories of functions presented above, we also -propose to include a convenience function for fire-and-forget eager one-way -submission of an invocable to a scheduler, to fulfil the role of one-way -executors from P0443. - -
-void execution::execute(
-    execution::schedule auto sched,
-    std::invocable auto fn
-);
-
- -Submits the provided function for execution on the provided scheduler, as-if by: - -
-auto snd = execution::schedule(sched);
-auto work = execution::then(snd, fn);
-execution::start_detached(work);
-
- # Design - implementer side # {#design-implementer} ## Receivers serve as glue between senders ## {#design-receivers} @@ -3632,8 +3573,8 @@ algorithm: `start`, which serves as the submission point of the work represented by a given operation state. Operation states are not a part of the user-facing API of this proposal; they -are necessary for implementing sender consumers like `execution::ensure_started` -and `this_thread::sync_wait`, and the knowledge of them is necessary to +are necessary for implementing sender consumers like `this_thread::sync_wait`, +and the knowledge of them is necessary to implement senders, so the only users who will interact with operation states directly are authors of senders and authors of sender algorithms. @@ -3765,9 +3706,7 @@ that accepts a sender as its first argument, should do the following: ## Sender adaptors are lazy ## {#design-laziness} Contrary to early revisions of this paper, we propose to make all sender -adaptors perform strictly lazy submission, unless specified otherwise (the one -notable exception in this paper is [[#design-sender-adaptor-ensure_started]], -whose sole purpose is to start an input sender). +adaptors perform strictly lazy submission, unless specified otherwise. Strictly lazy submission means that there is a guarantee that no work is submitted to an execution resource before a receiver is @@ -3794,10 +3733,7 @@ capable of removing the senders abstraction entirely, while still allowing for composition of functions across different parts of a program. The second way for this to occur is when a sender algorithm is specialized for a -specific set of arguments. For instance, we expect that, for senders which are -known to have been started already, [[#design-sender-adaptor-ensure_started]] -will be an identity transformation, because the sender algorithm will be -specialized for such senders. Similarly, an implementation could recognize two +specific set of arguments. For instance, an implementation could recognize two subsequent [[#design-sender-adaptor-bulk]]s of compatible shapes, and merge them together into a single submission of a GPU kernel. @@ -5291,7 +5227,6 @@ template<class Initializer> [exec.recv]Receivers [exec.opstate]Operation states [exec.snd]Senders -[exec.execute]One-way execution 3. Table 2 shows the types of customization point objects @@ -5307,7 +5242,7 @@ template<class Initializer> core provide core execution functionality, and connection between core components - e.g., `connect`, `start`, `execute` + e.g., `connect`, `start` completion functions @@ -5321,7 +5256,7 @@ template<class Initializer> @@ -5332,7 +5267,7 @@ template<class Initializer> @@ -5815,7 +5750,6 @@ namespace std::execution { struct let_stopped_t { see below }; struct bulk_t { see below }; struct split_t { see below }; - struct ensure_started_t { see below }; struct when_all_t { see below }; struct when_all_with_variant_t { see below }; struct into_variant_t { see below }; @@ -5834,17 +5768,12 @@ namespace std::execution { inline constexpr let_stopped_t let_stopped{}; inline constexpr bulk_t bulk{}; inline constexpr split_t split{}; - inline constexpr ensure_started_t ensure_started{}; inline constexpr when_all_t when_all{}; inline constexpr when_all_with_variant_t when_all_with_variant{}; inline constexpr into_variant_t into_variant{}; inline constexpr stopped_as_optional_t stopped_as_optional{}; inline constexpr stopped_as_error_t stopped_as_error{}; - // [exec.consumers], sender consumers - struct start_detached_t { see below }; - inline constexpr start_detached_t start_detached{}; - // [exec.utils], sender and receiver utilities // [exec.utils.cmplsigs] template<class Fn> @@ -5885,10 +5814,7 @@ namespace std::execution { } namespace std::this_thread { - // [exec.queries], queries - struct execute_may_block_caller_t { see below }; - inline constexpr execute_may_block_caller_t execute_may_block_caller{}; - + // [exec.consumers], consumers struct sync_wait_t { see below }; struct sync_wait_with_variant_t { see below }; @@ -5897,10 +5823,6 @@ namespace std::this_thread { } namespace std::execution { - // [exec.execute], one-way execution - struct execute_t { see below }; - inline constexpr execute_t execute{}; - // [exec.as.awaitable] struct as_awaitable_t { see below }; inline constexpr as_awaitable_t as_awaitable{}; @@ -6111,29 +6033,6 @@ namespace std::execution { `forward_progress_guarantee::parallel`, all such execution agents shall provide at least the parallel forward progress guarantee. -### `this_thread::execute_may_block_caller` [exec.execute.may.block.caller] ### {#spec-execution.execute_may_block_caller} - -1. `execute_may_block_caller` asks a scheduler `sch` whether any invocation of - the `execute` algorithm ([exec.execute]) with `sch` may block the current - thread of execution ([defns.block]). - -2. The name `execute_may_block_caller` denotes a query object. For - a subexpression `sch`, let `Sch` be `decltype((sch))`. If `Sch` does not - satisfy `scheduler`, `execute_may_block_caller(sch)` is ill-formed. - Otherwise, `execute_may_block_caller(sch)` is - expression-equivalent to: - - 1. MANDATE-NOTHROW(as_const(sch).query(execute_may_block_caller)), - if that expression is well-formed. - - * Mandates: The type of the expression above is `bool`. - - 2. Otherwise, `true`. - -3. If `execute_may_block_caller(sch)` returns `false` for some scheduler `sch`, - no invocation of the `execute` algorithm with `sch` shall block the calling - thread. - ### `execution::get_completion_scheduler` [exec.completion.scheduler] ### {#spec-execution.get_completion_scheduler} 1. get_completion_scheduler<completion-tag> obtains the @@ -8388,37 +8287,34 @@ namespace std::execution { - propagates all completion operations sent by `sndr`. -#### `execution::split` and `execution::ensure_started` [exec.split] #### {#spec-execution.senders.adapt.split} +#### `execution::split` [exec.split] #### {#spec-execution.senders.adapt.split} 1. `split` adapts an arbitrary sender into a sender that can be connected - multiple times. `ensure_started` eagerly starts the execution of a sender, - returning a sender that is usable as input to additional sender algorithms. + multiple times. -2. Let `shared-env` be the type of an environment such that, +2. Let `split-env` be the type of an environment such that, given an instance `env`, the expression `get_stop_token(env)` is well-formed and has type `inplace_stop_token`. -3. The names `split` and `ensure_started` denote pipeable sender adaptor objects. - Let the expression `shared-cpo` be one of `split` or - `ensure_started`. For a subexpression `sndr`, let `Sndr` be - `decltype((sndr))`. If sender_in<Sndr, shared-env> is - `false`, shared-cpo(sndr) is ill-formed. +3. The name `split` denotes a pipeable sender adaptor object. + For a subexpression `sndr`, let `Sndr` be `decltype((sndr))`. + If sender_in<Sndr, split-env> is + `false`, split(sndr) is ill-formed. -4. Otherwise, the expression shared-cpo(sndr) is +4. Otherwise, the expression split(sndr) is expression-equivalent to:
       transform_sender(
         get-domain-early(sndr),
-        make-sender(shared-cpo, {}, sndr))
+        make-sender(split, {}, sndr))
       
except that `sndr` is evaluated only once. - The default implementation of `transform_sender` - will have the effect of connecting the sender to a receiver and, in the - case of `ensure_started`, calling `start` on the resulting operation - state. It will return a sender with a different tag type. + will have the effect of connecting the sender to a receiver. + It will return a sender with a different tag type. 5. Let `local-state` denote the following exposition-only class template: @@ -8427,7 +8323,6 @@ namespace std::execution { struct local-state-base { // exposition only virtual ~local-state-base() = default; virtual void notify() noexcept = 0; // exposition only - virtual void detach() noexcept = 0; // exposition only }; template<class Sndr, class Rcvr> @@ -8439,7 +8334,6 @@ namespace std::execution { ~local-state(); void notify() noexcept override; - void detach() noexcept override; private: optional<on-stop-callback> on_stop; // exposition only @@ -8467,7 +8361,6 @@ namespace std::execution { 1. *Effects:* Equivalent to:
-            detach();
             sh_state->dec-ref();
             
@@ -8479,42 +8372,29 @@ namespace std::execution {
             on_stop.reset();
             visit(
-              [this]<class Tuple>(Tuple&& tupl) noexcept -> void {
+              [this](const auto& tupl) noexcept -> void {
                 apply(
-                  [this](auto tag, auto&... args) noexcept -> void {
-                    tag(std::move(*rcvr), std::forward_like<Tuple>(args)...);
+                  [this](auto tag, const auto&... args) noexcept -> void {
+                    tag(std::move(*rcvr), args...);
                   },
                   tupl);
               },
-              QUAL(sh_state->result));
+              sh_state->result);
             
- where `QUAL` is `std::move` if - same_as<tag_of_t<Sndr>, - ensure-started-impl-tag> is `true`, and `as_const` - otherwise. - - 4.
-        void detach() noexcept override;
- - 1. *Effects:* Equivalent to sh_state->detach() if - same_as<tag_of_t<Sndr>, - ensure-started-impl-tag> is `true`; otherwise, - nothing. - -6. Let `shared-receiver` denote the following exposition-only class +6. Let `split-receiver` denote the following exposition-only class template:
     namespace std::execution {
       template<class Sndr>
-      struct shared-receiver {
+      struct split-receiver {
         using receiver_concept = receiver_t;
 
         template<class Tag, class... Args>
         void complete(Tag, Args&&... args) noexcept { // exposition only
+          using tuple_t = decayed-tuple<Tag, Args...>;
           try {
-            using tuple_t = decayed-tuple<Tag, Args...>;
             sh_state->result.template emplace<tuple_t>(Tag(), std::forward<Args>(args)...);
           } catch (...) {
             using tuple_t = tuple<set_error_t, exception_ptr>;
@@ -8568,7 +8448,6 @@ namespace std::execution {
 
         void start-op() noexcept;  // exposition only
         void notify() noexcept;  // exposition only
-        void detach() noexcept;  // exposition only
         void inc-ref() noexcept; // exposition only
         void dec-ref() noexcept; // exposition only
 
@@ -8577,7 +8456,7 @@ namespace std::execution {
         state-list-type waiting_states;    // exposition only
         atomic<bool> completed{false};   // exposition only
         atomic<size_t> ref_count{1};   // exposition only
-        connect_result_t<Sndr, shared-receiver<Sndr>> op_state;    // exposition only
+        connect_result_t<Sndr, split-receiver<Sndr>> op_state;    // exposition only
       };
     }
     
@@ -8600,7 +8479,7 @@ namespace std::execution { explicit shared-state(Sndr&& sndr); 1. *Effects:* Initializes `op_state` with the result of - connect(std::forward<Sndr>(sndr), shared-receiver{this}). + connect(std::forward<Sndr>(sndr), split-receiver{this}). 2. *Postcondition:* `waiting_states` is empty, and `completed` is `false`. @@ -8626,21 +8505,11 @@ namespace std::execution { dec-ref(). 6.
-          void detach() noexcept;
- - 1. *Effects:* If `completed` is `false` and `waiting_states` is empty, - calls `stop_src.request_stop()`. This has - the effect of requesting early termination of any asynchronous - operation that was started as a result of a call to `ensure_started`, - but only if the resulting sender was never connected and started. - - - 7.
           void inc-ref() noexcept;
1. *Effects:* Increments `ref_count`. - 8.
+      7. 
           void dec-ref() noexcept;
1. *Effects:* Decrements `ref_count`. If the new value of @@ -8650,39 +8519,34 @@ namespace std::execution { the `ref_count` to `0` then synchronizes with the call to dec-ref() that decrements `ref_count` to `0`. -8. For each type `split_t` and `ensure_started_t`, there is a different, - associated exposition-only implementation tag type, `split-impl-tag` - and `ensure-started-impl-tag`, respectively. Let - `shared-impl-tag` be the associated implementation tag type of - `shared-cpo`. Given an expression `sndr`, the expression - shared-cpo.transform_sender(sndr) is equivalent to: +8. Let `split-impl-tag` be an empty exposition-only class type. + Given an expression `sndr`, the expression + split.transform_sender(sndr) is equivalent to:
       auto&& [tag, _, child] = sndr;
       auto* sh_state = new shared-state{std::forward_like<decltype((sndr))>(child)};
-      return make-sender(shared-impl-tag(), shared-wrapper{sh_state, tag});
+      return make-sender(split-impl-tag(), shared-wrapper{sh_state, tag});
       
where `shared-wrapper` is an exposition-only class that manages the reference count of the `shared-state` object pointed to by `sh_state`. - `shared-wrapper` models `movable` with move operations nulling out the - moved-from object. If `tag` is `split_t`, `shared-wrapper` models - `copyable` with copy operations incrementing the reference count by calling - sh_state->inc-ref(). The constructor calls - sh_state->start-op() if `tag` is `ensure_started_t`. The - destructor has no effect if `sh_state` is null; otherwise, it calls - sh_state->detach() if `tag` is `ensure_started_t`; - and finally, it decrements the reference count by calling + `shared-wrapper` models `copyable` with move operations nulling out the + moved-from object, copy operations incrementing the reference count by calling + sh_state->inc-ref(), and assignment operations performing + a copy-and-swap operation. The + destructor has no effect if `sh_state` is null; otherwise, it + decrements the reference count by calling sh_state->dec-ref(). 9. The exposition-only class template `impls-for` - ([exec.snd.general]) is specialized for `shared-impl-tag` + ([exec.snd.general]) is specialized for `split-impl-tag` as follows:
         namespace std::execution {
           template<>
-          struct impls-for<shared-impl-tag> : default-impls {
+          struct impls-for<split-impl-tag> : default-impls {
             static constexpr auto get-state = see below;
             static constexpr auto start = see below;
           };
@@ -8690,7 +8554,7 @@ namespace std::execution {
         
1. The member - impls-for<shared-impl-tag>::get-state + impls-for<split-impl-tag>::get-state is initialized with a callable object equivalent to the following lambda expression: @@ -8701,7 +8565,7 @@ namespace std::execution {
2. The member - impls-for<shared-impl-tag>::start + impls-for<split-impl-tag>::start is initialized with a callable object that has a function call operator equivalent to the following: @@ -8723,37 +8587,19 @@ namespace std::execution { on-stop-request{state.sh_state->stop_src}); - 2. If `shared-impl-tag` is `ensure-started-impl-tag`, - and if `state.sh_state->stop_src.stop_requested()` is `true`, - calls `set_stopped(std::move(rcvr))` and returns. + 2. Then atomically does the following: - 3. Otherwise, atomically does the following: + - Reads the value `c` of `state.sh_state->completed`, and - - Inserts `addressof(state)` into `state.sh_state->waiting_states`, and + - Inserts `addressof(state)` into `state.sh_state->waiting_states` + if `c` is `false`. - - Reads the value of `state.sh_state->completed`. + 3. If `c` is `true`, calls state.notify() and returns. - 4. If the value read from `state.sh_state->completed` is `true`, - calls state.notify() and returns. - - 5. Otherwise, if `shared-impl-tag` is - `split-impl-tag`, and if `addressof(state)` is the first item added + 4. Otherwise, if `addressof(state)` is the first item added to `state.sh_state->waiting_states`, calls state.sh_state->start-op(). -10.
Under the following conditions, the results of the - child operation are discarded: - - - When a sender returned from `ensure_started` is destroyed without being - connected to a receiver, or - - - If the sender is connected to a receiver but the operation state - is destroyed without having been started, or - - - If polling the receiver's stop token indicates that stop has been - requested when `start` is called, and the operation has not yet - completed.
- #### `execution::when_all` [exec.when.all] #### {#spec-execution.senders.adaptor.when_all} 1. `when_all` and `when_all_with_variant` both adapt multiple input senders into @@ -9168,63 +9014,6 @@ namespace std::execution { ### Sender consumers [exec.consumers] ### {#spec-execution.senders.consumers} -#### `execution::start_detached` [exec.start.detached] #### {#spec-execution.senders.consumers.start_detached} - -1. `start_detached` eagerly starts a sender without the caller needing to manage - the lifetimes of any objects. - -2. The name `start_detached` denotes a customization point object. For a - subexpression `sndr`, let `Sndr` be `decltype((sndr))`. If - `sender_in` is `false`, `start_detached` is ill-formed. - Otherwise, the expression `start_detached(sndr)` is expression-equivalent to - the following except that `sndr` is evaluated only once: - -
-    apply_sender(get-domain-early(sndr), start_detached, sndr)
-    
- - * Mandates: same_as<decltype(e), void> is - `true` where e is the expression above. - - If the expression above does not eagerly start the sender `sndr` after - connecting it with a receiver that ignores value and stopped completion - operations and calls `terminate()` on error completions, the behavior of - calling `start_detached(sndr)` is undefined. - -3. Let `sndr` be a subexpression such that `Sndr` is `decltype((sndr))`, and let - `detached-receiver` and - `detached-operation` be the following exposition-only - class templates: - -
-    namespace std::execution {
-      template<class Sndr>
-      struct detached-receiver {
-        using receiver_concept = receiver_t;
-        detached-operation<Sndr>* op; // exposition only
-
-        void set_value() && noexcept { delete op; }
-        void set_error() && noexcept { terminate(); }
-        void set_stopped() && noexcept { delete op; }
-        empty_env get_env() const noexcept { return {}; }
-      };
-
-      template<class Sndr>
-      struct detached-operation {
-        connect_result_t<Sndr, detached-receiver<Sndr>> op; // exposition only
-
-        explicit detached-operation(Sndr&& sndr)
-          : op(connect(std::forward<Sndr>(sndr), detached-receiver<Sndr>{this}))
-        {}
-      };
-    }
-    
- -4. If sender_to<Sndr, detached-receiver<Sndr>> is `false`, the - expression `start_detached.apply_sender(sndr)` is ill-formed; otherwise, it is - expression-equivalent to start((new - detached-operation<Sndr>(sndr))->op). - #### `this_thread::sync_wait` [exec.sync.wait] #### {#spec-execution.senders.consumers.sync_wait} 1. `this_thread::sync_wait` and `this_thread::sync_wait_with_variant` are used @@ -9438,29 +9227,6 @@ namespace std::execution { 3. For a stopped completion, a disengaged `optional` object is returned. -## `execution::execute` [exec.execute] ## {#spec-execution.execute} - -1. `execute` executes a specified callable object on a specified scheduler. - -2. The name `execute` denotes a customization point object. For some - subexpressions `sch` and `f`, let `Sch` be `decltype((sch))` and `F` be the - decayed type of `f`. If `Sch` does not satisfy `scheduler` or `F` does not - satisfy `invocable`, `execute(sch, f)` is ill-formed. Otherwise, - `execute(sch, f)` is expression-equivalent to: - -
-    apply_sender(
-      query-or-default(get_domain, sch, default_domain()),
-      execute, schedule(sch), f)
-    
- - * Mandates: The type of the expression above is `void`. - -3. For some subexpressions `sndr` and `f` where `F` is the decayed type of `f`, - if `F` does not satisfy `invocable`, the expression - `execute.apply_sender(sndr, f)` is ill-formed; otherwise it is - expression-equivalent to `start_detached(then(sndr, f))`. - ## Sender/receiver utilities [exec.utils] ## {#spec-execution.snd_rec_utils} ### `execution::completion_signatures` [exec.utils.cmplsigs] ### {#spec-execution.snd_rec_utils.completion_sigs}