h2: Make requests walk away from the waiting list #3835

dridi · 2022-08-12T14:34:56Z

When a session goes to the cleanup step, it will indefinitely try to
kill all the streams until those stuck in a waiting list reembark a
thread. This can lead to clients timing out, closing the h2 session,
and opening a new h2 session for the exact same requests. This can
artifically inflate the number of dead tasks in a waiting list with
generous backend timeouts.

To avoid that, an h2 stream will keep a list of requests currently
in a waiting list and make them walk away quietly during cleanup.
An h2 stream would normally only have one request running at any
time but this may no longer be true with modules like vmod_pesi.

See https://varnish-cache.org/lists/pipermail/varnish-misc/2022-July/027160.html

This is equivalent to what we have in Varnish Enterprise that I
mentioned in this thread. It does not solve the problem described
in the email thread, because we have no path to walk away from
an HTTP/1 request (pun intended).

I believe we could potentially solve the HTTP/1 case if we involved
a waiter that would only care about closed connections, ignoring
incoming data that I believe trigger a wake up today.

I am convinced that I managed a better design than the original
attempt, and I'm looking forward to reviews. Unfortunately, I will
miss the next two bugwash meetings.

nigoroll · 2022-08-29T13:54:17Z

generally in favor, have not looked closer yet, still :(

bin/varnishd/cache/cache_hash.c

mbgrydeland · 2022-09-14T12:34:09Z

bin/varnishd/cache/cache_req_fsm.c

+		AZ(req->stale_oc);
+		VSLb_ts_req(req, "Waitinglist", W_TIM_real(wrk));
+		VSLb(req->vsl, SLT_Error, "The client is going away");
+		if (req->esi_level > 0) {


Why should we only go to VCLFAIL for subrequests? This implies that HSH_WalkAway will only ever be called as part of a broken connection, and we have to close. Could it not be useful also for a possible future waitinglist timeout feature, and we need to produce a 503?

First, we can add a dedicated HSH_TIMEOUT. Otherwise I can use the same argument for the error message preventing us to use this branch for future timeouts. I think we will be fine if we dedicated HSH_WALKAWAY to the case where the client went away while a request was in a waiting list.

For the timeout case, we probably want to have different semantics than plainly ending the transaction. Maybe a waiting list timeout would be a deferred hash_ignore_busy?

I must admit I don't remember precisely why we did that downstream. I think we needed this to interrupt ESI delivery and prevent it from recursing as the client is gone anyway.

There's actually a comment a few lines below explaining why we fail sub-requests:

/* NB: Interrupt delivery upwards to avoid engaging * new sub-request tasks. */

Granted, it doesn't give many details.

mbgrydeland · 2022-09-14T13:39:00Z

bin/varnishd/cache/cache.h

@@ -466,11 +466,13 @@ struct req {
 	void			*transport_priv;

 	VTAILQ_ENTRY(req)	w_list;
+	VTAILQ_ENTRY(req)	t_list;


This feels like quite the waste of memory adding 24 bytes for the list entry and the objhead reference flag on struct req. I think I prefer the original way of doing the workspace allocation, perhaps replaced/expanded with a fallback malloc.

I had considered this, and decided it would be simpler, especially with the generalization of the feature to include HTTP/1 too.

I can ponder something similar to what we did downstream but more general (not tied to h2 code).

I just remembered why I didn't use the workspace: it is reserved for vary processing.

I got some results on the generalization side, but didn't manage to without allocation space somewhere. Since workspace is reserved during waiting list operations, it is either a heap allocation or preallocated storage in struct req (current solution). Another possibility could be to opportunistically allocate an "entry" for transport tracking on the workspace before the vary reservation, if the top transport has waiting list methods.

To be continued.

Considering that eventually both h2 and http/1 (100% of client top transports) will track requests and sub-requests in waiting lists, it would be much simpler to go with the current design and have the tracking state directly in struct req.

mbgrydeland · 2022-09-14T13:46:36Z

bin/varnishd/http2/cache_http2_session.c

+
+	Lck_Lock(&r2->h2sess->sess->mtx);
+	AZ(req->transport_objhead);
+	req->transport_objhead = oh;


The variable req->transport_objhead should be renamed req->toptransport_objhead, to reflect that it is "owned" by the top transport. As it is it would be trouble if e.g. esi_deliver wanted to use it. Same argument goes for t_list.

~~Or, maybe we could move req::t_list and req::transport_objhead to reqtop::w_list and reqtop::hash_objhead?~~

edit: obviously, not.

dridi · 2022-09-15T09:21:55Z

Rebased against current master to solve the trivial merge conflict. I switched to more explicit tansport::top_*() methods as suggested. Not addressing other review items, yet.

It comes with new functions hsh_waitlist() and hsh_reembark() that centralize the logic. The top_waitlist() and top_reembark() methods are meant for client-facing transport to register when requests and sub-requests alike enter and leave a waiting list.

Since both vtr_reembark_f and vtr_waitlist_f have the same signature, we can use vtr_waitlist_f to convey that those callbacks relate to waiting lists, and the name of the methods to convey their roles.

When there's no rush, a request can simply walk away from the queue. The concept is simple: when a factor (pun intended) other than the backend fetch requires the request to leave the queue, it can just walk away. It short-circuits the usual rush exit. We can repurpose the dead counter busy_killed to count those events, and get rid of the dead enum entry HSH_CONTINUE.

But operate on a different objhead reference guarded by the h2_sess lock instead. This will allow coordination with the h2 session thread.

When a session goes to the cleanup step, it will indefinitely try to kill all the streams until those stuck in a waiting list reembark a thread. This can lead to clients timing out, closing the h2 session, and opening a new h2 session for the exact same requests. This can artifically inflate the number of dead tasks in a waiting list with generous backend timeouts. To avoid that, an h2 stream will keep a list of requests currently in a waiting list and make them walk away quietly during cleanup. An h2 stream would normally only have one request running at any time but this may no longer be true with modules like vmod_pesi. Better diff with the --ignore-all-space option.

dridi · 2023-09-28T06:58:56Z

Closing for now because #3992 will pull the objhead rug pretty hard.

The objcore is officially referenced by a request when it leaves its waiting list during a rush. This allows migrating a waiting list from one objcore to the next without the need to touch reference counters. The hash_oc field will still have to be updated. This field must be be kept to navigate from a request to its waiting list to implement walking away from a waiting list. Refs varnishcache#3835

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. Refs varnishcache#3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs varnishcache#3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs #3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs varnishcache#3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699 Conflicts: bin/varnishd/cache/cache_vrt_vcl.c bin/varnishd/cache/cache_vcl_vrt.c bin/varnishd/mgt/mgt_param_bits.c include/tbl/params.h

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs #3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699 Conflicts: bin/varnishd/cache/cache_vrt_vcl.c bin/varnishd/cache/cache_vcl_vrt.c bin/varnishd/mgt/mgt_param_bits.c include/tbl/params.h

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs varnishcache#3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs varnishcache#3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699 Conflicts: include/tbl/feature_bits.h

Once a client is reportedly gone, processing its VCL task(s) is just a waste of resources. The execution of client-facing VCL is intercepted and an artificial return(fail) is returned in that scenario. Thanks to the introduction of the universal return(fail) proper error handling and resource tear down is already in place, which makes this change safe modulus unknown bugs. This adds a circuit breaker anywhere in the client state machine where there is VCL execution. A new Reset time stamp is logged to convey when a task does not complete because the client is gone. This is a good complement to the walk away feature and its original circuit breaker for the waiting list, but this has not been integrated yet. While the request is technically failed, it won't increase the vcl_fail counter, and a new req_reset counter is incremented. This new behavior is guarded by a new vcl_req_reset feature flag, enabled by default. Refs #3835 Refs 61a15cb Refs e5efc2c Refs ba54dc9 Refs 6f50a00 Refs b881699 Conflicts: include/tbl/feature_bits.h

dridi added b=enhancement r=trunk c=varnishd c=H/2 labels Aug 12, 2022

nigoroll self-requested a review August 15, 2022 13:14

dridi requested a review from mbgrydeland September 12, 2022 09:46

mbgrydeland reviewed Sep 14, 2022

View reviewed changes

dridi force-pushed the h2_walkaway branch from e95015c to 092f497 Compare September 15, 2022 09:06

dridi force-pushed the h2_walkaway branch from 092f497 to 27da5b7 Compare October 4, 2022 13:12

dridi added 5 commits October 19, 2022 11:08

transport: Remove the vtr_reembark_f typedef

9200141

Since both vtr_reembark_f and vtr_waitlist_f have the same signature, we can use vtr_waitlist_f to convey that those callbacks relate to waiting lists, and the name of the methods to convey their roles.

h2: Keep track of streams entering waiting lists

2e9bcc3

But operate on a different objhead reference guarded by the h2_sess lock instead. This will allow coordination with the h2 session thread.

dridi force-pushed the h2_walkaway branch from 27da5b7 to f6ebc5a Compare October 19, 2022 09:08

dridi closed this Sep 28, 2023

dridi deleted the h2_walkaway branch September 28, 2023 06:59

dridi mentioned this pull request Oct 13, 2023

vcl_vrt: Skip VCL execution if the client is gone #3998

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

h2: Make requests walk away from the waiting list #3835

h2: Make requests walk away from the waiting list #3835

dridi commented Aug 12, 2022

nigoroll commented Aug 29, 2022 •

edited

Loading

mbgrydeland Sep 14, 2022

dridi Sep 14, 2022

dridi Sep 15, 2022

mbgrydeland Sep 14, 2022

dridi Sep 14, 2022

dridi Sep 19, 2022

dridi Sep 19, 2022

dridi Sep 19, 2022 •

edited

Loading

mbgrydeland Sep 14, 2022

dridi Sep 14, 2022 •

edited

Loading

dridi commented Sep 15, 2022

dridi commented Sep 28, 2023

h2: Make requests walk away from the waiting list #3835

h2: Make requests walk away from the waiting list #3835

Conversation

dridi commented Aug 12, 2022

nigoroll commented Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dridi Sep 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dridi Sep 14, 2022 • edited Loading

Choose a reason for hiding this comment

dridi commented Sep 15, 2022

dridi commented Sep 28, 2023

nigoroll commented Aug 29, 2022 •

edited

Loading

dridi Sep 19, 2022 •

edited

Loading

dridi Sep 14, 2022 •

edited

Loading