Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Do CC reaction before largest_acked #2117

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

larseggert
Copy link
Collaborator

Packets are only declared as lost relative to largest_acked. If we hit a PTO while we don't have a largest_acked yet, also do a congestion control reaction (because otherwise none would happen).

Broken out of #1998

Packets are only declared as lost relative to `largest_acked`. If we hit
a PTO while we don't have a largest_acked yet, also do a congestion
control reaction (because otherwise none would happen).

Broken out of mozilla#1998
Copy link

codecov bot commented Sep 16, 2024

Codecov Report

Attention: Patch coverage is 98.14815% with 1 line in your changes missing coverage. Please review.

Project coverage is 95.38%. Comparing base (5fe0c89) to head (d1a0f83).

Files with missing lines Patch % Lines
neqo-transport/src/path.rs 90.90% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2117   +/-   ##
=======================================
  Coverage   95.38%   95.38%           
=======================================
  Files         112      112           
  Lines       36593    36615   +22     
=======================================
+ Hits        34903    34924   +21     
- Misses       1690     1691    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Sep 16, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Copy link

github-actions bot commented Sep 16, 2024

Benchmark results

Performance differences relative to 5fe0c89.

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [99.163 ns 99.503 ns 99.896 ns]
       change: [-0.3454% +0.0430% +0.4756%] (p = 0.84 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe

coalesce_acked_from_zero 3+1 entries: Change within noise threshold.
       time:   [117.97 ns 118.32 ns 118.70 ns]
       change: [+0.0240% +0.5735% +1.1118%] (p = 0.03 < 0.05)

Found 21 outliers among 100 measurements (21.00%)
2 (2.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
13 (13.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [117.62 ns 118.06 ns 118.59 ns]
       change: [-2.4932% -0.4443% +0.9141%] (p = 0.72 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) low severe
2 (2.00%) low mild
7 (7.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [98.536 ns 98.680 ns 98.843 ns]
       change: [-19.862% -6.7169% +2.0329%] (p = 0.64 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
5 (5.00%) high mild
5 (5.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [112.78 ms 112.83 ms 112.88 ms]
       change: [+0.7474% +0.8121% +0.8822%] (p = 0.00 < 0.05)

Found 15 outliers among 100 measurements (15.00%)
8 (8.00%) low mild
7 (7.00%) high mild

SentPackets::take_ranges: No change in performance detected.
       time:   [5.6737 µs 5.8581 µs 6.0504 µs]
       change: [-13.447% -1.7477% +6.7591%] (p = 0.83 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe

transfer/pacing-false/varying-seeds: No change in performance detected.
       time:   [26.591 ms 27.730 ms 28.900 ms]
       change: [-9.0860% -3.9509% +1.7664%] (p = 0.17 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild

transfer/pacing-true/varying-seeds: No change in performance detected.
       time:   [33.638 ms 35.457 ms 37.296 ms]
       change: [-4.2984% +2.9796% +9.9402%] (p = 0.41 > 0.05)
transfer/pacing-false/same-seed: No change in performance detected.
       time:   [25.394 ms 26.144 ms 26.908 ms]
       change: [-4.9375% -0.6823% +3.7173%] (p = 0.74 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild

transfer/pacing-true/same-seed: No change in performance detected.
       time:   [39.502 ms 41.113 ms 42.776 ms]
       change: [-8.6814% -2.8935% +3.5444%] (p = 0.37 > 0.05)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
       time:   [925.84 ms 936.72 ms 947.74 ms]
       thrpt:  [105.51 MiB/s 106.75 MiB/s 108.01 MiB/s]
change:
       time:   [-0.3743% +1.1445% +2.5735%] (p = 0.15 > 0.05)
       thrpt:  [-2.5090% -1.1316% +0.3757%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [318.75 ms 321.71 ms 324.71 ms]
       thrpt:  [30.797 Kelem/s 31.084 Kelem/s 31.372 Kelem/s]
change:
       time:   [-1.9371% -0.6178% +0.7130%] (p = 0.36 > 0.05)
       thrpt:  [-0.7079% +0.6216% +1.9754%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [33.846 ms 34.024 ms 34.216 ms]
       thrpt:  [29.226  elem/s 29.391  elem/s 29.546  elem/s]
change:
       time:   [-0.1725% +0.7496% +1.6595%] (p = 0.11 > 0.05)
       thrpt:  [-1.6324% -0.7440% +0.1728%]

Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: Change within noise threshold.
       time:   [1.6835 s 1.7070 s 1.7314 s]
       thrpt:  [57.758 MiB/s 58.582 MiB/s 59.401 MiB/s]
change:
       time:   [+0.6548% +2.5513% +4.4730%] (p = 0.01 < 0.05)
       thrpt:  [-4.2815% -2.4878% -0.6505%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client Server CC Pacing MTU Mean [ms] Min [ms] Max [ms]
gquiche gquiche 1504 583.8 ± 72.5 503.2 717.1
neqo gquiche reno on 1504 871.0 ± 141.5 752.5 1131.4
neqo gquiche reno 1504 811.2 ± 153.9 721.2 1209.4
neqo gquiche cubic on 1504 787.5 ± 76.0 744.0 1000.9
neqo gquiche cubic 1504 797.3 ± 62.2 757.7 964.1
msquic msquic 1504 167.0 ± 93.2 98.2 406.3
neqo msquic reno on 1504 252.7 ± 70.2 211.9 430.9
neqo msquic reno 1504 322.5 ± 97.7 214.7 432.4
neqo msquic cubic on 1504 262.5 ± 61.7 205.0 390.0
neqo msquic cubic 1504 261.8 ± 86.3 206.9 479.5
gquiche neqo reno on 1504 738.5 ± 62.2 653.1 835.7
gquiche neqo reno 1504 702.7 ± 108.1 561.9 884.8
gquiche neqo cubic on 1504 733.5 ± 114.1 579.1 938.3
gquiche neqo cubic 1504 702.1 ± 91.2 568.4 814.2
msquic neqo reno on 1504 504.6 ± 68.8 438.5 690.7
msquic neqo reno 1504 480.1 ± 10.0 463.3 492.1
msquic neqo cubic on 1504 490.0 ± 30.3 467.3 557.7
msquic neqo cubic 1504 502.8 ± 37.7 477.5 603.8
neqo neqo reno on 1504 529.5 ± 94.1 448.4 721.7
neqo neqo reno 1504 538.5 ± 60.6 470.1 676.3
neqo neqo cubic on 1504 574.2 ± 61.4 504.9 723.8
neqo neqo cubic 1504 543.4 ± 75.3 477.9 731.5

⬇️ Download logs

Copy link

github-actions bot commented Sep 16, 2024

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

@larseggert
Copy link
Collaborator Author

@martinthomson I'd appreciate a review, since the code I am touching is pretty complex.

Copy link
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure looks fine, but this seems overly conservative to me.

@@ -337,6 +337,41 @@ impl<T: WindowAdjustment> CongestionControl for ClassicCongestionControl<T> {
congestion || persistent_congestion
}

/// Handle a congestion event.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a move? It looks like it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but from ClassicCongestionControl to CongestionControl.

Comment on lines +854 to +856
// Packets are only declared as lost relative to `largest_acked`. If we hit a PTO while
// we don't have a largest_acked yet, also do a congestion control reaction (because
// otherwise none would happen).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that we want to slow down? If we haven't received an ACK, we should be at a low send rate. Do we really want to slow down further?

One thing that concerns me here is that a fast PTO (if we ever enable that) will hit this condition more often and that might not be good for performance. The same goes for long RTT connections, which might be OK while we are still retransmitting within the RTT, but once we get an RTT estimate that is long, we'll slow our send rate by a huge amount for the false PTOs we've hit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the argument is that the combination of initial cwnd and initial RTT (= PTO) were made with some sort of safety criteria in mind. A PTO is an indication that either the RTT is much longer than the default assumption, or there is quite a bit of loss. In either of these two cases, we probably want to slow down?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my main concern is that if we want to be more aggressive about PTO, such as sending another Initial we create a situation where slowing down is not the result of a misunderstanding of the RTT, but a deliberate choice to send more to start with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants