Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the Client's performance for non-stream bodies #2919

Merged
merged 17 commits into from
Jun 20, 2024

Conversation

kyri-petrou
Copy link
Collaborator

@kyri-petrou kyri-petrou commented Jun 19, 2024

/claim #2117

This PR reimplements the way that we "collect" the response body when the user uses asArray, asChunk, asText etc. (basically any method other than asStream).

When those methods are used, regardless of whether we use a ZStream or just a plain ArrayBuilder as the buffer, we will be materializing the entire body into memory. Therefore, we can avoid all the overheads of ZStream and simply buffer the entire response into the array builder which performs much better.

Since we don't have benchmarks for the Client, I don't have any results to show at the moment, but based on some local testing I'm seeing the following:

  • 20% improvement of using asChunk vs asStream.collectAll
  • 35-40% improvement over the baseline (series/2.x). This is due to some of the optimizations that benefit both streaming and non-streaming bodies

EDIT: Benchmark suite added and results are added below

Note:
The one thing I'm a bit sceptical about this implementation is whether the aggregation should be happening within UnsafeAsync.Aggregating instead of within AsyncBodyReader. I think semantically it would have been better to have this logic within UnsafeAsync.Aggregating, but unfortunately the performance is not as good as doing this within the AsyncBodyReader since we can't utilize a single ArrayBuilder to buffer the entire response, and so we need to create an intermediate array when connect is called.

@jdegoes Let me know what you think. Do we want to prioritise an implementation that performs better or one that it's cleaner/easier to reason with and follow

Benchmark results

TLDR:

  • ~5% improvement when body is collected asStream
  • ~45% improvement when body is collected asChunk / asArray / asString 🚀

Main branch (NOTE: performance of asChunk and asStream.runCollect is the same since asChunk uses a stream under the hood)

[info] Benchmark                                    (path)   Mode  Cnt     Score      Error  Units
[info] ClientBenchmark.zhttpChunkBenchmark           small  thrpt    3  9978.642 ± 1836.980  ops/s
[info] ClientBenchmark.zhttpChunkBenchmark           large  thrpt    3  9077.854 ± 1626.184  ops/s
[info] ClientBenchmark.zhttpStreamToChunkBenchmark   small  thrpt    3  9959.477 ±  593.264  ops/s
[info] ClientBenchmark.zhttpStreamToChunkBenchmark   large  thrpt    3  9135.447 ± 1355.958  ops/s

PR branch:

[info] Benchmark                                    (path)   Mode  Cnt      Score      Error  Units
[info] ClientBenchmark.zhttpChunkBenchmark           small  thrpt    3  14554.512 ± 2518.083  ops/s
[info] ClientBenchmark.zhttpChunkBenchmark           large  thrpt    3  13730.751 ± 1534.280  ops/s
[info] ClientBenchmark.zhttpStreamToChunkBenchmark   small  thrpt    3  10506.628 ±  327.098  ops/s
[info] ClientBenchmark.zhttpStreamToChunkBenchmark   large  thrpt    3   9642.722 ±  904.606  ops/s

@kyri-petrou
Copy link
Collaborator Author

@jdegoes, @987Nabil brought to my attention this issue #2117 after seeing my PR.

I think the issue is pretty outdated with a part of it already solved in a previous release. However this PR does improve the Client's performance quite a fair bit. Do you think it's applicable to claim it (or part of it) if I were to add benchmarks to showcase the improvement? It's totally okay if you think it's not related, just thought to ask :)

@jdegoes
Copy link
Member

jdegoes commented Jun 19, 2024

@kyri-petrou Yes, adding benchmarks would be great! Complexity can be ignored for 20-40% improvement in performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised that these macros did not work while adding the benchmarks because the fromAbsoluteURI method is package-private.

I decided to just fix it here instead so that I could use them in the benchmarks

@kyri-petrou
Copy link
Collaborator Author

@jdegoes done! See the updated PR description for the results

@jdegoes jdegoes merged commit fd5eb22 into zio:main Jun 20, 2024
34 checks passed
@jdegoes
Copy link
Member

jdegoes commented Jun 20, 2024

❤️

@kyri-petrou kyri-petrou deleted the improve-client-non-streaming-performance branch June 20, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants