You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The way parallel-stream works is that instead of calling into_stream to create a sequential stream you can call into_par_stream to create a "parallel" stream instead. This means that each item in the stream will be operated on in a new task, which enables multi-core processing of items backed by a thread pool.
For example:
use parallel_stream::prelude::*; #[async_std::main] async fn main() { // Create a vec of numbers to square. let v = vec![1, 2, 3, 4]; // Convert the vec into a parallel stream and collect each item into a vector. let mut res: Vec<usize> = v .into_par_stream() .map(|n| async move { n * n }) .collect() .await; // Items are stored as soon as they're ready so we need to sort them. res.sort(); assert_eq!(res, vec![1, 4, 9, 16]); }
This model should make it easy to quickly process streams in parallel where previously it would've required significant effort to setup.
parallel-stream today only provides the basics: map, for_each, next, and collect. This was enough to prove it works, and should cover the 90% use case for most people.
Comparing to other libraries
parallel-stream does not inherently enable anything that couldn't be done before. Instead it makes existing capabilities much more accessible, which increases the likelyhood they'll be used, which in turn is good for performance.
Rayon
Rayon is a data parallelism library built for synchronous Rust, powered by an underlying thread pool. async-std manages a thread pool as well, but the key difference with Rayon is that async-std (and futures) are optimized for latency, while Rayon is optimized for throughput.
As a rule of thumb: if you want to speed up doing heavy calculations you probably want to use Rayon. If you want to parallelize network requests consider using parallel-stream.
// rayon(parallel): convert a vec into a parallel iterator. // Map each item in the iterator, and collect into a vec. let res: Vec<usize> = vec![1, 2, 3, 4] .into_par_iter() .map(|n| n * n ) .map(|n| n + n ) .collect(); // parallel-stream(parallel): convert a vec into a parallel stream. // Map each item in the stream, and collect into a vec. let res: Vec<usize> = vec![1, 2, 3, 4] .into_par_stream() .map(|n| async move { n * n }) .map(|n| async move { n + n }) .collect() .await;
Futures
futures provides several abstractions for parallelizing streams. These include:
stream::futures_unordered: a Set of futures that can resolve in any order..
StreamExt::for_each_concurrent: a concurrent version of Stream::for_each.
StreamExt::buffer_unordered: buffer up to n futures from a stream simulatenously.
These methods provide the ability to process multiple items from a stream at the same time. But by themselves they don't schedule work on multiple cores: they need to be combined with task::spawn in order to do that.
// futures-rs(parallel): convert a vec into a stream. Then map the // stream to a stream of futures (tasks). Then process // all of them in parallel, and collect into a vec. let res: Vec<usize> = stream::iter(vec![1, 2, 3, 4]) .map(|n| task::spawn(async move { n * n }) .map(|fut| fut.and_then(|n| n + n)) .buffer_unordered(usize::MAX) .collect() .await; // parallel-stream(parallel): convert a vec into a parallel stream. // Map each item in the stream, and collect into a vec. let res: Vec<usize> = vec![1, 2, 3, 4] .into_par_stream() .map(|n| async move { n * n }) .map(|n| async move { n + n }) .collect() .await;
As you can see both approaches are about the same length, yet they are quite different. The core difference lies in how tasks are spawned: futures-rs has a two-step process of converting values into tasks, taking their handles, and then awaiting those handles in parallel.
While parallel-stream only requires to be initialized once, and then all future operations on it can be assumed to be parallel. This is especially noticable when chaining multiple stream adapters.
Configuring parallelism limits
Currently parallel-stream assumes all tasks in the stream will be spawned in parallel, and doesn't impose any limit on how many are running at the same time. However in practice systems have very real limits, and being able to configure maximum concurrency is useful.
To that extent we've introduced the limit method as part of the trait, which governs concurrency . The idea is that setting a maximum concurrency would be matter of calling limit on the stream chain:
let res: Vec<usize> = vec![1, 2, 3, 4] .into_par_stream() .limit(2) // Process up to 2 items at the same time. .map(|n| async move { n * n }) .collect() .await;
Like its name implies buffer_unordered has a counterpart: buffer_ordered. Instead of returning items as soon as they're ready, it returns items in order (but still processes them in parallel).
As you've seen in the examples so far, we've been calling Vec::order after having finished collecting items. This generally only works for finite streams where all items can fit in memory. If a stream is infinite, or the full contents of the stream can't fit in memory, this doesn't work. Instead it'd be better if items could be sorted on the spot, which would result in much more efficient use of memory.
parallel-stream doesn't support this yet. It's somewhat involved to implement, and would require resolving issue #4 first. But the API would likely end up looking something like this:
let res: Vec<usize> = vec![1, 2, 3, 4] .into_par_stream() .order(true) // Return items from the stream in order. .map(|n| async move { n * n }) .collect() .await;
Another reason why ordering is not enabled by default is because it uses more memory for cases when ordering is not important. And since items are now returned in order, it increases overall latency. Which is often not what you want when using async/await.
Together with the limit method this functionality would provide careful control of how much work can be done at the same time, and whether to trade ordering for latency and memory use.
Future Directions
As I mentioned in the "language support" section of my "streams concurrency" post, it'd be fantastic if the Rust language itself could eventually provide support for parallel iteration. For example writing a parallel async TCP listener shouldn't be more lines than writing a synchronous sequential one:
let mut listener = TcpListener::bind("127.0.0.1:8080").await?; println!("Listening on {}", listener.local_addr()?); for par stream.await? in listener.incoming() { println!("Accepting from: {}", stream.peer_addr()?); io::copy(&stream, &stream).await?; }
But that's quite a bit further out. Though now that parallel-stream exists we can probably start prototyping what this would look like through the use of proc macros. After all, language support for async/await was also first implemented as a proc macro. There are some limitations around what proc macros can currently do, but I believe the following should be feasible:
let mut listener = TcpListener::bind("127.0.0.1:8080").await?; println!("Listening on {}", listener.local_addr()?); #[for_par] { stream.await? in listener.incoming() { println!("Accepting from: {}", stream.peer_addr()?); io::copy(&stream, &stream).await?; }}
In addition to that there are quite a few more APIs to add. parallel-stream currently only covers the basics, but we've setup a comprehensive list of all methods to add in #2. Contributions would be most welcome!
Conclusion
In this post we've introduced parallel-stream, a data parallelism library for async-std. It provides a familiar interface powering non-blocking parallel processing of work.
Work on this library started a few months after async-std was kicked off. However I had some trouble prototyping it, and it wasn't until last week that I figured out how to simplify internals enough to get it to work well.
Personally I'm happy that we've proven this exists, but probably don't have much time to maintain it going forward. If someone would like to help maintain this and move it foward, please do get it touch!
For me it's probably time to work on Tide again, and finish the migration to http-types. Probably more on that in a later post. That's all for now. Have a good week, and stay safe!
via Yoshua Wuyts — Blog
April 2, 2020 at 10:06PM
The text was updated successfully, but these errors were encountered:
Yoshua Wuyts — Blog
https://ift.tt/2WkeZJT
parallel-stream
— 2020-03-17
Last weekend I released parallel-stream, a data parallelism library for async std. It's to streams, the way rayon is to iterators. This is an implementation of a design I wrote about earlier.
The way
parallel-stream
works is that instead of callinginto_stream
to create a sequential stream you can callinto_par_stream
to create a "parallel" stream instead. This means that each item in the stream will be operated on in a new task, which enables multi-core processing of items backed by a thread pool.For example:
This model should make it easy to quickly process streams in parallel where previously it would've required significant effort to setup.
parallel-stream
today only provides the basics:map
,for_each
,next
, andcollect
. This was enough to prove it works, and should cover the 90% use case for most people.Comparing to other libraries
parallel-stream
does not inherently enable anything that couldn't be done before. Instead it makes existing capabilities much more accessible, which increases the likelyhood they'll be used, which in turn is good for performance.Rayon
Rayon is a data parallelism library built for synchronous Rust, powered by an underlying thread pool. async-std manages a thread pool as well, but the key difference with Rayon is that async-std (and futures) are optimized for latency, while Rayon is optimized for throughput.
As a rule of thumb: if you want to speed up doing heavy calculations you probably want to use Rayon. If you want to parallelize network requests consider using
parallel-stream
.Futures
futures
provides several abstractions for parallelizing streams. These include:stream::futures_unordered
: aSet
of futures that can resolve in any order..StreamExt::for_each_concurrent
: a concurrent version ofStream::for_each
.StreamExt::buffer_unordered
: buffer up ton
futures from a stream simulatenously.These methods provide the ability to process multiple items from a stream at the same time. But by themselves they don't schedule work on multiple cores: they need to be combined with
task::spawn
in order to do that.As you can see both approaches are about the same length, yet they are quite different. The core difference lies in how tasks are spawned:
futures-rs
has a two-step process of converting values into tasks, taking their handles, and then awaiting those handles in parallel.While
parallel-stream
only requires to be initialized once, and then all future operations on it can be assumed to be parallel. This is especially noticable when chaining multiple stream adapters.Configuring parallelism limits
Currently
parallel-stream
assumes all tasks in the stream will be spawned in parallel, and doesn't impose any limit on how many are running at the same time. However in practice systems have very real limits, and being able to configure maximum concurrency is useful.To that extent we've introduced the
limit
method as part of the trait, which governs concurrency . The idea is that setting a maximum concurrency would be matter of callinglimit
on the stream chain:This rougly corresponds to the
limit
param infutures::stream::Stream::for_each_concurrent
and then
parameter infutures::stream::Stream::buffer_unordered
.Configuring ordering
Like its name implies
buffer_unordered
has a counterpart:buffer_ordered
. Instead of returning items as soon as they're ready, it returns items in order (but still processes them in parallel).As you've seen in the examples so far, we've been calling
Vec::order
after having finishedcollect
ing items. This generally only works for finite streams where all items can fit in memory. If a stream is infinite, or the full contents of the stream can't fit in memory, this doesn't work. Instead it'd be better if items could be sorted on the spot, which would result in much more efficient use of memory.parallel-stream
doesn't support this yet. It's somewhat involved to implement, and would require resolving issue #4 first. But the API would likely end up looking something like this:Another reason why ordering is not enabled by default is because it uses more memory for cases when ordering is not important. And since items are now returned in order, it increases overall latency. Which is often not what you want when using
async/await
.Together with the
limit
method this functionality would provide careful control of how much work can be done at the same time, and whether to trade ordering for latency and memory use.Future Directions
As I mentioned in the "language support" section of my "streams concurrency" post, it'd be fantastic if the Rust language itself could eventually provide support for parallel iteration. For example writing a parallel async TCP listener shouldn't be more lines than writing a synchronous sequential one:
But that's quite a bit further out. Though now that
parallel-stream
exists we can probably start prototyping what this would look like through the use of proc macros. After all, language support forasync/await
was also first implemented as a proc macro. There are some limitations around what proc macros can currently do, but I believe the following should be feasible:In addition to that there are quite a few more APIs to add.
parallel-stream
currently only covers the basics, but we've setup a comprehensive list of all methods to add in #2. Contributions would be most welcome!Conclusion
In this post we've introduced
parallel-stream
, a data parallelism library forasync-std
. It provides a familiar interface powering non-blocking parallel processing of work.Work on this library started a few months after
async-std
was kicked off. However I had some trouble prototyping it, and it wasn't until last week that I figured out how to simplify internals enough to get it to work well.Personally I'm happy that we've proven this exists, but probably don't have much time to maintain it going forward. If someone would like to help maintain this and move it foward, please do get it touch!
For me it's probably time to work on Tide again, and finish the migration to
http-types
. Probably more on that in a later post. That's all for now. Have a good week, and stay safe!via Yoshua Wuyts — Blog
April 2, 2020 at 10:06PM
The text was updated successfully, but these errors were encountered: