Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse matrix vector product benchmark (in comparison with spark and dask) #68

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pcmoritz
Copy link
Collaborator

@pcmoritz pcmoritz commented May 24, 2016

The numbers on the 3Mx3M test matrix from https://snap.stanford.edu/data/com-Orkut.html look like this:

scipy.sparse single threaded: 1.2s
halo (1 node, 4 workers): 600ms
halo (2 nodes, 4 workers each): 430ms
dask (4 workers): 1.0s
dask.distributed (1 node, 4 workers): 11s
dask.distributed (2 nodes, 4 workers each): 8.1s

Distributed Dask presumably does not perform well, because it does not have an object store where the sparse matrix blocks can be stored. The single node version of dask does not need to perform serialization, but is limited by the Python GIL.

For pyspark, the full matrix gave a serialization error; using a 2Mx2M matrix gives:

scipy.sparse single threaded: 0.76s
spark (1 node, 4 workers): 1.41s
spark (2 node, 4 workers each): 1.56s

Before this is merged, we should check with the author of Dask that there is not a more efficient way to implement these operations.

@ludwigschmidt
Copy link

Nice numbers! I assume each node has at least four hardware threads?

Is it clear why the speed-up is roughly 3x in total instead of 8x (total number of cores)?

And I guess halo is the new name for Orchestra? :-)

@cathywu
Copy link

cathywu commented May 25, 2016

Also curious. :)

On Tue, May 24, 2016 at 7:25 PM Ludwig Schmidt [email protected]
wrote:

Nice numbers! I assume each node has at least four hardware threads?

Is it clear why the speed-up is roughly 3x in total instead of 8x (total
number of cores)?

And I guess halo is the new name for Orchestra? :-)


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#68 (comment)

@pcmoritz
Copy link
Collaborator Author

Hey Cathy + Ludwig,

glad to hear from you! Scaling up sparse linear algebra on non-MPI systems is challenging, because each task is typically very small (in this case, it takes on the order of a few ms).

This was the first experiment where we got a speedup for sparse linear algebra on multiple nodes using Halo. Now that we understand better where the bottlenecks are (mainly the synchronous gRPC calls to the scheduler), we are going to address them in the next development iteration.

Best,
Philipp.

pcmoritz added a commit to pcmoritz/ray that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants