Speed up `has_path` by avoiding `popfirst!` and `push!` #406

thchr · 2024-10-31T15:06:20Z

This speeds up has_path considerably by allocating the next buffer up front, avoiding the need to push! and popfirst! (and hence resize / reallocate) during the search.

Examples below (has_path = old implementation; new_has_path = implementation in this PR):

julia> using Graphs, Chairmarks

julia> g = path_graph(4)
julia> @b has_path($g, 1, 3) # true
96.970 ns (4 allocs: 272 bytes)
julia> @b new_has_path($g, 1, 3) # true
68.720 ns (3 allocs: 224 bytes)

julia> g = path_graph(10)
julia> @b has_path($g, 1, 10) # true
162.154 ns (4 allocs: 272 bytes)
julia> @b new_has_path($g, 1, 10) # true
94.956 ns (3 allocs: 272 bytes)

julia> p = complete_multipartite_graph(Int8[5,5,5])
julia> @b has_path($p, 1, 15) # true
139.452 ns (5 allocs: 288 bytes)
julia> @b new_has_path($p, 1, 15) # true
72.054 ns (3 allocs: 192 bytes)

julia> pp = SimpleGraph(30, vcat([p.fadjlist, [v .+ Int8(15) for v in p.fadjlist]]...))
julia> @b has_path($pp, 1, 16) # false
322.962 ns (5 allocs: 304 bytes)
julia> @b new_has_path($pp, 1, 16) # false
183.883 ns (3 allocs: 224 bytes)

It can occasionally be slower if the graph is very large and the vertices are very "close", e.g.:

julia> g = path_graph(1000)

julia> @b has_path($g, 1, 1000)
11.267 μs (4 allocs: 1.266 KiB)
julia> @b new_has_path($g, 1, 1000)
5.283 μs (3 allocs: 9.062 KiB)

julia> @b has_path($g, 1, 3)
148.652 ns (4 allocs: 1.266 KiB)
julia> @b new_has_path($g, 1, 3)
251.704 ns (3 allocs: 9.062 KiB)

For the above example, the new implementation is faster for all vertices separated by more than 14 steps.

Overall, I think this should be a very worthwhile trade-off for most applications.

codecov · 2024-10-31T15:17:51Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.30%. Comparing base (24539fd) to head (6954605).

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #406   +/-   ##
=======================================
  Coverage   97.30%   97.30%           
=======================================
  Files         117      117           
  Lines        6948     6951    +3     
=======================================
+ Hits         6761     6764    +3     
  Misses        187      187

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

simonschoelly · 2024-11-01T02:23:40Z

May I ask what Julia version you used for these benchmarks? In the past this kind of optimization would often lead to significant speedups but from Julia v1.11 on, arrays are implemented differently: https://github.com/JuliaLang/julia/blob/v1.11.0/NEWS.md#new-language-features

I also think that perhaps the graphs you choose for benchmarking are kind of very small - for this kind of function we can also work with graphs with millions of vertices.
And so far I saw that you benchmarked with either paths or complete multipartite graphs - both are kind of pathological cases where either the search beam is very small or very wide - perhaps there is a more natural version somewhere in the middle?

thchr · 2024-11-01T11:20:50Z

This was on 1.10.5, so indeed before the Memory changes, good point.

The original version improves on 1.11 - but so does this revision (but the gap probably narrows a bit overall):

julia> g = path_graph(4)
julia> @b has_path($g, 1, 3) # true
50.041 ns (5 allocs: 224 bytes)
julia> @b new_has_path($g, 1, 3) # true
42.559 ns (5 allocs: 192 bytes)

julia> g = path_graph(10)
julia> @b has_path($g, 1, 10) # true
94.484 ns (5 allocs: 240 bytes)
julia> @b new_has_path($g, 1, 10) # true
77.722 ns (5 allocs: 256 bytes)

julia> g = path_graph(10)
julia> @b has_path($g, 1, 10) # true
100.112 ns (5 allocs: 240 bytes)
julia> @b new_has_path($g, 1, 10) # true
79.012 ns (5 allocs: 256 bytes)

julia> pp = SimpleGraph(30, vcat([p.fadjlist, [v .+ (15) for v in p.fadjlist]]...))
julia> @b has_path($pp, 1, 16) # false
203.519 ns (6 allocs: 624 bytes)
julia> @b new_has_path($pp, 1, 16) # false
148.489 ns (5 allocs: 432 bytes)

julia> g = path_graph(1000)
julia> @b has_path($g, 1, 1000)
5.148 μs (5 allocs: 1.250 KiB)
julia> @b new_has_path($g, 1, 1000)
5.498 μs (6 allocs: 9.000 KiB)

julia> @b has_path($g, 1, 3)
101.500 ns (5 allocs: 1.250 KiB)
julia> @b new_has_path($g, 1, 3)
202.632 ns (6 allocs: 9.000 KiB)

I also think that perhaps the graphs you choose for benchmarking are kind of very small - for this kind of function we can also work with graphs with millions of vertices.

Sure, that is an undeniable tradeoff. To me though, we are already instantiating an nv(g)-sized vector in seen, so this change is in line with that. Still, it's a tradeoff, no doubt.
(I don't feel strongly about this PR either way; I ended up factoring out some bits myself to reuse the buffers for my use-case, which is much faster)

And so far I saw that you benchmarked with either paths or complete multipartite graphs - both are kind of pathological cases where either the search beam is very small or very wide - perhaps there is a more natural version somewhere in the middle?

Also fair; I just don't know that many canonical/conventional graph kinds, so I just picked a few of the ones provided by Graphs.jl. Open to suggestions.

simonschoelly · 2024-11-20T13:21:59Z

@thchr Can you rebase your PR on the lastest master? I think the nightly tests might succeed then.

gdalle · 2024-11-21T11:14:10Z

Thanks for this suggestion, but I would rather weigh in against greedy allocation of a huge vector that we may never need. I think @simonschoelly is right and relying on Julia's compiler to get better is the right call

gdalle · 2024-11-21T11:16:01Z

We could use an actual queue from DataStructures.jl instead of a vector though

thchr · 2024-11-21T12:10:59Z

Thanks for this suggestion, but I would rather weigh in against greedy allocation of a huge vector that we may never need. I think @simonschoelly is right and relying on Julia's compiler to get better is the right call

Okay, no worries; feel free to close.

We could use an actual queue from DataStructures.jl instead of a vector though

I think that is likely to be a good bit slower for small graphs, probably big ones too; the Queue type is quite complicated and allocates several arrays. For my use-case, this would almost certainly be substantially worse.

speed up has_path by avoiding popfirst! and push!

6954605

thchr force-pushed the has_path branch from 5059bc5 to 6954605 Compare November 21, 2024 08:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up `has_path` by avoiding `popfirst!` and `push!` #406

Speed up `has_path` by avoiding `popfirst!` and `push!` #406

thchr commented Oct 31, 2024 •

edited

Loading

codecov bot commented Oct 31, 2024 •

edited

Loading

simonschoelly commented Nov 1, 2024

thchr commented Nov 1, 2024

simonschoelly commented Nov 20, 2024

gdalle commented Nov 21, 2024

gdalle commented Nov 21, 2024

thchr commented Nov 21, 2024

Speed up has_path by avoiding popfirst! and push! #406

Are you sure you want to change the base?

Speed up has_path by avoiding popfirst! and push! #406

Conversation

thchr commented Oct 31, 2024 • edited Loading

codecov bot commented Oct 31, 2024 • edited Loading

Codecov Report

simonschoelly commented Nov 1, 2024

thchr commented Nov 1, 2024

simonschoelly commented Nov 20, 2024

gdalle commented Nov 21, 2024

gdalle commented Nov 21, 2024

thchr commented Nov 21, 2024

Speed up `has_path` by avoiding `popfirst!` and `push!` #406

Speed up `has_path` by avoiding `popfirst!` and `push!` #406

thchr commented Oct 31, 2024 •

edited

Loading

codecov bot commented Oct 31, 2024 •

edited

Loading