Skip to content

Commit

Permalink
LLP: mmap labels instead of loading them in memory
Browse files Browse the repository at this point in the history
LLP on SWH's graph now needs more than 3TB, which makes it crash in this step when
there is anything else running on our 4TB machine that needs significant memory,
like a previous version of the graph.
  • Loading branch information
progval committed Dec 16, 2024
1 parent e8b0012 commit 997fd89
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions src/algo/llp/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -363,16 +363,17 @@ pub fn layered_label_propagation<R: RandomAccessGraph + Sync>(
.context("Could not load labels from best gammar")?
.to_vec();

let mmap_flags = Flags::TRANSPARENT_HUGE_PAGES | Flags::RANDOM_ACCESS;
for (i, gamma_index) in gamma_indices.iter().enumerate() {
info!("Starting step {}...", i);
let labels =
<Vec<usize>>::load_mem(labels_path(*gamma_index)).context("Could not load labels")?;
<Vec<usize>>::load_mmap(labels_path(*gamma_index), mmap_flags).context("Could not load labels")?;
combine(&mut result_labels, *labels, &mut temp_perm).context("Could not combine labels")?;
// This recombination with the best labels does not appear in the paper, but
// it is not harmful and fixes a few corner cases in which experimentally
// LLP does not perform well. It was introduced by Marco Rosa in the Java
// LAW code.
let best_labels = <Vec<usize>>::load_mem(labels_path(best_gamma_index))
let best_labels = <Vec<usize>>::load_mmap(labels_path(best_gamma_index), mmap_flags)
.context("Could not load labels from best gamma")?;
let number_of_labels = combine(&mut result_labels, *best_labels, &mut temp_perm)?;
info!("Number of labels: {}", number_of_labels);
Expand Down

0 comments on commit 997fd89

Please sign in to comment.