Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to hashbrown 0.15.1: migrate from hashbrown::raw::RawTable to hashbrown::hash_table::HashTable #13433

Open
3 tasks done
crepererum opened this issue Nov 15, 2024 · 6 comments

Comments

@crepererum
Copy link
Contributor

crepererum commented Nov 15, 2024

What

Migrate from hashbrown::raw::RawTable to hashbrown::hash_table::HashTable.

Why

RawTable and raw_entry are removed in hashbrown 0.15.1. See #13256 . Personally I think it's the right thing to do because RawTable was a bit of a weird API.

How

First, we need to get some memory accounting for HashTable. This can be done in a similar way to

/// Extension trait for hash browns [`RawTable`] to account for allocations.
pub trait RawTableAllocExt {

/// Extension trait for hash browns [`HashTable`] to account for allocations.
pub trait HashTableAllocExt {
    /// Item type.
    type T;

    /// Insert new element into table and increase
    /// `accounting` by any newly allocated bytes.
    ///
    /// Returns the bucket where the element was inserted.
    /// Note that allocation counts capacity, not size.
    ///
    /// # Example:
    /// ```
    /// TODO: rewrite this example!
    /// ```
    fn insert_accounted(
        &mut self,
        x: Self::T,
        hasher: impl Fn(&Self::T) -> u64,
        accounting: &mut usize,
    );
}

impl<T> HashTableAllocExt for HashTable<T> where T: Eq {
    type T = T;

    fn insert_accounted(
        &mut self,
        x: Self::T,
        hasher: impl Fn(&Self::T) -> u64,
        accounting: &mut usize,
    ) {
        let hash = hasher(&x);

        // NOTE: `find_entry` does NOT grow!
        match self.find_entry(hash, |y| y == &x) {
            Ok(_occupied) => {}
            Err(_absent) => {
                if self.len() == self.capacity() {
                    // need to request more memory
                    let bump_elements = self.capacity().max(16);
                    let bump_size = bump_elements * size_of::<T>();
                    *accounting = (*accounting).checked_add(bump_size).expect("overflow");

                    self.reserve(bump_elements, &hasher);
                }

                // still need to insert the element since first try failed
                self.entry(hash, |y| y == &x, hasher).insert(x);
            }
        }
    }
}

Then, migrate every RawTable to HashTable. Since RawTable is used all over the place, I think this should be split into multiple PRs. Luckily, HashTable is already available in hashbrown 0.14.x, so we can do this iteratively before upgrading to hashbrown 0.15.x.

Progress

crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Nov 22, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Nov 22, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Nov 22, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Dec 2, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Dec 2, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Dec 2, 2024
crepererum added a commit that referenced this issue Dec 4, 2024
…nd 2) (#13524)

* refactor: migrate `GroupValuesRows` to `HashTable`

For #13433.

* refactor: migrate `GroupValuesPrimitive` to `HashTable`

For #13433.

* refactor: migrate `GroupValuesColumn` to `HashTable`

For #13433.
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Dec 5, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Dec 5, 2024
@alamb alamb changed the title Migrate from hashbrown::raw::RawTable to hashbrown::hash_table::HashTable Upgrade to hashbrown 0.15.1: migrate from hashbrown::raw::RawTable to hashbrown::hash_table::HashTable Dec 6, 2024
alamb pushed a commit that referenced this issue Dec 6, 2024
@Dandandan
Copy link
Contributor

Thanks @crepererum for the steady progress on this

@crepererum
Copy link
Contributor Author

crepererum commented Dec 9, 2024

TBH the last two batches are rather hard:

hashbrown 0.14 & allocation size

hashbrown 0.14 doesn't expose the allocation size for HashTable which we would need here:

/// Calculates the size of the `PruningJoinHashMap` in bytes.
///
/// # Returns
/// The size of the hash map in bytes.
pub(crate) fn size(&self) -> usize {
self.map.allocation_info().1.size() + self.next.capacity() * size_of::<u64>()
}

hashbrown 0.15 offers that, but then we have to roll the RawTable->HashTable change together with the upgrade, not as separate steps.

ArrowHashTable

This is a bit of a wild interface and I dunno if it's worth the 15% uplift TBH:

/// An interface to hide the generic type signature of TopKHashTable behind arrow arrays
pub trait ArrowHashTable {
fn set_batch(&mut self, ids: ArrayRef);
fn len(&self) -> usize;
// JUSTIFICATION
// Benefit: ~15% speedup + required to index into RawTable from binary heap
// Soundness: the caller must provide valid indexes
unsafe fn update_heap_idx(&mut self, mapper: &[(usize, usize)]);
// JUSTIFICATION
// Benefit: ~15% speedup + required to index into RawTable from binary heap
// Soundness: the caller must provide a valid index
unsafe fn heap_idx_at(&self, map_idx: usize) -> usize;
unsafe fn take_all(&mut self, indexes: Vec<usize>) -> ArrayRef;
// JUSTIFICATION
// Benefit: ~15% speedup + required to index into RawTable from binary heap
// Soundness: the caller must provide valid indexes
unsafe fn find_or_insert(
&mut self,
row_idx: usize,
replace_idx: usize,
map: &mut Vec<(usize, usize)>,
) -> (usize, bool);
}

The issue here is that HashTable doesn't offer an unsafe "directly address the bucket via index" kind of interface because it is fundamentally rather risky. I wonder if we should somewhat redesign this part. As far as I understand, this uses the following concepts:

  1. data heap: a "heap" (which I guess is just a vector) to store the actual payload data
  2. mutable slot: a slot that stores a mutable index to the data heap. I guess this is used to update the data pointer whenever a new/better value was found. The slots are referenced by a simple usize index.
  3. key lookup: A way to lookup of key->mutable slot.

The current solution fuses 2 & 3 into a single RawTable w/ a LOT of unsafe. I think we could deconstruct that into a HashTable (for 3) + Vec (for 2). This can be done BEFORE the hashbrown 0.15 update.

@alamb
Copy link
Contributor

alamb commented Dec 9, 2024

This is a bit of a wild interface and I dunno if it's worth the 15% uplift TBH:

I think this is something @avantgardnerio and @thinkharderdev may know more about. Perhaps they can offer some suggestions / thoughts about improving the performance or some benchmarks that we can use to see how much performance would be affected if we switched to the newer hashbrown version

@avantgardnerio
Copy link
Contributor

I think this is something @avantgardnerio and @thinkharderdev may know more about.

The main reason that 15% was so important was to keep it on par with the unoptimized version. If CPU is equal, then the optimizer rule was easy: just always use this rule.

If we are okay taking a performance hit, or if @crepererum 's idea keeps performance the same, then there's no reason to keep the unsafe code.

@thinkharderdev
Copy link
Contributor

TBH the last two batches are rather hard:

hashbrown 0.14 & allocation size

hashbrown 0.14 doesn't expose the allocation size for HashTable which we would need here:

/// Calculates the size of the `PruningJoinHashMap` in bytes.
///
/// # Returns
/// The size of the hash map in bytes.
pub(crate) fn size(&self) -> usize {
self.map.allocation_info().1.size() + self.next.capacity() * size_of::<u64>()
}

hashbrown 0.15 offers that, but then we have to roll the RawTable->HashTable change together with the upgrade, not as separate steps.

ArrowHashTable

This is a bit of a wild interface and I dunno if it's worth the 15% uplift TBH:

/// An interface to hide the generic type signature of TopKHashTable behind arrow arrays
pub trait ArrowHashTable {
fn set_batch(&mut self, ids: ArrayRef);
fn len(&self) -> usize;
// JUSTIFICATION
// Benefit: ~15% speedup + required to index into RawTable from binary heap
// Soundness: the caller must provide valid indexes
unsafe fn update_heap_idx(&mut self, mapper: &[(usize, usize)]);
// JUSTIFICATION
// Benefit: ~15% speedup + required to index into RawTable from binary heap
// Soundness: the caller must provide a valid index
unsafe fn heap_idx_at(&self, map_idx: usize) -> usize;
unsafe fn take_all(&mut self, indexes: Vec<usize>) -> ArrayRef;
// JUSTIFICATION
// Benefit: ~15% speedup + required to index into RawTable from binary heap
// Soundness: the caller must provide valid indexes
unsafe fn find_or_insert(
&mut self,
row_idx: usize,
replace_idx: usize,
map: &mut Vec<(usize, usize)>,
) -> (usize, bool);
}

The issue here is that HashTable doesn't offer an unsafe "directly address the bucket via index" kind of interface because it is fundamentally rather risky. I wonder if we should somewhat redesign this part. As far as I understand, this uses the following concepts:

  1. data heap: a "heap" (which I guess is just a vector) to store the actual payload data
  2. mutable slot: a slot that stores a mutable index to the data heap. I guess this is used to update the data pointer whenever a new/better value was found. The slots are referenced by a simple usize index.
  3. key lookup: A way to lookup of key->mutable slot.

The current solution fuses 2 & 3 into a single RawTable w/ a LOT of unsafe. I think we could deconstruct that into a HashTable (for 3) + Vec (for 2). This can be done BEFORE the hashbrown 0.15 update.

The basic idea is just a hash map where the values are sorted so you can use as a combo aggregation by key + topk heap. @avantgardnerio and I went through a couple iterations on the design and landed on this because it doesn't regress performance when the grouping key is low cardinality.

@crepererum
Copy link
Contributor Author

If we are okay taking a performance hit, or if @crepererum 's idea keeps performance the same, then there's no reason to keep the unsafe code.

My idea was to ideally not regress significantly, at least not without discussing it first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants