Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize hash table insertion #2107

Merged
merged 5 commits into from
Oct 4, 2023

Commits on Oct 3, 2023

  1. hash_table REFACTOR simplify collision management

    The current implementation manages collision by browsing the next
    records to find an unused one.
    
    This has the following consequences:
    - when there are a lot of collisions, the insertion can take a lot
      of time before finding an empty entry.
    - this may require to rehash the table to get rid of invalid records
    - the code that handles the collisions is not trivial
    
    This commit reworks the hash table to use a per-hash list of records.
    
    It prepares the work to have an insertion in the hash table in O(1) even
    if there are hash collisions. This commit is not sufficient for that yet,
    since we always check for duplicates at insertion. See the introduction
    of ly_ht_insert_no_check() in a latter commit.
    
    Note: this change breaks the validation unit test. It is fixed by the
    next commit.
    
    Signed-off-by: Olivier Matz <[email protected]>
    olivier-matz-6wind committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    55936cb View commit details
    Browse the repository at this point in the history
  2. hash_table UPDATE insert record at the end of the hlist

    Change the type of hlist head: instead of only referencing the first
    record, reference both first and last records. Therefore we can add
    new elements at the tail of the list.
    
    This impacts how the records of a hlist will be browsed in case of
    collisions:
    - before this commit: last inserted is browsed first
    - after this commit: first inserted is browsed first
    
    It solves the validation unit test that was broken by the previous
    commit.
    
    Signed-off-by: Olivier Matz <[email protected]>
    olivier-matz-6wind committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    0c76cb2 View commit details
    Browse the repository at this point in the history
  3. hash_table FEATURE new api to insert without lookup

    When lyht_insert() is called, it first searches for an existing object
    that matches the previously specified val_equal() callback. In this
    situation, the callback is invoked with mod=1.
    
    In case there are a lot of collisions, this check can take some time.
    
    Introduce a new API that bypasses this lookup operation, it will be
    used in a next commit to optimize keyless list elements insertions.
    
    Signed-off-by: Olivier Matz <[email protected]>
    olivier-matz-6wind committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    5d43e55 View commit details
    Browse the repository at this point in the history
  4. tree_data OPTIMIZE don't check dups on htable insertion

    When inserting a node into the children hash table, we check that
    it was not already added, using a pointer comparison.
    
    This check can take a lot of time if there are a lot of collisions
    in the hash table, which is the case for keyless lists.
    
    Use the new API lyht_insert_no_check() to optimize the insertion.
    
    Signed-off-by: Olivier Matz <[email protected]>
    olivier-matz-6wind committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    e764b3f View commit details
    Browse the repository at this point in the history
  5. tree_data OPTIMIZE optimize creation of children htable

    Here, we know the number of children that will be added in the hash
    table, so create the hash table with the correct number of elements to
    avoid automatic resizes.
    
    Signed-off-by: Olivier Matz <[email protected]>
    olivier-matz-6wind committed Oct 3, 2023
    Configuration menu
    Copy the full SHA
    9f69558 View commit details
    Browse the repository at this point in the history