Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change std::hash<int> to hash_combine #1557

Open
wants to merge 4 commits into
base: inference
Choose a base branch
from

Conversation

suranap
Copy link
Collaborator

@suranap suranap commented Dec 14, 2024

Description of changes:
A user was trying to use this with 32 gpus and got the following error:
LEGION ERROR: ERROR: ShardingID 486726351 has already been used by another sharding functor.
Turns out the hash function in MachineView doesn't work. Changed all but one place where std::hash is used. Cherry-picked improved hash_combine from master.

There's still this spot that uses std::hash:
https://github.com/flexflow/FlexFlow/blob/3a825ed6ae7425ea3578006e458cfcaf19e038a5/src/runtime/graph.cc#L1868-L1872

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:


This change is Reviewable

This exposes the collision found for 8 nodes * 4 gpus.
These hashes use hash<int> which is the identity function. So no
hashing occurs at all!
@suranap suranap added the bug Something isn't working label Dec 14, 2024
@suranap suranap linked an issue Dec 16, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hash function doesn't hash hash_combine results in hash collision
1 participant