-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nlist computer example and nprobe suggestion #2771
base: v2.4.x
Are you sure you want to change the base?
Conversation
If the data volume of the entities is within the millions, you might consider using brute-force search. In other words, set nprobe to nlist.
one more nlist example
Seetimee patch 1
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: seetimee The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign |
Setting `nprobe` is specific to the dataset and scenario, and involves a trade-off between accuracy and query performance. We recommend finding the ideal value through repeated experimentation. | ||
If the data volume of the entities is within the millions, you might consider using brute-force search. In other words, set `nprobe` to `nlist`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my understanding, BF search shows better performance only in thousands level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the millions level dosen't effect performance Significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some misunderstanding.
He meant it doesn't hurt that much for small cases.
Actually for 1M dataset, the performance gap between w/ and w/o index can be 10~100x. Only when the row number smaller than thousands FLAT can outperform. But Milvus will handle that for you
nlist computer example and nprobe suggestion