Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WARNING] False Positive Issues #105

Open
ziczhu opened this issue Jun 29, 2023 · 1 comment
Open

[WARNING] False Positive Issues #105

ziczhu opened this issue Jun 29, 2023 · 1 comment

Comments

@ziczhu
Copy link

ziczhu commented Jun 29, 2023

Currently, we are experiencing a high number of false positives when utilizing this library. In our scenario, approximately 70% of the results are false positives, which significantly impacts the accuracy of our application.

To address this issue, I suggest to use the following precheck before using the library:

  1. Preprocessing based on video length: Consider incorporating a preprocessing step that filters out videos with durations less than 1 minute. This criteria can help eliminate irrelevant and short-duration videos, which often contribute to false positive matches.

  2. Similarity threshold adjustment: Modify the similarity threshold used by the library to make it more stringent. By increasing the threshold, the library will only consider videos with a higher degree of similarity, reducing the occurrence of false positives. This adjustment can significantly improve the precision of the matching process.

  3. Comparison of video durations: Introduce a comparison mechanism that checks the proximity of video durations when assessing similarity. This step would ensure that two videos are not considered similar if their durations differ significantly. By including this additional criterion, we can reduce the occurrence of false positives caused by videos with vastly different lengths.

But still thanks to the author to provide this library for low cost comparison, but if you're using it in a very serious scenario, I would suggest use it like the bloom filter, and do intensive algorithm after positive result.

@Qinmayyear
Copy link

Wish I saw this earlier. This library cannot be use to detect videos less than 1min, there were many false positive cases :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants