Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with baseline PeakFP #2

Open
shenberg opened this issue Mar 5, 2024 · 3 comments
Open

Issues with baseline PeakFP #2

shenberg opened this issue Mar 5, 2024 · 3 comments

Comments

@shenberg
Copy link

shenberg commented Mar 5, 2024

Hi, I'm trying to use PeakFP to the best of my understanding. I took 35 songs (roughly 120 minutes of audio), ran extractor.py on each one, then ran indexer.py in order to generate an index (I provided a txt file containing the paths to all the fingerprint files). The generated index is ~20MB. Next, I tried to run matcher.py on an 8-second query, however the script never finishes.

If I break it after a long time (e.g. >10 minutes), it's busy adding to the interval tree itree.addi(match['start'], match['end'], match). I let it run for >3h and it still did not finish a single 8-second query.

I reduced sensitivity from 5 matches minimum to 12 matches, which reduces the amount of matches by a factor of x10, and even then it gets stuck at itree.split_overlaps() - I gave up after some 10s of minutes, as from what was reported in the paper, I was expecting a runtime that's only, say, a x10 factor slower than real-time, but I'm getting well over x100.

Any ideas why? Am I using the library incorrectly?

@guillemcortes
Copy link
Owner

Hi @shenberg,
Thanks for your interest in baf, right now I am a bit busy with other things but I will try to take a look into it.
In the meantime can you provide some information to reproduce your error? like Os platform, python and package versions, exact commands you are running, etc.
Many thanks!

@shenberg
Copy link
Author

Sure, thanks for the response!

I'm running on an M1 max Macbook pro with MacOS Ventura 13.6.2, installed PeakFP in a virtualenv with Python 3.8.18.

i had to change the requirements a bit in order to get a functional install - older versions of numpy don't work smoothly with ARM-based macs so I changed the requirements file to be:

intervaltree==3.1.0
librosa==0.8.1
scikit-image

And then everything installed smoothly.

I don't know if it's normal, but for me, run_matching() in matcher.py would end up with on the order of 300k matches in processed_matches with the default settings, then spend an incredibly long time in the interval tree. I tried modifying the code to create an interval tree per reference id, and then to coalesce matches between references, but I still ended up having to reduce the sensitivity by a ton (min_peak_threshold 13 instead of 5 where necessary) and results were pretty bad...

@guillemcortes
Copy link
Owner

PeakFP takes a lot of time to compute the matching. Please refer to the paper for more information on that: https://zenodo.org/records/7372162. Basically it took 98hours to do the matching on BAF (single thread). The results do not match the metrics reported in the paper, you mean? In the end PeakFP was designed to be a naive, non-scalable baseline for background music identification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants