-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio Fingerprinting Spike #45
Comments
Echonest spike: Easy to use, fast and identifies files. Unfortunately they provide very limited metadata. We only get an echonest id, an artist, and a track name. While they claim to integrate with other services (music brainz, etc) they only do so on artists and albums, not tracks. The reason they give is that they can't tell which album an individual track comes from. The workaround would be to take the track title, and artist name and look them up on a separate metadata api. Here's the problem: I have a track, and echonest identifies it as artist: ratatat, title: stomp. But musicbrainz is using artist: Young Buck, T.I. & Ludacris. So searching for ratatat doesn't find anything. So the fingerprinting is basically meaningless because we don't get any real identifier that would allow us to look it up elsewhere. Here were the steps and data I used: I downloaded the ENMFP binary from: http://developer.echonest.com/downloads/license
[{"metadata":{"artist":"", "release":"", "title":"", "genre":"", "bitrate":247,"sample_rate":44100, "duration":306, "filename":"/Users/agrieser/remixes/Ratatat_-_01_-_Young_Buck,_T.I._&_Ludacris_-_Stomp.mp3", "samples_decoded":661814, "given_duration":30, "start_offset":10, "version":3.16, "codegen_time":0.295904, "decode_time":0.193694}, "code_count":141, "code":"eJwllYl5hCEIRFv6EQ8ox7P_EvJmk_2C68ghMLhfGd_3lXERMREZiFkkhC1hW9sjvbMRVypP2EPFP1TcKgIIIcyF1Y5oRUJYx8yHMEXzkEhtp2wVzRfufUvvyOzIy5V46NWvSKBXTduCWfVPArOqaLVpq2i1Y1YVrYbMQiopTIHqktg6PfJyZfaI2wjwNfskUG5FmNw3uW9Np12nHZM2dCr3Tcm0lN6Sity3LQdHZlcqV5hidOXRFaPLfXdUetW24qDLf5f7PrQNTnpqu3SwpSfPXcWR2tffT7Ad-P6G4X4UbR3bIc9DGY4mTO0euvgICXkeUwe6-Ng_Iezqm5o8HjFDTQ41WekihFVt5TTU2hhFAtsQkfSP6BLCllTEoRCHQhxKdl-ql1mKBAcp5qQ8pzyn2pjKQbxEaPvjS-da93hv31mPisbJ6r0cjwLIusg5ylisx6lKXq3vs93H8lfo6jDWt-OJyW-vV2y_xad4xPD7rlEggfhpl4rZu3nPSEKMN8f6xW3vlviPa43CE9dntjrDxa3ZWil2PgMkfpBN9dt9AzqXT8Wv9fV3IYSnAdqGfrdW-w5NDQev3rb5YCDPdxiyc28orxWAkJm8rspiAr9cZ-26C_TseGYoGLVcXs2sN9XlMDKtLLj1qhUKfcjnW3u2uXrh7wF-s0cEYxmt5CSNFcXq7D4n0QCD-cwVVuDkE49qXYAfX83U0FlmMKu44n2Z33yziUzHy7rZjRzqno91_MCxxmRjutfhkHHQvWgcdeZeX2weg2ajUO-PRpnmLGBG7DYKU1U6oPn650yttB_mrG0CqXuhGNVgQJKXUd1N3U3zCki8-gGyEg87OpSjN38Mk-LefGdzz9nHHG_YXDPzJmufdyq0Cqn-cZ-5yk4iuLm3VYjA9336Tj_X4jl52BuBQsy2B2DngskIBwZie2UKDqW2oG_h6DPjPHOOfrXgft9F38-mjj6Nbg9A7luhwCaPdw4lyY88SCK_WcVTyktTeO57myOV8oQhqbd6A5LHJ-rhJ3gT3GZ_57oVez3XGyVrjEO-NLgBwovAWXXLrteOELs3qPk6cS392jtLw-ovxutlNep5rawyTz10sc89V9OcqWzg9L918gGn_wxIghNnJyC_WP0C_qh-9Lak7arfMn4lqkBKphfmZhniX2Mka17fGyZe8PFcYC2ebzM8DGjc8QeI4b7u", "tag":0}
] This was then posted into their api:
Which responded with {"response": {"status": {"version": "4.2", "code": 0, "message": "Success"}, "songs": [{"tag": 0, "score": 6, "title": "Stomp", "message": "OK (match type 6)", "artist_id": "AREPZK61187B990670", "artist_name": "Ratatat", "id": "SOWNNKW12B0B806712"}]}} I also tried it with Wheezer - Hash Pipe, and got the following response: {"response": {"status": {"version": "4.2", "code": 0, "message": "Success"}, "songs": []}} Which means they were unable to identify it. Definitely not a good sign for a popular track like this. |
lastfm spike: Relatively easy to setup, available via homebrew. When lastfm finds the track, it includes track title, artist, and a *mbid. If the search returns multiple tracks, as it seemed to do pretty regularly, it ranks the track matching. In my limited use, sometimes the highest ranking track did not contain a musicbrainz id, and one of the lower ranking tracks did have the musicbrainz id. The workaround for this would be to search for a track, and grab the highest ranking result that also has a musicbrainz id. Then we'd hit musicbrainz for metadata. Here are the steps and data I used: I downloaded the lastfm Fingerprinter from: https://github.com/lastfm/Fingerprinter Obscure Test
Mainstream Test
|
When accepting audio files, we should not rely on user submitted metadata. It would be better to use audio fingerprinting to get the metadata.
We should spike the following libraries
http://developer.echonest.com/docs/v4/song.html#identify
http://acoustid.org/faq
https://github.com/lastfm/Fingerprinter
https://github.com/sampsyo/beets
The text was updated successfully, but these errors were encountered: