Pretrained model #10

Mihonarium · 2021-07-12T11:10:05Z

While it's relatively easy to train the model on the Dataset-mini (even Colab allows that), it's not as easy to reproduce the paper's results with the Dataset-full. It would be great if you could publish a model trained on the full dataset.

(By the way, congratulations on the paper, and thanks for publishing the work, it's really cool!)

Mihonarium · 2021-07-12T12:39:03Z

Oh, sorry, I just saw that you actually use the mini dataset for training and the full one for a full-scale evaluation. Closing the issue

mimbres · 2021-07-12T14:40:00Z

Thanks. Yes actually the training part is same.
I have a plan for colab. The g-drive (raw) files are exactly for the purpose of mounting it on colab .

Training in colab:
I didn't test it but it should work. You first need to modify the config/default.yaml. The OUTPUT_ROOT_DIR and LOG_ROOT_DIR must be set to you gdrive directory. And other paths like SOURCE_ROOT etc. should be the dataset (raw) I shared.
In training, It saves model checkpoint every epoch. Usually every twenty minutes or it can take longer.
So if the colab was auto-shut down, you can continue training from the last checkpoint.
If you meet any problem, just let me know. It will be a nice contribution.

About sharing a trained model, yes I can. The plan is to write a one page colab demo by loading it for the next update.
But if you wanna early-try, here is the link.

I really welcome feedback from colab users. I feel it is the way this open project to go.

mimbres · 2021-07-12T16:16:07Z

I am wondering if it is possible to install faiss (required for constructing search engine) smoothly in colab. I've never tried it yet. It is also an important prerequisite to develop colab demo. I'll test it out a bit tonight.

Installation of faiss-gpu on colab.

Mihonarium · 2021-07-12T17:02:20Z

I was able to run the training process in Colab with Miniconda, but just installing requirements without Miniconda leads to an error. #12 should fix it.

Restoring from that checkpoint doesn't work for some reason. It outputs a long list of messages like WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer's state 'v' for (root).model.div_enc.split_fc_layers.124.layer_with_weights-0.bias for all the layers, weights, etc., and this warning at the end:

WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details

mimbres · 2021-07-12T17:14:51Z

@Mihonarium Thanks for report. Yes, it seems we don't need conda for colab. Just pip install works smooth. Installation of faiss-gpu was super smooth too: !pip install faiss-gpu.

About your checkpoint loading issue, let me ask:

Just use the config/640_lamb.yaml in repo.
Did you specify config? The command should be like:

python run.py train -c 640_lamb  640_lamb # ignore this line..
!python run.py generate -c 640_lamb 640_lamb 101

BTW, just try generate command. Continuing train from the checkpoint of different type of device is weird scenario.
If you send me your notebook, I'll look at it tomorrow.

Mihonarium · 2021-07-12T17:22:51Z

Yes, I did specify the config.

What's even more strange, the issue with a lot of warnings appears only with run.py train and doesn't appear for generate.

The notebook: https://gist.github.com/Mihonarium/e3fd355cb560b82373fd2186139f1bc2 (the last cells show that generate and training from scratch work).

mimbres · 2021-07-12T17:29:03Z

@Mihonarium Oh it is an expected behavior as I wrote it above. The checkpoint file contains optimzer's states info which is GPU device dependent. So, if you wanna continue train using my checkpoint as an initial parameter, it's possible but I didn't consider such use. It requires to load model without connecting optimizer first (as in generate). Then initialize optimizer and start training.

mimbres · 2021-07-12T17:55:51Z

@Mihonarium About training from scratch error: First, for P100 GPU, I recommend

BSZ:
    TR_BATCH_SZ : 320
        # Training batch size N must be EVEN number.
    TR_N_ANCHOR : 160

You didn't get out of memory error though. But this is not related with your issue.
I am now checking CPU info of colab.
In config, try:

DEVICE:
    CPU_N_WORKERS : 4 # 4 for minimal system. 8 is recommended.
    CPU_MAX_QUEUE : 10 # 10 for minimal system. 20 is recommended.

It depends on how many threads the system can handle.
I will run it tomorrow.

Mihonarium · 2021-07-12T18:04:19Z

it is an expected behavior as I wrote it above. The checkpoint file contains optimzer's states info which is GPU device dependent.

Got it, makes sense. Thanks!

Training from scratch didn't give any errors, I interrupted it. I included it to show that errors are from the checkpoints load (I didn't know it was the expected behavior) and not from something else. You're right though, I would probably get an out of memory error if trained for longer. I was actually able to train the model successfully with a batch size of 320.

Mihonarium · 2021-07-12T18:28:26Z

Got unsupported operand type(s) for +: 'PosixPath' and 'str' from line 306 of dataset.py when tried to generate from a custom source

mimbres · 2021-07-12T22:27:17Z

@Mihonarium Solved by removing pathlib for argin. Also fixed same issue for --output option.

TheMightyRaider · 2021-07-28T05:36:29Z

@mimbres @Mihonarium Is it possible for you guys to share the trained model, It's quite hard to train with 320 as batch size? 🤞

Mihonarium · 2021-07-28T06:43:49Z

@TheMightyRaider the trained model is available here

TheMightyRaider · 2021-07-28T06:45:34Z

Thanks! @Mihonarium

haha010508 · 2022-11-08T02:29:41Z

i use the pretrained model, and same database(Dataset-mini), for evalue step, but i got very poor result, i want to know: why? this is my code
`
CUDA_VISIBLE_DEVICES=1 python run.py evaluate 640_lamb 101.index -c 640_lamb
cli: Configuration from ./config/640_lamb.yaml
Load 29,500 items from ./logs/emb/640_lamb/101.index/query.mm.
Load 29,500 items from ./logs/emb/640_lamb/101.index/db.mm.
Load 581,922 items from ./logs/emb/640_lamb/101.index/dummy_db.mm.
Creating index: ivfpq
Copy index to GPU.
Training index...
Elapsed time: 23.07 seconds.
581922 items from dummy DB
29500 items from reference DB
Added total 611422 items to DB. 2.25 sec.
Created fake_recon_index, total 611422 items. 0.04 sec.
test_id: icassp, n_test: 2000
========= Top1 hit rate (%) of segment-level search =========
---------------- Query length ----------------
segments 1 3 5 9 11 19
seconds (1s) (2s) (3s) (5s) (6s) (10s)

Top1 exact 3.75 5.90 6.45 7.25 7.25 7.80
Top1 near 4.00 6.15 6.70 7.30 7.30 7.80
Top3 exact 4.40 7.00 7.85 8.60 8.45 8.95
Top10 exact 5.40 8.35 9.40 10.90 11.15 10.90

average search + evaluation time 7.25 ms/query
Saved test_ids and raw score to ./logs/emb/640_lamb/101.index/.
`
if i need retrain?

Mihonarium closed this as completed Jul 12, 2021

mimbres self-assigned this Jul 12, 2021

mimbres reopened this Jul 12, 2021

mimbres added the enhancement New feature or request label Jul 12, 2021

mimbres added a commit that referenced this issue Jul 12, 2021

solving #10:use string for arugment input, instead of pathlib

ccca0e1

mimbres added a commit that referenced this issue Jul 12, 2021

solving #10:use string for arugment input, instead of pathlib

d809424

mimbres mentioned this issue Jul 13, 2021

New feature for database adaptation #13

Closed

mimbres mentioned this issue Aug 28, 2021

loss to convergence #15

Closed

guillemcortes mentioned this issue Jan 18, 2022

Performance benchmark #8

Closed

mimbres mentioned this issue Jan 19, 2022

Speed of generating fingereprints from custom source #23

Closed

Rodrigo29Almeida pushed a commit to Rodrigo29Almeida/neural-audio-fp that referenced this issue Apr 16, 2024

solving mimbres#10:use string for arugment input, instead of pathlib

886db91

Rodrigo29Almeida pushed a commit to Rodrigo29Almeida/neural-audio-fp that referenced this issue Apr 16, 2024

solving mimbres#10:use string for arugment input, instead of pathlib

b28412f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretrained model #10

Pretrained model #10

Mihonarium commented Jul 12, 2021 •

edited

Loading

Mihonarium commented Jul 12, 2021

mimbres commented Jul 12, 2021 •

edited

Loading

mimbres commented Jul 12, 2021 •

edited

Loading

Mihonarium commented Jul 12, 2021

mimbres commented Jul 12, 2021 •

edited

Loading

Mihonarium commented Jul 12, 2021

mimbres commented Jul 12, 2021

mimbres commented Jul 12, 2021 •

edited

Loading

Mihonarium commented Jul 12, 2021

Mihonarium commented Jul 12, 2021 •

edited

Loading

mimbres commented Jul 12, 2021

TheMightyRaider commented Jul 28, 2021 •

edited

Loading

Mihonarium commented Jul 28, 2021

TheMightyRaider commented Jul 28, 2021

haha010508 commented Nov 8, 2022 •

edited

Loading

Pretrained model #10

Pretrained model #10

Comments

Mihonarium commented Jul 12, 2021 • edited Loading

Mihonarium commented Jul 12, 2021

mimbres commented Jul 12, 2021 • edited Loading

mimbres commented Jul 12, 2021 • edited Loading

Mihonarium commented Jul 12, 2021

mimbres commented Jul 12, 2021 • edited Loading

Mihonarium commented Jul 12, 2021

mimbres commented Jul 12, 2021

mimbres commented Jul 12, 2021 • edited Loading

Mihonarium commented Jul 12, 2021

Mihonarium commented Jul 12, 2021 • edited Loading

mimbres commented Jul 12, 2021

TheMightyRaider commented Jul 28, 2021 • edited Loading

Mihonarium commented Jul 28, 2021

TheMightyRaider commented Jul 28, 2021

haha010508 commented Nov 8, 2022 • edited Loading

Top1 exact 3.75 5.90 6.45 7.25 7.25 7.80 Top1 near 4.00 6.15 6.70 7.30 7.30 7.80 Top3 exact 4.40 7.00 7.85 8.60 8.45 8.95 Top10 exact 5.40 8.35 9.40 10.90 11.15 10.90

Mihonarium commented Jul 12, 2021 •

edited

Loading

mimbres commented Jul 12, 2021 •

edited

Loading

mimbres commented Jul 12, 2021 •

edited

Loading

mimbres commented Jul 12, 2021 •

edited

Loading

mimbres commented Jul 12, 2021 •

edited

Loading

Mihonarium commented Jul 12, 2021 •

edited

Loading

TheMightyRaider commented Jul 28, 2021 •

edited

Loading

haha010508 commented Nov 8, 2022 •

edited

Loading

Top1 exact 3.75 5.90 6.45 7.25 7.25 7.80
Top1 near 4.00 6.15 6.70 7.30 7.30 7.80
Top3 exact 4.40 7.00 7.85 8.60 8.45 8.95
Top10 exact 5.40 8.35 9.40 10.90 11.15 10.90