Make training of segnet, unet and classifiers easier by providing a single entry point to all training steps #54

liebharc · 2023-12-01T14:41:45Z

Hi,

This PR should make it easier to retrain segnet, unet and classifiers. It adds a train.py script as entry point to train all models. The goal of this PR is to reproduce the current models without any modifications.

With the changes, I was able to get classifiers and unet models which work as well as the current models.

segnet fails to give meaningful results. It seems to converge in a local optimum where all resulting predictions are zero - or black predictions. @BreezeWhite, do you recall if there is anything I missed which might improve the segnet training? I assume that adding regularization or changing parameters might improve this. But in order to reproduce the current models, it would be good to not experiment too much here.

Assumptions I made:

classifier_min is the definition which was used to train the current version of segnet. There also is classifier, but it seems to contain much more categories and oemer only uses 3 from segnet which matches classifier_min (ignoring the background and other categories).
I assumed that the adam optimizer with a warmup learning rate is used and that SigmoidFocalCrossEntropy is the correct loss function. There are other loss functions commented out in the code and I tried them in runs with a few epochs, but they seem to not change the result a lot.
Unet used CHANNEL_NUM in the commented out code. Instead of that I set a hard coded out_class of 3 as from what I understood CHANNEL_NUM is used for segnet. Without that change, unet failed to train as the dimensions were wrong.
That the default values of the train_model function were used to train segnet and unet
ds_dense was used to train segnet, I haven't tried with the full ds dataset yet due to its download size

Fixes:

Workaround for the removal of np.float which causes errors in imaugs
Fixed out of memory issues by decreasing the queue sizes, training speed seems to be unaffected by this

Changes which are not mandatory but appeared useful for me:

Added also main.py to make it easier to just run the main function
Tried to refactor the dataset definitions by extracting constants for each color. I was looking for an official source, but while I'm sure it exists, I just couldn't find it, so I created the map by looking at examples. I double-checked that the refactoring doesn't actually change the definitions in classifier and classifier_min. It's far away from perfect, but it feels like it makes it clearer data exists in the dataset and which part of it is used by oemer.

… of dtype int64' in training, fixes BreezeWhite#39 https://stackoverflow.com/questions/76511182/tensorflow-custom-learning-rate-scheduler-gives-unexpected-eagertensor-type-erro

…ment them in code There is likely an official definition somewhere but I just couldn't find it. So I looked at example and tried to reconstruct the mapping. Unknown basically means that I just couldn't see the symbol on the picture.

…an out of memory exception after it used up about 30GB of memory

…used in oemer

Fixed TypeError at start of unet or segnet training (BreezeWhite#52)

BreezeWhite · 2023-12-20T09:49:07Z

Since I am just too far away from the initial development and model training, I may not answer all the questions correctly above.

I assume the failure one which you are trying to train is the second UNet model, which I refer it as "model 2" in the Model Prediction section. I guess the most possible failure reason is that you fixed it to predict 3 channels, but it should be 4 which the 4th channel corresponds to "nothing" channel. When using the crossentropy loss for training, such additional channel is always required to fulfill the formula itself. Maybe you could try adjust it and train it again, otherwise I cannot come out any other possible reasons.

For the PR itself, I will review it later when I have time, since the modifications are relatively large.

liebharc · 2023-12-20T15:24:46Z

Thanks! By now, I'm also sure that I mixed up the models. That's something you can see if you compare the arch.json to the ones which are checked in. That alone didn't improve the results yet, but I'm optimistic that fixing that together with what you proposed gives better results.

I'll update the PR and let you know about the results as soon as I found the time for this.

… now

…mer/ceckpoints/seg_net/arch.json

liebharc · 2024-01-11T08:00:51Z

I spent some hours now with diffing the arch.json files and consolidated them so that the arch.json which comes out of the training matches the one which are already checked in into git. That required to change the network definitions in the code a bit, which came as a surprise to me. But it can be easily verified by taking a look the arch.json files. I will double-check the results, but in first runs the models seem to be good now - as they give results similar to the pretrained models.

BreezeWhite · 2024-01-23T06:49:52Z

Glad to hear that! By the way, I don't see any modifications to the arch.json file though. Did you commit it into git? Or you are still working on it?

liebharc · 2024-01-23T11:31:10Z

No, I never committed the training results to git. From what I experienced, you should always keep weights and arch.json in sync. And right now, I don't intend to update the weights, as the results with this PR are the same as the ones oemer already has. If I find the time, then I might contribute to improved weights in the future.

If you want to verify this PR, then I think the easiest is to run the training for just 1, 2 or 3 epochs. That's not enough epochs to give you good weights, but if you take a look at the predictions to see that they are converging towards the ones oemer uses right now. And you then also can then diff the arch.json files against the current ones.

BreezeWhite · 2024-01-26T06:24:00Z

That's a bit more work then I expected, and perhaps more than I could afford now 😂 Maybe you could put it somewhere and show the training results and the performance? Like under your forked repo.

* Fixed typos in comments * IndexError while scanning for a dot should not abort the whole process * Bound check while getting the note label * Added check if label is in the note_type_map * Filter staffs instead of aborting with an exception * Bound check during symbol extraction * Marking notes as invalid instead of aborting with an exception * Bound check * Fixed type error * Fixed TypeError at start of unet or segnet training (BreezeWhite#52) * Fixed 'TypeError: Cannot convert 4.999899999999999e-07 to EagerTensor of dtype int64' in training, fixes BreezeWhite#39 https://stackoverflow.com/questions/76511182/tensorflow-custom-learning-rate-scheduler-gives-unexpected-eagertensor-type-erro * --format was deprecated in ruff and replaced wtih --output-format * HoughLinesP can return None if no lines are found * Fixed error which happens if no rest bboxes were found * Limited try/except block * Fixed typo * Use fixed versions for the linter dependencies to avoid that results are different for the same source code level on different test runs due to update of the dependencies * Fixed type errors which came up with the recent version of cv2 * Going back to the newest version of ruff and mypy as the type errors were introduced by cv2

liebharc · 2024-01-31T07:11:51Z

I've added weights and predictions from one example to this dummy release https://github.com/liebharc/oemer/releases/tag/v0.0.1
The weights are trained for just 5 epochs and not the 15 which would be the default. That's just because I didn't want to use an older result, so that I can be absolutely sure that the weights are produced by the code level of this PR - and not one of my previous experiments. But at the same time, I don't have the capacity right now to run the training for the full 15 epochs.

BreezeWhite · 2024-02-15T04:22:11Z

The results looks awesome! I think it's good to merge this PR when you are ready.

liebharc · 2024-02-16T08:18:08Z

Thank you! The PR is complete, so you can hit the merge button.

BreezeWhite · 2024-02-17T02:14:00Z

Great! By the way, I'd like to adding you to this project with write permission, since you've contributed a lot to this project and have a deep understanding of oemer. Would you willing to become a maintainer of this project?

liebharc · 2024-02-25T22:11:14Z

Thanks @BreezeWhite , that's a big complement! However, I'm not sure if I'll continue working on oemer. If I find more time to work on OMR then I'd likely try to go towards a transformer approach such as https://github.com/NetEase/Polyphonic-TrOMR. I put together a prototype based on TrOMR and the first results looked promising. TrOMR comes without training data and code, so it's not really usable. But it still might be worthwhile to look further into it. The transformer approach could also be combined with oemers staff detection, but that would likely still be a complete rewrite of how oemer works right now. I'm happy to share more details about that if you are interested. But I would assume that this isn't what you envision for the oemer repository?

BreezeWhite · 2024-03-05T09:59:36Z

@liebharc I see. I still want to promote you as one of the maintainer of this project, even though you might not have enough time to really maintain. It's my stubbornly thanks to you ^^
Regarding TrOMR, that looks interesting. I would be very glad to see a better performance on combining TrOMR with Oemer. Maybe you could apply techniques from both side and develop it in a new project. I am willing to become the first starer ★

liebharc · 2024-05-13T15:31:20Z

Hello @BreezeWhite , reusing this PR felt like the easiest way to contact you directly. I was able to train TrOMR on an alternative data set and combine it with a staff detection based on segnet/unet from oemer. The results look quite promising. By now I also was able to get my changes cleaned up (a little bit) and I released them as https://github.com/liebharc/homr

It also seems to be able to deal with two of the images provided in the issues section of oemer. Would it be alright for you if I reach out to those people? Getting some evaluation and test data would help to better understand the performance and limitations of the transformer approach.

BreezeWhite · 2024-05-14T01:42:41Z

Hi @liebharc , glad to hear the good news! I would definitely be the first one to try it out~
As for your concern, I am totally fine with it, since this is an open source world and of course you have freedom to reach out anyone you want ^^ Just for curious, which issues do you refer to?

liebharc · 2024-05-14T06:31:15Z

Thanks!

since this is an open source world and of course you have freedom to reach out anyone you want ^^

Right, but it still is good to show some respect to fellow contributors, and so I think it's good to first ask you :).

The staff detection of homr was written for images taken by phone cameras. Therefore, homr and oemer have slightly different use cases and strengths. The staff detection of homr is different since camera pictures can be full of noise, and as a result it doesn't seem to get confused by the lyrics in issue 59 and is able to recognize that issue 51 has three voices.

I didn't link the issues, as github would then create a back reference from the issues to this PR and the original PR has nothing to do with the issues.

BreezeWhite · 2024-05-15T02:28:19Z

Thanks for your kindness~

I haven't tried on the sheet in 51 and 59, but I've tried homr on several example sheet in oemer/docs/images, and it seems like it has difficulties on transcribing multi-melody lines in the same staff 🤔 I haven't read the technical details about TrOMR and don't know if it can handle such situation. Perhaps this can be adjusted in later process stages. There are also other bugs. I assume you are still under development and the code would be more robust in the future? If you need any help, I can do some works^^

Maybe it's too much for this CLOSED PR lol, how about change the place for discussion? Maybe under homr?

liebharc · 2024-05-15T06:42:17Z

Thanks for the feedback!

Maybe it's too much for this CLOSED PR lol, how about change the place for discussion? Maybe under homr?

Sure, perhaps just create an issue in homr and we can continue there.

it seems like it has difficulties on transcribing multi-melody lines in the same staff 🤔

The weights TrOMR I added to the repo might have issues, as I recognized yesterday. A new training run will take two days or so, and then I know more. But I can easily imagine that homr doesn't achieve the oemer performance for these examples. The transformer is capable of handling chords and multiple melodies in one staff, and should have learned that from https://sites.google.com/view/multiscore-project/datasets. But from my experience the transformer model is like a die roll in these cases, and you get sometimes astonishing good results and sometimes not so much. There would still be plenty of room for experiments :).

If you need any help, I can do some works^^

Always :)

BreezeWhite · 2024-05-16T04:14:43Z

I see. Just can't wait to see the full power of fully trained weights ^^
Wish you hit it in one shot!

liebharc and others added 14 commits November 29, 2023 10:39

Fixed 'TypeError: Cannot convert 4.999899999999999e-07 to EagerTensor…

47ab174

… of dtype int64' in training, fixes BreezeWhite#39 https://stackoverflow.com/questions/76511182/tensorflow-custom-learning-rate-scheduler-gives-unexpected-eagertensor-type-erro

--format was deprecated in ruff and replaced wtih --output-format

9f5a6e9

Added a single entry point to train all models

b431701

Added convenience wrapper for oemer

bdbe6f5

Decreased queue sizes as otherwise the training process crashed with …

8d8c64f

…an out of memory exception after it used up about 30GB of memory

Added model outputs to git ignore

4c51b52

Added checks for dataset folders

cae8dd7

Using default training params

7dbd6ae

Added workarounds for removal of np.float

b200d0f

Using dataset definitions

d087cf1

Added type annotations

2deda24

Added a train_all_rests even if the resulting model is right now not …

6f1a678

…used in oemer

Merge pull request #4 from liebharc/main

0e56ca0

Fixed TypeError at start of unet or segnet training (BreezeWhite#52)

liebharc added 3 commits January 5, 2024 14:15

segnet and unet should now pick the correct model

f0bf03e

Changed label definitions from what appears to be used in oemer right…

d3db06a

… now

With this commit the resulting arch.json matches the one inside of oe…

3655f61

…mer/ceckpoints/seg_net/arch.json

liebharc and others added 2 commits January 31, 2024 08:08

Fix install from github command in README

c44cdd2

BreezeWhite merged commit 072b69e into BreezeWhite:main Feb 17, 2024
1 check passed

liebharc deleted the training branch March 21, 2024 20:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make training of segnet, unet and classifiers easier by providing a single entry point to all training steps #54

Make training of segnet, unet and classifiers easier by providing a single entry point to all training steps #54

liebharc commented Dec 1, 2023

BreezeWhite commented Dec 20, 2023 •

edited

Loading

liebharc commented Dec 20, 2023

liebharc commented Jan 11, 2024

BreezeWhite commented Jan 23, 2024

liebharc commented Jan 23, 2024 •

edited

Loading

BreezeWhite commented Jan 26, 2024

liebharc commented Jan 31, 2024

BreezeWhite commented Feb 15, 2024

liebharc commented Feb 16, 2024 •

edited

Loading

BreezeWhite commented Feb 17, 2024

liebharc commented Feb 25, 2024

BreezeWhite commented Mar 5, 2024

liebharc commented May 13, 2024

BreezeWhite commented May 14, 2024

liebharc commented May 14, 2024

BreezeWhite commented May 15, 2024

liebharc commented May 15, 2024

BreezeWhite commented May 16, 2024

Make training of segnet, unet and classifiers easier by providing a single entry point to all training steps #54

Make training of segnet, unet and classifiers easier by providing a single entry point to all training steps #54

Conversation

liebharc commented Dec 1, 2023

BreezeWhite commented Dec 20, 2023 • edited Loading

liebharc commented Dec 20, 2023

liebharc commented Jan 11, 2024

BreezeWhite commented Jan 23, 2024

liebharc commented Jan 23, 2024 • edited Loading

BreezeWhite commented Jan 26, 2024

liebharc commented Jan 31, 2024

BreezeWhite commented Feb 15, 2024

liebharc commented Feb 16, 2024 • edited Loading

BreezeWhite commented Feb 17, 2024

liebharc commented Feb 25, 2024

BreezeWhite commented Mar 5, 2024

liebharc commented May 13, 2024

BreezeWhite commented May 14, 2024

liebharc commented May 14, 2024

BreezeWhite commented May 15, 2024

liebharc commented May 15, 2024

BreezeWhite commented May 16, 2024

BreezeWhite commented Dec 20, 2023 •

edited

Loading

liebharc commented Jan 23, 2024 •

edited

Loading

liebharc commented Feb 16, 2024 •

edited

Loading