Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configs to run the result on table 5.1 #3

Open
Learner23333 opened this issue Sep 20, 2024 · 8 comments
Open

Configs to run the result on table 5.1 #3

Learner23333 opened this issue Sep 20, 2024 · 8 comments

Comments

@Learner23333
Copy link

Thanks for your contribution to the great work!
Could you please provide the settings to run the results of RF-POMO in Table 5.1, are they the same as rf-100.yaml and rf-50.yaml in the config.
I have run the run.py as guided in the readme.md, training with 8 * RTX3090, 300 epoch, using the uploaded test datatset, but the result I got is much larger than the result in Table 5.1.

@fedebotu
Copy link
Member

fedebotu commented Sep 20, 2024

Could you share your current results?

Actually we found a mistake in the original code, and we fixed RF-POMO, so the results might not be exactly the same as the current shared paper

PS: in around two weeks time we should share the latest version of RouteFinder with better reproducibility and possibly the model checkpoints!

@Learner23333
Copy link
Author

Thanks for your quick reply. Looking forward to the lastest version of RouteFinder!

@fedebotu
Copy link
Member

fedebotu commented Oct 2, 2024

Hi @Learner23333 ! We have release the latest version :)
We added several new features, including checkpoints, BKS solutions to calculate gaps, testing script and more. Feel free to let us know if you encounter problems!

PS: we will also release the updated version of the paper on Arxiv - Table 5.1 will be much improved with new, more meaningful and reproducible results!

@fedebotu
Copy link
Member

Follow-up: The latest preprint is now available on Arxiv https://arxiv.org/abs/2406.15007

@hanseul-jeong
Copy link

Thanks for great work! @fedebotu

i couldn't reproduce rf-pomo-50 when i tried to learn it from scratch. (i checked the result of your loaded checkpoint is same as paper's)

averaged gap : 2.14 (paper) -> 2.273 (mine)

i changed only these two config.
experiment: main/rf/rf-50.yaml
max_epochs: 300

could you let me know your config?

@fedebotu
Copy link
Member

Hi @hanseul-jeong !

I just double-checked, but the configs seem to be correct 🤔

For 50 nodes, we actually ran multiple runs with different seeds (image below), and while the overall trends hold, there is some variance between runs. Some variance might explain your value, and re-running with another seed may yield better results!

PS: have you also tried the RF-TE variant?

Please let me know if this helps :)

image

@hanseul-jeong
Copy link

Thank you for your rapid reply :)
I understand that the uploaded checkpoint wasn't trained with the "69420" seed written in the config, is that correct?
When I checked, the performance reported in the paper matched the performance of the uploaded checkpoint. Would you be able to share the seeds used to train each baseline (rf-pomo, rf-moe-l, rf-te)?
If that's difficult, would it be acceptable if I train new models using 3 randomly selected seeds and report the average performance?
(I also ran RF-TE, but i didn't check the result yet.)

@fedebotu
Copy link
Member

About seeds: for those 3 runs we used 33,609 / 28,027 / 76,131

For the peculiar choice of numbers, @ngastzepeda may know why these were chosen specifically ;)

*However, note that I don't think the runs are perfectly reproducible unless with exactly the same hardware and the same conditions - check out this Pytorch blog

If that's difficult, would it be acceptable if I train new models using 3 randomly selected seeds and report the average performance?

Of course! 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants