-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About #3
Comments
@Cc-Hy @Xianpeng919 |
@rockywind Did you use the provided config to train your model? |
@Xianpeng919 Yes, I use the default config. |
@Xianpeng919
|
@rockywind I'll double check and get back to you asap. |
@Xianpeng919 |
@Xianpeng919
|
Hello, I train the model with command
The result at 200 epoch:
trainging log could be seen here |
@rockywind We have tested our released checkpoints in multiple GPUs. The result is 26.33 | 19.03 | 16.00, same as the result in the readme. Not sure what the problem is here. You might provide me with your log so that I can help you check the details. |
@Cc-Hy You may replace the training split with the trainval split in the config |
@kaixinbear Your dimension branch exploded during training. We did observe this during our experiments. The dimension-aware loss is a little bit unstable. You can restart your training from the un-exploded ckpts. |
Thanks for your kindly reply! I will try later |
Hello author,I resume my training from the un-exploded ckpts, but it still explodes in the follwing epochs. Have you met this phenomenon? Should i turn down my lr ? |
@Xianpeng919 |
Hi, have your tried multi-gpu training or are you still use single gpu training? I retrained with 4-gpu and get lower results than the readme. |
@ganyz You can restart the training from scratch. |
@rockywind I double checked your log, the config looks good to me. I'll double check the code. You can also try another random seed to train again to see the performance. |
@Xianpeng919 |
@rockywind @ganyz @kaixinbear @Xianpeng919 |
I conduct 3 experiments with different seeds, and the best performance is 17.80. Besides, results are not reproducible with the same seed and deterministic==True in the codebase. |
I retrained twice and got 16.20 on the GTX1080Ti and 16.80 on the Titan V. It seems that no one in the issue can retrain more than 18.00, makes me frustrated.... =_=! |
It's normal. Mono3D performance is always unstable. Just pay attention to the last few checkpoint eval results. 0.0 |
@excitohe I konw that the Mono3D performance is always unstable. But results are reproducible with the same seed and deterministic==True in the Monodle codebase. I'm just wondering why nondeterministic algorithms appear when using mmdet reimplementation. |
@Cc-Hy You can refer to mmdet3d's visualization scripts. Their scripts are very helpful. |
@Xianpeng919 |
@Cc-Hy You can do inference you model first and revise the show_results function in the mmdet3d.core.visualizer |
@Xianpeng919 Have you finished your retraining results yet? Looking forward to your train log file. ^_^ |
Hi, I migrate monocon into latest mmdet3d in plugin_dir manner, and try again with only_car with your latest updated config in 4GPU.
Attach the training log: Can you see where is the problem?Thank you so much and keep in touch. ^_^ I will reconfigure your original environment and test again with single GPU... |
@Cc-Hy Hi, Is this your recent result in only_car config? It looks like we're about the same... |
No, these are 3 class results. |
Could you please tell me how to solve this model collapse problem? By turn down lr or change random seed? |
If you always meet this problem, you can modify the dimension loss with L1 loss only, L = |gt - pred|. |
@Xianpeng919 |
Hi, i got AP 19.0217 by setting "cfg.SEED = 1903919922 " |
Hello, I tried to train the model, but after 120 epochs, the performance is a lot worse than yours.
The modification is that I used a larger learning rate 0.001 compare to your original 0.000225.
So first I want to ask why the learning rate you choose is so small ( generally contact with the network learning rate around 0.003 to 0.001), do you use pre-training and this is a fine-tuning?
And I want to ask for some idea about the results I got, I think the learning rate would not result in such a large gap.
And I will use your original learning rate to retrain later.
Thanks a lot.
The text was updated successfully, but these errors were encountered: