You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, do you have a plan to release the segmentation code? I reproduced the Mambaout-Tiny on ImageNet-1K, which achieves 82.6 top1 accuracy. I think the result is reasonable given the randomness and the different batch size (1024 rather than 4096). However, when I transfer the pre-trained model to semantic segmentation, it only achieves 46.1 mIoU on ade20k, much lower than the 47.4 reported in the paper. I use the Swin config based on MMsegmentation codebase. The droppath rate is 0.2 and no layer-wise decaying learning rate.
The text was updated successfully, but these errors were encountered:
Hi @ydhongHIT , many thanks for your attention. Currently, I am busy and will arrange the segmentation code as soon as possible. MambaOut is also based on Swin config. Note to add LN for outputs of backbone at each stage. For all model sizes, the learning rate is 1e-4. The drop path is 0.3 for Tiny, 0.3 or 0.4 for Small and 0.6 for Base.
Thank you for the reply. I will train the model again according to your instructions. By the way, do you employ the layer-wise decaying learning rate like ConvNext?
Hi, do you have a plan to release the segmentation code? I reproduced the Mambaout-Tiny on ImageNet-1K, which achieves 82.6 top1 accuracy. I think the result is reasonable given the randomness and the different batch size (1024 rather than 4096). However, when I transfer the pre-trained model to semantic segmentation, it only achieves 46.1 mIoU on ade20k, much lower than the 47.4 reported in the paper. I use the Swin config based on MMsegmentation codebase. The droppath rate is 0.2 and no layer-wise decaying learning rate.
The text was updated successfully, but these errors were encountered: