Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PSPNet and ICNet #81

Merged
merged 13 commits into from
Apr 20, 2018
Merged

Update PSPNet and ICNet #81

merged 13 commits into from
Apr 20, 2018

Conversation

adam9500370
Copy link
Contributor

@adam9500370 adam9500370 commented Apr 18, 2018

Modifications:

  1. In cityscapes_loader.py, add args for mean version and img_norm, and input type for scipy.misc.imresize needs to be uint8 with RGB mode (since original pretrained model uses pascal mean image, and image doesn't normalize to [0,1])
  2. Fix wrong number n_blocks for bottoleNeckIdentityPSP in residualBlockPSP function (n_blocks -> n_blocks-1)
  3. Add auxiliary training layers for training PSPNet
  4. Modify tile_predict with flip arg and support for batch with tensor type
  5. Add PSPNet support for training and testing (extra args: img_norm, eval_flip, measure_time)

Validation results on cityscapes validation set (mIoU/pAcc):
Without flip: 78.65/96.31
With flip: 78.80/96.34
I feed images into network input size: 1025x2049 and single scale (since model input is odd numbers (713x713))
Run on single GTX 1080TI, time is about 1.2~1.3 fps
python validate.py --model_path checkpoints/pspnet_101_cityscapes.pth --dataset cityscapes --img_rows 1025 --img_cols 2049 --no-img_norm --eval_flip --measure_time --batch_size 2 --split val

Pretrained models in pytorch:
pspnet_50_ade20k.pth
pspnet_101_cityscapes.pth
pspnet_101_pascalvoc.pth

@adam9500370
Copy link
Contributor Author

adam9500370 commented Apr 18, 2018

Thank you for your nice implementations.
I think PSPNet weight loaded from caffemodel is ok.
By the way, I also implement near real-time (avg ~23 fps) ICNet based on this PSPNet, the followings are results on cityscapes validation set (without flip) (mIoU/pAcc):
ICNet_train_30k: 65.86/94.21
ICNet_trainval_90k: 79.18/96.31
Do you want to merge my ICNet implementation in the current repo?

@meetps
Copy link
Owner

meetps commented Apr 20, 2018

@adam9500370 - Thanks for the PR!

Could you update the PR to have the ICNet as well? It'd be great to have ICNet in the suite as well.

ptsemseg/loss.py Outdated
for i in range(nt):
lbl_resized[i,:,:] = m.imresize(lbl[i,:,:], (h, w), 'nearest', mode='F')
lbl_resized = lbl_resized.astype(int)
target = Variable(torch.from_numpy(lbl_resized).long().cuda())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be faster if we used torch.nn.functional.upsample_nearest instead of scipy imresize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rewrite this part and update ICNet.
I will create auxiliary loss function for PSPNet and ICNet, and model outputs adjusted by training or eval mode to make train.py, validate.py, and test.py scripts clean).

data_loader = get_loader(args.dataset)
data_path = get_data_path(args.dataset)
loader = data_loader(data_path, is_transform=True)
loader = data_loader(data_path, is_transform=True, img_norm=args.img_norm)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the test script for all other dataloaders, which don't have the img_norm argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I just add this arg in cityscapes dataloader, I will add it in other dataloaders.

@meetps
Copy link
Owner

meetps commented Apr 20, 2018

Looks good to me, need to make very minor some changes in the dataloader to incorporate the img_norm.

Looking forward to the ICNet Implementation!. I will merge asap.

@adam9500370
Copy link
Contributor Author

adam9500370 commented Apr 20, 2018

Update:

  1. Add img_norm arg to all dataloaders and change order for scipy.misc.imresize
  2. Add auxiliary loss func (multi_scale_cross_entropy2d)
  3. Add loss and modify model outputs adjusted by training or eval mode in PSPNet
  4. Make train.py, validate.py, test.py scripts clean
  5. Update ICNet support

Pretrained models:
ICNet_train_30k
ICNet_train_30k_bn
ICNet_trainval_90k
ICNet_trainval_90k_bn

@meetps meetps merged commit 730b5af into meetps:master Apr 20, 2018
@nichalin
Copy link

nichalin commented May 5, 2018

Hi adam9500370, thanks for sharing. Could you please also share how you trained those PSPnet and ICNet models?

@adam9500370
Copy link
Contributor Author

Hi, @nichalin .
I don't really train those models.
The validation results above are obtained by converting from original pretrained caffe models to pytorch models.

@nichalin
Copy link

nichalin commented May 6, 2018

Got it. Thanks, @adam9500370.

@adam9500370 adam9500370 mentioned this pull request Jun 27, 2018
@adam9500370 adam9500370 changed the title Update PSPNet Update PSPNet and ICNet Jun 27, 2018
@lxtGH
Copy link

lxtGH commented Aug 11, 2018

Hi !!! @nichalin Did you do the crop prediction to get 78.80 mIOU /96.34
I use the model you provided but I only got 77.32 mIOU. I think crop test may be better.

@lxtGH
Copy link

lxtGH commented Aug 11, 2018

OK, I found the reason, because I use the latest pytorch, I need to set align_corners=True to get your answer, I will do the crop test

@erichhhhho
Copy link

erichhhhho commented May 23, 2019

@adam9500370 @meetshah1995 @lxtGH
Hi guys, when I was doing inference using the pretrained model pspnet_101_cityscapes.pth of PSPNet on Cityscape dataset. (With input image size 1025x2049 as @adam9500370 mentioned)

I met the following problem:

  • RuntimeError: CUDA out of memory. Tried to allocate 130.00 MiB (GPU 0; 10.91 GiB total capacity; 10.12 GiB already allocated; 115.38 MiB free; 85.29 MiB cached)

May I ask about how much GPU memory is needed to conduct full resolution inference? Thx

@lxtGH
Copy link

lxtGH commented May 23, 2019

@erichhhhho I guess you should inference with the code like this:
with torch.no_grad():
out = model(img)

@erichhhhho
Copy link

@erichhhhho I guess you should inference with the code like this:
with torch.no_grad():
out = model(img)

Thank you!!

@erichhhhho
Copy link

erichhhhho commented Jul 1, 2019

@lxtGH @adam9500370 @meetshah1995
Hi, I think the valset mIoU on PSPNet Cityscape in here(78.65) we can get is only on 713*173. However, the original implementation is reported on 1024*2048 resolution(79.70). And in here, if the inference size is 1024*2048, the mIoU is 0.6759.

I think the performance gap is mainly because the inference method. It is related to the sliding prediction scheme discussed here

Has anyone tried this before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants