Skip to content
This repository has been archived by the owner on May 22, 2020. It is now read-only.

What is performance in comparison with original implementation? #2

Open
John1231983 opened this issue Jun 19, 2018 · 4 comments
Open

Comments

@John1231983
Copy link

Great implementation. Could you provide the reproduce result that can use to compare with original implementation in CAFFE2? Thanks

@John1231983
Copy link
Author

John1231983 commented Jun 19, 2018

I think the first conv should be conv2d. Am I right?
The correct version likes

       self.spatial_conv = nn.Conv2d(in_channels, intermed_channels, kernel=3,
                                    stride=1, padding=1, bias=bias)
        self.bn = nn.BatchNorm2d(intermed_channels)
        self.relu = nn.ReLU()
        self.temporal_conv = nn.Conv3d(intermed_channels, out_channels, temporal_kernel_size, 
                                    stride=temporal_stride, padding=temporal_padding, bias=bias)

@yechanp
Copy link

yechanp commented Oct 13, 2019

I think it is okay.
It should be kept as conv3d. but it actually performs like conv2d because one of kernel size is 1.

@JinXiaozhao
Copy link

self.conv3 = SpatioTemporalResLayer(64, 128, 3, layer_sizes[1], block_type=block_type, downsample=True)
why downsample=True?input size = 64 output size =128,I can't understand.can you help me ?
Thanks! @irhum

@Litou1
Copy link

Litou1 commented Dec 6, 2019

My finding is that it's actually slower than C3D with fp16.
With fp32, R2+1D is faster.

pytorch 1.3
cuda 10.2
cudnn 7.6.5

I think the newer cudnn is quite efficient in performing 3D convolution for fp16 inputs.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants