GPU version support？ #14

leemengxing · 2020-02-27T05:48:39Z

Thank you for your work, I would like to ask if you can add GPU support options in torch.hub?Another question is whether the obtained embedding_size must be a fixed value of 128, is there a way to convert to 2048 dimensions?

stevenguh · 2020-02-27T17:54:40Z

I think you should be able to do model.to('cuda') to convert the model to cuda.

The model itself is pretty simple, so you should be able to load the pre-trained weights without the last layer. But that requires some manual work with forking this repo.

leemengxing · 2020-03-02T04:29:00Z

vggish strictly extracts features every 0.96 seconds, but my image features extract features every 1s. Do you have a good way to align features and look forward to your suggestions?

stevenguh · 2020-03-02T19:29:11Z

You should be able to just crop the 1 second audio to .96 seconds

leemengxing · 2020-03-02T19:51:51Z

I'm sorry. I may have described the problem. For example, my video is half an hour. I select one frame of image every second to extract the image features after rensnet18, and the audio features are vggish. But I found that the dimension of the image is [30 * 60,512], but the audio feature test [30 * 60 / 0.96,128]. I want to align features in the time dimension. What should I do?

leemengxing · 2020-03-02T19:54:19Z

I found that 4 seconds video does not have this problem. because [4,512] == [4/0.96,128].Any suggestion is welcome,thx very much.

harritaylor · 2020-03-03T12:42:51Z

@leemengxing this repo is just for the port of vggish to pytorch. I suggest you ask this question on https://groups.google.com/forum/#!forum/audioset-users - you're more likely to have a useful response from those guys 😄 I'm not really sure how to help with that particular problem other than to crop the audio per second to 0.96 like @stevenguh suggested.

As the GPU support has been resolved upthread, I'm closing this issue now. Thanks.

botkevin · 2020-08-08T00:57:33Z

I know this is closed, but when I try to send the model to cuda using model.cuda(), pytorch will throw me RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same. I solved this by adding the following code to VGGish.forward in vggish.py:

def forward(self, x, fs=None):
    if self.preprocess:
        x = self._preprocess(x, fs)
    # start added code
    if next(self.parameters()).is_cuda:
        x = x.cuda()
    # end added code
    x = VGG.forward(self, x)
    if self.postprocess:
        x = self._postprocess(x)
    return x

It's not the most elegant solution, but I am just checking if the model weights are cuda and if so changing the data to such. From my tests so far it seems to work, but please let me know if there is something wrong with this.

harritaylor · 2020-08-10T14:03:43Z

@botkevin nothing wrong with that if it works! However I have realised that the offending line is:

torchvggish/torchvggish/vggish.py

Line 148 in e1e2273

super().load_state_dict(state_dict)

. There is a way to serialise weights to cuda automatically afaik. I will try to fix this issue later today. Thanks for raising it!

dfan · 2020-09-14T20:26:25Z

#19

Sending the model to GPU works fine but PyTorch will complain RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same unless the audio tensor is also sent to GPU.

That said, the speedup is not dramatic because most of the time is spent in pre-processing. For a 2 second audio clip that I tested on CPU, 70 milliseconds were spent on pre-processing the audio file into an array of spectrogram patches, and 20 milliseconds were spent on inference itself.

nhattruongpham · 2022-09-28T18:34:29Z

Hi,

You guys can check my configuration based on v0.1 at https://github.com/nhattruongpham/torchvggish-gpu

That worked for me because I had converted the PCA params tensor to cuda.

Good luck!

harritaylor closed this as completed Mar 3, 2020

harritaylor reopened this Aug 10, 2020

dfan mentioned this issue Sep 14, 2020

Add device to VGGish and send audio tensor to device #19

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU version support？ #14

GPU version support？ #14

leemengxing commented Feb 27, 2020

stevenguh commented Feb 27, 2020

leemengxing commented Mar 2, 2020

stevenguh commented Mar 2, 2020

leemengxing commented Mar 2, 2020

leemengxing commented Mar 2, 2020

harritaylor commented Mar 3, 2020

botkevin commented Aug 8, 2020

harritaylor commented Aug 10, 2020

dfan commented Sep 14, 2020 •

edited

Loading

nhattruongpham commented Sep 28, 2022

GPU version support？ #14

GPU version support？ #14

Comments

leemengxing commented Feb 27, 2020

stevenguh commented Feb 27, 2020

leemengxing commented Mar 2, 2020

stevenguh commented Mar 2, 2020

leemengxing commented Mar 2, 2020

leemengxing commented Mar 2, 2020

harritaylor commented Mar 3, 2020

botkevin commented Aug 8, 2020

harritaylor commented Aug 10, 2020

dfan commented Sep 14, 2020 • edited Loading

nhattruongpham commented Sep 28, 2022

dfan commented Sep 14, 2020 •

edited

Loading