Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During C-ViViT pretraining, how do you set fps of MiT dataset? #6

Open
9B8DY6 opened this issue Nov 1, 2022 · 1 comment
Open

During C-ViViT pretraining, how do you set fps of MiT dataset? #6

9B8DY6 opened this issue Nov 1, 2022 · 1 comment

Comments

@9B8DY6
Copy link

9B8DY6 commented Nov 1, 2022

In phenaki paper, they pretrain phenaki with downsampled MiT dataset of which fps is downsampled from 25 to 6 fps. Then with cv2.Videocapture(videofile), the fps should be re-set as 6fps when we transform video into .jpg(frame)??
image

cap = cv2.VideoCapture(0) 
cap.set(cv2.cv.CV_CAP_PROP_FPS, 6) 

like code above?

Then, when training transformer, should input videos be 6 fps same with c-vivit input video fps?

So as you reproduce Phenaki, I wonder how to preprocess video dataset to pretrain C-ViViT.
Thank you.

@dome272
Copy link
Collaborator

dome272 commented Nov 2, 2022

Hey there,
you can see that here. Im basically using the builtin python step function when indexing like videos[start_frame:end_frame:step]. But this dataloader is not very efficient as it loads the entire video and then cuts it. There is a builtin torchvision.io.read_video argument which allows you to only load a certain part of the video. We still have to improve that, so keep that in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants