You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In phenaki paper, they pretrain phenaki with downsampled MiT dataset of which fps is downsampled from 25 to 6 fps. Then with cv2.Videocapture(videofile), the fps should be re-set as 6fps when we transform video into .jpg(frame)??
cap = cv2.VideoCapture(0)
cap.set(cv2.cv.CV_CAP_PROP_FPS, 6)
like code above?
Then, when training transformer, should input videos be 6 fps same with c-vivit input video fps?
So as you reproduce Phenaki, I wonder how to preprocess video dataset to pretrain C-ViViT.
Thank you.
The text was updated successfully, but these errors were encountered:
Hey there,
you can see that here. Im basically using the builtin python step function when indexing like videos[start_frame:end_frame:step]. But this dataloader is not very efficient as it loads the entire video and then cuts it. There is a builtin torchvision.io.read_video argument which allows you to only load a certain part of the video. We still have to improve that, so keep that in mind.
In phenaki paper, they pretrain phenaki with downsampled MiT dataset of which fps is downsampled from 25 to 6 fps. Then with cv2.Videocapture(videofile), the fps should be re-set as 6fps when we transform video into .jpg(frame)??
like code above?
Then, when training transformer, should input videos be 6 fps same with c-vivit input video fps?
So as you reproduce Phenaki, I wonder how to preprocess video dataset to pretrain C-ViViT.
Thank you.
The text was updated successfully, but these errors were encountered: