DMVC: Decomposed Motion Modeling for Learned Video Compression

Pytorch implementation of the paper "DMVC: Decpomsed Motion Modeling for Learned Video Compression". T-CSVT 2022.

Requirements

Python==3.7
PyTorch==1.11

About

This repository defines a learned video compression framework with decomposed motion modeling for low delay coding scenario. The past reconstructions from DPB are used to capture the intrinsic motion at the first place. The intrinsic motion is conveyed along the temporal axis and takes part in the initial temporal transition. Subsequently, the compensatory motion is unsupervisedly learned conditioned on the approximated feature, so as to refine the temporal transition. The core temporal transition is occurred in the feature space. The left pixel-level residue between $\bar{x}_t$ and $x_t$ is signaled to derive the reconstruction, which is enhanced by the in-loop filtering module at the last step.

Data Preparation

Training Dataset

Vimeo-90k dataset

Test Dataset

This method focuses on the P-frame compression. In terms of I frames, we apply CompressAI to compress them. The test datasets include:

HEVC common test sequences
UVG dataset (1080p/8bit/YUV)
MCLJCV dataset (1080p/8bit/YUV)

Basically, the test sequences are cropped. After that, both the width and height are the multiplier of 64. Subsequently, we split them into consecutive pictures by ffmpeg. Taking UVG as example, the data process is shown as follows.

Crop Videos from 1920x1080 to 1920x1024.

ffmpeg -pix_fmt yuv420p  -s 1920x1080 -i ./videos/xxxx.yuv -vf crop=1920:1024:0:0 ./videos_crop/xxxx.yuv

Convert YUV files to images.

ffmpg -s 1920x1024 -pix_fmt yuv420p -i ./videos_crop/xxxx.yuv ./images_crop/xxxx/im%3d.png

Evaluation

We respectively train four differnt models for PSNR metric, where $\lambda$ equals to 256, 512, 1024 and 2048. As for MS-SSIM metric, we set $\lambda$ as 8, 16, 32 and 64. Our pretrained models are provided on Google Drive.

python eval.py --eval_lambda 256 --metric mse --intra_model cheng2020_anchor --test_class ClassD --gop_size 10 --pretrain ./checkpoints/dmvc_psnr_256.model

python eval.py --eval_lambda 8 --metric ms-ssim --intra_model cheng2020_anchor --test_class ClassD --gop_size 10 --pretrain ./checkpoints/dmvc_msssim_8.model

Citation

If you find this paper useful, please cite:

@article{lin2022dmvc,
  title={DMVC: Decomposed Motion Modeling for Learned Video Compression},
  author={Lin, Kai and Jia, Chuanmin and Zhang, Xinfeng and Wang, Shanshe and Ma, Siwei and Gao, Wen},
  journal={IEEE Transactions on Circuits and Systems for Video Technology},
  year={2022},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
dataload		dataload
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
eval.py		eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DMVC: Decomposed Motion Modeling for Learned Video Compression

Requirements

About

Data Preparation

Training Dataset

Test Dataset

Evaluation

Citation

Related links

About

Releases

Packages

Languages

License

Linkeyboard/DMVC

Folders and files

Latest commit

History

Repository files navigation

DMVC: Decomposed Motion Modeling for Learned Video Compression

Requirements

About

Data Preparation

Training Dataset

Test Dataset

Evaluation

Citation

Related links

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages