IPT Description

This repository is an official implementation of the paper "Pre-Trained Image Processing Transformer" from CVPR 2021.

We study the low-level computer vision task (e.g., denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.

If you find our work useful in your research or publication, please cite our work: [1] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. "Pre-trained image processing transformer". CVPR 2021. [arXiv]

@inproceedings{chen2020pre,
  title={Pre-trained image processing transformer},
  author={Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen},
  booktitle={CVPR},
  year={2021}
 }

Model architecture

The overall network architecture of IPT is shown as below

Dataset

The benchmark datasets can be downloaded as follows:

For super-resolution:

Set5, Set14, B100, Urban100.

For denoising:

CBSD68.

For deraining:

Rain100L

The result images are converted into YCbCr color space. The PSNR is evaluated on the Y channel only.

Environmental requirements

Hardware (Ascend)

Prepare hardware environment with Ascend.

Framework

MindSpore>=1.9

For more information, please check the resources below

MindSpore Tutorials MindSpore Python API

Performance

Training Performance

Training speed.

At mindspore, dynamic shape will take additional costs. Especially on two image with a large difference in size.

Platform	Device type	Device	training time for each step(seconds)
IPT-MS-Ascend	Ascend	Ascend-910A	0.159 (batch size: 32)

Evaluation Performance

The Results on all tasks are listed as below.

Super-resolution results:

Scale	Set5	Set14	B100	Urban100
×2	38.33	34.49	32.46	33.74
×3	34.86	30.85	29.38	29.50
×4	32.71	29.03	27.84	27.24

Denoising results:

noisy level	CBSD68	Urban100
30	32.35	33.99
50	29.93	31.49

Derain results:

Task	Rain100L
Derain	42.08

Inference Speed on Super-resolution (x4) task:

Platform	Device type	Device	Inference time for images in set5 dataset (seconds)
IPT-Torch	GPU	V100	1.33, 0.69, 0.59, 0.69, 1.09
IPT-MS (mindspore 1.7)	GPU	V100	39.29, 13.45, 0.69, 0.71, 14.01
IPT-MS-Ascend (mindspore 1.9, Graph Mode)	Ascend	Ascend-910A	419.53, 254.83, 9.49, 22.68, 275.21
IPT-MS-Ascend (mindspore 1.9, Pynative Mode)	Ascend	Ascend-910A	635.53, 195.69, 14.94, 15.07, 271.17
IPT-MS-Ascend (mindspore 1.9, Graph Mode, warmup 50 exps)	Ascend	Ascend-910A	2.74, 2.68, 2.47, 2.38, 2.56
IPT-MS-Ascend (mindspore 1.9, Pynative Mode, warmup 50 exps)	Ascend	Ascend-910A	4.43, 4.10, 4.44, 4.03, 4.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ipt.md

ipt.md

Contents

IPT Description

Model architecture

The overall network architecture of IPT is shown as below

Dataset

Environmental requirements

Hardware (Ascend)

Framework

For more information, please check the resources below

Performance

Training Performance

Training speed.

Evaluation Performance

Files

ipt.md

Latest commit

History

ipt.md

File metadata and controls

Contents

IPT Description

Model architecture

The overall network architecture of IPT is shown as below

Dataset

Environmental requirements

Hardware (Ascend)

Framework

For more information, please check the resources below

Performance

Training Performance

Training speed.

Evaluation Performance