-
Notifications
You must be signed in to change notification settings - Fork 21
waifu2x
WolframRhodium edited this page Jan 20, 2022
·
38 revisions
Waifu2x is a well-known image super-resolution neural network for anime-style arts.
Link:
Includes all known publicly available waifu2x models:
- anime_style_art: requires pre-scaled input for the scaled2.0x variant
- noise1 noise2 noise3 scale2.0x
- anime_style_art_rgb: requires pre-scaled input for the scale2.0x variant
- noise0 noise1 noise2 noise3 scale2.0x
- photo: requires pre-scaled input for the scale2.0x variant
- noise0 noise1 noise2 noise3 scale2.0x
- ukbench: requires pre-scaled input
- scale2.0x
- upconv_7_anime_style_art_rgb
- scale2.0x noise3_scale2.0x noise2_scale2.0x noise1_scale2.0x noise0_scale2.0x
- upconv_7_photo
- scale2.0x noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
- cunet: tile size (
block_w
andblock_h
) must be multiples of 4.- noise0 noise1 noise2 noise3
- scale2.0x
- noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
- upresnet10
- scale2.0x
- noise0_scale2.0x noise1_scale2.0x noise2_scale2.0x noise3_scale2.0x
In order to simplify usage, we provided a Python wrapper module vsmlrt that provides full functionality of waifu2x caffe but with a more Pythonic interface:
from vsmlrt import Waifu2x, Waifu2xModel, Backend
src = core.std.BlankClip(format=vs.RGBS)
# backend could be:
# - CPU Backend.OV_CPU(): the recommended CPU backend; generally faster than ORT-CPU.
# - CPU Backend.ORT_CPU(num_streams=1, verbosity=2): vs-ort cpu backend.
# - GPU Backend.ORT_CUDA(device_id=0, cudnn_benchmark=True, num_streams=1, verbosity=2)
# - use device_id to select device
# - set cudnn_benchmark=False to reduce script reload latency when debugging, but with slight throughput performance penalty.
flt = Waifu2x(src, noise=-1, scale=2, model=Waifu2xModel.upconv_7_anime_style_art_rgb, backend=Backend.ORT_CUDA())
This section is mostly for reference purposes as the suggested way is to use the vsmlrt.py.
src = core.std.BlankClip(width=1920, height=1080, format=vs.RGBS)
flt = core.ov.Model(src, "upconv_7_anime_style_art_rgb_scale2.0x.onnx")
anime_style_art, anime_style_art_rgb, photo, ukbench models do not include builtin upscaling. Therefore, you need to upscale 2x using Catmull-Rom (bicubic(b=0, c=0.5)) before feeding the image to the models:
src = core.std.BlankClip(width=1920, height=1080, format=vs.RGBS)
flt = core.ov.Model(src.fmtc.resample(scale=2, kernel="bicubic", a1=0, a2=0.5), "anime_style_art_rgb_scale2.0x.onnx")
- cunet networks work best when the tile size (
block_w
/block_h
) is in range 60 - 150 and multiples of 4.
Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23 Input size: 1920x1080
- vs-mlrt v6
- VapourSynth-Waifu2x-caffe r14
Model | [1] ort-cuda fp32 | [1] trt fp32 | [1] trt fp32 (2 streams) | [2] caffe fp32 (540p patch) |
---|---|---|---|---|
upconv7 | 5.98 / 5065 | 6.60 / 5033 | 8.43 / 9253 | 1.63 / 3248 |
upresnet10 | 4.36 / 5061 | N/A | N/A | 1.54 / 7232 |
cunet | 2.58 / 9155 | N/A | N/A | 1.11 / 11657 |
Model | [1] ort-cuda fp16 | [1] trt fp16 | [1] trt fp16 (2 streams) |
---|---|---|---|
upconv7 | 10.4 / 5189 | 13.8 / 3041 | 26.2 / 5253 |
upresnet10 | 6.43 / 5059 | N/A | N/A |
cunet | 4.10 / 9535 | N/A | N/A |
- Runtimes
- Models
- Device-specific benchmarks
- NVIDIA GeForce RTX 4090
- NVIDIA GeForce RTX 3090
- NVIDIA GeForce RTX 2080 Ti
- NVIDIA Quadro P6000
- AMD Radeon RX 7900 XTX
- AMD Radeon Pro V620
- AMD Radeon Pro V520
- AMD Radeon VII
- AMD EPYC Zen4
- Intel Core Ultra 7 155H
- Intel Arc A380
- Intel Arc A770
- Intel Data Center GPU Flex 170
- Intel Data Center GPU Max 1100
- Intel Xeon Sapphire Rapids