Implementation of NVIDIA's generative SPADE network in PyTorch. Based on the original paper of Park et.al.
GauGAN is a generative network with a special normalization method SPADE, the generator takes as input a segmentation mask and a corresponding real image or noise vector in a high dimensional space and synthesize a photorealistic image as output. Depending on the dataset used during training it is capable of multi-modal image synthesis, which means that it can generate images in various different styles using the same input segmentation mask. In my implementation I trained the network on parts of COCO-Stuff dataset for about 160 epochs on my home computer which limit the rendering capabilities of the network. Beside the network I also created an interactive app hosted on streamlit, where you can draw your own doodles and turn them into semi-photorealistic images.
demo.mov
You can test the demo yourself, note that the generation of images is quite slow without GPU acceleration.
- Select one of the many materials in the panel to the left.
- Draw your doodles.
- You can move objects by selecting 'transform' under drawing tool.
- Images are automatically generated, this can be toggeled on/off by checking 'Update image in realtime'.
- Style image is supposed to be used as a multi-modal synthesis but does not work all to well with COCO-Stuff dataset.
Demo: https://engbergandreas-gaugan.streamlit.app/
- Semantic Image Synthesis with Spatially-Adaptive Normalization: https://arxiv.org/pdf/1903.07291.pdf
- Official implementation: https://github.com/NVlabs/SPADE
- COCO-Stuff dataset: https://github.com/nightrome/cocostuff
- Streamlit drawable canvas: https://github.com/andfanilo/streamlit-drawable-canvas
- Read full report here: https://github.com/engbergandreas/GauGAN/blob/main/Interactive_AI_canvas_TNM095.pdf