Skip to content

Latest commit

 

History

History
104 lines (81 loc) · 5.82 KB

README.md

File metadata and controls

104 lines (81 loc) · 5.82 KB

Style Transfer (Neural Style)

A tensorflow implementation of style transfer (neural style) described in the papers:

The implementation is coincided with the paper both in variable-names and algorithms so that a reader of the paper can understand the code without too much effort.

Usage

Prerequisites

  1. Tensorflow
  2. Python packages : numpy, scipy, PIL(or Pillow), matplotlib
  3. Pretrained VGG19 file : imagenet-vgg-verydeep-19.mat

      * Please download the file from link above.
      * Save the file under pre_trained_model

Running

python run_main.py --content <content file> --style <style file> --output <output file>

Example: python run_main.py --content images/tubingen.jpg --style images/starry-night.jpg --output result.jpg

Arguments

Required :

  • --content: Filename of the content image. Default: images/tubingen.jpg
  • --style: Filename of the style image. Default: images/starry-night.jpg
  • --output: Filename of the output image. Default: result.jpg

Optional :

  • --model_path: Relative or absolute directory path to pre trained model. Default: pre_trained_model
  • --loss_ratio: Weight of content-loss relative to style-loss. Alpha over beta in the paper. Default: 1e-3
  • --content_layers: Space-separated VGG-19 layer names used for content loss computation. Default: conv4_2
  • --style_layers: Space-separated VGG-19 layer names used for style loss computation. Default: relu1_1 relu2_1 relu3_1 relu4_1 relu5_1
  • --content_layer_weights: Space-separated weights of each content layer to the content loss. Default: 1.0
  • --style_layer_weights: Space-separated weights of each style layer to loss. Default: 0.2 0.2 0.2 0.2 0.2
  • --max_size: Maximum width or height of the input images. Default: 512
  • --num_iter: The number of iterations to run. Default: 1000
  • --initial_type: The initial image for optimization. (notation in the paper : x) Choices: content, style, random. Default: 'content'
  • --content_loss_norm_type: Different types of normalization for content loss. Choices: 1, 2, 3. Default: 3

Sample results

The Neckarfront in Tübingen, Germany

Results were obtained from default setting.
An image was rendered approximately after 4 mins on GTX 980 ti.

The Gyeongbokgung Palace in Seoul, South Korea

Results were obtained from default setting except --max_size 1200.
An image was rendered approximately after 19.5 mins on GTX 980 ti.

References

The implementation is based on the projects:

  • This is a tutorial version. Comments on code are well provided. Some exercises are given to check what you learn.
  • This is a simple and well written implemetation, but some parts like optimizer are not conincided with the paper.
  • There are other implementations related to style transfer like video style transfer, color-preserving style transfer etc.

I went through these implementations and found some differences from each other.

  1. Style image shape : there are some variations how to resize a style image.
           In this implementation, a style image is resized to the shape of a content image.

  2. Optimizer : gradient descent, Adam, L-BFGS.
           In this implementation, only L-BFGS is provided.

  3. Scale factor of loss : scale factors for content-loss and style-loss are different.
           In this implementation, style loss is implemented as in the paper.
           About content loss, there are 3 choices.
           * Choice 1 : as in A Neural Algorithm of Artistic Style
           * Choice 2 : as in Artistic style transfer for videos
           * Choice 3 : as in https://github.com/cysmith/neural-style-tf

  4. Total variance denoising : implementation details for total variance denoising are slightly different.
           In this implementation, total variance denoising is not provided since the paper does not use it.

Acknowledgements

This implementation has been tested with Tensorflow r0.12 on Windows 10 and Ubuntu 14.04.