Instance Normalization: The Missing Ingredient for Fast Stylization

Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky

2017

Instance Normalization: The Missing Ingredient for Fast Stylization

Problem

Framing

Feed-forward stylization was real-time but still trailed Gatys-quality transfer and degraded under larger training sets or longer optimization. The paper closes this gap with one architectural bias: replace batch normalization with instance normalization and keep it active at test time.

Currently Used Methods

Direct antecedents

Proposed Method

Architecture

The paper keeps the feed-forward generator and swaps every batch-normalization layer for instance normalization. It tests the change in both the earlier Ulyanov generator and a reproduced Johnson residual generator, with normalization still applied at inference.

Qualitative comparison figure: top row shows content, style, and Gatys transfer; bottom row compares zero padding, improved padding, and zero padding plus instance normalization, where instance normalization suppresses border artifacts.

Loss / Objective

Training keeps the fixed-style perceptual objective:

ming  1nt=1nL(x0,xt,g(xt,zt)),ztN(0,1)\min_g \; \frac{1}{n} \sum_{t=1}^{n} \mathcal{L}\big(x_0, x_t, g(x_t, z_t)\big), \qquad z_t \sim \mathcal{N}(0,1)

Normalization Rule

The key change is per-instance, per-channel spatial normalization:

ytijk=xtijkμtiσti2+ϵy_{tijk} = \frac{x_{tijk} - \mu_{ti}}{\sqrt{\sigma_{ti}^2 + \epsilon}} μti=1HWl=1Wm=1Hxtilm,σti2=1HWl=1Wm=1H(xtilmμti)2\mu_{ti} = \frac{1}{HW} \sum_{l=1}^{W} \sum_{m=1}^{H} x_{tilm}, \qquad \sigma_{ti}^2 = \frac{1}{HW} \sum_{l=1}^{W} \sum_{m=1}^{H} \big(x_{tilm} - \mu_{ti}\big)^2

Training Procedure

Evaluation

Datasets

Metrics

Headline results

Sample grid: two style images on the top row, then a portrait content image and its two stylized outputs from the proposed method.

Ablations

Method Strengths and Weaknesses

Strengths

Weaknesses

Suggestions from the authors

Links

Prior Papers

Further Papers