A Style-Based Generator Architecture for Generative Adversarial Networks

Tero Karras, Samuli Laine, Timo Aila

2019 · CVPR

A Style-Based Generator Architecture for Generative Adversarial Networks

Problem

Framing

Progressive GANs generate sharp high-resolution images, but a single input latent leaves semantics entangled across layers and scales. StyleGAN closes this gap with an intermediate latent w\mathbf{w}, per-layer style control, and explicit noise inputs, reducing FFHQ FID from 8.04 to 4.40.

Currently Used Methods

Foundational

Proposed Method

Architecture

The generator maps zZ\mathbf{z} \in \mathcal{Z} through an 8-layer MLP into wW\mathbf{w} \in \mathcal{W}. A separate 18-layer synthesis network starts from a learned 4×4×5124 \times 4 \times 512 constant, applies AdaIN after each convolution, and injects single-channel Gaussian noise at every layer.

Architecture and ablation page: the left panel contrasts a traditional latent-input generator with StyleGAN's mapping network, learned constant input, per-layer AdaIN style control, and noise injection; the right panel shows the main FID ablation table.

Loss / Objective

The paper keeps the GAN objective and changes the generator parameterization through adaptive instance normalization.

AdaIN(xi,y)=ys,ixiμ(xi)σ(xi)+yb,i\mathrm{AdaIN}(\mathbf{x}_i, \mathbf{y}) = y_{s,i} \frac{\mathbf{x}_i - \mu(\mathbf{x}_i)}{\sigma(\mathbf{x}_i)} + y_{b,i}

Sampling Rule / Algorithm

Sampling maps z\mathbf{z} into layerwise styles, then synthesizes from a learned constant plus stochastic noise.

w=f(z),y()=A()(w),x=g(c;{y()}=1L,{n()}=1L)\mathbf{w} = f(\mathbf{z}), \qquad \mathbf{y}^{(\ell)} = A^{(\ell)}(\mathbf{w}), \qquad \mathbf{x} = g\big(\mathbf{c}; \{\mathbf{y}^{(\ell)}\}_{\ell=1}^{L}, \{\mathbf{n}^{(\ell)}\}_{\ell=1}^{L}\big)

Training Procedure

Evaluation

Datasets

Metrics

Headline results

Table 1: FID for generator variants on CelebA-HQ and FFHQ

MethodCelebA-HQFFHQ
A Baseline Progressive GAN [30]7.798.04
B + Tuning (incl. bilinear up/down)6.115.25
C + Add mapping and styles5.344.85
D + Remove traditional input5.074.88
E + Add noise inputs5.064.42
F + Mixing regularization5.174.40

Ablations

Method Strengths and Weaknesses

Strengths

Weaknesses

Suggestions from the authors

Links

Prior Papers

Further Papers

No vault papers identified as further work yet.