Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, Pieter Abbeel

2020 · arXiv

Denoising Diffusion Probabilistic Models

Problem

Framing

Diffusion models had a variational formulation but had not shown GAN-class sample fidelity. The paper closes that gap with an ϵ\epsilon-prediction parameterization of the reverse Gaussian chain, reaching CIFAR-10 IS 9.469.46 and FID 3.173.17 while remaining a likelihood model.

Currently Used Methods

Foundational and direct antecedents

Proposed Method

Architecture

The reverse model is a U-Net-like Wide-ResNet with group normalization, shared weights across timesteps, sinusoidal timestep embeddings, and self-attention at 16×1616 \times 16. The 32×3232 \times 32 model uses four resolutions; the 256×256256 \times 256 model uses six.

Directed graphical model: a forward Gaussian noising chain q(\mathbf{x}_t\mid\mathbf{x}_{t-1}) and a learned reverse denoising chain p_\theta(\mathbf{x}_{t-1}\mid\mathbf{x}_t) from noise to image

Loss / Objective

Training uses the simplified noise-prediction objective at a random timestep.

Lsimple=Et,x0,ϵ[ϵϵθ(αˉtx0+1αˉtϵ,t)2]L_{\mathrm{simple}} = \mathbb{E}_{t,\mathbf{x}_0,\boldsymbol{\epsilon}}\left[\left\|\boldsymbol{\epsilon} - \boldsymbol{\epsilon}_{\theta}\left(\sqrt{\bar{\alpha}_t}\,\mathbf{x}_0 + \sqrt{1-\bar{\alpha}_t}\,\boldsymbol{\epsilon}, t\right)\right\|^2\right]

Sampling Rule / Algorithm

Sampling starts from Gaussian noise and applies one reverse Gaussian step per timestep.

xt1=1αt(xtβt1αˉtϵθ(xt,t))+σtz,zN(0,I)\mathbf{x}_{t-1} = \frac{1}{\sqrt{\alpha_t}}\left(\mathbf{x}_t - \frac{\beta_t}{\sqrt{1-\bar{\alpha}_t}}\,\boldsymbol{\epsilon}_{\theta}(\mathbf{x}_t,t)\right) + \sigma_t \mathbf{z}, \qquad \mathbf{z} \sim \mathcal{N}(\mathbf{0}, \mathbf{I})

Training Procedure

Evaluation

Datasets

Metrics

Headline results

Sample grid: LSUN Church generations with varied building layouts, facades, and lighting

Ablations

Method Strengths and Weaknesses

Strengths

Weaknesses

Suggestions from the authors

Links

Prior Papers

Further Papers