Denoising Diffusion Implicit Models
Denoising Diffusion Implicit Models
Problem
Framing
DDPMs deliver strong image quality but need long reverse chains, usually steps. DDIM closes the latency gap by defining a non-Markovian forward family with the same denoising training objective, then using deterministic or partially stochastic short-step sampling. CIFAR-10 reaches FID in steps.
Currently Used Methods
Direct antecedents
- @DeepUnsupervisedLearningusing2015 — diffusion latent-variable learning from nonequilibrium thermodynamics.
- Limitation in context: no practical short-step image sampler.
- @DenoisingDiffusionProbabilisticModels2020 — denoising diffusion with strong likelihoods and sample quality.
- Limitation in context: generation needs long sequential reverse chains.
- @songScoreSDE2020 — score-based generation with continuous noise perturbations.
- Limitation in context: not framed as direct reuse of DDPM checkpoints.
- @goodfellowGAN2014 — one-pass adversarial image synthesis with high perceptual quality.
- Limitation in context: no diffusion-style inversion trajectory.
Proposed Method
Architecture
DDIM changes the sampler, not the denoiser. It reuses the DDPM network , chooses a subsequence of length , and controls randomness with , where gives a deterministic trajectory.

Loss / Objective
The non-Markovian family shares the DDPM denoising surrogate up to timestep weights.
Sampling Rule
Sampling predicts and updates along the chosen subsequence .
Training Procedure
- Reuses pretrained DDPM denoisers
- Sampling uses a subsequence with length
- Same dataset-specific architectures as DDPM
Evaluation
Datasets
- CIFAR-10, , unconditional
- CelebA,
- LSUN Bedroom,
- LSUN Church,
Metrics
- FID
- Reconstruction MSE on CIFAR-10 test images
- Sampling steps
Headline results
- CIFAR-10 (, ): FID
- CelebA (, ): FID
- LSUN Bedroom (, ): FID
- LSUN Church (, ): FID
- DDPM baseline (, ): CIFAR-10 FID , CelebA FID
Table 1: CIFAR10 and CelebA image generation measured in FID.
| CIFAR10 | CIFAR10 | CIFAR10 | CIFAR10 | CIFAR10 | CelebA | CelebA | CelebA | CelebA | CelebA | |
|---|---|---|---|---|---|---|---|---|---|---|
| 10 | 13.36 | 14.04 | 16.66 | 41.07 | 367.43 | 17.33 | 17.66 | 19.86 | 33.12 | 299.71 |
| 20 | 6.84 | 7.11 | 8.35 | 18.36 | 133.37 | 13.73 | 14.11 | 16.06 | 26.03 | 183.83 |
| 50 | 4.67 | 4.77 | 5.25 | 8.01 | 32.72 | 9.17 | 9.51 | 11.01 | 18.48 | 71.71 |
| 100 | 4.16 | 4.25 | 4.46 | 5.78 | 9.99 | 6.53 | 6.79 | 8.09 | 13.93 | 45.20 |
| 1000 | 4.04 | 4.09 | 4.29 | 4.73 | 3.17 | 3.51 | 3.64 | 4.28 | 5.98 | 3.26 |
Ablations
- Step count : larger subsequences improve FID across datasets.
- Stochasticity : is best in short-step regimes.
- DDIM versus DDPM: deterministic updates degrade far less when steps are truncated.
- Deterministic trajectories enable interpolation and low-error reconstruction.
Method Strengths and Weaknesses
Strengths
- Reuses DDPM checkpoints without retraining.
- CIFAR-10 reaches FID in steps.
- Gives to faster sampling than DDPM.
- Deterministic paths support inversion and interpolation.
Weaknesses
- Sampling remains iterative, not one-shot.
- Very short chains still lose quality sharply.
- Best overall FID still comes from -step DDPM.
- Quality depends on timestep subsequence design.
Suggestions from the authors
- Study shorter forward processes with the same denoising objective.
- Use deterministic inversion as a latent representation.
- Extend the construction beyond Gaussian continuous data.
- Analyze continuous-time limits of the DDIM process.
Links
Prior Papers
- @DenoisingDiffusionProbabilisticModels2020 — DDIM keeps the DDPM objective and replaces its slow reverse sampler.
- @DeepUnsupervisedLearningusing2015 — DDIM extends the original diffusion latent-variable line with practical short-step generation.
- @dinhNVP2017 — deterministic DDIM trajectories make diffusion sampling closer to invertible latent-variable mappings.
Further Papers
- @nicholImprovedDDPM2021 — improves diffusion samplers and variance parameterization in the design space DDIM opened.
- @ClassifierFreeDiffusionGuidance2022 — guidance is routinely paired with DDIM sampling for fast conditional generation.
- @rombachLatentDiffusion2022 — latent diffusion uses DDIM-style sampling in compressed latent spaces.
- @dhariwalDiffusionBeatGANs2021 — strong guided diffusion systems use accelerated samplers in the DDIM family.