Harnessing Low Dimensionality in Diffusion Models: From Theory to Practice
Yuxin Chen Qing Qu Liyue Shen
Mon 14 Jul 2025 • 9:30 — 12:00 PDT
West Ballroom B
Abstract
Diffusion models have emerged as a powerful class of deep generative models, achieving state-of-the-art performance in a wide range of data generation tasks. At a high level, they learn data distributions by progressively denoising Gaussian noise, mimicking non-equilibrium thermodynamic diffusion processes. Despite their empirical success, the theoretical foundations underlying their capabilities remain poorly understood. This lack of understanding limits their broader adoption—especially in high-stakes applications that demand interpretability, efficiency, and safety.
In response to growing concerns about the reliability and transparency of generative AI models, this tutorial—building on a joint tutorial by the organizers at CPAL'25 (Stanford)—offers a timely introduction to the theoretical principles governing diffusion models. We will focus on three core aspects: generalization, sampling efficiency, and scientific applications. Drawing on recent advances, the tutorial will highlight how low-dimensional structures in both data and models can be exploited to address key challenges in generalization, fast sampling convergence, and controllability.
Specifically, we will examine how diffusion models adaptively learn underlying data distributions, how to accelerate convergence during sampling, and how to characterize the intrinsic properties of the learned denoiser. These theoretical insights will then be connected to practical advances, demonstrating how to harness them for real-world scientific applications.

Specifically, the tutorial will cover the following:
(i) Generalization: We begin with an introduction to diffusion models, followed by a comprehensive study of their generalization: when and why they can learn low-dimensional target structures, how sample complexity scales with intrinsic data dimension, and how they transition from memorization to generalization. We also introduce a probability flow-based metric to quantify generalization and highlight several intriguing behaviors observed during training.
(ii) Sampling Efficiency: We seek to develop a sharp, non-asymptotic convergence theory for mainstream diffusion-based samplers, leveraging these theoretical insights to design provably faster higher-order diffusion samplers, including SDE-based and ODE-based solvers. We will also investigate the capability of diffusion-based samplers to adapt to unknown low-dimensional data structures, exploiting adaptive parallel computing to provably speed up training and sampling.
(iii) Scientific Applications: We will advance diffusion models for scientific imaging by improving their flexibility, efficiency, and robustness in solving inverse problems on high-dimensional, high-resolution data. Our focus includes efficient latent and patch-based methods, enhanced data consistency for challenging 3D tasks, and controllable sampling techniques that enforce desired constraints while maintaining sample quality.
Tutorial Schedule
Topic | Time (PDT) | Speaker | Slides |
---|---|---|---|
Introduction
|
09:30 – 09:40 a.m. | Qing Qu | |
Part I: Generalization of Learning Diffusion Models
|
09:40 – 10:25 a.m. | ||
Part II: Guaranteed and Efficient Sampling of Diffusion Models
|
10:25 – 11:10 a.m. | Yuxin Chen | |
Break | 11:10 – 11:15 a.m. | ||
Part III: From Theory to Scientific Applications
|
11:15 – 12:00 p.m. | Liyue Shen |
Speakers

Professor

Assistant Professor

Assistant Professor


Funding Acknowledgement
