Losing dimensions: Geometric memorization in generative diffusion

arXiv:2410.08727v2 Announce Type: replace-cross
Abstract: Diffusion models power leading generative AI, but when and how they memorize training data, especially on low-dimensional manifolds, remains unclear. We find memorization emerges gradually, not abruptly: as data become scarce, diffusion models experience a smooth collapse where their capacity to vary across independent directions diminishes. Measuring latent dimensionality via the learned score field, we reveal how generative behavior increasingly centers on a few examples while other variations “freeze out”. We propose a geometric memorization theory, showing that salient features collapse first, then finer details, leading to near point-wise replication. This mirrors physical systems condensing into a few low-energy configurations. Our theoretical predictions align with both synthetic and real data, identifying geometric memorization as a distinct phase between generalization and exact copying.

Source link

Losing dimensions: Geometric memorization in generative diffusion

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Leave a Reply Cancel reply

Recent Posts

Recent Comments

You Might Also Like

How to Build a Self-Designing Meta-Agent That Automatically Constructs, Instantiates, and Refines Task-Specific AI Agents

[2603.07893] Designing probabilistic AI monsoon forecasts to inform agricultural decision-making

Delivering Massive Performance Leaps for Mixture of Experts Inference on NVIDIA Blackwell

Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning