Losing dimensions: Geometric memorization in generative diffusion

arXiv:2410.08727v2 Announce Type: replace-cross
Abstract: Diffusion models power leading generative AI, but when and how they memorize training data, especially on low-dimensional manifolds, remains unclear. We find memorization emerges gradually, not abruptly: as data become scarce, diffusion models experience a smooth collapse where their capacity to vary across independent directions diminishes. Measuring latent dimensionality via the learned score field, we reveal how generative behavior increasingly centers on a few examples while other variations “freeze out”. We propose a geometric memorization theory, showing that salient features collapse first, then finer details, leading to near point-wise replication. This mirrors physical systems condensing into a few low-energy configurations. Our theoretical predictions align with both synthetic and real data, identifying geometric memorization as a distinct phase between generalization and exact copying.

Source link

Losing dimensions: Geometric memorization in generative diffusion

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Leave a Reply Cancel reply

Recent Posts

Recent Comments

You Might Also Like

Low-input deep learning platform for citrullinated peptide identification, autoantigen discovery and rheumatoid arthritis treatment stratification

This AI finds simple rules where humans see only chaos

40 Questions to Go from Beginner to Advanced

The $qs$ Inequality: Quantifying the Double Penalty of Mixture-of-Experts at Inference