Interpretable Concept Evolution Flow in Diffusion Models

1University of North Texas 2University of California, San Diego
Model visualization

Intermediate decoding vs. our stepwise unconditional epsilon-perturbation. Top row shows the images generated by decoding intermediate latents during standard sampling, which are dominated by noise and hard to interpret. Bottom row shows our method, each time perturbing at one intermediate step and sampling to the final image, which reveals step-specific concept changes. Example uses Stable Diffusion v1.5 with the Food-101 prompt template: ``a photo of apple pie, a type of food.''

Abstract

We introduce stepwise unconditional epsilon-perturbation, a training-free single-step causal probe. At a chosen timestep, we set the unconditional noise prediction to zero, temporarily canceling that negative unconditional component. This exposes the generic structural priors that CFG attempts to suppress via propagating this change through the remaining reverse dynamics. By sweeping the perturbation timestep, we are able to extract a human-readable concept evolution flow: an ordered sequence of visually explicit final samples that reveal how concepts appear, change, and stabilize. This flow allows us to directly quantify and analyze the global semantic relevance and local concept traces of it using pre-trained vision models (CLIP and Grounded-SAM2) rather than probing complex intermediate latent. Extensive experiments reveal consistent coarse-to-fine emergence patterns and, interestingly, transient concept appearances that are otherwise hidden under the standard reverse process.

BibTeX

@misc{Liu2026EvolutionFlow,
      title={Interpretable Concept Evolution Flow in Diffusion Models}, 
      author={Yuxuan Liu and Dan Wang and Xinrui Cui},
      year={2026},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
}