Selective Aggregation of Attention Maps Improves Diffusion-Based Visual Interpretation
arXiv:2604.05906v1 Announce Type: new
Abstract: Numerous studies on text-to-image (T2I) generative models have utilized cross-attention maps to boost application performance and interpret model behavior. However, the distinct characteristics of attent…