cs.CV

Golden RPG: Confidence-Adaptive Region-Aware Noise for Compositional Text-to-Image Generation

arXiv:2604.25314v1 Announce Type: new
Abstract: Compositional text-to-image (T2I) generation requires a model to honour multiple sub-prompts that describe distinct image regions. Recent work shows that the \emph{starting noise} of a diffusion model ca…