Circuit Mechanisms for Spatial Relation Generation in Diffusion Transformers
arXiv:2601.06338v2 Announce Type: replace-cross
Abstract: Diffusion Transformers (DiTs) have greatly advanced text-to-image generation, but models still struggle to generate the correct spatial relations between objects as specified in the text prompt…