When Do Diffusion Models learn to Generate Multiple Objects?
arXiv:2605.00273v1 Announce Type: cross
Abstract: Text-to-image diffusion models achieve impressive visual fidelity, yet they remain unreliable in multi-object generation. Despite extensive empirical evidence of these failures, the underlying causes r…