cs.AI, cs.CV

ImAgent: A Unified Multimodal Agent Framework for Test-Time Scalable Image Generation

arXiv:2511.11483v4 Announce Type: replace
Abstract: Recent text-to-image (T2I) models have made remarkable progress in generating visually realistic and semantically coherent images. However, they still suffer from randomness and inconsistency with th…