cs.CV

A training-free framework for high-fidelity appearance transfer via diffusion transformers

arXiv:2603.26767v1 Announce Type: new
Abstract: Diffusion Transformers (DiTs) excel at generation, but their global self-attention makes controllable, reference-image-based editing a distinct challenge. Unlike U-Nets, naively injecting local appearanc…