The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents
arXiv:2604.25299v1 Announce Type: new
Abstract: Diffusion models have achieved success in high-fidelity data synthesis, yet their capacity for more complex, structured reasoning like text following tasks remains constrained. While advances in language…