cs.AI, cs.CV

Image Generators are Generalist Vision Learners

arXiv:2604.20329v2 Announce Type: replace-cross
Abstract: Recent works show that image and video generators exhibit zero-shot visual understanding behaviors, in a way reminiscent of how LLMs develop emergent capabilities of language understanding and …

cs.CL

Confidence Estimation for LLMs in Multi-turn Interactions

arXiv:2601.02179v2 Announce Type: replace
Abstract: While confidence estimation is a promising direction for mitigating hallucinations in Large Language Models (LLMs), current research overwhelmingly focuses on single-turn settings. The dynamics of mo…

cs.AI, cs.LG

MathAtlas: A Benchmark for Autoformalization in the Wild

arXiv:2605.14061v1 Announce Type: new
Abstract: Current autoformalization benchmarks are largely focused on olympiad or undergraduate mathematics, while graduate and research-level mathematics remains underexplored. In this paper, we introduce MathAtl…

Scroll to Top