Author name: Itay Itzhak, Eliya Habba, Gabriel Stanovsky, Yonatan Belinkov

From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs

Itay Itzhak, Eliya Habba, Gabriel Stanovsky, Yonatan Belinkov / April 17, 2026

arXiv:2604.14137v2 Announce Type: replace-cross
Abstract: Evaluating LLMs is challenging, as benchmark scores often fail to capture models’ real-world usefulness. Instead, users often rely on “vibe-testing”: informal experience-based evaluation, suc…

cs.AI, cs.CL, cs.LG

From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs

Itay Itzhak, Eliya Habba, Gabriel Stanovsky, Yonatan Belinkov / April 16, 2026

arXiv:2604.14137v1 Announce Type: cross
Abstract: Evaluating LLMs is challenging, as benchmark scores often fail to capture models’ real-world usefulness. Instead, users often rely on “vibe-testing”: informal experience-based evaluation, such as com…