Zihao Yang, Mosh Levy, Yoav Goldberg, Byron C. Wallace

Compared to What? Baselines and Metrics for Counterfactual Prompting

Zihao Yang, Mosh Levy, Yoav Goldberg, Byron C. Wallace / May 5, 2026

arXiv:2605.01048v1 Announce Type: new
Abstract: Counterfactual prompting (i.e., perturbing a single factor and measuring output change) is widely used to evaluate things like LLM bias and CoT faithfulness. But in this work we argue that observed effec…

Author name: Zihao Yang, Mosh Levy, Yoav Goldberg, Byron C. Wallace

Compared to What? Baselines and Metrics for Counterfactual Prompting