cs.CL, cs.LG

Compared to What? Baselines and Metrics for Counterfactual Prompting

arXiv:2605.01048v1 Announce Type: new
Abstract: Counterfactual prompting (i.e., perturbing a single factor and measuring output change) is widely used to evaluate things like LLM bias and CoT faithfulness. But in this work we argue that observed effec…