Ferdinand M. Schessl

The Autocorrelation Blind Spot: Why 42% of Turn-Level Findings in LLM Conversation Analysis May Be Spurious

Ferdinand M. Schessl / April 17, 2026

arXiv:2604.14414v1 Announce Type: new
Abstract: Turn-level metrics are widely used to evaluate properties of multi-turn human-LLM conversations, from safety and sycophancy to dialogue quality. However, consecutive turns within a conversation are not s…

Author name: Ferdinand M. Schessl

The Autocorrelation Blind Spot: Why 42% of Turn-Level Findings in LLM Conversation Analysis May Be Spurious