cs.CL, cs.LG

OCRR: A Benchmark for Online Correction Recovery under Distribution Shift

arXiv:2605.03153v1 Announce Type: new
Abstract: Static benchmarks measure a model frozen at training time. Real systems face distribution shift: new categories, paraphrased queries, drift: and must recover online via user corrections. No existing benc…