CRAFT: Counterfactual-to-Interactive Reinforcement Fine-Tuning for Driving Policies
arXiv:2605.04470v1 Announce Type: cross
Abstract: Open-loop imitation learning has advanced modern autonomous driving policy architectures, but closed-loop deployment remains vulnerable to policy-induced distribution shift. Existing post-training para…