cs.AI, cs.CL, cs.RO

Characterizing the Robustness of Black-Box LLM Planners Under Perturbed Observations with Adaptive Stress Testing

arXiv:2505.05665v4 Announce Type: replace-cross
Abstract: Large language models (LLMs) have recently demonstrated success in decision-making tasks including planning, control, and prediction, but their tendency to hallucinate unsafe and undesired outp…