cs.AI

SAGE: A Service Agent Graph-guided Evaluation Benchmark

arXiv:2604.09285v1 Announce Type: new
Abstract: The development of Large Language Models (LLMs) has catalyzed automation in customer service, yet benchmarking their performance remains challenging. Existing benchmarks predominantly rely on static para…

cs.AI, cs.MA

Memory Intelligence Agent

arXiv:2604.04503v3 Announce Type: replace
Abstract: Deep research agents (DRAs) integrate LLM reasoning with external tools. Memory systems enable DRAs to leverage historical experiences, which are essential for efficient reasoning and autonomous evol…

Scroll to Top