CalBench: Evaluating Coordination-Privacy Trade-offs in Multi-Agent LLMs
arXiv:2605.09823v1 Announce Type: cross
Abstract: We introduce CalBench, a controlled evaluation environment for studying multi-agent coordination through calendar scheduling. In CalBench, N agents each manage a private calendar containing pre-existin…