How do you benchmark structural properties of agent memory (isolation, context pollution, typed memory) beyond retrieval metrics? [D]
I'm working on an open-source memory infrastructure for AI agents (CtxVault). It organizes agent memory into typed, isolated vaults rather than a single shared vector store. I've run standard retrieval benchmarks (BEIR, CoIR) comparing against …