LocalLLaMA

Benchmarked 4 agent memory systems: Mem0 scores 49% recall (worse than a coin flip), Zep uses 340x more tokens for 15 points improvement. Here’s what’s actually going on.

I've been digging into how AI coding agents actually handle memory — not what the marketing says, but what the code and benchmarks show. Here's what I found. TL;DR: Every agent memory system in 2026 is either too simple (can't search), too …