| I'm the author of YourMemory (disclosing upfront). Sharing benchmark results because I couldn't find reproducible comparisons for agent memory retrieval, specifically on multi hop facts where vector search has a known blind spot. The problem: How the retrieval stack works:
Benchmark results: LoCoMo-10 (1,534 QA pairs, 10 multi session conversations)
LongMemEval-S (500 questions, ~53 haystack sessions each)
HotpotQA multi-hop (200 questions)
+14pp on bridge questions specifically. How are others handling bridge type retrieval in long running agents? Website: https://yourmemoryai.xyz/ [link] [comments] |