Benchmarking AI persistent memory server against connected memory.

Retrieval of only semantically similar memories using vector search is not sufficient to build an holistic context to feed to an llm.

Most of the memory system works on the first concept of pure vector search, While running an experiment I found out that purely working on semantics would never give me a complete picture, the hybrid approach of semantic search + entity graph is the way forward.

Tested a Hybrid approach of (semantic search + entity graph)

LoCoMo-10 (1,534 QA pairs, 10 multi-session conversations)
scored 59%

LongMemEval-S (500 questions, ~53 haystack sessions each)
scored 84.8% on retrieval of top 5

And the most interesting benchmark that current memory tool ecosystem misses.

HotpotQA multi-hop (200 questions)
To capture connected memories: scored 71.5% where it found all the connected memories.

Benchmark Scripts : https://github.com/sachitrafa/YourMemory/tree/main/benchmarks

Memory Graph

submitted by /u/Sufficient_Sir_5414
[link] [comments]

Leave a Comment