Memori Labs Outperforms Every Other Memory System with 81.95% Accuracy at 4.98% the Cost of Full Context

PR Newswire

SAN FRANCISCO, March 20, 2026

Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents," demonstrating that its structured memory architecture outperforms every other retrieval-based memory system on the LoCoMo long-conversation memory benchmark. Memori achieved 81.95% overall accuracy on LoCoMo, surpassing Zep (78.94%), LangMem (78.05%), and Mem0 (62.47%). Crucially, Memori accomplished this while maintaining a context footprint of only 4.98% of the cost of full conversation context, using an average of just 1,294 tokens per query. This represents more than 20x lower context cost than full-context prompting. Memori Labs addresses the problem of "context rot" by treating memory as a data structuring problem, converting raw conversational history into high-signal memory assets—semantic triples and session summaries—via its Advanced Augmentation pipeline. This approach ensures production AI agents can have both accurate recall and manageable costs, translating into lower inference costs and better cross-session continuity for AI builders. The full paper is available for review at www.memorilabs.ai/benchmark.

SAN FRANCISCO, March 20, 2026 /PRNewswire-PRWeb/ -- Memori Labs today announced the release of a new benchmark paper, Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents, detailing how the company's structured memory architecture performs on the LoCoMo long-conversation memory benchmark.

Memori achieved 81.95% overall accuracy, outperforming every other retrieval-based memory system tested - including Zep (78.94%), LangMem (78.05%), and Mem0 (62.47%) - while using an average of just 1,294 tokens per query, or 4.98% of the cost of full conversation context. The paper also reports 67% fewer context tokens than Zep and more than 20x lower context cost than full-context prompting.

The paper evaluates Memori's Advanced Augmentation pipeline, which converts raw conversational history into structured memory assets - semantic triples for factual recall and session summaries for narrative context. The benchmark tests whether AI systems can retain and reason over facts, preferences, and temporal changes across long, noisy, multi-session conversations.

"Production AI agents need LLM-agnostic memory, but they also need to preserve unit economics," said Adam B. Struck, CEO and Co-Founder of Memori Labs. "This paper shows that Memori Labs outperforms competing approaches on accuracy while using a fraction of the tokens. Teams shouldn't have to choose between accurate recall and manageable costs."

Full Context Does Not Scale

As AI agents move into production, teams are hitting the limits of prompt-history-based memory. Passing full chat history into the model increases token costs, adds irrelevant conversational noise, and makes it harder for models to consistently use the most important information. This problem, known as context rot, means that relevant information is present but not effectively utilized.

Memori approaches this as a data structuring problem rather than a storage problem. Instead of retrieving large blocks of raw conversation, Memori extracts compact factual and narrative representations that are easier to search and cheaper to inject into prompts.

What the Paper Evaluated

The paper uses the LoCoMo benchmark, a framework designed to test long-conversation memory, state tracking, temporal reasoning, and subtle preference recall across multi-session chat histories. Memori was evaluated across four reasoning categories:

Single-hop reasoning (direct fact retrieval): Memori scored 87.87%
Multi-hop reasoning (connecting disparate facts): Memori scored 72.70%
Temporal reasoning (tracking changes over time): Memori scored 80.37%
Open-domain reasoning (broad synthesis): Memori scored 63.54%

For the evaluation, Memori processed LoCoMo conversations through its Advanced Augmentation pipeline to generate semantic triples for precise fact recall and session summaries for timeline and narrative context. The resulting memory assets were embedded, indexed, and retrieved through a hybrid retrieval flow before being used to answer benchmark questions.

Key Results

81.95% overall accuracy on LoCoMo, the highest among retrieval-based systems tested
1,294 average tokens added to context per query
4.98% context footprint relative to full-context prompting
67% fewer context tokens than Zep
More than 20x lower context cost than full-context prompting
Outperforms Zep, LangMem, and Mem0 on overall accuracy

Why It Matters for AI Builders

The release of the paper reinforces Memori Labs' core product thesis: durable AI memory should be compact, structured, and retrieval-friendly.

For builders of AI agents, copilots, and multi-session applications, that translates into:

lower inference costs
better continuity across sessions
stronger recall of facts and preferences
reduced dependence on oversized context windows

Availability

The full paper is available at www.memorilabs.ai/benchmark. Memori's memory platform is available through the Company's Memori Cloud offering.

About Memori Labs

Memori Labs builds SQL-native memory infrastructure for LLM applications, agents, and copilots. The platform continuously captures interactions, extracts structured knowledge, and intelligently retrieves relevant memory, giving AI systems persistent context across every session. Memori offers both Memori Cloud (fully managed) and flexible enterprise deployment options including BYODB, VPC, and on-premises configurations.

Media Contact

Amandeep Sandhu, Memori Labs, 1 4157137321, hello@memorilabs.ai, https://memorilabs.ai/

View original content:https://www.prweb.com/releases/memori-labs-outperforms-every-other-memory-system-with-81-95-accuracy-at-4-98-the-cost-of-full-context-302719406.html

SOURCE Memori Labs