MongoDB targets AI’s retrieval problem

For all their technical capabilities, large language models (LLMs) still have a memory problem. They can lack the ability to retain context across conversations, and don’t always contain the frameworks to let them access relevant data, ultimately making their results unreliable and untrustworthy.

NoSQL database pioneer MongoDB is taking on this problem, releasing new persistent memory, retrieval, embedding, and re-ranking features, all integrated into one platform. The company is also introducing new security connectivity, open-source plugins, and other framework integrations to support agentic AI workloads.

Supporting agentic memory

“Unlocking the power of agents requires memory,” Pete Johnson, MongoDB’s field CTO of AI, said during a press briefing. “Just like human memory, a good agentic memory organizes knowledge. It helps agents retrieve the right knowledge based on context and learn to make smarter decisions and take optimized actions over time.”

To advance automated retrieval and persistent agent memory, the company is adding Automated Voyage AI Embeddings in MongoDB Vector Search. The capability is now available in public preview.

Fragmented AI stacks present another challenge. As builders grapple with them, they are often stuck paying what Ben Cefalo, MongoDB CPO, called the “synchronization tax.” To make data agent-searchable, developers must stitch together factor search, operational data stores, embedded models, and caches, then take the time to build complex data pipelines that keep everything in sync across systems.

But by natively integrating Voyage AI into Atlas, MongoDB has turned a “multi week engineering project into a two minute configuration,” Cefalo claimed. Developers can ship reliable, trustworthy agents much more quickly and easily, and “without all the complex data plumbing.”

Dovetailing with this, the company is announcing the general availability of a LangGraph.js Long-Term Memory Store. Cefalo pointed out that JavaScript and TypeScript users comprise the world’s largest builder communities, but the company’s existing Python integration formerly limited these groups to short-term and single-threaded content.

Now, they can use MongoDB to give agents persistent long-term memory so they retain preferences and interaction history across conversations “on the data pipeline they already trust.” This underscores MongoDB’s longstanding “run anywhere” strategy, Cefalo explained.

Introducing embedding and re-ranking

Agents must be able to retrieve information based on context, and learn from and optimize that process, all while minimizing LLM token use as much as possible, Johnson pointed out, because without consistent, high-accuracy retrieval, users lose trust.

Most users incorrectly blame this lack of trust on the LLM, and “the instinct is to upgrade to the latest, most extensive, expensive model,” he said. Ultimately, though, it’s a retrieval problem: Models can only act on the information they are given; if data is inaccurate, out of date, or lacking context, the output will ultimately be wrong, leading to potentially “disastrous” business consequences.

“That’s exactly the sentiment that we hear from customers: They’re excited about AI agents, but they’re nervous about putting them in front of their customers if the results are inconsistent, irrelevant or flat out wrong,” Johnson said.

The solution is getting the LLM the right information it needs upfront; this is where embedding and re-ranking models come in. MongoDB has been integrating these technologies into its Voyage 4 family of models, building off the company’s acquisition of Voyage AI in February 2025.

As Johnson explained it, embedding models convert unstructured information like PDFs, images, videos, and audio into vectors, which capture and map data meaning and group related data. “That’s how you can get semantic-style searching for things that aren’t exact keyword search.”

Re-rankers take this a step further. After results are retrieved by the embedding model, re-rankers compare them to the user’s query. This provides more relevant, grounded responses. “Think of the embeddings as a wide net, and the re-ranker hand picks the best fish out of it,” Johnson explained.

Both embedding and re-ranking capabilities are natively integrated into MongoDB, so enterprise customers don’t have to switch between vendors and end up “Frankensteining a stack that creates an operational headache,” he said.

Johnson also underscored the fact that the decisions technical leaders make about their data platform now will either accelerate their AI development or delay it by months or years. “This isn’t a question for the future, it’s a question for today, because the success of that development depends on the data platform they’re working with,” he said.

Database enhancements and new integrations

In addition to offering new memory capabilities, MongoDB is strengthening its data foundation. The latest version of its database, MongoDB 8.3, is now generally available, and represents a “deep architectural hardening” of its core offering to support faster AI workloads at lower cost, Cefalo explained.

Query expressions (instructions for retrieving and organizing data) are integrated natively into MongoDB, so developers don’t have to rely on external toolboxes; transformation logic stays inside the database. “It’s SQL-style data transformations for data engineers,” said Cefalo.

Further, MongoDB is also announcing an Atlas integration with Feast. The widely-adopted open-source store provides AI and LLM apps with structured data during training and inference. This means machine learning (ML) teams can operate without having to play a “high stakes game of database musical chairs” requiring them to move data from their primary training database to a separate system for real-time inferencing, said Cefalo.

“This database sprawl doesn’t just create operational overhead, it creates drift, where the model trains on one version of reality but makes predictions on another,” he said. This can be complex and expensive, and a hurdle to scaling AI.

Finally, to support security and compliance, MongoDB is providing cross-region connectivity to MongoDB Atlas from AWS PrivateLink, which supports connectivity between AWS services, virtual private clouds (VPCs), and on-premises networks without exposing traffic to the public internet. This integration, now generally available, provides a “single, auditable model” that simplifies compliance and maintains strong security posture for organizations operating across multiple regions, Cefalo explained.

Supporting agentic memory

Introducing embedding and re-ranking

Database enhancements and new integrations

Leave a Comment