How Metal AI Built an Agent-Driven Intelligence Platform on Vespa Cloud
How Metal built an AI-Native Intelligence Platform on Vespa.ai, where 95% of retrieval is handled by AI agents.
Documents are embedded once — worth the spend for maximum quality. Queries hit you on every request. This is what drives your cost at scale. Asymmetric retrieval with Voyage AI and Vespa. Real numbers, real config.
Retrieval-Augmented Generation (RAG) allows an LLM to answer questions using your data at query time. On their own, LLMs are powerful but limited: they can hallucinate, they have a fixed knowledge cutoff, and they know nothing about your private documents, internal wikis, or proprietary systems.
Tensor-based retrieval preserves context across queries, maintains “chain of thought” and ranking relevance of multiple scientific factors simultaneously.