| I've been pretty unsatisfied with web search options for local LLM/RAG systems. Most setups either rely on paid APIs like Brave, or meta search scrapers like SearXNG. So I built LLMSearchIndex- a Python library for fully local internet-scale search. It uses a custom trained, highly compressed search index that contains most of the webpages from FineWeb + Wikipedia. The full index is only ~2GB and runs locally on most hardware with pretty fast retrieval speeds. I've built a python library to make it easy to retrieve these results for RAG context. You can also check out a demo here: https://zakerytclarke-llmsearchindex.hf.space/ [link] [comments] |