We tried vectors, ASTs, and brute-force context stuffing for code retrieval. Graphs with LLM-generated semantics worked best. Here’s what we learned.
We spent the last year building a code indexing system and tried pretty much every retrieval approach along the way. Sharing what actually worked and what didn't because the discourse around "just use embeddings" or "just use Tree-si…