Your AI Assistant Has Amnesia. Andrej Karpathy Just Found the Cure (LLM-Wiki).

Here is a problem you have probably run into without having a name for it. You spend an hour having a brilliant conversation with ChatGPT or Claude. You feed it a research paper, ask sharp questions, get sharp answers. The AI synthesizes connections you had not seen before. You close the tab feeling like you finally understand the topic.

Two weeks later, you open a fresh chat and ask a follow-up question.

The AI has no idea what you are talking about. It starts from zero. All that work, all those connections — gone. You are back to uploading the paper, re-explaining the context, re-discovering the same insights.

That is not a bug in the product. It is a fundamental design choice. And in early April 2026, one of the most respected researchers in AI quietly published a two-page document on GitHub that proposes a very different approach [1].

His name is Andrej Karpathy. He co-founded OpenAI, built Tesla’s Autopilot AI team from scratch, and has a gift for explaining complex ideas simply. When he shares something he finds useful, people pay attention. The document — a casual “idea file” he calls llm-wiki.md — has since gathered over 5,000 stars on GitHub and sparked dozens of independent implementations within two weeks of being published [1].

The idea it describes is called an LLM Wiki.

The Library That Never Forgets

To understand why the LLM Wiki matters, it helps to understand how most AI-powered knowledge tools work today.

Imagine you have a library of 200 research papers, articles, and notes. You want to ask the AI a question that requires connecting ideas across five of them. In a standard system — the kind used by tools like NotebookLM or basic ChatGPT file uploads — the AI picks up the five most relevant pieces, reads them quickly, answers your question, and then puts everything back on the shelf. Next time you ask something related, it repeats the whole process. The library looks exactly the same as the day you first walked in.

This approach is called RAG — Retrieval-Augmented Generation — and it works well enough for simple lookups. But it has a ceiling. The AI is always a tourist in your library. It never becomes the librarian.

The LLM Wiki flips this model. Instead of retrieving from your raw documents on every query, the AI studies your documents and builds its own organized notes — a living wiki of markdown files that gets richer every time you add something new. Karpathy’s analogy is precise: this is the difference between a compiler and an interpreter.

When a programmer writes code, they do not run the source files directly every time. They compile the source into an optimized artifact once, and then run the artifact. The compilation is expensive, but it pays for itself across every subsequent use. The LLM Wiki works the same way. Compile your sources into a structured knowledge base once. Query the artifact. The AI is no longer a tourist — it is the author of the library [1].

What This Actually Looks Like in Practice

Say you are a researcher spending three months reading everything published on a niche topic — let us say AI memory systems for agents. You read 40 papers, 20 blog posts, watch 10 talks. In a standard workflow, you have 70 browser bookmarks, some highlights scattered in a PDF reader, and increasingly vague memories of which paper said what.

In an LLM Wiki workflow, every time you add a source, the AI does not just store it. It reads it, writes a structured summary page, and then updates its existing notes wherever the new source agrees with, contradicts, or extends what it already knows. By the time you have ingested 30 sources, you have a dense, interlinked network of notes — concept pages, entity pages, a running index, a log of everything the system has learned — that the AI has authored and maintains. You never wrote a single note yourself.

Karpathy’s own research wiki, built this way, contains roughly 100 articles and around 400,000 words — the equivalent of reading five novels cover to cover [2]. He did not write any of it. The AI did the writing while he did the reading and the thinking.

When he wants to ask a question, the AI is not starting from scratch. It is consulting its own notes. The answer is already partially assembled.

The Three Pieces That Make It Work

There is no complex infrastructure here. The system has three parts, and you do not need to be a developer to understand why each one matters.

The raw sources. This is your pile of original material — papers, articles, transcripts, notes. You drop them into a folder and never touch them again. They are your source of truth. If the AI ever makes a mistake in its notes, you can always go back to the original and correct it. Think of this as the books on your shelf.

The wiki. This is the AI’s notebook. A collection of structured text files — one file per concept, one file per important entity, one file per key topic — that the AI writes and maintains as it processes your sources. You only read this. The AI only writes it. That strict division is what keeps the system coherent over time. Think of this as the AI’s handwritten study guide, built from reading all your books.

The schema. This is the instruction manual you give the AI at the start of every session. It tells the AI exactly how the notebook is organized, what to do when a new source arrives, how to flag contradictions, and how to answer questions. Without this document, the AI is a generic chatbot. With it, it is a disciplined researcher who follows the same methodology every time. Think of this as the house rules you post on the fridge for a new flatmate — once they read it, they know exactly how this household runs.

The Three Things You Actually Do

Once the system is set up, your entire workflow with it reduces to three verbs.

Ingest. You find a good article or paper, save it, and tell the AI to process it. The AI reads the source, writes a summary, and updates every relevant page in the wiki that the new source touches. A single article typically ripples across 10 to 15 existing pages — updating the concept that the article extends, flagging the entity page where it provides new data, noting the contradiction with a claim the wiki had previously treated as settled [1]. The more the wiki grows, the richer each new ingest becomes.

Query. You ask a question. The AI reads its own notes and synthesizes a grounded answer. The important thing here — and this is one of Karpathy’s sharper insights — is that a good answer should be saved back into the wiki as a new page. An analysis you asked for, a comparison table, a connection you discovered — these should not disappear into chat history. Filing them back into the wiki means your questions make the system smarter, not just your sources.

Lint. Every few weeks, you ask the AI to run a health check on its own notes. It looks for pages that contradict each other, claims that newer sources have made outdated, concepts that are mentioned frequently but never got their own dedicated page. It also suggests gaps in your research — topics you should read about next, questions worth investigating. The system audits itself and tells you what it does not know.

Who Is Already Using This

The developer community responded to Karpathy’s gist faster than almost any idea I have seen spread in the past year.

Researchers are using it to synthesize paper reading across months-long literature reviews — no more scattered bookmarks, no more re-reading papers they vaguely remember. The wiki holds the synthesis; they hold the direction.

Software engineers are using it to turn AI coding session transcripts into a persistent knowledge base. Every time you work with Claude Code or Cursor, a full transcript is written to disk. Almost no one ever reads those transcripts again. An LLM Wiki converts that dormant history into searchable architecture notes, decision logs, and pattern libraries — a record of everything the codebase has taught you [3].

Teams are experimenting with feeding Slack threads, meeting notes, and customer call transcripts into shared wikis, with the LLM doing the maintenance work that nobody on the team ever actually does. The vision is an internal wiki that stays current because the AI handles the upkeep, and humans handle the direction.

And then there are the more personal uses. Karpathy’s own gist mentions tracking health data, journal entries, podcast notes, and self-improvement reading — building a structured picture of yourself over time. The pattern applies to anything where knowledge accumulates and where the maintenance of organizing it usually kills the habit before the habit delivers value [1].

What It Cannot Do (Yet)

One thing I appreciate about Karpathy’s framing is that he published an idea file, not a finished product. The limitations are worth naming honestly.

The pattern works best at a personal scale — roughly 10 to a few hundred documents. As the wiki grows into the thousands of pages, the cost of keeping the AI context window stocked with enough pages to answer complex questions starts to increase. At that scale, you need additional search tooling layered on top [4].

The quality of every page in the wiki depends entirely on the quality of the LLM doing the ingestion. A careless model can propagate errors without flagging them. This is a system that rewards periodic human review, especially in its early weeks.

And the system works best for knowledge that is relatively stable — concepts, relationships, accumulated research findings. It is less well-suited to tracking the kind of context that evolves day-to-day in active projects: who said what in yesterday’s meeting, what the current status of a task is, whether a deadline has moved. For that kind of dynamic ground truth, a knowledge graph of structured entities is a better fit than a wiki of narrative summaries [5].

The Bigger Point

The reason this idea spread so fast is not the architecture. The architecture is genuinely simple — markdown files, a schema, a folder structure.

The reason it spread is that it names a frustration everyone who works with LLMs has felt but not articulated: every conversation starts from zero, and every insight you work hard to develop with an AI gets thrown away when the chat window closes.

The LLM Wiki says: the AI should be doing the bookkeeping so you do not have to. Its job is to take your reading and your questions and compile them into something that compounds over time. Your job is to curate what goes in, decide what questions are worth asking, and think about what it all means.

That is a clean division of labor. And for anyone who works with large volumes of information — researchers, engineers, analysts, students, anyone who reads seriously for their work — it is one worth trying.

The full idea file is available at Karpathy’s GitHub Gist [1]. Start with five articles on a topic you care about. Watch what the system builds from them. Adjust the schema based on what you liked and what you would change. The compounding does not start on the first ingest — but it does not take long to feel it.

Images in this article are AI generated.

References

[1] Karpathy, A. (2026, April 4). llm-wiki.md [GitHub Gist]. https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f

[2] Gupta, M. (2026, April). Andrej Karpathy’s LLM Knowledge Bases Explained. Medium — Data Science in Your Pocket. https://medium.com/data-science-in-your-pocket/andrej-karpathys-llm-knowledge-bases-explained-2d9fd3435707

[3] Pratiyush. (2026). llm-wiki: LLM-powered knowledge base from your coding sessions [GitHub]. https://github.com/Pratiyush/llm-wiki

[4] Anthemcreation. (2026, April). LLM wiki Karpathy: knowledge base with Claude and Obsidian. https://anthemcreation.com/en/artificial-intelligence/karpathy-llm-wiki-claude-obsidian/

[5] Chawla, A. (2026, April). The Next Step After Karpathy’s Wiki Idea. Daily Dose of Data Science. https://blog.dailydoseofds.com/p/the-next-step-after-karpathys-wiki

Your AI Assistant Has Amnesia. Andrej Karpathy Just Found the Cure (LLM-Wiki). was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.