Building Living Context: My LLM Local Setup
Andrej Karpathy published his LLM Wiki idea recently. A persistent, compounding knowledge base maintained by the LLM, not by you. No RAG, no embeddings, just structured markdown files that get richer over time.
I’ve been running something very similar with Claude Code TUI and wanted to share my setup.
The architecture
┌────────────────────────────────────────────────────────────────┐
│ CONTEXT ENRICHMENT LOOP │
└────────────────────────────────────────────────────────────────┘
┌───────────────────┐
│ Claude Code TUI │
│ + Skills │
└─────────┬─────────┘
│
1. read compass file
│
▼
┌──────────────────────────────────────────────────────────┐
│ context/ │
│ ├── DIRECTORY.md ← first lookup (folder index) │
│ ├── work/ │
│ │ └── topic.md │
│ │ --- │
│ │ desc: token-light preview ← frontmatter │
│ │ --- │
│ │ full content below ← body │
│ ├── learning/ │
│ ├── personal/ │
│ └── ... │
└──────────────────────────────────────────────────────────┘
│
2. context found?
│
┌─────┴──────┐
│ │
YES NO
│ │
▼ ▼
resolve AskUserQuestion
auto (built-in claude tool)
│ │
│ │ 3. human answers
│ ▼
│ file answer back
│ (new .md OR enrich existing)
│ │
└─────┬──────┘
│
▼
context COMPOUNDS
next session starts richer ↺
How it works
The context/ directory is one central brain that spans all projects. Every .md file has a frontmatter header with a curated, token-light description. The LLM reads the frontmatter first to decide if it needs the full content. No embeddings, no vector DB. Just modular folders with clear names and a DIRECTORY.md compass file that maps the whole structure. Think of it as a unix filesystem abstraction: folder metadata + short descriptions = the LLM’s first lookup.
The key primitive is AskUserQuestion. Built into Claude Code, massively underrated. It flips the interaction: instead of me typing instructions, the LLM asks me targeted questions (QCM-style options + free text input) when it can’t resolve something from context. My role becomes answering, not prompting.
And every answer gets filed back. New .md file, or enriched into an existing one. The loop continues. Every session makes the context richer for the next one.
Where this converges with Karpathy
Karpathy frames his wiki as three layers: raw sources (immutable), wiki (LLM-maintained), and schema (the config that steers the LLM). My setup collapses the first two into one. The context/ directory IS both source and wiki. Living documents, not immutable archives.
Karpathy's 3 layers My setup (2 layers)
─────────────────────── ───────────────────────
raw sources (immutable) ┐
├──► context/ (source + wiki)
wiki (LLM-maintained) ┘
schema (CLAUDE.md) ───► CLAUDE.md + skills
This is simpler but less rigorous. The three-layer split gives you an immutable source of truth you can always go back to. My merged layer means the LLM is editing the same files it reads from. Room to improve there.
BUT the core insight is the same: persistent, compounding context beats RAG. Build it up once, keep it current, let the LLM do the maintenance. As Karpathy puts it, the tedious part isn’t the reading or thinking, it’s the bookkeeping. LLMs handle that.
We’re all converging towards the same pattern. Karpathy’s idea file frames it well: this is intentionally abstract because the right implementation depends on your domain and tools. But the pattern itself is clear enough that it should converge into a proper product. A personal, persistent, LLM-maintained wiki that compounds with every interaction. Someone will build this.