"I think there is room here for an incredible new product instead of a hacky collection of scripts."
- Andrej Karpathy on LLM Knowledge Bases, Apr 3, 2026
An LLM-powered CLI that compiles raw documents into structured markdown wikis. Feed it articles, papers, notes, and transcripts -- it extracts concepts, builds cross-references, and maintains a living knowledge base.
Chat conversations are ephemeral. You ask, you get an answer, it's gone. wiki-compiler builds something persistent and structured:
- Accumulates knowledge -- every document you ingest adds to a growing, interconnected wiki
- Cross-references automatically -- concepts link to each other via
[[wiki-links]] - Lives in your filesystem -- plain markdown files you own, version with git, read with any editor
- Works offline -- use Ollama for fully local compilation
- Auditable -- every article tracks its sources and confidence level
$ wiki-compiler init my-research && cd my-research
Initialized knowledge base at ./my-research
$ wiki-compiler ingest paper.pdf --provider claude
Ingesting paper.pdf...
Created: wiki/sources/attention-is-all-you-need.md
Created: wiki/concepts/transformer-architecture.md
Created: wiki/concepts/attention-mechanism.md
Updated: wiki/_index.md
3 articles created, 0 updated
$ wiki-compiler stats
Articles: 3
Concepts: 2 | Sources: 1
Wiki-links: 7 | Backlinks: 5
Tags: 4 unique across 3 articles
$ wiki-compiler lint
0 errors 0 warnings 1 info
Health Score: 98/100pip install wiki-compiler
# Or with pipx for isolated install
pipx install wiki-compiler
# With specific LLM provider support
pip install "wiki-compiler[anthropic]" # Claude
pip install "wiki-compiler[openai]" # OpenAI
pip install "wiki-compiler[all]" # All providers# 1. Initialize a knowledge base
wiki-compiler init my-kb
cd my-kb
# 2. Ingest a document
wiki-compiler ingest paper.pdf --provider claude
# 3. See what you've built
wiki-compiler stats
# 4. Query your knowledge base
wiki-compiler query "what is transformer attention?"Raw Documents wiki-compiler Structured Wiki
+------------------+
paper.pdf --> | Ingester | --> wiki/concepts/attention.md
notes.md --> | Compiler | --> wiki/concepts/transformers.md
article.txt --> | Linker | --> wiki/sources/paper-2024.md
| Indexer | --> wiki/_index.md
+------------------+
|
LLM Provider
(Claude/OpenAI/Ollama)
- Ingest -- Raw document goes in, LLM extracts concepts and creates individual wiki articles
- Link -- Scanner finds
[[wiki-links]], builds backlink graph, suggests new connections - Compile -- LLM reads the full wiki, identifies gaps and overlaps, reorganizes
- Index -- Generates master index with categories, statistics, and navigation
| Command | LLM Required | Description |
|---|---|---|
wiki-compiler init [path] |
No | Initialize a new knowledge base |
wiki-compiler ingest <file> |
Yes | Ingest a raw document |
wiki-compiler compile |
Yes | Reorganize and improve the wiki |
wiki-compiler query "..." |
Yes | Ask questions about your knowledge base |
wiki-compiler lint [--fix] |
No | Check wiki for structural issues |
wiki-compiler stats |
No | Show wiki statistics |
wiki-compiler index |
No | Rebuild the master index |
Set your preferred provider via environment variables:
# Anthropic Claude (recommended)
export ANTHROPIC_API_KEY=sk-ant-...
# OpenAI
export OPENAI_API_KEY=sk-...
# Ollama (local, no key needed)
export OLLAMA_HOST=http://localhost:11434 # defaultOverride per-command:
wiki-compiler ingest doc.md --provider ollama --model llama3
wiki-compiler query "what is X?" --provider openai --model gpt-4omy-kb/
wiki/
_index.md # Auto-generated master index
concepts/ # Extracted concepts and knowledge
attention.md
transformers.md
sources/ # Source document summaries
paper-2024.md
wiki.yaml # Wiki configuration
Each article includes YAML frontmatter:
---
title: "Attention Mechanism"
tags: [deep-learning, nlp, transformers]
sources: [paper-2024]
related: [transformers, self-attention]
confidence: high
created: 2026-04-03
updated: 2026-04-03
---Install wiki-compiler via pip install wiki-compiler, run wiki-compiler init my-kb to create a knowledge base, then use wiki-compiler ingest <file> to feed it any document. The CLI uses an LLM to extract concepts, generate structured markdown articles, and automatically cross-reference them with [[wiki-links]].
wiki-compiler supports Anthropic Claude, OpenAI (GPT-4o and others), and Ollama for local models. You can set a default provider via environment variables or override per-command with --provider and --model flags.
Yes. Use Ollama as your provider to run fully offline with local models like Llama 3. Commands that don't require an LLM -- such as lint, stats, and index -- always work offline with no API key needed.
ChatGPT conversations are ephemeral -- you ask a question, get an answer, and it disappears into your history. wiki-compiler builds a persistent, structured knowledge base of plain markdown files that accumulate over time, cross-reference automatically, and live in your filesystem where you can version them with git.
Yes. wiki-compiler outputs standard markdown with [[wiki-links]] and YAML frontmatter, which is exactly the format Obsidian uses. You can point Obsidian at your wiki directory and get full graph view, backlinks, and search out of the box.
- awesome-llm-knowledge-bases -- Curated list of LLM-powered knowledge base tools
- karpathy-kb-template -- Ready-to-use wiki template with GitHub Actions
- kb-lint -- Standalone linter for markdown knowledge bases
git clone https://github.com/SingggggYee/wiki-compiler
cd wiki-compiler
pip install -e ".[dev,all]"
pytest
ruff check src/MIT