Wiki Agent Instructions

This is a personal LLM-maintained knowledge wiki for me (the user). I will load in source files I want synthesized into the raw/ directory. Obsidian is the reading interface for the wiki.

You (the agent) are the writer and maintainer of the wiki/ directory. Your job is to read my sources, synthesize knowledge, and keep the wiki current, cross-referenced, and free of staleness.

Architecture

1. Instruction layer

CLAUDE.md — User-owned instructions. NEVER modify.
syntax.md — User-owned for reference. Ignore, NEVER modify.

2. Raw sources layer

raw/ — User-owned immutable source documents. NEVER modify.
raw/backlog/ — Staging area for source files not yet ready for ingest. Agent must NEVER read or reference files here.

3.2. Wiki content layer

wiki/ — LLM-generated wiki. You own and maintain all subdirectories as well.
- Subdirectories: wiki/sources/, wiki/concepts/, wiki/entities/, wiki/comparisons/, wiki/syntheses/, wiki/queries/.
- When creating a file in wiki/, make sure to use the corresponding template from the templates/ directory.

4. Implementations layer

implementations/ — User-authored builds and reimplementations (e.g. buildGPT, micrograd). NEVER modify. These are exercises, not reference material.
Wiki pages may link to implementations where relevant (e.g. **Implementation:** [[buildGPT]]).

Template files

templates/ — User-defined templates. NEVER modify.
Always use the corresponding template file when creating any new file in wiki/.
Templates are the authoritative source for page structure. Key templates:
- templates/wiki-content.md — for all content pages (sources, concepts, entities, comparisons, syntheses, queries)
- templates/hub-*.md — for hub pages
If a template exists for the page type you’re creating, follow it exactly.

Images

All source images live in raw/assets/. Never modify or move them.
When a wiki page references an image, use: ![[raw/assets/image-name.png]]
If a source contains important diagrams or figures, mention them in the source summary and link them from relevant concept pages.

Directory structure

CLAUDE.md           # this file
syntax.md           # user reference file
raw/                # subfolders: articles, transcripts, papers, repos, etc
	articles-and-blogposts/
	video-and-podcast-transcripts/
	academic-papers/
	code-repositories/
	data/
	jupyter-notebooks/
	assets/         # Downloaded images from clipped articles
wiki/
	index.md        # Level 1 index. Catalogs broad topics and links to `hub-*.md` files.
	hub-*.md        # Level 2 topic-specific index. Each file catalogs ALL wiki pages of that topic.
	log.md          # Chronological activity record. Append-only operation history
	sources/        # 1 summary file per 1 or many of the source files in raw/
	concepts/       # default route. 1 file per fleshed-out concept of interest
	entities/       # 1 file per "entity" in raw (publication, person, company, blog author)
	comparisons/    # 1 file per meaningful comparison (e.g. of concepts)
	syntheses/      # clean, high-value, one-page summaries
	queries/        # for difficult questions asked by me
implementations/    # user-authored builds — agent reads, never modifies build-gpt/ build-micrograd/
templates/          # template files for agent-use 
	hub-*.md
	wiki-content.md

File Naming conventions

Use kebab-case.md for all file names. Examples for each file type:

Hub page: wiki/hub-ai-ml.md, wiki/hub-business.md, wiki/hub-engineering.md, wiki/hub-physics.md
Source summary page: wiki/sources/src-attention-is-all-you-need.md, wiki/sources/src-blogpost-title.md
- For series/multi-part sources, use: src-creator-series-chN.md (e.g. src-3b1b-neural-networks-ch1.md).
- The full original title goes in the YAML title: field, not the filename.
Concept page: wiki/concepts/transformer-architecture.md
Entity page: wiki/entities/andrej-karpathy.md, wiki/entities/openai.md
Comparison: wiki/comparisons/rnn-vs-transformer.md
Synthesis: wiki/syntheses/recap-transformers.md, wiki/syntheses/recap-rsa.md
Query: wiki/queries/what-the-last-vector-in-context-represents.md

Content routing

Use these rules to decide where a page goes:

Category	What goes here	Examples
`entities/`	Named things: people, companies, organisations, specific models/products, publications	`andrej-karpathy.md`, `openai.md`, `gpt-4.md`, `nature-journal.md`
`concepts/`	Ideas, techniques, mechanisms, principles — anything you’d explain rather than identify	`transformer-architecture.md`, `attention-mechanism.md`, `backpropagation.md`
`sources/`	One summary per source document (or small cluster of closely related sources)	`src-attention-is-all-you-need.md`
`comparisons/`	Head-to-head analysis of two or more concepts/entities	`rnn-vs-transformer.md`, `adam-vs-sgd.md`
`syntheses/`	Concise, high-density reference sheets for complex end-to-end processes, pulling together multiple sources and concepts	`recap-modern-attention-mechanisms.md`
`queries/`	Reusable answers to specific questions I’ve asked	`why-layer-norm-before-attention.md`

Boundary rules:

A specific named model (e.g. GPT-4, BERT) is an entity. The general technique it uses (e.g. transformer, masked language modelling) is a concept.
If something is both (e.g. “attention” is a concept, “Multi-Head Attention” is a specific mechanism) — default to concept unless it’s a branded/named product.
If a concept is small enough to be a section within another concept page (e.g. “positional encoding” within “transformer-architecture”), make it a section first. Promote it to its own page only if it grows beyond ~200 words or is referenced from 3+ other pages.

Routing qualifications:

queries/ is the last resort. If a query answer could be filed as a concept, comparison, or synthesis, file it there instead. Use queries/ only when the question framing itself is the value.
syntheses/ pages are concise, high-density reference sheets — e.g. the full MLP forward pass in math notation, or the end-to-end flow through a GPT network in the shortest form possible. They require 3+ sources contributing meaningfully to the same narrative. A concept page that merely references multiple sources is not a synthesis. I will have significant input into how each synthesis page looks — ask me before writing one.

Scale

At the current scale (<100 sources, <200 wiki pages), the wiki/index.md → hub-*.md navigation is sufficient. The LLM reads the index, finds the relevant hub, drills into pages.

If the wiki grows beyond this:

Add keyword search via grep -r as a first step.
Consider adding a dedicated search tool (e.g. qmd) if grep becomes too slow or imprecise.
Flag it to me if you notice queries taking too many steps to find relevant pages — that’s the signal to add better search.

Writing style

Wiki pages are text-only by default. Do not embed images from raw/assets/ into wiki pages — the ## Sources section provides a path back to original diagrams. Exceptions:
- Function plots (e.g. sigmoid, ReLU) where the visual shape is the point — use LaTeX for the equation and a simple image from raw/assets/ for the plot if the equation alone isn’t sufficient.
- Architecture diagrams (e.g. network topology, layer flow) where spatial structure can’t be conveyed in text — use a clean reference image in raw/assets/.
- Custom diagrams created specifically for a synthesis page, with my approval.
Wiki concept pages can be either .md or .ipynb. Use .ipynb when the concept benefits from executable code alongside the explanation (e.g. linear algebra, probability). Use .md for concepts that are purely explanatory. Both formats render in Quartz and are wiki-linked identically.
Do not include an H1 heading in wiki pages. The YAML title field renders as the page title via Quartz. Start page content with the **Summary:** line, then H2 sections.
When a wiki page would benefit from a diagram or plot, insert a placeholder: > [!image] TODO: sigmoid activation plot — S-curve from 0 to 1, squashing all inputs into (0,1). Describe what the image should show. I will find or create the image and replace the placeholder.
Be concise and clear. No filler, no hedging, no “it’s worth noting that.”
Never use vague metaphors as explanations. If you write something like “squished through a nonlinearity,” immediately follow it with what that actually means — which function, what it does to the input range, and why it’s there. The metaphor can stay as a one-line intuition, but it must be backed by the precise mechanism.
Prioritise insight over completeness. I care more about why something matters, edge cases, and surprising implications than about restating the obvious.
When a concept has a common misconception or subtle gotcha, call it out explicitly.
Distinguish general principles from specific examples. If a source uses a concrete example (e.g. 784 input neurons for MNIST), make clear in the wiki page that this is an illustrative example, not a property of the concept itself. Use phrasing like “e.g. 784 for a 28×28 image” rather than stating it as a definition.
Don’t just state formulas — explain the intuition. For any equation or algorithm, include: what each term means, why it’s there, and what would change if you removed it. E.g. for gradient descent, don’t just write the update rule — explain what the gradient vector represents, what the learning rate controls, and what happens if it’s too large or too small.
Show both compact and expanded notation. For any matrix/vector equation, include the shorthand notation AND an expanded example showing explicit dimensions (e.g. a [3×2] matrix multiplied by a [2×1] vector → [3×1] output). This makes it easy to verify that inner dimensions cancel and outputs have the expected shape.
I understand math notation — use it where appropriate, with colour highlighting preferred if there are many objects to keep track of (e.g. more than 4 vectors / matrices in the same equation, with several indices).
Use 0-indexed notation (layers, neurons, vector elements, matrix entries). Layer 0 is the input layer. The first element of a vector is index 0.
I understand algorithms and Python code — use pseudocode or real Python code freely.

Tag	Meaning
`#navigation`	Hub page, Index pages
`#log`	Log file
`#glossary`	Glossary definition page
`#synthesis`	Synthesis page requiring my input
`#implementation`	Page documents a hands-on build or reimplementation
`#has/code`	Page contains runnable Python code
`#has/math`	Page contains significant derivations or proofs
`#prerequisite`	Concept that other pages depend on understanding first
`#source/paper`	Content derived primarily from an academic paper
`#source/video`	Content derived primarily from a video or lecture
`#source/blog`	Content derived primarily from a blog post or article
`#todo/stub`	Page exists but needs significant expansion
`#todo/review`	Content may be inaccurate or incomplete — flagged for my review
`#todo/expand`	Specific section(s) flagged for expansion — page is decent but has gaps
`#todo/create-page`	Concept mentioned but lacking its own page — placeholder created
`#todo/formatting`	Content is correct but layout, notation, or structure needs cleanup
`#todo/update`	Page is stale — newer sources exist that should be integrated
`#contradiction`	Contains unresolved conflict between sources

Page length targets

Source summaries (wiki/sources/): 200–500 words. Key claims, methodology, findings. Not a rehash of the whole source.
Concept pages (wiki/concepts/): 300–800 words. Definition, how it works, why it matters, edge cases, connections.
Entity pages (wiki/entities/): 200–600 words. What/who it is, why it’s relevant, key contributions or characteristics.
Comparisons (wiki/comparisons/): 300–800 words. Use tables where appropriate. Focus on when you’d pick one over the other.
Syntheses (wiki/syntheses/): 500–1500 words. These are the most substantial pages — they pull together multiple sources and concepts.
Hub pages: No word limit, but each entry is a single line: - [[page-name]] — one-line summary.

These are guidelines, not hard caps. A complex concept page can be 1200 words if it needs to be. But if a source summary is 900 words, it’s probably too detailed — trim it.

Workflows

Ingest

When I say “ingest <filename>”:

Stage 1 — Source summary (wait for my approval before continuing)

Read the new source file in raw/.
If the source contains embedded images with formulas, diagrams, or derivations, list them and tell me which ones look important. I will review and tell you which images to examine, or I will describe the key content from those images.
Discuss key takeaways with me.
Create a source summary page in wiki/sources/ — naming convention: src-kebab-title.md.
Stop here. Show me the source summary and wait for my feedback before proceeding to Stage 2. I may ask you to add missing derivations, expand on formulas, or correct misunderstandings from image-heavy sections.

Stage 2 — Wiki integration (after my approval of the source summary)

For each concept, entity, or comparison the source touches: a. Check whether a wiki page already exists (search wiki/index.md and relevant hubs, or use grep). b. If the page exists, read it. Determine whether the new source confirms, extends, or contradicts the existing content.
- If it confirms: no change needed, or add the new source to the Sources section.
- If it extends: update the page with the new information. Add the source reference.
- If it contradicts: add a > [!warning] Contradiction callout with the conflicting claim and its source. Do not silently overwrite. c. If no page exists, create one using the appropriate template.
Update the relevant wiki/hub-*.md page(s) — add new pages as line entries. If a hub doesn’t exist for this topic, create one.
Update wiki/index.md only if a new hub was created.
Append an entry to wiki/log.md: ## [YYYY-MM-DD] ingest | Source title

One source typically touches 5–15 pages. That is expected and good.

Query

When I ask a question against the wiki

Read wiki/index.md to find the relevant hub-*.md file(s).
Read those hub files to find specific pages.
Read those specific pages. Use grep for keyword lookup if the wiki is large.
Synthesize an answer with [[wiki-link]] citations (e.g. [[page-name]])
Decide whether to file the answer as a new wiki page:
- File it if: the answer required reading 3+ wiki pages, or it produced a comparison/synthesis/analysis that doesn’t already exist in the wiki.
- Don’t file it if: the answer was a simple factual lookup, or it largely restates a single existing page.
- When filing, choose the appropriate subdirectory (comparisons/, syntheses/, queries/) and add it to the relevant hub(s).
Output formats: Answers don’t have to be wiki pages. Depending on the question, appropriate formats include:
- A wiki page (default for reusable knowledge).
- A comparison table (markdown table in a wiki page or standalone).
- A diagram or chart (described in markdown; I’ll render it in Obsidian or ask for a specific format).
- A slide deck (Marp format, saved in wiki/).
- Ask me if you’re unsure which format fits. Default to a wiki page.
Append to log.md: ## [YYYY-MM-DD] query | Brief question summary.

Lint

When I say “lint”:

Find orphan pages: pages not linked from any hub.
Find contradictions: claims on one page that conflict with another.
Find stubs: pages with frontmatter but no real contents.
Find missing pages: concepts mentioned, but lacking own page.
Check for stale claims superseded by newer sources.
Suggest questions to investigate next, or sources to fill knowledge gaps.
Append to log.md: ## [YYYY-MM-DD] lint | Summary of findings.

Hard rules

Never modify raw/ files. They are immutable sources of truth, not drafts.
Never modify templates/ files. They are user-defined.
Always use a templates/ file as a basis when creating a new file(s) in wiki/
Only create wiki content files inside these subdirectories: wiki/sources/, wiki/concepts/, wiki/entities/, wiki/comparisons/, wiki/syntheses/, or wiki/queries/. Never create content files directly in wiki/.
Never delete log entries. wiki/log.md is append-only.
Keep wiki/index.md small. It links to hubs only — not to individual pages.
Frequently update the topic-specific wiki/hub-*.md files. They link to individual pages.
Split large hubs. If a hub page exceeds ~40 entries, split it into sub-hubs (e.g. hub-ai-ml.md → hub-ai-architectures.md + hub-ai-training.md + hub-ai-applications.md). Update wiki/index.md to point to the new hubs. Keep the original hub as a redirect or remove it.
Flag contradictions explicitly. If a new source contradicts an existing page, note it in the page with a > [!warning] Contradiction callout rather than silently overwriting.
Prefer updating over creating. If a concept page already exists, update it. Don’t create a duplicate.
Always use [[wiki-links]] for internal linking within files. Never use [markdown links](source.md)

When in doubt

Can’t find a relevant hub? Create a new one. Add it to wiki/index.md.
Source doesn’t fit any existing category? Create the most natural hub for it. Ask me if genuinely unsure.
Unsure if a concept deserves its own page? Start it as a section within the most related existing page. Promote to its own page later if it grows or gets referenced from multiple pages.
Unsure about a factual claim in a source? Include it but add a > [!question] callout noting the uncertainty.
Two sources say different things? Include both claims with a > [!warning] Contradiction callout. Don’t pick a winner silently.
A source covers a topic the wiki hasn’t touched before? Create the concept/entity pages anyway, even if they’re short. Stubs are better than gaps — they’ll get fleshed out on future ingests.

notes/

AGENTS

Wiki Agent Instructions

Architecture

1. Instruction layer

2. Raw sources layer

3.1. Wiki navigation layer

3.2. Wiki content layer

4. Implementations layer

Template files

Images

Directory structure

File Naming conventions

Content routing

Scale

Writing style

Tags

Page length targets

Workflows

Ingest

Query

Lint

Hard rules

When in doubt

AGENTS

Wiki Agent Instructions

Architecture

1. Instruction layer

2. Raw sources layer

3.1. Wiki navigation layer

3.2. Wiki content layer

4. Implementations layer

Template files

Images

Directory structure

File Naming conventions

Content routing

Scale

Writing style

Tags

Page length targets

Workflows

Ingest

Query

Lint

Hard rules

When in doubt

Graph View

Backlinks

Explorer