Wiki Agent Instructions

This is a personal LLM-maintained knowledge wiki for me (the user). I will load in source files I want synthesized into the raw/ directory. Obsidian is the reading interface for the wiki.

You (the agent) are the writer and maintainer of the wiki/ directory. Your job is to read my sources, synthesize knowledge, and keep the wiki current, cross-referenced, and free of staleness.


Architecture

1. Instruction layer

  • CLAUDE.md — User-owned instructions. NEVER modify.
  • syntax.md — User-owned for reference. Ignore, NEVER modify.

2. Raw sources layer

  • raw/ — User-owned immutable source documents. NEVER modify.
  • raw/backlog/ — Staging area for source files not yet ready for ingest. Agent must NEVER read or reference files here.

3.1. Wiki navigation layer

  • Top-level index (wiki/index.md): catalogs broad topics and links to hub-*.md files. Update occasionally.
  • Hub pages (wiki/hub-*.md): Domain catalogs that sit between index.md and individual pages. Update on every ingest.
  • Logging (wiki/log.md): Chronological activity record. Append-only.

3.2. Wiki content layer

  • wiki/ — LLM-generated wiki. You own and maintain all subdirectories as well.
    • Subdirectories: wiki/sources/, wiki/concepts/, wiki/entities/, wiki/comparisons/, wiki/syntheses/, wiki/queries/.
    • When creating a file in wiki/, make sure to use the corresponding template from the templates/ directory.

4. Implementations layer

  • implementations/ — User-authored builds and reimplementations (e.g. buildGPT, micrograd). NEVER modify. These are exercises, not reference material.
  • Wiki pages may link to implementations where relevant (e.g. **Implementation:** [[buildGPT]]).

Template files

  • templates/ — User-defined templates. NEVER modify.
  • Always use the corresponding template file when creating any new file in wiki/.
  • Templates are the authoritative source for page structure. Key templates:
    • templates/wiki-content.md — for all content pages (sources, concepts, entities, comparisons, syntheses, queries)
    • templates/hub-*.md — for hub pages
  • If a template exists for the page type you’re creating, follow it exactly.

Images

  • All source images live in raw/assets/. Never modify or move them.
  • When a wiki page references an image, use: ![[raw/assets/image-name.png]]
  • If a source contains important diagrams or figures, mention them in the source summary and link them from relevant concept pages.

Directory structure

CLAUDE.md           # this file
syntax.md           # user reference file
raw/                # subfolders: articles, transcripts, papers, repos, etc
	articles-and-blogposts/
	video-and-podcast-transcripts/
	academic-papers/
	code-repositories/
	data/
	jupyter-notebooks/
	assets/         # Downloaded images from clipped articles
wiki/
	index.md        # Level 1 index. Catalogs broad topics and links to `hub-*.md` files.
	hub-*.md        # Level 2 topic-specific index. Each file catalogs ALL wiki pages of that topic.
	log.md          # Chronological activity record. Append-only operation history
	sources/        # 1 summary file per 1 or many of the source files in raw/
	concepts/       # default route. 1 file per fleshed-out concept of interest
	entities/       # 1 file per "entity" in raw (publication, person, company, blog author)
	comparisons/    # 1 file per meaningful comparison (e.g. of concepts)
	syntheses/      # clean, high-value, one-page summaries
	queries/        # for difficult questions asked by me
implementations/    # user-authored builds — agent reads, never modifies build-gpt/ build-micrograd/
templates/          # template files for agent-use 
	hub-*.md
	wiki-content.md

File Naming conventions

Use kebab-case.md for all file names. Examples for each file type:

  • Hub page: wiki/hub-ai-ml.md, wiki/hub-business.md, wiki/hub-engineering.md, wiki/hub-physics.md
  • Source summary page: wiki/sources/src-attention-is-all-you-need.md, wiki/sources/src-blogpost-title.md
    • For series/multi-part sources, use: src-creator-series-chN.md (e.g. src-3b1b-neural-networks-ch1.md).
    • The full original title goes in the YAML title: field, not the filename.
  • Concept page: wiki/concepts/transformer-architecture.md
  • Entity page: wiki/entities/andrej-karpathy.md, wiki/entities/openai.md
  • Comparison: wiki/comparisons/rnn-vs-transformer.md
  • Synthesis: wiki/syntheses/recap-transformers.md, wiki/syntheses/recap-rsa.md
  • Query: wiki/queries/what-the-last-vector-in-context-represents.md

Content routing

Use these rules to decide where a page goes:

CategoryWhat goes hereExamples
entities/Named things: people, companies, organisations, specific models/products, publicationsandrej-karpathy.md, openai.md, gpt-4.md, nature-journal.md
concepts/Ideas, techniques, mechanisms, principles — anything you’d explain rather than identifytransformer-architecture.md, attention-mechanism.md, backpropagation.md
sources/One summary per source document (or small cluster of closely related sources)src-attention-is-all-you-need.md
comparisons/Head-to-head analysis of two or more concepts/entitiesrnn-vs-transformer.md, adam-vs-sgd.md
syntheses/Concise, high-density reference sheets for complex end-to-end processes, pulling together multiple sources and conceptsrecap-modern-attention-mechanisms.md
queries/Reusable answers to specific questions I’ve askedwhy-layer-norm-before-attention.md

Boundary rules:

  • A specific named model (e.g. GPT-4, BERT) is an entity. The general technique it uses (e.g. transformer, masked language modelling) is a concept.
  • If something is both (e.g. “attention” is a concept, “Multi-Head Attention” is a specific mechanism) — default to concept unless it’s a branded/named product.
  • If a concept is small enough to be a section within another concept page (e.g. “positional encoding” within “transformer-architecture”), make it a section first. Promote it to its own page only if it grows beyond ~200 words or is referenced from 3+ other pages.

Routing qualifications:

  • queries/ is the last resort. If a query answer could be filed as a concept, comparison, or synthesis, file it there instead. Use queries/ only when the question framing itself is the value.
  • syntheses/ pages are concise, high-density reference sheets — e.g. the full MLP forward pass in math notation, or the end-to-end flow through a GPT network in the shortest form possible. They require 3+ sources contributing meaningfully to the same narrative. A concept page that merely references multiple sources is not a synthesis. I will have significant input into how each synthesis page looks — ask me before writing one.

Scale

At the current scale (<100 sources, <200 wiki pages), the wiki/index.mdhub-*.md navigation is sufficient. The LLM reads the index, finds the relevant hub, drills into pages.

If the wiki grows beyond this:

  • Add keyword search via grep -r as a first step.
  • Consider adding a dedicated search tool (e.g. qmd) if grep becomes too slow or imprecise.
  • Flag it to me if you notice queries taking too many steps to find relevant pages — that’s the signal to add better search.

Writing style

  • Wiki pages are text-only by default. Do not embed images from raw/assets/ into wiki pages — the ## Sources section provides a path back to original diagrams. Exceptions:
    • Function plots (e.g. sigmoid, ReLU) where the visual shape is the point — use LaTeX for the equation and a simple image from raw/assets/ for the plot if the equation alone isn’t sufficient.
    • Architecture diagrams (e.g. network topology, layer flow) where spatial structure can’t be conveyed in text — use a clean reference image in raw/assets/.
    • Custom diagrams created specifically for a synthesis page, with my approval.
  • Wiki concept pages can be either .md or .ipynb. Use .ipynb when the concept benefits from executable code alongside the explanation (e.g. linear algebra, probability). Use .md for concepts that are purely explanatory. Both formats render in Quartz and are wiki-linked identically.
  • Do not include an H1 heading in wiki pages. The YAML title field renders as the page title via Quartz. Start page content with the **Summary:** line, then H2 sections.
  • When a wiki page would benefit from a diagram or plot, insert a placeholder: > [!image] TODO: sigmoid activation plot — S-curve from 0 to 1, squashing all inputs into (0,1). Describe what the image should show. I will find or create the image and replace the placeholder.
  • Be concise and clear. No filler, no hedging, no “it’s worth noting that.”
  • Never use vague metaphors as explanations. If you write something like “squished through a nonlinearity,” immediately follow it with what that actually means — which function, what it does to the input range, and why it’s there. The metaphor can stay as a one-line intuition, but it must be backed by the precise mechanism.
  • Prioritise insight over completeness. I care more about why something matters, edge cases, and surprising implications than about restating the obvious.
  • When a concept has a common misconception or subtle gotcha, call it out explicitly.
  • Distinguish general principles from specific examples. If a source uses a concrete example (e.g. 784 input neurons for MNIST), make clear in the wiki page that this is an illustrative example, not a property of the concept itself. Use phrasing like “e.g. 784 for a 28×28 image” rather than stating it as a definition.
  • Don’t just state formulas — explain the intuition. For any equation or algorithm, include: what each term means, why it’s there, and what would change if you removed it. E.g. for gradient descent, don’t just write the update rule — explain what the gradient vector represents, what the learning rate controls, and what happens if it’s too large or too small.
  • Show both compact and expanded notation. For any matrix/vector equation, include the shorthand notation AND an expanded example showing explicit dimensions (e.g. a [3×2] matrix multiplied by a [2×1] vector → [3×1] output). This makes it easy to verify that inner dimensions cancel and outputs have the expected shape.
  • I understand math notation — use it where appropriate, with colour highlighting preferred if there are many objects to keep track of (e.g. more than 4 vectors / matrices in the same equation, with several indices).
  • Use 0-indexed notation (layers, neurons, vector elements, matrix entries). Layer 0 is the input layer. The first element of a vector is index 0.
  • I understand algorithms and Python code — use pseudocode or real Python code freely.

Tags

Tags are for cross-cutting categorisation — things you’d want to filter by across the whole wiki. They must NOT duplicate information already captured by the filename, directory, or hub placement.

Never use:

  • Topic name tags (e.g. #attention, #gradient-descent) — the page name and hub entry already cover this.
  • One-off tags — if a tag would only apply to a single page, it shouldn’t be a tag. Approved tags (use only these):
TagMeaning
#navigationHub page, Index pages
#logLog file
#glossaryGlossary definition page
#synthesisSynthesis page requiring my input
#implementationPage documents a hands-on build or reimplementation
#has/codePage contains runnable Python code
#has/mathPage contains significant derivations or proofs
#prerequisiteConcept that other pages depend on understanding first
#source/paperContent derived primarily from an academic paper
#source/videoContent derived primarily from a video or lecture
#source/blogContent derived primarily from a blog post or article
#todo/stubPage exists but needs significant expansion
#todo/reviewContent may be inaccurate or incomplete — flagged for my review
#todo/expandSpecific section(s) flagged for expansion — page is decent but has gaps
#todo/create-pageConcept mentioned but lacking its own page — placeholder created
#todo/formattingContent is correct but layout, notation, or structure needs cleanup
#todo/updatePage is stale — newer sources exist that should be integrated
#contradictionContains unresolved conflict between sources

Do not create new tags without my approval. If you think a new tag is needed, suggest it to me first.


Page length targets

  • Source summaries (wiki/sources/): 200–500 words. Key claims, methodology, findings. Not a rehash of the whole source.
  • Concept pages (wiki/concepts/): 300–800 words. Definition, how it works, why it matters, edge cases, connections.
  • Entity pages (wiki/entities/): 200–600 words. What/who it is, why it’s relevant, key contributions or characteristics.
  • Comparisons (wiki/comparisons/): 300–800 words. Use tables where appropriate. Focus on when you’d pick one over the other.
  • Syntheses (wiki/syntheses/): 500–1500 words. These are the most substantial pages — they pull together multiple sources and concepts.
  • Hub pages: No word limit, but each entry is a single line: - [[page-name]] — one-line summary.

These are guidelines, not hard caps. A complex concept page can be 1200 words if it needs to be. But if a source summary is 900 words, it’s probably too detailed — trim it.


Workflows

Ingest

When I say “ingest <filename>”:

Stage 1 — Source summary (wait for my approval before continuing)

  1. Read the new source file in raw/.
  2. If the source contains embedded images with formulas, diagrams, or derivations, list them and tell me which ones look important. I will review and tell you which images to examine, or I will describe the key content from those images.
  3. Discuss key takeaways with me.
  4. Create a source summary page in wiki/sources/ — naming convention: src-kebab-title.md.
  5. Stop here. Show me the source summary and wait for my feedback before proceeding to Stage 2. I may ask you to add missing derivations, expand on formulas, or correct misunderstandings from image-heavy sections.

Stage 2 — Wiki integration (after my approval of the source summary)

  1. For each concept, entity, or comparison the source touches: a. Check whether a wiki page already exists (search wiki/index.md and relevant hubs, or use grep). b. If the page exists, read it. Determine whether the new source confirms, extends, or contradicts the existing content.
    • If it confirms: no change needed, or add the new source to the Sources section.
    • If it extends: update the page with the new information. Add the source reference.
    • If it contradicts: add a > [!warning] Contradiction callout with the conflicting claim and its source. Do not silently overwrite. c. If no page exists, create one using the appropriate template.
  2. Update the relevant wiki/hub-*.md page(s) — add new pages as line entries. If a hub doesn’t exist for this topic, create one.
  3. Update wiki/index.md only if a new hub was created.
  4. Append an entry to wiki/log.md: ## [YYYY-MM-DD] ingest | Source title

One source typically touches 5–15 pages. That is expected and good.

Query

When I ask a question against the wiki

  1. Read wiki/index.md to find the relevant hub-*.md file(s).
  2. Read those hub files to find specific pages.
  3. Read those specific pages. Use grep for keyword lookup if the wiki is large.
  4. Synthesize an answer with [[wiki-link]] citations (e.g. [[page-name]])
  5. Decide whether to file the answer as a new wiki page:
    • File it if: the answer required reading 3+ wiki pages, or it produced a comparison/synthesis/analysis that doesn’t already exist in the wiki.
    • Don’t file it if: the answer was a simple factual lookup, or it largely restates a single existing page.
    • When filing, choose the appropriate subdirectory (comparisons/, syntheses/, queries/) and add it to the relevant hub(s).
  6. Output formats: Answers don’t have to be wiki pages. Depending on the question, appropriate formats include:
    • A wiki page (default for reusable knowledge).
    • A comparison table (markdown table in a wiki page or standalone).
    • A diagram or chart (described in markdown; I’ll render it in Obsidian or ask for a specific format).
    • A slide deck (Marp format, saved in wiki/).
    • Ask me if you’re unsure which format fits. Default to a wiki page.
  7. Append to log.md: ## [YYYY-MM-DD] query | Brief question summary.

Lint

When I say “lint”:

  1. Find orphan pages: pages not linked from any hub.
  2. Find contradictions: claims on one page that conflict with another.
  3. Find stubs: pages with frontmatter but no real contents.
  4. Find missing pages: concepts mentioned, but lacking own page.
  5. Check for stale claims superseded by newer sources.
  6. Suggest questions to investigate next, or sources to fill knowledge gaps.
  7. Append to log.md: ## [YYYY-MM-DD] lint | Summary of findings.

Hard rules

  • Never modify raw/ files. They are immutable sources of truth, not drafts.
  • Never modify templates/ files. They are user-defined.
  • Always use a templates/ file as a basis when creating a new file(s) in wiki/
  • Only create wiki content files inside these subdirectories: wiki/sources/, wiki/concepts/, wiki/entities/, wiki/comparisons/, wiki/syntheses/, or wiki/queries/. Never create content files directly in wiki/.
  • Never delete log entries. wiki/log.md is append-only.
  • Keep wiki/index.md small. It links to hubs only — not to individual pages.
  • Frequently update the topic-specific wiki/hub-*.md files. They link to individual pages.
  • Split large hubs. If a hub page exceeds ~40 entries, split it into sub-hubs (e.g. hub-ai-ml.mdhub-ai-architectures.md + hub-ai-training.md + hub-ai-applications.md). Update wiki/index.md to point to the new hubs. Keep the original hub as a redirect or remove it.
  • Flag contradictions explicitly. If a new source contradicts an existing page, note it in the page with a > [!warning] Contradiction callout rather than silently overwriting.
  • Prefer updating over creating. If a concept page already exists, update it. Don’t create a duplicate.
  • Always use [[wiki-links]] for internal linking within files. Never use [markdown links](source.md)

When in doubt

  • Can’t find a relevant hub? Create a new one. Add it to wiki/index.md.
  • Source doesn’t fit any existing category? Create the most natural hub for it. Ask me if genuinely unsure.
  • Unsure if a concept deserves its own page? Start it as a section within the most related existing page. Promote to its own page later if it grows or gets referenced from multiple pages.
  • Unsure about a factual claim in a source? Include it but add a > [!question] callout noting the uncertainty.
  • Two sources say different things? Include both claims with a > [!warning] Contradiction callout. Don’t pick a winner silently.
  • A source covers a topic the wiki hasn’t touched before? Create the concept/entity pages anyway, even if they’re short. Stubs are better than gaps — they’ll get fleshed out on future ingests.