Documentation as a first-class concern in your agentic workflow

Most teams write documentation after the feature ships. By then the context is stale, the pressure to move on is high, and the ADR nobody wrote is already forgotten. The agentic dev workflow treats docs as something you generate alongside the code, not something you backfill when someone complains the wiki is out of date.

If you’ve been following this series, you know the workflow gives your AI a persistent memory across sessions and a full catalog of skills for .NET and Angular development. In part 1, I covered the foundation: memory layers, heartbeat, hooks, and multi-repo support. One thread I touched on but didn’t fully pull was this idea of documentation as a first-class concern. That’s what this post is about.

The workflow ships with ten documentation skills. Together they cover every stage of software development: from early proposals to post-incident reviews, from bootstrapping docs for an existing codebase to rendering everything as a browsable site. This post walks through all of them.

Note: The skills shown here reflect my stack and conventions at the time of writing. They improve over time as the workflow learns from daily use. Your project will have different tools, different security concerns, different quality bars. These are examples of what’s possible, not prescriptions. Fork them, adjust them, or use them as inspiration for your own.

The problem with docs-as-afterthought

Documentation written after the fact is almost always incomplete. The developer who made the decision has moved on. The context that made option B obviously wrong is no longer obvious. The runbook gets written the second time an incident happens, not the first.

You’ve been there: a production issue on Friday when you we’er about to enjoy the weekend, scanning an empty runbook folder, trying to reconstruct what the service does from a six-month-old commit message. Not fun. The workflow’s answer to this isn’t discipline. It’s automation. When the AI is already reading and writing the code with you, it can write the documentation at the same time, while the context is still fresh.

Three things make this work in practice:

Every doc skill writes to the shared docs repo, not the feature repo
The workflow.json in each project points to that shared repo
The session wrap-up checks for documentation gaps before closing

The result is that docs happen when decisions happen, not when someone schedules a documentation sprint.

workflow.json: the connective tissue

Every project in the workflow carries a workflow.json file at its root. It’s small, but it’s what makes documentation work across multiple repos.

{
  "docsRepo": "my-project.docs",
  "projectType": "dotnet",
  "templates": "../my-project.docs/templates",
  "output": {
    "adr": "../my-project.docs/adr",
    "design": "../my-project.docs/design",
    "architecture": "../my-project.docs/architecture",
    "runbooks": "../my-project.docs/runbooks",
    "postmortems": "../my-project.docs/postmortems",
    "spikes": "../my-project.docs/spikes",
    "prd": "../my-project.docs/prd",
    "rfc": "../my-project.docs/rfc"
  }
}

Every doc skill reads this file first. If it finds a docsRepo, it resolves the path to the shared docs repository and writes there. If no workflow.json exists, it falls back to a local docs/ folder. Either way, the skill works. With workflow.json, all your ADRs, runbooks, and design docs end up in one place regardless of which repo you’re working in.

Your directory layout might look like this:

my-project.api/          ← .NET backend
  └── workflow.json → ../my-project.docs

my-project.frontend/     ← Angular app
  └── workflow.json → ../my-project.docs

my-project.docs/         ← Shared documentation
  ├── adr/
  ├── architecture/
  ├── design/
  ├── runbooks/
  ├── postmortems/
  ├── spikes/
  ├── prd/
  └── rfc/

Two repos producing docs. One docs home. No duplicate folders, no “is this the up-to-date version?” guessing.

Starting from nothing: doc-init

If your project already exists but has no documentation, /doc-init is where you start. It doesn’t ask you to fill in a template. Instead, it analyzes your codebase first.

Phase 1 is all reading: project structure, dependency files, entry points, existing CLAUDE.md and README.md. Then it runs git archaeology using actual git commands that reconstruct your project’s history:

# When were key architectural files added?
git log --diff-filter=A --format="%ai %H" --name-only -- \
  "*.csproj" "*.sln" "package.json" "Dockerfile" "*.bicep"

# Track dependency additions and removals over time
git log --all -p -- "*.csproj" "package.json" \
  | grep -E "^\+.*PackageReference|^\-.*PackageReference"

From this it builds an evolution timeline, grouping history into phases: Genesis, Foundation, Growth, Maturation. It surfaces architectural inflection points: when a new service appeared, when a library was swapped out, when infra moved to Bicep.

Critically, before generating anything, it shows you the plan and waits for approval:

### Codebase Analysis

Project:         my-project.api
Tech stack:      .NET 9, ASP.NET Core, EF Core, MediatR
Architecture:    Modular monolith
External systems: Azure Service Bus, Blob Storage, Stripe
History:         Created 2024-09, 847 commits, 4 contributors

### Documentation Plan
- C4 Context diagram (Level 1)
- C4 Container diagram (Level 2)
- C4 Component diagram — API internals
- Project evolution timeline
- ADRs for 7 inferred architectural decisions
- Domain model reference
- API endpoint reference
- Developer onboarding guide

Proceed? [y/N]

The key phrase is “inferred architectural decisions.” The skill can see that you’re using event sourcing and Azure Container Apps, but it doesn’t know why you chose them. So the generated ADRs are marked Inferred — needs review, with a note: “Rationale unknown — confirm with the team.” You fill in the why. The AI provides the what and the structure.

The output won’t be perfect. Inferred ADRs may miss alternatives you actually considered, and some documents will need significant additions. But it gives you a head start at getting into the right pattern, which is far better than starting from an empty docs folder.

ADRs: decisions that survive the team

Architecture Decision Records are the single highest-value documentation artifact I know of. An ADR written well means that six months later, when someone asks “why are we using X instead of Y?”, the answer is in the repo.

You trigger /doc-adr during or after making a significant technical decision. The skill uses MADR format and handles the numbering automatically. It checks the adr/ output directory, finds the highest existing number, and increments. You never have to think about “is this ADR-0023 or ADR-0024?”

What makes the numbering interesting in a multi-repo setup: the skill counts ADRs in the shared docs repo, not the current project repo. So if my-project.api contributes ADRs 0001 through 0015, and then you switch to my-project.frontend and run /doc-adr, the next number is 0016, not 0001 again. The entire project’s decision history has a single, sequential timeline.

The ADR format enforces good structure: it won’t let you or the AI write a bad decision record; just one that satisfies one opinion.

There’s a quality checklist baked in:

Context must be neutral, not pre-arguing for the chosen option
At least two real options with genuine pros and cons (not strawmen)
Decision outcome must include a “because” with clear justification
Consequences must include at least one negative

That last point matters. Every decision has trade-offs. An ADR that only lists upsides isn’t documenting the decision. It’s marketing it.

One more thing: ADRs are immutable. If you reverse a decision, you write a new ADR that supersedes the old one. The history stays intact.

RFC: think before you build

An RFC is for proposals that need discussion before anyone writes code. You use it when the change is significant enough that you don’t want to just start building and discover the problems halfway through.

/doc-rfc generates the document and auto-assigns the next RFC number from the rfc/ output directory. The structure is deliberately different from an ADR. An RFC is exploratory; an ADR is a record of a resolved decision. The tone guide in the skill puts it plainly: “An RFC is a discussion document, not a decision record — keep the tone exploratory.”

A common workflow is: write an RFC to explore options, get feedback, then close the RFC and write an ADR to record what was decided. The RFC becomes the “what we considered” artifact, the ADR becomes the “what we chose and why.”

PRD: requirements that drive decisions

A Product Requirements Document lives upstream of the technical work. Before the RFC, before the design doc, before the ADR. It captures the problem, the users, the goals, and critically, the non-goals.

/doc-prd is the skill that keeps requirements honest. Its quality checklist requires:

Goals that are measurable (numbers, not adjectives)
At least two explicit non-goals
User stories in “As a… I want… So that…” format
Acceptance criteria for requirements
Current vs. target values in success metrics

The non-goals requirement is the one that saves the most arguments. A PRD that says “we are not solving mobile notifications in this phase” means nobody spends three weeks designing a notification system that wasn’t in scope.

Design docs: blueprints with diagrams

A design doc is what comes after the PRD: the technical blueprint for how you’re going to build the thing. /doc-design generates it by reading the relevant source code first, so the document reflects actual project patterns rather than textbook examples.

The quality checklist mandates a C4 diagram at either the Context or Container level. A design doc with no architecture diagram is just prose. The diagram forces clarity about what talks to what. It also helps with estimating the work: once you can see the components and their interactions, scoping the time and effort becomes a lot more concrete.

Design docs link back to their PRD for requirements and forward to the ADRs that capture decisions made during implementation. When you read a design doc six months later, you can follow the chain: PRD defines the problem, design doc shows the solution, ADRs capture why the key choices were made that way.

C4 diagrams: architecture you can actually read

The /viz-c4-diagram skill generates Mermaid C4 diagrams at four levels. The skill follows the C4 model’s zoom principle: Context shows the system and its environment, Container shows the deployable units, Component shows the internals of one container, and Code shows class-level relationships.

A Level 2 Container diagram for a typical project looks like this:

C4Container
  title Container Diagram — my-project

  Person(user, "End User", "Web browser")
  Person(admin, "Admin", "Internal dashboard")

  System_Boundary(s1, "my-project") {
    Container(spa, "Angular SPA", "TypeScript, Angular 19", "User-facing application")
    Container(api, "ASP.NET Core API", ".NET 9, Carter", "Business logic and API endpoints")
    ContainerDb(db, "PostgreSQL", "EF Core", "Application data store")
    Container(worker, "Background Worker", ".NET 9", "Processes async jobs")
    ContainerQueue(bus, "Azure Service Bus", "", "Async messaging")
  }

  System_Ext(stripe, "Stripe", "Payment processing")
  System_Ext(blob, "Azure Blob Storage", "File storage")

  Rel(user, spa, "Uses", "HTTPS")
  Rel(admin, api, "Manages via", "HTTPS")
  Rel(spa, api, "Calls", "REST/JSON")
  Rel(api, db, "Reads/writes", "EF Core")
  Rel(api, bus, "Publishes events to")
  Rel(worker, bus, "Subscribes to")
  Rel(api, stripe, "Processes payments via", "HTTPS")
  Rel(worker, blob, "Stores files to", "SDK")

The skill enforces a few things: every relationship must have a label, no diagram should exceed about fifteen elements, and external systems use the _Ext suffix so they’re visually distinct. Small rules, but they prevent diagrams from becoming unreadable walls.

/doc-init uses this skill internally to generate architecture diagrams during the initial docs bootstrap. You can also invoke it standalone whenever the architecture changes significantly enough to warrant an update.

Runbooks: documentation for the 3 AM call

A runbook is written for someone who has been paged and doesn’t have time to think. The skill’s tip puts it directly: “Write for someone who’s been paged at 3 AM — be explicit, don’t assume context.”

/doc-runbook generates operational procedures with copy-pasteable commands. Not abstract instructions. Actual commands. Every diagnosis step has a concrete action. The structure forces a decision tree: if X then Y, if Y doesn’t resolve it then Z, here’s when to escalate.

The quality checklist for a runbook requires escalation criteria to be time-based or condition-based, not just “when in doubt, escalate.” That ambiguity is what causes people to sit on an incident longer than they should because they’re not sure if it’s “bad enough” yet.

The value of a runbook shows itself the second time an incident happens. The first time, you figure it out from scratch. The second time, you follow the runbook and resolve it in minutes instead of hours.

Post-mortems: blameless, honest, specific

After an incident is resolved, /doc-postmortem generates the write-up following blameless principles. The format is strict:

Impact with concrete numbers, not adjectives
Timeline with timestamps, not “then… later…”
Root cause that’s technical and specific
Contributing factors (root cause is never the only factor)
“Where we got lucky” section

That last section is the one most teams skip. It’s also the most valuable. “We got lucky that the staging environment caught this before prod” isn’t just an observation — it’s an implied action item to make the staging environment more production-like.

The skill enforces blameless language throughout. “The deploy pipeline didn’t catch the configuration change” instead of “Bob forgot to update the config.” The distinction matters. One leads to fixing the pipeline. The other leads to people being more careful, which is not a reliable defense.

Every post-mortem should update at least one runbook. The skill notes this explicitly. The knowledge gained during an incident shouldn’t live only in the post-mortem. It should feed back into the operational playbook.

Spikes: capturing research before you forget it

A spike is a time-boxed investigation to answer a specific question. You spend a day evaluating three state management libraries, or prototyping two database access patterns. Then you write it up before you lose the context.

/doc-spike generates a research report with a comparison table if you’re evaluating options. The format requires the question to be stated specifically and answerably — “Which state management library should we use for the dashboard?” not “What state management should we use?”

The recommendation section must be clear and tied back to the question: “We should use X because Y.” If the spike is inconclusive, the skill says to write that. An inconclusive spike that honestly documents what you still don’t know is more useful than a confident recommendation that glosses over unresolved concerns.

The comparison table format is genuinely useful to reference months later. When a new team member asks why you chose one approach over another, you point them at the spike document instead of reconstructing the reasoning from memory.

VitePress: from markdown to a browsable site

All these markdown files are useful on their own, but /tool-vitepress turns them into a proper documentation site that you can run locally or deploy anywhere.

The skill sets up VitePress with the vitepress-plugin-mermaid extension, which means all the C4 diagrams render as interactive diagrams in the browser. The setup takes about sixty seconds:

npm init -y
npm install -D vitepress vitepress-plugin-mermaid mermaid

The config wires up navigation for every doc type the workflow produces:

import { defineConfig } from 'vitepress'
import { withMermaid } from 'vitepress-plugin-mermaid'

export default withMermaid(
  defineConfig({
    title: 'my-project Documentation',
    description: 'Architecture, decisions, and operational docs',
    themeConfig: {
      nav: [
        { text: 'PRDs', link: '/prd/' },
        { text: 'RFCs', link: '/rfc/' },
        { text: 'ADRs', link: '/adr/' },
        { text: 'Design', link: '/design/' },
        { text: 'Architecture', link: '/architecture/' },
        { text: 'Runbooks', link: '/runbooks/' },
      ],
      sidebar: 'auto',
      search: { provider: 'local' },
    },
    mermaid: {},
  })
)

Run npm run docs:dev in my-project.docs and you have a searchable, navigable documentation site at http://localhost:5173. Every ADR, every runbook, every C4 diagram. The C4 diagrams are rendered Mermaid, so they’re not static images. They scale with the viewport and look sharp.

For my project, I have a GitHub Action that builds the docs site and deploys to GitHub Pages on every push to main. The docs URL goes in the project README.md. New team members hit the docs site first, not a wiki that may or may not be accurate.

The wrap-up ritual and documentation gaps

There’s one more piece that ties this together: the session wrap-up. At the end of every working session, the /meta-wrap-up command (or typing “bye”) triggers a review that checks for documentation gaps.

If you merged a significant architectural change without writing an ADR, the wrap-up surfaces it. If you added a new operational runnable without a runbook, it flags it. These aren’t blocking; you can close the session. But they make the gap visible before you forget.

The combination of real-time generation (writing the ADR while making the decision) and gap detection (catching what was missed at session close) means docs stay current without requiring a separate discipline. The process does the reminding.

Putting it together

Here’s what the documentation workflow looks like in practice for a new feature:

Write the PRD with /doc-prd to define the problem and success criteria
Propose the technical approach with /doc-rfc if it’s non-trivial
Generate the technical blueprint with /doc-design, which includes C4 diagrams
Write an ADR with /doc-adr once key decisions are made during implementation
Write the runbook with /doc-runbook before the feature ships
After any incidents: /doc-postmortem followed by a runbook update
Run /viz-c4-diagram when the architecture changes significantly

This still takes some discipline. In a high-pressure sprint, documentation is the first thing to get skipped. But even when you skip it in the moment, the daily memory preserves the context. The wrap-up ritual flags documentation gaps before closing the session. And when you do get around to writing the docs, the AI generates the first draft from that preserved context instead of asking you to reconstruct it from your memory. The result is documentation that reflects what was actually built and why, not a best-effort reconstruction written when someone complained 6 months later the wiki was stale.

For projects that have no docs today: start with /doc-init. It does the archaeology, generates the first draft of everything, and gives you drafts to review rather than blank pages to fill. That’s often the hardest part: starting.

The docs repository, the workflow.json connections, the VitePress site, the sequential ADR numbering across repos: none of it is magic. It’s just a set of conventions that the skills enforce consistently. The insight is that when the AI is already in the session generating code, the cost of also generating the documentation is low. The cost of not having it, when you need it, is always higher than you expect.