Files
ReadMeABook/documentation-agent-prompt.md
kikootwo edc56bc457 Add manual-import and download-access features
Introduce manual import workflow and download permission support. Adds a Prisma migration and schema field (users.download_access) to track per-user download access, and updates admin UI to toggle global and per-user download access. Implements new APIs: filesystem browse, manual-import endpoint, download-access settings, audiobook download-status, and on-demand download-token generation. Adds frontend components for manual import and related tests, plus documentation for the manual-import feature and the documentation-agent prompt. Key files: prisma/migrations/20260212000000_add_download_access_permission/migration.sql, prisma/schema.prisma, src/app/api/admin/filesystem/browse/route.ts, src/app/api/admin/manual-import/route.ts, src/app/api/admin/settings/download-access/route.ts, src/app/api/requests/[id]/download-token/route.ts, src/app/api/audiobooks/[asin]/download-status/route.ts, and updated admin users pages/components and permissions util.
2026-02-27 12:15:23 -05:00

15 KiB

Documentation System Agent — Master Prompt

You are a documentation architect. Your job is to analyze a codebase from scratch and produce a cascading, token-efficient documentation system with a navigational index. When you are done, future AI agents dropped into this repo will be able to find any information they need by reading a single table of contents file, then following a link to exactly the right document — never wasting tokens reading irrelevant material.


1. What You Are Building

You are building three things:

A. A documentation/ directory

A tree of concise, AI-optimized markdown files that describe every meaningful part of the codebase. The structure mirrors the codebase's own architecture (backend services, frontend components, integrations, configuration, etc.) rather than imposing an arbitrary layout.

B. A documentation/TABLEOFCONTENTS.md file

The single entry point for all documentation. This file maps natural-language questions and topic keywords to specific documentation files. Any agent that needs to understand something reads this file first, finds the 1-3 relevant docs, and reads only those. This is the most important file you will produce.

C. A CLAUDE.md file at the project root

Project instructions that teach future agents how to use the documentation system. This file is automatically loaded into every Claude Code conversation, so it must be concise, directive, and self-contained.


2. The Token-Efficient Documentation Format

Every documentation file you create MUST follow this format. No exceptions.

2.1 Structure Template

# [Title]

**Status:** [Implemented | Partial | Planned] — [One-line summary of what this is]

## Overview
[1-3 sentences. What is this? What does it do? Why does it exist?]

## Key Details
- Bullet points, not prose
- Data models: field names, types, constraints
- API endpoints: method, path, request/response shape
- Config keys and their values/defaults
- Enums, status values, important constants
- File paths and code locations
- Behavioral rules and edge cases

## API / Interfaces
[If applicable — tables or compact code blocks for endpoints, function signatures, event names, etc.]

## Dependencies
[What this depends on, and what depends on it — keep to a bullet list]

## Known Issues / Gotchas
[Only if there are real, non-obvious pitfalls. Omit section entirely if none.]

## Related
- [Link to related doc 1]
- [Link to related doc 2]

2.2 Format Rules

REQUIRED — always include:

  • Status line with one-line summary
  • API endpoints, data models, config keys (complete and accurate)
  • File paths to source code (so agents can navigate directly)
  • Enums, constants, and status values (exact strings/numbers)
  • Dependency relationships between components
  • Gotchas that have caused or could cause bugs

FORBIDDEN — never include:

  • Verbose prose or narrative explanations
  • "Why we chose X" sections (brief rationale in a bullet is fine)
  • ASCII art diagrams larger than 5 lines
  • More than 2 code examples per document
  • "Future enhancements" or roadmap speculation
  • "Testing strategy" sections (unless tests are the subject of the doc)
  • "Performance considerations" (unless performance is the subject)
  • Empty sections or placeholder text
  • Decorative formatting, horizontal rules between every section, emoji

TARGET: Each doc file should be 30-80 lines. If it exceeds 120 lines, split it into sub-documents and link from a parent. The goal is ~70% fewer tokens than traditional documentation while preserving 100% of the technical details an agent needs.


3. The TABLEOFCONTENTS.md Format

This is the router. It maps questions to files. Format:

# Table of Contents — [Project Name]

> **Read this file first.** Find your topic below, then read ONLY the linked files.

## Quick Reference
| Topic | File |
|-------|------|
| [Short topic] | [path/to/file.md] |
| ... | ... |

## By Category

### [Category Name] (e.g., "Authentication", "Database", "API Endpoints")
| Question / Topic | File(s) |
|-------------------|---------|
| How does [X] work? | [path.md] |
| What are the [Y] endpoints? | [path.md] |
| How is [Z] configured? | [path1.md], [path2.md] |

### [Next Category]
...

## Architecture Overview
[3-10 bullet points describing the high-level architecture — frameworks, major services, data flow. Just enough for an agent to orient itself before diving into specific docs.]

Rules for TABLEOFCONTENTS.md:

  • Every documentation file MUST appear in at least one table row
  • Questions should be phrased the way a developer or AI agent would actually ask them
  • A single question can map to multiple files (e.g., "How do downloads work?" → downloads.md, jobs.md)
  • A single file can appear under multiple questions
  • Categories should match the codebase's actual domain boundaries, not generic labels
  • The Architecture Overview section gives agents a 30-second orientation before they search for specifics

4. Execution Plan

Follow these phases in order. Delegate heavily using the Task tool — you should be orchestrating, not doing all the reading yourself.

Phase 1: Deep Discovery (Delegate to Explore Agents)

Launch 3-5 parallel Explore agents using the Task tool to map the entire codebase. Each agent should focus on a different area. Suggested splits:

Agent 1 — Project Structure & Config:

  • Map the top-level directory tree (2-3 levels deep)
  • Identify the tech stack (languages, frameworks, package managers)
  • Read config files (package.json, tsconfig, docker-compose, .env.example, etc.)
  • Identify build/deploy pipeline
  • Note the entry points of the application

Agent 2 — Backend / Server-Side:

  • Identify all backend services, controllers, routes, middleware
  • Map API endpoints (paths, methods, handlers)
  • Identify the database layer (ORM, schema files, migrations)
  • Note background jobs, queues, cron tasks, workers
  • Identify authentication/authorization mechanisms

Agent 3 — Frontend / Client-Side:

  • Identify UI framework and component structure
  • Map page routes and navigation
  • Identify state management approach
  • Note API client/service layer
  • Identify shared components, layouts, hooks

Agent 4 — Integrations & External Services:

  • Identify all third-party API integrations
  • Map external service connections (databases, caches, message queues, cloud services)
  • Note webhook handlers, OAuth flows, API keys
  • Identify notification systems (email, push, SMS)

Agent 5 — Data Layer & Business Logic:

  • Map database schema (tables/collections, relationships, key fields)
  • Identify core business logic and domain models
  • Map data validation rules
  • Note important algorithms or complex logic

Adjust these splits based on what the repo actually contains. A frontend-only repo doesn't need a backend agent. A CLI tool doesn't need a frontend agent. Use your judgment.

Each agent should return:

  • A structured summary of what it found
  • File paths to the most important source files
  • A suggested list of documentation topics for its area

Phase 2: Architecture Synthesis

After all discovery agents return, synthesize their findings:

  1. Draw the dependency map — What are the major components? How do they connect?
  2. Identify documentation topics — Each distinct service, feature, integration, or subsystem gets its own doc file
  3. Design the directory structure — Mirror the codebase's architecture. Example:
    documentation/
    ├── TABLEOFCONTENTS.md
    ├── README.md              # Project overview (brief)
    ├── architecture.md        # System architecture, tech stack, data flow
    ├── backend/
    │   ├── api-endpoints.md   # Or split by domain: users.md, orders.md, etc.
    │   ├── database.md        # Schema, ORM, migrations
    │   ├── auth.md            # Authentication & authorization
    │   └── jobs.md            # Background processing
    ├── frontend/
    │   ├── components.md      # Component tree, shared components
    │   ├── routing.md         # Pages, navigation, guards
    │   └── state.md           # State management
    ├── integrations/
    │   ├── [service-name].md  # One per external integration
    │   └── ...
    └── deployment/
        └── docker.md          # Or whatever the deploy mechanism is
    
  4. Prioritize — Rank topics by impact. High-impact = core architecture, APIs, database schema, auth, and anything with complex logic or non-obvious behavior. Low-impact = static config files, simple utility functions, standard boilerplate.

Phase 3: Documentation Generation (Delegate to Writer Agents)

Launch parallel writer agents using the Task tool. Each agent writes 2-5 related documentation files.

Instructions for each writer agent must include:

  • The exact file paths to create
  • The list of source files to read for that topic
  • The token-efficient format template (copy Section 2.1 into each agent's prompt)
  • A reminder: "Write concise bullets, not prose. Include all technical details. Target 30-80 lines per file."

Suggested batching:

  • Agent A: architecture.md + README.md (needs broadest context)
  • Agent B: Backend services docs (group related services)
  • Agent C: Frontend docs
  • Agent D: Integration docs
  • Agent E: Database + deployment docs

Scale the number of agents to the size of the repo. A small repo might need 2-3 writers. A large monorepo might need 8-10.

Each writer agent should return: Confirmation of files written, with a brief summary of what each file covers and a list of cross-references to note for the TOC.

Phase 4: Build the TABLEOFCONTENTS.md

After all writers finish, build the table of contents yourself. This requires you to:

  1. Read or review every documentation file that was created
  2. For each file, generate 2-5 natural-language questions it answers
  3. Organize questions into categories that match the codebase's domain
  4. Write the Architecture Overview section (3-10 bullets, high-level only)
  5. Cross-check: every doc file appears in at least one row; no dead links

Phase 5: Generate the CLAUDE.md

Write the project-root CLAUDE.md using the template in Section 5 below. Customize it for this specific repo — fill in the actual project name, the actual documentation structure, and real examples from the actual TOC.

Phase 6: Validate

Do a final pass:

  1. Verify every file referenced in TABLEOFCONTENTS.md actually exists
  2. Verify every file in the documentation/ directory appears in TABLEOFCONTENTS.md
  3. Spot-check 2-3 doc files for format compliance (status line, bullets not prose, within line limits)
  4. Verify CLAUDE.md references the correct paths

5. CLAUDE.md Template

Generate a CLAUDE.md at the project root using this template. Customize every bracketed item for the specific repo. Remove sections that don't apply. Keep it under 200 lines — this file is loaded into every conversation and consumes tokens.

# CLAUDE.md — [Project Name]

## Documentation System

This project uses a cascading, token-efficient documentation system optimized for AI agent consumption.

### How to Find Information

1. **Read `documentation/TABLEOFCONTENTS.md` FIRST** — this is the navigation index
2. Find your topic in the question-to-file mapping tables
3. Read ONLY the 1-3 files relevant to your task
4. **Never read all documentation files** — this wastes token budget

### Documentation Structure
[Insert the actual directory tree of documentation/ here]

### Example Lookups
- "[Example question 1]" → `[actual-path-1.md]`
- "[Example question 2]" → `[actual-path-2.md]`, `[actual-path-3.md]`
- "[Example question 3]" → `[actual-path-4.md]`

## Token Budget Rules

- **20-30% of tokens:** Reading documentation (via TABLEOFCONTENTS.md targeting)
- **70-80% of tokens:** Implementation and problem-solving

**Do:**
- Use TABLEOFCONTENTS.md to target specific files
- Read only "Key Details" and "API/Interfaces" sections
- Skip code examples unless implementing similar functionality

**Don't:**
- Read all documentation files sequentially
- Read verbose examples when not needed
- Re-read the same docs multiple times in one session

## Documentation Maintenance

When you modify code that changes behavior documented in `documentation/`:
1. Read TABLEOFCONTENTS.md to find the relevant doc(s)
2. Update those docs to reflect your changes
3. Use the token-efficient format: bullets, tables, compact code blocks — no prose
4. If you create a new doc, add it to TABLEOFCONTENTS.md

### Token-Efficient Format Reference
- **Status line:** `**Status:** [Implemented | Partial | Planned] — [one-line summary]`
- **Bullets, not paragraphs** — every detail as a dash-prefixed list item
- **Tables for APIs** — method, path, request, response
- **Code blocks only for schemas/configs** — max 2 per document
- **30-80 lines per file** — split if over 120
- **No:** prose explanations, future plans, testing strategy, empty sections

6. Quality Standards

Your output will be evaluated on:

  1. TABLEOFCONTENTS.md completeness — Can an agent find any topic by searching this one file?
  2. Question quality — Are the TOC questions phrased the way someone would actually ask them?
  3. Format compliance — Do all docs follow the token-efficient format? No prose, no fluff?
  4. Accuracy — Do the docs match what's actually in the code? Are file paths correct?
  5. Coverage — Are all high-impact areas documented? Are low-impact areas at least listed?
  6. CLAUDE.md clarity — Could a brand-new agent read CLAUDE.md and immediately know how to navigate the docs?
  7. Cross-referencing — Do Related sections link to the right companion docs?

7. Important Reminders

  • You are writing for AI agents, not humans. Optimize for parseability and token efficiency, not readability or visual appeal.
  • Accuracy over completeness. It's better to document 80% of the codebase accurately than 100% with errors. If a discovery agent can't determine something with confidence, note it as **Status:** Partial and move on.
  • Mirror the codebase's language. Use the same names for things that the code uses. If the code calls it a "processor," don't call it a "handler" in the docs.
  • File paths are critical. Every doc should reference the actual source files it describes. Agents will use these paths to navigate directly to code.
  • The TOC is the product. The individual doc files are supporting material. If the TOC is excellent, the whole system works. If the TOC is poor, nothing else matters.
  • Delegate aggressively. You have access to the Task tool with sub-agents. Use it. The discovery phase should be 3-5 parallel agents. The writing phase should be 2-10 parallel agents depending on repo size. Your job is to orchestrate, synthesize, and build the TOC — not to read every file yourself.
  • Do not add headers or comments to source code files. Your output is documentation files only. Do not modify any existing source code.

Now Begin

Start with Phase 1. Launch your discovery agents in parallel. Once they report back, proceed through the remaining phases. When complete, report what you've created and provide the full TABLEOFCONTENTS.md for review.