136 lines
10 KiB
Markdown
136 lines
10 KiB
Markdown
---
|
|
name: docs-discovery
|
|
description: Load all .md documentation relevant to the user's current coding task BEFORE searching source code or making modifications. Scans `docs/` folders via Glob using topic keywords extracted from the user's message, loads paired main/ISSUES/TODO .md sets as one unit, and respects the no-re-read rule (skips already-loaded files). Use when the user's request mentions any domain concept, class name, file area, or feature behavior — invoke BEFORE the first `code_search`/`get_file`/`Grep` on source files. Typical triggers: any coding/refactoring task, "let's fix X", "review Y", "how does Z work", or any question that would otherwise lead to source-code exploration.
|
|
compatibility: Designed for Claude Code and GitHub Copilot (VS). Uses the host agent's Glob/Read tools.
|
|
metadata:
|
|
author: Fullepi
|
|
version: "1.0"
|
|
---
|
|
|
|
# docs-discovery
|
|
|
|
Ensure relevant `.md` documentation is loaded **before** any source-code search or modification, so the LLM has the documented behaviour, known issues, and planned work in context. Saves many `Grep` / `get_file` rounds — reading a handful of .md files upfront is cheaper than rediscovering information via code search.
|
|
|
|
## Before you start
|
|
|
|
This skill READS `.md` files and updates the LLM's `[LOADED_DOCS: ...]` state. It MUST NOT modify any file. Follow the **no-re-read** and **explicit-consent-for-modifications** rules from the active repo's `copilot-instructions.md` (rule numbers may differ per repo — refer to the rule NAMES).
|
|
|
|
## Step 1 — Extract topic keywords
|
|
|
|
Parse the user's most recent message (and the wider conversation tail if relevant) for concrete concepts. Examples:
|
|
|
|
- Class / type names: `AcLoggerBase`, `SegmentBufferReader`, `AcBinaryHubProtocol`, `<Consumer>SignalRClient` (any derived/consumer-specific type)
|
|
- Feature areas: "logger", "log writer", "serializer", "SignalR", "hub protocol", "chunked framing", "connection builder", "options"
|
|
- File hints: `Program.cs`, `AcLoggerBase.cs`, `SIGNALR.md`
|
|
- Patterns / idioms: "DI factory", "appsettings", "mode negotiation"
|
|
|
|
Derive **root topic tokens** from these — singular, lowercase, domain-defining words:
|
|
- `"AcLoggerBase"` → `logger`, `logging`
|
|
- `"SignalR client"` → `signalr`
|
|
- `"AcBinarySerializer"` → `binary`, `serializer`
|
|
- `"AcBinaryHubProtocol"` → `protocol`, `signalr_binary_protocol`, `binary`
|
|
- `"chunked"` → `signalr_binary_protocol`
|
|
|
|
Keep the set small (usually 1-3 root tokens). If the request genuinely spans multiple domains, include all.
|
|
|
|
## Step 2 — Map tokens to glob patterns (semantic, not hardcoded)
|
|
|
|
### ⚠️ CRITICAL — the recursive `**/` wildcard is MANDATORY in every glob
|
|
|
|
The `**/` is NOT cosmetic. It matches `docs/` at **any depth** in the workspace:
|
|
- repo-root: `<Repo>/docs/TOPIC/`
|
|
- project-level: `<Repo>/<Project>/docs/TOPIC/` ← **very common for Pattern B layouts**
|
|
- nested: `<Repo>/<Project>/<SubProject>/docs/TOPIC/`
|
|
|
|
**Correct form — always**: `<OptionalRepoPrefix>/**/docs/{TOKEN}/**/*.md`
|
|
**Wrong form — never**: `<OptionalRepoPrefix>/docs/{TOKEN}/**/*.md` (missing the leading `**/`)
|
|
|
|
**Failure mode** (this happens often with Pattern B projects):
|
|
- You know the target repo (e.g. via `own-dep-repos`) — say `<Repo> = AyCode.Core`.
|
|
- You synthesize `<Repo>/docs/{TOKEN}/...` because "that's where docs usually live".
|
|
- Glob returns 0 matches (repo-root `docs/` doesn't contain topic folders — only flat reference docs).
|
|
- You conclude "no docs exist" and fall through to code-search.
|
|
- Meanwhile the actual docs sit at `<Repo>/<Project>/docs/{TOKEN}/` — one level deeper.
|
|
|
|
**The rule is absolute**: NEVER drop the leading `**/`, even when you "know" the repo. Let the recursive glob find the actual depth. Relative-path guesses based on "usual" layouts are a reliable source of false-empty conclusions.
|
|
|
|
### File layout convention
|
|
|
|
(See `LLM_PROTOCOL_DECISIONS.md` entry "Docs migrated to folder+README pattern".)
|
|
|
|
Topics with multiple files live in named folders: `docs/TOPIC/README.md` + `docs/TOPIC/TOPIC_ISSUES.md` + `docs/TOPIC/TOPIC_TODO.md` (or other `TOPIC_*.md` companions). Single-file reference docs remain flat at the `docs/` root (e.g., `docs/ARCHITECTURE.md`, `docs/GLOSSARY.md`).
|
|
|
|
For each root token, synthesize glob patterns targeting BOTH layouts:
|
|
|
|
| Token example | Primary glob (folder) | Companion glob (flat + variants) |
|
|
|---|---|---|
|
|
| `logger`, `log`, `logging` | `**/docs/LOGGING/**/*.md` | `**/docs/LOGGING_*.md` (legacy/variants) |
|
|
| `binary`, `serializer` | `**/docs/BINARY/**/*.md` | `**/docs/BINARY_*.md` |
|
|
| `signalr`, `hub` | `**/docs/SIGNALR*/**/*.md` | — (covers SIGNALR + SIGNALR_BINARY_PROTOCOL folders) |
|
|
| `protocol`, `wire`, `chunked` | `**/docs/*PROTOCOL*/**/*.md` | — |
|
|
| `grid`, `mggrid` | `**/docs/MGGRID/**/*.md` | — |
|
|
| `architecture`, `conventions`, `glossary` | — (flat, single-file) | `**/docs/ARCHITECTURE.md`, `**/docs/CONVENTIONS.md`, `**/docs/GLOSSARY.md` |
|
|
|
|
Do NOT require tokens to match a pre-baked list — construct patterns from the token itself uppercased:
|
|
- Primary: `**/docs/{TOKEN}/**/*.md` (matches everything inside the topic folder)
|
|
- Companion/variant: `**/docs/{TOKEN}_*.md` (matches flat files or variant prefix folders like `SIGNALR_BINARY_PROTOCOL`)
|
|
|
|
Natural language variants (logger/logging, serialize/serializer, binary/binaries) should all be attempted against both the primary and companion patterns.
|
|
|
|
**For README.md discovery** (folder-navigation rule): if a topic folder match is found, the `README.md` in that folder is the entry point and MUST be included in the load set (not just sibling `_ISSUES` / `_TODO` files).
|
|
|
|
(See the CRITICAL section at the top of this Step 2 for the full explanation of why the leading `**/` is mandatory — this is the most common cause of false-empty docs conclusions.)
|
|
|
|
## Step 3 — Execute the Glob and dedupe against already-loaded docs
|
|
|
|
Run each glob pattern via the host agent's Glob tool. Collect all matching absolute paths.
|
|
|
|
**Dedupe against `[LOADED_DOCS: ...]` prefix:**
|
|
- If a match is already in LOADED_DOCS → skip it (Rule #3)
|
|
- If a match is under `bin/`, `obj/`, `node_modules/`, `Test_Benchmark_Results/`, or a worktree-backup path → skip it (not framework docs)
|
|
|
|
If the total match count exceeds 10, narrow the glob pattern (e.g., require domain token near the filename start, not just substring). LLM context is finite.
|
|
|
|
**False-empty guardrail:** if the glob returns 0 matches OR all matched files are 0-byte, do NOT conclude "docs are empty" — first re-validate the glob (typo? literal path substituted?) and retry once with the same token under a corrected `**/docs/...` pattern (NEVER with an ad-hoc path guess). Only after the validated retry also fails should you fall through to code-search.
|
|
|
|
## Step 4 — Load the filtered set
|
|
|
|
Read all remaining matches in parallel (batch the Read calls in one tool-use block). The newly-loaded files will appear in your next response's `[LOADED_DOCS: ...]` prefix under the `+K this turn: <short names>` delta, per the active repo's Rule #1 format (basename by default; `TOPIC/README.md` for topic-folder READMEs to disambiguate across the many `README.md` files the Pattern-B docs layout introduces).
|
|
|
|
## Step 5 — Respect the paired-docs convention
|
|
|
|
If any `{DOMAIN}.md` is loaded (e.g., `LOGGING.md`), ALSO glob and load its companions:
|
|
|
|
- `{DOMAIN}_ISSUES.md` — known issues / limitations / workarounds
|
|
- `{DOMAIN}_TODO.md` — planned work / open tickets
|
|
|
|
These are **paired docs** and must be loaded as a set. Skipping ISSUES/TODO risks reintroducing fixed bugs or conflicting with ongoing refactors.
|
|
|
|
## Step 6 — Proceed to the user's task
|
|
|
|
The response's `[LOADED_DOCS: N files (+K this turn: <basenames>)]` prefix (per the active repo's Rule #1) already surfaces the newly-loaded filenames and the cumulative count. **No separate confirmation line is needed** — the prefix itself is the confirmation. Continue directly to the user's actual request.
|
|
|
|
If any relevant docs were skipped as already-loaded (Rule #3 dedupe), you MAY optionally mention them inline where relevant (e.g., "I already have LOGGING.md from earlier"). Do not reiterate the full loaded list.
|
|
|
|
## Do NOT
|
|
|
|
- **Re-read** any `.md` file already in `[LOADED_DOCS: ...]` — the **no-re-read** rule is absolute (check the active repo's `copilot-instructions.md` for the authoritative phrasing; rule number may differ per repo). The only exception: user explicitly states the file has changed on disk via external means.
|
|
- **Load unrelated domains** — if the user asks about the Logger, don't load SignalR docs "just in case".
|
|
- **Load more than ~10 files** in a single invocation — if the glob matches more, refine the pattern. If the request truly spans many domains, split into multiple sequential invocations with narrower scope each.
|
|
- **Skip folder `README.md`** — if the active repo's conventions include a **folder-navigation / folder-README-first** rule, honour it. `README.md` in a loaded `docs/` folder is always in scope.
|
|
|
|
## Tool usage
|
|
|
|
This skill is tool-neutral. Map these capabilities to the host agent's tools (per the active repo's `CLAUDE.md`):
|
|
|
|
- Globbing file paths: `Glob` (Claude Code), `file_search` (Copilot), `Get-ChildItem -Filter`
|
|
- Reading files: `Read` (Claude Code), `get_file` (Copilot)
|
|
- Parallelizing reads: issue multiple tool calls in a single response where the host supports it
|
|
|
|
## Edge cases
|
|
|
|
- **No matching docs found:** Emit `> docs-discovery: no .md matches for tokens [list]. Proceeding with code-search only.` This is informational — the task may be in a domain without documentation, which is itself a signal to be careful.
|
|
- **Token extraction is ambiguous:** Prefer SUPERSET — load a few extra .md files rather than missing relevant ones. Loading 3 extra docs is cheap; missing ISSUES.md and reintroducing a fixed bug is expensive.
|
|
- **User says "don't load docs" / "just search the code":** Respect it. Skip this skill entirely for that turn.
|
|
- **Recursive trigger:** If loaded docs reference other `.md` files via cross-reference, do NOT auto-follow unless the user's request explicitly extends to them. Cross-refs can cascade; relevance-bounded glob is the primary mechanism.
|