From 9a53aa1d73bc981c8b708e4c3e6d75308780ca84 Mon Sep 17 00:00:00 2001 From: Loretta Date: Sat, 25 Apr 2026 17:55:21 +0200 Subject: [PATCH] [LOADED_DOCS: 5 files (+2 this turn: BINARY_ISSUES.md, LOGGING_ISSUES.md)] Simplify Status field; add docs-archive skill & archiving - Reduced Status field values in issues/TODOs to Open, InProgress, Closed; updated all affected entries to new convention - Introduced docs-archive skill for rotating Closed entries into year-bucketed archive files; process is user-invoked or LLM-suggested, never automatic - Expanded docs-discovery and protocol documentation to clarify archive file handling and on-demand loading - Updated session setup: only reactive skills pre-loaded, user-gated skills now lazy-loaded for token efficiency - Clarified and documented Status update workflow, archive eligibility, and lifecycle - Updated all relevant issue/TODO files to match new Status conventions and archival process --- .github/LLM_PROTOCOL_DECISIONS.md | 2 + .github/copilot-instructions.md | 12 +- .github/skills/docs-archive/SKILL.md | 138 +++++++++++++++++ .../docs-check/references/TOPIC_CODES.md | 33 +++++ .github/skills/docs-discovery/SKILL.md | 39 +++++ AyCode.Core/docs/BINARY/BINARY_ISSUES.md | 30 ++-- AyCode.Core/docs/LOGGING/LOGGING_ISSUES.md | 4 +- AyCode.Core/docs/XCUT/XCUT_ISSUES.md | 2 +- .../Logins/AcLoginServiceServer.cs | 16 ++ .../docs/SIGNALR/SIGNALR_ISSUES.md | 25 ++-- .../SIGNALR_BINARY_PROTOCOL_ISSUES.md | 4 +- .../SIGNALR_BINARY_PROTOCOL_TODO.md | 139 ++++++++++++++++++ 12 files changed, 411 insertions(+), 33 deletions(-) create mode 100644 .github/skills/docs-archive/SKILL.md diff --git a/.github/LLM_PROTOCOL_DECISIONS.md b/.github/LLM_PROTOCOL_DECISIONS.md index 9d98905..fe5ed9a 100644 --- a/.github/LLM_PROTOCOL_DECISIONS.md +++ b/.github/LLM_PROTOCOL_DECISIONS.md @@ -148,6 +148,8 @@ The "primary" vs "inherit" distinction is defined in `protocol-audit/references/ | LLMP-DEC-40 | 2026-04-24 | Rule #1 clarification: removed false "globally unique by basename" claim for `copilot-instructions.md` | **Observation**: a parallel Claude Code self-audit session noticed a literal contradiction in Rule #1. The short-name rule line 18 said `copilot-instructions.md` is "globally unique by basename" — true for a typical session (1 active repo = 1 loaded copilot-instructions.md) but FALSE for `protocol-audit` sessions, which load all 8 `copilot-instructions.md` files simultaneously. The claim conflicts with the cross-project-disambiguation rule one line above (line 17) whenever the audit use-case fires. The LLM correctly identified the ambiguity and proposed a patch. **Fix**: replaced the parenthetical "globally unique by basename" with an explicit acknowledgment of the `protocol-audit` collision case + pointer to the already-existing cross-project-disambiguation rule above. New wording uses concrete disambiguation examples (`AyCode.Core/copilot-instructions.md`, `FruitBankHybridApp/copilot-instructions.md`). The `.github/` prefix tiltás is retained — now justified as "implicit location" rather than "already unique". **Deferred (explicit YAGNI)**: proposed new `protocol-audit` C5 invariant (semantic validation: Rule #1 basename-uniqueness claims must match the actual file-set in REPOS.md). Rejected for now — one known instance isn't enough to justify invariant surface / false-positive risk; revisit if a second similar contradiction appears. | `5× primary` `copilot-instructions.md` (Rule #1 line 18 patched) | | LLMP-DEC-41 | 2026-04-24 | Created `adr-author` skill — 4th workspace skill, extends Session Setup pre-load from 3 to 4 | **Gap identified**: the existing skill-stack (docs-discovery, docs-check, protocol-audit) covers documentation retrieval, code↔docs drift detection, and cross-repo file-structural audit — but has no artifact-producing workflow for **pre-code architectural decisions** (greenfield repo planning, feature design, tech-stack choices, library selection, migration planning). The `{TOPIC}_TODO.md` format is an implementation queue (known scope, code ref expected, P1-P3 priority); it cannot naturally hold decision rationale, weighed alternatives, or trade-off tables. Without a dedicated artifact, design rationale tends to disappear into chat history or get flattened into TODO entries that lose the "why". **Solution**: `adr-author` skill encoding the canonical Michael Nygard ADR workflow as a structured interview (Step 1 routing → Step 2 context → Step 3 alternatives → Step 4 trade-offs → Step 5 decision → Step 6 consequences → Step 7 draft + write → Step 8 cross-refs). Tool-neutral like the other three skills. **Routing logic**: two decision-log paradigms supported — per-repo `docs/adr/NNNN-.md` Nygard-style files (product/code/domain decisions) VS workspace-level `LLMP_PROTOCOL_DECISIONS.md` table rows with ID `LLMP-DEC-N` (protocol-meta decisions). **Invocation model (new): explicit user request OR LLM-suggest-back** — LLM flags when conversation exhibits 2+ ADR-heuristics (multiple alternatives weighed, irreversible, cross-cutting, 6-18mo re-openable); user must confirm before invocation. Never auto-invoke. **Session Setup pre-load 3 → 4 skill**: consistent with existing `protocol-audit` (on-demand) pattern — the skill's content must be in context for trigger recognition. **Handoff**: `docs-check` can flag ADR-worthy observations at end-of-response; `docs-discovery` owns "how does X work" questions, `adr-author` owns "let's design X" questions. **User rationale for broader scope than greenfield-only**: user correctly pointed out ADR-worthy decisions arise in mature code too (refactors, library additions, migration paths), not just new-repo planning — the original Nygard definition supports this scope. | new: `AyCode.Core/.github/skills/adr-author/SKILL.md` + `AyCode.Core/.github/skills/adr-author/references/ADR_TEMPLATE.md`; `5× primary` `copilot-instructions.md` (Session Setup: three→four; prefix count 4→5); `3× inherit` `copilot-instructions.md` (Session Setup: three→four; prefix count 5→6); `4× non-Core primary` + `3× inherit` `copilot-instructions.md` (Shared Agent Skills intro "All three"→"All four"; new adr-author bullet added after docs-check bullet) | | LLMP-DEC-42 | 2026-04-25 | `adr-author` SKILL.md: added "Multi-project repo case" placement guidance to Step 1 (cross-cutting vs project-scoped) | **Incident**: a Claude Code session executing `adr-author` for a bearer token design produced an excellent ADR draft but left the `docs/adr/` location ambiguous — the draft path said `docs/adr/0001-...` without resolving whether that meant `/docs/adr/` (repo root, cross-cutting placement) or `//docs/adr/` (project-scoped). The decision in question affected 2+ projects (`AyCode.Services` client + `AyCode.Services.Server`), so the placement question was real, not academic. The original SKILL.md Step 1 used `/docs/adr/...` placeholder which silently assumed single-project repos; multi-project layouts (which exist in this and likely other repos) had no explicit guidance. **Fix**: added a "Multi-project repo case" sub-section to Step 1 with three rules: (1) existing convention wins (Glob first; never fragment by creating a parallel folder); (2) no existing folder → match scope to placement (cross-cutting → highest common ancestor, typically repo root; project-scoped → that project's `docs/`); (3) ambiguous scope → ask the user explicitly before proceeding. Step 7's path reference updated to `` resolved in Step 1, removing the misleading `/docs/adr/` literal. **Generic, not hardcoded**: no specific repo names or project paths in the rules — they apply to any multi-project repo. | `adr-author/SKILL.md` (Step 1 + Step 7 updates) | +| LLMP-DEC-43 | 2026-04-25 | Lazy-load shift for user-gated skills + new `docs-archive` skill + Status field convention | **Three integrated changes consolidated into one decision** because they share the underlying principle "user-gated skills are lazy-loaded; reactive skills stay pre-loaded": (A) **Lazy-load shift**: `protocol-audit` and `adr-author` (LLMP-DEC-41) move from Session Setup pre-load to lazy-load. The `## Shared Agent Skills` bullets in `copilot-instructions.md` already contain enough trigger description for the LLM to recognize when to invoke; the full SKILL.md only needs to be in context at execution time. Token savings: ~10-14K per session (3 SKILL.md × ~5K avg). (B) **New `docs-archive` skill** (lazy-loaded, user-gated + LLM-suggest-back): rotates closed entries (`Status` ∈ {Fixed, Resolved, Won't fix, Superseded by X}) from active `_ISSUES.md` / `_TODO.md` / `LLM_PROTOCOL_DECISIONS.md` into year-bucketed `*_.md` archive files. Year of Status update determines destination bucket. Status-based filter (no foundational-flag complexity, no age threshold). Active file gets a 2-line pointer block at top; archive files NOT auto-loaded by `docs-discovery` (lazy-on-suspicion read pattern documented in `docs-discovery/SKILL.md` "Archive files" section). (C) **Status field convention** formalized in `TOPIC_CODES.md`: explicit value list (`Open`, `Partially Fixed`, `Fixed`, `Resolved`, `Won't fix`, `Documented limitation`, `Superseded by`), date stamping rules, partial-fix `### Resolution status` sub-section format, archive-eligibility filter. Status updates are the **one exception** to append-only — field is mutable; entry body / ID / Description remain immutable. **Cross-skill integration**: `docs-discovery/SKILL.md` gets new "Archive files" section (default-exclude year-suffixed glob; on-demand read on regression / supersession-ref / cross-ref signals). `docs-check` already had status-update-on-fix logic (LLMP-DEC-27); this entry formalizes the value vocabulary it should use. **Workflow**: `docs-archive` reports per-file analysis (Step 2), user selects scope (Step 3), generates plan with diff (Step 4), applies on explicit consent (Step 5, Rule #5). Archive operation is on-demand — no automatic monitoring, no background size checks. **Note**: an earlier in-conversation draft proposed the lazy-load shift as a standalone LLMP-DEC entry; folded into this LLMP-DEC-43 instead, since lazy-load and `docs-archive` skill share the same design rationale and benefit from being recorded as one decision. | new: `AyCode.Core/.github/skills/docs-archive/SKILL.md`; updated: `docs-check/references/TOPIC_CODES.md` (new "Status field conventions" section); `docs-discovery/SKILL.md` (new "Archive files" section); `5× primary` `copilot-instructions.md` (Session Setup: 4→2 pre-loaded SKILL.md, prefix count 5→3, lazy-load list added); `3× inherit` `copilot-instructions.md` (Session Setup: 4→2 pre-loaded SKILL.md, prefix count 6→4, lazy-load list added); `4× non-Core primary` + `3× inherit` `copilot-instructions.md` (Shared Agent Skills intro updated; new `docs-archive` bullet added after `adr-author`) | +| LLMP-DEC-44 | 2026-04-25 | Status field vocabulary simplified: 7 values → 3 (`Open`, `InProgress`, `Closed`); bulk-migrated all existing TODO/Issue entries to `Open` | **Refinement of LLMP-DEC-43 (C)**: the 7-value Status vocabulary (`Open`, `Partially Fixed`, `Fixed`, `Resolved`, `Won't fix`, `Documented limitation`, `Superseded by`) was deemed over-engineered. **Reduced to 3 values**: `Open` (active/unresolved; also used for documented-current-behaviour entries that must remain visible), `InProgress` (partial work in flight), `Closed` (done — bug fixed, decision made, TODO completed; archive-eligible). Distinctions previously encoded in the Status (Fixed vs Resolved vs Won't fix vs Superseded) move to the **entry body** — body explains "what happened" (date, ref, rationale); Status only signals "is this still in the active set?". Documented-current-behaviour entries (previously `Documented limitation`, `By design`, `Acceptable`, `Limitation`, `Upstream SDK limitation`, etc.) stay as `Open` with an optional body callout (`> **Note:** This entry documents accepted current behaviour — not scheduled for change.`). **Bulk migration** applied: every existing `Status` value in `_ISSUES.md` and `_TODO.md` files across the workspace (LOGGING, XCUT, BINARY, SIGNALR, SBP) was set to `Open` as a clean-slate starting point — future Status updates use the new 3-value vocabulary. **Archive criterion** in `docs-archive` simplified to a single rule: `Status == "Closed"` → archive-eligible. **Known anomaly**: `SIGNALR_ISSUES.md` had a single `Status: DONE` entry — also set to `Open` per the bulk-migration rule, but semantically should likely be `Closed` (work was actually done). User to manually re-mark if appropriate. **Why simpler is better**: the previous 7 values created false precision — the distinction between Fixed/Resolved/Won't fix matters in the body, not in the Status field. The `Documented limitation` value semantically conflicted with archiving (it's not "closed work"); folding it into `Open` with a body callout removes the conflict. Archive logic becomes a single equality check. | `docs-check/references/TOPIC_CODES.md` (Status section: 7 → 3 values, simpler workflow); `docs-archive/SKILL.md` (archive criterion simplified to `Status == "Closed"`); 6 files affected by bulk Status migration: `LOGGING_ISSUES.md`, `XCUT_ISSUES.md`, `BINARY_ISSUES.md`, `SIGNALR_ISSUES.md`, `SIGNALR_BINARY_PROTOCOL_ISSUES.md`, `SIGNALR_BINARY_PROTOCOL_TODO.md` (`TOON_ISSUES.md` unchanged — all real entries already `Open`) | ## Known follow-ups diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 7ad09ea..1f01be2 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -60,12 +60,16 @@ You are operating in a multi-repo, documentation-first architecture. You MUST ST ## Session Setup -**Mandatory reads at session start** — in addition to this `copilot-instructions.md`, the agent MUST load the four workspace skills' `SKILL.md` files: +**Mandatory reads at session start** — in addition to this `copilot-instructions.md`, the agent MUST load the **two reactive workspace skills' `SKILL.md` files** (the three user-gated skills are lazy-loaded on demand — see notes below): - `docs-discovery/SKILL.md` — **reactive** (triggers on any domain question — must be ready BEFORE the first domain query arrives) - `docs-check/SKILL.md` — **reactive** (triggers at the end of every code-modifying response) -- `protocol-audit/SKILL.md` — **on-demand** (triggers on explicit "audit protocol" command) -- `adr-author/SKILL.md` — **on-demand + LLM-suggested** (triggers on explicit planning/design requests, or when the LLM flags an ADR-worthy conversation and the user confirms) + +The remaining three skills are **lazy-loaded** — `SKILL.md` is read on demand at invocation, not at session start. The 1-2 line summaries in `## Shared Agent Skills` below are sufficient for trigger recognition; the full skill content loads only when needed: + +- `protocol-audit/SKILL.md` — user-invoked ("audit protocol"); LLM-suggest-back when a `.github/copilot-instructions.md` or `SKILL.md` is modified during the session. +- `adr-author/SKILL.md` — user-invoked ("let's plan X", "design Y"); LLM-suggest-back on ADR-worthy conversation drift. +- `docs-archive/SKILL.md` — user-invoked ("archive ISSUES", "rotate logs"); LLM-suggest-back when active `_ISSUES.md` / `_TODO.md` / DEC log shows many closed prior entries. **Path resolution**: if this repo is the canonical protocol host (see `@repo` block below — typically AyCode.Core), the paths are local: `.github/skills//SKILL.md`. Otherwise, prefix with this repo's `own-dep-repos` AyCode.Core path (see the `## Shared Agent Skills` section below for explicit paths). @@ -73,7 +77,7 @@ You are operating in a multi-repo, documentation-first architecture. You MUST ST **Amortization — critical, do NOT re-evaluate per-turn**: the Session Setup cost is measured over the ENTIRE session, not per single turn. A typical session has many turns; the first domain question alone already recoups the investment (alternative — repeated source-code `Grep`/`Read` per turn — costs 10-20K tokens *per turn* with lower output quality). Do NOT flag pre-loaded content as "wasteful" for turns that don't invoke it — the design depends on cross-turn amortization + Rule #3 (no-re-read) + on-demand specificity of Rule #4 (Context Recovery). This amortization is the **central token-economics principle** of the entire protocol stack. -The first response's `[LOADED_DOCS: ...]` prefix must reflect **5 files** (this `copilot-instructions.md` + 4 SKILL.md). +The first response's `[LOADED_DOCS: ...]` prefix must reflect **3 files** (this `copilot-instructions.md` + 2 reactive SKILL.md). Lazy-loaded skills add to the count when invoked. ## Documentation-first coding diff --git a/.github/skills/docs-archive/SKILL.md b/.github/skills/docs-archive/SKILL.md new file mode 100644 index 0000000..f1a4de3 --- /dev/null +++ b/.github/skills/docs-archive/SKILL.md @@ -0,0 +1,138 @@ +--- +name: docs-archive +description: Rotate closed entries (Status: Fixed/Resolved/Won't fix/Superseded by X) from active `_ISSUES.md` / `_TODO.md` / `LLM_PROTOCOL_DECISIONS.md` files into year-bucketed archive companions (`*_.md`). Year of the Status update determines destination file. Active files retain only Open/Partially Fixed entries plus a header pointer to the archives. Archive files are NOT auto-loaded by `docs-discovery`; agents read them on-demand when historical context becomes relevant. Invoke explicitly ("archive ISSUES", "rotate logs") or via LLM-suggest-back when a loaded file shows many closed prior entries. +compatibility: Designed for Claude Code and GitHub Copilot (VS). Uses the host agent's Read/Write/Edit/Glob/Grep tools. +metadata: + author: Fullepi + version: "1.0" +--- + +# docs-archive + +Lifecycle skill for the workspace's append-only artifact files. Active files grow without bound as work accumulates; this skill consolidates closed entries into year-bucketed archive companions, keeping the active file as a focused "what's still open" view. + +## When to invoke + +### Explicit user triggers +- "archive ISSUES" / "archive TODOs" / "archive decisions" / "rotate logs" +- "consolidate closed entries" +- "archive {TOPIC}" — narrowed to a specific topic + +### LLM-suggest-back (you flag, user confirms) + +When a `_ISSUES.md` / `_TODO.md` / `LLM_PROTOCOL_DECISIONS.md` file is loaded and you observe: +- Many closed entries (Fixed/Resolved/Won't fix/Superseded by X) still in the active file +- The file is long enough that loading has noticeable token cost +- Closed-to-open ratio is high (e.g., > 50% closed) + +Phrase: *"`LOGGING_ISSUES.md` has N closed entries (Fixed/Resolved/...) still in the active file — consider running `docs-archive` to consolidate into year-bucketed archives. (Confirm if you want me to invoke.)"* + +User confirmation required. Never auto-invoke. + +### When NOT to invoke +- File has only a few entries — overhead exceeds benefit +- File has no closed entries — nothing to move +- User is mid-investigation referencing entries that may be archived — defer until done + +## Archive criterion (single rule) + +Move entry X to archive IF `Status: Closed`. + +Year derived from a date in the entry body (e.g., `Fixed 2026-04-25`, `Won't fix 2026-04-25`, `Superseded by LOG-I-X 2026-04-25`). If no parseable date in body, default to current year. + +**Stay in active file**: +- `Status: Open` — including documented-current-behaviour entries (these stay Open with a body callout per `TOPIC_CODES.md`'s Status field conventions) +- `Status: InProgress` +- Any entry without parseable Status (treat as Open; flag to user) + +**Foundational LLMP-DEC entries**: protocol-meta decisions describing CURRENTLY-ACTIVE rules typically have no Status field — they stay in the active file naturally. If later superseded, they get `Status: Closed` and become archive-eligible normally. + +## Step 1 — Identify scope + +Glob the workspace for active artifact files (exclude existing year-suffixed archives): + +``` +**/{TOPIC}_ISSUES.md +**/{TOPIC}_TODO.md +**/LLM_PROTOCOL_DECISIONS.md +``` + +Default scope: all matches. Optionally narrow per user: "just LOGGING", "just DEC log", "just primary repos". + +## Step 2 — Per-file analysis (read-only) + +For each candidate file, report: + +| Field | Example | +|---|---| +| Total entries | 17 | +| Status: Open | 9 | +| Status: Partially Fixed | 1 | +| Status: Documented limitation | 1 | +| Closed (archive-eligible) | 6 | +| ↳ broken down by year (Status update) | 2025: 3, 2026: 3 | + +Show the user a summary table covering all candidate files. + +## Step 3 — User selects scope + +User indicates: "all eligible" / "just {TOPIC}" / "skip {file}" / "year ≥ {N}". + +If unclear, ask: *"process all 4 files (LOG, BIN, SIG, DEC)? Or narrow?"* + +## Step 4 — Generate archive plan (still no writes) + +For each affected file: + +1. **Build archive file content** for each year present: + - Filename: `_.md` (e.g., `LOGGING_ISSUES_2025.md`) + - Header (2 lines): topic + "Archived entries (Status: closed) from ``. IDs preserved (never reassigned). Format identical to active file." + - Entries: closed entries from that year, in original chronological/ID order. + +2. **Build new active file content**: + - Original header preserved. + - **Insert a pointer block immediately after the header**: + ``` + > **Archived entries**: see `_2025.md`, `_2026.md` (closed entries by year). + > Archive files are not auto-loaded — read on demand if relevant context is suspected (regression hint, supersession reference). + ``` + - Remaining entries: only Open / Partially Fixed / Documented limitation / unknown-status. + +3. **Show diff** for each affected file (or summary if very large). + +## Step 5 — Apply with explicit consent (Rule #5) + +On explicit user approval: +1. `Write` archive files (new); `Edit` for any pre-existing archive files being extended. +2. `Edit` active files: remove archived entries, add pointer block. +3. Verify: re-read counts, confirm entries arrived intact. + +## Step 6 — Cross-reference verification + +`Grep` archived IDs in: code (`*.cs`), other `.md` files. Report any references — but **do not auto-rewrite**. IDs are preserved; `docs-discovery`'s on-demand archive read finds them when needed. + +## Step 7 — Decision Log archive case + +If `LLM_PROTOCOL_DECISIONS.md` was archived: +- "Current protocol state" summary at top stays in active file unchanged. +- Dated table rows for `Superseded by` entries move to `LLM_PROTOCOL_DECISIONS_.md`. +- Yearly archive convention already documented (LLMP-DEC governance section). + +## Edge cases + +- **No closed entries anywhere**: exit cleanly with "no archive-eligible entries found across N candidate files". +- **Status field missing/malformed**: flag the entry; treat as Open; user decides. +- **Year unparseable from Status**: default to current year; flag for user fix-up. +- **Existing archive contains an ID we're about to add**: should never happen if IDs append-only and Status monotonic. Abort with explicit error — manual review. +- **Active file becomes empty**: keep file with header + pointer. Do not delete. +- **Concurrent edits / dirty git**: prompt user before proceeding. + +## Tool usage + +Tool-neutral: `Glob` (enumerate), `Read` (active files), `Write` (new archives), `Edit` (active files), `Grep` (cross-ref scan). + +## Handoff with other skills + +- **`docs-discovery`** — archive files excluded from default topic-loading globs. On-demand read only when relevant historical context suspected. +- **`docs-check`** — when surfacing `Status: FIXED` for an entry being closed, may note "this entry will be archive-eligible at next `docs-archive` invocation". No automatic chaining. +- **`adr-author`** — ADR files (`docs/adr/NNNN-*.md`) are NOT processed by this skill. ADRs use a different lifecycle (per-decision separate files, no rotation). diff --git a/.github/skills/docs-check/references/TOPIC_CODES.md b/.github/skills/docs-check/references/TOPIC_CODES.md index de44fce..7497389 100644 --- a/.github/skills/docs-check/references/TOPIC_CODES.md +++ b/.github/skills/docs-check/references/TOPIC_CODES.md @@ -85,6 +85,39 @@ LLMP-DEC-4 # LLM-protocol decision 4 (Decision Log entry) - **Code comments**: `// See LOG-I-5` — bare ID acceptable since it's globally unique. - **DB natural key** (future migration): `(topic, type, seq)` tuple; or the full string `LOG-I-5` as a single column. +## Status field conventions + +Every entry in `_ISSUES.md`, `_TODO.md`, and `LLM_PROTOCOL_DECISIONS.md` SHOULD carry an explicit `Status` field. **3 allowed values**: + +| Status | Meaning | Archive eligible? | +|---|---|---| +| `Open` | Active / unresolved (default for new entries); also used for documented-current-behaviour entries that must remain visible | No | +| `InProgress` | Partial work in flight; some scope addressed but more remains | No | +| `Closed` | Done — bug fixed, decision made (won't fix / superseded by another entry / accepted), TODO completed. The body of the entry explains *what happened* (date, ref, rationale). | Yes | + +### Defaults + +- New entries default to `Status: Open`. +- For documented current-behaviour entries (accepted limitations / "by design" / "this is how it works"), use `Status: Open` with an optional body callout: `> **Note:** This entry documents accepted current behaviour — not scheduled for change.` These never archive (Open status). + +### Update workflow + +When status changes, update the `Status` line in-place. **This is the ONE exception to append-only** — the Status field is mutable; entry body / ID / Description remain immutable. + +When marking `Closed`: +1. **Format the Status line as** `Status: Closed (YYYY-MM-DD)` — the inline date is what `docs-archive` uses to determine the destination year-bucket. +2. **Add a `### Resolution` sub-section** documenting the closure. **Strongly recommended** — without it, future readers (and the `docs-archive` skill on lookup) have no context for "what changed, why, where". Suggested fields: + - **What:** one-line summary of the change. + - **Where:** code reference (file/class/commit hash) or doc reference (ADR / PR). + - **Why:** the rationale (fix / "won't fix because X" / "superseded by LOG-I-Y" / "accepted as-is"). + - Optional: scope, date if different from Status line, related entries. + +The body carries the **nuance**; the Status field only signals archive-eligibility. + +### Lifecycle: archive + +`Closed` entries are eligible for rotation into year-bucketed archive files (`_.md`) via the `docs-archive` skill. Year derived from a date in the entry body. Archive operation is user-invoked — closed entries don't disappear automatically. See `AyCode.Core/.github/skills/docs-archive/SKILL.md`. + ## Change history See the Decision Log (`../../../LLM_PROTOCOL_DECISIONS.md`) for the introduction of this registry and future topic-code additions. diff --git a/.github/skills/docs-discovery/SKILL.md b/.github/skills/docs-discovery/SKILL.md index e344150..afb7c91 100644 --- a/.github/skills/docs-discovery/SKILL.md +++ b/.github/skills/docs-discovery/SKILL.md @@ -106,6 +106,45 @@ If any `{DOMAIN}.md` is loaded (e.g., `LOGGING.md`), ALSO glob and load its comp These are **paired docs** and must be loaded as a set. Skipping ISSUES/TODO risks reintroducing fixed bugs or conflicting with ongoing refactors. +## Archive files (`*_.md`) + +Closed entries from `_ISSUES.md` / `_TODO.md` / `LLM_PROTOCOL_DECISIONS.md` may be rotated into year-bucketed archive files by the `docs-archive` skill. Examples: +- `LOGGING_ISSUES_2025.md` +- `BINARY_TODO_2026.md` +- `LLM_PROTOCOL_DECISIONS_2026.md` + +### Default behaviour: NOT auto-loaded + +The Step 2 glob patterns target **active** companions only — unsuffixed names. Year-suffixed variants are excluded by default. Practically: + +- `**/docs/{TOPIC}/{TOPIC}_ISSUES.md` matches; `**/docs/{TOPIC}/{TOPIC}_ISSUES_2025.md` does NOT. +- If a generic `{TOPIC}_*.md` pattern inadvertently matches year-suffixed files, filter them out before passing to Step 4 (Load). + +### On-demand read (no user-confirm needed — read-only operation) + +Read an archive file when ANY of these signals appears: +- A loaded entry references an archived ID (e.g., `Superseded by LOG-I-X` where X resolves only to `_.md`) +- A code comment or other doc references an ID resolving only to an archive file +- The user's request describes a behaviour pattern matching an archived `Fixed` entry's Description (regression suspicion) +- The investigation feels like "this was solved before" — read the topic's archive(s) before re-deriving +- The user explicitly asks about historical context + +When read: include in `[LOADED_DOCS]` like any other `.md`. Rule #3 (no-re-read) applies. Cite from it like the active file. + +This is a **read** — Rule #5 (consent for modifications) is not engaged. The "don't pre-load" rule is about token economy, not access control. + +### Glob recap + +Active-only (default for topic discovery): +- `**/docs/{TOKEN}/{TOKEN}_ISSUES.md` +- `**/docs/{TOKEN}/{TOKEN}_TODO.md` +- `**/docs/{TOKEN}/README.md` + +On-demand archive lookup: +- `**/docs/{TOKEN}/{TOKEN}_ISSUES_*.md` (where `*` matches a 4-digit year) +- `**/docs/{TOKEN}/{TOKEN}_TODO_*.md` +- `**/LLM_PROTOCOL_DECISIONS_*.md` + ## Step 6 — Proceed to the user's task The response's `[LOADED_DOCS: N files (+K this turn: )]` prefix (per the active repo's Rule #1) already surfaces the newly-loaded filenames and the cumulative count. **No separate confirmation line is needed** — the prefix itself is the confirmation. Continue directly to the user's actual request. diff --git a/AyCode.Core/docs/BINARY/BINARY_ISSUES.md b/AyCode.Core/docs/BINARY/BINARY_ISSUES.md index a9f29b4..1f28098 100644 --- a/AyCode.Core/docs/BINARY/BINARY_ISSUES.md +++ b/AyCode.Core/docs/BINARY/BINARY_ISSUES.md @@ -4,7 +4,7 @@ ### BIN-I-1: Non-array-backed memory — per-segment copy -**Status:** By design +**Status:** Open **Affects:** `SequenceBinaryInput` **Path:** `ExtractArray()` fallback when `MemoryMarshal.TryGetArray` fails @@ -14,7 +14,7 @@ When `ReadOnlySequence` segments are backed by native memory (not managed ### BIN-I-2: Cross-boundary scratch buffer is not pooled across calls -**Status:** Acceptable +**Status:** Open **Affects:** `SequenceBinaryInput._scratchBuffer` The scratch buffer is `ArrayPool.Rent`-ed on first cross-boundary read and reused within a single deserialization. It is `Return`-ed in `Release()` after deserialization completes. However, the next deserialization will rent again. @@ -25,14 +25,14 @@ The scratch buffer is `ArrayPool.Rent`-ed on first cross-boundary read and reuse ### BIN-I-3: ReadBytes always copies -**Status:** By design +**Status:** Open **Affects:** `BinaryDeserializationContext.ReadBytes(int length)` `ReadBytes` allocates a new `byte[]` and copies from the buffer. This is unavoidable because the caller owns the returned array, and the source buffer (pipe segment or serialized data) may be recycled. ### BIN-I-4: ReadStringUtf8 requires contiguous buffer -**Status:** By design +**Status:** Open **Affects:** `BinaryDeserializationContext.ReadStringUtf8(int length)` `Encoding.GetString` and `Ascii.IsValid` require contiguous memory. For multi-segment reads, `EnsureAvailable` copies cross-boundary bytes into the scratch buffer first. This is the same approach `SequenceReader` uses internally. @@ -43,14 +43,14 @@ The scratch buffer is `ArrayPool.Rent`-ed on first cross-boundary read and reuse ### BIN-I-5: BufferWriterBinaryOutput fallback path allocates per-chunk -**Status:** Acceptable +**Status:** Open **Affects:** `BufferWriterBinaryOutput.AcquireChunk` fallback When `MemoryMarshal.TryGetArray` fails on `IBufferWriter.GetMemory()` (native memory-backed writer), a `byte[]` is rented from `ArrayPool` per chunk and copied to the writer on `Grow`/`Flush`. Same as BIN-I-1 — non-array-backed writers are extremely rare. ### BIN-I-6: AsyncPipeWriterOutput uses sync GetResult() for backpressure -**Status:** By design (v1) +**Status:** Open **Affects:** `AsyncPipeWriterOutput.Grow()` — `_lastFlush.GetAwaiter().GetResult()` When the previous `PipeWriter.FlushAsync()` hasn't completed by the next `Grow()` call, the serializer blocks the thread until the flush completes. This is necessary because `IHubProtocol.WriteMessage` is `void` (synchronous by design). @@ -61,7 +61,7 @@ When the previous `PipeWriter.FlushAsync()` hasn't completed by the next `Grow() ### BIN-I-7: AsyncPipeWriterOutput fallback path — same as BIN-I-5 -**Status:** Acceptable +**Status:** Open **Affects:** `AsyncPipeWriterOutput.AcquireChunk` fallback Same `TryGetArray` fallback as `BufferWriterBinaryOutput` (BIN-I-5). Kestrel `PipeWriter.GetMemory()` always returns array-backed memory — fallback is for non-standard `PipeWriter` implementations only. @@ -70,7 +70,7 @@ Same `TryGetArray` fallback as `BufferWriterBinaryOutput` (BIN-I-5). Kestrel `Pi ### BIN-I-8: PipeReaderBinaryInput uses sync ReadAsync().GetResult() -**Status:** By design (v1) +**Status:** Open **Affects:** `PipeReaderBinaryInput.Initialize()` and `TryAdvanceSegment()` Same constraint as BIN-I-6 — `IBinaryInputBase` interface is synchronous. `ReadAsync().GetAwaiter().GetResult()` blocks when waiting for more data from the pipe. Currently not used in production (SignalR delivers complete messages via `TryParseMessage`). Reserved for future direct-pipe deserialization scenarios. @@ -79,14 +79,14 @@ Same constraint as BIN-I-6 — `IBinaryInputBase` interface is synchronous. `Rea ### BIN-I-9: CS8625 warnings for non-nullable reference types -**Status:** Known +**Status:** Open **Affects:** Generated reader code The source generator emits `null` assignments for non-nullable reference type properties during deserialization (before the value is read from the stream). This produces CS8625 warnings. Functionally harmless — the property is always assigned before use. ### BIN-I-10: First-run cold-start overhead -**Status:** Active — mitigation planned (see `BINARY_TODO.md#bin-t-3`, `BINARY_TODO.md#bin-t-4`) +**Status:** Open **Affects:** First `Serialize`/`Deserialize` per `[AcBinarySerializable]` type, per process Cold-start cost chain on first use of an SGen type (before BIN-T-3 lands): @@ -106,7 +106,7 @@ Subsequent calls hit cached metadata/wrappers → only Tier 0→1 JIT transition ### BIN-I-11: Consumer entity with `new` Id shadowing — excluded from SGen -**Status:** Workaround-in-place (compiled-expression fallback) +**Status:** Open **Affects:** Any consumer entity whose base class hides `BaseEntity.Id` with `readonly new int Id { get; }` pattern (e.g. `DiscountProductMapping` in Mango.Nop.Core) When the base class shadows `Id` with a setter-less `new int Id { get; }`, SGen can't emit a setter without CS0200. Runtime falls back to compiled-expression serialization for these types. Low priority — affects a small number of consumer entities. @@ -117,28 +117,28 @@ When the base class shadows `Id` with a setter-less `new int Id { get; }`, SGen ### BIN-I-12: Struct copy semantics -**Status:** By design +**Status:** Open **Affects:** `BufferWriterBinaryOutput` value-type assignment Assigning a `BufferWriterBinaryOutput` value creates an independent copy. State changes (e.g. `_committedBytes` via `Grow`/`Flush`) are not reflected in the original. Copy back after use if needed. ### BIN-I-13: Initialize resets tracking -**Status:** By design +**Status:** Open **Affects:** `BufferWriterBinaryOutput.Initialize` (context mode) `Initialize` sets `_committedBytes = 0`. Standalone bytes written before are lost if the BWO is then passed to a context. Call `FlushAndReset()` first, or track standalone bytes separately. ### BIN-I-14: Constructor acquires chunk -**Status:** Acceptable (not a leak) +**Status:** Open **Affects:** `BufferWriterBinaryOutput` ctor `AcquireChunk` runs in ctor for standalone readiness. Redundant if only context mode is used (context `Initialize` acquires its own). Not a leak — consecutive `GetMemory` without `Advance` returns overlapping memory. ### BIN-I-15: No mode mixing -**Status:** By design +**Status:** Open **Affects:** `BufferWriterBinaryOutput` — context vs standalone mode A single instance must not use context + standalone modes simultaneously — buffer states desynchronize. One mode per lifecycle phase; `FlushAndReset()` as boundary between modes. diff --git a/AyCode.Core/docs/LOGGING/LOGGING_ISSUES.md b/AyCode.Core/docs/LOGGING/LOGGING_ISSUES.md index aafb272..f1b196f 100644 --- a/AyCode.Core/docs/LOGGING/LOGGING_ISSUES.md +++ b/AyCode.Core/docs/LOGGING/LOGGING_ISSUES.md @@ -24,7 +24,7 @@ Console.Error noise tolerated. Alternatively, consumer uses DI-based `AddAcLogge ## LOG-I-2: AcEnv.AppConfiguration is filesystem-bound, MAUI/WASM-unsafe -**Severity:** Minor · **Status:** Documented limitation · **Area:** `AyCode.Core.Consts.AcEnv` +**Severity:** Minor · **Status:** Open · **Area:** `AyCode.Core.Consts.AcEnv` ### Description `AcEnv.AppConfiguration` is a static lazy singleton calling `new ConfigurationBuilder().AddJsonFile("appsettings.json").Build()` on first access. Reads current-working-directory filesystem. Throws on MAUI (no physical appsettings next to exe) and WASM (no filesystem at all). @@ -40,7 +40,7 @@ Consumer avoids the config-reading `AcLoggerBase(string)` ctor on these platform ## LOG-I-3: Two parallel logger-setup patterns -**Severity:** Minor (confusion, not functional) · **Status:** Documented · **Area:** LOGGING.md / consumer code +**Severity:** Minor (confusion, not functional) · **Status:** Open · **Area:** LOGGING.md / consumer code ### Description Two ways to construct a logger coexist: diff --git a/AyCode.Core/docs/XCUT/XCUT_ISSUES.md b/AyCode.Core/docs/XCUT/XCUT_ISSUES.md index 4644671..223ca7c 100644 --- a/AyCode.Core/docs/XCUT/XCUT_ISSUES.md +++ b/AyCode.Core/docs/XCUT/XCUT_ISSUES.md @@ -8,7 +8,7 @@ For planned cross-cutting work, see `XCUT_TODO.md`. ## XCUT-I-1: JSON-in-Binary request parameters -**Status:** Major tech debt, planned replacement (coordinated) +**Status:** Open **Affects:** BINARY serializer (wire format) ↔ SIGNALR transport (envelope) ↔ all consuming projects (caller code) ### Description diff --git a/AyCode.Services.Server/Logins/AcLoginServiceServer.cs b/AyCode.Services.Server/Logins/AcLoginServiceServer.cs index 8c46a8e..2e03f1e 100644 --- a/AyCode.Services.Server/Logins/AcLoginServiceServer.cs +++ b/AyCode.Services.Server/Logins/AcLoginServiceServer.cs @@ -189,7 +189,15 @@ public class AcLoginServiceServer { @@ -209,7 +217,15 @@ public class AcLoginServiceServer` for merge. The extra copy (pipe → byte[]) is the trade-off. @@ -88,7 +95,7 @@ See `SIGNALR_TODO.md#sig-t-5`. ### SIG-I-8: HubConnectionBuilder inner DI isolation -**Status:** Workaround-in-place (dedicated options-passing overload) +**Status:** Open **Affects:** Consumer client setup in `Program.cs` (MAUI, WASM, ASP.NET Core server prerender) `HubConnectionBuilder.Services` is a separate `IServiceCollection` from the outer host DI. `services.Configure(...)` registered in the outer container does NOT flow into `HubConnectionBuilder.Services`. Calling `hubBuilder.AddAcBinaryProtocol()` with no args silently falls back to default options. @@ -103,7 +110,7 @@ hubBuilder.AddAcBinaryProtocol(protocolOpts); ### SIG-I-9: First-call null response (observed) -**Status:** Open — not diagnosed +**Status:** Open **Affects:** `PostDataAsync` awaiter / OnReceiveMessage → pending-request correlation Observed symptom: first `GetProductDtos_80`-style call returns null despite server serializing and sending a valid ~80KB chunked response. Second call (client-side retry) works normally. diff --git a/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_ISSUES.md b/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_ISSUES.md index 578ebf1..ec295ab 100644 --- a/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_ISSUES.md +++ b/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_ISSUES.md @@ -5,7 +5,7 @@ For higher-level SignalR abstractions see `../SIGNALR/SIGNALR_ISSUES.md`. ## SBP-I-1: AsyncSegment send-path unsupported on WebAssembly -**Severity:** Major (on WASM) · **Status:** Workaround-in-place · **Area:** `AsyncPipeWriterOutput` / WASM runtime +**Severity:** Major (on WASM) · **Status:** Open · **Area:** `AsyncPipeWriterOutput` / WASM runtime ### Description `AsyncPipeWriterOutput.SyncAwaitFlush` uses `Task.Wait(timeout)` — deadlocks the single-threaded WASM UI thread. Therefore, WASM clients cannot SEND with `AsyncSegment` mode. @@ -20,7 +20,7 @@ None — architectural constraint of browser WASM threading model. ## SBP-I-2: StaticWebAssets SDK "Illegal characters" noise (consumer build) -**Severity:** Cosmetic (non-blocking) · **Status:** Upstream SDK limitation · **Area:** SDK, not our code +**Severity:** Cosmetic (non-blocking) · **Status:** Open · **Area:** SDK, not our code ### Description Consumer projects using `Microsoft.NET.Sdk.BlazorWebAssembly` may see "Illegal characters in path" errors in the VS design-time error list. Originates from the SDK's `DefineStaticWebAssets` task calling legacy `FileIOPermission.EmulateFileIOPermissionChecks`, which is stricter than NTFS (e.g., double spaces or certain path patterns in any wwwroot asset trigger it). diff --git a/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_TODO.md b/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_TODO.md index 45640a2..a55678e 100644 --- a/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_TODO.md +++ b/AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_TODO.md @@ -44,3 +44,142 @@ Alternative to wire-detection: use SignalR handshake message's `extensions` JSON { "protocol": "acbinary", "version": 1, "extensions": { "acbinary": { "preferredMode": "AsyncSegment" } } } ``` Zero first-message overhead, fully explicit. Both sides advertise their send-modes; pick intersection. Specification to be drafted; compatibility with non-AC clients (pure JSON etc.) must remain. + +--- + +# 🟡 NuGet competitiveness ideas — NOT current priority + +> **Status: speculative / future-only.** The following P3 entries are feature ideas for a public NuGet release to broaden adoption appeal vs. competing protocols (gRPC, MessagePack, MemoryPack, Protobuf-net). **None are committed work.** They exist here for future reference only — safe to skip during current sprint planning. Each requires: +> - Threat model / use-case justification document **before** any code +> - Benchmark of zero-copy / pipeline impact +> - Decorator pattern (compose, do not embed in `AcBinaryHubProtocol`) +> - Opt-in via options; default path **must** stay free of any added overhead + +These ideas were captured because the **wider binary-protocol market is currently silent** on these features (no major .NET binary serializer ships built-in encryption / compression / tracing). Real differentiation potential, but only if executed correctly. + +## SBP-T-5: Optional payload encryption (`AcEncryptionOptions`) +**Priority:** P3 — IDEA · **Type:** Feature (NuGet competitiveness) · **Status:** Open + +### Niches where TLS alone is insufficient +- **SignalR backplane / Azure SignalR Service** — relay sees plaintext; payload encryption gives true end-to-end client↔client. +- **TLS-terminating proxies** (Cloudflare, nginx, ALB) — internal segment cleartext; payload encryption is TLS-topology-independent. +- **Mobile/MAUI reverse-engineering** — if the key derives from user credentials (not embedded), a leaked binary doesn't grant payload access. +- **Multi-tenant SaaS over shared channel** — per-tenant keys make broadcast leakage fail closed. +- **Zero-trust intra-cluster** (modern enterprise trend) — complementary to mTLS service mesh. +- **Regulatory compliance** (HIPAA, PCI-DSS 4.1, GDPR Art.32, FedRAMP) — auditors often reject "TLS only" for ePHI / cardholder / classified data. + +### Design constraints +- **Decorator only** — `EncryptingHubProtocolWrapper`, never embedded into `AcBinaryHubProtocol` (SRP). +- **`System.Security.Cryptography.AesGcm`** — never custom crypto. Optional ChaCha20-Poly1305 for ARM/mobile (faster on those targets). +- **Authenticated encryption mandatory** — encryption-only modes (CBC etc.) **forbidden**. Tamper detection is baseline, not optional. +- **Pluggable `IAcEncryptionKeyProvider`** — caller chooses: env var, KeyVault, HKDF from user creds, hardware token. No baked-in key source. +- **Replay protection bundled** — monotonic per-connection nonce counter; rejects out-of-window packets. +- **Opt-in** — `AcEncryptionOptions = null` (default) means the encryption code path is **completely absent** from the hot path (no branch, no check). +- **WASM compatibility** mandatory test — `AesGcm` is available on net8+ WASM. +- **AsyncSegment compatibility decision** — per-chunk auth tag (~28 byte overhead per chunk) vs. single-stream cipher (loses pipeline parallelism). Benchmark required. + +### Acceptance criteria (before any code) +- Threat model document: what does this defend against that TLS doesn't, in measurable terms. +- Benchmark plan: zero-copy loss percentage, AsyncSegment chunk overhead. +- Decorator API sketch reviewed. + +## SBP-T-6: Optional message compression with `MinSize` threshold +**Priority:** P3 — IDEA · **Type:** Feature (NuGet competitiveness) · **Status:** Open + +### Niches +- **Large structured payloads** (orders, product lists, shipping documents) — typical 50-90% compression on text-heavy DTOs. +- **Mobile / metered connections** — bandwidth cost reduction. +- **Legacy slow links** (satellite, GPRS-class IoT) — payload size matters more than CPU. +- **Cross-DC replication / SignalR backplane traffic** — bytes through Redis backplane × N consumers. +- **Mixed traffic real-time apps** (heartbeat / acks / lookups + occasional large DTO) — see `MinSize` rationale below. + +### `MinSize` threshold — market-gap differentiator + +**Industry observation:** every major compression-aware system today is "all or nothing": gRPC, HTTP/2 response compression, WebSocket per-message-deflate (RFC 7692), Kafka producer compression — none have per-message size threshold. They compress everything once enabled. + +**Empirical reality:** every compression algorithm has a **break-even point** below which compression LOSES (output > input + CPU cost wasted): + +| Algorithm | Approximate break-even (byte) | Recommended `MinSize` default | +|-----------|------------------------------|-------------------------------| +| LZ4 | ~64-128 | 128 | +| Snappy | ~100 | 128 | +| Zstd | ~256-512 | 512 | +| Brotli | ~512-1024 | 1024 | +| Gzip/Deflate | ~200-500 | 512 | + +In a real-time SignalR app where small heartbeats / acks / status pings interleave with occasional large DTOs, "all or nothing" compression **wastes CPU on small frames AND increases their size**. A per-message `MinSize` skip-threshold turns this from a coin-flip into a measurable win. + +**Proposed semantic:** +```csharp +public sealed class AcCompressionOptions +{ + public AcCompressionAlgorithm Algorithm { get; set; } // None | Lz4 | Brotli | Zstd | Snappy | Gzip + public int? MinSize { get; set; } = null; // null = use per-algorithm default; 0 = always compress; explicit value = override + public CompressionLevel Level { get; set; } = CompressionLevel.Optimal; + public bool AlwaysCompressInAsyncSegment { get; set; } = true; // see AsyncSegment trade-off below +} +``` + +### AsyncSegment interaction — the streaming dilemma + +In `AsyncSegment` mode the total message size is unknown until `CHUNK_END`, so a post-serialization `MinSize` check is impossible without buffering the entire message — which would defeat the pipeline-parallelism advantage of `AsyncSegment` in the first place. + +| Option | How it works | Trade-off | +|--------|-------------|-----------| +| **(A) Per-chunk threshold** | Each chunk independently compressed or not, marker byte per chunk | Streaming-friendly. Lose cross-chunk dictionary benefit (Brotli/Zstd). Small chunks may bypass compression even when whole-message would benefit. | +| **(B) Whole-message buffer** | Buffer until `MinSize`, then decide; if compress, buffer the rest too | **Kills AsyncSegment pipeline parallelism** — message becomes effectively non-streaming. | +| **(C) `AsyncSegment` = always compress** (PROPOSED DEFAULT) | `MinSize` ignored in AsyncSegment mode; `Bytes` and `Segment` modes honour it | Logical: anyone choosing `AsyncSegment` is already optimizing for large payloads (small ones would use `Bytes`/`Segment`), so the threshold would never trigger anyway. Simple mental model. | +| **(D) Sliding window first-chunk buffer** | Buffer first N bytes (= MinSize); if exceeded, that buffer flushes as compressed first chunk + downstream streaming with dictionary | Elegant, preserves both threshold and pipeline. Significantly more complex. Future optimization. | + +**Default decision: (C)** — `AlwaysCompressInAsyncSegment = true`. Override possible per-options. Document the trade-off explicitly so users picking `AsyncSegment + compression + small messages` know what they're choosing. + +### Other design constraints +- **Decorator pattern**, same as SBP-T-5. +- **Algorithm-pluggable** — at minimum LZ4 (fastest), Brotli (best ratio for text), Zstd (modern balanced). Default **none**. +- **Order with encryption matters**: compress FIRST, encrypt AFTER (compressing ciphertext is futile). +- **Wire marker byte** — 1-byte algorithm ID prefix so decoder knows what to expand. `0x00` = uncompressed (the small-message path), other values = algorithm IDs. +- **Per-algorithm `MinSize` default** — applied when `MinSize == null`; user override always wins. +- **Empty / single-byte messages** — bypass compression unconditionally regardless of `MinSize` (zero benefit, nonzero overhead). + +### Acceptance criteria +- Benchmark: compression ratio + CPU cost per algorithm on representative payloads (Order, ProductList, ShippingDocument) AND small payloads (Ping, Ack, StatusUpdate). Verify break-even points empirically — adjust `MinSize` per-algorithm defaults if measured values diverge from the table above. +- Decorator API sketch reviewed. +- AsyncSegment interaction documented — the (C) default with override path; sample log output showing `MinSize`-skipped vs. compressed messages for diagnostics. +- Fallback / negotiation strategy: what happens if peer can't decompress? (Suggested: wire-marker `0x00` always works since it means "uncompressed" — sender can downgrade per-message if peer signals incompatibility, similar to HTTP `Accept-Encoding` semantics.) + +## SBP-T-7: OpenTelemetry tracing integration +**Priority:** P3 — IDEA · **Type:** Feature (NuGet competitiveness) · **Status:** Open + +### Niches +- **Distributed tracing** is a modern .NET observability standard — gRPC has it, MessagePack/MemoryPack/Protobuf-net don't. +- **Production diagnostics** — correlate SignalR call → server method → DB query in one trace. +- **Backpressure / latency analysis** — flush time, chunk-by-chunk progress as span events. + +### Design constraints +- **`System.Diagnostics.ActivitySource`** based — no hard dependency on OTel SDK; consumers wire their own exporter. +- **Opt-in via `AcBinaryHubProtocolOptions.ActivitySource`** — null = zero overhead, no branch. +- **Span structure**: per `WriteMessage` / `TryParseMessage` invocation; chunked path emits chunk events as span events (not nested spans, to avoid cardinality explosion). +- **W3C Trace Context propagation** — read/write `traceparent` header in `Headers` dictionary for cross-service correlation. + +### Acceptance criteria +- API sketch with sample exporter wiring (Jaeger / OTel Collector). +- Hot-path branch verification — when `ActivitySource == null`, JIT must eliminate the tracing code completely. + +## SBP-T-8: Optional HMAC signing (without encryption) +**Priority:** P3 — IDEA · **Type:** Feature (NuGet competitiveness) · **Status:** Open + +### Niches +- **Tamper detection where confidentiality is acceptable** — audit log forwarding, telemetry where data isn't sensitive but integrity must be provable. +- **Compliance lite** — "we sign all messages" satisfies some integrity-focused audit requirements without encryption complexity. +- **Plaintext debugging + integrity** — payload remains readable in tcpdump / Wireshark (debugging-friendly), but tampering is detected. + +### Design constraints +- **Decorator pattern** — `SigningHubProtocolWrapper`, separate from encryption. Composable: encryption decorator + signing decorator can stack (encrypt-then-sign or sign-then-encrypt — pick canonical order, document it). +- **`System.Security.Cryptography.HMACSHA256`** default; SHA-512 optional. +- **Append-only on wire** — N-byte MAC tag at end of message, before any framing trailer. +- **Pluggable `IAcSigningKeyProvider`** — same idea as encryption key provider. +- **Constant-time comparison** mandatory — `CryptographicOperations.FixedTimeEquals`, never `==`. + +### Acceptance criteria +- API sketch. +- Stack-with-encryption order decision documented (industry standard: **encrypt-then-MAC**, but evaluate trade-offs).