From bbae524e8d3447a0aa780682b6dba38b9efe162c Mon Sep 17 00:00:00 2001 From: Loretta Date: Sat, 4 Apr 2026 12:55:36 +0200 Subject: [PATCH] SGen root fast path: hybrid dispatch, docs, helpers Refactored AcBinarySerializer to add a source-generated (SGen) root fast path, bypassing the full runtime dispatch chain for SGen-decorated types and improving performance. Introduced helper methods for context pooling and buffer management. All entry points now use these helpers for consistency. Added comprehensive SGen architecture documentation (BINARY_SGEN.md), updated all related docs to explain the hybrid model, bridge methods, and configuration. Clarified doc loading rules in copilot-instructions.md for strict doc-first enforcement. --- .github/copilot-instructions.md | 11 +- .../Binaries/AcBinarySerializer.cs | 209 ++++++++++-------- AyCode.Core/Serializers/Binaries/README.md | 12 + AyCode.Core/docs/BINARY_FEATURES.md | 21 +- AyCode.Core/docs/BINARY_FORMAT.md | 5 +- AyCode.Core/docs/BINARY_IMPLEMENTATION.md | 64 +++++- AyCode.Core/docs/BINARY_OPTIONS.md | 2 +- AyCode.Core/docs/BINARY_SGEN.md | 161 ++++++++++++++ 8 files changed, 375 insertions(+), 110 deletions(-) create mode 100644 AyCode.Core/docs/BINARY_SGEN.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 0025ea2..ad7bc9f 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -11,13 +11,14 @@ You are operating in a multi-repo, documentation-first architecture. You MUST ST - If `[LOADED_DOCS: NONE]` applies, you **MUST STOP** and you are **STRICTLY FORBIDDEN** to use the following tools: `code_search`, `get_symbols_by_name`, `find_symbol`, or `get_file` (for non-markdown files). - Your VERY FIRST AND ONLY allowed tool calls must be `file_search` or `get_file` targeting the `.md` documentation in the relevant `docs/` folders or `README.md`. - Do not answer the user's core question until the `[LOADED_DOCS]` list is populated with the base architecture files. - - **CRITICAL EXCEPTION:** Do **NOT** re-read `.md` files that are already mapped in your context or `LOADED_DOCS` list (strictly maintain rule 20). + - **CRITICAL EXCEPTION:** Do **NOT** re-read `.md` files that are already mapped in your context or `LOADED_DOCS` list (strictly maintain rule 3). - **CROSS-REPO HARD-GATE:** When navigating to an external repo (via `own-dep-repos` paths), read that repo's `docs/` and `README.md` BEFORE searching its source code. The hard-gate applies to EVERY repo you enter, not just your own. - **PER-QUESTION DOC-FIRST:** Before searching source code for any user question, check whether there is a relevant `.md` file (folder `README.md`, other repo `docs/`, etc.) that has NOT yet been loaded. Read it first — it tells you where to look in the code, saving searches and tokens. Only after loading relevant docs should you search/read source files. 3. **STRICT NO-RE-READ POLICY (ANTI-LOOP):** You are PHYSICALLY FORBIDDEN from calling `get_file` or `file_search` on any `.md` file that is already listed in your `[LOADED_DOCS]` prefix. - - Once an `.md` file is in your context, it STAYS in your context. + - **Definition:** A doc is "in your context" ONLY if you have read its actual file content via a tool call in THIS conversation. Prior session summaries, compacted messages, and memory entries do NOT count — they are lossy compressions. + - Once an `.md` file is in your context, it STAYS in your context. - Re-reading them wastes tokens and breaks the protocol. - ONLY re-read an `.md` file if the user EXPLICITLY states "the file has changed on disk, read it again". - If the user simply mentions a glossary term or requests info found in a loaded doc, answer directly from memory. DO NOT search for it again. @@ -26,6 +27,12 @@ You are operating in a multi-repo, documentation-first architecture. You MUST ST If the user asks a domain/architecture specific question and you realize the essential `.md` files are NO LONGER in your current context (they dropped out of memory), you **MUST automatically re-read** the necessary documentation before answering. Do NOT wait for the user to explicitly tell you to re-read them. Prioritize scanning the `docs/` folders to recover the lost context. + **Auto-detection triggers (MUST treat ALL docs as NOT loaded):** + - Session starts with a summary of a previous conversation (context recovery/compaction) + - Message compaction or context compression occurred mid-session + - You cannot quote the exact content of a doc you claim to know + When any trigger fires → reset `[LOADED_DOCS: NONE]` and re-read per Rule #2. + Directories to read (when recovering context): - `docs/` (in this repository root) diff --git a/AyCode.Core/Serializers/Binaries/AcBinarySerializer.cs b/AyCode.Core/Serializers/Binaries/AcBinarySerializer.cs index 0c94b8d..59f69c1 100644 --- a/AyCode.Core/Serializers/Binaries/AcBinarySerializer.cs +++ b/AyCode.Core/Serializers/Binaries/AcBinarySerializer.cs @@ -301,54 +301,43 @@ public static partial class AcBinarySerializer /// public static byte[] Serialize(T value, AcBinarySerializerOptions options) { - if (value == null) - { - return [BinaryTypeCode.Null]; - } + if (value == null) return [BinaryTypeCode.Null]; var runtimeType = value.GetType(); - - // Handle IQueryable types - convert to AcExpressionNode (serialize the Expression) - object actualValue = value; - if (value is IQueryable queryable) - { - actualValue = AcSerializerCommon.QueryableToNode(queryable); - runtimeType = typeof(AcExpressionNode); - } - // Handle Expression types - convert to AcExpressionNode - else if (AcSerializerCommon.IsExpressionType(runtimeType)) - { - actualValue = AcExpressionConverter.ToNode((Expression)(object)value); - runtimeType = typeof(AcExpressionNode); - } - - var context = BinarySerializationContextPool.Get(options); - if (!context.OutputInitialized) - { - context.Output = new ArrayBinaryOutput(options.InitialBufferCapacity); - context.OutputInitialized = true; - } - context.Output.Initialize(out context._buffer, out context._position, out context._bufferEnd); + var context = AcquireArrayOutputContext(options); try { + // SGen fast path: skip IQueryable/Expression check + WriteValue dispatch chain. + // If root type has a GeneratedWriter it cannot be IQueryable/Expression/primitive/collection. + if (options.UseGeneratedCode) + { + var wrapper = context.GetWrapper(runtimeType); + if (wrapper.GeneratedWriter != null) + { + ScanForDuplicates(value, runtimeType, context); + context.WriteHeader(); + WriteObject(value, wrapper, context, 0); + + if (options.UseCompression != Lz4CompressionMode.None) + return Lz4.Compress(context.Output.AsSpan(context._buffer, context._position), options.UseCompression); + return context.Output.ToArray(context._buffer, context._position); + } + } + + // Full path: IQueryable/Expression conversion, primitive/collection dispatch + var actualValue = ConvertExpressionValue(value, ref runtimeType); ScanForDuplicates(actualValue, runtimeType, context); context.WriteHeader(); WriteValue(actualValue, runtimeType, context, 0); - // Apply compression if enabled - compress directly from buffer span (1 allocation) if (options.UseCompression != Lz4CompressionMode.None) - { return Lz4.Compress(context.Output.AsSpan(context._buffer, context._position), options.UseCompression); - } - - // No compression - single allocation for result return context.Output.ToArray(context._buffer, context._position); } finally { - if (options.UseAsync) BinarySerializationContextPool.ReturnAsync(context); - else BinarySerializationContextPool.Return(context); + ReturnContext(context, options); } } @@ -360,19 +349,14 @@ public static partial class AcBinarySerializer { if (value == null) return; var runtimeType = value.GetType(); - var context = BinarySerializationContextPool.Get(options); - if (!context.OutputInitialized) - { - context.Output = new ArrayBinaryOutput(options.InitialBufferCapacity); - context.OutputInitialized = true; - } + var context = AcquireArrayOutputContext(options); try { ScanForDuplicates(value, runtimeType, context); } finally { - BinarySerializationContextPool.Return(context); + ReturnContext(context, options); } } @@ -392,49 +376,49 @@ public static partial class AcBinarySerializer } var runtimeType = value.GetType(); - - // Handle IQueryable types - convert to AcExpressionNode (serialize the Expression) - object actualValue = value; - if (value is IQueryable queryable) - { - actualValue = AcSerializerCommon.QueryableToNode(queryable); - runtimeType = typeof(AcExpressionNode); - } - // Handle Expression types - convert to AcExpressionNode - else if (AcSerializerCommon.IsExpressionType(runtimeType)) - { - actualValue = AcExpressionConverter.ToNode((Expression)(object)value); - runtimeType = typeof(AcExpressionNode); - } - var context = BinarySerializationContextPool.Get(options); context.Output = new BufferWriterBinaryOutput(writer, options.BufferWriterChunkSize); context.Output.Initialize(out context._buffer, out context._position, out context._bufferEnd); try { + // SGen fast path: skip IQueryable/Expression check + WriteValue dispatch chain. + // If root type has a GeneratedWriter it cannot be IQueryable/Expression/primitive/collection. + if (options.UseGeneratedCode) + { + var wrapper = context.GetWrapper(runtimeType); + if (wrapper.GeneratedWriter != null) + { + ScanForDuplicates(value, runtimeType, context); + context.WriteHeader(); + WriteObject(value, wrapper, context, 0); + + if (options.UseCompression != Lz4CompressionMode.None) + ThrowCompressionNotSupportedWithBufferWriter(context); + + var bytesWritten = context.Output.GetTotalPosition(context._position); + context.Output.Flush(context._buffer, context._position); + return bytesWritten; + } + } + + // Full path: IQueryable/Expression conversion, primitive/collection dispatch + var actualValue = ConvertExpressionValue(value, ref runtimeType); ScanForDuplicates(actualValue, runtimeType, context); context.WriteHeader(); WriteValue(actualValue, runtimeType, context, 0); - // Apply compression if enabled if (options.UseCompression != Lz4CompressionMode.None) - { - context.Output.Flush(context._buffer, context._position); - throw new NotSupportedException( - "Compression is not supported with IBufferWriter output. " + - "Use the byte[] overload or disable compression."); - } + ThrowCompressionNotSupportedWithBufferWriter(context); - var bytesWritten = context.Output.GetTotalPosition(context._position); + var totalBytesWritten = context.Output.GetTotalPosition(context._position); context.Output.Flush(context._buffer, context._position); - return bytesWritten; + return totalBytesWritten; } finally { context.Output = default; - if (options.UseAsync) BinarySerializationContextPool.ReturnAsync(context); - else BinarySerializationContextPool.Return(context); + ReturnContext(context, options); } } @@ -447,14 +431,7 @@ public static partial class AcBinarySerializer if (value == null) return 1; var runtimeType = value.GetType(); - - var context = BinarySerializationContextPool.Get(options); - if (!context.OutputInitialized) - { - context.Output = new ArrayBinaryOutput(options.InitialBufferCapacity); - context.OutputInitialized = true; - } - context.Output.Initialize(out context._buffer, out context._position, out context._bufferEnd); + var context = AcquireArrayOutputContext(options); try { @@ -465,8 +442,7 @@ public static partial class AcBinarySerializer } finally { - if (options.UseAsync) BinarySerializationContextPool.ReturnAsync(context); - else BinarySerializationContextPool.Return(context); + ReturnContext(context, options); } } @@ -477,20 +453,10 @@ public static partial class AcBinarySerializer /// public static BinarySerializationResult SerializeToPooledBuffer(T value, AcBinarySerializerOptions options) { - if (value == null) - { - return BinarySerializationResult.FromImmutable([BinaryTypeCode.Null]); - } + if (value == null) return BinarySerializationResult.FromImmutable([BinaryTypeCode.Null]); var runtimeType = value.GetType(); - - var context = BinarySerializationContextPool.Get(options); - if (!context.OutputInitialized) - { - context.Output = new ArrayBinaryOutput(options.InitialBufferCapacity); - context.OutputInitialized = true; - } - context.Output.Initialize(out context._buffer, out context._position, out context._bufferEnd); + var context = AcquireArrayOutputContext(options); try { @@ -509,11 +475,74 @@ public static partial class AcBinarySerializer } finally { - if (options.UseAsync) BinarySerializationContextPool.ReturnAsync(context); - else BinarySerializationContextPool.Return(context); + ReturnContext(context, options); } } + #region Entry Point Helpers + + /// + /// Acquires a pooled ArrayBinaryOutput context and initializes the output buffer. + /// Reuses pooled ArrayBinaryOutput instance when available. + /// AggressiveInlining: JIT must see buffer init in caller scope for register allocation. + /// + [MethodImpl(MethodImplOptions.AggressiveInlining)] + private static BinarySerializationContext AcquireArrayOutputContext(AcBinarySerializerOptions options) + { + var context = BinarySerializationContextPool.Get(options); + if (!context.OutputInitialized) + { + context.Output = new ArrayBinaryOutput(options.InitialBufferCapacity); + context.OutputInitialized = true; + } + context.Output.Initialize(out context._buffer, out context._position, out context._bufferEnd); + return context; + } + + /// + /// Returns a serialization context to its pool. Uses async return when UseAsync is enabled. + /// + private static void ReturnContext(BinarySerializationContext context, AcBinarySerializerOptions options) + where TOutput : struct, IBinaryOutputBase + { + if (options.UseAsync) BinarySerializationContextPool.ReturnAsync(context); + else BinarySerializationContextPool.Return(context); + } + + /// + /// Converts IQueryable/Expression values to AcExpressionNode for serialization. + /// Returns the converted value (or original if no conversion needed). + /// + [MethodImpl(MethodImplOptions.NoInlining)] + private static object ConvertExpressionValue(T value, ref Type runtimeType) + { + if (value is IQueryable queryable) + { + runtimeType = typeof(AcExpressionNode); + return AcSerializerCommon.QueryableToNode(queryable); + } + if (AcSerializerCommon.IsExpressionType(runtimeType)) + { + runtimeType = typeof(AcExpressionNode); + return AcExpressionConverter.ToNode((Expression)(object)value!); + } + return value!; + } + + /// + /// Flushes output and throws NotSupportedException for compression with IBufferWriter. + /// + [MethodImpl(MethodImplOptions.NoInlining)] + private static void ThrowCompressionNotSupportedWithBufferWriter(BinarySerializationContext context) + { + context.Output.Flush(context._buffer, context._position); + throw new NotSupportedException( + "Compression is not supported with IBufferWriter output. " + + "Use the byte[] overload or disable compression."); + } + + #endregion + #endregion #region Generated Writer Bridge Methods diff --git a/AyCode.Core/Serializers/Binaries/README.md b/AyCode.Core/Serializers/Binaries/README.md index 77b1c2b..4d8789d 100644 --- a/AyCode.Core/Serializers/Binaries/README.md +++ b/AyCode.Core/Serializers/Binaries/README.md @@ -4,6 +4,7 @@ High-performance binary serialization/deserialization. Two-phase processing, mul > Implementation details (zero virtual dispatch, buffer management): `../../docs/BINARY_IMPLEMENTATION.md` > Output writers (ArrayBinaryOutput, BufferWriterBinaryOutput, chunk sizing): `../../docs/BINARY_WRITERS.md` +> Source generation (SGen architecture, hybrid model, bridge methods): `../../docs/BINARY_SGEN.md` ## Architecture @@ -14,6 +15,17 @@ High-performance binary serialization/deserialization. Two-phase processing, mul Generic over `TOutput` for strategy selection (`ArrayBinaryOutput` vs `BufferWriterBinaryOutput`). +### Root Dispatch + +Two root paths in `AcBinarySerializer.Serialize`: + +| Path | Condition | Dispatch depth | +|------|-----------|---------------| +| **SGen fast** | `UseGeneratedCode` + `GeneratedWriter != null` | 3 checks → `WriteObject` directly | +| **Full runtime** | No GeneratedWriter or `UseGeneratedCode=false` | IQueryable → Expression → TryWritePrimitive → WriteValueNonPrimitive → WriteObject | + +SGen fast path skips: `is IQueryable`, `IsExpressionType`, `TryWritePrimitive` (GetTypeCode + 15-case switch), `WriteValueNonPrimitive` (4 interface checks). Wire format identical. Details: `../../docs/BINARY_SGEN.md`. + ### Wire Format `BinaryTypeCode.cs` — 100+ type markers: diff --git a/AyCode.Core/docs/BINARY_FEATURES.md b/AyCode.Core/docs/BINARY_FEATURES.md index 11da5e5..71b500f 100644 --- a/AyCode.Core/docs/BINARY_FEATURES.md +++ b/AyCode.Core/docs/BINARY_FEATURES.md @@ -1,6 +1,6 @@ # AcBinary Features -Advanced serialization features built on top of the wire format. For core type markers and encoding see `BINARY_FORMAT.md`. For configuration options and presets see `BINARY_OPTIONS.md`. For internal architecture and memory management see `BINARY_IMPLEMENTATION.md`. +Advanced serialization features built on top of the wire format. For core type markers and encoding see `BINARY_FORMAT.md`. For configuration options and presets see `BINARY_OPTIONS.md`. For internal architecture and memory management see `BINARY_IMPLEMENTATION.md`. For source generation details see `BINARY_SGEN.md`. ## Compact Encoding Selection @@ -68,21 +68,16 @@ Wire output (Compact mode, ReferenceHandling=All): ## Hybrid Execution Model (Runtime vs Source Generated) -The serializer employs a "frictionless" hybrid execution model to balance ease of use with maximum performance. +Two execution modes, seamlessly interoperable in a single serialization run: -**Zero-Configuration (Runtime Fallback)** -By default, any class or record can be serialized without attributes. The serializer discovers properties via reflection, computes the deterministic base→derived order, and falls back to compiled delegates (`GetValue`) for property access. This provides a no-friction start and easy integration with 3rd-party types. +| Mode | Trigger | Property access | When to use | +|------|---------|----------------|-------------| +| **SGen** | `[AcBinarySerializable]` + `UseGeneratedCode=true` | `Unsafe.As` direct | Hot-path types | +| **Runtime** | No attribute or `UseGeneratedCode=false` | Compiled delegates | 3rd-party types, fallback | -**Source Generator (SGen)** -When a type is decorated with `[AcBinarySerializable]`, the Source Generator emits highly optimized, reflection-free serialization code (inlining property writes, avoiding dictionary lookups). +SGen root types use a **fast path** that skips the full dispatch chain (~12 calls → 3 checks). SGen children call directly into other SGen writers; non-SGen children fall back to runtime via bridge methods. Wire format is identical regardless of mode. -**Seamless Interoperability** -When `UseGeneratedCode = true` (default), the framework seamlessly mixes both approaches during a single serialization run: -- When the runtime encounters a type with a generated writer (`wrapper.GeneratedWriter != null`), it directly invokes it. -- If the generated code encounters a nested type that *lacks* a generated writer, it seamlessly calls back into the runtime pipeline (`ScanValueGenerated` / `WriteValueGenerated`). -- If `UseGeneratedCode = false`, the serializer ignores all SGen outputs and strictly uses the runtime fallback (useful for fallback testing or specific isolation needs). - -This allows developers to iteratively optimize performance bottlenecks (by adding attributes to hot-path classes) without breaking compatibility or requiring a total rewrite. +> Full SGen architecture, bridge methods, generated code patterns, wrapper slots: `BINARY_SGEN.md` ## Property Ordering diff --git a/AyCode.Core/docs/BINARY_FORMAT.md b/AyCode.Core/docs/BINARY_FORMAT.md index a4b90cb..9a171e1 100644 --- a/AyCode.Core/docs/BINARY_FORMAT.md +++ b/AyCode.Core/docs/BINARY_FORMAT.md @@ -2,9 +2,8 @@ Complete wire format specification for the AcBinary serializer. Source of truth: `Serializers/Binaries/BinaryTypeCode.cs`. -> For advanced features (compact encoding, string interning, reference tracking, property ordering) see `BINARY_FEATURES.md`. -> For configuration options, presets, and option interactions see `BINARY_OPTIONS.md`. -> For internal architecture and zero-allocation memory management see `BINARY_IMPLEMENTATION.md`. +> Features (interning, ref tracking, property ordering): `BINARY_FEATURES.md` | Options/presets: `BINARY_OPTIONS.md` +> Implementation (zero-alloc, buffer management): `BINARY_IMPLEMENTATION.md` | SGen architecture: `BINARY_SGEN.md` ## Stream Layout diff --git a/AyCode.Core/docs/BINARY_IMPLEMENTATION.md b/AyCode.Core/docs/BINARY_IMPLEMENTATION.md index bfd9d36..0b0d5a9 100644 --- a/AyCode.Core/docs/BINARY_IMPLEMENTATION.md +++ b/AyCode.Core/docs/BINARY_IMPLEMENTATION.md @@ -2,7 +2,7 @@ Low-level technical decisions, memory management, internal structure of `AcBinarySerializer`. For framework developers modifying the serialization pipeline. -> Format spec: `BINARY_FORMAT.md` | Options/presets: `BINARY_OPTIONS.md` | Features: `BINARY_FEATURES.md` | Output writers: `BINARY_WRITERS.md` +> Format spec: `BINARY_FORMAT.md` | Options/presets: `BINARY_OPTIONS.md` | Features: `BINARY_FEATURES.md` | Output writers: `BINARY_WRITERS.md` | SGen architecture: `BINARY_SGEN.md` > Benchmark results: `../../Test_Benchmark_Results/Benchmark/*.LLM` ## Zero-Allocation Buffer Management @@ -36,6 +36,50 @@ Generic `TOutput : struct, IBinaryOutputBase` → JIT devirtualizes `Grow()`. Ou > `ArrayBinaryOutput`, `BufferWriterBinaryOutput`, chunk sizing, dual buffer state: `BINARY_WRITERS.md` +## Root Serialization Dispatch + +### SGen Root Fast Path + +When root type has `GeneratedWriter` (SGen-decorated), the serializer **skips the full dispatch chain** and calls `WriteObject` directly with a pre-resolved wrapper. + +``` +Full path (runtime): + null → GetType → IQueryable? → IsExpressionType? → Pool.Get → Initialize + → ScanForDuplicates → WriteHeader + → WriteValue → TryWritePrimitive(GetTypeCode+15-case switch) + → WriteValueNonPrimitive(byte[]? IDictionary? IEnumerable? GetWrapper) + → WriteObject → WriteObjectProperties → SGen + +SGen fast path: + null → GetType → Pool.Get → Initialize + → GetWrapper → GeneratedWriter != null? + YES → ScanForDuplicates → WriteHeader → WriteObject(wrapper) → return + NO → full path (IQueryable/Expression + WriteValue dispatch) +``` + +**Why safe:** GeneratedWriter exists → type is NOT IQueryable/Expression/primitive/byte[]/IDictionary/IEnumerable. SGen only generates for object model types. WriteObject handles FixObj slot assignment, UseMetadata, RefHandling — all work correctly when called directly with pre-resolved wrapper. + +**Saved overhead:** `is IQueryable` interface check, `IsExpressionType` IsAssignableFrom, `TryWritePrimitive` Type.GetTypeCode + 15-case switch, `WriteValueNonPrimitive` 4 interface checks + dupla GetWrapper, 2 eliminated method call levels. + +**Non-SGen penalty:** +1 bool check (`options.UseGeneratedCode`) + 1 `GetWrapper` (cached) + 1 null check ≈ ~10-15ns. + +**Location:** `AcBinarySerializer.cs` — both `Serialize(T, options)` (byte[] path) and `Serialize(T, IBufferWriter, options)`. + +### Full Runtime Dispatch Chain + +For non-SGen root types, the full chain executes: + +| Step | Method | What it does | +|------|--------|-------------| +| 1 | `Serialize` | null check, GetType, pool get, IQueryable/Expression conversion | +| 2 | `ScanForDuplicates` | builds write plan (if caching enabled) | +| 3 | `WriteHeader` | version + flags + cacheCount | +| 4 | `WriteValue` | null check → `TryWritePrimitive` → `WriteValueNonPrimitive` | +| 5 | `TryWritePrimitive` | `Type.GetTypeCode` + 15-case switch (Int32/String/Bool/etc.) | +| 6 | `WriteValueNonPrimitive` | `is byte[]?` → `is IDictionary?` → `is IEnumerable?` → `GetWrapper` → `WriteObject` | +| 7 | `WriteObject` | FixObj/Object marker, ref handling, metadata → `WriteObjectProperties` | +| 8 | `WriteObjectProperties` | SGen `WriteProperties` or runtime property loop | + ## Direct Object Write (IsDirectObjectWrite) When `UseMetadata = false`: no inline property name hashes needed. SGen bypasses generic `WriteObject` loop entirely — writes `Object(64)` marker then sequential properties. Overhead ≈ raw byte writes. @@ -70,3 +114,21 @@ Two-phase: - **Hot:** single-byte check (e.g. `value < 0x80`), `AggressiveInlining` - **Cold:** multi-byte logic in separate `NoInlining` method (e.g. `WriteVarUIntMultiByteUnsafe`) - Keeps caller IL small, cache-friendly + +### 4. SGen Root Fast Path + +**Rule:** Root-level SGen types MUST skip `WriteValue`/`TryWritePrimitive`/`WriteValueNonPrimitive` dispatch chain. + +- `Serialize` checks `options.UseGeneratedCode` + `wrapper.GeneratedWriter != null` before IQueryable/Expression check +- Calls `WriteObject(value, wrapper, context, 0)` directly +- Wire format identical — only dispatch path differs +- Applies to both byte[] and IBufferWriter entry points + +### 5. Pool Management + +**Rule:** Context pool is generic over `TOutput`. Pool.Get → try/finally → Pool.Return. + +- `BinarySerializationContextPool` — byte[] path +- `BinarySerializationContextPool` — IBufferWriter path +- `options.UseAsync` → `ReturnAsync` (ThreadPool enqueue) to avoid lock contention +- Pooled contexts retain wrapper caches, buffer instances across serializations diff --git a/AyCode.Core/docs/BINARY_OPTIONS.md b/AyCode.Core/docs/BINARY_OPTIONS.md index 7cb57d1..07b2670 100644 --- a/AyCode.Core/docs/BINARY_OPTIONS.md +++ b/AyCode.Core/docs/BINARY_OPTIONS.md @@ -118,7 +118,7 @@ delegate PropertyInfo? PropertyMapperDelegate(PropertyInfo sourceProperty, Type | Option | Type | Default | Purpose | |--------|------|---------|---------| -| `UseGeneratedCode` | bool | `true` | Use source-generated writers/readers when available | +| `UseGeneratedCode` | bool | `true` | Use source-generated writers/readers when available. Enables SGen root fast path (see `BINARY_SGEN.md`) | | `InitialBufferCapacity` | int | 4096 | Starting buffer size (bytes) for serialization output | | `RemoveOrphanedItems` | bool | `false` | During `PopulateMerge`: remove destination collection items with no matching source ID | | `UseAsync` | bool | `false` | Async context pool return via ThreadPool. Auto-disabled in WASM and when `ReferenceHandling=None` | diff --git a/AyCode.Core/docs/BINARY_SGEN.md b/AyCode.Core/docs/BINARY_SGEN.md new file mode 100644 index 0000000..ebff6d5 --- /dev/null +++ b/AyCode.Core/docs/BINARY_SGEN.md @@ -0,0 +1,161 @@ +# AcBinary Source Generation (SGen) + +Source-generated serialization architecture, hybrid execution model, bridge methods, and code generation patterns. For modifying SGen writers or the runtime ↔ SGen boundary. + +> Wire format: `BINARY_FORMAT.md` | Options: `BINARY_OPTIONS.md` | Implementation: `BINARY_IMPLEMENTATION.md` | Writers: `BINARY_WRITERS.md` + +## Overview + +Two execution modes, seamlessly interoperable in a single serialization run: + +| Mode | Trigger | Property access | Type dispatch | Performance | +|------|---------|----------------|---------------|-------------| +| **SGen** | `[AcBinarySerializable]` + `UseGeneratedCode=true` | `Unsafe.As` direct field | Compile-time known | Fastest | +| **Runtime** | No attribute or `UseGeneratedCode=false` | Compiled delegates (`GetValue`) | Reflection + interface checks | Flexible | + +## Interfaces + +### IGeneratedBinaryWriter + +```csharp +void WriteProperties(object value, BinarySerializationContext context, int depth); +void ScanObject(object value, BinarySerializationContext context, int depth); +void ScanForDuplicates(object value, BinarySerializationContext context); +``` + +- `WriteProperties`: writes all properties directly (no marker, no ref handling — caller handles) +- `ScanObject`: recursive graph walk for duplicates (strings + IId objects) +- `ScanForDuplicates`: entry point — `HasCaching` check + `ScanObject` + `SortWritePlan` + +### IGeneratedBinaryReader + +```csharp +object? ReadObject(BinaryDeserializationContext context, int depth, int cacheIndex); +void ReadProperties(object value, BinaryDeserializationContext context, int depth); +``` + +## SGen Root Fast Path + +**Problem:** Runtime dispatch chain for root object: ~12 calls, ~8 branches before first property byte. MemoryPack: ~2 calls. + +**Solution:** `AcBinarySerializer.Serialize` checks for GeneratedWriter before the full dispatch chain: + +``` +if (options.UseGeneratedCode) + wrapper = GetWrapper(runtimeType) + if (wrapper.GeneratedWriter != null) + ScanForDuplicates → WriteHeader → WriteObject(wrapper) → return +// else: full path (IQueryable/Expression + WriteValue dispatch) +``` + +**Skipped steps:** `is IQueryable` check, `IsExpressionType`, `TryWritePrimitive` (Type.GetTypeCode + 15-case switch), `WriteValueNonPrimitive` (4 interface checks + GetWrapper), 2 method call levels. + +**Safety guarantees:** +- GeneratedWriter exists → type is NOT IQueryable, Expression, primitive, byte[], IDictionary, IEnumerable +- SGen only generates for `[AcBinarySerializable]` object model types +- Root always uses `value.GetType()` → no declared vs runtime type mismatch (polymorphism safe) +- `WriteObject` handles FixObj slots, UseMetadata, RefHandling → wire format identical +- Hybrid children use bridge methods → unchanged behavior + +**Applies to:** both `Serialize(T, options)` (byte[]) and `Serialize(T, IBufferWriter, options)`. + +## Hybrid Execution Model + +SGen root type with non-SGen child types works transparently. SGen-generated `WriteProperties` calls bridge methods for unknown child types: + +``` +SGen Root (WriteProperties) + ├─ SGen Child A → direct SGen→SGen call (WriteProperties) + ├─ Runtime Child B → WriteValueGenerated (bridge → WriteValueNonPrimitive) + ├─ String property → WriteStringGenerated (bridge → WriteString) + └─ SGen Child C with runtime grandchild + ├─ SGen grandchild → direct call + └─ Runtime grandchild → WriteObjectGenerated (bridge → GetWrapper + WriteObject) +``` + +**Key:** SGen types can call **directly** into other SGen types' `WriteProperties` (zero dispatch). Non-SGen children fall back to runtime via bridge methods (full type dispatch). + +## Bridge Methods + +Located in `AcBinarySerializer.cs` region "Generated Writer Bridge Methods". Called by SGen-generated code to transition to runtime pipeline. + +| Bridge | Called when | What it does | +|--------|-----------|-------------| +| `WriteValueGenerated(value, type, ctx, depth)` | SGen encounters non-SGen complex child | → `WriteValueNonPrimitive` (byte[]? IDictionary? IEnumerable? GetWrapper → WriteObject) | +| `WriteObjectGenerated(value, type, ctx, depth)` | SGen knows child is object (not collection) | → `GetWrapper(type)` → `WriteObject` | +| `WriteObjectGenerated(value, type, slot, ctx, depth)` | SGen knows child is SGen object with known slot | → `GetWrapper(type, slot)` → `WriteObject`. Slot = compile-time known wrapper index, avoids dictionary lookup | +| `WriteStringGenerated(value, ctx)` | SGen writes string property | null → PropertySkip, empty → StringEmpty, else → `WriteString` (with interning) | +| `ScanValueGenerated(value, type, ctx, depth)` | SGen scan encounters non-SGen child | → runtime `ScanValue` for reference/string tracking | + +## Wrapper Slot System + +Each SGen type gets a unique slot index via `AllocateWrapperSlot()` (`Interlocked.Increment`). + +| Slot range | Purpose | +|-----------|---------| +| 0–63 | FixObj markers — runtime polymorphic types (dynamic assignment) | +| 64+ | SGen types (compile-time stable, sequential allocation) | + +`GetWrapper(type, slot)`: first call per slot per context → populates from `GetWrapper(type)`. Subsequent calls → direct array index `_wrapperSlots[slot]`. O(1) lookup, no dictionary. + +## Generated Code Patterns + +### WriteProperties (generated) + +```csharp +// Generated for [AcBinarySerializable] type +void WriteProperties(object value, BinarySerializationContext ctx, int depth) +{ + var obj = Unsafe.As(value); // No cast, no box + ctx.WriteInt32Property(obj.Id); // Direct typed write + AcBinarySerializer.WriteStringGenerated(obj.Name, ctx); // String bridge + // SGen child → direct call: + ChildWriter.WriteProperties(obj.Child, ctx, depth + 1); + // Non-SGen child → bridge: + AcBinarySerializer.WriteValueGenerated(obj.Other, typeof(OtherType), ctx, depth + 1); +} +``` + +### ScanObject (generated) + +```csharp +void ScanObject(object value, BinarySerializationContext ctx, int depth) +{ + var obj = Unsafe.As(value); + // Track this object for ref handling + if (!ctx.TrackObjectForScan(value)) return; // Already visited + // Scan string properties for interning + ctx.ScanStringProperty(obj.Name); + // SGen child → direct scan: + ChildWriter.ScanObject(obj.Child, ctx, depth + 1); + // Non-SGen child → bridge: + AcBinarySerializer.ScanValueGenerated(obj.Other, typeof(OtherType), ctx, depth + 1); +} +``` + +## Performance Characteristics + +| Aspect | SGen | Runtime | +|--------|------|---------| +| Property access | `Unsafe.As` (0 overhead) | Compiled delegate invoke | +| Type dispatch | Compile-time resolved | Interface checks + GetTypeCode switch | +| Wrapper lookup | Slot array index O(1) | Dictionary lookup (amortized O(1) but hashing) | +| String write | Bridge to same WriteString | Same WriteString | +| Ref handling | Same IdentityMap | Same IdentityMap | +| Wire format | **Identical** | **Identical** | +| Root dispatch | Fast path (3 checks) | Full chain (~12 calls, ~8 branches) | + +## Configuration + +| Option | Default | Effect on SGen | +|--------|---------|---------------| +| `UseGeneratedCode` | `true` | `false` → ignores all GeneratedWriter/Reader, uses runtime only | +| `UseMetadata` | `false` | `false` + SGen → `IsDirectObjectWrite=true` → SGen inlines property writes | +| `ReferenceHandling` | `All` | `None` + `StringInterning=None` → no scan pass (single-phase) | + +## When to Use SGen + +- **Hot-path types**: types serialized frequently (SignalR messages, cache entries) +- **Large object graphs**: deep nesting benefits most from zero-dispatch property access +- **Small payloads**: root fast path eliminates dispatch overhead that dominates at small sizes +- **Not needed for**: one-off serialization, 3rd-party types, Expression/IQueryable types