Update defaults, docs, and internals for AcBinary serializer

- Add BINARY_IMPLEMENTATION.md with internal architecture and perf details
- Link new implementation doc from all relevant documentation
- Change default ReferenceHandlingMode to OnlyId
- Change default UseStringInterning to All
- Add InternalsVisibleTo for Mango.Nop.Core
This commit is contained in:
Loretta 2026-04-02 22:17:46 +02:00
parent 75974bf238
commit 896ee257c4
8 changed files with 83 additions and 4 deletions

View File

@ -5,3 +5,4 @@ using System.Runtime.CompilerServices;
[assembly: InternalsVisibleTo("AyCode.Core.Tests.Internal")]
[assembly: InternalsVisibleTo("AyCode.Benchmark")]
[assembly: InternalsVisibleTo("FruitBank.Common")]
[assembly: InternalsVisibleTo("Mango.Nop.Core")]

View File

@ -19,7 +19,7 @@ public abstract class AcSerializerOptions
set => _referenceHandling = value;
}
private ReferenceHandlingMode _referenceHandling = ReferenceHandlingMode.All;
private ReferenceHandlingMode _referenceHandling = ReferenceHandlingMode.OnlyId;
private readonly byte _maxDepth = byte.MaxValue;
private readonly bool _throwOnCircularReference = true;

View File

@ -99,7 +99,7 @@ public sealed class AcBinarySerializerOptions : AcSerializerOptions
/// All: All strings within length limits are interned (legacy behavior).
/// Default: All
/// </summary>
public StringInterningMode UseStringInterning { get; set; } = StringInterningMode.Attribute;
public StringInterningMode UseStringInterning { get; set; } = StringInterningMode.All;
/// <summary>
/// When true, checks for duplicate property name hashes during serialization (UseMetadata mode).

View File

@ -2,6 +2,8 @@
High-performance binary serialization/deserialization with two-phase processing, multiple wire modes, string interning, and source generation support. The primary goal is **speed**: every design decision prioritizes minimal latency and maximum throughput.
> For deep technical implementation details (Zero Virtual Dispatch, Direct Buffer Management), see `../../docs/BINARY_IMPLEMENTATION.md`.
## Architecture
### Two-Phase Serialization

View File

@ -1,6 +1,6 @@
# AcBinary Features
Advanced serialization features built on top of the wire format. For core type markers and encoding see `BINARY_FORMAT.md`. For configuration options and presets see `BINARY_OPTIONS.md`.
Advanced serialization features built on top of the wire format. For core type markers and encoding see `BINARY_FORMAT.md`. For configuration options and presets see `BINARY_OPTIONS.md`. For internal architecture and memory management see `BINARY_IMPLEMENTATION.md`.
## Compact Encoding Selection

View File

@ -4,6 +4,7 @@ Complete wire format specification for the AcBinary serializer. Source of truth:
> For advanced features (compact encoding, string interning, reference tracking, property ordering) see `BINARY_FEATURES.md`.
> For configuration options, presets, and option interactions see `BINARY_OPTIONS.md`.
> For internal architecture and zero-allocation memory management see `BINARY_IMPLEMENTATION.md`.
## Stream Layout

View File

@ -0,0 +1,75 @@
# Binaries Implementation Details
This document covers the low-level technical decisions, memory management strategies, and internal structure of the `AcBinary` serializer. These details are intended for framework developers modifying or extending the serialization pipeline.
For format specifications, see `BINARY_FORMAT.md`. For options and presets, see `BINARY_OPTIONS.md`. For features, see `BINARY_FEATURES.md`.
## Zero-Allocation Buffer Management
The core design philosophy of `AcBinarySerializer` is **Zero Virtual Dispatch** and **Zero Direct Allocation** on the hot path.
### Context-Owned Buffer State
Instead of passing an `IBinaryWriter` interface down the object graph (which forces a virtual method call for every single byte or integer written), the `BinarySerializationContext<TOutput>` strictly owns the buffer state:
```csharp
internal byte[] _buffer = null!;
internal int _position;
internal int _bufferEnd;
```
All write methods (`WriteByte`, `WriteVarUInt`, `WriteStringUtf8`, etc.) are declared directly on the sealed context class and aggressively inlined.
### No Temporary Buffers for Strings
When writing strings, the serializer **never** allocates an intermediate `byte[]` to perform UTF-8 encoding. It writes directly to the context's pinned `_buffer` using `System.Span` and `Ascii.FromUtf16` / `Utf8NoBom.GetBytes`.
**Speculative ASCII Fast Path:**
1. Assume the string is purely ASCII (byte length = char length).
2. Ensure capacity for `value.Length`.
3. Try `Ascii.FromUtf16` directly into the final buffer.
4. If it hits a non-ASCII character, it safely aborts. The serializer then rewinds the `position`, calculates true UTF-8 byte length, and uses `Utf8.GetBytes`.
### Abstracting the Output (The `TOutput` Strategy)
To support both `byte[]` returns and streaming models (via `IBufferWriter<byte>`), the context is generic over `TOutput : struct, IBinaryOutputBase`.
The `TOutput` struct is **only** invoked on the cold path (when `_position >= _bufferEnd`).
- `ArrayBinaryOutput`: Rents a newly doubled array from `ArrayPool`, copies existing data across, and returns the old array to the pool. When serialization finishes, a final buffer slice is returned (often avoiding `ToArray()` allocations altogether if wrapped in a `BinarySerializationResult`).
- `BufferWriterBinaryOutput`: Attempts to acquire a new chunk directly from the underlying `IBufferWriter` using `MemoryMarshal.TryGetArray`. If the `IBufferWriter` isn't backed by an array (e.g. native memory), it falls back to renting a temporary buffer from `ArrayPool`, writing to it, and later copying the chunk to the `IBufferWriter` on `Flush()` or `Grow()`. Otherwise, it writes directly to the `IBufferWriter`'s backing array without extra copying.
Because `TOutput` is a generic struct constraint, the JIT completely devirtualizes the `Grow()` calls.
## Direct Object Write (IsDirectObjectWrite)
When `UseMetadata = false`, there is no need to track inline property name hashes. The [Source Generator] (SGen) can completely bypass the generic `WriteObject` loop.
When SGen code executes, it checks `IsDirectObjectWrite`. If true, it writes the `Object(64)` marker and immediately outputs properties sequentially without any framework-level `foreach` or reflection. This reduces the overhead of an object write effectively to the time it takes to write raw bytes to an array.
## Property State Buffering
During the `UseMetadata=true` cross-type deserialization phase, properties might arrive in a different order than expected. The deserializer maintains a temporary state buffer (rented from `ArrayPool`) to track which properties have been seen and which have been populated, allowing it to efficiently map wire hashes to local property setters.
## High-Performance Coding Guidelines (LLM / Contributor Rules)
The following patterns are strictly enforced within the serialization pipeline. AI agents and developers modifying this layer MUST adhere to these rules to maintain zero-allocation and high-throughput characteristics.
### 1. The "Write Plan" (O(1) Reference Tracking)
**Rule:** Never use `Dictionary.ContainsKey`, `HashSet.Contains`, or similar lookups during the serialization hot path.
**Implementation:**
Reference tracking and string interning use a two-phase approach:
1. **Scan Pass:** Walks the object graph and populates an array-based `WriteDuplicateEntry[]` "Write Plan" representing occurrences of duplicates.
2. **Serialize Pass:** Iterates sequentially. The framework advances an integer cursor (`WriteVisitIndex`) and compares it to a pre-calculated index via `TryConsumeWritePlanEntry()`. This provides `O(1)` duplicate detection without any hashing overhead during the actual byte-writing phase.
### 2. Unsafe and SIMD Memory Access
**Rule:** Do not use `BitConverter` or manual bit-shifting for primitive unmanaged types or block copying.
**Implementation:**
- For structs and unmanaged types (e.g., `Guid`, `DateTime`, `decimal`), use `Unsafe.WriteUnaligned<T>` and `MemoryMarshal.AsBytes()` to write directly to the buffer.
- For contiguous block memory copies (e.g., `byte[]`), use `System.Span<T>.CopyTo` or SIMD hardware acceleration (e.g., `Vector<byte>` and `Vector.IsHardwareAccelerated` as seen in `WriteBytesSimd`).
### 3. Hot-Path vs. Cold-Path Inlining (VarInt/VarUInt)
**Rule:** Ensure the JIT can inline the common cases by explicitly isolating the slow paths.
**Implementation:**
Methods called in tight loops are small and decorated with `[MethodImpl(MethodImplOptions.AggressiveInlining)]`. Heavy branching or loop logic (which prevents inlining) is extracted into separate methods:
- **Hot path:** Checks if the value fits in a single byte (e.g., `value < 0x80`). Aggressively inlined.
- **Cold path:** Multi-byte encoding logic is placed in a secondary method (e.g., `WriteVarUIntMultiByteUnsafe`) decorated with `[MethodImpl(MethodImplOptions.NoInlining)]`. This keeps the calling method's IL size extremely small and processor cache-friendly.

View File

@ -1,6 +1,6 @@
# AcBinary Configuration
Configuration options, presets, and option interactions for `AcBinarySerializerOptions`. For wire format see `BINARY_FORMAT.md`. For features (interning, ref tracking, property ordering) see `BINARY_FEATURES.md`.
Configuration options, presets, and option interactions for `AcBinarySerializerOptions`. For wire format see `BINARY_FORMAT.md`. For features (interning, ref tracking, property ordering) see `BINARY_FEATURES.md`. For internal architecture and memory management see `BINARY_IMPLEMENTATION.md`.
## WireMode