Update defaults, docs, and internals for AcBinary serializer
- Add BINARY_IMPLEMENTATION.md with internal architecture and perf details - Link new implementation doc from all relevant documentation - Change default ReferenceHandlingMode to OnlyId - Change default UseStringInterning to All - Add InternalsVisibleTo for Mango.Nop.Core
This commit is contained in:
parent
75974bf238
commit
896ee257c4
|
|
@ -5,3 +5,4 @@ using System.Runtime.CompilerServices;
|
||||||
[assembly: InternalsVisibleTo("AyCode.Core.Tests.Internal")]
|
[assembly: InternalsVisibleTo("AyCode.Core.Tests.Internal")]
|
||||||
[assembly: InternalsVisibleTo("AyCode.Benchmark")]
|
[assembly: InternalsVisibleTo("AyCode.Benchmark")]
|
||||||
[assembly: InternalsVisibleTo("FruitBank.Common")]
|
[assembly: InternalsVisibleTo("FruitBank.Common")]
|
||||||
|
[assembly: InternalsVisibleTo("Mango.Nop.Core")]
|
||||||
|
|
|
||||||
|
|
@ -19,7 +19,7 @@ public abstract class AcSerializerOptions
|
||||||
set => _referenceHandling = value;
|
set => _referenceHandling = value;
|
||||||
}
|
}
|
||||||
|
|
||||||
private ReferenceHandlingMode _referenceHandling = ReferenceHandlingMode.All;
|
private ReferenceHandlingMode _referenceHandling = ReferenceHandlingMode.OnlyId;
|
||||||
|
|
||||||
private readonly byte _maxDepth = byte.MaxValue;
|
private readonly byte _maxDepth = byte.MaxValue;
|
||||||
private readonly bool _throwOnCircularReference = true;
|
private readonly bool _throwOnCircularReference = true;
|
||||||
|
|
|
||||||
|
|
@ -99,7 +99,7 @@ public sealed class AcBinarySerializerOptions : AcSerializerOptions
|
||||||
/// All: All strings within length limits are interned (legacy behavior).
|
/// All: All strings within length limits are interned (legacy behavior).
|
||||||
/// Default: All
|
/// Default: All
|
||||||
/// </summary>
|
/// </summary>
|
||||||
public StringInterningMode UseStringInterning { get; set; } = StringInterningMode.Attribute;
|
public StringInterningMode UseStringInterning { get; set; } = StringInterningMode.All;
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
/// When true, checks for duplicate property name hashes during serialization (UseMetadata mode).
|
/// When true, checks for duplicate property name hashes during serialization (UseMetadata mode).
|
||||||
|
|
|
||||||
|
|
@ -2,6 +2,8 @@
|
||||||
|
|
||||||
High-performance binary serialization/deserialization with two-phase processing, multiple wire modes, string interning, and source generation support. The primary goal is **speed**: every design decision prioritizes minimal latency and maximum throughput.
|
High-performance binary serialization/deserialization with two-phase processing, multiple wire modes, string interning, and source generation support. The primary goal is **speed**: every design decision prioritizes minimal latency and maximum throughput.
|
||||||
|
|
||||||
|
> For deep technical implementation details (Zero Virtual Dispatch, Direct Buffer Management), see `../../docs/BINARY_IMPLEMENTATION.md`.
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
### Two-Phase Serialization
|
### Two-Phase Serialization
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# AcBinary Features
|
# AcBinary Features
|
||||||
|
|
||||||
Advanced serialization features built on top of the wire format. For core type markers and encoding see `BINARY_FORMAT.md`. For configuration options and presets see `BINARY_OPTIONS.md`.
|
Advanced serialization features built on top of the wire format. For core type markers and encoding see `BINARY_FORMAT.md`. For configuration options and presets see `BINARY_OPTIONS.md`. For internal architecture and memory management see `BINARY_IMPLEMENTATION.md`.
|
||||||
|
|
||||||
## Compact Encoding Selection
|
## Compact Encoding Selection
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -4,6 +4,7 @@ Complete wire format specification for the AcBinary serializer. Source of truth:
|
||||||
|
|
||||||
> For advanced features (compact encoding, string interning, reference tracking, property ordering) see `BINARY_FEATURES.md`.
|
> For advanced features (compact encoding, string interning, reference tracking, property ordering) see `BINARY_FEATURES.md`.
|
||||||
> For configuration options, presets, and option interactions see `BINARY_OPTIONS.md`.
|
> For configuration options, presets, and option interactions see `BINARY_OPTIONS.md`.
|
||||||
|
> For internal architecture and zero-allocation memory management see `BINARY_IMPLEMENTATION.md`.
|
||||||
|
|
||||||
## Stream Layout
|
## Stream Layout
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,75 @@
|
||||||
|
# Binaries Implementation Details
|
||||||
|
|
||||||
|
This document covers the low-level technical decisions, memory management strategies, and internal structure of the `AcBinary` serializer. These details are intended for framework developers modifying or extending the serialization pipeline.
|
||||||
|
|
||||||
|
For format specifications, see `BINARY_FORMAT.md`. For options and presets, see `BINARY_OPTIONS.md`. For features, see `BINARY_FEATURES.md`.
|
||||||
|
|
||||||
|
## Zero-Allocation Buffer Management
|
||||||
|
|
||||||
|
The core design philosophy of `AcBinarySerializer` is **Zero Virtual Dispatch** and **Zero Direct Allocation** on the hot path.
|
||||||
|
|
||||||
|
### Context-Owned Buffer State
|
||||||
|
Instead of passing an `IBinaryWriter` interface down the object graph (which forces a virtual method call for every single byte or integer written), the `BinarySerializationContext<TOutput>` strictly owns the buffer state:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
internal byte[] _buffer = null!;
|
||||||
|
internal int _position;
|
||||||
|
internal int _bufferEnd;
|
||||||
|
```
|
||||||
|
|
||||||
|
All write methods (`WriteByte`, `WriteVarUInt`, `WriteStringUtf8`, etc.) are declared directly on the sealed context class and aggressively inlined.
|
||||||
|
|
||||||
|
### No Temporary Buffers for Strings
|
||||||
|
When writing strings, the serializer **never** allocates an intermediate `byte[]` to perform UTF-8 encoding. It writes directly to the context's pinned `_buffer` using `System.Span` and `Ascii.FromUtf16` / `Utf8NoBom.GetBytes`.
|
||||||
|
|
||||||
|
**Speculative ASCII Fast Path:**
|
||||||
|
1. Assume the string is purely ASCII (byte length = char length).
|
||||||
|
2. Ensure capacity for `value.Length`.
|
||||||
|
3. Try `Ascii.FromUtf16` directly into the final buffer.
|
||||||
|
4. If it hits a non-ASCII character, it safely aborts. The serializer then rewinds the `position`, calculates true UTF-8 byte length, and uses `Utf8.GetBytes`.
|
||||||
|
|
||||||
|
### Abstracting the Output (The `TOutput` Strategy)
|
||||||
|
To support both `byte[]` returns and streaming models (via `IBufferWriter<byte>`), the context is generic over `TOutput : struct, IBinaryOutputBase`.
|
||||||
|
|
||||||
|
The `TOutput` struct is **only** invoked on the cold path (when `_position >= _bufferEnd`).
|
||||||
|
- `ArrayBinaryOutput`: Rents a newly doubled array from `ArrayPool`, copies existing data across, and returns the old array to the pool. When serialization finishes, a final buffer slice is returned (often avoiding `ToArray()` allocations altogether if wrapped in a `BinarySerializationResult`).
|
||||||
|
- `BufferWriterBinaryOutput`: Attempts to acquire a new chunk directly from the underlying `IBufferWriter` using `MemoryMarshal.TryGetArray`. If the `IBufferWriter` isn't backed by an array (e.g. native memory), it falls back to renting a temporary buffer from `ArrayPool`, writing to it, and later copying the chunk to the `IBufferWriter` on `Flush()` or `Grow()`. Otherwise, it writes directly to the `IBufferWriter`'s backing array without extra copying.
|
||||||
|
|
||||||
|
Because `TOutput` is a generic struct constraint, the JIT completely devirtualizes the `Grow()` calls.
|
||||||
|
|
||||||
|
## Direct Object Write (IsDirectObjectWrite)
|
||||||
|
|
||||||
|
When `UseMetadata = false`, there is no need to track inline property name hashes. The [Source Generator] (SGen) can completely bypass the generic `WriteObject` loop.
|
||||||
|
|
||||||
|
When SGen code executes, it checks `IsDirectObjectWrite`. If true, it writes the `Object(64)` marker and immediately outputs properties sequentially without any framework-level `foreach` or reflection. This reduces the overhead of an object write effectively to the time it takes to write raw bytes to an array.
|
||||||
|
|
||||||
|
## Property State Buffering
|
||||||
|
|
||||||
|
During the `UseMetadata=true` cross-type deserialization phase, properties might arrive in a different order than expected. The deserializer maintains a temporary state buffer (rented from `ArrayPool`) to track which properties have been seen and which have been populated, allowing it to efficiently map wire hashes to local property setters.
|
||||||
|
|
||||||
|
## High-Performance Coding Guidelines (LLM / Contributor Rules)
|
||||||
|
|
||||||
|
The following patterns are strictly enforced within the serialization pipeline. AI agents and developers modifying this layer MUST adhere to these rules to maintain zero-allocation and high-throughput characteristics.
|
||||||
|
|
||||||
|
### 1. The "Write Plan" (O(1) Reference Tracking)
|
||||||
|
**Rule:** Never use `Dictionary.ContainsKey`, `HashSet.Contains`, or similar lookups during the serialization hot path.
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
Reference tracking and string interning use a two-phase approach:
|
||||||
|
1. **Scan Pass:** Walks the object graph and populates an array-based `WriteDuplicateEntry[]` "Write Plan" representing occurrences of duplicates.
|
||||||
|
2. **Serialize Pass:** Iterates sequentially. The framework advances an integer cursor (`WriteVisitIndex`) and compares it to a pre-calculated index via `TryConsumeWritePlanEntry()`. This provides `O(1)` duplicate detection without any hashing overhead during the actual byte-writing phase.
|
||||||
|
|
||||||
|
### 2. Unsafe and SIMD Memory Access
|
||||||
|
**Rule:** Do not use `BitConverter` or manual bit-shifting for primitive unmanaged types or block copying.
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
- For structs and unmanaged types (e.g., `Guid`, `DateTime`, `decimal`), use `Unsafe.WriteUnaligned<T>` and `MemoryMarshal.AsBytes()` to write directly to the buffer.
|
||||||
|
- For contiguous block memory copies (e.g., `byte[]`), use `System.Span<T>.CopyTo` or SIMD hardware acceleration (e.g., `Vector<byte>` and `Vector.IsHardwareAccelerated` as seen in `WriteBytesSimd`).
|
||||||
|
|
||||||
|
### 3. Hot-Path vs. Cold-Path Inlining (VarInt/VarUInt)
|
||||||
|
**Rule:** Ensure the JIT can inline the common cases by explicitly isolating the slow paths.
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
Methods called in tight loops are small and decorated with `[MethodImpl(MethodImplOptions.AggressiveInlining)]`. Heavy branching or loop logic (which prevents inlining) is extracted into separate methods:
|
||||||
|
- **Hot path:** Checks if the value fits in a single byte (e.g., `value < 0x80`). Aggressively inlined.
|
||||||
|
- **Cold path:** Multi-byte encoding logic is placed in a secondary method (e.g., `WriteVarUIntMultiByteUnsafe`) decorated with `[MethodImpl(MethodImplOptions.NoInlining)]`. This keeps the calling method's IL size extremely small and processor cache-friendly.
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# AcBinary Configuration
|
# AcBinary Configuration
|
||||||
|
|
||||||
Configuration options, presets, and option interactions for `AcBinarySerializerOptions`. For wire format see `BINARY_FORMAT.md`. For features (interning, ref tracking, property ordering) see `BINARY_FEATURES.md`.
|
Configuration options, presets, and option interactions for `AcBinarySerializerOptions`. For wire format see `BINARY_FORMAT.md`. For features (interning, ref tracking, property ordering) see `BINARY_FEATURES.md`. For internal architecture and memory management see `BINARY_IMPLEMENTATION.md`.
|
||||||
|
|
||||||
## WireMode
|
## WireMode
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue