AyCode.Core/AyCode.Core/docs/BINARY/BINARY_IMPLEMENTATION.md

11 KiB

Binaries Implementation Details

Low-level technical decisions, memory management, internal structure of AcBinarySerializer. For framework developers modifying the serialization pipeline.

Format spec: BINARY_FORMAT.md | Options/presets: BINARY_OPTIONS.md | Features: BINARY_FEATURES.md | Output writers: BINARY_WRITERS.md | SGen architecture: BINARY_SGEN.md Benchmark results: ../../Test_Benchmark_Results/Benchmark/*.LLM

Zero-Allocation Buffer Management

Core philosophy: zero virtual dispatch, zero direct allocation on hot path.

Context-Owned Buffer State

BinarySerializationContext<TOutput> owns buffer state directly (no IBinaryWriter interface dispatch per byte):

internal byte[] _buffer = null!;
internal int _position;
internal int _bufferEnd;

All write methods on the sealed context class, aggressively inlined.

No Temporary String Buffers

String → UTF-8 directly into context _buffer, no intermediate byte[].

Speculative ASCII fast path:

  1. Assume ASCII (byte length = char length), ensure capacity
  2. Ascii.FromUtf16 directly into buffer
  3. Non-ASCII hit → rewind, calculate true UTF-8 length, Utf8.GetBytes

TOutput Strategy

Generic TOutput : struct, IBinaryOutputBase → JIT devirtualizes Grow(). Output invoked only on cold path (buffer exhaustion).

ArrayBinaryOutput, BufferWriterBinaryOutput, chunk sizing, dual buffer state: BINARY_WRITERS.md

Root Serialization Dispatch

SGen Root Fast Path

When root type has GeneratedWriter (SGen-decorated), the serializer skips the full dispatch chain and calls WriteObject directly with a pre-resolved wrapper.

Full path (runtime):
  null → GetType → IQueryable? → IsExpressionType? → Pool.Get → Initialize
  → ScanForDuplicates → WriteHeader
  → WriteValue → TryWritePrimitive(GetTypeCode+15-case switch)
  → WriteValueNonPrimitive(byte[]? IDictionary? IEnumerable? GetWrapper)
  → WriteObject → WriteObjectProperties → SGen

SGen fast path:
  null → GetType → Pool.Get → Initialize
  → GetWrapper → GeneratedWriter != null?
    YES → ScanForDuplicates → WriteHeader → WriteObject(wrapper) → return
    NO  → full path (IQueryable/Expression + WriteValue dispatch)

Why safe: GeneratedWriter exists → type is NOT IQueryable/Expression/primitive/byte[]/IDictionary/IEnumerable. SGen only generates for object model types. WriteObject handles FixObj slot assignment, UseMetadata, RefHandling correctly when called with pre-resolved wrapper.

Saved overhead: is IQueryable interface check, IsExpressionType IsAssignableFrom, TryWritePrimitive Type.GetTypeCode + 15-case switch, WriteValueNonPrimitive 4 interface checks + dupla GetWrapper, 2 eliminated method call levels.

Non-SGen penalty: +1 bool check (options.UseGeneratedCode) + 1 GetWrapper (cached) + 1 null check ≈ ~10-15ns.

Location: AcBinarySerializer.csSerialize<T>(T, options) (byte[] path), Serialize<T>(T, IBufferWriter, options) (BWO path), and Serialize<T>(T, PipeWriter, options) (AsyncPipeWriterOutput segment-streaming path).

Full Runtime Dispatch Chain

For non-SGen root types, the full chain executes:

Step Method What it does
1 Serialize<T> null check, GetType, pool get, IQueryable/Expression conversion
2 ScanForDuplicates builds write plan (if caching enabled)
3 WriteHeader version + flags + cacheCount
4 WriteValue null check → TryWritePrimitiveWriteValueNonPrimitive
5 TryWritePrimitive Type.GetTypeCode + 15-case switch (Int32/String/Bool/etc.)
6 WriteValueNonPrimitive is byte[]?is IDictionary?is IEnumerable?GetWrapperWriteObject
7 WriteObject FixObj/Object marker, ref handling, metadata → WriteObjectProperties
8 WriteObjectProperties SGen WriteProperties or runtime property loop

Direct Object Write (IsDirectObjectWrite)

When UseMetadata = false: no inline property name hashes needed. SGen bypasses generic WriteObject loop entirely — writes Object(64) marker then sequential properties. Overhead ≈ raw byte writes.

Property State Buffering

UseMetadata=true cross-type deserialization: properties may arrive in different order. Deserializer rents temp buffer from ArrayPool to map wire hashes → local property setters.

High-Performance Coding Rules

Strictly enforced. AI agents and developers MUST follow.

1. The "Write Plan" (O(1) Reference Tracking)

Rule: Never use Dictionary.ContainsKey/HashSet.Contains on hot path.

Two-phase:

  1. Scan: walks graph → array-based WriteDuplicateEntry[] plan
  2. Serialize: sequential cursor (WriteVisitIndex) vs pre-calculated index via TryConsumeWritePlanEntry() → O(1) duplicate detection, zero hashing

2. Unsafe and SIMD Memory Access

Rule: No BitConverter or manual bit-shifting for unmanaged types.

  • Structs/unmanaged (Guid, DateTime, decimal): Unsafe.WriteUnaligned<T>, MemoryMarshal.AsBytes()
  • Block copies (byte[]): Span<T>.CopyTo or SIMD (Vector<byte>, Vector.IsHardwareAcceleratedWriteBytesSimd)

3. Hot/Cold Path Inlining

Rule: Keep hot-path IL small for JIT inlining.

  • Hot: single-byte check (e.g. value < 0x80), AggressiveInlining
  • Cold: multi-byte logic in separate NoInlining method (e.g. WriteVarUIntMultiByteUnsafe)
  • Keeps caller IL small, cache-friendly

Inlining barriers — [MethodImpl(AggressiveInlining)] is silently ignored when:

  • try / catch / finally / using — on .NET 9 (project minimum, see copilot-instructions.md rule 16) any EH region is a hard JIT inlining barrier (inline.cpp in CoreCLR). using desugars to try/finally with the same effect. Move resource cleanup (Pool.Return, ArrayPool.Return, Dispose) into a separate cold method, or keep it outside the hot caller. The Pool.Get → try/finally → Pool.Return pattern (Rule #5) sits at the entry of Serialize<T>, not on a per-property hot path. — Hard rule regardless of runtime. .NET 10 partially lifts the restriction for same-module try-finally (dotnet/runtime#112998, merged 2025-03-20), but catch, cross-module try-finally, and P/Invoke-stub cases stay blocked. Code must run inline-friendly on .NET 9 today AND .NET 10+ tomorrow — staying EH-free is the portable choice. Audit: BINARY_TODO.md#accore-bin-t-t5j8.
  • stackalloc with non-constant or large size — small constant stackalloc (≤ ~1KB) is inlinable in .NET 6+, but adding any other barrier (try/finally, complex control flow) makes the method non-inlinable. When mixing stackalloc with try/finally (e.g. ArrayPool fallback + scratch buffer), expect a separate call frame — avoid inline-only assumptions in the caller.
  • Method size / IL token count — the JIT has IL-size and basic-block thresholds even with AggressiveInlining. For large generated methods (SGen WriteProperties for property-heavy types) the attribute is a hint, not a guarantee; see BINARY_TODO.md#accore-bin-t-t5j8 for AggressiveOptimization as a complementary tool.

When adding a new helper to the hot path: check for any of the above before placing [MethodImpl(AggressiveInlining)]. The attribute silently lies if the body has an EH region.

4. SGen Root Fast Path

Rule: Root-level SGen types MUST skip WriteValue/TryWritePrimitive/WriteValueNonPrimitive dispatch chain.

  • Serialize<T> checks options.UseGeneratedCode + wrapper.GeneratedWriter != null before IQueryable/Expression check
  • Calls WriteObject(value, wrapper, context, 0) directly
  • Wire format identical — only dispatch path differs
  • Applies to both byte[] and IBufferWriter entry points

5. Pool Management

Rule: Context pool is generic over TOutput. Pool.Get → try/finally → Pool.Return.

  • BinarySerializationContextPool<ArrayBinaryOutput> — byte[] path
  • BinarySerializationContextPool<BufferWriterBinaryOutput> — IBufferWriter path
  • BinarySerializationContextPool<AsyncPipeWriterOutput> — PipeWriter segment-streaming path
  • options.UseAsyncReturnAsync (ThreadPool enqueue) to avoid lock contention
  • Pooled contexts retain wrapper caches, buffer instances across serializations

6. Avoid Redundant Wrapper/GetType Lookups

Rule: When a bridge method calls GetWrapper(value.GetType(), slot) for metadata, cache the result in a local. Never call GetWrapper + value.GetType() twice in the same method.

  • WriteObjectFullMarkerIId / WriteObjectFullMarkerAll: wrapper.Metadata cached at entry, reused in ref-handling and non-ref branches
  • GetWrapper(type, slot) is O(1) array index after first call, but value.GetType() is a virtual call — avoid repeating it

Metadata Lifecycle & Cold-Start (planned: ACCORE-BIN-T-W9F1 / ACCORE-BIN-T-T5J8)

Today BinarySerializeTypeMetadata / BinaryDeserializeTypeMetadata are built lazily in GetWrapperSlow via GlobalMetadataCache.GetOrAdd(type, MetadataFactory). The factory runs reflection property enumeration, attribute scans, Expression.Compile per property — dominant first-call cost for SGen types (see BINARY_ISSUES.md#accore-bin-i-n6q3).

Planned evolution (BINARY_TODO.md#accore-bin-t-w9f1):

  • GeneratedMetadataRegistry: ModuleInit registers pre-built metadata per [AcBinarySerializable] type alongside the existing GeneratedWriterRegistry / GeneratedReaderRegistry entries. Generator passes references to its static s_typeNameHash / s_propertyHashes fields — single source of truth, no duplicate computation, no hot-path indirection (generator keeps using its own static fields).
  • Metadata ctor split: a second ctor on BinarySerializeTypeMetadata / BinaryDeserializeTypeMetadata accepts pre-computed values (hashes, MinWriteSize, ComplexPropertyCount, IsIId, IdAccessorType, flags). No reflection in this ctor.
  • Lazy RuntimeInit: TypeMetadataBase gets volatile bool _runtimeInitialized + internal void RuntimeInit(). GetWrapperSlow calls it only when wrapper.GeneratedWriter == null || !Options.UseGeneratedCode — i.e. for runtime-only types and the UseGeneratedCode=false edge case. SGen types skip it. Thread-safe by idempotence + volatile (no lock).
  • Hybrid safety: SGen root path (WriteObjectPropertiesgeneratedWriter.WriteProperties) never touches the SGen type's own property accessors; non-SGen child types come through the MetadataFactory path as today.

Follow-up (BINARY_TODO.md#accore-bin-t-t5j8): after ACCORE-BIN-T-W9F1 removes reflection + Expression.Compile from the cold path, JIT of generated methods becomes dominant — mitigated via [AggressiveOptimization], background RuntimeHelpers.PrepareMethod, and/or R2R (consumer publish config).

Wire format unchanged; UseGeneratedCode=false fallback continues to work identically (triggers RuntimeInit for SGen types on demand).