11 KiB
Binaries Implementation Details
Low-level technical decisions, memory management, internal structure of AcBinarySerializer. For framework developers modifying the serialization pipeline.
Format spec:
BINARY_FORMAT.md| Options/presets:BINARY_OPTIONS.md| Features:BINARY_FEATURES.md| Output writers:BINARY_WRITERS.md| SGen architecture:BINARY_SGEN.mdBenchmark results:../../Test_Benchmark_Results/Benchmark/*.LLM
Zero-Allocation Buffer Management
Core philosophy: zero virtual dispatch, zero direct allocation on hot path.
Context-Owned Buffer State
BinarySerializationContext<TOutput> owns buffer state directly (no IBinaryWriter interface dispatch per byte):
internal byte[] _buffer = null!;
internal int _position;
internal int _bufferEnd;
All write methods on the sealed context class, aggressively inlined.
No Temporary String Buffers
String → UTF-8 directly into context _buffer, no intermediate byte[].
Speculative ASCII fast path:
- Assume ASCII (byte length = char length), ensure capacity
Ascii.FromUtf16directly into buffer- Non-ASCII hit → rewind, calculate true UTF-8 length,
Utf8.GetBytes
TOutput Strategy
Generic TOutput : struct, IBinaryOutputBase → JIT devirtualizes Grow(). Output invoked only on cold path (buffer exhaustion).
ArrayBinaryOutput,BufferWriterBinaryOutput, chunk sizing, dual buffer state:BINARY_WRITERS.md
Root Serialization Dispatch
SGen Root Fast Path
When root type has GeneratedWriter (SGen-decorated), the serializer skips the full dispatch chain and calls WriteObject directly with a pre-resolved wrapper.
Full path (runtime):
null → GetType → IQueryable? → IsExpressionType? → Pool.Get → Initialize
→ ScanForDuplicates → WriteHeader
→ WriteValue → TryWritePrimitive(GetTypeCode+15-case switch)
→ WriteValueNonPrimitive(byte[]? IDictionary? IEnumerable? GetWrapper)
→ WriteObject → WriteObjectProperties → SGen
SGen fast path:
null → GetType → Pool.Get → Initialize
→ GetWrapper → GeneratedWriter != null?
YES → ScanForDuplicates → WriteHeader → WriteObject(wrapper) → return
NO → full path (IQueryable/Expression + WriteValue dispatch)
Why safe: GeneratedWriter exists → type is NOT IQueryable/Expression/primitive/byte[]/IDictionary/IEnumerable. SGen only generates for object model types. WriteObject handles FixObj slot assignment, UseMetadata, RefHandling correctly when called with pre-resolved wrapper.
Saved overhead: is IQueryable interface check, IsExpressionType IsAssignableFrom, TryWritePrimitive Type.GetTypeCode + 15-case switch, WriteValueNonPrimitive 4 interface checks + dupla GetWrapper, 2 eliminated method call levels.
Non-SGen penalty: +1 bool check (options.UseGeneratedCode) + 1 GetWrapper (cached) + 1 null check ≈ ~10-15ns.
Location: AcBinarySerializer.cs — Serialize<T>(T, options) (byte[] path), Serialize<T>(T, IBufferWriter, options) (BWO path), and Serialize<T>(T, PipeWriter, options) (AsyncPipeWriterOutput segment-streaming path).
Full Runtime Dispatch Chain
For non-SGen root types, the full chain executes:
| Step | Method | What it does |
|---|---|---|
| 1 | Serialize<T> |
null check, GetType, pool get, IQueryable/Expression conversion |
| 2 | ScanForDuplicates |
builds write plan (if caching enabled) |
| 3 | WriteHeader |
version + flags + cacheCount |
| 4 | WriteValue |
null check → TryWritePrimitive → WriteValueNonPrimitive |
| 5 | TryWritePrimitive |
Type.GetTypeCode + 15-case switch (Int32/String/Bool/etc.) |
| 6 | WriteValueNonPrimitive |
is byte[]? → is IDictionary? → is IEnumerable? → GetWrapper → WriteObject |
| 7 | WriteObject |
FixObj/Object marker, ref handling, metadata → WriteObjectProperties |
| 8 | WriteObjectProperties |
SGen WriteProperties or runtime property loop |
Direct Object Write (IsDirectObjectWrite)
When UseMetadata = false: no inline property name hashes needed. SGen bypasses generic WriteObject loop entirely — writes Object(64) marker then sequential properties. Overhead ≈ raw byte writes.
Property State Buffering
UseMetadata=true cross-type deserialization: properties may arrive in different order. Deserializer rents temp buffer from ArrayPool to map wire hashes → local property setters.
High-Performance Coding Rules
Strictly enforced. AI agents and developers MUST follow.
1. The "Write Plan" (O(1) Reference Tracking)
Rule: Never use Dictionary.ContainsKey/HashSet.Contains on hot path.
Two-phase:
- Scan: walks graph → array-based
WriteDuplicateEntry[]plan - Serialize: sequential cursor (
WriteVisitIndex) vs pre-calculated index viaTryConsumeWritePlanEntry()→ O(1) duplicate detection, zero hashing
2. Unsafe and SIMD Memory Access
Rule: No BitConverter or manual bit-shifting for unmanaged types.
- Structs/unmanaged (
Guid,DateTime,decimal):Unsafe.WriteUnaligned<T>,MemoryMarshal.AsBytes() - Block copies (
byte[]):Span<T>.CopyToor SIMD (Vector<byte>,Vector.IsHardwareAccelerated→WriteBytesSimd)
3. Hot/Cold Path Inlining
Rule: Keep hot-path IL small for JIT inlining.
- Hot: single-byte check (e.g.
value < 0x80),AggressiveInlining - Cold: multi-byte logic in separate
NoInliningmethod (e.g.WriteVarUIntMultiByteUnsafe) - Keeps caller IL small, cache-friendly
Inlining barriers — [MethodImpl(AggressiveInlining)] is silently ignored when:
try/catch/finally/using— on .NET 9 (project minimum, seecopilot-instructions.mdrule 16) any EH region is a hard JIT inlining barrier (inline.cppin CoreCLR).usingdesugars totry/finallywith the same effect. Move resource cleanup (Pool.Return,ArrayPool.Return,Dispose) into a separate cold method, or keep it outside the hot caller. The Pool.Get → try/finally → Pool.Return pattern (Rule #5) sits at the entry ofSerialize<T>, not on a per-property hot path. — Hard rule regardless of runtime. .NET 10 partially lifts the restriction for same-module try-finally (dotnet/runtime#112998, merged 2025-03-20), butcatch, cross-module try-finally, and P/Invoke-stub cases stay blocked. Code must run inline-friendly on .NET 9 today AND .NET 10+ tomorrow — staying EH-free is the portable choice. Audit:BINARY_TODO.md#accore-bin-t-t5j8.stackallocwith non-constant or large size — small constantstackalloc(≤ ~1KB) is inlinable in .NET 6+, but adding any other barrier (try/finally, complex control flow) makes the method non-inlinable. When mixingstackallocwithtry/finally(e.g. ArrayPool fallback + scratch buffer), expect a separate call frame — avoid inline-only assumptions in the caller.- Method size / IL token count — the JIT has IL-size and basic-block thresholds even with
AggressiveInlining. For large generated methods (SGenWritePropertiesfor property-heavy types) the attribute is a hint, not a guarantee; seeBINARY_TODO.md#accore-bin-t-t5j8forAggressiveOptimizationas a complementary tool.
When adding a new helper to the hot path: check for any of the above before placing [MethodImpl(AggressiveInlining)]. The attribute silently lies if the body has an EH region.
4. SGen Root Fast Path
Rule: Root-level SGen types MUST skip WriteValue/TryWritePrimitive/WriteValueNonPrimitive dispatch chain.
Serialize<T>checksoptions.UseGeneratedCode+wrapper.GeneratedWriter != nullbefore IQueryable/Expression check- Calls
WriteObject(value, wrapper, context, 0)directly - Wire format identical — only dispatch path differs
- Applies to both byte[] and IBufferWriter entry points
5. Pool Management
Rule: Context pool is generic over TOutput. Pool.Get → try/finally → Pool.Return.
BinarySerializationContextPool<ArrayBinaryOutput>— byte[] pathBinarySerializationContextPool<BufferWriterBinaryOutput>— IBufferWriter pathBinarySerializationContextPool<AsyncPipeWriterOutput>— PipeWriter segment-streaming pathoptions.UseAsync→ReturnAsync(ThreadPool enqueue) to avoid lock contention- Pooled contexts retain wrapper caches, buffer instances across serializations
6. Avoid Redundant Wrapper/GetType Lookups
Rule: When a bridge method calls GetWrapper(value.GetType(), slot) for metadata, cache the result in a local. Never call GetWrapper + value.GetType() twice in the same method.
WriteObjectFullMarkerIId/WriteObjectFullMarkerAll:wrapper.Metadatacached at entry, reused in ref-handling and non-ref branchesGetWrapper(type, slot)is O(1) array index after first call, butvalue.GetType()is a virtual call — avoid repeating it
Metadata Lifecycle & Cold-Start (planned: ACCORE-BIN-T-W9F1 / ACCORE-BIN-T-T5J8)
Today BinarySerializeTypeMetadata / BinaryDeserializeTypeMetadata are built lazily in GetWrapperSlow via GlobalMetadataCache.GetOrAdd(type, MetadataFactory). The factory runs reflection property enumeration, attribute scans, Expression.Compile per property — dominant first-call cost for SGen types (see BINARY_ISSUES.md#accore-bin-i-n6q3).
Planned evolution (BINARY_TODO.md#accore-bin-t-w9f1):
GeneratedMetadataRegistry:ModuleInitregisters pre-built metadata per[AcBinarySerializable]type alongside the existingGeneratedWriterRegistry/GeneratedReaderRegistryentries. Generator passes references to its statics_typeNameHash/s_propertyHashesfields — single source of truth, no duplicate computation, no hot-path indirection (generator keeps using its own static fields).- Metadata ctor split: a second ctor on
BinarySerializeTypeMetadata/BinaryDeserializeTypeMetadataaccepts pre-computed values (hashes,MinWriteSize,ComplexPropertyCount,IsIId,IdAccessorType, flags). No reflection in this ctor. - Lazy
RuntimeInit:TypeMetadataBasegetsvolatile bool _runtimeInitialized+internal void RuntimeInit().GetWrapperSlowcalls it only whenwrapper.GeneratedWriter == null || !Options.UseGeneratedCode— i.e. for runtime-only types and theUseGeneratedCode=falseedge case. SGen types skip it. Thread-safe by idempotence +volatile(no lock). - Hybrid safety: SGen root path (
WriteObjectProperties→generatedWriter.WriteProperties) never touches the SGen type's own property accessors; non-SGen child types come through theMetadataFactorypath as today.
Follow-up (BINARY_TODO.md#accore-bin-t-t5j8): after ACCORE-BIN-T-W9F1 removes reflection + Expression.Compile from the cold path, JIT of generated methods becomes dominant — mitigated via [AggressiveOptimization], background RuntimeHelpers.PrepareMethod, and/or R2R (consumer publish config).
Wire format unchanged; UseGeneratedCode=false fallback continues to work identically (triggers RuntimeInit for SGen types on demand).