# AcBinary Wire Format Complete wire format specification for the AcBinary serializer. Source of truth: [`AyCode.Core/Serializers/Binaries/BinaryTypeCode.cs`](../AyCode.Core/Serializers/Binaries/BinaryTypeCode.cs). ## Stream Layout ``` [version : 1 byte] [flags : 1 byte] [cacheCount : VarUInt?] [payload...] ``` - **version** — `FormatVersion = 1` (current). - **flags** — See [Header Flags](#header-flags). - **cacheCount** — Present only when `HeaderFlag_HasCacheCount` is set. Number of type wrapper slots used by serializer. ## Header Flags The flags byte uses `0x90` (144) as base with bit flags in the lower nibble: | Bit | Mask | Flag | Meaning | |-----|------|------|---------| | 0 | `0x01` | Metadata | Property hash metadata included (cross-type deserialization) | | 1 | `0x02` | RefHandling_OnlyId | Reference tracking for `IId` objects only | | 2 | `0x04` | RefHandling_All | Reference tracking for all objects (always combined with bit 1) | | 3 | `0x08` | HasCacheCount | VarUInt cache count follows the flags byte | **Reference handling modes:** None = `0x00`, OnlyId = `0x02`, All = `0x06` (bits 1+2). ## Variable-Length Encoding ### VarUInt (unsigned) LEB128: 7 data bits per byte, MSB = continuation flag. ``` value < 128 → 1 byte [0xxxxxxx] value < 16384 → 2 bytes [1xxxxxxx] [0xxxxxxx] value < 2097152 → 3 bytes ... (max 5 bytes for uint32) ``` ### VarInt (signed) ZigZag encoding maps signed to unsigned, then LEB128: ``` encode: (value << 1) ^ (value >> 31) decode: (raw >> 1) ^ -(raw & 1) ``` Maps: `0 → 0`, `-1 → 1`, `1 → 2`, `-2 → 3`, etc. ### VarULong (unsigned 64-bit) Same LEB128 encoding, max 10 bytes for uint64. ## Type Markers All markers defined in `BinaryTypeCode.cs`. `SlotCount = 64`. ### FixObj (0–63) Single-byte object type. The marker byte **is** the type slot index — no additional type identifier needed. ``` [FixObj(N)] [properties...] ``` **Slot allocation:** Slots 0–63 are reserved for runtime polymorphic types, assigned dynamically on first encounter during serialization. Source-generated (SGen) types receive slots starting at 64+ via `AllocateWrapperSlot()` (sequential, `Interlocked.Increment`). SGen slots are compile-time stable; runtime slots depend on serialization order. ### Complex Types (64–71) | Code | Name | Wire format | |------|------|-------------| | 64 | Object | `[64] [VarUInt typeIndex] [properties...]` | | 65 | ObjectRef | `[65] [VarUInt refCacheIndex]` | | 66 | Array | `[66] [VarUInt count] [elements...]` | | 67 | Dictionary | `[67] [VarUInt count] [key, value pairs...]` | | 68 | ByteArray | `[68] [VarUInt length] [raw bytes]` | | 69 | ObjectWithMetadata | `[69] [VarUInt typeIndex] [VarUInt hashCount] [hashes...] [properties...]` | | 70 | ObjectRefFirst | `[70] [VarUInt refCacheIndex] [object body...]` | | 71 | ObjectWithMetadataRefFirst | `[71] [VarUInt refCacheIndex] [metadata + properties...]` | ### Polymorphic Types (72–75) Used when runtime type differs from declared property type and `UseMetadata=false`. | Code | Name | Wire format | |------|------|-------------| | 72 | ObjectWithTypeName | `[72] [UTF8 typeName] [inner marker] [body...]` — prefix, inner Object/Array/Dict follows | | 73 | ObjectWithTypeNameRefFirst | `[73] [UTF8 typeName] [VarUInt refCacheIndex] [properties...]` — combined, no inner marker | | 74 | ObjectWithTypeIndex | `[74] [VarUInt typeIndex] [inner marker] [body...]` — prefix | | 75 | ObjectWithTypeIndexRefFirst | `[75] [VarUInt typeIndex] [VarUInt refCacheIndex] [properties...]` — combined | Second occurrence of a referenced polymorphic object uses plain `ObjectRef(65)` — no polymorphic prefix needed. ### Primitives (76–90) | Code | Name | Wire format | |------|------|-------------| | 76 | Null | `[76]` — no payload | | 77 | True | `[77]` — no payload | | 78 | False | `[78]` — no payload | | 79 | Int8 | `[79] [1 byte]` | | 80 | UInt8 | `[80] [1 byte]` | | 81 | Int16 | `[81] [VarInt]` | | 82 | UInt16 | `[82] [VarUInt]` | | 83 | Int32 | `[83] [VarInt]` | | 84 | UInt32 | `[84] [VarUInt]` | | 85 | Int64 | `[85] [VarLong]` | | 86 | UInt64 | `[86] [VarULong]` | | 87 | Float32 | `[87] [4 bytes IEEE 754]` | | 88 | Float64 | `[88] [8 bytes IEEE 754]` | | 89 | Decimal | `[89] [16 bytes]` | | 90 | Char | `[90] [VarUInt]` | ### Strings (91–94) | Code | Name | Wire format | |------|------|-------------| | 91 | String | `[91] [VarUInt byteLength] [UTF-8 bytes]` | | 92 | StringInterned | `[92] [VarUInt cacheIndex]` — 2nd+ occurrence | | 93 | StringEmpty | `[93]` — no payload | | 94 | StringInternFirst | `[94] [VarUInt cacheIndex] [VarUInt byteLength] [UTF-8 bytes]` — 1st occurrence | ### Date/Time (95–98) | Code | Name | Wire format | |------|------|-------------| | 95 | DateTime | `[95] [8 bytes ticks]` | | 96 | DateTimeOffset | `[96] [8 bytes ticks] [VarInt offsetMinutes]` | | 97 | TimeSpan | `[97] [VarLong ticks]` | | 98 | Guid | `[98] [16 bytes]` | ### Other Markers | Code | Name | Wire format | |------|------|-------------| | 99 | Enum | `[99] [VarInt underlyingValue]` | | 100 | MetadataHeader | Legacy: implies `RefHandling=true` + metadata present | | 101 | NoMetadataHeader | Legacy: implies `RefHandling=true`, no metadata | | 102 | PropertySkip | `[102]` — marks skipped property (default/null value) | ### FixStr (103–134) Short ASCII strings encoded in a single marker byte + raw bytes (no length prefix): ``` [FixStrBase + byteLength] [ASCII bytes] ``` - Length range: 0–31 bytes (`FixStrBase=103`, `FixStrMax=134`) - Saves 1 byte vs `String` marker + VarUInt length - Falls back to `String(91)` if content is non-ASCII ### TinyInt (192–255) Single-byte integer encoding for small values: ``` value = marker - 192 - 16 (range: -16 to 47) marker = value + 16 + 192 (64 values total) ``` Saves 2+ bytes vs `Int32(83)` + VarInt for frequently occurring small integers. ## Compact Encoding Selection The serializer applies compact encodings automatically: | Data | Condition | Encoding | Savings | |------|-----------|----------|---------| | Integer | −16 ≤ v ≤ 47 | TinyInt (1 byte) | 2–5 bytes | | String | ≤31 bytes, ASCII | FixStr (1+N bytes) | 1 byte (no length prefix) | | Object | type index < 64 | FixObj (1 byte) | 1–5 bytes (no VarUInt index) | | String | empty | StringEmpty (1 byte) | 1+ bytes | | Bool | — | True/False (1 byte) | no payload | ## String Interning Protocol Controls deduplication of repeated string values. **Modes** (`StringInterningMode`): - `None` — all strings inline, no overhead - `Attribute` — only `[AcStringIntern]` properties interned (default) - `All` — all strings within length limits interned **Length limits:** `MinStringInternLength=4`, `MaxStringInternLength=64` (configurable). **Wire protocol:** 1. Serializer pre-scans all eligible strings to build a plan (which strings repeat) 2. First occurrence: `[StringInternFirst(94)] [VarUInt cacheIndex] [VarUInt byteLength] [UTF-8 bytes]` 3. Subsequent: `[StringInterned(92)] [VarUInt cacheIndex]` 4. Single-occurrence strings: written as normal `String`/`FixStr` (no interning overhead) ## Reference Tracking Prevents infinite loops and preserves object identity for repeated references. **Modes** (`ReferenceHandlingMode`): - `None` — no tracking (fastest, use when graph is a tree) - `OnlyId` — track only `IId` objects (matched by ID value) - `All` — track all reference types (two-phase scan required) **Two-phase process:** 1. **Scan pass** (`ScanPass.cs`) — walks the object graph, detects multi-referenced objects and repeated strings. Builds a `WriteDuplicateEntry[]` array (the "write plan") containing `VisitIndex`, `CacheMapIndex`, `IsFirst`, and `Value` for each duplicate. 2. **Sort** — write plan entries are sorted by `VisitIndex` to match the write pass traversal order. 3. **Serialize pass** — consumes the sorted write plan via `TryConsumeWritePlanEntry()`. A cursor (`_nextWritePlanVisitIndex`) advances through the plan in O(1) — no dictionary lookups during serialization. **Wire protocol:** - First occurrence: `[ObjectRefFirst(70)] [VarUInt refCacheIndex] [object body...]` - Subsequent: `[ObjectRef(65)] [VarUInt refCacheIndex]` **Example — same object referenced twice:** ``` Input: { Users: [userA, userA] } (same instance) Scan pass → WritePlan: [{VisitIndex:2, CacheMapIndex:0, IsFirst:true}, {VisitIndex:3, CacheMapIndex:0, IsFirst:false}] Wire output (Compact mode, ReferenceHandling=All): [version=1] [flags=0x96] [VarUInt cacheCount=1] ← header [FixObj(0)] ← root object [Array(66)] [VarUInt(2)] ← Users array, 2 elements [ObjectRefFirst(70)] [VarUInt(0)] [props...] ← userA, 1st occurrence [ObjectRef(65)] [VarUInt(0)] ← userA, 2nd (2 bytes only) ``` ## Property Ordering Properties are serialized in a deterministic order defined by `TypeMetadataBase.GetUnfilteredProperties()`: 1. Walk the inheritance chain from **derived → base** (`currentType.BaseType` loop) 2. At each level, collect declared public instance properties 3. Sort **alphabetically** (`StringComparer.Ordinal`) within each level 4. Result: **base properties first, then derived, alphabetical within each level** This order is stable across serializer/deserializer as long as the type hierarchy doesn't change. ### Cross-Type Deserialization (UseMetadata) When `UseMetadata=true`, property name hashes (FNV-1a via `FnvHash.ComputeString`) are written per type, enabling schema evolution: - **Serializer** writes property hashes in the metadata section (`ObjectWithMetadata(69)`) - **Deserializer** builds an index mapping array (`GetIndexMapping()`) that maps source property indices to destination indices by matching FNV-1a hashes - This allows deserialization even when source and destination types have different property sets or ordering When `UseMetadata=false`, properties are matched by **positional index only** — source and destination must have identical property layouts. **Edge cases:** - **Hash collision** (`CheckDuplicatePropName=true`, default): throws `InvalidOperationException`. When `false`: collision silently ignored — risk of data corruption. - **Source has unknown property** (not in destination): silently skipped via `SkipValue()`, no error. - **Destination has extra property** (not in source): left at default value (new instance) or unchanged (populate mode). ## Configuration Options Options defined in `AcBinarySerializerOptions` (inherits `AcSerializerOptions`). Each option controls which code paths execute and how the wire format changes. ### WireMode | Value | Integers | Strings | Output size | Speed | |-------|----------|---------|-------------|-------| | `Compact` (default) | VarInt/VarUInt (1–5 bytes) | UTF-8 with speculative ASCII fast path | Smaller | Slightly slower | | `Fast` | Fixed-width raw bytes (4/8 bytes) | UTF-16 memcpy (`charCount * 2` bytes) | Larger | Fastest encode/decode | **Format difference for strings:** - Compact: `[VarUInt byteLength] [UTF-8 bytes]` — speculative ASCII (1 pass if all ASCII, rewind+UTF-8 fallback otherwise) - Fast: `[VarUInt charCount] [raw UTF-16 bytes]` — zero-encoding memcpy **Code branch:** `context.FastWire` flag set at `context.Reset()`. Checked in `WriteStringUtf8()` and integer write methods. FixStr optimization is skipped in Fast mode (UTF-8 specific). ### ReferenceHandling | Value | Tracked objects | Scan pass | Header flags | Wire markers | |-------|----------------|-----------|--------------|-------------| | `None` | Nothing | Skipped | `0x00` | Standard object markers only | | `OnlyId` | `IId` objects only (by ID value) | Partial | `0x02` | `ObjectRefFirst(70)` + `ObjectRef(65)` | | `All` (default) | All reference types | Full graph walk | `0x06` | `ObjectRefFirst(70)` + `ObjectRef(65)` | **Format impact:** When enabled, multi-referenced objects are written once with `ObjectRefFirst(70) + VarUInt(refCacheIndex)` on first encounter, then replaced by `ObjectRef(65) + VarUInt(refCacheIndex)` on subsequent encounters. Header `HasCacheCount` flag is set and cache count written. **Interaction with `ThrowOnCircularReference` (default: `true`):** - `true` + ref handling enabled: all objects tracked for cycle detection, throws `InvalidOperationException` on circular reference - `false` + ref handling enabled: only IId types tracked for deduplication, non-IId circular refs silently truncated at `MaxDepth` ### UseMetadata | Value | Wire markers | Property matching | Overhead | |-------|-------------|-------------------|----------| | `false` (default) | `FixObj`/`Object` | Positional index only — types must match | None | | `true` | `ObjectWithMetadata(69)` / `ObjectWithMetadataRefFirst(71)` | FNV-1a property name hashes | 4 bytes per property per type | **Format impact:** When enabled, each type's first occurrence writes `[VarUInt hashCount] [FNV-1a hash × N]` before properties. Deserializer uses hashes to build source→destination index mapping, enabling cross-type deserialization (different property sets/ordering). **Code branch:** `context.UseMetadata` controls whether `ObjectWithMetadata(69)` or plain `Object(64)` markers are used. When `false`, `IsDirectObjectWrite=true` allows source-generated writers to bypass `WriteObject` entirely and inline property writes. **Related:** `CheckDuplicatePropName` (default: `true`) — throws if FNV-1a hash collision detected between property names of the same type. Disable in production for performance. ### UseStringInterning | Value | Eligible strings | Scan overhead | Wire markers | |-------|-----------------|---------------|-------------| | `None` | Nothing | None | `String(91)` / `FixStr` only | | `Attribute` (default) | Properties with `[AcStringIntern(true)]` | Scans marked properties | `StringInternFirst(94)` + `StringInterned(92)` | | `All` | All strings within length limits | Scans all strings | `StringInternFirst(94)` + `StringInterned(92)` | **Length limits:** `MinStringInternLength` (default: 4) and `MaxStringInternLength` (default: 64, 0=unlimited). Strings outside this range are always written inline. **Format impact:** Interned strings on first occurrence: `[StringInternFirst(94)] [VarUInt cacheIndex] [string data]`. Subsequent: `[StringInterned(92)] [VarUInt cacheIndex]` (1–2 bytes vs full string). Single-occurrence strings are never interned — no overhead for unique strings. **Code branch:** `context.StringInternEligible` flag set per-property before `WriteString`. Scan pass builds a `WriteDuplicateEntry[]` plan; write pass consumes it via cursor. ### MaxDepth | Value | Behavior | |-------|----------| | `255` (default) | Effectively unlimited nesting | | `0` | Root level only — nested objects/collections written as `Null(76)` | | `N` | Objects deeper than N levels written as `Null(76)` | **Format impact:** Depth-exceeded values appear as `Null(76)` in the stream — indistinguishable from actual null values. No special marker. **Code branch:** Checked at entry of every object/collection write: `if (depth > MaxDepth) { WriteByte(Null); return; }`. ### UseCompression | Value | Method | Granularity | Memory | |-------|--------|-------------|--------| | `None` (default) | No compression | — | — | | `Block` | LZ4 single block | Entire payload | Full buffer in memory | | `BlockArray` | LZ4 chunked | 64KB chunks | Streaming-friendly, lower peak memory | **Format impact:** Compression is applied **post-serialization** as a transparent wrapper — the inner wire format is unchanged. Both modes are pure managed C# (WASM-compatible, no native dependencies). **Code branch:** Applied in `AcBinarySerializer.Serialize()` after the serialization context produces the raw buffer: `if (UseCompression != None) Lz4.Compress(buffer, mode)`. Decompression is automatic on deserialize. ### PropertyFilter Optional delegate `BinaryPropertyFilter?` (default: `null`). When set, invoked for each property to decide inclusion. ``` delegate bool BinaryPropertyFilter(in BinaryPropertyFilterContext context); ``` **BinaryPropertyFilterContext fields:** `DeclaringType`, `PropertyName`, `PropertyType`, `Instance` (null during metadata phase), `IsMetadataPhase`, `GetValue()` (lazy). **Format impact:** Excluded properties are completely absent from the stream — no marker, no placeholder. The deserializer must use `UseMetadata=true` or identical filter to correctly match property indices. **Code branch:** `context.HasPropertyFilter` checked in `ShouldSerializeProperty()`. Called twice: once during metadata registration (`Instance=null`), once during write phase. ### PropertyMapper Optional delegate `PropertyMapperDelegate?` (default: `null`) for cross-type deserialization property remapping. ``` delegate PropertyInfo? PropertyMapperDelegate(PropertyInfo sourceProperty, Type destinationType); ``` **Purpose:** Maps properties between different class hierarchies (renamed properties, external DTOs). Result is cached — zero overhead on same-type operations (`Deserialize`). ### WASM Options | Option | Default | Purpose | |--------|---------|---------| | `IsWasm` | `OperatingSystem.IsBrowser()` | Auto-detect WASM environment | | `UseStringCaching` | follows `IsWasm` | Cache short strings during deserialization to reduce GC pressure | | `MaxCachedStringLength` | 64 | Max string length to cache | **Format impact:** None — these are deserialization-only optimizations. When `UseStringCaching=true`, the deserializer maintains an intern cache for strings ≤ `MaxCachedStringLength` chars. Disabled automatically when `StringInternFirst` marker is encountered (interning takes precedence). ### Other Options | Option | Type | Default | Purpose | |--------|------|---------|---------| | `UseGeneratedCode` | bool | `true` | Use source-generated writers/readers when available | | `InitialBufferCapacity` | int | 4096 | Starting buffer size (bytes) for serialization output | | `RemoveOrphanedItems` | bool | `false` | During `PopulateMerge`: remove destination collection items with no matching source ID | | `UseAsync` | bool | `false` | Async context pool return via ThreadPool. Auto-disabled in WASM and when `ReferenceHandling=None` | | `MaxContextPoolSize` | int | 8 | Max serialization contexts kept in pool | ## Presets | Preset | WireMode | Metadata | StringInterning | RefHandling | MaxDepth | Compression | Other | |--------|----------|----------|-----------------|-------------|----------|-------------|-------| | `Default` | Compact | false | Attribute | All | 255 | None | — | | `FastMode` | Compact | false | None | None | 255 | None | No scan pass | | `ShallowCopy` | Compact | false | None | None | **0** | None | Root level only | | `WasmOptimized` | Compact | false | Attribute | All | 255 | None | +StringCaching | | `WithoutReferenceHandling` | Compact | false | Attribute | **None** | 255 | None | No scan pass | | `WithoutMetadata` | Compact | **false** | Attribute | All | 255 | None | — | **Performance implication of presets:** - `Default` / `WasmOptimized` — two-phase (scan + serialize) due to `ReferenceHandling=All` - `FastMode` / `ShallowCopy` — single-phase (no scan pass) since both interning and refs are disabled - The scan pass adds ~20-30% overhead; disable it when the object graph is a simple tree ## Option Interactions Key interdependencies that affect which code branches execute: | Combination | Effect | |-------------|--------| | `ReferenceHandling=None` + `UseStringInterning=None` | **No scan pass** — fastest path, single-phase serialization | | `ReferenceHandling=All` + `UseMetadata=true` | Uses `ObjectWithMetadataRefFirst(71)` marker — combined ref + metadata | | `UseMetadata=false` + `UseGeneratedCode=true` | `IsDirectObjectWrite=true` — generated code inlines property writes, bypasses `WriteObject` | | `UseMetadata=true` + `PropertyFilter` set | Filter invoked twice (metadata phase + write phase); filter results must be stable | | `WireMode=Fast` + `UseStringInterning!=None` | Interned strings still use the fast string path (UTF-16 for first occurrence, VarUInt index for subsequent) | | `UseCompression!=None` + any other option | Compression is orthogonal — applied post-serialization, inner format unchanged |