20 KiB
AcBinary Wire Format
Complete wire format specification for the AcBinary serializer. Source of truth: AyCode.Core/Serializers/Binaries/BinaryTypeCode.cs.
Stream Layout
[version : 1 byte] [flags : 1 byte] [cacheCount : VarUInt?] [payload...]
- version —
FormatVersion = 1(current). - flags — See Header Flags.
- cacheCount — Present only when
HeaderFlag_HasCacheCountis set. Number of type wrapper slots used by serializer.
Header Flags
The flags byte uses 0x90 (144) as base with bit flags in the lower nibble:
| Bit | Mask | Flag | Meaning |
|---|---|---|---|
| 0 | 0x01 |
Metadata | Property hash metadata included (cross-type deserialization) |
| 1 | 0x02 |
RefHandling_OnlyId | Reference tracking for IId objects only |
| 2 | 0x04 |
RefHandling_All | Reference tracking for all objects (always combined with bit 1) |
| 3 | 0x08 |
HasCacheCount | VarUInt cache count follows the flags byte |
Reference handling modes: None = 0x00, OnlyId = 0x02, All = 0x06 (bits 1+2).
Variable-Length Encoding
VarUInt (unsigned)
LEB128: 7 data bits per byte, MSB = continuation flag.
value < 128 → 1 byte [0xxxxxxx]
value < 16384 → 2 bytes [1xxxxxxx] [0xxxxxxx]
value < 2097152 → 3 bytes ...
(max 5 bytes for uint32)
VarInt (signed)
ZigZag encoding maps signed to unsigned, then LEB128:
encode: (value << 1) ^ (value >> 31)
decode: (raw >> 1) ^ -(raw & 1)
Maps: 0 → 0, -1 → 1, 1 → 2, -2 → 3, etc.
VarULong (unsigned 64-bit)
Same LEB128 encoding, max 10 bytes for uint64.
Type Markers
All markers defined in BinaryTypeCode.cs. SlotCount = 64.
FixObj (0–63)
Single-byte object type. The marker byte is the type slot index — no additional type identifier needed.
[FixObj(N)] [properties...]
Slot allocation: Slots 0–63 are reserved for runtime polymorphic types, assigned dynamically on first encounter during serialization. Source-generated (SGen) types receive slots starting at 64+ via AllocateWrapperSlot() (sequential, Interlocked.Increment). SGen slots are compile-time stable; runtime slots depend on serialization order.
Complex Types (64–71)
| Code | Name | Wire format |
|---|---|---|
| 64 | Object | [64] [VarUInt typeIndex] [properties...] |
| 65 | ObjectRef | [65] [VarUInt refCacheIndex] |
| 66 | Array | [66] [VarUInt count] [elements...] |
| 67 | Dictionary | [67] [VarUInt count] [key, value pairs...] |
| 68 | ByteArray | [68] [VarUInt length] [raw bytes] |
| 69 | ObjectWithMetadata | [69] [VarUInt typeIndex] [VarUInt hashCount] [hashes...] [properties...] |
| 70 | ObjectRefFirst | [70] [VarUInt refCacheIndex] [object body...] |
| 71 | ObjectWithMetadataRefFirst | [71] [VarUInt refCacheIndex] [metadata + properties...] |
Polymorphic Types (72–75)
Used when runtime type differs from declared property type and UseMetadata=false.
| Code | Name | Wire format |
|---|---|---|
| 72 | ObjectWithTypeName | [72] [UTF8 typeName] [inner marker] [body...] — prefix, inner Object/Array/Dict follows |
| 73 | ObjectWithTypeNameRefFirst | [73] [UTF8 typeName] [VarUInt refCacheIndex] [properties...] — combined, no inner marker |
| 74 | ObjectWithTypeIndex | [74] [VarUInt typeIndex] [inner marker] [body...] — prefix |
| 75 | ObjectWithTypeIndexRefFirst | [75] [VarUInt typeIndex] [VarUInt refCacheIndex] [properties...] — combined |
Second occurrence of a referenced polymorphic object uses plain ObjectRef(65) — no polymorphic prefix needed.
Primitives (76–90)
| Code | Name | Wire format |
|---|---|---|
| 76 | Null | [76] — no payload |
| 77 | True | [77] — no payload |
| 78 | False | [78] — no payload |
| 79 | Int8 | [79] [1 byte] |
| 80 | UInt8 | [80] [1 byte] |
| 81 | Int16 | [81] [VarInt] |
| 82 | UInt16 | [82] [VarUInt] |
| 83 | Int32 | [83] [VarInt] |
| 84 | UInt32 | [84] [VarUInt] |
| 85 | Int64 | [85] [VarLong] |
| 86 | UInt64 | [86] [VarULong] |
| 87 | Float32 | [87] [4 bytes IEEE 754] |
| 88 | Float64 | [88] [8 bytes IEEE 754] |
| 89 | Decimal | [89] [16 bytes] |
| 90 | Char | [90] [VarUInt] |
Strings (91–94)
| Code | Name | Wire format |
|---|---|---|
| 91 | String | [91] [VarUInt byteLength] [UTF-8 bytes] |
| 92 | StringInterned | [92] [VarUInt cacheIndex] — 2nd+ occurrence |
| 93 | StringEmpty | [93] — no payload |
| 94 | StringInternFirst | [94] [VarUInt cacheIndex] [VarUInt byteLength] [UTF-8 bytes] — 1st occurrence |
Date/Time (95–98)
| Code | Name | Wire format |
|---|---|---|
| 95 | DateTime | [95] [8 bytes ticks] |
| 96 | DateTimeOffset | [96] [8 bytes ticks] [VarInt offsetMinutes] |
| 97 | TimeSpan | [97] [VarLong ticks] |
| 98 | Guid | [98] [16 bytes] |
Other Markers
| Code | Name | Wire format |
|---|---|---|
| 99 | Enum | [99] [VarInt underlyingValue] |
| 100 | MetadataHeader | Legacy: implies RefHandling=true + metadata present |
| 101 | NoMetadataHeader | Legacy: implies RefHandling=true, no metadata |
| 102 | PropertySkip | [102] — marks skipped property (default/null value) |
FixStr (103–134)
Short ASCII strings encoded in a single marker byte + raw bytes (no length prefix):
[FixStrBase + byteLength] [ASCII bytes]
- Length range: 0–31 bytes (
FixStrBase=103,FixStrMax=134) - Saves 1 byte vs
Stringmarker + VarUInt length - Falls back to
String(91)if content is non-ASCII
TinyInt (192–255)
Single-byte integer encoding for small values:
value = marker - 192 - 16 (range: -16 to 47)
marker = value + 16 + 192 (64 values total)
Saves 2+ bytes vs Int32(83) + VarInt for frequently occurring small integers.
Compact Encoding Selection
The serializer applies compact encodings automatically:
| Data | Condition | Encoding | Savings |
|---|---|---|---|
| Integer | −16 ≤ v ≤ 47 | TinyInt (1 byte) | 2–5 bytes |
| String | ≤31 bytes, ASCII | FixStr (1+N bytes) | 1 byte (no length prefix) |
| Object | type index < 64 | FixObj (1 byte) | 1–5 bytes (no VarUInt index) |
| String | empty | StringEmpty (1 byte) | 1+ bytes |
| Bool | — | True/False (1 byte) | no payload |
String Interning Protocol
Controls deduplication of repeated string values.
Modes (StringInterningMode):
None— all strings inline, no overheadAttribute— only[AcStringIntern]properties interned (default)All— all strings within length limits interned
Length limits: MinStringInternLength=4, MaxStringInternLength=64 (configurable).
Wire protocol:
- Serializer pre-scans all eligible strings to build a plan (which strings repeat)
- First occurrence:
[StringInternFirst(94)] [VarUInt cacheIndex] [VarUInt byteLength] [UTF-8 bytes] - Subsequent:
[StringInterned(92)] [VarUInt cacheIndex] - Single-occurrence strings: written as normal
String/FixStr(no interning overhead)
Reference Tracking
Prevents infinite loops and preserves object identity for repeated references.
Modes (ReferenceHandlingMode):
None— no tracking (fastest, use when graph is a tree)OnlyId— track onlyIIdobjects (matched by ID value)All— track all reference types (two-phase scan required)
Two-phase process:
- Scan pass (
ScanPass.cs) — walks the object graph, detects multi-referenced objects and repeated strings. Builds aWriteDuplicateEntry[]array (the "write plan") containingVisitIndex,CacheMapIndex,IsFirst, andValuefor each duplicate. - Sort — write plan entries are sorted by
VisitIndexto match the write pass traversal order. - Serialize pass — consumes the sorted write plan via
TryConsumeWritePlanEntry(). A cursor (_nextWritePlanVisitIndex) advances through the plan in O(1) — no dictionary lookups during serialization.
Wire protocol:
- First occurrence:
[ObjectRefFirst(70)] [VarUInt refCacheIndex] [object body...] - Subsequent:
[ObjectRef(65)] [VarUInt refCacheIndex]
Example — same object referenced twice:
Input: { Users: [userA, userA] } (same instance)
Scan pass → WritePlan:
[{VisitIndex:2, CacheMapIndex:0, IsFirst:true},
{VisitIndex:3, CacheMapIndex:0, IsFirst:false}]
Wire output (Compact mode, ReferenceHandling=All):
[version=1] [flags=0x96] [VarUInt cacheCount=1] ← header
[FixObj(0)] ← root object
[Array(66)] [VarUInt(2)] ← Users array, 2 elements
[ObjectRefFirst(70)] [VarUInt(0)] [props...] ← userA, 1st occurrence
[ObjectRef(65)] [VarUInt(0)] ← userA, 2nd (2 bytes only)
Property Ordering
Properties are serialized in a deterministic order defined by TypeMetadataBase.GetUnfilteredProperties():
- Walk the inheritance chain from derived → base (
currentType.BaseTypeloop) - At each level, collect declared public instance properties
- Sort alphabetically (
StringComparer.Ordinal) within each level - Result: base properties first, then derived, alphabetical within each level
This order is stable across serializer/deserializer as long as the type hierarchy doesn't change.
Cross-Type Deserialization (UseMetadata)
When UseMetadata=true, property name hashes (FNV-1a via FnvHash.ComputeString) are written per type, enabling schema evolution:
- Serializer writes property hashes in the metadata section (
ObjectWithMetadata(69)) - Deserializer builds an index mapping array (
GetIndexMapping()) that maps source property indices to destination indices by matching FNV-1a hashes - This allows deserialization even when source and destination types have different property sets or ordering
When UseMetadata=false, properties are matched by positional index only — source and destination must have identical property layouts.
Edge cases:
- Hash collision (
CheckDuplicatePropName=true, default): throwsInvalidOperationException. Whenfalse: collision silently ignored — risk of data corruption. - Source has unknown property (not in destination): silently skipped via
SkipValue(), no error. - Destination has extra property (not in source): left at default value (new instance) or unchanged (populate mode).
Configuration Options
Options defined in AcBinarySerializerOptions (inherits AcSerializerOptions). Each option controls which code paths execute and how the wire format changes.
WireMode
| Value | Integers | Strings | Output size | Speed |
|---|---|---|---|---|
Compact (default) |
VarInt/VarUInt (1–5 bytes) | UTF-8 with speculative ASCII fast path | Smaller | Slightly slower |
Fast |
Fixed-width raw bytes (4/8 bytes) | UTF-16 memcpy (charCount * 2 bytes) |
Larger | Fastest encode/decode |
Format difference for strings:
- Compact:
[VarUInt byteLength] [UTF-8 bytes]— speculative ASCII (1 pass if all ASCII, rewind+UTF-8 fallback otherwise) - Fast:
[VarUInt charCount] [raw UTF-16 bytes]— zero-encoding memcpy
Code branch: context.FastWire flag set at context.Reset(). Checked in WriteStringUtf8() and integer write methods. FixStr optimization is skipped in Fast mode (UTF-8 specific).
ReferenceHandling
| Value | Tracked objects | Scan pass | Header flags | Wire markers |
|---|---|---|---|---|
None |
Nothing | Skipped | 0x00 |
Standard object markers only |
OnlyId |
IId objects only (by ID value) |
Partial | 0x02 |
ObjectRefFirst(70) + ObjectRef(65) |
All (default) |
All reference types | Full graph walk | 0x06 |
ObjectRefFirst(70) + ObjectRef(65) |
Format impact: When enabled, multi-referenced objects are written once with ObjectRefFirst(70) + VarUInt(refCacheIndex) on first encounter, then replaced by ObjectRef(65) + VarUInt(refCacheIndex) on subsequent encounters. Header HasCacheCount flag is set and cache count written.
Interaction with ThrowOnCircularReference (default: true):
true+ ref handling enabled: all objects tracked for cycle detection, throwsInvalidOperationExceptionon circular referencefalse+ ref handling enabled: only IId types tracked for deduplication, non-IId circular refs silently truncated atMaxDepth
UseMetadata
| Value | Wire markers | Property matching | Overhead |
|---|---|---|---|
false (default) |
FixObj/Object |
Positional index only — types must match | None |
true |
ObjectWithMetadata(69) / ObjectWithMetadataRefFirst(71) |
FNV-1a property name hashes | 4 bytes per property per type |
Format impact: When enabled, each type's first occurrence writes [VarUInt hashCount] [FNV-1a hash × N] before properties. Deserializer uses hashes to build source→destination index mapping, enabling cross-type deserialization (different property sets/ordering).
Code branch: context.UseMetadata controls whether ObjectWithMetadata(69) or plain Object(64) markers are used. When false, IsDirectObjectWrite=true allows source-generated writers to bypass WriteObject entirely and inline property writes.
Related: CheckDuplicatePropName (default: true) — throws if FNV-1a hash collision detected between property names of the same type. Disable in production for performance.
UseStringInterning
| Value | Eligible strings | Scan overhead | Wire markers |
|---|---|---|---|
None |
Nothing | None | String(91) / FixStr only |
Attribute (default) |
Properties with [AcStringIntern(true)] |
Scans marked properties | StringInternFirst(94) + StringInterned(92) |
All |
All strings within length limits | Scans all strings | StringInternFirst(94) + StringInterned(92) |
Length limits: MinStringInternLength (default: 4) and MaxStringInternLength (default: 64, 0=unlimited). Strings outside this range are always written inline.
Format impact: Interned strings on first occurrence: [StringInternFirst(94)] [VarUInt cacheIndex] [string data]. Subsequent: [StringInterned(92)] [VarUInt cacheIndex] (1–2 bytes vs full string). Single-occurrence strings are never interned — no overhead for unique strings.
Code branch: context.StringInternEligible flag set per-property before WriteString. Scan pass builds a WriteDuplicateEntry[] plan; write pass consumes it via cursor.
MaxDepth
| Value | Behavior |
|---|---|
255 (default) |
Effectively unlimited nesting |
0 |
Root level only — nested objects/collections written as Null(76) |
N |
Objects deeper than N levels written as Null(76) |
Format impact: Depth-exceeded values appear as Null(76) in the stream — indistinguishable from actual null values. No special marker.
Code branch: Checked at entry of every object/collection write: if (depth > MaxDepth) { WriteByte(Null); return; }.
UseCompression
| Value | Method | Granularity | Memory |
|---|---|---|---|
None (default) |
No compression | — | — |
Block |
LZ4 single block | Entire payload | Full buffer in memory |
BlockArray |
LZ4 chunked | 64KB chunks | Streaming-friendly, lower peak memory |
Format impact: Compression is applied post-serialization as a transparent wrapper — the inner wire format is unchanged. Both modes are pure managed C# (WASM-compatible, no native dependencies).
Code branch: Applied in AcBinarySerializer.Serialize() after the serialization context produces the raw buffer: if (UseCompression != None) Lz4.Compress(buffer, mode). Decompression is automatic on deserialize.
PropertyFilter
Optional delegate BinaryPropertyFilter? (default: null). When set, invoked for each property to decide inclusion.
delegate bool BinaryPropertyFilter(in BinaryPropertyFilterContext context);
BinaryPropertyFilterContext fields: DeclaringType, PropertyName, PropertyType, Instance (null during metadata phase), IsMetadataPhase, GetValue() (lazy).
Format impact: Excluded properties are completely absent from the stream — no marker, no placeholder. The deserializer must use UseMetadata=true or identical filter to correctly match property indices.
Code branch: context.HasPropertyFilter checked in ShouldSerializeProperty(). Called twice: once during metadata registration (Instance=null), once during write phase.
PropertyMapper
Optional delegate PropertyMapperDelegate? (default: null) for cross-type deserialization property remapping.
delegate PropertyInfo? PropertyMapperDelegate(PropertyInfo sourceProperty, Type destinationType);
Purpose: Maps properties between different class hierarchies (renamed properties, external DTOs). Result is cached — zero overhead on same-type operations (Deserialize<T>).
WASM Options
| Option | Default | Purpose |
|---|---|---|
IsWasm |
OperatingSystem.IsBrowser() |
Auto-detect WASM environment |
UseStringCaching |
follows IsWasm |
Cache short strings during deserialization to reduce GC pressure |
MaxCachedStringLength |
64 | Max string length to cache |
Format impact: None — these are deserialization-only optimizations. When UseStringCaching=true, the deserializer maintains an intern cache for strings ≤ MaxCachedStringLength chars. Disabled automatically when StringInternFirst marker is encountered (interning takes precedence).
Other Options
| Option | Type | Default | Purpose |
|---|---|---|---|
UseGeneratedCode |
bool | true |
Use source-generated writers/readers when available |
InitialBufferCapacity |
int | 4096 | Starting buffer size (bytes) for serialization output |
RemoveOrphanedItems |
bool | false |
During PopulateMerge: remove destination collection items with no matching source ID |
UseAsync |
bool | false |
Async context pool return via ThreadPool. Auto-disabled in WASM and when ReferenceHandling=None |
MaxContextPoolSize |
int | 8 | Max serialization contexts kept in pool |
Presets
| Preset | WireMode | Metadata | StringInterning | RefHandling | MaxDepth | Compression | Other |
|---|---|---|---|---|---|---|---|
Default |
Compact | false | Attribute | All | 255 | None | — |
FastMode |
Compact | false | None | None | 255 | None | No scan pass |
ShallowCopy |
Compact | false | None | None | 0 | None | Root level only |
WasmOptimized |
Compact | false | Attribute | All | 255 | None | +StringCaching |
WithoutReferenceHandling |
Compact | false | Attribute | None | 255 | None | No scan pass |
WithoutMetadata |
Compact | false | Attribute | All | 255 | None | — |
Performance implication of presets:
Default/WasmOptimized— two-phase (scan + serialize) due toReferenceHandling=AllFastMode/ShallowCopy— single-phase (no scan pass) since both interning and refs are disabled- The scan pass adds ~20-30% overhead; disable it when the object graph is a simple tree
Option Interactions
Key interdependencies that affect which code branches execute:
| Combination | Effect |
|---|---|
ReferenceHandling=None + UseStringInterning=None |
No scan pass — fastest path, single-phase serialization |
ReferenceHandling=All + UseMetadata=true |
Uses ObjectWithMetadataRefFirst(71) marker — combined ref + metadata |
UseMetadata=false + UseGeneratedCode=true |
IsDirectObjectWrite=true — generated code inlines property writes, bypasses WriteObject |
UseMetadata=true + PropertyFilter set |
Filter invoked twice (metadata phase + write phase); filter results must be stable |
WireMode=Fast + UseStringInterning!=None |
Interned strings still use the fast string path (UTF-16 for first occurrence, VarUInt index for subsequent) |
UseCompression!=None + any other option |
Compression is orthogonal — applied post-serialization, inner format unchanged |