87 KiB

Raw Blame History

AcBinarySerializer — TODO

This page covers planned work for the binary serializer core (format, SGen, options, deserialization context, buffer writer). Work specific to the streaming I/O layer (AsyncPipeReaderInput + AsyncPipeWriterOutput, multi-message wire framing, sliding-window buffer, producer-consumer synchronization) is tracked separately in BINARY_ASYNCPIPE_TODO.md.

Priority legend

P0 blocker · P1 important · P2 nice-to-have · P3 idea

ACCORE-BIN-T-S8P4: Replace JSON-in-Binary request parameters

Priority: P1 · Type: Refactor · Status: Closed (2026-04-26, landed in commits cdd54d3 2026-04-05 + 3b70070 2026-04-06) · Related: ../XCUT/XCUT_ISSUES.md#accore-xcut-i-x8q1 (canonical), AyCode.Services/docs/SIGNALR/SIGNALR_TODO.md

Migrate client→server request parameters from JSON-in-Binary envelope to direct Binary serialization (matching response path). Coordinated change across client, server, and all consuming projects. Do NOT attempt as side-effect of unrelated work.

Acceptance: SignalPostJsonDataMessage<T> replaced by a SignalPostBinaryDataMessage<T> (or equivalent); no JSON round-trip on the wire for request params; benchmarks confirm no regression.

Resolution

What: Length-prefixed, per-parameter binary format introduced via SignalRSerializationHelper.SerializeParametersToBinary / DeserializeParametersFromBinary; further unified into SignalParams (single byte[] carrying packed method parameters with SetParameterValues / GetParameterValues).
Where: AyCode.Services/SignalRs/AcSignalRClientBase.cs, AcWebSignalRHubBase.cs, ISignalParams.cs (server + client dispatch); IAcSignalRHubClient.cs (legacy wrappers).
Equivalent (not literal SignalPostBinaryDataMessage<T>): SignalParams was chosen over a 1:1 binary wrapper class — fewer indirections on the hot path, type-safe pack/unpack, and DataSerializerType field on SignalReceiveParams for response format indication.
Wire impact: No JSON round-trip on the wire for request params; this is a breaking change vs. previous JSON-in-Binary clients/servers (see commit message).
Legacy types: SignalPostJsonMessage, SignalPostJsonDataMessage<T>, SignalPostMessage<T>, ISignalPostMessage<T> all marked [Obsolete] in IAcSignalRHubClient.cs; deletion tracked separately in AyCode.Services/docs/SIGNALR/SIGNALR_TODO.md#accore-sig-t-s3n8 (gated on consumer migration).

ACCORE-BIN-T-Q2N7: Re-evaluate DiscountProductMapping SGen exclusion

Priority: P3 · Type: Investigation · Related: BINARY_ISSUES.md#accore-bin-i-f1w8

Investigate whether the new int Id shadowing pattern can be handled by SGen (via base-class introspection, property-setter lookup on the base) to eliminate the runtime compiled-expression fallback for this entity class.

ACCORE-BIN-T-W9F1: Generate `BinarySerializeTypeMetadata` / `BinaryDeserializeTypeMetadata` at compile time

Priority: P1 · Type: Performance · Related: BINARY_ISSUES.md#accore-bin-i-n6q3

Eliminate the dominant first-call cost (reflection + Expression.Compile in metadata ctor) for SGen types by emitting pre-built metadata from the source generator.

Design outline:

TypeMetadataBase / BinarySerializeTypeMetadata / BinaryDeserializeTypeMetadata get a second constructor that accepts pre-computed values (hashes, MinWriteSize, ComplexPropertyCount, flags, IsIId, IdAccessorType, etc.). No reflection executes in this ctor.
Source generator keeps its existing s_typeNameHash / s_propertyHashes static fields (hot-path access stays static, zero indirection) and passes the same references to the metadata — single source of truth, no duplicate computation.
ModuleInit registers both the writer/reader and the pre-built metadata into a GeneratedMetadataRegistry. GetWrapperSlow consults this registry first, falling back to the reflection-based MetadataFactory for runtime-only types.
Lazy RuntimeInit() pattern for Expression.Compile property accessors:
- TypeMetadataBase gets volatile bool _runtimeInitialized + internal void RuntimeInit() (idempotent, no lock needed).
- GetWrapperSlow calls metadata.RuntimeInit() only when wrapper.GeneratedWriter == null || !Options.UseGeneratedCode — SGen types skip it entirely (they never touch runtime accessors on their own metadata; non-SGen child types have their own metadata and run the factory path normally).
- Hybrid mode stays correct: an SGen type on the SGen path never uses its own property accessors; a non-SGen child type's metadata runs the reflection ctor as today.
volatile guards the flag; multiple contexts may race into RuntimeInit, second run is a no-op.

Thread safety: GlobalMetadataCache is ConcurrentDictionary; generated metadata is registered once at ModuleInit; wrapper construction is per-context and unchanged.

Acceptance:

Cold benchmark: first Serialize<T> of a fresh SGen type shows no reflection / Expression.Compile on the call stack.
Runtime fallback (UseGeneratedCode=false) still produces identical wire output and uses the full metadata accessors.
Deserialize side has parity (same approach for BinaryDeserializeTypeMetadata).
Existing tests pass; wire format unchanged.

ACCORE-BIN-T-T5J8: JIT Tier 1 warmup for generated hot methods

Priority: P2 · Type: Performance · Related: BINARY_ISSUES.md#accore-bin-i-n6q3

After ACCORE-BIN-T-W9F1 lands, JIT of generated WriteProperties / ScanObject / ScanForDuplicates becomes the dominant residual first-call cost for SGen types. Options to evaluate (benchmark before committing):

[MethodImpl(MethodImplOptions.AggressiveOptimization)] on the generated hot methods — skips Tier 0, compiles directly at Tier 1. Simple generator change. Trade-off: larger one-time JIT cost in exchange for eliminating the Tier 0→1 recompile step.
Background prewarm from ModuleInit: Task.Run(() => RuntimeHelpers.PrepareMethod(handle)) for each registered writer/reader method. Parallelizes JIT with app startup. Keep it opt-in (option flag) to avoid surprising consumers with extra startup threads.
ReadyToRun (R2R) in consuming projects' publish config — pre-compiles IL to native at publish time. External to SGen, complementary. Document as a recommended publish setting.
Code chunking (split generated methods exceeding a property threshold into sub-methods, e.g. WriteProperties_Part1 / _Part2) — measure first. Only beneficial for unusually large types (20+ properties / nested collections). Call overhead can offset gains; JIT inliner may already handle reasonably-sized methods well.
try / finally audit on hot path — On .NET 9 (project's minimum target), JIT silently refuses to inline any method containing an EH region (AggressiveInlining is ignored). [.NET 10 partially lifts this for same-module try-finally — see dotnet/runtime#112998, merged 2025-03-20 — but catch, cross-module, and P/Invoke-stub cases stay blocked. Until project's minimum runtime moves to .NET 10, treat EH as an absolute inlining barrier; even after the upgrade, several sub-cases keep the rule.] Audit scope:
- Hand-written bridges: WriteValueGenerated / WriteObjectGenerated / WriteStringGenerated / ScanValueGenerated and any helper called from generated WriteProperties for accidental try/finally / using blocks.
- SGen output template (AcBinarySourceGenerator.cs): generated WriteProperties / ScanObject / ScanForDuplicates / ReadObject / ReadProperties MUST stay straight-line. Future feature additions ([CustomSerializer] / [CustomDeserializer] hooks, OnSerializing / OnDeserialized callbacks, validation attributes, rented-buffer using blocks) are tempting candidates for try/catch/finally — emit them in separate cold helpers, never inline into the generated hot method. A single accidental try block in WriteProperties makes the whole generated method non-inlinable, killing the SGen Root Fast Path benefit.
- Resource cleanup (Pool/ArrayPool/Dispose) belongs in Serialize<T> entry-frame only, not in per-property helpers or generated hot methods. See BINARY_IMPLEMENTATION.md Rule #3 (Inlining barriers) and BINARY_SGEN.md (SGen Output Constraints).
stackalloc size discipline on hot path — On .NET 9, methods containing localloc (any C# stackalloc) historically blocked inlining. Modern .NET allows inlining only for fixed-size stackalloc ≤ 32 bytes outside loops (see dotnet/runtime#7113) — anything larger or loop-nested still blocks. Our typical scratch-buffer patterns (UTF-8 encoding scratch, ArrayPool fallbacks) sit far above 32 bytes (256+), so any helper containing such a stackalloc is non-inlinable. Combined with try/finally for ArrayPool.Return cleanup, the method is doubly non-inlinable on .NET 9. Plan accordingly: keep stackalloc-using helpers as deliberate cold call-frames, not as AggressiveInlining candidates.
Native AOT — out of scope for this TODO; separate architectural decision with deployment-model implications.

Acceptance:

Benchmark a realistic entity graph (≥ 3 referenced child types) and show first-call time within ~10% of steady-state after ACCORE-BIN-T-W9F1 + chosen mitigation(s).
Document which combination is recommended for SignalR hot-path workloads vs. batch serialization.

ACCORE-BIN-T-Z3K8: Replace `IId<T>` interface dependency with convention/attribute-based Id detection

Priority: P1 · Type: Refactor

The binary serializer currently detects Id-tracking properties via the IId<T> interface (AyCode.Interfaces). This couples the serializer to a framework-specific abstraction and forces consumer types to implement the interface for tracking participation. Move to a POCO-friendly detection scheme:

IdDetectionMode.Convention (default) — convention-based; any property named Id is treated as the tracking key. Zero-friction onboarding.
IdDetectionMode.Attribute — explicit; only properties marked with a serializer-native [Id] (or similar) attribute are tracked.
[IgnoreId] attribute — escape hatch in Convention mode to exclude an Id-named property from tracking when the developer wants explicit opt-out.

Implicit contract for Convention mode: within a single class, the Id property must be type-level unique. Whether it semantically represents a primary key or a sequence number is irrelevant — the tracker keys by (Type, Id), so per-type uniqueness is the only requirement. Violating this invariant typically signals a domain-modelling problem, not a serializer bug. Design rationale discussed in conversation 2026-04-27.

Acceptance:

Binary serializer no longer references IId<T> in any execution path (no interface checks, no where T : IId<TKey> constraints in the serializer surface).
Wire format unchanged.
Existing consumers using IId<T>-implementing types still work transparently in Convention mode (their Id property is detected via convention).
New consumers can use plain POCOs with no AyCode.Interfaces dependency.
IdDetectionMode exposed on AcBinaryOptions (or successor options class post-rebrand).
Default mode = Convention.

ACCORE-BIN-T-N7V1: Replace `[JsonIgnore]` dependency with serializer-native ignore attribute

Priority: P2 · Type: Refactor

Property exclusion from binary serialization currently relies on [JsonIgnore] (Newtonsoft.Json). This couples the binary serializer to a third-party JSON library's attribute and is conceptually wrong — a binary serializer should not consult a JSON-specific marker for its exclusion semantics.

Define a serializer-native ignore attribute (working name [BinaryIgnore]; final name TBD pending broader rebrand). For backward compatibility during transition, also continue recognizing [JsonIgnore] with a deprecation note.

Possible cross-cutting consideration: if Toon and other future serializers also need property-exclusion, a single shared attribute (e.g., [SerializerIgnore] in a common abstractions package) may be cleaner than per-serializer attributes. Decide before naming finalizes — this may belong in XCUT_TODO.md rather than purely BINARY scope.

Acceptance:

Native ignore attribute defined in the binary serializer's namespace (or shared abstractions package, pending the cross-cutting decision above).
Both native attribute and [JsonIgnore] recognized during a transitional period; native attribute takes precedence on conflict.
[JsonIgnore] recognition flagged for removal in a future major version (track in a follow-up cleanup TODO once consumer projects have migrated).
No new code dependency on Newtonsoft.Json for property-exclusion logic.

ACCORE-BIN-T-Y6R2: Implement projection serialization phase 1 (runtime path)

Priority: P1 · Type: Feature · Related: ../adr/0001-binary-projection-serialization.md (canonical)

Implement the phase 1 runtime path of source→target projection serialization per ADR 0001. See the ADR for full context, decision rationale, alternatives, consequences, and acceptance criteria.

Sibling rebrand-prep TODOs: ACCORE-BIN-T-Z3K8 (IId migration), ACCORE-BIN-T-N7V1 (JsonIgnore replacement).

ACCORE-BIN-T-K3W7: Rename `BufferWriterChunkSize` to reflect actual semantics

Priority: P3 · Type: Refactor · Breaking: Yes (public option API) · Streaming impact: see BINARY_ASYNCPIPE_TODO.md for the streaming-side companion considerations (chunk-on-wire vs internal-buffer semantics)

The property name BufferWriterChunkSize is misleading: across the three output paths it does NOT consistently represent a "chunk".

Output path	What `BufferWriterChunkSize` actually controls	Wire-format chunk?
`ArrayBinaryOutput` (Byte[] API)	Initial buffer capacity of the internal `byte[]`	No
`BufferWriterBinaryOutput` (IBufferWriter overload)	Internal buffer size — how much data accumulates before `Advance()` + new `GetMemory()` on the underlying writer	No
`AsyncPipeWriterOutput` (streaming)	Both internal buffer and wire-format chunk frame size for chunked framing	Yes (only here)
Receive side (`AsyncPipeReaderInput`)	Initial receive buffer = `BufferWriterChunkSize × 2`	No (just sizing hint)

Only the streaming AsyncPipeWriterOutput path has a wire-format "chunk" concept (chunked framing for length-prefixed segments). On the other 75% of paths the property name reads as if the serializer were segmenting the payload, which is not what happens.

Possible directions (decide before implementing):

Single rename, semantic-neutral — BufferWriterChunkSize → BufferWriterBufferSize or BufferWriterPageSize. Minimal API surface change, single-property semantics preserved. Downside: still slightly off for the streaming path where there IS chunked framing.
Two-property split — InternalBufferSize (universal: how much data accumulates before Advance/Grow) + StreamingChunkSize (only meaningful for AsyncPipeWriterOutput; separate knob, defaults to InternalBufferSize). Cleanest semantics, most ceremony, slightly more options to document.
Single rename, streaming-honest — Keep as BufferWriterChunkSize but document explicitly that on non-streaming paths the value is repurposed as buffer size. Cheapest change (docs only). Downside: doesn't fix the underlying confusion the field name causes.

Pick one before touching code. Option 2 is the most correct but adds API surface; Option 1 is the pragmatic middle.

Affected callers / docs to update on rename:

AcBinarySerializerOptions.cs (definition)
AcBinarySerializer.cs × 3 sites (ArrayBinaryOutput ctor, BufferWriterBinaryOutput ctor, AsyncPipeWriterOutput ctor)
AcBinaryDeserializer.cs × 1 site (receive-side initial capacity derivation)
AsyncPipeReaderInput.cs — XML doc cross-refs
BINARY_WRITERS.md, BINARY_TODO.md (this entry), BINARY_ISSUES.md (line 151 — already lists BufferWriterChunkSize among the struct-mutation issue's affected setters)
Consumer-side: AyCode.Services/SignalRs/AcBinaryHubProtocol.cs ctor mutates _options.BufferWriterChunkSize = options.BufferSize; — see BINARY_ISSUES.md#accore-bin-i-... (struct-mutation context). Coordinate the rename with the struct-mutation fix to avoid two cross-cutting churn waves on the same property.

Acceptance:

Property renamed (or split) per the chosen direction; all internal references updated.
XML docs reflect the actual semantics on each output path (initial capacity / advance threshold / chunk frame size — whichever applies).
Consumer-side usage in AcBinaryHubProtocol updated; if Option 2 is chosen, the protocol uses StreamingChunkSize (the streaming knob), not the universal one.
Wire format unchanged. Default values unchanged (65535 / equivalent).
Migration note in CHANGELOG / release notes since this is a breaking change to AcBinarySerializerOptions.

ACCORE-BIN-T-M4D2: Add `ReadOnlyMemory<byte>` / `Memory<byte>` deserialize overloads

Priority: P3 · Type: Feature

The public AcBinaryDeserializer.Deserialize surface accepts byte[] (with optional offset/length) and ReadOnlySequence<byte>, but not ReadOnlyMemory<byte> / Memory<byte>. Consumers that hold a ReadOnlyMemory<byte> (cached payloads, message-broker frames, in-memory pipe slices) must call .ToArray() to round-trip through byte[] — unnecessary copy + GC alloc.

Implementation:

Deserialize<T>(ReadOnlyMemory<byte> data, AcBinarySerializerOptions options) and the non-generic Type-based variant.
Body: MemoryMarshal.TryGetArray(data, out var seg) → array-backed path delegates to Deserialize<T>(seg.Array!, seg.Offset, seg.Count, options) (zero-copy). Non-array-backed fallback (rare — custom MemoryManager<T> with native memory) copies into a pooled byte[].
Memory<byte> overload trivially delegates to the ReadOnlyMemory<byte> one (Memory<byte> is implicitly convertible).
No new input-strategy struct needed — reuses existing ArrayBinaryInput.

Acceptance:

Both overloads compile and pass round-trip tests against byte[]-equivalent input.
Array-backed path measurably zero-alloc (BenchmarkDotNet allocation diagnoser).
Non-array-backed path documented as fallback (separate using var pooled = MemoryPool<byte>.Shared.Rent(...) style copy).
API doc-strings cross-reference the existing byte[] and ReadOnlySequence<byte> overloads.

ACCORE-BIN-T-S7X3: Add `ReadOnlySpan<byte>` deserialize overload

Priority: P2 · Type: Feature · Related: ACCORE-BIN-T-M4D2

The MemoryPack-style Deserialize<T>(ReadOnlySpan<byte>) API enables direct deserialization from stack-allocated buffers (stackalloc byte[256]), pinned native memory (fixed blocks), and ReadOnlyMemory<byte>.Span slices without round-tripping through a heap-allocated byte[]. The current AcBinary surface lacks this entry point.

Design tension: the existing IBinaryInputBase.Initialize(out byte[] buffer, ...) contract returns a byte[] — a ReadOnlySpan<byte> cannot be stored in a regular struct field, only in a ref struct field. Two implementation paths to evaluate:

ref struct SpanBinaryInput + interface bump to support ref byte buffer / int length fields. Pure zero-copy from any span. Cost: BinaryDeserializationContext<TInput> and IBinaryInputBase need a parallel ref-struct-friendly track (the existing pooled context cannot hold a ref struct). Major surgery on the deser core.
MemoryMarshal.CreateReadOnlySpanFromNullTerminated-style hack — accept ReadOnlySpan<byte>, use Unsafe.AsRef/MemoryMarshal.GetReference to obtain a ref byte, then copy into a pooled byte[] before deserialization. Not zero-copy, defeats the purpose. Reject.
Pinned-buffer trampoline — accept ReadOnlySpan<byte>, allocate a Memory<byte> view via a MemoryManager<byte>-like wrapper, delegate to ReadOnlyMemory<byte> overload. Awkward, allocations per call. Reject.

Recommendation: option (1) is the only correct path, but it's a substantial refactor — measure first whether real consumer demand justifies the surgery. The current byte[]-based pool-pattern outperforms MemoryPack on the dominant use-cases per existing benchmarks; this overload addresses an API-surface gap, not a perf gap.

Acceptance:

Deserialize<T>(ReadOnlySpan<byte> data, AcBinarySerializerOptions options) compiles and round-trips against byte[]-equivalent input.
Zero-alloc path verified for stackalloc-source spans (BenchmarkDotNet allocation diagnoser).
IBinaryInputBase (or successor interface) refactor preserves backward compatibility for existing ArrayBinaryInput / SequenceBinaryInput / AsyncPipeReaderInputAdapter consumers.
Doc-strings cross-reference the byte[] / ReadOnlyMemory<byte> (ACCORE-BIN-T-M4D2) / ReadOnlySequence<byte> overloads with use-case guidance.

ACCORE-BIN-T-T8K3: Add `SerializeAsync(Stream, T)` async overloads with mode-driven output strategy

Priority: P1 · Type: Feature · Related: ACCORE-BIN-T-N9G6 (Type-based coordination)

The mainstream serializer ecosystem (System.Text.Json, MessagePack, Newtonsoft.Json, MemoryPack) all expose SerializeAsync(Stream, T) as a primary entry point — async file I/O, network response body, log streaming. AcBinary's public API surface MUST include this overload regardless of what we do internally; consumers expect a Stream parameter and don't navigate PipeWriter.Create(stream) workarounds. Market-entry-blocking otherwise.

Mode-driven output strategy — three lanes for three workload shapes

AcBinary already models the three output strategies in BinaryProtocolMode (AyCode.Services/SignalRs/BinaryProtocolMode.cs) for the SignalR side. The same three-lane shape applies to the public SerializeAsync(Stream) API. Promote the concept to AcBinary core scope (e.g. AcBinaryOutputMode in AyCode.Core/Serializers/Binaries/) and let the SignalR BinaryProtocolMode either alias it or migrate to it. Migration timing: the existing BinaryProtocolMode keeps shipping until the new public API is stabilized; both names live for one major version, then BinaryProtocolMode becomes a using-alias.

Mode	Output strategy	Peak memory	Pipeline parallelism	Use when
`Bytes` (default)	`Serialize(T) → byte[]` + `stream.WriteAsync(bytes)`	Full payload in `byte[]` (pooled)	No	Typical payloads (<10 MB), throughput-focus
`Segment`	`BufferWriterBinaryOutput` → `PipeWriter`, single closing flush	PipeWriter pause-threshold-bounded (~64 KB Kestrel default)	No	Mid-size payloads, zero-copy desired
`AsyncSegment`	`SerializeChunked(PipeWriter)`, per-chunk async flush	Chunk-size-bounded (~8 KB at default `BufferWriterChunkSize`)	Yes (on parallel-capable PipeWriter — Kestrel / `Pipe`)	Very large payloads (>10 MB), memory-tight hosts, parallel-capable transport

Honest performance positioning vs. MemoryPack — three real axes

MemoryPack's SerializeAsync(Stream) is pseudo-streaming — serializes the entire payload into a pool-allocated linked-list buffer first (ReusableLinkedArrayBufferWriter), then writes the completed buffer to the stream in a single closing fence. Peak memory ≈ payload size; no pipeline parallelism. AcBinary's Bytes mode is architecturally similar (single pooled contiguous byte[] vs. MemoryPack's linked-list) — comparable peak-memory cost, often faster on the wire due to one contiguous WriteAsync call.

AcBinary's AsyncSegment mode is architecturally different in three real ways MemoryPack cannot match:

Axis	`Bytes` mode (default)	`AsyncSegment` mode	MemoryPack `SerializeAsync`
Heap allocation per call	Pooled `byte[]` rent (peak ≈ payload size)	Truly zero — `ArrayPool` + pooled context + `MemoryMarshal.TryGetArray` direct-buffer-write into the transport's own `byte[]`	Pool-allocated linked-list buffer per call (peak ≈ payload size)
Peak managed memory	≈ payload size	≈ chunk size (`BufferWriterChunkSize`, e.g. 4-8 KB)	≈ payload size
GC pressure	Touches GC pool on every call	Never touches GC for the serialize itself	Touches GC pool on every call
Pipeline parallelism	No	Yes on parallel-capable PipeWriter (Kestrel transport, `new Pipe()`)	No
GB-scale payload	OOM risk on memory-tight hosts	Works	OOM risk

The AsyncSegment zero-alloc claim is literal, not "almost zero": AsyncPipeWriterOutput.AcquireChunk calls _pipeWriter.GetMemory(chunkSize) and uses MemoryMarshal.TryGetArray(memory, out segment) to obtain the transport's own internal byte[] — the serializer writes directly into it. With chunkSize aligned to the transport's internal buffer (e.g. NamedPipe-server pipe-buffer-size), one chunk is one kernel-level transfer; no managed-side double-fragmentation.

Throughput nuance — `AsyncSegment` cost on Stream-backed transports

AsyncSegment IS slightly slower than Bytes on StreamPipeWriter-backed transports (NamedPipe / FileStream / NetworkStream), but not for the reason that initially seems obvious:

The cost is NOT "managed-side double-fragmentation on top of OS-level fragmentation" — that's not what happens. MemoryMarshal.TryGetArray zero-copy direct-buffer-writes mean the managed chunking is the same chunking the kernel does anyway, not redundant.
The cost IS the per-chunk async-await round-trip (SyncAwaitFlush(_lastFlush) blocks until the kernel acknowledges the write), forced sequential by the StreamPipeWriter._tailMemory reset race (ACCORE-BIN-I-...). N async cycles vs 1 in Bytes mode.
Empirically the gap is roughly 1.2-1.5x on NamedPipe — not 2-5x. The dominant cost on these transports is the transport itself (Windows IRP / Linux FIFO syscall overhead), independent of the serializer mode.

When AsyncSegment wins outright:

GC-sensitive hot-paths (server hubs, real-time game tick loops, mobile UI thread, embedded targets): zero-alloc + zero-GC-pressure beats a 1.2x throughput edge every time.
Memory-tight hosts (mobile, WASM, container-trimmed, embedded): chunk-bounded peak memory is the only option.
GB-scale payloads: Bytes OOMs; AsyncSegment works.
Kestrel transport / parallel-capable Pipe: pipeline parallelism makes AsyncSegment faster than Bytes for medium-to-large payloads.

When Bytes wins outright:

Tipikus NuGet workload (small-to-medium payload, throughput priority, GC-tolerant): one async cycle vs N is the simpler, faster path.
MemoryStream (in-memory): one large byte[] copy decisively beats N managed chunks.

Marketing claim — three-way honest comparison

"AcBinary offers a real choice. Bytes mode for typical throughput-priority workloads (matches MemoryPack's pseudo-streaming, often faster on the wire). AsyncSegment mode for the workloads MemoryPack cannot serve: zero-alloc serialize for GC-sensitive hot-paths, chunk-bounded peak memory for tight-budget hosts, GB-scale payloads, and pipeline parallelism on parallel-capable transports. You pick the mode; MemoryPack picks for you."

This is honest — does not overclaim universal speed, does not hide the small AsyncSegment cost on Stream-backed transports, AND clearly surfaces the three differentiator axes (alloc / memory / parallelism) where AcBinary architecturally beats MemoryPack.

Implementation outline:

New enum AcBinaryOutputMode { Bytes = 0, Segment = 1, AsyncSegment = 2 } in AyCode.Core/Serializers/Binaries/. Default Bytes.
New mode field on AcBinarySerializerOptions: AcBinaryOutputMode OutputMode { get; set; } = AcBinaryOutputMode.Bytes;. (Note: subject to ACCORE-BIN-I-L8N5 thread-safety treatment — defensive copy / immutable refactor coordination.)
public static ValueTask SerializeAsync<T>(T value, Stream stream, AcBinarySerializerOptions? options = null, bool leaveOpen = false, CancellationToken ct = default):
- Switch on options.OutputMode:
  - Bytes → var bytes = Serialize(value, options); await stream.WriteAsync(bytes, ct); ArrayPool.Return(bytes);
  - Segment → var pw = PipeWriter.Create(stream, new(leaveOpen: leaveOpen)); Serialize(value, pw, options); await pw.CompleteAsync();
  - AsyncSegment → var pw = PipeWriter.Create(stream, new(leaveOpen: leaveOpen)); SerializeChunked(value, pw, options); await pw.CompleteAsync();
public static ValueTask SerializeAsync(object? value, Type type, Stream stream, ...) — non-generic, same dispatch (coordinated with ACCORE-BIN-T-N9G6).
leaveOpen parameter standard for stream-async serializers (System.Text.Json, MessagePack convention).
The Bytes mode uses a pooled byte[] from ArrayBinaryOutput to keep alloc cost amortized.

SignalR migration coordination: the existing BinaryProtocolMode enum (in AyCode.Services) keeps shipping unchanged until the new public API is stabilized. After stabilization, BinaryProtocolMode becomes a deprecated alias of AcBinaryOutputMode, eventually removed in a major-bump. No SignalR-side churn during this TODO's implementation.

Acceptance:

SerializeAsync<T> round-trips against Deserialize<T>(byte[]) via MemoryStream in all three modes.
Cancellation propagates correctly (OperationCanceledException on cancelled token mid-stream).
Throughput matrix benchmark: 4 transports (MemoryStream, FileStream, NamedPipeStream, NetworkStream) × 3 modes × 3 payload sizes (small ~1 KB / medium ~100 KB / large ~10 MB). Results documented in Test_Benchmark_Results/Benchmark/SerializeAsync_Stream_Modes.LLM (or similar) and surfaced as a doc-string table for consumer guidance.
Memory-bounded benchmark: 100 MB payload to FileStream in AsyncSegment mode → peak managed-heap delta ≤ 1 MB throughout. Same payload in Bytes mode → peak ~100 MB (expected, documented).
API doc-string contains a "When to use which mode?" decision matrix; explicitly compares with MemoryPack's pseudo-streaming.
leaveOpen parameter behaves per the System.Text.Json / MessagePack convention across all three modes.

ACCORE-BIN-T-D7K4: Add `DeserializeAsync(Stream, T)` async overloads with mode-driven input strategy

Priority: P1 · Type: Feature · Related: ACCORE-BIN-T-T8K3 (companion write-side overload), ACCORE-BIN-T-N9G6 (non-generic Type-based dispatch)

Companion to T8K3 on the receive side. The mainstream serializer ecosystem (System.Text.Json, MessagePack, Newtonsoft.Json, MemoryPack) all expose DeserializeAsync<T>(Stream) — the symmetric counterpart of SerializeAsync(Stream, T). AcBinary's public API surface MUST include this overload for parity; consumers expect a Stream parameter for receive paths (file load, HTTP response body, network stream) and don't navigate PipeReader.Create(stream) workarounds. Market-entry-blocking otherwise.

Implementation: zero new `IBinaryInputBase` impl needed

The existing receive-side primitives cover the full strategy space via BCL PipeReader.Create(stream):

Mode	Input strategy	Peak memory	Pipeline parallelism	Use when
`Bytes` (default)	`await stream.CopyToAsync(MemoryStream)` → `Deserialize<T>(byte[])` (existing overload)	Full payload as `byte[]` (pooled)	No	Typical payloads (<10 MB), throughput-focus
`Segment`	`await PipeReader.Create(stream).ReadAsync()` → `Deserialize<T>(ReadOnlySequence<byte>)` (existing overload)	PipeReader pause-threshold-bounded (~64 KB)	No	Mid-size payloads, no full byte[] alloc desired
`AsyncSegment`	`AsyncPipeReaderInput` + `DrainFromAsync(PipeReader.Create(stream))` + `Deserialize<T>(input)` (existing overload)	Chunk-size-bounded (~8 KB)	Yes (producer drain Task in parallel with deser Task)	Very large payloads (>10 MB), memory-tight hosts

The AcBinaryOutputMode enum (introduced by T8K3) is symmetric — it controls deser-input strategy as well. The same enum value picks the matching read path. No new IBinaryInputBase implementation needed — the trio of existing inputs (ArrayBinaryInput, SequenceBinaryInput, AsyncPipeReaderInput) already cover all three modes; the new overload is a thin shim that wraps the Stream and routes to the right existing overload.

Public API shape

public static ValueTask<T?> DeserializeAsync<T>(
    Stream stream,
    AcBinarySerializerOptions? options = null,
    bool leaveOpen = false,
    CancellationToken ct = default);

// Non-generic Type-based variant (coordinated with N9G6):
public static ValueTask<object?> DeserializeAsync(
    Stream stream,
    Type targetType,
    AcBinarySerializerOptions? options = null,
    bool leaveOpen = false,
    CancellationToken ct = default);

Implementation outline (per mode)

// Bytes mode (default — simplest path, sub-LOH-friendly fast path):
public static async ValueTask<T?> DeserializeAsync_Bytes<T>(Stream stream, ..., CancellationToken ct)
{
    var rented = ArrayPool<byte>.Shared.Rent((int)Math.Min(stream.CanSeek ? stream.Length : 4096, int.MaxValue));
    try
    {
        var totalRead = 0;
        int read;
        while ((read = await stream.ReadAsync(rented.AsMemory(totalRead), ct)) > 0)
        {
            totalRead += read;
            if (totalRead == rented.Length) { /* grow rented */ }
        }
        return Deserialize<T>(rented, 0, totalRead, options);
    }
    finally { ArrayPool<byte>.Shared.Return(rented); }
}

// Segment mode (PipeReader.Create wrapping, then drain to ReadOnlySequence):
public static async ValueTask<T?> DeserializeAsync_Segment<T>(Stream stream, ..., CancellationToken ct)
{
    var pipeReader = PipeReader.Create(stream, new(leaveOpen: leaveOpen));
    var result = await pipeReader.ReadAtLeastAsync(int.MaxValue, ct);   // drain whole stream
    var seq = result.Buffer;
    var obj = Deserialize<T>(seq, options);
    pipeReader.AdvanceTo(seq.End);
    await pipeReader.CompleteAsync();
    return obj;
}

// AsyncSegment mode (chunked streaming pipeline, parallel drain + deser):
public static async ValueTask<T?> DeserializeAsync_AsyncSegment<T>(Stream stream, ..., CancellationToken ct)
{
    using var input = new AsyncPipeReaderInput(options.BufferWriterChunkSize * 2, multiMessage: false);
    var pipeReader = PipeReader.Create(stream, new(leaveOpen: leaveOpen));
    var deserTask = Task.Run(() => Deserialize<T>(input, options), ct);
    await input.DrainFromAsync(pipeReader, ct);
    await pipeReader.CompleteAsync();
    return await deserTask;
}

Honest performance positioning

Symmetric to T8K3's analysis:

Bytes mode: simplest, single contiguous byte[] (pooled) → Deserialize<T>(byte[]). Comparable to MemoryPack's DeserializeAsync (which does similar full-buffer-then-deser). Best for typical workloads.
Segment mode: zero-copy from PipeReader's natural ReadOnlySequence<byte> — no extra byte[] allocation. Best for mid-size payloads where allocation matters but pipeline overlap doesn't.
AsyncSegment mode: producer-drain Task and consumer-deser Task in parallel via AsyncPipeReaderInput. Wall-clock = max(network-drain, deser-CPU) + small overlap-cost. Best for large payloads + slow transports (network, mobile, satellite — where transit dominates and overlap pays).

Acceptance

DeserializeAsync<T> round-trips against SerializeAsync(Stream, T) (T8K3) via MemoryStream in all three modes.
Cancellation propagates correctly (OperationCanceledException on cancelled token mid-stream); partial-buffer state cleaned up; pooled byte[] returned even on cancellation.
Throughput matrix benchmark (mirror of T8K3): 4 transports (MemoryStream, FileStream, NamedPipeStream, NetworkStream) × 3 modes × 3 payload sizes. Results documented in Test_Benchmark_Results/Benchmark/DeserializeAsync_Stream_Modes.LLM.
Memory-bounded benchmark: 100 MB payload from FileStream in AsyncSegment mode → peak managed-heap delta ≤ 1 MB throughout. Same payload in Bytes mode → peak ~100 MB (expected, documented).
API doc-string contains a "When to use which mode?" decision matrix; cross-references T8K3's symmetric write-side guidance.
leaveOpen parameter behaves per the System.Text.Json / MessagePack convention across all three modes.

ACCORE-BIN-T-N9G6: Add non-generic `Type`-based `Serialize(object, Type, ...)` overloads

Priority: P2 · Type: Feature · Status: Closed (2026-05-04) · Related: ACCORE-BIN-T-T8K3

Resolution

Added in AcBinarySerializer.cs:

Serialize(object?, Type, opts) → byte[]
Serialize(object?, Type, IBufferWriter<byte>, opts) → int
SerializeChunked(object?, Type, PipeWriter, opts) → int
SerializeChunkedFramed(object?, Type, PipeWriter, opts) → int

Added in AcBinaryDeserializer.cs:

DeserializeFromPipeReaderAsync<T>(PipeReader, opts, ct) → Task<T?>
DeserializeFromPipeReaderAsync(PipeReader, Type, opts, ct) → Task<object?>

The Deserialize(byte[], Type, opts) / Deserialize(ReadOnlySequence<byte>, Type, opts) / Deserialize(AsyncPipeReaderInput, Type, opts) overloads already existed.

Consumed by ASP.NET Core MVC formatter package (AyCode.Services/Mvc/) — AcBinaryInputFormatter, AcBinaryOutputFormatter, AddAcBinaryFormatters extension. Media type: application/vnd.acbinary.

Plugin frameworks, ASP.NET ModelBinding, DI middleware, and DataContractSerializer-style "generic-API container" use-cases need to serialize an object whose type is known only at runtime. Current AcBinary surface forces a reflection trampoline through the generic Serialize<T>:

// Today's workaround (slow + noisy):
typeof(AcBinarySerializer).GetMethod("Serialize", new[] { type, typeof(AcBinarySerializerOptions) })
    .MakeGenericMethod(type).Invoke(null, new[] { value, options });

Implementation outline:

public static byte[] Serialize(object? value, Type type, AcBinarySerializerOptions? options = null)
public static int Serialize(object? value, Type type, IBufferWriter<byte> writer, AcBinarySerializerOptions? options = null)
public static int SerializeChunked(object? value, Type type, PipeWriter writer, AcBinarySerializerOptions? options = null) and Pipe overload
public static int SerializeChunkedFramed(object? value, Type type, PipeWriter writer, AcBinarySerializerOptions? options = null) and Pipe overload
public static ValueTask SerializeAsync(object? value, Type type, Stream stream, ...) — coordinated with ACCORE-BIN-T-T8K3
Internal dispatch: value.GetType() is the runtime type; the Type type parameter constrains the declared type for polymorphism handling (ObjectWithTypeName write decision).

Acceptance:

All non-generic overloads round-trip via the generic deserializer's Deserialize(byte[], Type) overload.
Plugin-style scenario: serialize IList<dynamic> of mixed-type elements → all elements correctly typed in the wire output.
API doc-strings call out the performance characteristics (slightly slower than generic due to runtime Type lookup but without the reflection trampoline cost).

ACCORE-BIN-T-R4P2: Expose low-level `ref Writer`-style API for custom formatters

Priority: P3 · Type: Feature

The MemoryPack-style Serialize<T>(ref MemoryPackWriter writer, in T value) low-level API enables:

Custom formatters that compose write primitives without the full Serialize entry-point overhead.
Nested-into-existing-stream scenarios where the caller already owns a writer-style cursor.
Test harnesses that exercise specific wire-format paths in isolation.

Today's BufferWriterBinaryOutput standalone-mode partly fills this gap — exposing WriteByte, WriteVarUInt, WriteStringUtf8, etc. — but it is not a ref struct, not a documented low-level public API for external custom formatters, and the relationship with BinarySerializationContext<TOutput> is unclear from the consumer's perspective.

Design tension (decide before implementing):

Promote BufferWriterBinaryOutput to documented public surface — add doc, examples, supported usage patterns. Cheapest, but the standalone-mode is currently a side-feature, not a primary API; documenting it commits to its current shape.
New ref struct AcBinaryWriter wrapper around BufferWriterBinaryOutput (or a dedicated impl) — explicit "this is the low-level writer" signal. More API surface but clearer mental model. Aesthetic alignment with MemoryPack.
Skip entirely — the IBufferWriter<byte> overload is already lower-level than most consumers need; custom formatters can write to an ArrayBufferWriter<byte> and use IBufferWriter-style primitives. This is what BufferWriterBinaryOutput already does internally.

Recommendation: option 3 is honest — the existing IBufferWriter<byte> overload covers the use case, and adding a ref struct AcBinaryWriter is mostly aesthetic alignment with MemoryPack. Re-evaluate when there's a concrete custom-formatter request that the current API can't accommodate.

Acceptance (if implemented):

AcBinaryWriter ref struct (or equivalent) compiles, supports the same write primitives as BufferWriterBinaryOutput standalone-mode.
At least one example custom formatter ships in tests (e.g., a Vector3 struct formatter).
Doc-string clearly distinguishes when to use the low-level writer vs. the high-level Serialize<T> entry-point.

ACCORE-BIN-T-U6Y8: Attribute-driven polymorphism via `[AcBinaryUnion]` + SGen (opt-in, AOT-friendly)

Priority: P1 (if AOT target required) / P2 (non-AOT only) · Type: Feature

Design philosophy alignment: AcBinary's market positioning is "JSON-style flexibility with MessagePack-class speed" — attributes are opt-in optimization, never required. The runtime polymorphism path (AQN-based, today's default) stays the default and continues to work for arbitrary unattributed types. This TODO adds a fast/AOT path alongside it, never replaces it.

AcBinary today handles polymorphism at runtime: the wire writes ObjectWithTypeName(72) + AQN string, and the deserializer calls Type.GetType(aqn) to resolve. This is flexible (no upfront declaration), but has three significant drawbacks for some consumers:

AOT-incompatible — Type.GetType(AQN) requires reflection metadata that the Native AOT trimmer strips by default. The runtime polymorphism path does not work at all under Native AOT. Hard blocker for AOT-targeting consumers (Blazor WASM, MAUI mobile, container-trimmed deployments).
Slower — AQN string parse + reflection lookup vs. a closed switch (tag) in code-gen.
Larger wire format — full AQN string (often 100+ bytes) vs. a single-byte tag.

Design — three coordinated pieces:

1. New 5th bool parameter on `[AcBinarySerializable]`: `EnablePolymorphismFeature`

Mirrors the existing EnableMetadataFeature / EnableIdTrackingFeature / EnableRefHandlingFeature / EnableInternStringFeature pattern. Per-type opt-out / opt-in via attribute parameter.

public AcBinarySerializableAttribute(
    bool enableMetadataFeature,
    bool enableIdTrackingFeature,
    bool enableRefHandlingFeature,
    bool enableInternStringFeature,
    bool enablePolymorphismFeature)   // ← ÚJ, default: true

Three behavior modes per type:

EnablePolymorphismFeature = false → disabled. SGen never emits polymorphism dispatch for this type; runtime path also short-circuits — runtime type ≠ declared type is silently treated as declared (or throws, decision TBD). Use for hot-path closed types where polymorphism is impossible-by-design and the perf/AOT cost is unwanted.
EnablePolymorphismFeature = true (default), no [AcBinaryUnion] → runtime options control. Behaves per AcBinarySerializerOptions.PolymorphismMode (Runtime/AQN today). This preserves the JSON-style flexibility for unattributed bases.
EnablePolymorphismFeature = true + [AcBinaryUnion(...)] declared → union-switch dispatch. SGen emits a closed switch (tag) dispatch using the declared subtype set. Fast + AOT-friendly. Overrides the options-level default for this type.

2. New `[AcBinaryUnion(byte tag, Type subtype)]` attribute

Multiple instances per base class / interface declare the closed polymorphism set:

[AcBinarySerializable]   // EnablePolymorphismFeature defaults to true
[AcBinaryUnion(0, typeof(Cat))]
[AcBinaryUnion(1, typeof(Dog))]
public abstract partial class Animal { ... }

SGen detects [AcBinaryUnion] on abstract / base type → emits the switch-based write/read dispatch instead of falling through to runtime AQN.

3. New `PolymorphismMode` enum on `AcBinarySerializerOptions`

Options-level default for unattributed polymorphism (i.e. the case where EnablePolymorphismFeature = true but no [AcBinaryUnion] is declared):

Runtime (today's default) — AQN-based. Flexible, AOT-incompatible.
Throw — fail fast on any polymorphic write that lacks a [AcBinaryUnion] attribute. AOT-friendly diagnostic mode for migration scenarios.

Note: there is no UnionAttribute-only mode — declaration is per-type via the attribute, not options-global. The options-level mode only governs the fallback when no [AcBinaryUnion] is present.

Wire-format addition:

New marker (e.g. UnionTagBase = <TBD>) + [byte tag][inner Object], parallel to existing ObjectWithTypeName(72). Slot number to be assigned avoiding clashes with existing 64–134 / 192–255 ranges.

Implementation outline:

AcBinarySerializableAttribute — new ctor parameter enablePolymorphismFeature, all existing ctors default it to true (backward compatible).
AcBinaryUnionAttribute — new attribute, AttributeUsage(AttributeTargets.Class | Interface, AllowMultiple = true).
Source generator — emit WriteUnion<TBase>(value, ctx, depth) and ReadUnion<TBase>(ctx, depth) static methods on the union-base type's generated writer/reader. Skipped entirely when EnablePolymorphismFeature = false.
Wire-format new marker + [byte tag][inner Object] body.
Runtime path: WriteValueNonPrimitive checks the wrapper's PolymorphismFeatureEnabled flag; when false, skips the value.GetType() != declaredType polymorphism branch entirely.

Acceptance:

EnablePolymorphismFeature = false: SGen-emitted dispatch contains zero is-typeof / GetType branches; runtime path also short-circuits. Verify in JIT disassembly.
EnablePolymorphismFeature = true, no union: runtime AQN polymorphism works as today (full backward compat); preserved JSON-style flexibility for unattributed bases.
EnablePolymorphismFeature = true + [AcBinaryUnion]: AOT-test (Native AOT publish) compiles and round-trips a polymorphic graph — Type.GetType() is never invoked on this path.
Benchmark: union-switch polymorphism measurably faster than AQN polymorphism on deser side (typed switch vs. reflection lookup).
Wire format documented in BINARY_FORMAT.md; BINARY_FEATURES.md cross-references the attribute pattern; BINARY_OPTIONS.md documents PolymorphismMode. AcBinarySerializableAttribute doc-string explains all three behavior modes.

ACCORE-BIN-T-B7H4: Implement `AcBinarySerializerOptions` thread-safety fix

Priority: P2 · Type: Refactor · Related: BINARY_ISSUES.md#accore-bin-i-l8n5 (canonical issue)

The latent thread-safety problem documented in ACCORE-BIN-I-L8N5 — mutable set; properties on AcBinarySerializerOptions shared across concurrent serialize/deserialize calls — needs a fix before AcBinary ships as a NuGet package. The package cannot constrain how consumers scope their options instances; defensive contract is needed in the serializer itself.

Three candidate fix directions (decide before implementing):

Defensive copy on ingress — add AcBinarySerializerOptions Clone() method (member-wise copy). Every API entry point that retains an options instance clones it on entry. External mutation to the original becomes invisible to the holder.
- Pro: non-breaking. Existing consumer code unchanged. No major version bump required.
- Pro: API surface change limited to one new Clone() method.
- Con: per-call clone overhead (small, but non-zero). Cache keyed on options-identity becomes invalid for downstream code using reference equality.
- Con: doesn't fix the underlying mutability — internal code can still race-mutate the cloned snapshot if a method retains both the snapshot and modifies it concurrently.
Immutable record refactor — set; → init; on all configuration properties. Mutation requires with-expression which produces a new instance.
- Pro: type-system-strong guarantee. Race becomes a compile error, not a runtime corruption risk.
- Pro: zero runtime overhead (init-only is compile-time check; record class semantics are unchanged at runtime).
- Con: breaking change for any consumer doing opts.UseGeneratedCode = false after construction. Major version bump.
- Con: source-generator coordination needed if SGen emits options-builder code that mutates properties.
Read-only flag pattern (à la JsonSerializerOptions.MakeReadOnly()) — mutable by default, holder calls MakeReadOnly() on entry; subsequent property setters throw InvalidOperationException.
- Pro: BCL-precedent — Microsoft adopted it for JsonSerializerOptions in .NET 7 (dotnet/runtime#74431) for exactly this problem. Familiar pattern for consumers.
- Pro: minimal API surface change (one new method + IsReadOnly flag property).
- Pro: per-call overhead = single bool check per setter call. Negligible.
- Con: opt-in by the holder — if a custom consumer-side wrapper forgets to call MakeReadOnly(), the safety hole stays open for that wrapper's clients. Documentation-driven safety, not type-system-driven.
- Con: bypasses static-analysis tooling (the setter signature stays public; the throw is runtime). IDE doesn't surface "this property is currently read-only" in autocomplete.

Recommendation: Option 3 (MakeReadOnly pattern) is the BCL-precedent, lowest-friction migration path. Microsoft adopted it for JsonSerializerOptions in .NET 7 to solve the same problem; AcBinary should follow the same pattern for consistency with consumers' mental model and zero migration cost.

Coordination with the existing AcBinaryHubProtocol setter side-effect (the second risk surface in ACCORE-BIN-I-L8N5): the protocol ctor currently mutates the caller-provided options reference (_options.BufferWriterChunkSize = options.BufferSize). After the fix:

Option 1 (Clone): ctor mutates the cloned snapshot → no side-channel to the caller. Fix transparent.
Option 2 (Immutable): ctor cannot mutate; needs to construct a new options via with-expression. Breaking change in the ctor's options-handling.
Option 3 (MakeReadOnly): ctor mutates before calling MakeReadOnly() — same as today, but explicit "frozen" point afterwards. Caller-side mutation post-ctor is now a runtime throw.

Implementation outline (Option 3 path):

AcBinarySerializerOptions.IsReadOnly { get; } — public bool property.
AcBinarySerializerOptions.MakeReadOnly() — sets the flag; idempotent (no-op if already set).
All set; accessors guard: if (IsReadOnly) throw new InvalidOperationException("AcBinarySerializerOptions has been made read-only and can no longer be mutated. Construct a new options instance instead.");.
AcBinarySerializer.Serialize<T> entry (and all sibling entries — Deserialize<T>, SerializeChunked, etc.): options.MakeReadOnly() before any property read.
AcBinaryHubProtocol ctor: complete the BufferWriterChunkSize mutation before calling options.MakeReadOnly(). After ctor returns, the options instance is frozen for that protocol's lifetime.
Doc-string update on AcBinarySerializerOptions class header: explicit "thread-safety contract" section explaining the freeze-on-first-use semantics.

Acceptance:

Concurrent stress test (16 threads × 1000 iterations) on a shared AcBinarySerializerOptions instance with property-mutation-attempts mid-iteration — all mutations after MakeReadOnly() throw InvalidOperationException; no silent corruption observed.
Existing tests pass unchanged (the MakeReadOnly is opt-in for the serializer entries; tests that build options + use them once continue to work transparently).
BINARY_ISSUES.md#accore-bin-i-l8n5 Status updated to Closed (YYYY-MM-DD) with a ### Resolution sub-section pointing to this TODO + the implementing commit.
Doc-string on AcBinarySerializerOptions documents the freeze-on-first-use contract; BINARY_FEATURES.md or BINARY_OPTIONS.md cross-references the BCL-precedent (JsonSerializerOptions.MakeReadOnly).

ACCORE-BIN-T-F8N3: Switch source-generator type-name hashing from simple-name to fully-qualified-name

Priority: P3 · Type: Refactor · Related: ACCORE-BIN-T-I3P8 (override mechanism for residual collisions)

The source generator's ComputeFnvHash(typeSymbol.Name) uses the simple name only (e.g. "User", not "MyApp.A.User"). Cross-namespace types with the same simple name silently collide on s_typeNameHash. The hash is currently only consumed by the WireMode=Metadata inline metadata-write path (cross-version property compat) — the framework explicitly does NOT add wire-format type-id (per CLAUDE.md Rule #7: type-dispatch is consumer responsibility, see BINARY_ASYNCPIPE_ISSUES.md#accore-bin-i-t6v2). Within UseMetadata, the simple-name collision can still cause silent property-set mismatches between two types with the same short name in different namespaces — this TODO fixes that.

Change scope (AcBinarySourceGenerator.cs) — 4 call sites: ComputeFnvHash(typeSymbol.Name) → ComputeFnvHash(typeSymbol.ToDisplayString()):

Self type-name hash (~line 358)
Child type-name hash (~line 157)
Element type-name hash (~line 254)
Dict-value type-name hash (~line 311)

No runtime code changes; output regenerates with new constants on next build.

Breaking change scope: any saved binary stream that uses WireMode=Metadata and was produced by an older version embeds the old simple-name hash; consumers reading those streams with the new hash compute would mismatch and throw. Pre-1.0: acceptable. Post-1.0 would require a WireMode=Metadata format-version bump.

Acceptance:

All *_GeneratedWriter.g.cs files regenerate with FQN-based s_typeNameHash values.
Existing tests pass (auto-regen propagates; no manual hash literals in tests).
Wire format identical for WireMode=Compact (no metadata embedded).
UseMetadata=true paths produce different hashes — explicitly tested via round-trip.

ACCORE-BIN-T-I3P8: `[AcBinaryTypeId(...)]` attribute — explicit type-id override

Priority: P3 · Type: Feature · Related: ACCORE-BIN-T-F8N3 (FQN base hash being overridden)

Once ACCORE-BIN-T-F8N3 reduces collision frequency by switching to FQN, residual FQN-hash collisions are still possible (32-bit hash space, birthday paradox). Currently the only consumer of s_typeNameHash is the WireMode=Metadata inline metadata-write path — a residual collision there causes a silent property-set mismatch.

[AcBinaryTypeId(0x12345)] attribute on a class:

Source generator emits s_typeNameHash = 0x12345 instead of computing FNV.
Two types with the same [AcBinaryTypeId(...)] value → compile-time / first-use error.

Useful for:

Resolving rare FQN-hash collisions deterministically (within WireMode=Metadata).
Pinning a stable type-id across class renames (wire-compat across versions in Metadata mode).
Future-proofing: if a Layer 1 consumer (hypothetically) builds a type-dispatch above AcBinary using s_typeNameHash, the same override mechanism applies.

Acceptance:

New attribute class shipped alongside [AcBinarySerializable].
Generator honours the override (emits explicit constant instead of FNV result).
Tests: rename a class with [AcBinaryTypeId] → s_typeNameHash unchanged.

ACCORE-BIN-T-X2M5: Evaluate xxHash3 vs FNV-1a for type-name hashes

Priority: P3 · Type: Investigation · Related: ACCORE-BIN-T-F8N3

FNV-1a is currently used for both s_typeNameHash and s_propertyHashes. For compile-time hashing, performance is irrelevant. For collision resistance:

FNV-1a 32-bit: ~50% collision at ~77K types (birthday paradox). Adequate for small/medium projects, marginal for large ones with many auto-generated types.
xxHash3 32-bit: comparable mathematical properties to FNV-1a (both non-cryptographic).
xxHash3 64-bit: dramatically better collision resistance (~50% at ~5B entries), at the cost of 8 wire bytes instead of 4.

Trigger: real collisions observed (1000+ types per assembly + cross-assembly aggregation), or community feedback indicating collision pain.

Investigation questions (no code change without a triggering pain signal):

Switch to xxHash3 32-bit (incremental improvement) — but doubles the change scope (touch property hashes too if uniformity desired).
Switch to xxHash3 64-bit (8 wire bytes instead of 4) — meaningful collision resistance, modest wire cost.
Stay on FNV-1a + force [AcBinaryTypeId] for collisions — minimal change, devops burden.

Investigation only — defer until pain signal arrives.

ACCORE-BIN-T-K9E4: `[RequiresDynamicCode]` + `[RequiresUnreferencedCode]` on Runtime-only methods

Priority: P3 · Type: Refactor · Related: BINARY_FEATURES.md#nativeaot-compatibility

The Runtime path (factories in AcSerializerCommon + wrapper-based deserialize fallback in AcBinaryDeserializer) currently works under NativeAOT thanks to DAMs propagation + RuntimeFeature.IsDynamicCodeSupported guards, but the trimmer still emits warnings for the well-known blind spots (polymorphism via obj.GetType(), nested-type chain via generic argument extraction). The library suppresses these with [UnconditionalSuppressMessage] and documented justification.

A complementary signal would be to mark the Runtime entry points (or the factories themselves) with [RequiresDynamicCode("AcBinary Runtime path uses Reflection.Emit / closed-generic instantiation; use [AcBinarySerializable] + SGen for NativeAOT.")] and [RequiresUnreferencedCode("...")]. Effect:

AOT publish in consumer's project surfaces a warning at the call site → consumer chooses SGen or accepts the Runtime cost
Mirrors the System.Text.Json reflection-mode pattern ([RequiresDynamicCode] on JsonSerializer.Serialize<T> overloads)
One-codebase, no NuGet split needed
Cheap implementation — attribute placement only

Coordination: [RequiresDynamicCode] is contagious; every caller must either propagate it or suppress with [UnconditionalSuppressMessage]. Scope:

Public Serialize<T> / Deserialize<T> entry points stay attribute-free (consumer-facing)
Runtime fallback methods get the attribute (contained inside the library)
The DAMs annotations we already have stay — they're orthogonal (one prevents trim, the other warns about JIT-only behavior)

Acceptance:

Consumer's AOT publish surfaces a IL2026/IL3050 warning when UseGeneratedCode=false is set or an unattributed type is deserialized
SGen path is warning-free
Library compiles 0 warnings (suppressions added at the propagation barrier)
BINARY_FEATURES.md NativeAOT Compatibility section updated to mention the explicit warning signal

ACCORE-BIN-T-A2J7: Optional `AyCode.Core.Aot` NuGet variant (SGen-only build)

Priority: P3 · Type: Feature · Related: BINARY_FEATURES.md#nativeaot-compatibility, ACCORE-BIN-T-K9E4

Binary-size-sensitive AOT consumers (Blazor WASM, MAUI mobile, embedded, container-trimmed) benefit from a smaller library variant that strips the Runtime fallback path entirely. Estimated savings: ~80-150 KB of native code (~25-60 KB compressed wire size for WASM publish).

Strippable code in the .Aot variant:

Component	LOC	Purpose	Removable in Aot?
`AcSerializerCommon.Create*` (7 factory methods + Expression-tree code)	~150	Runtime delegate compilation	✅ Yes
`TypeMetadataBase` runtime metadata path (`CompiledConstructor`, IdGetters via Expression.Compile)	~300	Reflection-based metadata	✅ Yes
`AcBinaryDeserializer` wrapper-based runtime fallback (`PopulateObjectPropertiesIndexed`, `ReadObjectCoreWithWrapper` non-SGen branches, `CreateInstance(type)` Activator-fallback)	~500	Runtime polymorphic dispatch	✅ Yes
Property accessor runtime delegate fields (`_dynamicGetter`, typed getter/setter caches outside SGen)	~150	Boxed property access	✅ Yes
`System.Linq.Expressions` transitive dependency	—	Expression-tree IL emission	✅ Yes (when nothing else in graph uses it)

Implementation sketch (avoid #if-erdő via file-level split):

AyCode.Core/Serializers/
  AcSerializerCommon.cs              // SGen-safe shared parts
  AcSerializerCommon.Runtime.cs      // 7 Create* factory methods only here
  AcBinaryDeserializer.cs            // SGen path
  AcBinaryDeserializer.Runtime.cs    // wrapper-based runtime fallback path
  TypeMetadataBase.cs                // SGen-safe metadata
  TypeMetadataBase.Runtime.cs        // Expression.Compile-based ctor + accessor wiring

Two .csproj files:

AyCode.Core.csproj — full package (current); includes all files
AyCode.Core.Aot.csproj — <Compile Remove="**/*.Runtime.cs" />; sets <PackageId>AyCode.Core.Aot</PackageId>; same version as full

Trade-offs:

✅ No #if directives in business code — physically separate file groups
✅ Source mostly shared via SDK include/exclude semantics
✅ DAMs annotations and trim-suppressions only land in the full package; .Aot variant is genuinely trim-clean by construction
✅ "Strict SGen" semantics in .Aot: a non-SGen type at deser time throws clearly instead of silently falling back. Marketing positioning: "guaranteed SGen path, no hidden slow lane".
⚠️ Two NuGet IDs, two changelogs, version sync (CI-automatable)
⚠️ Consumer must pick the right package — wrong choice = breaking switch later

Coordination:

Land ACCORE-BIN-T-K9E4 first ([RequiresDynamicCode] attributes) — if that pattern handles the consumer-side scenarios well, .Aot may not be needed
The current Runtime fallback code is already well-isolated (mostly in AcSerializerCommon factories + AcBinaryDeserializer wrapper-based methods), so the file-split refactor is mechanically straightforward
Marketing decision: is binary size a central pillar? If yes, .Aot is a NuGet differentiator; if not, K9E4 alone is enough

Acceptance:

AyCode.Core.Aot.csproj produces a NuGet ~25-60 KB smaller than AyCode.Core after compression
.Aot build emits zero IL/AOT trim warnings (no suppressions needed because the Runtime path code is physically removed)
Round-trip tests pass on .Aot for all SGen types
.Aot throws a clear InvalidOperationException (not MissingMethodException) when a non-[AcBinarySerializable] type is encountered at deser time
BINARY_FEATURES.md NativeAOT Compatibility section documents both packages and when to choose which

ACCORE-BIN-T-V4N2: .NET 11 SIMD-specialized UTF-8 decoder via multi-targeting

Priority: P3 · Type: Performance · Related: AcBinaryDeserializer.BinaryDeserializationContext.Read.cs::DecodeUtf8

The custom UTF-8 → UTF-16 decoder in DecodeUtf8 / CountUtf8Chars / DecodeUtf8ToChars currently targets .NET 9 — scalar two-pass with optional Vector256 ASCII prefix widen + DWORD ASCII batch (per Phase 1 optimization). .NET 11 (planned ~Nov 2026) exposes additional SIMD intrinsics that can meaningfully accelerate the decoder on AVX-512-capable hosts, particularly the vpcompressb-style mask-driven byte compression that simdutf relies on for its 64-byte AVX-512 transcoder.

Why .NET 11 specifically (and not .NET 10)

.NET 10: incremental SIMD improvements, but the changes that affect us are mostly inside the BCL (Encoding.UTF8.GetString internal SIMD widening). Our custom decoder bypasses the BCL — we don't benefit unless we hand-roll the same SIMD ourselves with .NET 9 intrinsics, which already work today. Multi-targeting net9.0;net10.0 adds CI/test overhead with marginal payoff. Skip.
.NET 11: PR #120628 (Vector512/Vector256 SIMD for UTF-8 utilities) was closed without merge but signals upcoming work in this area. Future iterations are expected to expose Avx512Vbmi-style mask-compress intrinsics that today require unsafe / Vector128-emulation paths. Target this once the framework lands.

Implementation outline (when triggered)

Multi-target <TargetFrameworks>net9.0;net11.0</TargetFrameworks> on AyCode.Core.csproj
#if NET11_0_OR_GREATER block in DecodeUtf8 selects an AVX-512-aware path: process 64-byte blocks via Vector512<byte> + vpcompressb for byte-stream extraction, fall back to the .NET 9 scalar+Vector256 path on non-AVX-512 hardware (Avx512Vbmi.IsSupported runtime check)
Reuse the .NET 9 scalar path for short strings (<64 bytes) — SIMD setup cost dominates
New benchmark cells comparing .NET 9 vs .NET 11 builds on the same hardware

Acceptance

dotnet test passes on both target frameworks
Benchmark on AVX-512 hardware (Sapphire Rapids / Zen 4+) shows ≥1.5x non-ASCII deser speedup vs .NET 9 build for strings ≥256 bytes
Short-string perf (≤64 bytes) within ±5% of .NET 9 build (no regression from multi-target setup)
BINARY_FEATURES.md documents the SIMD path selection logic

Trigger

Wait for .NET 11 release (or RC)
Re-evaluate once dotnet/runtime UTF-8 SIMD utilities re-land (post-PR #120628 follow-up)
Skip entirely if .NET 11 BCL Encoding.UTF8.GetString becomes fast enough that hybrid (≥256 bytes → BCL, <256 → custom) wins without hand-rolled SIMD

ACCORE-BIN-T-S5L8: Sentinel-length encoding for strings (wire-size optimization, both modes)

Priority: P3 · Type: Wire-format optimization · Related: AcBinarySerializer.WriteString, AcBinaryDeserializer.ReadValue string dispatch

The leading string-marker byte (String / StringEmpty / Null) exists primarily to distinguish null vs empty vs non-empty before dispatching. For non-polymorphic, non-interned string properties the marker can be replaced by a single sentinel-length VarUInt:

[VarUInt sentinelLength] [content bytes if applicable]
   sentinelLength == 0    → null
   sentinelLength == 1    → empty string
   sentinelLength == N+1  → string of N bytes/chars, content follows

MemoryPack-style encoding pattern. Applies to both Compact (UTF-8) and FastWire (UTF-16 raw) modes; the content following the sentinel differs by mode.

Per-mode impact

FastWire mode — wire layout today: [String marker][VarUInt charCount][UTF-16 raw bytes]. Sentinel saves 1 byte per non-null string.

TestData	Current FastWire wire	Estimated with sentinel	Δ
Small	3122 B	~3050 B	-2%
Medium	10905 B	~10500 B	-4%
Large	68603 B	~67000 B	-2%
Repeated	16244 B	~15700 B	-3%
Deep	15514 B	~14900 B	-4%

Closes the +1.7-8.1% FastWire wire gap vs MemoryPack to near zero or favorable while keeping AcBinary FastWire's +9-20% speed advantage.

Compact mode — wire layout today varies by length:

Short (≤31 byte): [FixStr+length][UTF-8 bytes] — already 1-byte marker, ties sentinel.
Long (>31 byte): [String marker][VarUInt byteCount][UTF-8 bytes] — sentinel saves 1 byte (the marker).

Compact gain: only on long strings (>31 byte UTF-8). Estimated −1 byte per long string. Workload-dependent: if most strings are short or use interning, gain is small. If many long mixed-content strings, meaningful saving.

Limitations (both modes)

Polymorphic object properties: marker needed for type discrimination. Sentinel encoding only applies when the property type is statically string or string?.
Interning incompatible: sentinel cannot express StringInternFirst / StringInterned markers (those carry cache-index semantics). Interned properties keep marker-based encoding. FastWire mode already disables interning by design (consistent); Compact mode needs per-property dispatch (interned → marker, non-interned → sentinel).
Compact-mode FixStr ties: short strings (≤31 byte UTF-8) gain nothing in Compact (FixStr is already 1-byte marker+length). The optimization wins only on long strings in Compact.

Implementation outline (rough — refine when implementing)

Writer: branch in WriteString on property metadata flags (IsString, IsNotInterned, IsNotPolymorphic). If sentinel-eligible, emit VarUInt sentinelLength + content. Else fall through to existing marker-based encoding.
Reader: matching branch in property reader. If sentinel-eligible (per property metadata), read VarUInt sentinelLength, dispatch on 0/1/N+1.
SGen: emit sentinel-encoding variant for non-polymorphic non-interned string typed properties; emit existing marker-encoding for the rest.
Wire format version bump OR header flag indicating sentinel-encoding-active. (Cross-version compat policy decided when implementing.)

Trigger

After D-2 / decoder optimization / marker-dispatch land (compact-mode focus completes)
When wire-size positioning becomes a primary pillar for NuGet release
Re-evaluate scope at implementation time — exact gain in Compact depends on consumer workload (long-string ratio, interning patterns)

Acceptance

FastWire mode: AcBinary wire ≤ MemoryPack on at least 4 of 5 test cells
Compact mode: long-string wire bytes -1 each, no regression on short or interned strings
Speed benchmark: no regression vs current encoding (essentially zero CPU cost — sentinel is shifted bookkeeping)
Cross-version compat: documented format version bump + clean fail on old reader / new wire mismatch
Polymorphic + interned property test cases pass unchanged (use existing marker-based encoding)

ACCORE-BIN-T-M3R7: ASCII marker-dispatch — writer detect + reader dedicated path

Priority: P2 · Type: Performance + wire optimization · Related: BinaryTypeCode.FixStrAsciiBase..StringAscii markers, WriteStringWithDispatch, ReadAsciiBytesAsString Status: Closed (2026-05-04)

Sorrendi megjegyzés: ezt AZ ENCODER OPTIMALIZÁCIÓ UTÁN csináljuk (lásd ACCORE-BIN-T-E2F9). Indok: a custom encoder/decoder Vector256 ASCII narrow/widen path-jai már magukban gyorsan kezelik az ASCII byte-ot. A marker-dispatch ezen FELÜL csak a per-call dispatch-overhead spórolást hozza (no Ascii.IsValid scan, no decoder layer). Garantált win, de additív — méréstechnikailag tisztább a decoder/encoder utánra hagyni.

The FixStrAscii* (135-166) and StringAscii (167) markers are defined in BinaryTypeCode.cs with helper methods (IsAsciiString, IsFixStrAscii, EncodeFixStrAscii, DecodeFixStrAsciiLength). Encoding/decoding logic NOT yet implemented — currently both writer and reader use the universal String / FixStr markers.

Implementation

Writer: in WriteStringUtf8 / WriteFixStrDirect, after UTF-8 encoding (D-2 path), check bytesWritten == charLength (= ASCII iff equal). If ASCII, emit FixStrAscii (≤31 byte) or StringAscii (>31 byte). Else emit existing FixStr / String. Free detect — both numbers already computed by D-2.
Reader: in ReadStringUtf8 (or upstream marker dispatch), branch on marker. ASCII markers → dedicated byte→char widening path (no UTF-8 decode, no Ascii.IsValid scan, no decoder dispatch). Non-ASCII markers → existing custom UTF-8 decoder.
SGen: regenerate readers/writers to dispatch on the new markers.
Re-enable ASCII fast paths: uncomment writer FixStr dispatch in AcBinarySerializer.cs and reader Ascii.IsValid block in ReadStringUtf8 — these temporarily disabled blocks become the marker-aware paths (no IsValid scan needed since the marker is the contract).

Wire format change

Format version bump (1 → 2). Old readers fail clean on new wire (version mismatch). New readers must reject old wire OR support backward read.

Acceptance

Repeated Strings (Hungarian content) Deser: AcBinary closes the ~10% gap vs MemoryPack
Pure ASCII tests (Small/Medium/Large/Deep): AcBinary Ser AND Deser ≥ MemoryPack
Wire size: minimum -25% vs MemoryPack across all test cells
SGen-generated code compiles and round-trips on all [AcBinarySerializable] types
Decision documented: backward-compat policy for v2 vs v1 wire

Resolution

End-to-end implementation landed (writer + reader + SGen + skip + populate). Key components:

Writer (AcBinarySerializer.BinarySerializationContext.WriteStringWithDispatch) — single-pass UTF-8 encode + ASCII detect via bytesWritten == charLength; emits one of 4 markers (FixStrAscii / FixStr / StringAscii / String). Split layout for hot path: charLength ≤ 31 encodes optimistically at savedPos+1 (FixStr position) → 0 shift on FixStr hit; charLength > 31 uses D-2 layout with backfill. The split avoids the post-encode left-shift that the unified layout introduced (regression seen in 12-42-32 bench).
Reader (AcBinaryDeserializer.BinaryDeserializationContext.ReadAsciiBytesAsString) — Encoding.Latin1.GetString (BCL SIMD-accelerated byte→char widen). Avoids the string.Create callback + scalar widen overhead — measurably better on Small Deser cell (closed the +20% MemPack-relative anomaly).
TypeReaderTable: StringAscii (167) + 32 × FixStrAscii (135-166) readers registered. IsFixStrAscii / StringAscii fast paths in PopulatePropertyWithMarker, ReadValue, SkipValue.
SGen (AcBinarySourceGenerator.EmitReadString) — regenerated readers branch on IsFixStr / IsFixStrAscii / case StringAscii per property.

Wire format version not bumped — the new markers occupy previously-unused codepoints (135-167); old wire (without ASCII markers) is forward-compatible (readers handle both String and StringAscii). v1 stays.

Acceptance (AOT bench 13-40-29, MemPack-relative ratios — JIT noise eliminated):

✅ AcBinary Ser AND Deser GYORSABB MemPack-nél MINDEN cellán (5/5)
- Small: Ser -8%, Deser -23%
- Medium: Ser -17%, Deser -30%
- Large: Ser -28%, Deser -32%
- Repeated: Ser -4%, Deser -9%
- Deep: Ser -24%, Deser -22%
✅ Wire size advantage: 2043-50419 byte (vs MemPack 3070-64986) = -22% to -33% across cells
✅ Round-trip tests: 167 pass (13 pre-existing failures are IId-tracking, unrelated to M3R7)

JIT vs AOT note: earlier JIT-mode benchmarks (12-50-43 → 13-27-20 series) showed elevated ratios on Small/Repeated cells (1.0-1.2 range) that disappeared under AOT publish. The JIT-mode numbers reflect tier-up artifacts (inconsistent inlining of SGen-generated reader hot paths during the 1000-iteration measurement window), not a structural M3R7 property. AOT (NativeAOT / ILC) compiles deterministically with fixed inline decisions — the steady-state numbers above reflect the actual production performance.

ACCORE-BIN-T-E2F9: Custom UTF-8 encoder (writer-side, symmetric with custom decoder)

Priority: P1 · Type: Performance · Related: decoder optimization (AcBinaryDeserializer.BinaryDeserializationContext.Read.cs::DecodeUtf8SinglePass) Status: Closed (2026-05-04)

Sorrendi megjegyzés: ezt A MARKER-DISPATCH ELŐTT csináljuk (lásd ACCORE-BIN-T-M3R7). Indok: a custom encoder/decoder optimalizáció a "nehezebb, kevésbé biztos" win — a non-ASCII / mixed content workload-okat (Repeated Strings Hungarian) hozza be. A marker-dispatch utána már csak additív tisztítás a pure ASCII path dispatch-overhead-jén.

Replace Encoding.UTF8.GetBytes calls in WriteStringUtf8 / WriteStringUtf8Internal / WriteFixStrDirect (collectively the writer's UTF-8 encode path, post-D-2) with a hand-rolled SIMD encoder. Symmetric to the decoder optimization (V4N2 / Read.cs::DecodeUtf8SinglePass).

Layered structure (mirrors decoder)

Phase 1 — Vector256 ASCII narrow: 16 chars (Vector256) → 16 bytes (Vector128) via Vector256.Narrow. ASCII detect via (v & 0xFF80).ExtractMostSignificantBits() == 0 (any high bit on UTF-16 char). Break on first non-ASCII char.
Phase 2 — DWORD ASCII batch: 4 chars at a time, OR-mask test, 4 bytes per iter when ASCII.
Phase 3 — Scalar multi-byte encode: 1-byte (ASCII) / 2-byte (Latin extended) / 3-byte (BMP) / 4-byte (surrogate pair → supplementary plane) UTF-8 encoding via direct bit-extract. No fallback dispatch — input is trusted UTF-16 (string).
Use System.Text.Unicode.Utf8.FromUtf16 as fallback target for scalar correctness — or skip BCL entirely with manual bit-pack.

Why

Encoding.UTF8.GetBytes carries virtual-dispatch + encoder-fallback overhead even with SIMD ASCII fast path internally. Custom encoder skips this. ~15-30% Ser improvement on ASCII content, ~5-10% on non-ASCII (multi-byte path stays scalar).

Trigger

NEXT — implementation order P1 before marker-dispatch (M3R7)
Re-evaluate if .NET 11 BCL UTF-8 GetBytes becomes faster (PR #120628 follow-up)

Acceptance

Writer-side benchmark: ≥15% Ser speedup on ASCII content (Small/Medium/Large/Deep), ≥5% on non-ASCII (Repeated)
Wire format unchanged (custom encoder produces same bytes as Encoding.UTF8)
Round-trip tests pass

Resolution

Implemented as EncodeUtf8SinglePass in AcBinarySerializer.BinarySerializationContext.cs — three-phase layered encoder (Vector256 ASCII narrow + DWORD ASCII batch + scalar 1/2/3-byte BMP & 4-byte surrogate-pair). Bypasses Encoding.UTF8.GetBytes virtual-dispatch + encoder-fallback overhead. Trusted-input path — no validation pass on writer side (the input is a .NET string with valid UTF-16 surrogate pairs by construction).

Used by WriteStringUtf8 (D-2 single-pass with VarUInt backfill) and WriteStringWithDispatch (M3R7 marker-dispatch path). Wire format unchanged — the encoder produces the same bytes as Encoding.UTF8.GetBytes.

Acceptance (per bench 12-50-43 → 13-27-20, MemPack-relative ratios on AcBinary Compact FastMode SGen):

✅ ASCII Ser ≥ MemPack on 4/5 cells (Small 0.94, Medium 0.80, Large 0.79, Deep 0.81)
⚠️ Repeated Ser ~1.04 (Hungarian, multi-byte path scalar) — see follow-up ACCORE-BIN-T-H7K3
✅ Round-trip tests pass (167 of 180; 13 pre-existing failures unrelated to encoder)

ACCORE-BIN-T-W7N5: Default-value omission policy — doc + optional opt-out

Priority: P2 · Type: Refactor + Documentation · Related: BINARY_ISSUES.md#accore-bin-i-d9y2 (canonical issue)

The serializer's PropertySkip (102) optimization saves 1 byte per default-valued property by omitting the full value from the wire — relying on the consumer-side type definition to have the same default(T). This is a latent correctness risk documented in ACCORE-BIN-I-D9Y2. This entry tracks the mitigation plan; full failure-mode analysis lives in the issue.

Decision tree (TBD when implementing)

Doc-only: position as a deliberate protobuf-style feature; consumer keeps type defaults stable across versions. Lowest cost, maximum benchmark wire-size advantage retained.
Option flag: AcBinarySerializerOptions.OmitDefaults boolean. Default true (preserves current behavior + benchmark numbers). false writes every property in full — opt-out for fragile-class-evolution scenarios.
Both: ship doc + flag. Default behavior unchanged; consumers who hit silent-corruption have an explicit opt-out.

Acceptance (when implementing)

BINARY_FEATURES.md adds a "Default-Value Omission" section documenting the semantic and the tradeoff (with cross-ref to ACCORE-BIN-I-D9Y2)
If flag added: round-trip tests covering both true and false; benchmark comparison table showing wire-size delta on ASCII / Hungarian / DTO-heavy workloads
Decision rationale recorded in LLM_PROTOCOL_DECISIONS.md (or a ### Resolution block on the issue) once implemented

ACCORE-BIN-T-H7K3: Hungarian / multi-byte content Ser optimization (Repeated Strings cell)

Priority: P3 · Type: Performance · Related: EncodeUtf8SinglePass Phase 3 (scalar multi-byte encode), ACCORE-BIN-T-E2F9 resolution Status: Closed (2026-05-04) — Won't Fix (JIT-only artifact)

The Repeated Strings benchmark (Hungarian content: "TermékNév_…", "RaklapKód_…") still shows AcBinary Ser ratio ~1.04 vs MemPack across multiple runs (12-50-43 / 13-21-27 / 13-27-20 series). All other ASCII-heavy cells (Small/Medium/Large/Deep) sit in the 0.79-0.94 ratio range — Repeated is the outlier.

The Phase 3 scalar multi-byte branch in EncodeUtf8SinglePass (1-byte ASCII / 2-byte Latin-extended / 3-byte BMP / 4-byte surrogate-pair) processes Hungarian diacritics (á, é, í, ő, ű, etc.) as 2-byte UTF-8 sequences via scalar bit-extract. MemPack's UTF-8 encoder appears to use a SIMD-accelerated mixed-content lane that processes 2-byte sequences in parallel.

Resolution

AOT bench 13-40-29: Repeated Ser ratio = 0.96 (AcBinary 14.50 µs vs MemPack 15.05 µs, AcBinary GYORSABB by 4%). Deser ratio 0.91 (also faster).

The 1.04+ ratio observed in JIT-mode benchmarks (12-50-43, 13-21-27, 13-27-20) was a JIT tier-up artifact — the SGen-generated writer's hot path (which calls EncodeUtf8SinglePass) didn't reliably tier up to fully-optimized code within the 1000-iteration measurement window, while MemPack's writer apparently warmed up faster. Under NativeAOT publish (-p:_IsPublishing=true) the issue disappears completely — both writers are deterministically optimized at compile time.

No structural problem in the Phase 3 scalar branch. The investigation directions (Vector256 mixed-content lane, BCL Utf8.FromUtf16 comparison) remain valid academic improvements but show no meaningful production-time win — closing as Won't Fix.

ACCORE-BIN-T-S2X9: Markerless schema lane — drop per-property type markers for fixed-shape primitives (SGen)

Priority: P3 · Type: Wire-format extension · Related: ACCORE-BIN-T-S5L8, ACCORE-BIN-T-W7N5

AcBinary is marker-driven: every value on the wire carries a 1-byte type code, so the reader can dispatch generically (handles polymorphism, null, intern markers, type-name lookup, etc.). MemPack is schema-driven: the SGen reader knows at compile time that "field 3 is int, field 4 is string" and reads values directly with no type code, no run-time dispatch.

For fixed-shape primitive properties (int, bool, double, Guid, DateTime, …) on [AcBinarySerializable] types, the per-property type marker is pure overhead — the SGen-generated reader already has compile-time knowledge of the property type, so the marker only confirms what is already known. Dropping it on this narrow class of properties is a clean wire+CPU win without losing any of the polymorphism / null / intern flexibility that the marker provides for variable-shape values.

Wire savings per property type

Type	Current encoding	Markerless lane	Wire saved
`int` (TinyInt range −16..47)	TinyInt (1 byte)	VarInt (1 byte)	0
`int` (out-of-tiny)	`[Int32]` `[VarInt]` (2-6 bytes)	VarInt (1-5 bytes)	1 byte
`bool`	`[True]` or `[False]` (1 byte)	1 byte (0/1)	0
`Guid`	`[Guid]` `[16 bytes]` (17 bytes)	16 bytes	1 byte
`DateTime`	`[DateTime]` `[9 bytes]` (10 bytes)	9 bytes	1 byte
`DateTimeOffset`	`[DateTimeOffset]` `[10 bytes]` (11 bytes)	10 bytes	1 byte
`TimeSpan`	`[TimeSpan]` `[VarLong]` (2-9 bytes)	VarLong (1-9 bytes)	1 byte
`decimal`	`[Decimal]` `[16 bytes]` (17 bytes)	16 bytes	1 byte
`double`	`[Float64]` `[8 bytes]` (9 bytes)	8 bytes	1 byte

DTO-heavy payloads with many Guid / DateTime properties benefit the most — easily -10..-20% wire size on top of the existing -22..-33% advantage.

CPU savings

Reader-side: SGen-generated code drops the per-property ReadByte() + IsTinyInt / IsFixStr / switch-case dispatch for primitive properties — direct context.ReadInt32Unsafe() / ReadGuidUnsafe() / etc. calls. Writer-side: drops the WriteByte(typeCode) per primitive. Effect amplifies on payloads with many primitive properties (Small/Medium benchmark cells) — independent of any JIT-vs-AOT measurement variance.

Sketch — opt-in markerless lane, SGen-only

New wire format flag (header HeaderFlag_MarkerlessSchema = 0x10 or similar) → activates a property-positional lane.
SGen-generated writer for [AcBinarySerializable] types: per primitive property, emits raw value (no marker). For variable-shape properties (string, complex, nullable, polymorphic) the existing marker-driven path stays.
SGen-generated reader: per primitive property, calls context.ReadInt32Unsafe() / ReadGuidUnsafe() / etc. directly. Variable-shape properties keep the marker-read + dispatch.
Heuristic: a property is markerless-eligible if IsValueType && !IsNullable && type is in {int, bool, byte, short, long, float, double, DateTime, DateTimeOffset, Guid, TimeSpan, decimal}. Anything else (string, list, nested object, nullable) keeps the marker.

Decision points

Backward compatibility: header flag + version negotiation. Old readers see the flag set and either reject (clean fail) or fall back to marker-driven (if they support both lanes). Default false preserves current wire format.
Schema evolution fragility: the markerless lane is positional, so adding/removing/reordering primitive properties breaks readers compiled against an older schema. Document this clearly — opt-in is for stable schemas only (DTO-frozen API contracts, internal SignalR messages with synchronized client/server SGen). For evolving schemas, marker-driven default stays.
Coordination with ACCORE-BIN-T-S5L8 (sentinel-length strings): the two could share the "no-marker per-call" infrastructure — markerless string lane uses sentinel-length VarUInt (null/empty/short distinguished by length value).

Acceptance

Wire size: ≥ -10% on DTO-heavy payloads (Guid/DateTime-rich) vs current marker-driven format
Round-trip on the markerless lane validated on representative DTO shapes (mixed primitive + string + nested object)
Schema-evolution fragility documented in BINARY_FEATURES.md (alongside the existing PropertySkip / default-omission caveat from ACCORE-BIN-I-D9Y2)
Opt-in flag with default false (preserves marker-driven default; consumers explicitly opt in for frozen-schema scenarios)

87 KiB Raw Blame History Unescape Escape

AcBinarySerializer — TODO

Priority legend

ACCORE-BIN-T-S8P4: Replace JSON-in-Binary request parameters

Resolution

ACCORE-BIN-T-Q2N7: Re-evaluate DiscountProductMapping SGen exclusion

ACCORE-BIN-T-W9F1: Generate BinarySerializeTypeMetadata / BinaryDeserializeTypeMetadata at compile time

ACCORE-BIN-T-T5J8: JIT Tier 1 warmup for generated hot methods

ACCORE-BIN-T-Z3K8: Replace IId<T> interface dependency with convention/attribute-based Id detection

ACCORE-BIN-T-N7V1: Replace [JsonIgnore] dependency with serializer-native ignore attribute

ACCORE-BIN-T-Y6R2: Implement projection serialization phase 1 (runtime path)

ACCORE-BIN-T-K3W7: Rename BufferWriterChunkSize to reflect actual semantics

ACCORE-BIN-T-M4D2: Add ReadOnlyMemory<byte> / Memory<byte> deserialize overloads

ACCORE-BIN-T-S7X3: Add ReadOnlySpan<byte> deserialize overload

ACCORE-BIN-T-T8K3: Add SerializeAsync(Stream, T) async overloads with mode-driven output strategy

Mode-driven output strategy — three lanes for three workload shapes

Honest performance positioning vs. MemoryPack — three real axes

Throughput nuance — AsyncSegment cost on Stream-backed transports

Marketing claim — three-way honest comparison

ACCORE-BIN-T-D7K4: Add DeserializeAsync(Stream, T) async overloads with mode-driven input strategy

Implementation: zero new IBinaryInputBase impl needed

Public API shape

Implementation outline (per mode)

Honest performance positioning

Acceptance

ACCORE-BIN-T-N9G6: Add non-generic Type-based Serialize(object, Type, ...) overloads

Resolution

ACCORE-BIN-T-R4P2: Expose low-level ref Writer-style API for custom formatters

ACCORE-BIN-T-U6Y8: Attribute-driven polymorphism via [AcBinaryUnion] + SGen (opt-in, AOT-friendly)

1. New 5th bool parameter on [AcBinarySerializable]: EnablePolymorphismFeature

2. New [AcBinaryUnion(byte tag, Type subtype)] attribute

3. New PolymorphismMode enum on AcBinarySerializerOptions

ACCORE-BIN-T-B7H4: Implement AcBinarySerializerOptions thread-safety fix

ACCORE-BIN-T-F8N3: Switch source-generator type-name hashing from simple-name to fully-qualified-name

ACCORE-BIN-T-I3P8: [AcBinaryTypeId(...)] attribute — explicit type-id override

ACCORE-BIN-T-X2M5: Evaluate xxHash3 vs FNV-1a for type-name hashes

ACCORE-BIN-T-K9E4: [RequiresDynamicCode] + [RequiresUnreferencedCode] on Runtime-only methods

ACCORE-BIN-T-A2J7: Optional AyCode.Core.Aot NuGet variant (SGen-only build)

ACCORE-BIN-T-V4N2: .NET 11 SIMD-specialized UTF-8 decoder via multi-targeting

Why .NET 11 specifically (and not .NET 10)

Implementation outline (when triggered)

Acceptance

Trigger

ACCORE-BIN-T-S5L8: Sentinel-length encoding for strings (wire-size optimization, both modes)

Per-mode impact

Limitations (both modes)

Implementation outline (rough — refine when implementing)

Trigger

Acceptance

ACCORE-BIN-T-M3R7: ASCII marker-dispatch — writer detect + reader dedicated path

Implementation

Wire format change

Acceptance

Resolution

ACCORE-BIN-T-E2F9: Custom UTF-8 encoder (writer-side, symmetric with custom decoder)

Layered structure (mirrors decoder)

Why

Trigger

Acceptance

Resolution

ACCORE-BIN-T-W7N5: Default-value omission policy — doc + optional opt-out

Decision tree (TBD when implementing)

Acceptance (when implementing)

ACCORE-BIN-T-H7K3: Hungarian / multi-byte content Ser optimization (Repeated Strings cell)

Resolution

ACCORE-BIN-T-S2X9: Markerless schema lane — drop per-property type markers for fixed-shape primitives (SGen)

Wire savings per property type

CPU savings

Sketch — opt-in markerless lane, SGen-only

Decision points

Acceptance

87 KiB

Raw Blame History

ACCORE-BIN-T-W9F1: Generate `BinarySerializeTypeMetadata` / `BinaryDeserializeTypeMetadata` at compile time

ACCORE-BIN-T-Z3K8: Replace `IId<T>` interface dependency with convention/attribute-based Id detection

ACCORE-BIN-T-N7V1: Replace `[JsonIgnore]` dependency with serializer-native ignore attribute

ACCORE-BIN-T-K3W7: Rename `BufferWriterChunkSize` to reflect actual semantics

ACCORE-BIN-T-M4D2: Add `ReadOnlyMemory<byte>` / `Memory<byte>` deserialize overloads

ACCORE-BIN-T-S7X3: Add `ReadOnlySpan<byte>` deserialize overload

ACCORE-BIN-T-T8K3: Add `SerializeAsync(Stream, T)` async overloads with mode-driven output strategy

Throughput nuance — `AsyncSegment` cost on Stream-backed transports

ACCORE-BIN-T-D7K4: Add `DeserializeAsync(Stream, T)` async overloads with mode-driven input strategy

Implementation: zero new `IBinaryInputBase` impl needed

ACCORE-BIN-T-N9G6: Add non-generic `Type`-based `Serialize(object, Type, ...)` overloads

ACCORE-BIN-T-R4P2: Expose low-level `ref Writer`-style API for custom formatters

ACCORE-BIN-T-U6Y8: Attribute-driven polymorphism via `[AcBinaryUnion]` + SGen (opt-in, AOT-friendly)

1. New 5th bool parameter on `[AcBinarySerializable]`: `EnablePolymorphismFeature`

2. New `[AcBinaryUnion(byte tag, Type subtype)]` attribute

3. New `PolymorphismMode` enum on `AcBinarySerializerOptions`

ACCORE-BIN-T-B7H4: Implement `AcBinarySerializerOptions` thread-safety fix

ACCORE-BIN-T-I3P8: `[AcBinaryTypeId(...)]` attribute — explicit type-id override

ACCORE-BIN-T-K9E4: `[RequiresDynamicCode]` + `[RequiresUnreferencedCode]` on Runtime-only methods

ACCORE-BIN-T-A2J7: Optional `AyCode.Core.Aot` NuGet variant (SGen-only build)