87 KiB
AcBinarySerializer — TODO
This page covers planned work for the binary serializer core (format, SGen, options, deserialization context, buffer writer). Work specific to the streaming I/O layer (AsyncPipeReaderInput + AsyncPipeWriterOutput, multi-message wire framing, sliding-window buffer, producer-consumer synchronization) is tracked separately in BINARY_ASYNCPIPE_TODO.md.
Priority legend
- P0 blocker · P1 important · P2 nice-to-have · P3 idea
ACCORE-BIN-T-S8P4: Replace JSON-in-Binary request parameters
Priority: P1 · Type: Refactor · Status: Closed (2026-04-26, landed in commits cdd54d3 2026-04-05 + 3b70070 2026-04-06) · Related: ../XCUT/XCUT_ISSUES.md#accore-xcut-i-x8q1 (canonical), AyCode.Services/docs/SIGNALR/SIGNALR_TODO.md
Migrate client→server request parameters from JSON-in-Binary envelope to direct Binary serialization (matching response path). Coordinated change across client, server, and all consuming projects. Do NOT attempt as side-effect of unrelated work.
Acceptance: SignalPostJsonDataMessage<T> replaced by a SignalPostBinaryDataMessage<T> (or equivalent); no JSON round-trip on the wire for request params; benchmarks confirm no regression.
Resolution
- What: Length-prefixed, per-parameter binary format introduced via
SignalRSerializationHelper.SerializeParametersToBinary/DeserializeParametersFromBinary; further unified intoSignalParams(singlebyte[]carrying packed method parameters withSetParameterValues/GetParameterValues). - Where:
AyCode.Services/SignalRs/AcSignalRClientBase.cs,AcWebSignalRHubBase.cs,ISignalParams.cs(server + client dispatch);IAcSignalRHubClient.cs(legacy wrappers). - Equivalent (not literal
SignalPostBinaryDataMessage<T>):SignalParamswas chosen over a 1:1 binary wrapper class — fewer indirections on the hot path, type-safe pack/unpack, andDataSerializerTypefield onSignalReceiveParamsfor response format indication. - Wire impact: No JSON round-trip on the wire for request params; this is a breaking change vs. previous JSON-in-Binary clients/servers (see commit message).
- Legacy types:
SignalPostJsonMessage,SignalPostJsonDataMessage<T>,SignalPostMessage<T>,ISignalPostMessage<T>all marked[Obsolete]inIAcSignalRHubClient.cs; deletion tracked separately inAyCode.Services/docs/SIGNALR/SIGNALR_TODO.md#accore-sig-t-s3n8(gated on consumer migration).
ACCORE-BIN-T-Q2N7: Re-evaluate DiscountProductMapping SGen exclusion
Priority: P3 · Type: Investigation · Related: BINARY_ISSUES.md#accore-bin-i-f1w8
Investigate whether the new int Id shadowing pattern can be handled by SGen (via base-class introspection, property-setter lookup on the base) to eliminate the runtime compiled-expression fallback for this entity class.
ACCORE-BIN-T-W9F1: Generate BinarySerializeTypeMetadata / BinaryDeserializeTypeMetadata at compile time
Priority: P1 · Type: Performance · Related: BINARY_ISSUES.md#accore-bin-i-n6q3
Eliminate the dominant first-call cost (reflection + Expression.Compile in metadata ctor) for SGen types by emitting pre-built metadata from the source generator.
Design outline:
TypeMetadataBase/BinarySerializeTypeMetadata/BinaryDeserializeTypeMetadataget a second constructor that accepts pre-computed values (hashes,MinWriteSize,ComplexPropertyCount, flags,IsIId,IdAccessorType, etc.). No reflection executes in this ctor.- Source generator keeps its existing
s_typeNameHash/s_propertyHashesstatic fields (hot-path access stays static, zero indirection) and passes the same references to the metadata — single source of truth, no duplicate computation. ModuleInitregisters both the writer/reader and the pre-built metadata into aGeneratedMetadataRegistry.GetWrapperSlowconsults this registry first, falling back to the reflection-basedMetadataFactoryfor runtime-only types.- Lazy
RuntimeInit()pattern forExpression.Compileproperty accessors:TypeMetadataBasegetsvolatile bool _runtimeInitialized+internal void RuntimeInit()(idempotent, no lock needed).GetWrapperSlowcallsmetadata.RuntimeInit()only whenwrapper.GeneratedWriter == null || !Options.UseGeneratedCode— SGen types skip it entirely (they never touch runtime accessors on their own metadata; non-SGen child types have their own metadata and run the factory path normally).- Hybrid mode stays correct: an SGen type on the SGen path never uses its own property accessors; a non-SGen child type's metadata runs the reflection ctor as today.
volatileguards the flag; multiple contexts may race intoRuntimeInit, second run is a no-op.
Thread safety: GlobalMetadataCache is ConcurrentDictionary; generated metadata is registered once at ModuleInit; wrapper construction is per-context and unchanged.
Acceptance:
- Cold benchmark: first
Serialize<T>of a fresh SGen type shows no reflection /Expression.Compileon the call stack. - Runtime fallback (
UseGeneratedCode=false) still produces identical wire output and uses the full metadata accessors. - Deserialize side has parity (same approach for
BinaryDeserializeTypeMetadata). - Existing tests pass; wire format unchanged.
ACCORE-BIN-T-T5J8: JIT Tier 1 warmup for generated hot methods
Priority: P2 · Type: Performance · Related: BINARY_ISSUES.md#accore-bin-i-n6q3
After ACCORE-BIN-T-W9F1 lands, JIT of generated WriteProperties / ScanObject / ScanForDuplicates becomes the dominant residual first-call cost for SGen types. Options to evaluate (benchmark before committing):
[MethodImpl(MethodImplOptions.AggressiveOptimization)]on the generated hot methods — skips Tier 0, compiles directly at Tier 1. Simple generator change. Trade-off: larger one-time JIT cost in exchange for eliminating the Tier 0→1 recompile step.- Background prewarm from
ModuleInit:Task.Run(() => RuntimeHelpers.PrepareMethod(handle))for each registered writer/reader method. Parallelizes JIT with app startup. Keep it opt-in (option flag) to avoid surprising consumers with extra startup threads. - ReadyToRun (R2R) in consuming projects' publish config — pre-compiles IL to native at publish time. External to SGen, complementary. Document as a recommended publish setting.
- Code chunking (split generated methods exceeding a property threshold into sub-methods, e.g.
WriteProperties_Part1/_Part2) — measure first. Only beneficial for unusually large types (20+ properties / nested collections). Call overhead can offset gains; JIT inliner may already handle reasonably-sized methods well. try/finallyaudit on hot path — On .NET 9 (project's minimum target), JIT silently refuses to inline any method containing an EH region (AggressiveInliningis ignored). [.NET 10 partially lifts this for same-module try-finally — seedotnet/runtime#112998, merged 2025-03-20 — butcatch, cross-module, and P/Invoke-stub cases stay blocked. Until project's minimum runtime moves to .NET 10, treat EH as an absolute inlining barrier; even after the upgrade, several sub-cases keep the rule.] Audit scope:- Hand-written bridges:
WriteValueGenerated/WriteObjectGenerated/WriteStringGenerated/ScanValueGeneratedand any helper called from generatedWritePropertiesfor accidentaltry/finally/usingblocks. - SGen output template (
AcBinarySourceGenerator.cs): generatedWriteProperties/ScanObject/ScanForDuplicates/ReadObject/ReadPropertiesMUST stay straight-line. Future feature additions ([CustomSerializer] / [CustomDeserializer] hooks,OnSerializing/OnDeserializedcallbacks, validation attributes, rented-bufferusingblocks) are tempting candidates fortry/catch/finally— emit them in separate cold helpers, never inline into the generated hot method. A single accidentaltryblock inWritePropertiesmakes the whole generated method non-inlinable, killing the SGen Root Fast Path benefit. - Resource cleanup (Pool/ArrayPool/Dispose) belongs in
Serialize<T>entry-frame only, not in per-property helpers or generated hot methods. SeeBINARY_IMPLEMENTATION.mdRule #3 (Inlining barriers) andBINARY_SGEN.md(SGen Output Constraints).
- Hand-written bridges:
stackallocsize discipline on hot path — On .NET 9, methods containinglocalloc(any C#stackalloc) historically blocked inlining. Modern .NET allows inlining only for fixed-sizestackalloc≤ 32 bytes outside loops (seedotnet/runtime#7113) — anything larger or loop-nested still blocks. Our typical scratch-buffer patterns (UTF-8 encoding scratch, ArrayPool fallbacks) sit far above 32 bytes (256+), so any helper containing such astackallocis non-inlinable. Combined withtry/finallyforArrayPool.Returncleanup, the method is doubly non-inlinable on .NET 9. Plan accordingly: keepstackalloc-using helpers as deliberate cold call-frames, not asAggressiveInliningcandidates.- Native AOT — out of scope for this TODO; separate architectural decision with deployment-model implications.
Acceptance:
- Benchmark a realistic entity graph (≥ 3 referenced child types) and show first-call time within ~10% of steady-state after ACCORE-BIN-T-W9F1 + chosen mitigation(s).
- Document which combination is recommended for SignalR hot-path workloads vs. batch serialization.
ACCORE-BIN-T-Z3K8: Replace IId<T> interface dependency with convention/attribute-based Id detection
Priority: P1 · Type: Refactor
The binary serializer currently detects Id-tracking properties via the IId<T> interface (AyCode.Interfaces). This couples the serializer to a framework-specific abstraction and forces consumer types to implement the interface for tracking participation. Move to a POCO-friendly detection scheme:
IdDetectionMode.Convention(default) — convention-based; any property namedIdis treated as the tracking key. Zero-friction onboarding.IdDetectionMode.Attribute— explicit; only properties marked with a serializer-native[Id](or similar) attribute are tracked.[IgnoreId]attribute — escape hatch inConventionmode to exclude an Id-named property from tracking when the developer wants explicit opt-out.
Implicit contract for Convention mode: within a single class, the Id property must be type-level unique. Whether it semantically represents a primary key or a sequence number is irrelevant — the tracker keys by (Type, Id), so per-type uniqueness is the only requirement. Violating this invariant typically signals a domain-modelling problem, not a serializer bug. Design rationale discussed in conversation 2026-04-27.
Acceptance:
- Binary serializer no longer references
IId<T>in any execution path (no interface checks, nowhere T : IId<TKey>constraints in the serializer surface). - Wire format unchanged.
- Existing consumers using
IId<T>-implementing types still work transparently inConventionmode (theirIdproperty is detected via convention). - New consumers can use plain POCOs with no
AyCode.Interfacesdependency. IdDetectionModeexposed onAcBinaryOptions(or successor options class post-rebrand).- Default mode =
Convention.
ACCORE-BIN-T-N7V1: Replace [JsonIgnore] dependency with serializer-native ignore attribute
Priority: P2 · Type: Refactor
Property exclusion from binary serialization currently relies on [JsonIgnore] (Newtonsoft.Json). This couples the binary serializer to a third-party JSON library's attribute and is conceptually wrong — a binary serializer should not consult a JSON-specific marker for its exclusion semantics.
Define a serializer-native ignore attribute (working name [BinaryIgnore]; final name TBD pending broader rebrand). For backward compatibility during transition, also continue recognizing [JsonIgnore] with a deprecation note.
Possible cross-cutting consideration: if Toon and other future serializers also need property-exclusion, a single shared attribute (e.g., [SerializerIgnore] in a common abstractions package) may be cleaner than per-serializer attributes. Decide before naming finalizes — this may belong in XCUT_TODO.md rather than purely BINARY scope.
Acceptance:
- Native ignore attribute defined in the binary serializer's namespace (or shared abstractions package, pending the cross-cutting decision above).
- Both native attribute and
[JsonIgnore]recognized during a transitional period; native attribute takes precedence on conflict. [JsonIgnore]recognition flagged for removal in a future major version (track in a follow-up cleanup TODO once consumer projects have migrated).- No new code dependency on Newtonsoft.Json for property-exclusion logic.
ACCORE-BIN-T-Y6R2: Implement projection serialization phase 1 (runtime path)
Priority: P1 · Type: Feature · Related: ../adr/0001-binary-projection-serialization.md (canonical)
Implement the phase 1 runtime path of source→target projection serialization per ADR 0001. See the ADR for full context, decision rationale, alternatives, consequences, and acceptance criteria.
Sibling rebrand-prep TODOs: ACCORE-BIN-T-Z3K8 (IId migration), ACCORE-BIN-T-N7V1 (JsonIgnore replacement).
ACCORE-BIN-T-K3W7: Rename BufferWriterChunkSize to reflect actual semantics
Priority: P3 · Type: Refactor · Breaking: Yes (public option API) · Streaming impact: see BINARY_ASYNCPIPE_TODO.md for the streaming-side companion considerations (chunk-on-wire vs internal-buffer semantics)
The property name BufferWriterChunkSize is misleading: across the three output paths it does NOT consistently represent a "chunk".
| Output path | What BufferWriterChunkSize actually controls |
Wire-format chunk? |
|---|---|---|
ArrayBinaryOutput (Byte[] API) |
Initial buffer capacity of the internal byte[] |
No |
BufferWriterBinaryOutput (IBufferWriter overload) |
Internal buffer size — how much data accumulates before Advance() + new GetMemory() on the underlying writer |
No |
AsyncPipeWriterOutput (streaming) |
Both internal buffer and wire-format chunk frame size for chunked framing | Yes (only here) |
Receive side (AsyncPipeReaderInput) |
Initial receive buffer = BufferWriterChunkSize × 2 |
No (just sizing hint) |
Only the streaming AsyncPipeWriterOutput path has a wire-format "chunk" concept (chunked framing for length-prefixed segments). On the other 75% of paths the property name reads as if the serializer were segmenting the payload, which is not what happens.
Possible directions (decide before implementing):
- Single rename, semantic-neutral —
BufferWriterChunkSize→BufferWriterBufferSizeorBufferWriterPageSize. Minimal API surface change, single-property semantics preserved. Downside: still slightly off for the streaming path where there IS chunked framing. - Two-property split —
InternalBufferSize(universal: how much data accumulates before Advance/Grow) +StreamingChunkSize(only meaningful forAsyncPipeWriterOutput; separate knob, defaults toInternalBufferSize). Cleanest semantics, most ceremony, slightly more options to document. - Single rename, streaming-honest — Keep as
BufferWriterChunkSizebut document explicitly that on non-streaming paths the value is repurposed as buffer size. Cheapest change (docs only). Downside: doesn't fix the underlying confusion the field name causes.
Pick one before touching code. Option 2 is the most correct but adds API surface; Option 1 is the pragmatic middle.
Affected callers / docs to update on rename:
AcBinarySerializerOptions.cs(definition)AcBinarySerializer.cs× 3 sites (ArrayBinaryOutputctor,BufferWriterBinaryOutputctor,AsyncPipeWriterOutputctor)AcBinaryDeserializer.cs× 1 site (receive-side initial capacity derivation)AsyncPipeReaderInput.cs— XML doc cross-refsBINARY_WRITERS.md,BINARY_TODO.md(this entry),BINARY_ISSUES.md(line 151 — already listsBufferWriterChunkSizeamong the struct-mutation issue's affected setters)- Consumer-side:
AyCode.Services/SignalRs/AcBinaryHubProtocol.csctor mutates_options.BufferWriterChunkSize = options.BufferSize;— seeBINARY_ISSUES.md#accore-bin-i-...(struct-mutation context). Coordinate the rename with the struct-mutation fix to avoid two cross-cutting churn waves on the same property.
Acceptance:
- Property renamed (or split) per the chosen direction; all internal references updated.
- XML docs reflect the actual semantics on each output path (initial capacity / advance threshold / chunk frame size — whichever applies).
- Consumer-side usage in
AcBinaryHubProtocolupdated; if Option 2 is chosen, the protocol usesStreamingChunkSize(the streaming knob), not the universal one. - Wire format unchanged. Default values unchanged (65535 / equivalent).
- Migration note in CHANGELOG / release notes since this is a breaking change to
AcBinarySerializerOptions.
ACCORE-BIN-T-M4D2: Add ReadOnlyMemory<byte> / Memory<byte> deserialize overloads
Priority: P3 · Type: Feature
The public AcBinaryDeserializer.Deserialize surface accepts byte[] (with optional offset/length) and ReadOnlySequence<byte>, but not ReadOnlyMemory<byte> / Memory<byte>. Consumers that hold a ReadOnlyMemory<byte> (cached payloads, message-broker frames, in-memory pipe slices) must call .ToArray() to round-trip through byte[] — unnecessary copy + GC alloc.
Implementation:
Deserialize<T>(ReadOnlyMemory<byte> data, AcBinarySerializerOptions options)and the non-genericType-based variant.- Body:
MemoryMarshal.TryGetArray(data, out var seg)→ array-backed path delegates toDeserialize<T>(seg.Array!, seg.Offset, seg.Count, options)(zero-copy). Non-array-backed fallback (rare — customMemoryManager<T>with native memory) copies into a pooledbyte[]. Memory<byte>overload trivially delegates to theReadOnlyMemory<byte>one (Memory<byte>is implicitly convertible).- No new input-strategy struct needed — reuses existing
ArrayBinaryInput.
Acceptance:
- Both overloads compile and pass round-trip tests against
byte[]-equivalent input. - Array-backed path measurably zero-alloc (BenchmarkDotNet allocation diagnoser).
- Non-array-backed path documented as fallback (separate
using var pooled = MemoryPool<byte>.Shared.Rent(...)style copy). - API doc-strings cross-reference the existing
byte[]andReadOnlySequence<byte>overloads.
ACCORE-BIN-T-S7X3: Add ReadOnlySpan<byte> deserialize overload
Priority: P2 · Type: Feature · Related: ACCORE-BIN-T-M4D2
The MemoryPack-style Deserialize<T>(ReadOnlySpan<byte>) API enables direct deserialization from stack-allocated buffers (stackalloc byte[256]), pinned native memory (fixed blocks), and ReadOnlyMemory<byte>.Span slices without round-tripping through a heap-allocated byte[]. The current AcBinary surface lacks this entry point.
Design tension: the existing IBinaryInputBase.Initialize(out byte[] buffer, ...) contract returns a byte[] — a ReadOnlySpan<byte> cannot be stored in a regular struct field, only in a ref struct field. Two implementation paths to evaluate:
ref struct SpanBinaryInput+ interface bump to supportref byte buffer/int lengthfields. Pure zero-copy from any span. Cost:BinaryDeserializationContext<TInput>andIBinaryInputBaseneed a parallel ref-struct-friendly track (the existing pooled context cannot hold aref struct). Major surgery on the deser core.MemoryMarshal.CreateReadOnlySpanFromNullTerminated-style hack — acceptReadOnlySpan<byte>, useUnsafe.AsRef/MemoryMarshal.GetReferenceto obtain aref byte, then copy into a pooledbyte[]before deserialization. Not zero-copy, defeats the purpose. Reject.- Pinned-buffer trampoline — accept
ReadOnlySpan<byte>, allocate aMemory<byte>view via aMemoryManager<byte>-like wrapper, delegate toReadOnlyMemory<byte>overload. Awkward, allocations per call. Reject.
Recommendation: option (1) is the only correct path, but it's a substantial refactor — measure first whether real consumer demand justifies the surgery. The current byte[]-based pool-pattern outperforms MemoryPack on the dominant use-cases per existing benchmarks; this overload addresses an API-surface gap, not a perf gap.
Acceptance:
Deserialize<T>(ReadOnlySpan<byte> data, AcBinarySerializerOptions options)compiles and round-trips againstbyte[]-equivalent input.- Zero-alloc path verified for
stackalloc-source spans (BenchmarkDotNet allocation diagnoser). IBinaryInputBase(or successor interface) refactor preserves backward compatibility for existingArrayBinaryInput/SequenceBinaryInput/AsyncPipeReaderInputAdapterconsumers.- Doc-strings cross-reference the
byte[]/ReadOnlyMemory<byte>(ACCORE-BIN-T-M4D2) /ReadOnlySequence<byte>overloads with use-case guidance.
ACCORE-BIN-T-T8K3: Add SerializeAsync(Stream, T) async overloads with mode-driven output strategy
Priority: P1 · Type: Feature · Related: ACCORE-BIN-T-N9G6 (Type-based coordination)
The mainstream serializer ecosystem (System.Text.Json, MessagePack, Newtonsoft.Json, MemoryPack) all expose SerializeAsync(Stream, T) as a primary entry point — async file I/O, network response body, log streaming. AcBinary's public API surface MUST include this overload regardless of what we do internally; consumers expect a Stream parameter and don't navigate PipeWriter.Create(stream) workarounds. Market-entry-blocking otherwise.
Mode-driven output strategy — three lanes for three workload shapes
AcBinary already models the three output strategies in BinaryProtocolMode (AyCode.Services/SignalRs/BinaryProtocolMode.cs) for the SignalR side. The same three-lane shape applies to the public SerializeAsync(Stream) API. Promote the concept to AcBinary core scope (e.g. AcBinaryOutputMode in AyCode.Core/Serializers/Binaries/) and let the SignalR BinaryProtocolMode either alias it or migrate to it. Migration timing: the existing BinaryProtocolMode keeps shipping until the new public API is stabilized; both names live for one major version, then BinaryProtocolMode becomes a using-alias.
| Mode | Output strategy | Peak memory | Pipeline parallelism | Use when |
|---|---|---|---|---|
Bytes (default) |
Serialize(T) → byte[] + stream.WriteAsync(bytes) |
Full payload in byte[] (pooled) |
No | Typical payloads (<10 MB), throughput-focus |
Segment |
BufferWriterBinaryOutput → PipeWriter, single closing flush |
PipeWriter pause-threshold-bounded (~64 KB Kestrel default) | No | Mid-size payloads, zero-copy desired |
AsyncSegment |
SerializeChunked(PipeWriter), per-chunk async flush |
Chunk-size-bounded (~8 KB at default BufferWriterChunkSize) |
Yes (on parallel-capable PipeWriter — Kestrel / Pipe) |
Very large payloads (>10 MB), memory-tight hosts, parallel-capable transport |
Honest performance positioning vs. MemoryPack — three real axes
MemoryPack's SerializeAsync(Stream) is pseudo-streaming — serializes the entire payload into a pool-allocated linked-list buffer first (ReusableLinkedArrayBufferWriter), then writes the completed buffer to the stream in a single closing fence. Peak memory ≈ payload size; no pipeline parallelism. AcBinary's Bytes mode is architecturally similar (single pooled contiguous byte[] vs. MemoryPack's linked-list) — comparable peak-memory cost, often faster on the wire due to one contiguous WriteAsync call.
AcBinary's AsyncSegment mode is architecturally different in three real ways MemoryPack cannot match:
| Axis | Bytes mode (default) |
AsyncSegment mode |
MemoryPack SerializeAsync |
|---|---|---|---|
| Heap allocation per call | Pooled byte[] rent (peak ≈ payload size) |
Truly zero — ArrayPool + pooled context + MemoryMarshal.TryGetArray direct-buffer-write into the transport's own byte[] |
Pool-allocated linked-list buffer per call (peak ≈ payload size) |
| Peak managed memory | ≈ payload size | ≈ chunk size (BufferWriterChunkSize, e.g. 4-8 KB) |
≈ payload size |
| GC pressure | Touches GC pool on every call | Never touches GC for the serialize itself | Touches GC pool on every call |
| Pipeline parallelism | No | Yes on parallel-capable PipeWriter (Kestrel transport, new Pipe()) |
No |
| GB-scale payload | OOM risk on memory-tight hosts | Works | OOM risk |
The AsyncSegment zero-alloc claim is literal, not "almost zero": AsyncPipeWriterOutput.AcquireChunk calls _pipeWriter.GetMemory(chunkSize) and uses MemoryMarshal.TryGetArray(memory, out segment) to obtain the transport's own internal byte[] — the serializer writes directly into it. With chunkSize aligned to the transport's internal buffer (e.g. NamedPipe-server pipe-buffer-size), one chunk is one kernel-level transfer; no managed-side double-fragmentation.
Throughput nuance — AsyncSegment cost on Stream-backed transports
AsyncSegment IS slightly slower than Bytes on StreamPipeWriter-backed transports (NamedPipe / FileStream / NetworkStream), but not for the reason that initially seems obvious:
- The cost is NOT "managed-side double-fragmentation on top of OS-level fragmentation" — that's not what happens.
MemoryMarshal.TryGetArrayzero-copy direct-buffer-writes mean the managed chunking is the same chunking the kernel does anyway, not redundant. - The cost IS the per-chunk async-await round-trip (
SyncAwaitFlush(_lastFlush)blocks until the kernel acknowledges the write), forced sequential by theStreamPipeWriter._tailMemoryreset race (ACCORE-BIN-I-...). N async cycles vs 1 inBytesmode. - Empirically the gap is roughly 1.2-1.5x on NamedPipe — not 2-5x. The dominant cost on these transports is the transport itself (Windows IRP / Linux FIFO syscall overhead), independent of the serializer mode.
When AsyncSegment wins outright:
- GC-sensitive hot-paths (server hubs, real-time game tick loops, mobile UI thread, embedded targets): zero-alloc + zero-GC-pressure beats a 1.2x throughput edge every time.
- Memory-tight hosts (mobile, WASM, container-trimmed, embedded): chunk-bounded peak memory is the only option.
- GB-scale payloads:
BytesOOMs;AsyncSegmentworks. - Kestrel transport / parallel-capable
Pipe: pipeline parallelism makesAsyncSegmentfaster thanBytesfor medium-to-large payloads.
When Bytes wins outright:
- Tipikus NuGet workload (small-to-medium payload, throughput priority, GC-tolerant): one async cycle vs N is the simpler, faster path.
MemoryStream(in-memory): one largebyte[]copy decisively beats N managed chunks.
Marketing claim — three-way honest comparison
"AcBinary offers a real choice.
Bytesmode for typical throughput-priority workloads (matches MemoryPack's pseudo-streaming, often faster on the wire).AsyncSegmentmode for the workloads MemoryPack cannot serve: zero-alloc serialize for GC-sensitive hot-paths, chunk-bounded peak memory for tight-budget hosts, GB-scale payloads, and pipeline parallelism on parallel-capable transports. You pick the mode; MemoryPack picks for you."
This is honest — does not overclaim universal speed, does not hide the small AsyncSegment cost on Stream-backed transports, AND clearly surfaces the three differentiator axes (alloc / memory / parallelism) where AcBinary architecturally beats MemoryPack.
Implementation outline:
- New enum
AcBinaryOutputMode { Bytes = 0, Segment = 1, AsyncSegment = 2 }inAyCode.Core/Serializers/Binaries/. DefaultBytes. - New mode field on
AcBinarySerializerOptions:AcBinaryOutputMode OutputMode { get; set; } = AcBinaryOutputMode.Bytes;. (Note: subject toACCORE-BIN-I-L8N5thread-safety treatment — defensive copy / immutable refactor coordination.) public static ValueTask SerializeAsync<T>(T value, Stream stream, AcBinarySerializerOptions? options = null, bool leaveOpen = false, CancellationToken ct = default):- Switch on
options.OutputMode:Bytes→var bytes = Serialize(value, options); await stream.WriteAsync(bytes, ct); ArrayPool.Return(bytes);Segment→var pw = PipeWriter.Create(stream, new(leaveOpen: leaveOpen)); Serialize(value, pw, options); await pw.CompleteAsync();AsyncSegment→var pw = PipeWriter.Create(stream, new(leaveOpen: leaveOpen)); SerializeChunked(value, pw, options); await pw.CompleteAsync();
- Switch on
public static ValueTask SerializeAsync(object? value, Type type, Stream stream, ...)— non-generic, same dispatch (coordinated withACCORE-BIN-T-N9G6).leaveOpenparameter standard for stream-async serializers (System.Text.Json, MessagePack convention).- The
Bytesmode uses a pooledbyte[]fromArrayBinaryOutputto keep alloc cost amortized.
SignalR migration coordination: the existing BinaryProtocolMode enum (in AyCode.Services) keeps shipping unchanged until the new public API is stabilized. After stabilization, BinaryProtocolMode becomes a deprecated alias of AcBinaryOutputMode, eventually removed in a major-bump. No SignalR-side churn during this TODO's implementation.
Acceptance:
SerializeAsync<T>round-trips againstDeserialize<T>(byte[])viaMemoryStreamin all three modes.- Cancellation propagates correctly (
OperationCanceledExceptionon cancelled token mid-stream). - Throughput matrix benchmark: 4 transports (
MemoryStream,FileStream,NamedPipeStream,NetworkStream) × 3 modes × 3 payload sizes (small ~1 KB / medium ~100 KB / large ~10 MB). Results documented inTest_Benchmark_Results/Benchmark/SerializeAsync_Stream_Modes.LLM(or similar) and surfaced as a doc-string table for consumer guidance. - Memory-bounded benchmark: 100 MB payload to
FileStreaminAsyncSegmentmode → peak managed-heap delta ≤ 1 MB throughout. Same payload inBytesmode → peak ~100 MB (expected, documented). - API doc-string contains a "When to use which mode?" decision matrix; explicitly compares with MemoryPack's pseudo-streaming.
leaveOpenparameter behaves per the System.Text.Json / MessagePack convention across all three modes.
ACCORE-BIN-T-D7K4: Add DeserializeAsync(Stream, T) async overloads with mode-driven input strategy
Priority: P1 · Type: Feature · Related: ACCORE-BIN-T-T8K3 (companion write-side overload), ACCORE-BIN-T-N9G6 (non-generic Type-based dispatch)
Companion to T8K3 on the receive side. The mainstream serializer ecosystem (System.Text.Json, MessagePack, Newtonsoft.Json, MemoryPack) all expose DeserializeAsync<T>(Stream) — the symmetric counterpart of SerializeAsync(Stream, T). AcBinary's public API surface MUST include this overload for parity; consumers expect a Stream parameter for receive paths (file load, HTTP response body, network stream) and don't navigate PipeReader.Create(stream) workarounds. Market-entry-blocking otherwise.
Implementation: zero new IBinaryInputBase impl needed
The existing receive-side primitives cover the full strategy space via BCL PipeReader.Create(stream):
| Mode | Input strategy | Peak memory | Pipeline parallelism | Use when |
|---|---|---|---|---|
Bytes (default) |
await stream.CopyToAsync(MemoryStream) → Deserialize<T>(byte[]) (existing overload) |
Full payload as byte[] (pooled) |
No | Typical payloads (<10 MB), throughput-focus |
Segment |
await PipeReader.Create(stream).ReadAsync() → Deserialize<T>(ReadOnlySequence<byte>) (existing overload) |
PipeReader pause-threshold-bounded (~64 KB) | No | Mid-size payloads, no full byte[] alloc desired |
AsyncSegment |
AsyncPipeReaderInput + DrainFromAsync(PipeReader.Create(stream)) + Deserialize<T>(input) (existing overload) |
Chunk-size-bounded (~8 KB) | Yes (producer drain Task in parallel with deser Task) | Very large payloads (>10 MB), memory-tight hosts |
The AcBinaryOutputMode enum (introduced by T8K3) is symmetric — it controls deser-input strategy as well. The same enum value picks the matching read path. No new IBinaryInputBase implementation needed — the trio of existing inputs (ArrayBinaryInput, SequenceBinaryInput, AsyncPipeReaderInput) already cover all three modes; the new overload is a thin shim that wraps the Stream and routes to the right existing overload.
Public API shape
public static ValueTask<T?> DeserializeAsync<T>(
Stream stream,
AcBinarySerializerOptions? options = null,
bool leaveOpen = false,
CancellationToken ct = default);
// Non-generic Type-based variant (coordinated with N9G6):
public static ValueTask<object?> DeserializeAsync(
Stream stream,
Type targetType,
AcBinarySerializerOptions? options = null,
bool leaveOpen = false,
CancellationToken ct = default);
Implementation outline (per mode)
// Bytes mode (default — simplest path, sub-LOH-friendly fast path):
public static async ValueTask<T?> DeserializeAsync_Bytes<T>(Stream stream, ..., CancellationToken ct)
{
var rented = ArrayPool<byte>.Shared.Rent((int)Math.Min(stream.CanSeek ? stream.Length : 4096, int.MaxValue));
try
{
var totalRead = 0;
int read;
while ((read = await stream.ReadAsync(rented.AsMemory(totalRead), ct)) > 0)
{
totalRead += read;
if (totalRead == rented.Length) { /* grow rented */ }
}
return Deserialize<T>(rented, 0, totalRead, options);
}
finally { ArrayPool<byte>.Shared.Return(rented); }
}
// Segment mode (PipeReader.Create wrapping, then drain to ReadOnlySequence):
public static async ValueTask<T?> DeserializeAsync_Segment<T>(Stream stream, ..., CancellationToken ct)
{
var pipeReader = PipeReader.Create(stream, new(leaveOpen: leaveOpen));
var result = await pipeReader.ReadAtLeastAsync(int.MaxValue, ct); // drain whole stream
var seq = result.Buffer;
var obj = Deserialize<T>(seq, options);
pipeReader.AdvanceTo(seq.End);
await pipeReader.CompleteAsync();
return obj;
}
// AsyncSegment mode (chunked streaming pipeline, parallel drain + deser):
public static async ValueTask<T?> DeserializeAsync_AsyncSegment<T>(Stream stream, ..., CancellationToken ct)
{
using var input = new AsyncPipeReaderInput(options.BufferWriterChunkSize * 2, multiMessage: false);
var pipeReader = PipeReader.Create(stream, new(leaveOpen: leaveOpen));
var deserTask = Task.Run(() => Deserialize<T>(input, options), ct);
await input.DrainFromAsync(pipeReader, ct);
await pipeReader.CompleteAsync();
return await deserTask;
}
Honest performance positioning
Symmetric to T8K3's analysis:
Bytesmode: simplest, single contiguousbyte[](pooled) →Deserialize<T>(byte[]). Comparable to MemoryPack'sDeserializeAsync(which does similar full-buffer-then-deser). Best for typical workloads.Segmentmode: zero-copy from PipeReader's naturalReadOnlySequence<byte>— no extra byte[] allocation. Best for mid-size payloads where allocation matters but pipeline overlap doesn't.AsyncSegmentmode: producer-drain Task and consumer-deser Task in parallel viaAsyncPipeReaderInput. Wall-clock = max(network-drain, deser-CPU) + small overlap-cost. Best for large payloads + slow transports (network, mobile, satellite — where transit dominates and overlap pays).
Acceptance
DeserializeAsync<T>round-trips againstSerializeAsync(Stream, T)(T8K3) viaMemoryStreamin all three modes.- Cancellation propagates correctly (
OperationCanceledExceptionon cancelled token mid-stream); partial-buffer state cleaned up; pooled byte[] returned even on cancellation. - Throughput matrix benchmark (mirror of T8K3): 4 transports (
MemoryStream,FileStream,NamedPipeStream,NetworkStream) × 3 modes × 3 payload sizes. Results documented inTest_Benchmark_Results/Benchmark/DeserializeAsync_Stream_Modes.LLM. - Memory-bounded benchmark: 100 MB payload from
FileStreaminAsyncSegmentmode → peak managed-heap delta ≤ 1 MB throughout. Same payload inBytesmode → peak ~100 MB (expected, documented). - API doc-string contains a "When to use which mode?" decision matrix; cross-references T8K3's symmetric write-side guidance.
leaveOpenparameter behaves per the System.Text.Json / MessagePack convention across all three modes.
ACCORE-BIN-T-N9G6: Add non-generic Type-based Serialize(object, Type, ...) overloads
Priority: P2 · Type: Feature · Status: Closed (2026-05-04) · Related: ACCORE-BIN-T-T8K3
Resolution
Added in AcBinarySerializer.cs:
Serialize(object?, Type, opts)→byte[]Serialize(object?, Type, IBufferWriter<byte>, opts)→intSerializeChunked(object?, Type, PipeWriter, opts)→intSerializeChunkedFramed(object?, Type, PipeWriter, opts)→int
Added in AcBinaryDeserializer.cs:
DeserializeFromPipeReaderAsync<T>(PipeReader, opts, ct)→Task<T?>DeserializeFromPipeReaderAsync(PipeReader, Type, opts, ct)→Task<object?>
The Deserialize(byte[], Type, opts) / Deserialize(ReadOnlySequence<byte>, Type, opts) / Deserialize(AsyncPipeReaderInput, Type, opts) overloads already existed.
Consumed by ASP.NET Core MVC formatter package (AyCode.Services/Mvc/) — AcBinaryInputFormatter, AcBinaryOutputFormatter, AddAcBinaryFormatters extension. Media type: application/vnd.acbinary.
Plugin frameworks, ASP.NET ModelBinding, DI middleware, and DataContractSerializer-style "generic-API container" use-cases need to serialize an object whose type is known only at runtime. Current AcBinary surface forces a reflection trampoline through the generic Serialize<T>:
// Today's workaround (slow + noisy):
typeof(AcBinarySerializer).GetMethod("Serialize", new[] { type, typeof(AcBinarySerializerOptions) })
.MakeGenericMethod(type).Invoke(null, new[] { value, options });
Implementation outline:
public static byte[] Serialize(object? value, Type type, AcBinarySerializerOptions? options = null)public static int Serialize(object? value, Type type, IBufferWriter<byte> writer, AcBinarySerializerOptions? options = null)public static int SerializeChunked(object? value, Type type, PipeWriter writer, AcBinarySerializerOptions? options = null)andPipeoverloadpublic static int SerializeChunkedFramed(object? value, Type type, PipeWriter writer, AcBinarySerializerOptions? options = null)andPipeoverloadpublic static ValueTask SerializeAsync(object? value, Type type, Stream stream, ...)— coordinated withACCORE-BIN-T-T8K3- Internal dispatch:
value.GetType()is the runtime type; theType typeparameter constrains the declared type for polymorphism handling (ObjectWithTypeNamewrite decision).
Acceptance:
- All non-generic overloads round-trip via the generic deserializer's
Deserialize(byte[], Type)overload. - Plugin-style scenario: serialize
IList<dynamic>of mixed-type elements → all elements correctly typed in the wire output. - API doc-strings call out the performance characteristics (slightly slower than generic due to runtime
Typelookup but without the reflection trampoline cost).
ACCORE-BIN-T-R4P2: Expose low-level ref Writer-style API for custom formatters
Priority: P3 · Type: Feature
The MemoryPack-style Serialize<T>(ref MemoryPackWriter writer, in T value) low-level API enables:
- Custom formatters that compose write primitives without the full Serialize entry-point overhead.
- Nested-into-existing-stream scenarios where the caller already owns a writer-style cursor.
- Test harnesses that exercise specific wire-format paths in isolation.
Today's BufferWriterBinaryOutput standalone-mode partly fills this gap — exposing WriteByte, WriteVarUInt, WriteStringUtf8, etc. — but it is not a ref struct, not a documented low-level public API for external custom formatters, and the relationship with BinarySerializationContext<TOutput> is unclear from the consumer's perspective.
Design tension (decide before implementing):
- Promote
BufferWriterBinaryOutputto documented public surface — add doc, examples, supported usage patterns. Cheapest, but the standalone-mode is currently a side-feature, not a primary API; documenting it commits to its current shape. - New
ref struct AcBinaryWriterwrapper aroundBufferWriterBinaryOutput(or a dedicated impl) — explicit "this is the low-level writer" signal. More API surface but clearer mental model. Aesthetic alignment with MemoryPack. - Skip entirely — the
IBufferWriter<byte>overload is already lower-level than most consumers need; custom formatters can write to anArrayBufferWriter<byte>and useIBufferWriter-style primitives. This is whatBufferWriterBinaryOutputalready does internally.
Recommendation: option 3 is honest — the existing IBufferWriter<byte> overload covers the use case, and adding a ref struct AcBinaryWriter is mostly aesthetic alignment with MemoryPack. Re-evaluate when there's a concrete custom-formatter request that the current API can't accommodate.
Acceptance (if implemented):
AcBinaryWriter ref struct(or equivalent) compiles, supports the same write primitives asBufferWriterBinaryOutputstandalone-mode.- At least one example custom formatter ships in tests (e.g., a
Vector3struct formatter). - Doc-string clearly distinguishes when to use the low-level writer vs. the high-level
Serialize<T>entry-point.
ACCORE-BIN-T-U6Y8: Attribute-driven polymorphism via [AcBinaryUnion] + SGen (opt-in, AOT-friendly)
Priority: P1 (if AOT target required) / P2 (non-AOT only) · Type: Feature
Design philosophy alignment: AcBinary's market positioning is "JSON-style flexibility with MessagePack-class speed" — attributes are opt-in optimization, never required. The runtime polymorphism path (AQN-based, today's default) stays the default and continues to work for arbitrary unattributed types. This TODO adds a fast/AOT path alongside it, never replaces it.
AcBinary today handles polymorphism at runtime: the wire writes ObjectWithTypeName(72) + AQN string, and the deserializer calls Type.GetType(aqn) to resolve. This is flexible (no upfront declaration), but has three significant drawbacks for some consumers:
- AOT-incompatible —
Type.GetType(AQN)requires reflection metadata that the Native AOT trimmer strips by default. The runtime polymorphism path does not work at all under Native AOT. Hard blocker for AOT-targeting consumers (Blazor WASM, MAUI mobile, container-trimmed deployments). - Slower — AQN string parse + reflection lookup vs. a closed
switch (tag)in code-gen. - Larger wire format — full AQN string (often 100+ bytes) vs. a single-byte
tag.
Design — three coordinated pieces:
1. New 5th bool parameter on [AcBinarySerializable]: EnablePolymorphismFeature
Mirrors the existing EnableMetadataFeature / EnableIdTrackingFeature / EnableRefHandlingFeature / EnableInternStringFeature pattern. Per-type opt-out / opt-in via attribute parameter.
public AcBinarySerializableAttribute(
bool enableMetadataFeature,
bool enableIdTrackingFeature,
bool enableRefHandlingFeature,
bool enableInternStringFeature,
bool enablePolymorphismFeature) // ← ÚJ, default: true
Three behavior modes per type:
EnablePolymorphismFeature = false→ disabled. SGen never emits polymorphism dispatch for this type; runtime path also short-circuits — runtime type ≠ declared type is silently treated as declared (or throws, decision TBD). Use for hot-path closed types where polymorphism is impossible-by-design and the perf/AOT cost is unwanted.EnablePolymorphismFeature = true(default), no[AcBinaryUnion]→ runtime options control. Behaves perAcBinarySerializerOptions.PolymorphismMode(Runtime/AQN today). This preserves the JSON-style flexibility for unattributed bases.EnablePolymorphismFeature = true+[AcBinaryUnion(...)]declared → union-switch dispatch. SGen emits a closedswitch (tag)dispatch using the declared subtype set. Fast + AOT-friendly. Overrides the options-level default for this type.
2. New [AcBinaryUnion(byte tag, Type subtype)] attribute
Multiple instances per base class / interface declare the closed polymorphism set:
[AcBinarySerializable] // EnablePolymorphismFeature defaults to true
[AcBinaryUnion(0, typeof(Cat))]
[AcBinaryUnion(1, typeof(Dog))]
public abstract partial class Animal { ... }
SGen detects [AcBinaryUnion] on abstract / base type → emits the switch-based write/read dispatch instead of falling through to runtime AQN.
3. New PolymorphismMode enum on AcBinarySerializerOptions
Options-level default for unattributed polymorphism (i.e. the case where EnablePolymorphismFeature = true but no [AcBinaryUnion] is declared):
Runtime(today's default) — AQN-based. Flexible, AOT-incompatible.Throw— fail fast on any polymorphic write that lacks a[AcBinaryUnion]attribute. AOT-friendly diagnostic mode for migration scenarios.
Note: there is no UnionAttribute-only mode — declaration is per-type via the attribute, not options-global. The options-level mode only governs the fallback when no [AcBinaryUnion] is present.
Wire-format addition:
New marker (e.g. UnionTagBase = <TBD>) + [byte tag][inner Object], parallel to existing ObjectWithTypeName(72). Slot number to be assigned avoiding clashes with existing 64–134 / 192–255 ranges.
Implementation outline:
AcBinarySerializableAttribute— new ctor parameterenablePolymorphismFeature, all existing ctors default it totrue(backward compatible).AcBinaryUnionAttribute— new attribute,AttributeUsage(AttributeTargets.Class | Interface, AllowMultiple = true).- Source generator — emit
WriteUnion<TBase>(value, ctx, depth)andReadUnion<TBase>(ctx, depth)static methods on the union-base type's generated writer/reader. Skipped entirely whenEnablePolymorphismFeature = false. - Wire-format new marker +
[byte tag][inner Object]body. - Runtime path:
WriteValueNonPrimitivechecks the wrapper'sPolymorphismFeatureEnabledflag; whenfalse, skips thevalue.GetType() != declaredTypepolymorphism branch entirely.
Acceptance:
EnablePolymorphismFeature = false: SGen-emitted dispatch contains zerois-typeof / GetType branches; runtime path also short-circuits. Verify in JIT disassembly.EnablePolymorphismFeature = true, no union: runtime AQN polymorphism works as today (full backward compat); preserved JSON-style flexibility for unattributed bases.EnablePolymorphismFeature = true+[AcBinaryUnion]: AOT-test (Native AOT publish) compiles and round-trips a polymorphic graph —Type.GetType()is never invoked on this path.- Benchmark: union-switch polymorphism measurably faster than AQN polymorphism on deser side (typed switch vs. reflection lookup).
- Wire format documented in
BINARY_FORMAT.md;BINARY_FEATURES.mdcross-references the attribute pattern;BINARY_OPTIONS.mddocumentsPolymorphismMode.AcBinarySerializableAttributedoc-string explains all three behavior modes.
ACCORE-BIN-T-B7H4: Implement AcBinarySerializerOptions thread-safety fix
Priority: P2 · Type: Refactor · Related: BINARY_ISSUES.md#accore-bin-i-l8n5 (canonical issue)
The latent thread-safety problem documented in ACCORE-BIN-I-L8N5 — mutable set; properties on AcBinarySerializerOptions shared across concurrent serialize/deserialize calls — needs a fix before AcBinary ships as a NuGet package. The package cannot constrain how consumers scope their options instances; defensive contract is needed in the serializer itself.
Three candidate fix directions (decide before implementing):
-
Defensive copy on ingress — add
AcBinarySerializerOptions Clone()method (member-wise copy). Every API entry point that retains an options instance clones it on entry. External mutation to the original becomes invisible to the holder.- Pro: non-breaking. Existing consumer code unchanged. No major version bump required.
- Pro: API surface change limited to one new
Clone()method. - Con: per-call clone overhead (small, but non-zero). Cache keyed on options-identity becomes invalid for downstream code using reference equality.
- Con: doesn't fix the underlying mutability — internal code can still race-mutate the cloned snapshot if a method retains both the snapshot and modifies it concurrently.
-
Immutable record refactor —
set;→init;on all configuration properties. Mutation requireswith-expression which produces a new instance.- Pro: type-system-strong guarantee. Race becomes a compile error, not a runtime corruption risk.
- Pro: zero runtime overhead (init-only is compile-time check; record class semantics are unchanged at runtime).
- Con: breaking change for any consumer doing
opts.UseGeneratedCode = falseafter construction. Major version bump. - Con: source-generator coordination needed if SGen emits options-builder code that mutates properties.
-
Read-only flag pattern (à la
JsonSerializerOptions.MakeReadOnly()) — mutable by default, holder callsMakeReadOnly()on entry; subsequent property setters throwInvalidOperationException.- Pro: BCL-precedent — Microsoft adopted it for
JsonSerializerOptionsin .NET 7 (dotnet/runtime#74431) for exactly this problem. Familiar pattern for consumers. - Pro: minimal API surface change (one new method +
IsReadOnlyflag property). - Pro: per-call overhead = single bool check per setter call. Negligible.
- Con: opt-in by the holder — if a custom consumer-side wrapper forgets to call
MakeReadOnly(), the safety hole stays open for that wrapper's clients. Documentation-driven safety, not type-system-driven. - Con: bypasses static-analysis tooling (the setter signature stays public; the throw is runtime). IDE doesn't surface "this property is currently read-only" in autocomplete.
- Pro: BCL-precedent — Microsoft adopted it for
Recommendation: Option 3 (MakeReadOnly pattern) is the BCL-precedent, lowest-friction migration path. Microsoft adopted it for JsonSerializerOptions in .NET 7 to solve the same problem; AcBinary should follow the same pattern for consistency with consumers' mental model and zero migration cost.
Coordination with the existing AcBinaryHubProtocol setter side-effect (the second risk surface in ACCORE-BIN-I-L8N5): the protocol ctor currently mutates the caller-provided options reference (_options.BufferWriterChunkSize = options.BufferSize). After the fix:
- Option 1 (Clone): ctor mutates the cloned snapshot → no side-channel to the caller. Fix transparent.
- Option 2 (Immutable): ctor cannot mutate; needs to construct a new options via
with-expression. Breaking change in the ctor's options-handling. - Option 3 (MakeReadOnly): ctor mutates before calling
MakeReadOnly()— same as today, but explicit "frozen" point afterwards. Caller-side mutation post-ctor is now a runtime throw.
Implementation outline (Option 3 path):
AcBinarySerializerOptions.IsReadOnly { get; }— public bool property.AcBinarySerializerOptions.MakeReadOnly()— sets the flag; idempotent (no-op if already set).- All
set;accessors guard:if (IsReadOnly) throw new InvalidOperationException("AcBinarySerializerOptions has been made read-only and can no longer be mutated. Construct a new options instance instead.");. AcBinarySerializer.Serialize<T>entry (and all sibling entries —Deserialize<T>,SerializeChunked, etc.):options.MakeReadOnly()before any property read.AcBinaryHubProtocolctor: complete theBufferWriterChunkSizemutation before callingoptions.MakeReadOnly(). After ctor returns, the options instance is frozen for that protocol's lifetime.- Doc-string update on
AcBinarySerializerOptionsclass header: explicit "thread-safety contract" section explaining the freeze-on-first-use semantics.
Acceptance:
- Concurrent stress test (16 threads × 1000 iterations) on a shared
AcBinarySerializerOptionsinstance with property-mutation-attempts mid-iteration — all mutations afterMakeReadOnly()throwInvalidOperationException; no silent corruption observed. - Existing tests pass unchanged (the
MakeReadOnlyis opt-in for the serializer entries; tests that build options + use them once continue to work transparently). BINARY_ISSUES.md#accore-bin-i-l8n5Status updated toClosed (YYYY-MM-DD)with a### Resolutionsub-section pointing to this TODO + the implementing commit.- Doc-string on
AcBinarySerializerOptionsdocuments the freeze-on-first-use contract;BINARY_FEATURES.mdorBINARY_OPTIONS.mdcross-references the BCL-precedent (JsonSerializerOptions.MakeReadOnly).
ACCORE-BIN-T-F8N3: Switch source-generator type-name hashing from simple-name to fully-qualified-name
Priority: P3 · Type: Refactor · Related: ACCORE-BIN-T-I3P8 (override mechanism for residual collisions)
The source generator's ComputeFnvHash(typeSymbol.Name) uses the simple name only (e.g. "User", not "MyApp.A.User"). Cross-namespace types with the same simple name silently collide on s_typeNameHash. The hash is currently only consumed by the WireMode=Metadata inline metadata-write path (cross-version property compat) — the framework explicitly does NOT add wire-format type-id (per CLAUDE.md Rule #7: type-dispatch is consumer responsibility, see BINARY_ASYNCPIPE_ISSUES.md#accore-bin-i-t6v2). Within UseMetadata, the simple-name collision can still cause silent property-set mismatches between two types with the same short name in different namespaces — this TODO fixes that.
Change scope (AcBinarySourceGenerator.cs) — 4 call sites: ComputeFnvHash(typeSymbol.Name) → ComputeFnvHash(typeSymbol.ToDisplayString()):
- Self type-name hash (~line 358)
- Child type-name hash (~line 157)
- Element type-name hash (~line 254)
- Dict-value type-name hash (~line 311)
No runtime code changes; output regenerates with new constants on next build.
Breaking change scope: any saved binary stream that uses WireMode=Metadata and was produced by an older version embeds the old simple-name hash; consumers reading those streams with the new hash compute would mismatch and throw. Pre-1.0: acceptable. Post-1.0 would require a WireMode=Metadata format-version bump.
Acceptance:
- All
*_GeneratedWriter.g.csfiles regenerate with FQN-baseds_typeNameHashvalues. - Existing tests pass (auto-regen propagates; no manual hash literals in tests).
- Wire format identical for
WireMode=Compact(no metadata embedded). UseMetadata=truepaths produce different hashes — explicitly tested via round-trip.
ACCORE-BIN-T-I3P8: [AcBinaryTypeId(...)] attribute — explicit type-id override
Priority: P3 · Type: Feature · Related: ACCORE-BIN-T-F8N3 (FQN base hash being overridden)
Once ACCORE-BIN-T-F8N3 reduces collision frequency by switching to FQN, residual FQN-hash collisions are still possible (32-bit hash space, birthday paradox). Currently the only consumer of s_typeNameHash is the WireMode=Metadata inline metadata-write path — a residual collision there causes a silent property-set mismatch.
[AcBinaryTypeId(0x12345)] attribute on a class:
- Source generator emits
s_typeNameHash = 0x12345instead of computing FNV. - Two types with the same
[AcBinaryTypeId(...)]value → compile-time / first-use error.
Useful for:
- Resolving rare FQN-hash collisions deterministically (within
WireMode=Metadata). - Pinning a stable type-id across class renames (wire-compat across versions in
Metadatamode). - Future-proofing: if a Layer 1 consumer (hypothetically) builds a type-dispatch above AcBinary using
s_typeNameHash, the same override mechanism applies.
Acceptance:
- New attribute class shipped alongside
[AcBinarySerializable]. - Generator honours the override (emits explicit constant instead of FNV result).
- Tests: rename a class with
[AcBinaryTypeId]→s_typeNameHashunchanged.
ACCORE-BIN-T-X2M5: Evaluate xxHash3 vs FNV-1a for type-name hashes
Priority: P3 · Type: Investigation · Related: ACCORE-BIN-T-F8N3
FNV-1a is currently used for both s_typeNameHash and s_propertyHashes. For compile-time hashing, performance is irrelevant. For collision resistance:
- FNV-1a 32-bit: ~50% collision at ~77K types (birthday paradox). Adequate for small/medium projects, marginal for large ones with many auto-generated types.
- xxHash3 32-bit: comparable mathematical properties to FNV-1a (both non-cryptographic).
- xxHash3 64-bit: dramatically better collision resistance (~50% at ~5B entries), at the cost of 8 wire bytes instead of 4.
Trigger: real collisions observed (1000+ types per assembly + cross-assembly aggregation), or community feedback indicating collision pain.
Investigation questions (no code change without a triggering pain signal):
- Switch to xxHash3 32-bit (incremental improvement) — but doubles the change scope (touch property hashes too if uniformity desired).
- Switch to xxHash3 64-bit (8 wire bytes instead of 4) — meaningful collision resistance, modest wire cost.
- Stay on FNV-1a + force
[AcBinaryTypeId]for collisions — minimal change, devops burden.
Investigation only — defer until pain signal arrives.
ACCORE-BIN-T-K9E4: [RequiresDynamicCode] + [RequiresUnreferencedCode] on Runtime-only methods
Priority: P3 · Type: Refactor · Related: BINARY_FEATURES.md#nativeaot-compatibility
The Runtime path (factories in AcSerializerCommon + wrapper-based deserialize fallback in AcBinaryDeserializer) currently works under NativeAOT thanks to DAMs propagation + RuntimeFeature.IsDynamicCodeSupported guards, but the trimmer still emits warnings for the well-known blind spots (polymorphism via obj.GetType(), nested-type chain via generic argument extraction). The library suppresses these with [UnconditionalSuppressMessage] and documented justification.
A complementary signal would be to mark the Runtime entry points (or the factories themselves) with [RequiresDynamicCode("AcBinary Runtime path uses Reflection.Emit / closed-generic instantiation; use [AcBinarySerializable] + SGen for NativeAOT.")] and [RequiresUnreferencedCode("...")]. Effect:
- AOT publish in consumer's project surfaces a warning at the call site → consumer chooses SGen or accepts the Runtime cost
- Mirrors the System.Text.Json reflection-mode pattern (
[RequiresDynamicCode]onJsonSerializer.Serialize<T>overloads) - One-codebase, no NuGet split needed
- Cheap implementation — attribute placement only
Coordination: [RequiresDynamicCode] is contagious; every caller must either propagate it or suppress with [UnconditionalSuppressMessage]. Scope:
- Public
Serialize<T>/Deserialize<T>entry points stay attribute-free (consumer-facing) - Runtime fallback methods get the attribute (contained inside the library)
- The DAMs annotations we already have stay — they're orthogonal (one prevents trim, the other warns about JIT-only behavior)
Acceptance:
- Consumer's AOT publish surfaces a IL2026/IL3050 warning when
UseGeneratedCode=falseis set or an unattributed type is deserialized - SGen path is warning-free
- Library compiles 0 warnings (suppressions added at the propagation barrier)
BINARY_FEATURES.mdNativeAOT Compatibility section updated to mention the explicit warning signal
ACCORE-BIN-T-A2J7: Optional AyCode.Core.Aot NuGet variant (SGen-only build)
Priority: P3 · Type: Feature · Related: BINARY_FEATURES.md#nativeaot-compatibility, ACCORE-BIN-T-K9E4
Binary-size-sensitive AOT consumers (Blazor WASM, MAUI mobile, embedded, container-trimmed) benefit from a smaller library variant that strips the Runtime fallback path entirely. Estimated savings: ~80-150 KB of native code (~25-60 KB compressed wire size for WASM publish).
Strippable code in the .Aot variant:
| Component | LOC | Purpose | Removable in Aot? |
|---|---|---|---|
AcSerializerCommon.Create* (7 factory methods + Expression-tree code) |
~150 | Runtime delegate compilation | ✅ Yes |
TypeMetadataBase runtime metadata path (CompiledConstructor, IdGetters via Expression.Compile) |
~300 | Reflection-based metadata | ✅ Yes |
AcBinaryDeserializer wrapper-based runtime fallback (PopulateObjectPropertiesIndexed, ReadObjectCoreWithWrapper non-SGen branches, CreateInstance(type) Activator-fallback) |
~500 | Runtime polymorphic dispatch | ✅ Yes |
Property accessor runtime delegate fields (_dynamicGetter, typed getter/setter caches outside SGen) |
~150 | Boxed property access | ✅ Yes |
System.Linq.Expressions transitive dependency |
— | Expression-tree IL emission | ✅ Yes (when nothing else in graph uses it) |
Implementation sketch (avoid #if-erdő via file-level split):
AyCode.Core/Serializers/
AcSerializerCommon.cs // SGen-safe shared parts
AcSerializerCommon.Runtime.cs // 7 Create* factory methods only here
AcBinaryDeserializer.cs // SGen path
AcBinaryDeserializer.Runtime.cs // wrapper-based runtime fallback path
TypeMetadataBase.cs // SGen-safe metadata
TypeMetadataBase.Runtime.cs // Expression.Compile-based ctor + accessor wiring
Two .csproj files:
AyCode.Core.csproj— full package (current); includes all filesAyCode.Core.Aot.csproj—<Compile Remove="**/*.Runtime.cs" />; sets<PackageId>AyCode.Core.Aot</PackageId>; same version as full
Trade-offs:
- ✅ No
#ifdirectives in business code — physically separate file groups - ✅ Source mostly shared via SDK include/exclude semantics
- ✅ DAMs annotations and trim-suppressions only land in the full package;
.Aotvariant is genuinely trim-clean by construction - ✅ "Strict SGen" semantics in
.Aot: a non-SGen type at deser time throws clearly instead of silently falling back. Marketing positioning: "guaranteed SGen path, no hidden slow lane". - ⚠️ Two NuGet IDs, two changelogs, version sync (CI-automatable)
- ⚠️ Consumer must pick the right package — wrong choice = breaking switch later
Coordination:
- Land
ACCORE-BIN-T-K9E4first ([RequiresDynamicCode]attributes) — if that pattern handles the consumer-side scenarios well,.Aotmay not be needed - The current Runtime fallback code is already well-isolated (mostly in
AcSerializerCommonfactories +AcBinaryDeserializerwrapper-based methods), so the file-split refactor is mechanically straightforward - Marketing decision: is binary size a central pillar? If yes,
.Aotis a NuGet differentiator; if not,K9E4alone is enough
Acceptance:
AyCode.Core.Aot.csprojproduces a NuGet ~25-60 KB smaller thanAyCode.Coreafter compression.Aotbuild emits zero IL/AOT trim warnings (no suppressions needed because the Runtime path code is physically removed)- Round-trip tests pass on
.Aotfor all SGen types .Aotthrows a clearInvalidOperationException(notMissingMethodException) when a non-[AcBinarySerializable]type is encountered at deser timeBINARY_FEATURES.mdNativeAOT Compatibility section documents both packages and when to choose which
ACCORE-BIN-T-V4N2: .NET 11 SIMD-specialized UTF-8 decoder via multi-targeting
Priority: P3 · Type: Performance · Related: AcBinaryDeserializer.BinaryDeserializationContext.Read.cs::DecodeUtf8
The custom UTF-8 → UTF-16 decoder in DecodeUtf8 / CountUtf8Chars / DecodeUtf8ToChars currently targets .NET 9 — scalar two-pass with optional Vector256 ASCII prefix widen + DWORD ASCII batch (per Phase 1 optimization). .NET 11 (planned ~Nov 2026) exposes additional SIMD intrinsics that can meaningfully accelerate the decoder on AVX-512-capable hosts, particularly the vpcompressb-style mask-driven byte compression that simdutf relies on for its 64-byte AVX-512 transcoder.
Why .NET 11 specifically (and not .NET 10)
- .NET 10: incremental SIMD improvements, but the changes that affect us are mostly inside the BCL (
Encoding.UTF8.GetStringinternal SIMD widening). Our custom decoder bypasses the BCL — we don't benefit unless we hand-roll the same SIMD ourselves with .NET 9 intrinsics, which already work today. Multi-targetingnet9.0;net10.0adds CI/test overhead with marginal payoff. Skip. - .NET 11: PR #120628 (Vector512/Vector256 SIMD for UTF-8 utilities) was closed without merge but signals upcoming work in this area. Future iterations are expected to expose
Avx512Vbmi-style mask-compress intrinsics that today require unsafe / Vector128-emulation paths. Target this once the framework lands.
Implementation outline (when triggered)
- Multi-target
<TargetFrameworks>net9.0;net11.0</TargetFrameworks>onAyCode.Core.csproj #if NET11_0_OR_GREATERblock inDecodeUtf8selects an AVX-512-aware path: process 64-byte blocks viaVector512<byte>+vpcompressbfor byte-stream extraction, fall back to the .NET 9 scalar+Vector256 path on non-AVX-512 hardware (Avx512Vbmi.IsSupportedruntime check)- Reuse the .NET 9 scalar path for short strings (<64 bytes) — SIMD setup cost dominates
- New benchmark cells comparing .NET 9 vs .NET 11 builds on the same hardware
Acceptance
dotnet testpasses on both target frameworks- Benchmark on AVX-512 hardware (Sapphire Rapids / Zen 4+) shows ≥1.5x non-ASCII deser speedup vs .NET 9 build for strings ≥256 bytes
- Short-string perf (≤64 bytes) within ±5% of .NET 9 build (no regression from multi-target setup)
BINARY_FEATURES.mddocuments the SIMD path selection logic
Trigger
- Wait for .NET 11 release (or RC)
- Re-evaluate once
dotnet/runtimeUTF-8 SIMD utilities re-land (post-PR #120628 follow-up) - Skip entirely if .NET 11 BCL
Encoding.UTF8.GetStringbecomes fast enough that hybrid (≥256 bytes → BCL, <256 → custom) wins without hand-rolled SIMD
ACCORE-BIN-T-S5L8: Sentinel-length encoding for strings (wire-size optimization, both modes)
Priority: P3 · Type: Wire-format optimization · Related: AcBinarySerializer.WriteString, AcBinaryDeserializer.ReadValue string dispatch
The leading string-marker byte (String / StringEmpty / Null) exists primarily to distinguish null vs empty vs non-empty before dispatching. For non-polymorphic, non-interned string properties the marker can be replaced by a single sentinel-length VarUInt:
[VarUInt sentinelLength] [content bytes if applicable]
sentinelLength == 0 → null
sentinelLength == 1 → empty string
sentinelLength == N+1 → string of N bytes/chars, content follows
MemoryPack-style encoding pattern. Applies to both Compact (UTF-8) and FastWire (UTF-16 raw) modes; the content following the sentinel differs by mode.
Per-mode impact
FastWire mode — wire layout today: [String marker][VarUInt charCount][UTF-16 raw bytes]. Sentinel saves 1 byte per non-null string.
| TestData | Current FastWire wire | Estimated with sentinel | Δ |
|---|---|---|---|
| Small | 3122 B | ~3050 B | -2% |
| Medium | 10905 B | ~10500 B | -4% |
| Large | 68603 B | ~67000 B | -2% |
| Repeated | 16244 B | ~15700 B | -3% |
| Deep | 15514 B | ~14900 B | -4% |
Closes the +1.7-8.1% FastWire wire gap vs MemoryPack to near zero or favorable while keeping AcBinary FastWire's +9-20% speed advantage.
Compact mode — wire layout today varies by length:
- Short (≤31 byte):
[FixStr+length][UTF-8 bytes]— already 1-byte marker, ties sentinel. - Long (>31 byte):
[String marker][VarUInt byteCount][UTF-8 bytes]— sentinel saves 1 byte (the marker).
Compact gain: only on long strings (>31 byte UTF-8). Estimated −1 byte per long string. Workload-dependent: if most strings are short or use interning, gain is small. If many long mixed-content strings, meaningful saving.
Limitations (both modes)
- Polymorphic
objectproperties: marker needed for type discrimination. Sentinel encoding only applies when the property type is staticallystringorstring?. - Interning incompatible: sentinel cannot express
StringInternFirst/StringInternedmarkers (those carry cache-index semantics). Interned properties keep marker-based encoding. FastWire mode already disables interning by design (consistent); Compact mode needs per-property dispatch (interned → marker, non-interned → sentinel). - Compact-mode FixStr ties: short strings (≤31 byte UTF-8) gain nothing in Compact (FixStr is already 1-byte marker+length). The optimization wins only on long strings in Compact.
Implementation outline (rough — refine when implementing)
- Writer: branch in
WriteStringon property metadata flags(IsString, IsNotInterned, IsNotPolymorphic). If sentinel-eligible, emitVarUInt sentinelLength+ content. Else fall through to existing marker-based encoding. - Reader: matching branch in property reader. If sentinel-eligible (per property metadata), read
VarUInt sentinelLength, dispatch on 0/1/N+1. - SGen: emit sentinel-encoding variant for non-polymorphic non-interned
stringtyped properties; emit existing marker-encoding for the rest. - Wire format version bump OR header flag indicating sentinel-encoding-active. (Cross-version compat policy decided when implementing.)
Trigger
- After D-2 / decoder optimization / marker-dispatch land (compact-mode focus completes)
- When wire-size positioning becomes a primary pillar for NuGet release
- Re-evaluate scope at implementation time — exact gain in Compact depends on consumer workload (long-string ratio, interning patterns)
Acceptance
- FastWire mode: AcBinary wire ≤ MemoryPack on at least 4 of 5 test cells
- Compact mode: long-string wire bytes -1 each, no regression on short or interned strings
- Speed benchmark: no regression vs current encoding (essentially zero CPU cost — sentinel is shifted bookkeeping)
- Cross-version compat: documented format version bump + clean fail on old reader / new wire mismatch
- Polymorphic + interned property test cases pass unchanged (use existing marker-based encoding)
ACCORE-BIN-T-M3R7: ASCII marker-dispatch — writer detect + reader dedicated path
Priority: P2 · Type: Performance + wire optimization · Related: BinaryTypeCode.FixStrAsciiBase..StringAscii markers, WriteStringWithDispatch, ReadAsciiBytesAsString
Status: Closed (2026-05-04)
Sorrendi megjegyzés: ezt AZ ENCODER OPTIMALIZÁCIÓ UTÁN csináljuk (lásd
ACCORE-BIN-T-E2F9). Indok: a custom encoder/decoder Vector256 ASCII narrow/widen path-jai már magukban gyorsan kezelik az ASCII byte-ot. A marker-dispatch ezen FELÜL csak a per-call dispatch-overhead spórolást hozza (noAscii.IsValidscan, no decoder layer). Garantált win, de additív — méréstechnikailag tisztább a decoder/encoder utánra hagyni.
The FixStrAscii* (135-166) and StringAscii (167) markers are defined in BinaryTypeCode.cs with helper methods (IsAsciiString, IsFixStrAscii, EncodeFixStrAscii, DecodeFixStrAsciiLength). Encoding/decoding logic NOT yet implemented — currently both writer and reader use the universal String / FixStr markers.
Implementation
- Writer: in
WriteStringUtf8/WriteFixStrDirect, after UTF-8 encoding (D-2 path), checkbytesWritten == charLength(= ASCII iff equal). If ASCII, emitFixStrAscii(≤31 byte) orStringAscii(>31 byte). Else emit existingFixStr/String. Free detect — both numbers already computed by D-2. - Reader: in
ReadStringUtf8(or upstream marker dispatch), branch on marker. ASCII markers → dedicated byte→char widening path (no UTF-8 decode, noAscii.IsValidscan, no decoder dispatch). Non-ASCII markers → existing custom UTF-8 decoder. - SGen: regenerate readers/writers to dispatch on the new markers.
- Re-enable ASCII fast paths: uncomment writer FixStr dispatch in
AcBinarySerializer.csand readerAscii.IsValidblock inReadStringUtf8— these temporarily disabled blocks become the marker-aware paths (no IsValid scan needed since the marker is the contract).
Wire format change
- Format version bump (1 → 2). Old readers fail clean on new wire (version mismatch). New readers must reject old wire OR support backward read.
Acceptance
- Repeated Strings (Hungarian content) Deser: AcBinary closes the ~10% gap vs MemoryPack
- Pure ASCII tests (Small/Medium/Large/Deep): AcBinary Ser AND Deser ≥ MemoryPack
- Wire size: minimum -25% vs MemoryPack across all test cells
- SGen-generated code compiles and round-trips on all
[AcBinarySerializable]types - Decision documented: backward-compat policy for v2 vs v1 wire
Resolution
End-to-end implementation landed (writer + reader + SGen + skip + populate). Key components:
- Writer (
AcBinarySerializer.BinarySerializationContext.WriteStringWithDispatch) — single-pass UTF-8 encode + ASCII detect viabytesWritten == charLength; emits one of 4 markers (FixStrAscii / FixStr / StringAscii / String). Split layout for hot path:charLength ≤ 31encodes optimistically atsavedPos+1(FixStr position) → 0 shift on FixStr hit;charLength > 31uses D-2 layout with backfill. The split avoids the post-encode left-shift that the unified layout introduced (regression seen in 12-42-32 bench). - Reader (
AcBinaryDeserializer.BinaryDeserializationContext.ReadAsciiBytesAsString) —Encoding.Latin1.GetString(BCL SIMD-accelerated byte→char widen). Avoids thestring.Createcallback + scalar widen overhead — measurably better on Small Deser cell (closed the +20% MemPack-relative anomaly). - TypeReaderTable:
StringAscii(167) + 32 ×FixStrAscii(135-166) readers registered.IsFixStrAscii/StringAsciifast paths inPopulatePropertyWithMarker,ReadValue,SkipValue. - SGen (
AcBinarySourceGenerator.EmitReadString) — regenerated readers branch onIsFixStr/IsFixStrAscii/case StringAsciiper property.
Wire format version not bumped — the new markers occupy previously-unused codepoints (135-167); old wire (without ASCII markers) is forward-compatible (readers handle both String and StringAscii). v1 stays.
Acceptance (AOT bench 13-40-29, MemPack-relative ratios — JIT noise eliminated):
- ✅ AcBinary Ser AND Deser GYORSABB MemPack-nél MINDEN cellán (5/5)
- Small: Ser -8%, Deser -23%
- Medium: Ser -17%, Deser -30%
- Large: Ser -28%, Deser -32%
- Repeated: Ser -4%, Deser -9%
- Deep: Ser -24%, Deser -22%
- ✅ Wire size advantage: 2043-50419 byte (vs MemPack 3070-64986) = -22% to -33% across cells
- ✅ Round-trip tests: 167 pass (13 pre-existing failures are IId-tracking, unrelated to M3R7)
JIT vs AOT note: earlier JIT-mode benchmarks (12-50-43 → 13-27-20 series) showed elevated ratios on Small/Repeated cells (1.0-1.2 range) that disappeared under AOT publish. The JIT-mode numbers reflect tier-up artifacts (inconsistent inlining of SGen-generated reader hot paths during the 1000-iteration measurement window), not a structural M3R7 property. AOT (NativeAOT / ILC) compiles deterministically with fixed inline decisions — the steady-state numbers above reflect the actual production performance.
ACCORE-BIN-T-E2F9: Custom UTF-8 encoder (writer-side, symmetric with custom decoder)
Priority: P1 · Type: Performance · Related: decoder optimization (AcBinaryDeserializer.BinaryDeserializationContext.Read.cs::DecodeUtf8SinglePass)
Status: Closed (2026-05-04)
Sorrendi megjegyzés: ezt A MARKER-DISPATCH ELŐTT csináljuk (lásd
ACCORE-BIN-T-M3R7). Indok: a custom encoder/decoder optimalizáció a "nehezebb, kevésbé biztos" win — a non-ASCII / mixed content workload-okat (Repeated Strings Hungarian) hozza be. A marker-dispatch utána már csak additív tisztítás a pure ASCII path dispatch-overhead-jén.
Replace Encoding.UTF8.GetBytes calls in WriteStringUtf8 / WriteStringUtf8Internal / WriteFixStrDirect (collectively the writer's UTF-8 encode path, post-D-2) with a hand-rolled SIMD encoder. Symmetric to the decoder optimization (V4N2 / Read.cs::DecodeUtf8SinglePass).
Layered structure (mirrors decoder)
- Phase 1 — Vector256 ASCII narrow: 16 chars (Vector256) → 16 bytes (Vector128) via
Vector256.Narrow. ASCII detect via(v & 0xFF80).ExtractMostSignificantBits() == 0(any high bit on UTF-16 char). Break on first non-ASCII char. - Phase 2 — DWORD ASCII batch: 4 chars at a time, OR-mask test, 4 bytes per iter when ASCII.
- Phase 3 — Scalar multi-byte encode: 1-byte (ASCII) / 2-byte (Latin extended) / 3-byte (BMP) / 4-byte (surrogate pair → supplementary plane) UTF-8 encoding via direct bit-extract. No fallback dispatch — input is trusted UTF-16 (string).
- Use
System.Text.Unicode.Utf8.FromUtf16as fallback target for scalar correctness — or skip BCL entirely with manual bit-pack.
Why
Encoding.UTF8.GetBytes carries virtual-dispatch + encoder-fallback overhead even with SIMD ASCII fast path internally. Custom encoder skips this. ~15-30% Ser improvement on ASCII content, ~5-10% on non-ASCII (multi-byte path stays scalar).
Trigger
- NEXT — implementation order P1 before marker-dispatch (M3R7)
- Re-evaluate if .NET 11 BCL UTF-8 GetBytes becomes faster (PR #120628 follow-up)
Acceptance
- Writer-side benchmark: ≥15% Ser speedup on ASCII content (Small/Medium/Large/Deep), ≥5% on non-ASCII (Repeated)
- Wire format unchanged (custom encoder produces same bytes as
Encoding.UTF8) - Round-trip tests pass
Resolution
Implemented as EncodeUtf8SinglePass in AcBinarySerializer.BinarySerializationContext.cs — three-phase layered encoder (Vector256 ASCII narrow + DWORD ASCII batch + scalar 1/2/3-byte BMP & 4-byte surrogate-pair). Bypasses Encoding.UTF8.GetBytes virtual-dispatch + encoder-fallback overhead. Trusted-input path — no validation pass on writer side (the input is a .NET string with valid UTF-16 surrogate pairs by construction).
Used by WriteStringUtf8 (D-2 single-pass with VarUInt backfill) and WriteStringWithDispatch (M3R7 marker-dispatch path). Wire format unchanged — the encoder produces the same bytes as Encoding.UTF8.GetBytes.
Acceptance (per bench 12-50-43 → 13-27-20, MemPack-relative ratios on AcBinary Compact FastMode SGen):
- ✅ ASCII Ser ≥ MemPack on 4/5 cells (Small 0.94, Medium 0.80, Large 0.79, Deep 0.81)
- ⚠️ Repeated Ser ~1.04 (Hungarian, multi-byte path scalar) — see follow-up
ACCORE-BIN-T-H7K3 - ✅ Round-trip tests pass (167 of 180; 13 pre-existing failures unrelated to encoder)
ACCORE-BIN-T-W7N5: Default-value omission policy — doc + optional opt-out
Priority: P2 · Type: Refactor + Documentation · Related: BINARY_ISSUES.md#accore-bin-i-d9y2 (canonical issue)
The serializer's PropertySkip (102) optimization saves 1 byte per default-valued property by omitting the full value from the wire — relying on the consumer-side type definition to have the same default(T). This is a latent correctness risk documented in ACCORE-BIN-I-D9Y2. This entry tracks the mitigation plan; full failure-mode analysis lives in the issue.
Decision tree (TBD when implementing)
- Doc-only: position as a deliberate protobuf-style feature; consumer keeps type defaults stable across versions. Lowest cost, maximum benchmark wire-size advantage retained.
- Option flag:
AcBinarySerializerOptions.OmitDefaultsboolean. Defaulttrue(preserves current behavior + benchmark numbers).falsewrites every property in full — opt-out for fragile-class-evolution scenarios. - Both: ship doc + flag. Default behavior unchanged; consumers who hit silent-corruption have an explicit opt-out.
Acceptance (when implementing)
BINARY_FEATURES.mdadds a "Default-Value Omission" section documenting the semantic and the tradeoff (with cross-ref toACCORE-BIN-I-D9Y2)- If flag added: round-trip tests covering both
trueandfalse; benchmark comparison table showing wire-size delta on ASCII / Hungarian / DTO-heavy workloads - Decision rationale recorded in
LLM_PROTOCOL_DECISIONS.md(or a### Resolutionblock on the issue) once implemented
ACCORE-BIN-T-H7K3: Hungarian / multi-byte content Ser optimization (Repeated Strings cell)
Priority: P3 · Type: Performance · Related: EncodeUtf8SinglePass Phase 3 (scalar multi-byte encode), ACCORE-BIN-T-E2F9 resolution
Status: Closed (2026-05-04) — Won't Fix (JIT-only artifact)
The Repeated Strings benchmark (Hungarian content: "TermékNév_…", "RaklapKód_…") still shows AcBinary Ser ratio ~1.04 vs MemPack across multiple runs (12-50-43 / 13-21-27 / 13-27-20 series). All other ASCII-heavy cells (Small/Medium/Large/Deep) sit in the 0.79-0.94 ratio range — Repeated is the outlier.
The Phase 3 scalar multi-byte branch in EncodeUtf8SinglePass (1-byte ASCII / 2-byte Latin-extended / 3-byte BMP / 4-byte surrogate-pair) processes Hungarian diacritics (á, é, í, ő, ű, etc.) as 2-byte UTF-8 sequences via scalar bit-extract. MemPack's UTF-8 encoder appears to use a SIMD-accelerated mixed-content lane that processes 2-byte sequences in parallel.
Resolution
AOT bench 13-40-29: Repeated Ser ratio = 0.96 (AcBinary 14.50 µs vs MemPack 15.05 µs, AcBinary GYORSABB by 4%). Deser ratio 0.91 (also faster).
The 1.04+ ratio observed in JIT-mode benchmarks (12-50-43, 13-21-27, 13-27-20) was a JIT tier-up artifact — the SGen-generated writer's hot path (which calls EncodeUtf8SinglePass) didn't reliably tier up to fully-optimized code within the 1000-iteration measurement window, while MemPack's writer apparently warmed up faster. Under NativeAOT publish (-p:_IsPublishing=true) the issue disappears completely — both writers are deterministically optimized at compile time.
No structural problem in the Phase 3 scalar branch. The investigation directions (Vector256 mixed-content lane, BCL Utf8.FromUtf16 comparison) remain valid academic improvements but show no meaningful production-time win — closing as Won't Fix.
ACCORE-BIN-T-S2X9: Markerless schema lane — drop per-property type markers for fixed-shape primitives (SGen)
Priority: P3 · Type: Wire-format extension · Related: ACCORE-BIN-T-S5L8, ACCORE-BIN-T-W7N5
AcBinary is marker-driven: every value on the wire carries a 1-byte type code, so the reader can dispatch generically (handles polymorphism, null, intern markers, type-name lookup, etc.). MemPack is schema-driven: the SGen reader knows at compile time that "field 3 is int, field 4 is string" and reads values directly with no type code, no run-time dispatch.
For fixed-shape primitive properties (int, bool, double, Guid, DateTime, …) on [AcBinarySerializable] types, the per-property type marker is pure overhead — the SGen-generated reader already has compile-time knowledge of the property type, so the marker only confirms what is already known. Dropping it on this narrow class of properties is a clean wire+CPU win without losing any of the polymorphism / null / intern flexibility that the marker provides for variable-shape values.
Wire savings per property type
| Type | Current encoding | Markerless lane | Wire saved |
|---|---|---|---|
int (TinyInt range −16..47) |
TinyInt (1 byte) | VarInt (1 byte) | 0 |
int (out-of-tiny) |
[Int32] [VarInt] (2-6 bytes) |
VarInt (1-5 bytes) | 1 byte |
bool |
[True] or [False] (1 byte) |
1 byte (0/1) | 0 |
Guid |
[Guid] [16 bytes] (17 bytes) |
16 bytes | 1 byte |
DateTime |
[DateTime] [9 bytes] (10 bytes) |
9 bytes | 1 byte |
DateTimeOffset |
[DateTimeOffset] [10 bytes] (11 bytes) |
10 bytes | 1 byte |
TimeSpan |
[TimeSpan] [VarLong] (2-9 bytes) |
VarLong (1-9 bytes) | 1 byte |
decimal |
[Decimal] [16 bytes] (17 bytes) |
16 bytes | 1 byte |
double |
[Float64] [8 bytes] (9 bytes) |
8 bytes | 1 byte |
DTO-heavy payloads with many Guid / DateTime properties benefit the most — easily -10..-20% wire size on top of the existing -22..-33% advantage.
CPU savings
Reader-side: SGen-generated code drops the per-property ReadByte() + IsTinyInt / IsFixStr / switch-case dispatch for primitive properties — direct context.ReadInt32Unsafe() / ReadGuidUnsafe() / etc. calls. Writer-side: drops the WriteByte(typeCode) per primitive. Effect amplifies on payloads with many primitive properties (Small/Medium benchmark cells) — independent of any JIT-vs-AOT measurement variance.
Sketch — opt-in markerless lane, SGen-only
- New wire format flag (header
HeaderFlag_MarkerlessSchema = 0x10or similar) → activates a property-positional lane. - SGen-generated writer for
[AcBinarySerializable]types: per primitive property, emits raw value (no marker). For variable-shape properties (string, complex, nullable, polymorphic) the existing marker-driven path stays. - SGen-generated reader: per primitive property, calls
context.ReadInt32Unsafe()/ReadGuidUnsafe()/ etc. directly. Variable-shape properties keep the marker-read + dispatch. - Heuristic: a property is markerless-eligible if
IsValueType && !IsNullable && type is in {int, bool, byte, short, long, float, double, DateTime, DateTimeOffset, Guid, TimeSpan, decimal}. Anything else (string, list, nested object, nullable) keeps the marker.
Decision points
- Backward compatibility: header flag + version negotiation. Old readers see the flag set and either reject (clean fail) or fall back to marker-driven (if they support both lanes). Default
falsepreserves current wire format. - Schema evolution fragility: the markerless lane is positional, so adding/removing/reordering primitive properties breaks readers compiled against an older schema. Document this clearly — opt-in is for stable schemas only (DTO-frozen API contracts, internal SignalR messages with synchronized client/server SGen). For evolving schemas, marker-driven default stays.
- Coordination with
ACCORE-BIN-T-S5L8(sentinel-length strings): the two could share the "no-marker per-call" infrastructure — markerless string lane uses sentinel-length VarUInt (null/empty/short distinguished by length value).
Acceptance
- Wire size: ≥ -10% on DTO-heavy payloads (Guid/DateTime-rich) vs current marker-driven format
- Round-trip on the markerless lane validated on representative DTO shapes (mixed primitive + string + nested object)
- Schema-evolution fragility documented in
BINARY_FEATURES.md(alongside the existingPropertySkip/ default-omission caveat fromACCORE-BIN-I-D9Y2) - Opt-in flag with default
false(preserves marker-driven default; consumers explicitly opt in for frozen-schema scenarios)