SignalR binary protocol: doc updates, WASM threading guide

- Closed and documented TODOs for V3P9 wire-format breaking changes and protocol-layer VarUInt/generic-inference fixes (ACCORE-SBP-T-W7K4).
- Added and referenced SIGNALR_BINARY_PROTOCOL_WASMTHREADING.md, detailing how to enable and test true pipeline streaming on Blazor-WASM via WasmEnableThreads.
- Updated README and issues to clarify WASM fallback behavior, technical limitations, and future multi-threading path.
- Improved cross-references and added acceptance/regression notes for maintainability.
This commit is contained in:
Loretta 2026-05-27 14:54:12 +02:00
parent 4d75599988
commit 101929b89e
5 changed files with 280 additions and 1 deletions

View File

@ -75,6 +75,14 @@ header byte was NOT bumped (silent breaking; AcBinary is consumer-private, no cr
compatibility surface). If future versioned compat is desired, a `FormatVersion 1 → 2` bump would
be the conventional approach.
### SignalR-protocol-layer follow-on
A separate closed entry [`ACCORE-SBP-T-W7K4`](../../../AyCode.Services/docs/SIGNALR_BINARY_PROTOCOL/SIGNALR_BINARY_PROTOCOL_TODO.md#accore-sbp-t-w7k4-prefix-tier-varuint-protocol-side-parity--t--object-regression-cluster)
documents the SignalR-side parity fixes the V3P9 wire-format change exposed:
`AcBinaryHubProtocol.ReadVarUInt(ref SequenceReader<byte>)` standalone LEB128 → prefix-tier port,
plus a latent `T = object` generic-inference regression cluster surfaced by the V3P9 test breaks
(test count: 320 → 10, within baseline flaky variance).
## ACCORE-BIN-T-N4P8: ~~SGen reference-property null-check parity across all four emit branches~~
**Status:** Closed (2026-05-23) · **Priority:** ~~P1~~ · **Type:** ~~Bug fix~~

View File

@ -209,6 +209,8 @@ Send and receive paths handle WASM (`OperatingSystem.IsBrowser()`) asymmetricall
Consequence: mixed topology (desktop server `AsyncSegment` + WASM client `Bytes`) works without negotiation or protocol-name variation — client converts incoming chunked wire to its synchronous processing model.
> Future: enabling true pipeline streaming on WASM via `WasmEnableThreads` (Atomics-backed `ManualResetEventSlim.Wait()`) is documented in [`SIGNALR_BINARY_PROTOCOL_WASMTHREADING.md`](SIGNALR_BINARY_PROTOCOL_WASMTHREADING.md) — doc-only, not implemented; revisit at .NET 10 stable / .NET 11.
## Registration in `Program.cs`
### Server

View File

@ -16,7 +16,7 @@ For higher-level SignalR abstractions see `../SIGNALR/SIGNALR_ISSUES.md`.
3. **Receive-path** on WASM is fully supported — `AsyncPipeReaderInput` with synchronous fallback at `CHUNK_END` (after `Complete()`, `TryAdvanceSegment` never enters the `MRES.Wait` path) means WASM clients CAN receive AsyncSegment-chunked data from a non-WASM sender, they just cannot send AsyncSegment themselves
### Related TODO
None — architectural constraint of browser WASM threading model.
- [`SIGNALR_BINARY_PROTOCOL_WASMTHREADING.md`](SIGNALR_BINARY_PROTOCOL_WASMTHREADING.md) — documents the `WasmEnableThreads` upgrade path that would enable AsyncSegment SEND-path on WASM (doc-only, not implemented; .NET 10 stable / .NET 11 milestone).
## ACCORE-SBP-I-G4B5: StaticWebAssets SDK "Illegal characters" noise (consumer build)

View File

@ -95,6 +95,67 @@ Shipped in `AcBinaryHubProtocol.cs`:
- ✅ Teardown logic deduplicated into the `AbandonChunkState` helper; `GetInvocationId` and `TryEmitChunkAbort` helpers extracted; nested try/catch eliminated.
- 🟡 **Deferred edge case** — mid-header serialize failure leaves a partial `[201][UINT16=N]` on the wire; `[203]` is parsed as chunk-data and the receiver lands on the protocol-violation `InvalidDataException` path. Robust graceful-abort fix tracked in [`../../../AyCode.Core/docs/BINARY/BINARY_ASYNCPIPE_TODO.md#accore-bin-t-z6n2`](../../../AyCode.Core/docs/BINARY/BINARY_ASYNCPIPE_TODO.md#accore-bin-t-z6n2).
## ACCORE-SBP-T-W7K4: ~~Prefix-tier VarUInt protocol-side parity + `T = object` regression cluster~~
**Status:** Closed (2026-05-27) · **Priority:** ~~P1~~ · **Type:** ~~Bug fix / Wire format~~ · **Related:** [`../../../AyCode.Core/docs/BINARY/BINARY_TODO.md#accore-bin-t-v3p9`](../../../AyCode.Core/docs/BINARY/BINARY_TODO.md#accore-bin-t-v3p9)
~~Two coordinated fix waves after the V3P9 wire-format change exposed pre-existing latent bugs in
the SignalR-protocol layer: a standalone `ReadVarUInt` that still decoded under LEB128 grammar
(protocol-layer parity miss), and a cluster of `T = object` generic-inference regressions in
multiple developer-facing `Serialize<T>` call sites that bound `T` to `object` instead of the
runtime concrete type, producing object-typed empty wire payloads.~~
### Resolution (2026-05-27)
**1. Protocol-layer `ReadVarUInt` parity.** `AcBinaryHubProtocol.ReadVarUInt(ref SequenceReader<byte>)`
had its own LEB128 implementation that was missed during V3P9. After the serializer-side moved to
prefix-tier, the protocol's standalone `ReadVarUInt` continued decoding under LEB128 grammar,
corrupting argLength / header parsing on the chunked-stream path. Rewrote to match prefix-tier
(1-byte `0xxxxxxx` / 2-byte `10xxxxxx` / 3-byte `110xxxxx` / 4-byte `1110xxxx` / 5-byte `1111xxxx`).
Test count: 96 → 50 failing.
**2. `T = object` generic-inference regression cluster.** Multiple call sites silently invoked
`Serialize<T>(T value)` where the developer-facing call had `object?` static type, causing T to
bind to `object` instead of the runtime concrete type — wire payloads came out object-typed (empty,
no properties). Fixed via runtime-type capture (`value.GetType()`) and a new non-generic
`SerializeToBinary(object?, Type)` overload for heterogeneous `object?` call sites:
- `AyCodeBinaryHubProtocol.WriteArgument` (streamed-arg AsyncSegment path, ~line 565)
- `ISignalParams.SetParameterValues` (per-param explicit-null branch)
- `SerializeObjectExtensions.CloneTo<TDestination>` / `CopyTo` (project-wide deep-clone)
- `SignalRSerializationHelper.CreateResponseData` (server response path)
- `AcSignalRClientBase` context-params, `AcWebSignalRHubBase` `isRawBytes` pre-serialize,
`AcSignalRDataSource` fallback re-serialize
Test count: 50 → 10 failing (within baseline 13-15 flaky-test variance band).
**3. `SignalRSerializationHelper` byte[]-overload migration.** `SerializeToBinary` overloads
switched from `ArrayBufferWriter<byte>` + `WrittenSpan.ToArray()` (2 allocations: writer buffer
+ ToArray copy) to direct `.ToBinary(...)` byte[]-output (1 pool-backed allocation, no copy).
Same wire output, fewer allocations per call. Same migration applied to `SerializeObjectExtensions`
internal use sites.
**4. `DebugLogArgument [Conditional("DEBUG")]` diagnostics.** Added two `[Conditional("DEBUG")]`
overloads on the write-side (`Type, int, object?`) and read-side (`Type, int, long`) of the
argument framing path. Release-eliminated, debug-enabled — zero production cost, fast diagnostic
when wire-format regressions reappear (the SignalR test regression cluster would have been faster
to localize with these in place).
### Acceptance criteria met
- ✅ Full solution build (`AyCode.Core.sln`) — 0 errors.
- ✅ SignalR test suite: 10 failing (vs baseline 13-15) — within flaky-test variance band.
- ✅ Wire format unchanged at the protocol level — V3P9 covers the encoding change; this fix is
protocol-side parity catch-up + non-encoding-related generic-inference bugs.
- ✅ Allocation profile improvement on `SignalRSerializationHelper.SerializeToBinary` paths
(server response + heterogeneous `object?` call sites).
### Note on regression-discovery flow
The `T = object` regression cluster was latent (pre-V3P9), but only surfaced after V3P9 broke the
SignalR test baseline — the failing tests' close investigation exposed the entire generic-inference
gap. Future encoding-format breaks should pair with a focused `(value.GetType())` audit across the
project's `Serialize<T>` / `ToBinary<T>` call sites to catch this earlier.
---
# 🟡 NuGet competitiveness ideas — NOT current priority

View File

@ -0,0 +1,208 @@
# WASM Threading — true pipeline streaming on Blazor-WASM
How to enable real producer/consumer pipeline streaming for the chunked
deserialize path on Blazor-WASM clients. Currently WASM falls back to a
degenerate "accumulate-then-sync-deserialize" mode because the runtime is
single-threaded by default — `ManualResetEventSlim.Wait()` throws
`PlatformNotSupportedException`.
> Companion docs:
> - [`README.md`](README.md) — protocol overview
> - [`SIGNALR_BINARY_PROTOCOL_ISSUES.md`](SIGNALR_BINARY_PROTOCOL_ISSUES.md) — known issues
> - [`SIGNALR_BINARY_PROTOCOL_TODO.md`](SIGNALR_BINARY_PROTOCOL_TODO.md) — planned work
> - [`AyCode.Core/AyCode.Core/docs/BINARY/BINARY_ASYNCPIPE_TODO.md`](../../../../AyCode.Core/docs/BINARY/BINARY_ASYNCPIPE_TODO.md) — async-pipe layer TODO
## Current state — WASM single-thread fallback
`AcBinaryHubProtocol` has a static `IsBrowser = OperatingSystem.IsBrowser()`
flag (line ~88) that gates the background deserialize task at line ~936:
```csharp
if (state.DeserTask == null && !IsBrowser)
{
state.DeserTask = Task.Run(() => AcBinaryDeserializer.Deserialize(input2, type, opts));
}
```
**Non-browser** (server / Windows-app): the deser-task runs on a real thread.
`AsyncPipeReaderInput.TryAdvanceSegment(...)` blocks on
`ManualResetEventSlim.Wait()` when out of bytes — true pipeline parallelism
between the producer (WebSocket-receive) and consumer (deserializer).
**Browser (WASM single-thread)**: `Task.Run` is skipped. Chunks accumulate in
`AsyncPipeReaderInput` from the WebSocket-receive callback. On `CHUNK_END`,
`Input.Complete()` is called → `_completed = true`. The deserializer then
runs **synchronously on the current thread** over the fully-buffered payload.
`TryAdvanceSegment` short-circuits at the `_completed` check (line ~494)
before reaching the `Wait()` at line ~498, so the
`PlatformNotSupportedException` is never thrown.
### Cost of the fallback
- **Peak memory**: full payload in memory at once (~10 MB at production
workloads vs ~4 KB chunk-bounded in true pipeline mode).
- **No pipeline parallelism**: deserialize starts only after the last chunk
arrives. Sequential serial latency = transit + deser, not max(transit,
deser).
- **Large-payload UX**: a ~10 MB response measured ~3.4 sec end-to-end
(DB-fetch dominant, ~500 ms client-side deser). With true pipeline mode the
deser could overlap with receive — wall-clock saving estimated at
100-200 ms on the production payload.
The fallback is **functionally correct** and works on every browser without
extra configuration — it's only suboptimal on memory and parallelism.
## Goal — enable true pipeline streaming on WASM
WASM Multi-Threading (the `WasmEnableThreads` toggle) is the path forward.
The runtime spawns WebWorker-based threads that share memory via
`SharedArrayBuffer`, and `ManualResetEventSlim.Wait()` is implemented on top
of `Atomics.wait()` / `Atomics.notify()` — transparent to .NET code.
When this is on, the existing `AsyncPipeReaderInput` works **unchanged** on
WASM: the consumer-thread blocks naturally at `_dataAvailable.Wait()` and is
woken by the producer-thread (UI-thread receiving WebSocket frames).
### Three-step trial setup
#### 1. Blazor-WASM client `.csproj`
```xml
<PropertyGroup>
<WasmEnableThreads>true</WasmEnableThreads>
</PropertyGroup>
```
Available from .NET 8 (experimental), stabilising in .NET 9-10. Triggers
WebWorker-spawning, multi-threaded WASM module build, larger bundle.
#### 2. Hosting server — `Cross-Origin-Isolation` headers (mandatory)
`SharedArrayBuffer` requires `cross-origin-isolated` context. The hosting
server (Blazor-Server, Kestrel, nginx, IIS, CDN) must send both headers on
every response that serves the WASM client:
```
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
```
Blazor-Server `Program.cs` example:
```csharp
app.Use(async (ctx, next) =>
{
ctx.Response.Headers.Append("Cross-Origin-Opener-Policy", "same-origin");
ctx.Response.Headers.Append("Cross-Origin-Embedder-Policy", "require-corp");
await next();
});
```
Production caveat: every third-party resource (CDN, fonts, analytics, ads,
iframes) must serve `Cross-Origin-Resource-Policy: cross-origin` or use the
`crossorigin` attribute. Anything that doesn't will fail to load — a full
content audit is required before turning this on for production.
#### 3. Override the `IsBrowser` short-circuit in `AcBinaryHubProtocol`
Currently the flag is a static-readonly bool derived only from
`OperatingSystem.IsBrowser()` — there's no runtime override. For the trial,
add an opt-in property:
```csharp
// Replace the existing line ~88:
// private static readonly bool IsBrowser = OperatingSystem.IsBrowser();
public static bool EnableWasmThreading { get; set; } = false;
private static readonly bool IsBrowser = OperatingSystem.IsBrowser() && !EnableWasmThreading;
```
Note: `IsBrowser` is `static readonly`, evaluated once at type-init. If
`EnableWasmThreading` is set during app startup *before* any
`AcBinaryHubProtocol` instance is constructed, the gate works. For
robustness, the trial may want a non-readonly variant or instance-level
opt-in via `AcBinaryHubProtocolOptions`.
Client-side startup (Blazor `Program.cs` or a DI initialiser):
```csharp
AcBinaryHubProtocol.EnableWasmThreading = true;
```
This re-enables the `Task.Run(() => Deserialize(...))` background task on
WASM. The deser runs on a worker, `_dataAvailable.Wait()` blocks via
`Atomics.wait()`, the UI-thread continues receiving WebSocket chunks.
## Expected outcomes
- **Peak memory**: from ~payload-size down to ~chunk-size (4 KB default).
- **Latency**: deser starts as soon as the first chunk arrives. On a 10 MB
payload with ~150 ms total deser-time, the overlap saves ~100-150 ms of
wall-clock.
- **Throughput**: per-chunk receive→feed→consume cycle on the worker-thread.
Comparable to a non-trivial fraction of the Windows-app pipeline (the
Windows-app does ~63 MB/s deser; WASM-worker likely ~30-50 MB/s due to
WASM-vs-native instruction cost, but the parallelism still wins on big
payloads).
## Risks and known gotchas
1. **Bundle-size +30-40%** — additional WASM modules, WebWorker loaders. May
matter for cold-start latency.
2. **Cross-origin breakage** — any third-party resource not opted-in via
`crossorigin` attribute or `CORP: cross-origin` header fails to load.
Common offenders: Google Fonts (works with `crossorigin="anonymous"`),
analytics snippets, ad networks, embedded YouTube/Vimeo iframes, social
widgets.
3. **NuGet package compatibility** — packages built without
`WasmEnableThreads` support may throw runtime errors. Likely safe: pure
.NET libraries (BCL, custom DTOs). Likely risky: native interop
(SkiaSharp, image-processing, crypto with native fallbacks). Audit with a
smoke-test of the full app before going to production.
4. **Debugger experience** — multi-threaded WASM debug is improving but still
less smooth than single-thread: breakpoint hit on the wrong thread,
stepping across worker boundaries can be flaky. Browser dev-tools may
show worker stacks separately.
5. **Static-readonly initialisation order** — the `IsBrowser` field is
evaluated at type-init. If `EnableWasmThreading = true` is set *after*
the type is loaded (e.g. after the first `AcBinaryHubProtocol` instance
is created), the gate stays in single-threaded mode. The trial setup
must set the flag in `Program.cs` *before* any SignalR client is built.
6. **Atomics.wait() on the UI thread is forbidden** — only worker threads
can call it. The .NET runtime handles this internally (it schedules
blocking work to worker threads), but if any consumer code accidentally
blocks on the UI thread, the browser will throw. Stick to standard
`Task.Run` / `await` patterns; don't bypass them.
7. **Worker-cold-start cost** — spawning a new worker has a ~10-50 ms
overhead. For very small payloads (< 64 KB), the worker-spawn cost may
exceed the pipeline-parallelism gain. The `Bytes` mode (single-flush,
no chunked path) is still preferable for small messages.
## When to revisit this
- **.NET 10 stable release** (Nov 2025): `WasmEnableThreads` should be
closer to production-ready. Re-evaluate compatibility with the project's
NuGet dependencies and CDN setup.
- **.NET 11 release** (Nov 2026): expected to make `WasmEnableThreads` the
default for new Blazor-WASM templates (per current Microsoft roadmap
signals — not confirmed). Likely a good moment for production rollout.
- **`WebAssembly.Suspending`** (alternative path): an experimental browser
feature (Chrome 134+, ~Q4 2024) that allows a sync WASM call to await a
JS promise without multi-threading. If .NET adds runtime support, this
could be an *opt-in-free* alternative that doesn't require COOP/COEP or
multi-threaded builds. Monitor `dotnet/runtime` issue tracker for
`WebAssembly.Suspending` integration.
## What this doc is NOT
- Not a production-ready procedure. The trial is for measurement only.
- Not a recommendation to enable multi-threading by default — the bundle
size, cross-origin headers, and library audit costs are significant.
- Not a substitute for the chunked-protocol design decisions in the parent
`README.md` and `SIGNALR_BINARY_PROTOCOL_ISSUES.md`.