diff --git a/docs/message-pump.md b/docs/message-pump.md new file mode 100644 index 0000000..e9aba35 --- /dev/null +++ b/docs/message-pump.md @@ -0,0 +1,77 @@ +# Message Pump — End-to-End Flow + +The AgentServer message pump is a single, linear, attack-resistant pipeline. Every message — local or remote, request or response — follows exactly the same path. + +```mermaid +flowchart TD + A[WebSocket Ingress
] --> B[TOTP + Auth Check] + B --> C[Repair + Exclusive C14N] + C --> D["Lark Envelope Grammar
(noise-tolerant, NOISE*)"] + D --> E[Extract Payload XML fragment] + E --> F{Payload namespace?} + + F -->|meta/v1| G["Core Meta Handler
(privileged, direct registry lookup)"] + F -->|user namespace| H[Route by namespace + root] + H --> I["Listener-specific Lark Grammar
(auto-generated from @xmlify class)"] + I --> J[Parse → clean dict] + J --> K["Call handler(payload_dict: dict) → bytes"] + K --> L[Wrap response payload in envelope] + + G --> L + L --> M[Exclusive C14N + Sign] + M --> N["WebSocket Egress
(bytes)"] +``` + +## Detailed Stages + +1. **Ingress (raw bytes over WSS)** + Single port, TLS-terminated. + +2. **Authentication** + TOTP-based session scoping. Determines privilege level (admin vs regular agent). + +3. **Repair + Exclusive Canonicalization** + Normalizes XML (entity resolution disabled, huge_tree=False, no_network=True). Tamper-evident baseline. + +4. **Envelope Validation** + Fixed, shared Lark grammar for the envelope (with `NOISE*` token). + Seeks first valid `...` in noisy LLM output. + Consumes exactly one envelope per pass (handles conjoined messages cleanly). + +5. **Payload Extraction** + Clean payload XML fragment (bytes) + declared namespace/root. + +6. **Routing Decision** + - `https://xml-platform.org/meta/v1` → **Core Meta Handler** (privileged, internal). + No user listener involved. Direct registry lookup for `request-schema`, `request-example`, `request-prompt`, `list-capabilities`. + - Any other namespace → **User Listener** lookup by `(namespace, root_element)`. + +7. **Payload Validation & Conversion** + Listener-specific Lark grammar (auto-generated from `@xmlify` payload_class at registration). + One-pass, noise-tolerant parse → Transformer → guaranteed clean `dict[str, Any]`. + +8. **Handler Execution** + Pure callable: `handler(payload_dict) -> bytes` + Returns raw response payload XML fragment. + Synchronous by default (async supported). + +9. **Response Envelope** + Bus wraps handler bytes in standard response envelope. + +10. **Egress Canonicalization** + Same exclusive C14N + optional signing. + +11. **WebSocket Out** + Bytes to peer. + +## Safety Properties + +- **No entity expansion** anywhere (Lark ignores entities, lxml parsers hardened). +- **Bounded depth/recursion** by schema design + size limits. +- **No XML trees escape the pump** — only clean dicts reach handlers. +- **Topology privacy** — normal flows reveal no upstream schemas unless meta privilege granted. +- **Zero tool-call convention** — the payload *is* the structured invocation. + +The pump is deliberately simple: one path, no branches except the privileged meta shortcut. Everything else is data-driven by live, auto-generated grammars. + +XML in → XML out. Safely. Permanently.