xml-pipeline/docs/message-pump.md
2026-01-02 15:46:33 -08:00

3.1 KiB

Message Pump — End-to-End Flow

The AgentServer message pump is a single, linear, attack-resistant pipeline. Every message — local or remote, request or response — follows exactly the same path.

flowchart TD
    A[WebSocket Ingress<br>] --> B[TOTP + Auth Check]
    B --> C[Repair + Exclusive C14N]
    C --> D["Lark Envelope Grammar<br>(noise-tolerant, NOISE*)"]
    D --> E[Extract Payload XML fragment]
    E --> F{Payload namespace?}
    
    F -->|meta/v1| G["Core Meta Handler<br>(privileged, direct registry lookup)"]
    F -->|user namespace| H[Route by namespace + root]
    H --> I["Listener-specific Lark Grammar<br>(auto-generated from @xmlify class)"]
    I --> J[Parse → clean dict]
    J --> K["Call handler(payload_dict: dict) → bytes"]
    K --> L[Wrap response payload in envelope]
    
    G --> L
    L --> M[Exclusive C14N + Sign]
    M --> N["WebSocket Egress<br>(bytes)"]

Detailed Stages

  1. Ingress (raw bytes over WSS)
    Single port, TLS-terminated.

  2. Authentication
    TOTP-based session scoping. Determines privilege level (admin vs regular agent).

  3. Repair + Exclusive Canonicalization
    Normalizes XML (entity resolution disabled, huge_tree=False, no_network=True). Tamper-evident baseline.

  4. Envelope Validation
    Fixed, shared Lark grammar for the envelope (with NOISE* token).
    Seeks first valid <envelope>...</envelope> in noisy LLM output.
    Consumes exactly one envelope per pass (handles conjoined messages cleanly).

  5. Payload Extraction
    Clean payload XML fragment (bytes) + declared namespace/root.

  6. Routing Decision

    • https://xml-platform.org/meta/v1Core Meta Handler (privileged, internal).
      No user listener involved. Direct registry lookup for request-schema, request-example, request-prompt, list-capabilities.
    • Any other namespace → User Listener lookup by (namespace, root_element).
  7. Payload Validation & Conversion
    Listener-specific Lark grammar (auto-generated from @xmlify payload_class at registration).
    One-pass, noise-tolerant parse → Transformer → guaranteed clean dict[str, Any].

  8. Handler Execution
    Pure callable: handler(payload_dict) -> bytes
    Returns raw response payload XML fragment.
    Synchronous by default (async supported).

  9. Response Envelope
    Bus wraps handler bytes in standard response envelope.

  10. Egress Canonicalization
    Same exclusive C14N + optional signing.

  11. WebSocket Out
    Bytes to peer.

Safety Properties

  • No entity expansion anywhere (Lark ignores entities, lxml parsers hardened).
  • Bounded depth/recursion by schema design + size limits.
  • No XML trees escape the pump — only clean dicts reach handlers.
  • Topology privacy — normal flows reveal no upstream schemas unless meta privilege granted.
  • Zero tool-call convention — the payload is the structured invocation.

The pump is deliberately simple: one path, no branches except the privileged meta shortcut. Everything else is data-driven by live, auto-generated grammars.

XML in → XML out. Safely. Permanently.