From ab062bca18d674ceef08a77db5b96c89808951d4 Mon Sep 17 00:00:00 2001 From: dullfig Date: Sat, 3 Jan 2026 14:48:57 -0800 Subject: [PATCH] re-writing docs and code --- .idea/misc.xml | 2 +- .idea/xml-pipeline.iml | 2 +- README.md | 92 +++++++++++++------- agentserver/schema/envelope.xsd | 4 +- agentserver/xml_listener.py | 145 +++++++++++++------------------ docs/configuration.md | 112 +++++++++++------------- docs/core-principles-v2.0.md | 76 ++++++++++++++++ docs/message-pump.md | 23 ++--- docs/thread-management.md | 65 ++++++++++++++ scripts/generate_organism_key.py | 2 - 10 files changed, 330 insertions(+), 193 deletions(-) create mode 100644 docs/core-principles-v2.0.md create mode 100644 docs/thread-management.md delete mode 100644 scripts/generate_organism_key.py diff --git a/.idea/misc.xml b/.idea/misc.xml index 9ccd328..4c64606 100644 --- a/.idea/misc.xml +++ b/.idea/misc.xml @@ -3,5 +3,5 @@ - + \ No newline at end of file diff --git a/.idea/xml-pipeline.iml b/.idea/xml-pipeline.iml index eb72a7e..e3e342f 100644 --- a/.idea/xml-pipeline.iml +++ b/.idea/xml-pipeline.iml @@ -5,7 +5,7 @@ - + \ No newline at end of file diff --git a/README.md b/README.md index 07d4628..bc7ad21 100644 --- a/README.md +++ b/README.md @@ -1,48 +1,82 @@ -# AgentServer — The Living Substrate (v1.3) -***"It just works..."*** +# AgentServer — The Living Substrate (v2.0) +***"It just works... safely."*** -**January 01, 2026** -**Architecture: Autonomous Grammar-Driven, Turing-Complete Multi-Agent Organism** +**January 03, 2026** +**Architecture: Autonomous Schema-Driven, Turing-Complete Multi-Agent Organism** ## What It Is -AgentServer is a production-ready substrate for the `xml-pipeline` nervous system. Version 1.3 introduces **Autonomous Grammar Generation**, where the organism defines its own language and validation rules in real-time using Lark and XSD automation. +AgentServer is a production-ready substrate for the `xml-pipeline` nervous system. Version 2.0 stabilizes the design around exact XSD validation, typed dataclass handlers, mandatory hierarchical threading, and strict out-of-band privileged control. + +See [Core Architectural Principles](docs/core-principles-v2.0.md) for the single canonical source of truth. ## Core Philosophy -- **Autonomous DNA:** The system never requires a human to explain tool usage to an agent. Listeners automatically generate their own XSDs based on their parameters, which are then converted into **Lark Grammars** for high-speed, one-pass scanning and validation. -- **Grammar-Locked Intelligence:** Dirty LLM streams are scanned by Lark. Only text that satisfies the current organism's grammar is extracted and validated. Everything else is ignored as "Biological Noise." -- **Parameter-Keyed Logic:** Messages are delivered to agents as pristine Python dictionaries, automatically keyed to the listener's registered parameters. -- **Computational Sovereignty:** Turing-complete via `` and `` primitives, governed by a strict resource stack. +- **Autonomous DNA:** Listeners declare their contract via `@xmlify` dataclasses; the organism auto-generates XSDs, examples, and tool prompts. +- **Schema-Locked Intelligence:** Payloads validated directly against XSD (lxml) → deserialized to typed instances → pure handlers. +- **Multi-Response Tolerance:** Handlers return raw bytes; bus wraps in `` and extracts multiple payloads (perfect for parallel tool calls or dirty LLM output). +- **Computational Sovereignty:** Turing-complete via self-calls, subthreading primitives, and visible reasoning — all bounded by thread hierarchy and local-only control. + +## Developer Experience — Create a Listener in 12 Lines +That's it. No manual XML, no schemas, no prompts. + +```python +from xmlable import xmlify +from dataclasses import dataclass +from xml_pipeline import Listener, bus # bus is the global MessageBus + +@xmlify +@dataclass +class AddPayload: + a: int + b: int + +def add_handler(payload: AddPayload) -> bytes: + result = payload.a + payload.b + return f"{result}".encode("utf-8") + +Listener( + payload_class=AddPayload, + handler=add_handler, + name="calculator.add", + description="Adds two integers and returns their sum." +).register() # ← Boom: XSD, example, prompt auto-generated + registered +``` + +The organism now speaks `` — fully validated, typed, and discoverable. ## Key Features +### 1. The Autonomous Schema Layer +- Dataclass → cached XSD + example + rich tool prompt (mandatory description + field docs). +- Namespaces: `https://xml-pipeline.org/ns///v1` (served live via domain for discoverability). -### 1. The Autonomous Language Layer -- **XSD-to-Lark Generator:** A core utility that transcribes XSD schema definitions into EBNF Lark grammars. This enables the server to search untrusted data streams for specific XML patterns with mathematical precision. -- **Auto-Descriptive Organs:** The base `XMLListener` class inspects its own instantiation parameters to generate a corresponding XSD. The tool itself tells the world how to use it. -- **Protocol Agnostic:** To add a new field (like ``) to the entire swarm, you simply update the central XSD. The entire organism's grammar updates instantly. -- **[Read Further: Self-Registration & Autonomous Grammars](docs/self-grammar-generation.md)** +### 2. Thread-Based Lifecycle & Reasoning +- Mandatory `` with hierarchical IDs for reliable subthreading and audit trails. +- LLM agents reason via open self-calls and optional ``. +- All thought steps visible as messages — no hidden state. -### 2. The Stack-Based Lifecycle -- **UUID Custody:** UUID v4 thread identifiers are born via `` and managed on a physical stack. -- **Leaf-to-Root Roll-up:** Threads remain active until the final leaf responds, ensuring perfect resource tracking and preventing runaway processes. +### 3. Message Pump +- Single linear pipeline with repair, C14N, XSD validation, deserialization, handler execution, and multi-payload extraction. +- Supports clean tools and forgiving LLM streams alike. +- Thread-base message queue with bounded memory. -### 3. The Sovereign Witness -- **Inline Auditing:** The Logger witnesses all traffic before routing. -- **The Confessional:** Agents record inner thoughts via ``. The Logger is **strictly write-only** to prevent rogue memory or shared-state leaks. +### 4. Structural Control +- Bootstrap from `organism.yaml`. +- Runtime changes (hot-reload, add/remove listeners) via local-only OOB channel (localhost WSS or Unix socket — GUI-ready). +- Main bus oblivious to privileged ops. -### 4. Isolated Structural Control -- **Out-of-Band (OOB) Port:** Structural commands (registration, wiring, shutdown) use a dedicated port and Ed25519 signatures, ensuring "Life/Death" commands cannot be delayed by agent traffic. -- **[Read Further: YAML Configuration System](docs/configuration.md)** +### 5. Federation & Introspection +- YAML-declared gateways with trusted keys. +- Controlled meta queries (schema/example/prompt/capability list). ## Technical Stack -- **Parsing:** Lark (EBNF Grammar) + `lxml` (Validation/C14N). -- **Protocol:** Mandatory WSS (TLS) + TOTP 2FA. -- **Identity:** Ed25519 (OOB) + UUID v4 (In-Bus). -- **Format:** `lxml` trees (Internal) / Exclusive C14N (External). +- **Validation & Parsing:** lxml (XSD, C14N, repair) + xmlable (round-trip). +- **Protocol:** Mandatory WSS (TLS) + TOTP on main port. +- **Identity:** Ed25519 (signing, federation, privileged). +- **Format:** Exclusive C14N XML (wire sovereign). ## Why This Matters -AgentServer v1.3 is the first multi-agent substrate where the **language is the security.** By automating the link between XSD, Grammar, and LLM Prompts, you’ve created an organism that is impossible to "misunderstand." It is a self-documenting, self-validating, and self-regulating intelligent system. +AgentServer v2.0 is a bounded, auditable, owner-controlled organism where the **XSD is the security**, the **thread is the memory**, and the **OOB channel is the sovereignty**. -**One port. Many bounded minds. Autonomous Evolution.** 🚀 +One port. Many bounded minds. Autonomous yet obedient evolution. 🚀 --- *XML wins. Safely. Permanently.* \ No newline at end of file diff --git a/agentserver/schema/envelope.xsd b/agentserver/schema/envelope.xsd index e621124..5d69c61 100644 --- a/agentserver/schema/envelope.xsd +++ b/agentserver/schema/envelope.xsd @@ -1,6 +1,6 @@ @@ -13,7 +13,7 @@ - + diff --git a/agentserver/xml_listener.py b/agentserver/xml_listener.py index d79f6ef..5954061 100644 --- a/agentserver/xml_listener.py +++ b/agentserver/xml_listener.py @@ -1,104 +1,83 @@ """ -xmllistener.py — The Sovereign Contract for All Capabilities - -In xml-pipeline, there are no "agents", no "tools", no "services". -There are only bounded, reactive XMLListeners. - -Every capability in the organism — whether driven by an LLM, -a pure function, a remote gateway, or privileged logic — -must inherit from this class. - -This file is intentionally verbose and heavily documented. -It is the constitution that all organs must obey. +xml_listener.py — The Sovereign Contract for All Capabilities (v1.3) """ from __future__ import annotations - -import uuid -from typing import Optional, List, ClassVar -from lxml import etree +from typing import Optional, Type, Callable +from pydantic import BaseModel class XMLListener: """ - Base class for all reactive capabilities in the organism. - - Key Invariants (never break these): - 1. Listeners are passive. They never initiate. They only react. - 2. They declare what they listen to via class variable. - 3. They have a globally unique agent_name. - 4. They receive the full parsed envelope tree (not raw XML). - 5. They return only payload XML (never the envelope). - 6. The MessageBus owns routing, threading, and envelope wrapping. + Base class for all reactive capabilities. + Now supports Autonomous Registration via Pydantic payload classes. """ - # =================================================================== - # Required class declarations — must be overridden in subclasses - # =================================================================== + def __init__( + self, + name: str, + payload_class: Type[BaseModel], + handler: Callable[[dict], bytes], + description: Optional[str] = None + ): + self.agent_name = name + self.payload_class = payload_class + self.handler = handler + self.description = description or payload_class.__doc__ or "No description provided." - listens_to: ClassVar[List[str]] = [] - """ - List of full XML tags this listener reacts to. - Example: ["{https://example.org/chat}message", "{https://example.org/calc}request"] - """ - - agent_name: ClassVar[str] = "" - """ - Globally unique name for this instance. - Enforced by MessageBus at registration. - Used in , routing, logging, and known_peers prompts. - """ - - # =================================================================== - # Core handler — the only method that does work - # =================================================================== + # In v1.3, the root tag is derived from the payload class name + self.root_tag = payload_class.__name__ + self.listens_to = [self.root_tag] async def handle( self, - envelope_tree: etree._Element, - convo_id: str, + payload_dict: dict, + thread_id: str, sender_name: str, - ) -> Optional[str]: + ) -> Optional[bytes]: """ - React to an incoming enveloped message. - - Parameters: - envelope_tree: Full root (parsed, post-repair/C14N) - convo_id: Current conversation UUID (injected or preserved by bus) - sender_name: The value (mandatory) - - Returns: - Payload XML string (no envelope) if responding, else None. - - The organism guarantees: - - envelope_tree is valid against envelope.xsd - - is present and matches sender_name - - convo_id is a valid UUID - - To reply in the current thread: omit convo_id in response → bus preserves it - To start a new thread: include new-uuid in returned envelope + React to a pre-validated dictionary payload. + Returns raw response XML bytes. """ - raise NotImplementedError( - f"{self.__class__.__name__} must implement handle()" - ) + # 1. Execute the handler logic + # Note: In v1.3, the Bus/Lark handles the XML -> Dict conversion + return await self.handler(payload_dict) - # =================================================================== - # Optional convenience helpers (can be overridden) - # =================================================================== + def generate_xsd(self) -> str: + """ + Autonomous XSD Synthesis. + Inspects the payload_class and generates an XSD string. + """ + # Logic to iterate over self.payload_class.model_fields + # and build the definitions. + pass - def make_response( + def generate_prompt_fragment(self) -> str: + """ + Prompt Synthesis (The 'Mente'). + Generates the tool usage instructions for other agents. + """ + fragment = [ + f"Capability: {self.agent_name}", + f"Root Tag: <{self.root_tag}>", + f"Description: {self.description}", + "\nParameters:" + ] + + for name, field in self.payload_class.model_fields.items(): + field_type = field.annotation.__name__ + field_desc = field.description or "No description" + fragment.append(f" - {name} ({field_type}): {field_desc}") + + return "\n".join(fragment) + + def make_response_envelope( self, - payload: str | etree._Element, - *, - to: Optional[str] = None, - convo_id: Optional[str] = None, - ) -> str: + payload_bytes: bytes, + thread_id: str, + to: Optional[str] = None + ) -> bytes: """ - Helper for building correct response payloads. - Use this in subclasses to avoid envelope boilerplate. - - - If convo_id is None → reply in current thread - - If convo_id provided → force/start new thread - - to overrides default reply-to-sender + Wraps response bytes in a standard envelope. """ - # Implementation tomorrow — but declared here for contract clarity - raise NotImplementedError \ No newline at end of file + # Logic to build the meta block and append the payload_bytes + pass \ No newline at end of file diff --git a/docs/configuration.md b/docs/configuration.md index 5007b8b..bf1edec 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -1,118 +1,110 @@ -# Configuration — organism.yaml +G# Configuration — organism.yaml (v2.0) The entire organism is declared in a single YAML file (default: `config/organism.yaml`). -All listeners, agents, and federation gateways are instantiated from this file at startup. -Changes require a restart (hot-reload planned for future). +Loaded at bootstrap — single source of truth for initial composition. +Runtime changes (hot-reload) via local OOB privileged commands. ## Example Full Configuration ```yaml organism: name: "ResearchSwarm-01" - identity: "config/identity/private.ed25519" # Ed25519 private key for signing - port: 8765 + identity: "config/identity/private.ed25519" # Ed25519 private key + port: 8765 # Main message bus WSS tls: cert: "certs/fullchain.pem" key: "certs/privkey.pem" +oob: # Out-of-band privileged channel (GUI/hot-reload ready) + enabled: true + bind: "127.0.0.1" # Localhost-only default + port: 8766 # Separate WSS port + # unix_socket: "/tmp/organism.sock" # Alternative + +thread_scheduling: "breadth-first" # or "depth-first" (default: breadth-first) + meta: enabled: true - allow_list_capabilities: true # Public catalog of capability names + allow_list_capabilities: true allow_schema_requests: "admin" # "admin" | "authenticated" | "none" allow_example_requests: "admin" allow_prompt_requests: "admin" - allow_remote: false # Federation peers can query meta + allow_remote: false # Federation peers query meta listeners: - name: calculator.add payload_class: examples.calculator.AddPayload handler: examples.calculator.add_handler - description: "Integer addition" - - - name: calculator.subtract - payload_class: examples.calculator.SubtractPayload - handler: examples.calculator.subtract_handler + description: "Adds two integers and returns their sum." # Mandatory for usable tool prompts - name: summarizer payload_class: agents.summarizer.SummarizePayload handler: agents.summarizer.summarize_handler - description: "Text summarization via local LLM" + description: "Summarizes text via local LLM." agents: - name: researcher system_prompt: "prompts/researcher_system.txt" tools: - calculator.add - - calculator.subtract - summarizer - - name: web_search # Remote tool via gateway below + - name: web_search remote: true gateways: - name: web_search remote_url: "wss://trusted-search-node.example.org" trusted_identity: "pubkeys/search_node.ed25519.pub" - description: "Federated web search capability" + description: "Federated web search capability." ``` ## Sections Explained ### `organism` -Core server settings. +Core settings. +- `name`: Logs/discovery. +- `identity`: Ed25519 private key path. +- `port` / `tls`: Main WSS bus. -- `name`: Human-readable identifier (used in logs, discovery). -- `identity`: Path to Ed25519 private key (for envelope signing, federation auth). -- `port` / `tls`: Single-port WSS configuration. +### `oob` +Privileged local control channel. +- `enabled: false` → pure static (restart for changes). +- Localhost default for GUI safety. +- Separate from main port — bus oblivious. + +### `thread_scheduling` +Balanced subthread execution. +- `"breadth-first"`: Fair round-robin (default, prevents deep starvation). +- `"depth-first"`: Dive deep into branches. ### `meta` -Controls the privileged introspection facility (`https://xml-platform.org/meta/v1`). - -- `allow_list_capabilities`: Publicly visible catalog (safe). -- `allow_*_requests`: Restrict schema/example/prompt emission to admin or authenticated sessions. -- `allow_remote`: Whether federation peers can query your meta namespace. +Introspection controls (`https://xml-pipeline.org/ns/meta/v1`). ### `listeners` -All bounded capabilities. Each entry triggers autonomous registration: +Bounded capabilities. +- `name`: Discovery/logging (dots for hierarchy). +- `payload_class`: Full import to `@xmlify` dataclass. +- `handler`: Full import to function (dataclass → bytes). +- `description`: **Mandatory** human-readable blurb (lead-in for auto-prompt; fallback to generic if omitted). -- `name`: Logical capability name (used in discovery, YAML tools lists). Dots allowed for hierarchy. -- `payload_class`: Full import path to the `@xmlify` dataclass (defines contract). -- `handler`: Full import path to the handler callable (`dict → bytes`). -- `description`: Optional human-readable text (included in `list-capabilities`). +At startup/hot-reload: imports → Listener instantiation → bus.register() → XSD/example/prompt synthesis. -At startup: -1. Import payload_class and handler. -2. Instantiate `Listener(payload_class=..., handler=..., name=...)`. -3. `bus.register(listener)` → XSD synthesis, Lark grammar generation, prompt caching. - -Filesystem artifacts: -- XSDs cached as `schemas//v1.xsd` (dots → underscores for Linux safety). +Cached XSDs: `schemas//v1.xsd`. ### `agents` -LLM-based reasoning agents. - -- `name`: Agent identifier. -- `system_prompt`: Path to static prompt file. -- `tools`: List of local capability names or remote references. - - Local: direct name match (`calculator.add`). - - Remote: `name:` + `remote: true` → routed via matching gateway. - -Live capability prompts are auto-injected into the agent's system prompt at runtime (no stale copies). +LLM reasoners. +- `system_prompt`: Static file path. +- `tools`: Local names or remote references. +- Auto-injected live tool prompts at runtime. ### `gateways` Federation peers. +- Trusted public key required. +- Bidirectional regular traffic only. -- `name`: Local alias for the remote organism. -- `remote_url`: WSS endpoint. -- `trusted_identity`: Path to remote's Ed25519 public key. -- `description`: Optional. +## Notes +- Hot-reload: Future privileged OOB commands (apply new YAML fragments, add/remove listeners). +- Namespaces: Capabilities under `https://xml-pipeline.org/ns///v1` (served live if configured). +- Edit → reload/restart → new bounded minds, self-describing and attack-resistant. -Remote tools referenced in `agents.tools` are routed through the gateway with matching `name`. - -## Future Extensions (planned) - -- Hot-reload of configuration. -- Per-agent privilege scoping. -- Capability versioning in YAML (`version: v2`). - -This YAML is the **single source of truth** for organism composition. -Edit → restart → new bounded minds appear, fully self-describing and attack-resistant. +This YAML is the organism's DNA — precise, auditable, and evolvable locally. \ No newline at end of file diff --git a/docs/core-principles-v2.0.md b/docs/core-principles-v2.0.md new file mode 100644 index 0000000..6eb57c5 --- /dev/null +++ b/docs/core-principles-v2.0.md @@ -0,0 +1,76 @@ +# AgentServer v2.0 — Core Architectural Principles +**January 03, 2026** +**Architecture: Autonomous Schema-Driven, Turing-Complete Multi-Agent Organism** + +These principles are the single canonical source of truth for the project. All documentation, code, and future decisions must align with this file. + +## Identity & Communication +- All traffic uses the universal `` envelope defined in `envelope.xsd` (namespace `https://xml-pipeline.org/ns/envelope/v1`). +- Mandatory `` and `` (convo_id string, supports hierarchical dot notation for subthreading, e.g., "root.1.research"). +- Optional `` (rare direct routing; most flows use payload namespace/root). +- Exclusive C14N on ingress and egress. +- Malformed XML repaired on ingress; repairs logged in `` metadata. + +## Configuration & Composition +- YAML file (`organism.yaml`) is the bootstrap source of truth, loaded at startup. +- Defines initial listeners, agents, gateways, meta privileges, and OOB channel configuration. +- Runtime structural changes (add/remove listeners, rewire agents, etc.) via local-only privileged commands on the dedicated OOB channel (hot-reload capability). +- No remote or unprivileged structural changes ever. + +## Autonomous Schema Layer +- Listeners defined by `@xmlify`-decorated dataclass (payload contract) + pure handler function. +- Mandatory human-readable description string (short "what this does" blurb for tool prompt lead-in). +- Registration (at startup or via hot-reload) automatically generates: + - XSD cached on disk (`schemas//v1.xsd`) + - Example XML + - Tool description prompt fragment (includes description, params with field docs if present, example input) +- All capability namespaces under `https://xml-pipeline.org/ns///v1`. +- Root element derived from payload class name (lowercase) or explicit. + +## Message Pump +- Single linear pipeline on main port: ingress → repair → C14N → envelope validation → payload routing. +- Routing key = (payload namespace, root element); unique per listener. +- Meta requests (`https://xml-pipeline.org/ns/meta/v1`) handled by privileged core handler. +- User payloads: + - Validated directly against listener's cached XSD (lxml) + - On success → deserialized to typed dataclass instance (`xmlable.from_xml`) + - Handler called with instance → returns raw bytes (XML fragment, possibly dirty/multi-root) + - Bytes wrapped in `` → repaired/parsed → all top-level payload elements extracted + - Each extracted payload wrapped in separate response envelope (inherits thread/from, optional new subthread if primitive used) + - Enveloped responses buffered and sent sequentially +- Supports single clean response, multi-payload emission (parallel tools/thoughts), and dirty LLM output tolerance. + +## Reasoning & Iteration +- LLM agents iterate via open self-calls (same root tag, same thread ID). +- Conversation thread = complete memory and audit trail (all messages logged). +- Subthreading natively supported via hierarchical thread IDs and primitives (e.g., reserved payload to spawn "parent.sub1"). +- Optional structured constructs like `` for visible planning. +- No hidden loops or state machines; all reasoning steps are visible messages. + +## Security & Sovereignty +- Privileged messages (per `privileged-msg.xsd`) handled exclusively on dedicated OOB channel. +- OOB channel bound to localhost by default (safe for local GUI); separate port/socket from main bus. +- Main MessageBus and pump oblivious to privileged operations — no routing or handling for privileged roots. +- Remote privileged attempts impossible (channel not exposed); any leak to main port logged as security event and dropped. +- Ed25519 identity key used for envelope signing, federation auth, and privileged command verification. +- No agent may modify organism structure, register listeners, or access host resources beyond declared scope. +- “No Paperclippers” manifesto injected as first system message for every LLM-based listener. + +## Federation +- Gateways declared in YAML with trusted remote public key. +- Remote tools referenced by gateway name in agent tool lists. +- Regular messages flow bidirectionally; privileged messages never forwarded or accepted. + +## Introspection (Meta) +- Controlled via YAML flags (`allow_list_capabilities`, `allow_schema_requests`, etc.). +- Supports `request-schema`, `request-example`, `request-prompt`, `list-capabilities`. +- Remote meta queries optionally allowed per YAML (federation peers). + +## Technical Constraints +- Mandatory WSS (TLS) + TOTP on main port. +- OOB channel WSS or Unix socket, localhost-default. +- Internal: lxml trees → XSD validation → xmlable deserialization → dataclass → handler → bytes → dummy extraction. +- Single process, async non-blocking. +- XML is the sovereign wire format; everything else is implementation detail. + +These principles are now locked. All existing docs will be updated to match this file exactly. Future changes require explicit discussion and amendment here first. \ No newline at end of file diff --git a/docs/message-pump.md b/docs/message-pump.md index e9aba35..2804cd3 100644 --- a/docs/message-pump.md +++ b/docs/message-pump.md @@ -5,8 +5,8 @@ The AgentServer message pump is a single, linear, attack-resistant pipeline. Eve ```mermaid flowchart TD A[WebSocket Ingress
] --> B[TOTP + Auth Check] - B --> C[Repair + Exclusive C14N] - C --> D["Lark Envelope Grammar
(noise-tolerant, NOISE*)"] + B --> C[lxml Repair + Exclusive C14N] + C --> D["Envelope Grammar
"] D --> E[Extract Payload XML fragment] E --> F{Payload namespace?} @@ -24,22 +24,15 @@ flowchart TD ## Detailed Stages -1. **Ingress (raw bytes over WSS)** - Single port, TLS-terminated. +1. **Ingress:** Raw bytes over WSS. -2. **Authentication** - TOTP-based session scoping. Determines privilege level (admin vs regular agent). +2. **The Immune System:** Every inbound packet is converted to a Tree. -3. **Repair + Exclusive Canonicalization** - Normalizes XML (entity resolution disabled, huge_tree=False, no_network=True). Tamper-evident baseline. +3. **Internal Routing:** Trees flow between organs via the `dispatch` method. -4. **Envelope Validation** - Fixed, shared Lark grammar for the envelope (with `NOISE*` token). - Seeks first valid `...` in noisy LLM output. - Consumes exactly one envelope per pass (handles conjoined messages cleanly). +4. **The Thought Stream (Egress):** Listeners return raw bytes. These are wrapped in a `` tag and run through a recovery parser. -5. **Payload Extraction** - Clean payload XML fragment (bytes) + declared namespace/root. +5. **Multi-Message Extraction:** Every `` found in the dummy tag is extracted as a Tree and re-injected into the Bus. 6. **Routing Decision** - `https://xml-platform.org/meta/v1` → **Core Meta Handler** (privileged, internal). @@ -66,7 +59,7 @@ flowchart TD ## Safety Properties -- **No entity expansion** anywhere (Lark ignores entities, lxml parsers hardened). +- **No entity expansion** anywhere (lxml parsers hardened). - **Bounded depth/recursion** by schema design + size limits. - **No XML trees escape the pump** — only clean dicts reach handlers. - **Topology privacy** — normal flows reveal no upstream schemas unless meta privilege granted. diff --git a/docs/thread-management.md b/docs/thread-management.md new file mode 100644 index 0000000..a1a8954 --- /dev/null +++ b/docs/thread-management.md @@ -0,0 +1,65 @@ +# Thread Management in AgentServer v2.0 +**January 03, 2026** + +This document clarifies the thread ID system, subthreading mechanics, and internals. It supplements the Core Architectural Principles — hierarchical dot notation examples there reflect the wire format. + +## Wire Format: Hierarchical String IDs +- Mandatory `` contains a **server-assigned** hierarchical string (dot notation, e.g., "root", "root.research", "root.research.images"). +- Root IDs: Short, opaque, server-generated (e.g., "sess-abcd1234"). +- Sub-IDs: Relative extensions for readability. +- Benefits: LLM/human-friendly copying, natural tree structure for logs/GUI. + +## Server Assignment Only +The organism assigns all final IDs — agents never invent them. + +- **Root initiation**: Client suggests or server auto-generates on first message; uniqueness enforced. +- **Subthread spawning**: Explicit reserved payload for intent clarity: + ```xml + + + + + + ``` + Core handler: + - Appends label (or auto-short if omitted). + - Resolves uniqueness conflicts (append "-1" etc.). + - Creates queue + seeds bootstrap. + - **Always responds** in current thread: + ```xml + + ``` + +## Error Handling (No Silent Failure) +- Unknown `` ID → no implicit creation. +- **Always inject** system error into parent thread (or root): + ```xml + + ``` +- LLM sees error immediately, retries without hanging. +- Logs warning for monitoring. + +## Internals +- Per-thread queues: dict[str, Queue]. +- Scheduling via `organism.yaml`: + ```yaml + thread_scheduling: "breadth-first" # or "depth-first" (default: breadth-first) + ``` + - Depth from dot count. +- Optional hidden UUID mapping for extra safety (implementation detail). + +## Design Rationale +- Explicit spawn = clear intent + bootstrap hook. +- Mandatory feedback = no LLM limbo. +- Readable IDs = easy copying without UUID mangling. +- Server control = sovereignty + no collisions. + +Future: Alias registry, thread metadata primitives. + +The organism branches reliably, visibly, and recoverably. diff --git a/scripts/generate_organism_key.py b/scripts/generate_organism_key.py deleted file mode 100644 index f478b64..0000000 --- a/scripts/generate_organism_key.py +++ /dev/null @@ -1,2 +0,0 @@ -# One-time tool to generate the permanent Ed25519 organism identity -# Run once, store private key offline/safely \ No newline at end of file