re-writing docs and code

This commit is contained in:
dullfig 2026-01-03 14:48:57 -08:00
parent 9e75cfffd6
commit ab062bca18
10 changed files with 330 additions and 193 deletions

View file

@ -3,5 +3,5 @@
<component name="Black">
<option name="sdkName" value="Python 3.13 (xml-pipeline)" />
</component>
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.14 (xml-pipeline)" project-jdk-type="Python SDK" />
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.13 (xml-pipeline)" project-jdk-type="Python SDK" />
</project>

View file

@ -5,7 +5,7 @@
<sourceFolder url="file://$MODULE_DIR$" isTestSource="false" />
<excludeFolder url="file://$MODULE_DIR$/.venv" />
</content>
<orderEntry type="jdk" jdkName="Python 3.14 (xml-pipeline)" jdkType="Python SDK" />
<orderEntry type="jdk" jdkName="Python 3.13 (xml-pipeline)" jdkType="Python SDK" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
</module>

View file

@ -1,48 +1,82 @@
# AgentServer — The Living Substrate (v1.3)
***"It just works..."***
# AgentServer — The Living Substrate (v2.0)
***"It just works... safely."***
**January 01, 2026**
**Architecture: Autonomous Grammar-Driven, Turing-Complete Multi-Agent Organism**
**January 03, 2026**
**Architecture: Autonomous Schema-Driven, Turing-Complete Multi-Agent Organism**
## What It Is
AgentServer is a production-ready substrate for the `xml-pipeline` nervous system. Version 1.3 introduces **Autonomous Grammar Generation**, where the organism defines its own language and validation rules in real-time using Lark and XSD automation.
AgentServer is a production-ready substrate for the `xml-pipeline` nervous system. Version 2.0 stabilizes the design around exact XSD validation, typed dataclass handlers, mandatory hierarchical threading, and strict out-of-band privileged control.
See [Core Architectural Principles](docs/core-principles-v2.0.md) for the single canonical source of truth.
## Core Philosophy
- **Autonomous DNA:** The system never requires a human to explain tool usage to an agent. Listeners automatically generate their own XSDs based on their parameters, which are then converted into **Lark Grammars** for high-speed, one-pass scanning and validation.
- **Grammar-Locked Intelligence:** Dirty LLM streams are scanned by Lark. Only text that satisfies the current organism's grammar is extracted and validated. Everything else is ignored as "Biological Noise."
- **Parameter-Keyed Logic:** Messages are delivered to agents as pristine Python dictionaries, automatically keyed to the listener's registered parameters.
- **Computational Sovereignty:** Turing-complete via `<todo-until/>` and `<start-thread/>` primitives, governed by a strict resource stack.
- **Autonomous DNA:** Listeners declare their contract via `@xmlify` dataclasses; the organism auto-generates XSDs, examples, and tool prompts.
- **Schema-Locked Intelligence:** Payloads validated directly against XSD (lxml) → deserialized to typed instances → pure handlers.
- **Multi-Response Tolerance:** Handlers return raw bytes; bus wraps in `<dummy></dummy>` and extracts multiple payloads (perfect for parallel tool calls or dirty LLM output).
- **Computational Sovereignty:** Turing-complete via self-calls, subthreading primitives, and visible reasoning — all bounded by thread hierarchy and local-only control.
## Developer Experience — Create a Listener in 12 Lines
That's it. No manual XML, no schemas, no prompts.
```python
from xmlable import xmlify
from dataclasses import dataclass
from xml_pipeline import Listener, bus # bus is the global MessageBus
@xmlify
@dataclass
class AddPayload:
a: int
b: int
def add_handler(payload: AddPayload) -> bytes:
result = payload.a + payload.b
return f"<result>{result}</result>".encode("utf-8")
Listener(
payload_class=AddPayload,
handler=add_handler,
name="calculator.add",
description="Adds two integers and returns their sum."
).register() # ← Boom: XSD, example, prompt auto-generated + registered
```
The organism now speaks `<add>` — fully validated, typed, and discoverable.
## Key Features
### 1. The Autonomous Schema Layer
- Dataclass → cached XSD + example + rich tool prompt (mandatory description + field docs).
- Namespaces: `https://xml-pipeline.org/ns/<category>/<name>/v1` (served live via domain for discoverability).
### 1. The Autonomous Language Layer
- **XSD-to-Lark Generator:** A core utility that transcribes XSD schema definitions into EBNF Lark grammars. This enables the server to search untrusted data streams for specific XML patterns with mathematical precision.
- **Auto-Descriptive Organs:** The base `XMLListener` class inspects its own instantiation parameters to generate a corresponding XSD. The tool itself tells the world how to use it.
- **Protocol Agnostic:** To add a new field (like `<cc/>`) to the entire swarm, you simply update the central XSD. The entire organism's grammar updates instantly.
- **[Read Further: Self-Registration & Autonomous Grammars](docs/self-grammar-generation.md)**
### 2. Thread-Based Lifecycle & Reasoning
- Mandatory `<thread/>` with hierarchical IDs for reliable subthreading and audit trails.
- LLM agents reason via open self-calls and optional `<todo-until/>`.
- All thought steps visible as messages — no hidden state.
### 2. The Stack-Based Lifecycle
- **UUID Custody:** UUID v4 thread identifiers are born via `<spawn-thread/>` and managed on a physical stack.
- **Leaf-to-Root Roll-up:** Threads remain active until the final leaf responds, ensuring perfect resource tracking and preventing runaway processes.
### 3. Message Pump
- Single linear pipeline with repair, C14N, XSD validation, deserialization, handler execution, and multi-payload extraction.
- Supports clean tools and forgiving LLM streams alike.
- Thread-base message queue with bounded memory.
### 3. The Sovereign Witness
- **Inline Auditing:** The Logger witnesses all traffic before routing.
- **The Confessional:** Agents record inner thoughts via `<logger/>`. The Logger is **strictly write-only** to prevent rogue memory or shared-state leaks.
### 4. Structural Control
- Bootstrap from `organism.yaml`.
- Runtime changes (hot-reload, add/remove listeners) via local-only OOB channel (localhost WSS or Unix socket — GUI-ready).
- Main bus oblivious to privileged ops.
### 4. Isolated Structural Control
- **Out-of-Band (OOB) Port:** Structural commands (registration, wiring, shutdown) use a dedicated port and Ed25519 signatures, ensuring "Life/Death" commands cannot be delayed by agent traffic.
- **[Read Further: YAML Configuration System](docs/configuration.md)**
### 5. Federation & Introspection
- YAML-declared gateways with trusted keys.
- Controlled meta queries (schema/example/prompt/capability list).
## Technical Stack
- **Parsing:** Lark (EBNF Grammar) + `lxml` (Validation/C14N).
- **Protocol:** Mandatory WSS (TLS) + TOTP 2FA.
- **Identity:** Ed25519 (OOB) + UUID v4 (In-Bus).
- **Format:** `lxml` trees (Internal) / Exclusive C14N (External).
- **Validation & Parsing:** lxml (XSD, C14N, repair) + xmlable (round-trip).
- **Protocol:** Mandatory WSS (TLS) + TOTP on main port.
- **Identity:** Ed25519 (signing, federation, privileged).
- **Format:** Exclusive C14N XML (wire sovereign).
## Why This Matters
AgentServer v1.3 is the first multi-agent substrate where the **language is the security.** By automating the link between XSD, Grammar, and LLM Prompts, youve created an organism that is impossible to "misunderstand." It is a self-documenting, self-validating, and self-regulating intelligent system.
AgentServer v2.0 is a bounded, auditable, owner-controlled organism where the **XSD is the security**, the **thread is the memory**, and the **OOB channel is the sovereignty**.
**One port. Many bounded minds. Autonomous Evolution.** 🚀
One port. Many bounded minds. Autonomous yet obedient evolution. 🚀
---
*XML wins. Safely. Permanently.*

View file

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="https://xml-pipeline.org/ns/envelope/1"
targetNamespace="https://xml-pipeline.org/ns/envelope/v1"
elementFormDefault="qualified">
<!-- The universal envelope for all non-privileged messages -->
@ -13,7 +13,7 @@
<xs:sequence>
<xs:element name="from" type="xs:string" minOccurs="1"/>
<xs:element name="to" type="xs:string" minOccurs="0"/>
<xs:element name="thread" type="xs:string" minOccurs="0"/>
<xs:element name="thread" type="xs:string" minOccurs="1"/>
<!-- Reserved for future standard fields (timestamp, priority, etc.) -->
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>

View file

@ -1,104 +1,83 @@
"""
xmllistener.py The Sovereign Contract for All Capabilities
In xml-pipeline, there are no "agents", no "tools", no "services".
There are only bounded, reactive XMLListeners.
Every capability in the organism whether driven by an LLM,
a pure function, a remote gateway, or privileged logic
must inherit from this class.
This file is intentionally verbose and heavily documented.
It is the constitution that all organs must obey.
xml_listener.py The Sovereign Contract for All Capabilities (v1.3)
"""
from __future__ import annotations
import uuid
from typing import Optional, List, ClassVar
from lxml import etree
from typing import Optional, Type, Callable
from pydantic import BaseModel
class XMLListener:
"""
Base class for all reactive capabilities in the organism.
Key Invariants (never break these):
1. Listeners are passive. They never initiate. They only react.
2. They declare what they listen to via class variable.
3. They have a globally unique agent_name.
4. They receive the full parsed envelope tree (not raw XML).
5. They return only payload XML (never the envelope).
6. The MessageBus owns routing, threading, and envelope wrapping.
Base class for all reactive capabilities.
Now supports Autonomous Registration via Pydantic payload classes.
"""
# ===================================================================
# Required class declarations — must be overridden in subclasses
# ===================================================================
def __init__(
self,
name: str,
payload_class: Type[BaseModel],
handler: Callable[[dict], bytes],
description: Optional[str] = None
):
self.agent_name = name
self.payload_class = payload_class
self.handler = handler
self.description = description or payload_class.__doc__ or "No description provided."
listens_to: ClassVar[List[str]] = []
"""
List of full XML tags this listener reacts to.
Example: ["{https://example.org/chat}message", "{https://example.org/calc}request"]
"""
agent_name: ClassVar[str] = ""
"""
Globally unique name for this instance.
Enforced by MessageBus at registration.
Used in <from/>, routing, logging, and known_peers prompts.
"""
# ===================================================================
# Core handler — the only method that does work
# ===================================================================
# In v1.3, the root tag is derived from the payload class name
self.root_tag = payload_class.__name__
self.listens_to = [self.root_tag]
async def handle(
self,
envelope_tree: etree._Element,
convo_id: str,
payload_dict: dict,
thread_id: str,
sender_name: str,
) -> Optional[str]:
) -> Optional[bytes]:
"""
React to an incoming enveloped message.
Parameters:
envelope_tree: Full <env:message> root (parsed, post-repair/C14N)
convo_id: Current conversation UUID (injected or preserved by bus)
sender_name: The <from/> value (mandatory)
Returns:
Payload XML string (no envelope) if responding, else None.
The organism guarantees:
- envelope_tree is valid against envelope.xsd
- <from/> is present and matches sender_name
- convo_id is a valid UUID
To reply in the current thread: omit convo_id in response bus preserves it
To start a new thread: include <env:convo_id>new-uuid</env:convo_id> in returned envelope
React to a pre-validated dictionary payload.
Returns raw response XML bytes.
"""
raise NotImplementedError(
f"{self.__class__.__name__} must implement handle()"
)
# 1. Execute the handler logic
# Note: In v1.3, the Bus/Lark handles the XML -> Dict conversion
return await self.handler(payload_dict)
# ===================================================================
# Optional convenience helpers (can be overridden)
# ===================================================================
def generate_xsd(self) -> str:
"""
Autonomous XSD Synthesis.
Inspects the payload_class and generates an XSD string.
"""
# Logic to iterate over self.payload_class.model_fields
# and build the <xs:element> definitions.
pass
def make_response(
def generate_prompt_fragment(self) -> str:
"""
Prompt Synthesis (The 'Mente').
Generates the tool usage instructions for other agents.
"""
fragment = [
f"Capability: {self.agent_name}",
f"Root Tag: <{self.root_tag}>",
f"Description: {self.description}",
"\nParameters:"
]
for name, field in self.payload_class.model_fields.items():
field_type = field.annotation.__name__
field_desc = field.description or "No description"
fragment.append(f" - {name} ({field_type}): {field_desc}")
return "\n".join(fragment)
def make_response_envelope(
self,
payload: str | etree._Element,
*,
to: Optional[str] = None,
convo_id: Optional[str] = None,
) -> str:
payload_bytes: bytes,
thread_id: str,
to: Optional[str] = None
) -> bytes:
"""
Helper for building correct response payloads.
Use this in subclasses to avoid envelope boilerplate.
- If convo_id is None reply in current thread
- If convo_id provided force/start new thread
- to overrides default reply-to-sender
Wraps response bytes in a standard envelope.
"""
# Implementation tomorrow — but declared here for contract clarity
raise NotImplementedError
# Logic to build the <message> meta block and append the payload_bytes
pass

View file

@ -1,118 +1,110 @@
# Configuration — organism.yaml
G# Configuration — organism.yaml (v2.0)
The entire organism is declared in a single YAML file (default: `config/organism.yaml`).
All listeners, agents, and federation gateways are instantiated from this file at startup.
Changes require a restart (hot-reload planned for future).
Loaded at bootstrap — single source of truth for initial composition.
Runtime changes (hot-reload) via local OOB privileged commands.
## Example Full Configuration
```yaml
organism:
name: "ResearchSwarm-01"
identity: "config/identity/private.ed25519" # Ed25519 private key for signing
port: 8765
identity: "config/identity/private.ed25519" # Ed25519 private key
port: 8765 # Main message bus WSS
tls:
cert: "certs/fullchain.pem"
key: "certs/privkey.pem"
oob: # Out-of-band privileged channel (GUI/hot-reload ready)
enabled: true
bind: "127.0.0.1" # Localhost-only default
port: 8766 # Separate WSS port
# unix_socket: "/tmp/organism.sock" # Alternative
thread_scheduling: "breadth-first" # or "depth-first" (default: breadth-first)
meta:
enabled: true
allow_list_capabilities: true # Public catalog of capability names
allow_list_capabilities: true
allow_schema_requests: "admin" # "admin" | "authenticated" | "none"
allow_example_requests: "admin"
allow_prompt_requests: "admin"
allow_remote: false # Federation peers can query meta
allow_remote: false # Federation peers query meta
listeners:
- name: calculator.add
payload_class: examples.calculator.AddPayload
handler: examples.calculator.add_handler
description: "Integer addition"
- name: calculator.subtract
payload_class: examples.calculator.SubtractPayload
handler: examples.calculator.subtract_handler
description: "Adds two integers and returns their sum." # Mandatory for usable tool prompts
- name: summarizer
payload_class: agents.summarizer.SummarizePayload
handler: agents.summarizer.summarize_handler
description: "Text summarization via local LLM"
description: "Summarizes text via local LLM."
agents:
- name: researcher
system_prompt: "prompts/researcher_system.txt"
tools:
- calculator.add
- calculator.subtract
- summarizer
- name: web_search # Remote tool via gateway below
- name: web_search
remote: true
gateways:
- name: web_search
remote_url: "wss://trusted-search-node.example.org"
trusted_identity: "pubkeys/search_node.ed25519.pub"
description: "Federated web search capability"
description: "Federated web search capability."
```
## Sections Explained
### `organism`
Core server settings.
Core settings.
- `name`: Logs/discovery.
- `identity`: Ed25519 private key path.
- `port` / `tls`: Main WSS bus.
- `name`: Human-readable identifier (used in logs, discovery).
- `identity`: Path to Ed25519 private key (for envelope signing, federation auth).
- `port` / `tls`: Single-port WSS configuration.
### `oob`
Privileged local control channel.
- `enabled: false` → pure static (restart for changes).
- Localhost default for GUI safety.
- Separate from main port — bus oblivious.
### `thread_scheduling`
Balanced subthread execution.
- `"breadth-first"`: Fair round-robin (default, prevents deep starvation).
- `"depth-first"`: Dive deep into branches.
### `meta`
Controls the privileged introspection facility (`https://xml-platform.org/meta/v1`).
- `allow_list_capabilities`: Publicly visible catalog (safe).
- `allow_*_requests`: Restrict schema/example/prompt emission to admin or authenticated sessions.
- `allow_remote`: Whether federation peers can query your meta namespace.
Introspection controls (`https://xml-pipeline.org/ns/meta/v1`).
### `listeners`
All bounded capabilities. Each entry triggers autonomous registration:
Bounded capabilities.
- `name`: Discovery/logging (dots for hierarchy).
- `payload_class`: Full import to `@xmlify` dataclass.
- `handler`: Full import to function (dataclass → bytes).
- `description`: **Mandatory** human-readable blurb (lead-in for auto-prompt; fallback to generic if omitted).
- `name`: Logical capability name (used in discovery, YAML tools lists). Dots allowed for hierarchy.
- `payload_class`: Full import path to the `@xmlify` dataclass (defines contract).
- `handler`: Full import path to the handler callable (`dict → bytes`).
- `description`: Optional human-readable text (included in `list-capabilities`).
At startup/hot-reload: imports → Listener instantiation → bus.register() → XSD/example/prompt synthesis.
At startup:
1. Import payload_class and handler.
2. Instantiate `Listener(payload_class=..., handler=..., name=...)`.
3. `bus.register(listener)` → XSD synthesis, Lark grammar generation, prompt caching.
Filesystem artifacts:
- XSDs cached as `schemas/<name_with_underscores>/v1.xsd` (dots → underscores for Linux safety).
Cached XSDs: `schemas/<name>/v1.xsd`.
### `agents`
LLM-based reasoning agents.
- `name`: Agent identifier.
- `system_prompt`: Path to static prompt file.
- `tools`: List of local capability names or remote references.
- Local: direct name match (`calculator.add`).
- Remote: `name:` + `remote: true` → routed via matching gateway.
Live capability prompts are auto-injected into the agent's system prompt at runtime (no stale copies).
LLM reasoners.
- `system_prompt`: Static file path.
- `tools`: Local names or remote references.
- Auto-injected live tool prompts at runtime.
### `gateways`
Federation peers.
- Trusted public key required.
- Bidirectional regular traffic only.
- `name`: Local alias for the remote organism.
- `remote_url`: WSS endpoint.
- `trusted_identity`: Path to remote's Ed25519 public key.
- `description`: Optional.
## Notes
- Hot-reload: Future privileged OOB commands (apply new YAML fragments, add/remove listeners).
- Namespaces: Capabilities under `https://xml-pipeline.org/ns/<category>/<name>/v1` (served live if configured).
- Edit → reload/restart → new bounded minds, self-describing and attack-resistant.
Remote tools referenced in `agents.tools` are routed through the gateway with matching `name`.
## Future Extensions (planned)
- Hot-reload of configuration.
- Per-agent privilege scoping.
- Capability versioning in YAML (`version: v2`).
This YAML is the **single source of truth** for organism composition.
Edit → restart → new bounded minds appear, fully self-describing and attack-resistant.
This YAML is the organism's DNA — precise, auditable, and evolvable locally.

View file

@ -0,0 +1,76 @@
# AgentServer v2.0 — Core Architectural Principles
**January 03, 2026**
**Architecture: Autonomous Schema-Driven, Turing-Complete Multi-Agent Organism**
These principles are the single canonical source of truth for the project. All documentation, code, and future decisions must align with this file.
## Identity & Communication
- All traffic uses the universal `<message>` envelope defined in `envelope.xsd` (namespace `https://xml-pipeline.org/ns/envelope/v1`).
- Mandatory `<from/>` and `<thread/>` (convo_id string, supports hierarchical dot notation for subthreading, e.g., "root.1.research").
- Optional `<to/>` (rare direct routing; most flows use payload namespace/root).
- Exclusive C14N on ingress and egress.
- Malformed XML repaired on ingress; repairs logged in `<huh/>` metadata.
## Configuration & Composition
- YAML file (`organism.yaml`) is the bootstrap source of truth, loaded at startup.
- Defines initial listeners, agents, gateways, meta privileges, and OOB channel configuration.
- Runtime structural changes (add/remove listeners, rewire agents, etc.) via local-only privileged commands on the dedicated OOB channel (hot-reload capability).
- No remote or unprivileged structural changes ever.
## Autonomous Schema Layer
- Listeners defined by `@xmlify`-decorated dataclass (payload contract) + pure handler function.
- Mandatory human-readable description string (short "what this does" blurb for tool prompt lead-in).
- Registration (at startup or via hot-reload) automatically generates:
- XSD cached on disk (`schemas/<name>/v1.xsd`)
- Example XML
- Tool description prompt fragment (includes description, params with field docs if present, example input)
- All capability namespaces under `https://xml-pipeline.org/ns/<category>/<name>/v1`.
- Root element derived from payload class name (lowercase) or explicit.
## Message Pump
- Single linear pipeline on main port: ingress → repair → C14N → envelope validation → payload routing.
- Routing key = (payload namespace, root element); unique per listener.
- Meta requests (`https://xml-pipeline.org/ns/meta/v1`) handled by privileged core handler.
- User payloads:
- Validated directly against listener's cached XSD (lxml)
- On success → deserialized to typed dataclass instance (`xmlable.from_xml`)
- Handler called with instance → returns raw bytes (XML fragment, possibly dirty/multi-root)
- Bytes wrapped in `<dummy></dummy>` → repaired/parsed → all top-level payload elements extracted
- Each extracted payload wrapped in separate response envelope (inherits thread/from, optional new subthread if primitive used)
- Enveloped responses buffered and sent sequentially
- Supports single clean response, multi-payload emission (parallel tools/thoughts), and dirty LLM output tolerance.
## Reasoning & Iteration
- LLM agents iterate via open self-calls (same root tag, same thread ID).
- Conversation thread = complete memory and audit trail (all messages logged).
- Subthreading natively supported via hierarchical thread IDs and primitives (e.g., reserved payload to spawn "parent.sub1").
- Optional structured constructs like `<todo-until/>` for visible planning.
- No hidden loops or state machines; all reasoning steps are visible messages.
## Security & Sovereignty
- Privileged messages (per `privileged-msg.xsd`) handled exclusively on dedicated OOB channel.
- OOB channel bound to localhost by default (safe for local GUI); separate port/socket from main bus.
- Main MessageBus and pump oblivious to privileged operations — no routing or handling for privileged roots.
- Remote privileged attempts impossible (channel not exposed); any leak to main port logged as security event and dropped.
- Ed25519 identity key used for envelope signing, federation auth, and privileged command verification.
- No agent may modify organism structure, register listeners, or access host resources beyond declared scope.
- “No Paperclippers” manifesto injected as first system message for every LLM-based listener.
## Federation
- Gateways declared in YAML with trusted remote public key.
- Remote tools referenced by gateway name in agent tool lists.
- Regular messages flow bidirectionally; privileged messages never forwarded or accepted.
## Introspection (Meta)
- Controlled via YAML flags (`allow_list_capabilities`, `allow_schema_requests`, etc.).
- Supports `request-schema`, `request-example`, `request-prompt`, `list-capabilities`.
- Remote meta queries optionally allowed per YAML (federation peers).
## Technical Constraints
- Mandatory WSS (TLS) + TOTP on main port.
- OOB channel WSS or Unix socket, localhost-default.
- Internal: lxml trees → XSD validation → xmlable deserialization → dataclass → handler → bytes → dummy extraction.
- Single process, async non-blocking.
- XML is the sovereign wire format; everything else is implementation detail.
These principles are now locked. All existing docs will be updated to match this file exactly. Future changes require explicit discussion and amendment here first.

View file

@ -5,8 +5,8 @@ The AgentServer message pump is a single, linear, attack-resistant pipeline. Eve
```mermaid
flowchart TD
A[WebSocket Ingress<br>] --> B[TOTP + Auth Check]
B --> C[Repair + Exclusive C14N]
C --> D["Lark Envelope Grammar<br>(noise-tolerant, NOISE*)"]
B --> C[lxml Repair + Exclusive C14N]
C --> D["Envelope Grammar<br>"]
D --> E[Extract Payload XML fragment]
E --> F{Payload namespace?}
@ -24,22 +24,15 @@ flowchart TD
## Detailed Stages
1. **Ingress (raw bytes over WSS)**
Single port, TLS-terminated.
1. **Ingress:** Raw bytes over WSS.
2. **Authentication**
TOTP-based session scoping. Determines privilege level (admin vs regular agent).
2. **The Immune System:** Every inbound packet is converted to a Tree.
3. **Repair + Exclusive Canonicalization**
Normalizes XML (entity resolution disabled, huge_tree=False, no_network=True). Tamper-evident baseline.
3. **Internal Routing:** Trees flow between organs via the `dispatch` method.
4. **Envelope Validation**
Fixed, shared Lark grammar for the envelope (with `NOISE*` token).
Seeks first valid `<envelope>...</envelope>` in noisy LLM output.
Consumes exactly one envelope per pass (handles conjoined messages cleanly).
4. **The Thought Stream (Egress):** Listeners return raw bytes. These are wrapped in a `<dummy/>` tag and run through a recovery parser.
5. **Payload Extraction**
Clean payload XML fragment (bytes) + declared namespace/root.
5. **Multi-Message Extraction:** Every `<message/>` found in the dummy tag is extracted as a Tree and re-injected into the Bus.
6. **Routing Decision**
- `https://xml-platform.org/meta/v1`**Core Meta Handler** (privileged, internal).
@ -66,7 +59,7 @@ flowchart TD
## Safety Properties
- **No entity expansion** anywhere (Lark ignores entities, lxml parsers hardened).
- **No entity expansion** anywhere (lxml parsers hardened).
- **Bounded depth/recursion** by schema design + size limits.
- **No XML trees escape the pump** — only clean dicts reach handlers.
- **Topology privacy** — normal flows reveal no upstream schemas unless meta privilege granted.

65
docs/thread-management.md Normal file
View file

@ -0,0 +1,65 @@
# Thread Management in AgentServer v2.0
**January 03, 2026**
This document clarifies the thread ID system, subthreading mechanics, and internals. It supplements the Core Architectural Principles — hierarchical dot notation examples there reflect the wire format.
## Wire Format: Hierarchical String IDs
- Mandatory `<thread/>` contains a **server-assigned** hierarchical string (dot notation, e.g., "root", "root.research", "root.research.images").
- Root IDs: Short, opaque, server-generated (e.g., "sess-abcd1234").
- Sub-IDs: Relative extensions for readability.
- Benefits: LLM/human-friendly copying, natural tree structure for logs/GUI.
## Server Assignment Only
The organism assigns all final IDs — agents never invent them.
- **Root initiation**: Client suggests or server auto-generates on first message; uniqueness enforced.
- **Subthread spawning**: Explicit reserved payload for intent clarity:
```xml
<spawn-thread suggested_sub_id="research"> <!-- optional relative label -->
<initial-payload> <!-- optional bootstrap fragment -->
<!-- any valid payload -->
</initial-payload>
</spawn-thread>
```
Core handler:
- Appends label (or auto-short if omitted).
- Resolves uniqueness conflicts (append "-1" etc.).
- Creates queue + seeds bootstrap.
- **Always responds** in current thread:
```xml
<thread-spawned
assigned_id="root.research"
parent_id="root"
message="Thread spawned successfully."/>
```
## Error Handling (No Silent Failure)
- Unknown `<thread/>` ID → no implicit creation.
- **Always inject** system error into parent thread (or root):
```xml
<system-thread-error
unknown_id="root.badname"
code="unknown_thread"
message="Unknown thread; emit <spawn-thread/> to create or correct ID."/>
```
- LLM sees error immediately, retries without hanging.
- Logs warning for monitoring.
## Internals
- Per-thread queues: dict[str, Queue].
- Scheduling via `organism.yaml`:
```yaml
thread_scheduling: "breadth-first" # or "depth-first" (default: breadth-first)
```
- Depth from dot count.
- Optional hidden UUID mapping for extra safety (implementation detail).
## Design Rationale
- Explicit spawn = clear intent + bootstrap hook.
- Mandatory feedback = no LLM limbo.
- Readable IDs = easy copying without UUID mangling.
- Server control = sovereignty + no collisions.
Future: Alias registry, thread metadata primitives.
The organism branches reliably, visibly, and recoverably.

View file

@ -1,2 +0,0 @@
# One-time tool to generate the permanent Ed25519 organism identity
# Run once, store private key offline/safely