re-writing docs and code

This commit is contained in:
dullfig 2026-01-03 14:48:57 -08:00
parent 9e75cfffd6
commit ab062bca18
10 changed files with 330 additions and 193 deletions

View file

@ -3,5 +3,5 @@
<component name="Black"> <component name="Black">
<option name="sdkName" value="Python 3.13 (xml-pipeline)" /> <option name="sdkName" value="Python 3.13 (xml-pipeline)" />
</component> </component>
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.14 (xml-pipeline)" project-jdk-type="Python SDK" /> <component name="ProjectRootManager" version="2" project-jdk-name="Python 3.13 (xml-pipeline)" project-jdk-type="Python SDK" />
</project> </project>

View file

@ -5,7 +5,7 @@
<sourceFolder url="file://$MODULE_DIR$" isTestSource="false" /> <sourceFolder url="file://$MODULE_DIR$" isTestSource="false" />
<excludeFolder url="file://$MODULE_DIR$/.venv" /> <excludeFolder url="file://$MODULE_DIR$/.venv" />
</content> </content>
<orderEntry type="jdk" jdkName="Python 3.14 (xml-pipeline)" jdkType="Python SDK" /> <orderEntry type="jdk" jdkName="Python 3.13 (xml-pipeline)" jdkType="Python SDK" />
<orderEntry type="sourceFolder" forTests="false" /> <orderEntry type="sourceFolder" forTests="false" />
</component> </component>
</module> </module>

View file

@ -1,48 +1,82 @@
# AgentServer — The Living Substrate (v1.3) # AgentServer — The Living Substrate (v2.0)
***"It just works..."*** ***"It just works... safely."***
**January 01, 2026** **January 03, 2026**
**Architecture: Autonomous Grammar-Driven, Turing-Complete Multi-Agent Organism** **Architecture: Autonomous Schema-Driven, Turing-Complete Multi-Agent Organism**
## What It Is ## What It Is
AgentServer is a production-ready substrate for the `xml-pipeline` nervous system. Version 1.3 introduces **Autonomous Grammar Generation**, where the organism defines its own language and validation rules in real-time using Lark and XSD automation. AgentServer is a production-ready substrate for the `xml-pipeline` nervous system. Version 2.0 stabilizes the design around exact XSD validation, typed dataclass handlers, mandatory hierarchical threading, and strict out-of-band privileged control.
See [Core Architectural Principles](docs/core-principles-v2.0.md) for the single canonical source of truth.
## Core Philosophy ## Core Philosophy
- **Autonomous DNA:** The system never requires a human to explain tool usage to an agent. Listeners automatically generate their own XSDs based on their parameters, which are then converted into **Lark Grammars** for high-speed, one-pass scanning and validation. - **Autonomous DNA:** Listeners declare their contract via `@xmlify` dataclasses; the organism auto-generates XSDs, examples, and tool prompts.
- **Grammar-Locked Intelligence:** Dirty LLM streams are scanned by Lark. Only text that satisfies the current organism's grammar is extracted and validated. Everything else is ignored as "Biological Noise." - **Schema-Locked Intelligence:** Payloads validated directly against XSD (lxml) → deserialized to typed instances → pure handlers.
- **Parameter-Keyed Logic:** Messages are delivered to agents as pristine Python dictionaries, automatically keyed to the listener's registered parameters. - **Multi-Response Tolerance:** Handlers return raw bytes; bus wraps in `<dummy></dummy>` and extracts multiple payloads (perfect for parallel tool calls or dirty LLM output).
- **Computational Sovereignty:** Turing-complete via `<todo-until/>` and `<start-thread/>` primitives, governed by a strict resource stack. - **Computational Sovereignty:** Turing-complete via self-calls, subthreading primitives, and visible reasoning — all bounded by thread hierarchy and local-only control.
## Developer Experience — Create a Listener in 12 Lines
That's it. No manual XML, no schemas, no prompts.
```python
from xmlable import xmlify
from dataclasses import dataclass
from xml_pipeline import Listener, bus # bus is the global MessageBus
@xmlify
@dataclass
class AddPayload:
a: int
b: int
def add_handler(payload: AddPayload) -> bytes:
result = payload.a + payload.b
return f"<result>{result}</result>".encode("utf-8")
Listener(
payload_class=AddPayload,
handler=add_handler,
name="calculator.add",
description="Adds two integers and returns their sum."
).register() # ← Boom: XSD, example, prompt auto-generated + registered
```
The organism now speaks `<add>` — fully validated, typed, and discoverable.
## Key Features ## Key Features
### 1. The Autonomous Schema Layer
- Dataclass → cached XSD + example + rich tool prompt (mandatory description + field docs).
- Namespaces: `https://xml-pipeline.org/ns/<category>/<name>/v1` (served live via domain for discoverability).
### 1. The Autonomous Language Layer ### 2. Thread-Based Lifecycle & Reasoning
- **XSD-to-Lark Generator:** A core utility that transcribes XSD schema definitions into EBNF Lark grammars. This enables the server to search untrusted data streams for specific XML patterns with mathematical precision. - Mandatory `<thread/>` with hierarchical IDs for reliable subthreading and audit trails.
- **Auto-Descriptive Organs:** The base `XMLListener` class inspects its own instantiation parameters to generate a corresponding XSD. The tool itself tells the world how to use it. - LLM agents reason via open self-calls and optional `<todo-until/>`.
- **Protocol Agnostic:** To add a new field (like `<cc/>`) to the entire swarm, you simply update the central XSD. The entire organism's grammar updates instantly. - All thought steps visible as messages — no hidden state.
- **[Read Further: Self-Registration & Autonomous Grammars](docs/self-grammar-generation.md)**
### 2. The Stack-Based Lifecycle ### 3. Message Pump
- **UUID Custody:** UUID v4 thread identifiers are born via `<spawn-thread/>` and managed on a physical stack. - Single linear pipeline with repair, C14N, XSD validation, deserialization, handler execution, and multi-payload extraction.
- **Leaf-to-Root Roll-up:** Threads remain active until the final leaf responds, ensuring perfect resource tracking and preventing runaway processes. - Supports clean tools and forgiving LLM streams alike.
- Thread-base message queue with bounded memory.
### 3. The Sovereign Witness ### 4. Structural Control
- **Inline Auditing:** The Logger witnesses all traffic before routing. - Bootstrap from `organism.yaml`.
- **The Confessional:** Agents record inner thoughts via `<logger/>`. The Logger is **strictly write-only** to prevent rogue memory or shared-state leaks. - Runtime changes (hot-reload, add/remove listeners) via local-only OOB channel (localhost WSS or Unix socket — GUI-ready).
- Main bus oblivious to privileged ops.
### 4. Isolated Structural Control ### 5. Federation & Introspection
- **Out-of-Band (OOB) Port:** Structural commands (registration, wiring, shutdown) use a dedicated port and Ed25519 signatures, ensuring "Life/Death" commands cannot be delayed by agent traffic. - YAML-declared gateways with trusted keys.
- **[Read Further: YAML Configuration System](docs/configuration.md)** - Controlled meta queries (schema/example/prompt/capability list).
## Technical Stack ## Technical Stack
- **Parsing:** Lark (EBNF Grammar) + `lxml` (Validation/C14N). - **Validation & Parsing:** lxml (XSD, C14N, repair) + xmlable (round-trip).
- **Protocol:** Mandatory WSS (TLS) + TOTP 2FA. - **Protocol:** Mandatory WSS (TLS) + TOTP on main port.
- **Identity:** Ed25519 (OOB) + UUID v4 (In-Bus). - **Identity:** Ed25519 (signing, federation, privileged).
- **Format:** `lxml` trees (Internal) / Exclusive C14N (External). - **Format:** Exclusive C14N XML (wire sovereign).
## Why This Matters ## Why This Matters
AgentServer v1.3 is the first multi-agent substrate where the **language is the security.** By automating the link between XSD, Grammar, and LLM Prompts, youve created an organism that is impossible to "misunderstand." It is a self-documenting, self-validating, and self-regulating intelligent system. AgentServer v2.0 is a bounded, auditable, owner-controlled organism where the **XSD is the security**, the **thread is the memory**, and the **OOB channel is the sovereignty**.
**One port. Many bounded minds. Autonomous Evolution.** 🚀 One port. Many bounded minds. Autonomous yet obedient evolution. 🚀
--- ---
*XML wins. Safely. Permanently.* *XML wins. Safely. Permanently.*

View file

@ -1,6 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="https://xml-pipeline.org/ns/envelope/1" targetNamespace="https://xml-pipeline.org/ns/envelope/v1"
elementFormDefault="qualified"> elementFormDefault="qualified">
<!-- The universal envelope for all non-privileged messages --> <!-- The universal envelope for all non-privileged messages -->
@ -13,7 +13,7 @@
<xs:sequence> <xs:sequence>
<xs:element name="from" type="xs:string" minOccurs="1"/> <xs:element name="from" type="xs:string" minOccurs="1"/>
<xs:element name="to" type="xs:string" minOccurs="0"/> <xs:element name="to" type="xs:string" minOccurs="0"/>
<xs:element name="thread" type="xs:string" minOccurs="0"/> <xs:element name="thread" type="xs:string" minOccurs="1"/>
<!-- Reserved for future standard fields (timestamp, priority, etc.) --> <!-- Reserved for future standard fields (timestamp, priority, etc.) -->
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/> <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence> </xs:sequence>

View file

@ -1,104 +1,83 @@
""" """
xmllistener.py The Sovereign Contract for All Capabilities xml_listener.py The Sovereign Contract for All Capabilities (v1.3)
In xml-pipeline, there are no "agents", no "tools", no "services".
There are only bounded, reactive XMLListeners.
Every capability in the organism whether driven by an LLM,
a pure function, a remote gateway, or privileged logic
must inherit from this class.
This file is intentionally verbose and heavily documented.
It is the constitution that all organs must obey.
""" """
from __future__ import annotations from __future__ import annotations
from typing import Optional, Type, Callable
import uuid from pydantic import BaseModel
from typing import Optional, List, ClassVar
from lxml import etree
class XMLListener: class XMLListener:
""" """
Base class for all reactive capabilities in the organism. Base class for all reactive capabilities.
Now supports Autonomous Registration via Pydantic payload classes.
Key Invariants (never break these):
1. Listeners are passive. They never initiate. They only react.
2. They declare what they listen to via class variable.
3. They have a globally unique agent_name.
4. They receive the full parsed envelope tree (not raw XML).
5. They return only payload XML (never the envelope).
6. The MessageBus owns routing, threading, and envelope wrapping.
""" """
# =================================================================== def __init__(
# Required class declarations — must be overridden in subclasses self,
# =================================================================== name: str,
payload_class: Type[BaseModel],
handler: Callable[[dict], bytes],
description: Optional[str] = None
):
self.agent_name = name
self.payload_class = payload_class
self.handler = handler
self.description = description or payload_class.__doc__ or "No description provided."
listens_to: ClassVar[List[str]] = [] # In v1.3, the root tag is derived from the payload class name
""" self.root_tag = payload_class.__name__
List of full XML tags this listener reacts to. self.listens_to = [self.root_tag]
Example: ["{https://example.org/chat}message", "{https://example.org/calc}request"]
"""
agent_name: ClassVar[str] = ""
"""
Globally unique name for this instance.
Enforced by MessageBus at registration.
Used in <from/>, routing, logging, and known_peers prompts.
"""
# ===================================================================
# Core handler — the only method that does work
# ===================================================================
async def handle( async def handle(
self, self,
envelope_tree: etree._Element, payload_dict: dict,
convo_id: str, thread_id: str,
sender_name: str, sender_name: str,
) -> Optional[str]: ) -> Optional[bytes]:
""" """
React to an incoming enveloped message. React to a pre-validated dictionary payload.
Returns raw response XML bytes.
Parameters:
envelope_tree: Full <env:message> root (parsed, post-repair/C14N)
convo_id: Current conversation UUID (injected or preserved by bus)
sender_name: The <from/> value (mandatory)
Returns:
Payload XML string (no envelope) if responding, else None.
The organism guarantees:
- envelope_tree is valid against envelope.xsd
- <from/> is present and matches sender_name
- convo_id is a valid UUID
To reply in the current thread: omit convo_id in response bus preserves it
To start a new thread: include <env:convo_id>new-uuid</env:convo_id> in returned envelope
""" """
raise NotImplementedError( # 1. Execute the handler logic
f"{self.__class__.__name__} must implement handle()" # Note: In v1.3, the Bus/Lark handles the XML -> Dict conversion
) return await self.handler(payload_dict)
# =================================================================== def generate_xsd(self) -> str:
# Optional convenience helpers (can be overridden) """
# =================================================================== Autonomous XSD Synthesis.
Inspects the payload_class and generates an XSD string.
"""
# Logic to iterate over self.payload_class.model_fields
# and build the <xs:element> definitions.
pass
def make_response( def generate_prompt_fragment(self) -> str:
"""
Prompt Synthesis (The 'Mente').
Generates the tool usage instructions for other agents.
"""
fragment = [
f"Capability: {self.agent_name}",
f"Root Tag: <{self.root_tag}>",
f"Description: {self.description}",
"\nParameters:"
]
for name, field in self.payload_class.model_fields.items():
field_type = field.annotation.__name__
field_desc = field.description or "No description"
fragment.append(f" - {name} ({field_type}): {field_desc}")
return "\n".join(fragment)
def make_response_envelope(
self, self,
payload: str | etree._Element, payload_bytes: bytes,
*, thread_id: str,
to: Optional[str] = None, to: Optional[str] = None
convo_id: Optional[str] = None, ) -> bytes:
) -> str:
""" """
Helper for building correct response payloads. Wraps response bytes in a standard envelope.
Use this in subclasses to avoid envelope boilerplate.
- If convo_id is None reply in current thread
- If convo_id provided force/start new thread
- to overrides default reply-to-sender
""" """
# Implementation tomorrow — but declared here for contract clarity # Logic to build the <message> meta block and append the payload_bytes
raise NotImplementedError pass

View file

@ -1,118 +1,110 @@
# Configuration — organism.yaml G# Configuration — organism.yaml (v2.0)
The entire organism is declared in a single YAML file (default: `config/organism.yaml`). The entire organism is declared in a single YAML file (default: `config/organism.yaml`).
All listeners, agents, and federation gateways are instantiated from this file at startup. Loaded at bootstrap — single source of truth for initial composition.
Changes require a restart (hot-reload planned for future). Runtime changes (hot-reload) via local OOB privileged commands.
## Example Full Configuration ## Example Full Configuration
```yaml ```yaml
organism: organism:
name: "ResearchSwarm-01" name: "ResearchSwarm-01"
identity: "config/identity/private.ed25519" # Ed25519 private key for signing identity: "config/identity/private.ed25519" # Ed25519 private key
port: 8765 port: 8765 # Main message bus WSS
tls: tls:
cert: "certs/fullchain.pem" cert: "certs/fullchain.pem"
key: "certs/privkey.pem" key: "certs/privkey.pem"
oob: # Out-of-band privileged channel (GUI/hot-reload ready)
enabled: true
bind: "127.0.0.1" # Localhost-only default
port: 8766 # Separate WSS port
# unix_socket: "/tmp/organism.sock" # Alternative
thread_scheduling: "breadth-first" # or "depth-first" (default: breadth-first)
meta: meta:
enabled: true enabled: true
allow_list_capabilities: true # Public catalog of capability names allow_list_capabilities: true
allow_schema_requests: "admin" # "admin" | "authenticated" | "none" allow_schema_requests: "admin" # "admin" | "authenticated" | "none"
allow_example_requests: "admin" allow_example_requests: "admin"
allow_prompt_requests: "admin" allow_prompt_requests: "admin"
allow_remote: false # Federation peers can query meta allow_remote: false # Federation peers query meta
listeners: listeners:
- name: calculator.add - name: calculator.add
payload_class: examples.calculator.AddPayload payload_class: examples.calculator.AddPayload
handler: examples.calculator.add_handler handler: examples.calculator.add_handler
description: "Integer addition" description: "Adds two integers and returns their sum." # Mandatory for usable tool prompts
- name: calculator.subtract
payload_class: examples.calculator.SubtractPayload
handler: examples.calculator.subtract_handler
- name: summarizer - name: summarizer
payload_class: agents.summarizer.SummarizePayload payload_class: agents.summarizer.SummarizePayload
handler: agents.summarizer.summarize_handler handler: agents.summarizer.summarize_handler
description: "Text summarization via local LLM" description: "Summarizes text via local LLM."
agents: agents:
- name: researcher - name: researcher
system_prompt: "prompts/researcher_system.txt" system_prompt: "prompts/researcher_system.txt"
tools: tools:
- calculator.add - calculator.add
- calculator.subtract
- summarizer - summarizer
- name: web_search # Remote tool via gateway below - name: web_search
remote: true remote: true
gateways: gateways:
- name: web_search - name: web_search
remote_url: "wss://trusted-search-node.example.org" remote_url: "wss://trusted-search-node.example.org"
trusted_identity: "pubkeys/search_node.ed25519.pub" trusted_identity: "pubkeys/search_node.ed25519.pub"
description: "Federated web search capability" description: "Federated web search capability."
``` ```
## Sections Explained ## Sections Explained
### `organism` ### `organism`
Core server settings. Core settings.
- `name`: Logs/discovery.
- `identity`: Ed25519 private key path.
- `port` / `tls`: Main WSS bus.
- `name`: Human-readable identifier (used in logs, discovery). ### `oob`
- `identity`: Path to Ed25519 private key (for envelope signing, federation auth). Privileged local control channel.
- `port` / `tls`: Single-port WSS configuration. - `enabled: false` → pure static (restart for changes).
- Localhost default for GUI safety.
- Separate from main port — bus oblivious.
### `thread_scheduling`
Balanced subthread execution.
- `"breadth-first"`: Fair round-robin (default, prevents deep starvation).
- `"depth-first"`: Dive deep into branches.
### `meta` ### `meta`
Controls the privileged introspection facility (`https://xml-platform.org/meta/v1`). Introspection controls (`https://xml-pipeline.org/ns/meta/v1`).
- `allow_list_capabilities`: Publicly visible catalog (safe).
- `allow_*_requests`: Restrict schema/example/prompt emission to admin or authenticated sessions.
- `allow_remote`: Whether federation peers can query your meta namespace.
### `listeners` ### `listeners`
All bounded capabilities. Each entry triggers autonomous registration: Bounded capabilities.
- `name`: Discovery/logging (dots for hierarchy).
- `payload_class`: Full import to `@xmlify` dataclass.
- `handler`: Full import to function (dataclass → bytes).
- `description`: **Mandatory** human-readable blurb (lead-in for auto-prompt; fallback to generic if omitted).
- `name`: Logical capability name (used in discovery, YAML tools lists). Dots allowed for hierarchy. At startup/hot-reload: imports → Listener instantiation → bus.register() → XSD/example/prompt synthesis.
- `payload_class`: Full import path to the `@xmlify` dataclass (defines contract).
- `handler`: Full import path to the handler callable (`dict → bytes`).
- `description`: Optional human-readable text (included in `list-capabilities`).
At startup: Cached XSDs: `schemas/<name>/v1.xsd`.
1. Import payload_class and handler.
2. Instantiate `Listener(payload_class=..., handler=..., name=...)`.
3. `bus.register(listener)` → XSD synthesis, Lark grammar generation, prompt caching.
Filesystem artifacts:
- XSDs cached as `schemas/<name_with_underscores>/v1.xsd` (dots → underscores for Linux safety).
### `agents` ### `agents`
LLM-based reasoning agents. LLM reasoners.
- `system_prompt`: Static file path.
- `name`: Agent identifier. - `tools`: Local names or remote references.
- `system_prompt`: Path to static prompt file. - Auto-injected live tool prompts at runtime.
- `tools`: List of local capability names or remote references.
- Local: direct name match (`calculator.add`).
- Remote: `name:` + `remote: true` → routed via matching gateway.
Live capability prompts are auto-injected into the agent's system prompt at runtime (no stale copies).
### `gateways` ### `gateways`
Federation peers. Federation peers.
- Trusted public key required.
- Bidirectional regular traffic only.
- `name`: Local alias for the remote organism. ## Notes
- `remote_url`: WSS endpoint. - Hot-reload: Future privileged OOB commands (apply new YAML fragments, add/remove listeners).
- `trusted_identity`: Path to remote's Ed25519 public key. - Namespaces: Capabilities under `https://xml-pipeline.org/ns/<category>/<name>/v1` (served live if configured).
- `description`: Optional. - Edit → reload/restart → new bounded minds, self-describing and attack-resistant.
Remote tools referenced in `agents.tools` are routed through the gateway with matching `name`. This YAML is the organism's DNA — precise, auditable, and evolvable locally.
## Future Extensions (planned)
- Hot-reload of configuration.
- Per-agent privilege scoping.
- Capability versioning in YAML (`version: v2`).
This YAML is the **single source of truth** for organism composition.
Edit → restart → new bounded minds appear, fully self-describing and attack-resistant.

View file

@ -0,0 +1,76 @@
# AgentServer v2.0 — Core Architectural Principles
**January 03, 2026**
**Architecture: Autonomous Schema-Driven, Turing-Complete Multi-Agent Organism**
These principles are the single canonical source of truth for the project. All documentation, code, and future decisions must align with this file.
## Identity & Communication
- All traffic uses the universal `<message>` envelope defined in `envelope.xsd` (namespace `https://xml-pipeline.org/ns/envelope/v1`).
- Mandatory `<from/>` and `<thread/>` (convo_id string, supports hierarchical dot notation for subthreading, e.g., "root.1.research").
- Optional `<to/>` (rare direct routing; most flows use payload namespace/root).
- Exclusive C14N on ingress and egress.
- Malformed XML repaired on ingress; repairs logged in `<huh/>` metadata.
## Configuration & Composition
- YAML file (`organism.yaml`) is the bootstrap source of truth, loaded at startup.
- Defines initial listeners, agents, gateways, meta privileges, and OOB channel configuration.
- Runtime structural changes (add/remove listeners, rewire agents, etc.) via local-only privileged commands on the dedicated OOB channel (hot-reload capability).
- No remote or unprivileged structural changes ever.
## Autonomous Schema Layer
- Listeners defined by `@xmlify`-decorated dataclass (payload contract) + pure handler function.
- Mandatory human-readable description string (short "what this does" blurb for tool prompt lead-in).
- Registration (at startup or via hot-reload) automatically generates:
- XSD cached on disk (`schemas/<name>/v1.xsd`)
- Example XML
- Tool description prompt fragment (includes description, params with field docs if present, example input)
- All capability namespaces under `https://xml-pipeline.org/ns/<category>/<name>/v1`.
- Root element derived from payload class name (lowercase) or explicit.
## Message Pump
- Single linear pipeline on main port: ingress → repair → C14N → envelope validation → payload routing.
- Routing key = (payload namespace, root element); unique per listener.
- Meta requests (`https://xml-pipeline.org/ns/meta/v1`) handled by privileged core handler.
- User payloads:
- Validated directly against listener's cached XSD (lxml)
- On success → deserialized to typed dataclass instance (`xmlable.from_xml`)
- Handler called with instance → returns raw bytes (XML fragment, possibly dirty/multi-root)
- Bytes wrapped in `<dummy></dummy>` → repaired/parsed → all top-level payload elements extracted
- Each extracted payload wrapped in separate response envelope (inherits thread/from, optional new subthread if primitive used)
- Enveloped responses buffered and sent sequentially
- Supports single clean response, multi-payload emission (parallel tools/thoughts), and dirty LLM output tolerance.
## Reasoning & Iteration
- LLM agents iterate via open self-calls (same root tag, same thread ID).
- Conversation thread = complete memory and audit trail (all messages logged).
- Subthreading natively supported via hierarchical thread IDs and primitives (e.g., reserved payload to spawn "parent.sub1").
- Optional structured constructs like `<todo-until/>` for visible planning.
- No hidden loops or state machines; all reasoning steps are visible messages.
## Security & Sovereignty
- Privileged messages (per `privileged-msg.xsd`) handled exclusively on dedicated OOB channel.
- OOB channel bound to localhost by default (safe for local GUI); separate port/socket from main bus.
- Main MessageBus and pump oblivious to privileged operations — no routing or handling for privileged roots.
- Remote privileged attempts impossible (channel not exposed); any leak to main port logged as security event and dropped.
- Ed25519 identity key used for envelope signing, federation auth, and privileged command verification.
- No agent may modify organism structure, register listeners, or access host resources beyond declared scope.
- “No Paperclippers” manifesto injected as first system message for every LLM-based listener.
## Federation
- Gateways declared in YAML with trusted remote public key.
- Remote tools referenced by gateway name in agent tool lists.
- Regular messages flow bidirectionally; privileged messages never forwarded or accepted.
## Introspection (Meta)
- Controlled via YAML flags (`allow_list_capabilities`, `allow_schema_requests`, etc.).
- Supports `request-schema`, `request-example`, `request-prompt`, `list-capabilities`.
- Remote meta queries optionally allowed per YAML (federation peers).
## Technical Constraints
- Mandatory WSS (TLS) + TOTP on main port.
- OOB channel WSS or Unix socket, localhost-default.
- Internal: lxml trees → XSD validation → xmlable deserialization → dataclass → handler → bytes → dummy extraction.
- Single process, async non-blocking.
- XML is the sovereign wire format; everything else is implementation detail.
These principles are now locked. All existing docs will be updated to match this file exactly. Future changes require explicit discussion and amendment here first.

View file

@ -5,8 +5,8 @@ The AgentServer message pump is a single, linear, attack-resistant pipeline. Eve
```mermaid ```mermaid
flowchart TD flowchart TD
A[WebSocket Ingress<br>] --> B[TOTP + Auth Check] A[WebSocket Ingress<br>] --> B[TOTP + Auth Check]
B --> C[Repair + Exclusive C14N] B --> C[lxml Repair + Exclusive C14N]
C --> D["Lark Envelope Grammar<br>(noise-tolerant, NOISE*)"] C --> D["Envelope Grammar<br>"]
D --> E[Extract Payload XML fragment] D --> E[Extract Payload XML fragment]
E --> F{Payload namespace?} E --> F{Payload namespace?}
@ -24,22 +24,15 @@ flowchart TD
## Detailed Stages ## Detailed Stages
1. **Ingress (raw bytes over WSS)** 1. **Ingress:** Raw bytes over WSS.
Single port, TLS-terminated.
2. **Authentication** 2. **The Immune System:** Every inbound packet is converted to a Tree.
TOTP-based session scoping. Determines privilege level (admin vs regular agent).
3. **Repair + Exclusive Canonicalization** 3. **Internal Routing:** Trees flow between organs via the `dispatch` method.
Normalizes XML (entity resolution disabled, huge_tree=False, no_network=True). Tamper-evident baseline.
4. **Envelope Validation** 4. **The Thought Stream (Egress):** Listeners return raw bytes. These are wrapped in a `<dummy/>` tag and run through a recovery parser.
Fixed, shared Lark grammar for the envelope (with `NOISE*` token).
Seeks first valid `<envelope>...</envelope>` in noisy LLM output.
Consumes exactly one envelope per pass (handles conjoined messages cleanly).
5. **Payload Extraction** 5. **Multi-Message Extraction:** Every `<message/>` found in the dummy tag is extracted as a Tree and re-injected into the Bus.
Clean payload XML fragment (bytes) + declared namespace/root.
6. **Routing Decision** 6. **Routing Decision**
- `https://xml-platform.org/meta/v1`**Core Meta Handler** (privileged, internal). - `https://xml-platform.org/meta/v1`**Core Meta Handler** (privileged, internal).
@ -66,7 +59,7 @@ flowchart TD
## Safety Properties ## Safety Properties
- **No entity expansion** anywhere (Lark ignores entities, lxml parsers hardened). - **No entity expansion** anywhere (lxml parsers hardened).
- **Bounded depth/recursion** by schema design + size limits. - **Bounded depth/recursion** by schema design + size limits.
- **No XML trees escape the pump** — only clean dicts reach handlers. - **No XML trees escape the pump** — only clean dicts reach handlers.
- **Topology privacy** — normal flows reveal no upstream schemas unless meta privilege granted. - **Topology privacy** — normal flows reveal no upstream schemas unless meta privilege granted.

65
docs/thread-management.md Normal file
View file

@ -0,0 +1,65 @@
# Thread Management in AgentServer v2.0
**January 03, 2026**
This document clarifies the thread ID system, subthreading mechanics, and internals. It supplements the Core Architectural Principles — hierarchical dot notation examples there reflect the wire format.
## Wire Format: Hierarchical String IDs
- Mandatory `<thread/>` contains a **server-assigned** hierarchical string (dot notation, e.g., "root", "root.research", "root.research.images").
- Root IDs: Short, opaque, server-generated (e.g., "sess-abcd1234").
- Sub-IDs: Relative extensions for readability.
- Benefits: LLM/human-friendly copying, natural tree structure for logs/GUI.
## Server Assignment Only
The organism assigns all final IDs — agents never invent them.
- **Root initiation**: Client suggests or server auto-generates on first message; uniqueness enforced.
- **Subthread spawning**: Explicit reserved payload for intent clarity:
```xml
<spawn-thread suggested_sub_id="research"> <!-- optional relative label -->
<initial-payload> <!-- optional bootstrap fragment -->
<!-- any valid payload -->
</initial-payload>
</spawn-thread>
```
Core handler:
- Appends label (or auto-short if omitted).
- Resolves uniqueness conflicts (append "-1" etc.).
- Creates queue + seeds bootstrap.
- **Always responds** in current thread:
```xml
<thread-spawned
assigned_id="root.research"
parent_id="root"
message="Thread spawned successfully."/>
```
## Error Handling (No Silent Failure)
- Unknown `<thread/>` ID → no implicit creation.
- **Always inject** system error into parent thread (or root):
```xml
<system-thread-error
unknown_id="root.badname"
code="unknown_thread"
message="Unknown thread; emit <spawn-thread/> to create or correct ID."/>
```
- LLM sees error immediately, retries without hanging.
- Logs warning for monitoring.
## Internals
- Per-thread queues: dict[str, Queue].
- Scheduling via `organism.yaml`:
```yaml
thread_scheduling: "breadth-first" # or "depth-first" (default: breadth-first)
```
- Depth from dot count.
- Optional hidden UUID mapping for extra safety (implementation detail).
## Design Rationale
- Explicit spawn = clear intent + bootstrap hook.
- Mandatory feedback = no LLM limbo.
- Readable IDs = easy copying without UUID mangling.
- Server control = sovereignty + no collisions.
Future: Alias registry, thread metadata primitives.
The organism branches reliably, visibly, and recoverably.

View file

@ -1,2 +0,0 @@
# One-time tool to generate the permanent Ed25519 organism identity
# Run once, store private key offline/safely