Add wiki documentation for xml-pipeline.org
Comprehensive documentation set for XWiki: - Home, Installation, Quick Start guides - Writing Handlers and LLM Router guides - Architecture docs (Overview, Message Pump, Thread Registry, Shared Backend) - Reference docs (Configuration, Handler Contract, CLI) - Hello World tutorial - Why XML rationale - Pandoc conversion scripts (bash + PowerShell) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
c01428260c
commit
515c738abb
17 changed files with 3632 additions and 0 deletions
67
docs/wiki/Home.md
Normal file
67
docs/wiki/Home.md
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
# xml-pipeline
|
||||
|
||||
**A tamper-proof nervous system for multi-agent AI systems.**
|
||||
|
||||
xml-pipeline (also called AgentServer) provides a schema-driven, Turing-complete message bus where AI agents communicate through validated XML payloads. It features automatic XSD generation, handler isolation, and built-in security guarantees against agent misbehavior.
|
||||
|
||||
## Why XML?
|
||||
|
||||
While JSON dominates web APIs, XML provides critical features for secure multi-agent systems:
|
||||
|
||||
- **Schema validation** — XSD enforces exact contracts on the wire
|
||||
- **Namespaces** — Safely mix vocabularies without collision
|
||||
- **Canonicalization** — C14N enables deterministic signing
|
||||
- **Repair tolerance** — Malformed XML can be recovered; malformed JSON cannot
|
||||
|
||||
See [[Why XML]] for the full rationale.
|
||||
|
||||
## Key Features
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| **Schema-Driven** | Define payloads as Python dataclasses; XSD generated automatically |
|
||||
| **Handler Isolation** | Handlers are sandboxed—cannot forge identity or escape threads |
|
||||
| **Thread Tracking** | Opaque UUIDs hide topology; call chains tracked privately |
|
||||
| **LLM Router** | Multi-backend routing with failover, rate limiting, retries |
|
||||
| **Multiprocess Ready** | CPU-bound handlers run in ProcessPoolExecutor |
|
||||
| **Shared State** | Redis/Manager backends for distributed deployments |
|
||||
|
||||
## Quick Links
|
||||
|
||||
### Getting Started
|
||||
- [[Installation]] — Install the package
|
||||
- [[Quick Start]] — Run your first organism in 5 minutes
|
||||
- [[Configuration]] — Configure organisms via YAML
|
||||
|
||||
### Guides
|
||||
- [[Writing Handlers]] — Create message handlers
|
||||
- [[Using the LLM Router]] — Call language models from handlers
|
||||
- [[Multiprocess Handlers]] — Run CPU-bound work in separate processes
|
||||
|
||||
### Architecture
|
||||
- [[Architecture Overview]] — How xml-pipeline works
|
||||
- [[Message Pump]] — The streaming message processor
|
||||
- [[Thread Registry]] — Call chain tracking with opaque UUIDs
|
||||
- [[Shared Backend]] — Cross-process state with Redis
|
||||
|
||||
### Tutorials
|
||||
- [[Hello World Tutorial]] — Build a greeting agent step by step
|
||||
- [[Calculator Tool Tutorial]] — Create a tool that agents can call
|
||||
|
||||
### Reference
|
||||
- [[Handler Contract]] — Handler function signature and return types
|
||||
- [[Configuration Reference]] — Complete organism.yaml specification
|
||||
- [[CLI Reference]] — Command-line interface
|
||||
|
||||
## Version
|
||||
|
||||
Current version: **0.4.0**
|
||||
|
||||
## License
|
||||
|
||||
MIT License
|
||||
|
||||
## Links
|
||||
|
||||
- [GitHub Repository](https://github.com/xml-pipeline/xml-pipeline)
|
||||
- [PyPI Package](https://pypi.org/project/xml-pipeline/)
|
||||
97
docs/wiki/Installation.md
Normal file
97
docs/wiki/Installation.md
Normal file
|
|
@ -0,0 +1,97 @@
|
|||
# Installation
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.11 or higher
|
||||
- pip, uv, or pipx
|
||||
|
||||
## Install from PyPI
|
||||
|
||||
```bash
|
||||
pip install xml-pipeline
|
||||
```
|
||||
|
||||
## Install with Optional Features
|
||||
|
||||
xml-pipeline has optional dependencies for different use cases:
|
||||
|
||||
```bash
|
||||
# Core only (minimal)
|
||||
pip install xml-pipeline
|
||||
|
||||
# With Anthropic Claude support
|
||||
pip install xml-pipeline[anthropic]
|
||||
|
||||
# With OpenAI support
|
||||
pip install xml-pipeline[openai]
|
||||
|
||||
# With Redis for distributed deployments
|
||||
pip install xml-pipeline[redis]
|
||||
|
||||
# With web search capability
|
||||
pip install xml-pipeline[search]
|
||||
|
||||
# With interactive console (for examples)
|
||||
pip install xml-pipeline[console]
|
||||
|
||||
# Everything
|
||||
pip install xml-pipeline[all]
|
||||
|
||||
# Development (includes testing tools)
|
||||
pip install xml-pipeline[dev]
|
||||
```
|
||||
|
||||
## Install from Source
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/xml-pipeline/xml-pipeline.git
|
||||
cd xml-pipeline
|
||||
|
||||
# Create virtual environment
|
||||
python -m venv .venv
|
||||
|
||||
# Activate (Windows)
|
||||
.venv\Scripts\activate
|
||||
|
||||
# Activate (Linux/macOS)
|
||||
source .venv/bin/activate
|
||||
|
||||
# Install in development mode
|
||||
pip install -e ".[all]"
|
||||
```
|
||||
|
||||
## Verify Installation
|
||||
|
||||
```bash
|
||||
# Check version
|
||||
xml-pipeline version
|
||||
|
||||
# Or use the short alias
|
||||
xp version
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
xml-pipeline 0.4.0
|
||||
Python 3.11.x
|
||||
Features: anthropic, console, redis, search
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Create a `.env` file in your project root for API keys:
|
||||
|
||||
```env
|
||||
# LLM Provider Keys (add the ones you need)
|
||||
XAI_API_KEY=xai-...
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
OPENAI_API_KEY=sk-...
|
||||
```
|
||||
|
||||
The library automatically loads `.env` files via python-dotenv.
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [[Quick Start]] — Run your first organism
|
||||
- [[Configuration]] — Learn about organism.yaml
|
||||
303
docs/wiki/LLM-Router.md
Normal file
303
docs/wiki/LLM-Router.md
Normal file
|
|
@ -0,0 +1,303 @@
|
|||
# LLM Router
|
||||
|
||||
The LLM Router provides a unified interface for language model calls. Agents request a model by name; the router handles backend selection, failover, rate limiting, and retries.
|
||||
|
||||
## Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Agent Handler │
|
||||
│ response = await complete("grok-4.1", messages) │
|
||||
└─────────────────────────────────┬───────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ LLM Router │
|
||||
│ • Find backends serving model │
|
||||
│ • Select backend (strategy) │
|
||||
│ • Retry on failure │
|
||||
│ • Track usage per agent │
|
||||
└────────────┬────────────────┬────────────────┬──────────────────┘
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌──────────┐ ┌──────────┐ ┌──────────┐
|
||||
│ XAI │ │Anthropic │ │ Ollama │
|
||||
│ Backend │ │ Backend │ │ Backend │
|
||||
└──────────┘ └──────────┘ └──────────┘
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from xml_pipeline.platform.llm_api import complete
|
||||
|
||||
response = await complete(
|
||||
model="grok-4.1",
|
||||
messages=[
|
||||
{"role": "system", "content": "You are helpful."},
|
||||
{"role": "user", "content": "Hello!"},
|
||||
],
|
||||
)
|
||||
|
||||
print(response.content)
|
||||
```
|
||||
|
||||
### In a Handler
|
||||
|
||||
```python
|
||||
async def my_agent(payload: Query, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
from xml_pipeline.platform.llm_api import complete
|
||||
|
||||
response = await complete(
|
||||
model="grok-4.1",
|
||||
messages=[
|
||||
{"role": "system", "content": metadata.usage_instructions},
|
||||
{"role": "user", "content": payload.question},
|
||||
],
|
||||
temperature=0.7,
|
||||
max_tokens=2048,
|
||||
)
|
||||
|
||||
return HandlerResponse(
|
||||
payload=Answer(text=response.content),
|
||||
to="output",
|
||||
)
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### organism.yaml
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
strategy: failover # Backend selection strategy
|
||||
retries: 3 # Max retry attempts
|
||||
retry_base_delay: 1.0 # Base delay for backoff
|
||||
retry_max_delay: 60.0 # Max delay between retries
|
||||
|
||||
backends:
|
||||
- provider: xai
|
||||
api_key_env: XAI_API_KEY
|
||||
priority: 1 # Lower = preferred
|
||||
rate_limit_tpm: 100000 # Tokens per minute
|
||||
max_concurrent: 20 # Concurrent request limit
|
||||
|
||||
- provider: anthropic
|
||||
api_key_env: ANTHROPIC_API_KEY
|
||||
priority: 2
|
||||
|
||||
- provider: openai
|
||||
api_key_env: OPENAI_API_KEY
|
||||
priority: 3
|
||||
|
||||
- provider: ollama
|
||||
base_url: http://localhost:11434
|
||||
supported_models: [llama3, mistral]
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```env
|
||||
# .env file
|
||||
XAI_API_KEY=xai-abc123...
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
OPENAI_API_KEY=sk-...
|
||||
```
|
||||
|
||||
## Supported Providers
|
||||
|
||||
| Provider | Models | Auth |
|
||||
|----------|--------|------|
|
||||
| `xai` | grok-* | Bearer token |
|
||||
| `anthropic` | claude-* | x-api-key header |
|
||||
| `openai` | gpt-*, o1-*, o3-* | Bearer token |
|
||||
| `ollama` | Any local model | None (local) |
|
||||
|
||||
### Model Routing
|
||||
|
||||
The router automatically selects backends based on model name:
|
||||
|
||||
- `grok-4.1` → XAI backend
|
||||
- `claude-sonnet-4` → Anthropic backend
|
||||
- `gpt-4o` → OpenAI backend
|
||||
- `llama3` → Ollama (if in `supported_models`)
|
||||
|
||||
## Strategies
|
||||
|
||||
### Failover (Default)
|
||||
|
||||
Tries backends in priority order. Falls back on error.
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
strategy: failover
|
||||
backends:
|
||||
- provider: xai
|
||||
priority: 1 # Try first
|
||||
- provider: anthropic
|
||||
priority: 2 # Fallback
|
||||
```
|
||||
|
||||
### Round-Robin
|
||||
|
||||
Distributes requests evenly across backends.
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
strategy: round-robin
|
||||
```
|
||||
|
||||
### Least-Loaded
|
||||
|
||||
Routes to the backend with lowest current load.
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
strategy: least-loaded
|
||||
```
|
||||
|
||||
## Response Format
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class LLMResponse:
|
||||
content: str # Generated text
|
||||
model: str # Model used
|
||||
usage: Dict[str, int] # Token counts
|
||||
finish_reason: str # stop, length, tool_calls
|
||||
raw: Any # Provider-specific response
|
||||
```
|
||||
|
||||
### Usage Dict
|
||||
|
||||
```python
|
||||
response.usage = {
|
||||
"prompt_tokens": 150,
|
||||
"completion_tokens": 50,
|
||||
"total_tokens": 200,
|
||||
}
|
||||
```
|
||||
|
||||
## Parameters
|
||||
|
||||
```python
|
||||
response = await complete(
|
||||
model="grok-4.1", # Required: model name
|
||||
messages=[...], # Required: conversation
|
||||
temperature=0.7, # Optional: randomness (0-2)
|
||||
max_tokens=2048, # Optional: response limit
|
||||
top_p=0.9, # Optional: nucleus sampling
|
||||
stop=["END"], # Optional: stop sequences
|
||||
)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Rate Limits
|
||||
|
||||
On 429 responses:
|
||||
1. Reads `Retry-After` header
|
||||
2. Falls back to exponential backoff with jitter
|
||||
3. Tries next backend (if failover)
|
||||
|
||||
### Provider Errors
|
||||
|
||||
On 5xx responses:
|
||||
1. Logs error
|
||||
2. Retries with backoff
|
||||
3. Tries next backend (if failover)
|
||||
|
||||
### All Backends Failed
|
||||
|
||||
```python
|
||||
from xml_pipeline.llm.router import BackendError
|
||||
|
||||
try:
|
||||
response = await complete(model, messages)
|
||||
except BackendError as e:
|
||||
# All backends failed
|
||||
logger.error(f"LLM call failed: {e}")
|
||||
```
|
||||
|
||||
## Rate Limiting
|
||||
|
||||
Each backend has independent limits:
|
||||
|
||||
- **Token bucket**: Limits tokens per minute (`rate_limit_tpm`)
|
||||
- **Semaphore**: Limits concurrent requests (`max_concurrent`)
|
||||
|
||||
Requests wait if limits are reached.
|
||||
|
||||
## Token Tracking
|
||||
|
||||
Track usage per agent:
|
||||
|
||||
```python
|
||||
from xml_pipeline.llm.router import get_router
|
||||
|
||||
router = get_router()
|
||||
|
||||
# Get usage for agent
|
||||
usage = router.get_agent_usage("greeter")
|
||||
print(f"Total tokens: {usage.total_tokens}")
|
||||
print(f"Requests: {usage.request_count}")
|
||||
|
||||
# Reset tracking
|
||||
router.reset_agent_usage("greeter")
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Use System Prompts
|
||||
|
||||
```python
|
||||
response = await complete(
|
||||
model="grok-4.1",
|
||||
messages=[
|
||||
{"role": "system", "content": metadata.usage_instructions},
|
||||
{"role": "user", "content": payload.query},
|
||||
],
|
||||
)
|
||||
```
|
||||
|
||||
### 2. Handle Errors Gracefully
|
||||
|
||||
```python
|
||||
try:
|
||||
response = await complete(model, messages)
|
||||
except BackendError:
|
||||
return HandlerResponse(
|
||||
payload=ErrorResponse(message="LLM unavailable"),
|
||||
to=metadata.from_id,
|
||||
)
|
||||
```
|
||||
|
||||
### 3. Set Appropriate Limits
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
backends:
|
||||
- provider: xai
|
||||
rate_limit_tpm: 50000 # Conservative limit
|
||||
max_concurrent: 10 # Prevent overload
|
||||
```
|
||||
|
||||
### 4. Use Failover for Reliability
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
strategy: failover
|
||||
backends:
|
||||
- provider: xai
|
||||
priority: 1
|
||||
- provider: anthropic
|
||||
priority: 2 # Backup
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Writing Handlers]] — Using LLM in handlers
|
||||
- [[Configuration]] — Full LLM configuration
|
||||
- [[Architecture Overview]] — System architecture
|
||||
159
docs/wiki/Quick-Start.md
Normal file
159
docs/wiki/Quick-Start.md
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
# Quick Start
|
||||
|
||||
Get an organism running in 5 minutes.
|
||||
|
||||
## 1. Install the Package
|
||||
|
||||
```bash
|
||||
pip install xml-pipeline[console]
|
||||
```
|
||||
|
||||
## 2. Create a Project Directory
|
||||
|
||||
```bash
|
||||
mkdir my-organism
|
||||
cd my-organism
|
||||
```
|
||||
|
||||
## 3. Initialize Configuration
|
||||
|
||||
```bash
|
||||
xml-pipeline init my-organism
|
||||
```
|
||||
|
||||
This creates:
|
||||
```
|
||||
my-organism/
|
||||
├── config/
|
||||
│ └── organism.yaml
|
||||
├── handlers/
|
||||
│ └── hello.py
|
||||
└── .env.example
|
||||
```
|
||||
|
||||
## 4. Examine the Generated Files
|
||||
|
||||
### config/organism.yaml
|
||||
|
||||
```yaml
|
||||
organism:
|
||||
name: my-organism
|
||||
port: 8765
|
||||
|
||||
listeners:
|
||||
- name: greeter
|
||||
payload_class: handlers.hello.Greeting
|
||||
handler: handlers.hello.handle_greeting
|
||||
description: A friendly greeting handler
|
||||
peers: []
|
||||
```
|
||||
|
||||
### handlers/hello.py
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
from third_party.xmlable import xmlify
|
||||
from xml_pipeline.message_bus.message_state import HandlerMetadata, HandlerResponse
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class Greeting:
|
||||
"""A greeting request."""
|
||||
name: str
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class GreetingResponse:
|
||||
"""A greeting response."""
|
||||
message: str
|
||||
|
||||
async def handle_greeting(payload: Greeting, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
"""Handle a greeting and respond."""
|
||||
return HandlerResponse(
|
||||
payload=GreetingResponse(message=f"Hello, {payload.name}!"),
|
||||
to=metadata.from_id, # Reply to sender
|
||||
)
|
||||
```
|
||||
|
||||
## 5. Run the Organism
|
||||
|
||||
```bash
|
||||
xml-pipeline run config/organism.yaml
|
||||
```
|
||||
|
||||
You should see:
|
||||
```
|
||||
Organism: my-organism
|
||||
Listeners: 1
|
||||
Root thread: abc123-...
|
||||
Routing: ['greeter.greeting']
|
||||
```
|
||||
|
||||
## 6. Try the Interactive Console
|
||||
|
||||
If you installed with `[console]`:
|
||||
|
||||
```bash
|
||||
python -m examples.console
|
||||
```
|
||||
|
||||
Type `@greeter Alice` to send a greeting message.
|
||||
|
||||
## What Just Happened?
|
||||
|
||||
1. **Payload defined** — `Greeting` dataclass with `@xmlify` decorator
|
||||
2. **XSD generated** — Schema auto-created at `schemas/greeter/v1.xsd`
|
||||
3. **Handler registered** — `handle_greeting` mapped to `greeter.greeting` root tag
|
||||
4. **Message pump started** — Waiting for messages
|
||||
|
||||
## Understanding the Message Flow
|
||||
|
||||
```
|
||||
Input: @greeter Alice
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ XML Envelope Created │
|
||||
│ <message> │
|
||||
│ <meta> │
|
||||
│ <from>console</from> │
|
||||
│ <to>greeter</to> │
|
||||
│ <thread>uuid-123</thread> │
|
||||
│ </meta> │
|
||||
│ <greeting> │
|
||||
│ <name>Alice</name> │
|
||||
│ </greeting> │
|
||||
│ </message> │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ Pipeline Processing │
|
||||
│ 1. Repair (fix malformed XML) │
|
||||
│ 2. C14N (canonicalize) │
|
||||
│ 3. Envelope validation │
|
||||
│ 4. Payload extraction │
|
||||
│ 5. XSD validation │
|
||||
│ 6. Deserialization → Greeting │
|
||||
│ 7. Route to greeter handler │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────┐
|
||||
│ Handler Execution │
|
||||
│ handle_greeting( │
|
||||
│ payload=Greeting(name="Alice"), │
|
||||
│ metadata=HandlerMetadata(...) │
|
||||
│ ) │
|
||||
│ → HandlerResponse(...) │
|
||||
└─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
Output: Hello, Alice!
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [[Writing Handlers]] — Create your own handlers
|
||||
- [[Configuration]] — Customize organism.yaml
|
||||
- [[Hello World Tutorial]] — Step-by-step tutorial
|
||||
97
docs/wiki/README.md
Normal file
97
docs/wiki/README.md
Normal file
|
|
@ -0,0 +1,97 @@
|
|||
# Wiki Documentation
|
||||
|
||||
This directory contains documentation for the xml-pipeline.org XWiki.
|
||||
|
||||
## Structure
|
||||
|
||||
```
|
||||
wiki/
|
||||
├── Home.md # Wiki home page
|
||||
├── Installation.md # Installation guide
|
||||
├── Quick-Start.md # Quick start guide
|
||||
├── Writing-Handlers.md # Handler guide
|
||||
├── LLM-Router.md # LLM router guide
|
||||
├── Why-XML.md # Rationale for XML
|
||||
├── architecture/
|
||||
│ ├── Overview.md # Architecture overview
|
||||
│ ├── Message-Pump.md # Message pump details
|
||||
│ ├── Thread-Registry.md # Thread registry details
|
||||
│ └── Shared-Backend.md # Shared backend details
|
||||
├── reference/
|
||||
│ ├── Configuration.md # Full configuration reference
|
||||
│ ├── Handler-Contract.md # Handler specification
|
||||
│ └── CLI.md # CLI reference
|
||||
├── tutorials/
|
||||
│ └── Hello-World.md # Step-by-step tutorial
|
||||
├── convert-to-xwiki.sh # Bash conversion script
|
||||
├── convert-to-xwiki.ps1 # PowerShell conversion script
|
||||
└── README.md # This file
|
||||
```
|
||||
|
||||
## Converting to XWiki Format
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Install Pandoc: https://pandoc.org/installing.html
|
||||
|
||||
### Convert All Files
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
cd docs/wiki
|
||||
.\convert-to-xwiki.ps1
|
||||
```
|
||||
|
||||
**Linux/macOS (Bash):**
|
||||
```bash
|
||||
cd docs/wiki
|
||||
chmod +x convert-to-xwiki.sh
|
||||
./convert-to-xwiki.sh
|
||||
```
|
||||
|
||||
### Output
|
||||
|
||||
Converted files are placed in `xwiki/` directory with `.xwiki` extension.
|
||||
|
||||
## Uploading to XWiki
|
||||
|
||||
### Option 1: XWiki REST API
|
||||
|
||||
```bash
|
||||
# Upload a single page
|
||||
curl -u admin:password -X PUT \
|
||||
'https://xml-pipeline.org/rest/wikis/xwiki/spaces/Docs/pages/Home' \
|
||||
-H 'Content-Type: text/plain' \
|
||||
-d @xwiki/Home.xwiki
|
||||
```
|
||||
|
||||
### Option 2: XWiki Import
|
||||
|
||||
1. Go to XWiki Administration
|
||||
2. Content → Import
|
||||
3. Upload the files
|
||||
|
||||
### Option 3: Copy/Paste
|
||||
|
||||
1. Create page in XWiki
|
||||
2. Switch to Wiki editing mode
|
||||
3. Paste converted content
|
||||
|
||||
## Wiki Links
|
||||
|
||||
The Markdown files use `[[Page Name]]` wiki-link syntax. Pandoc converts these to XWiki's `[[Page Name]]` format.
|
||||
|
||||
If your XWiki uses different space structure, you may need to adjust links:
|
||||
|
||||
```
|
||||
[[Installation]] → [[Docs.Installation]]
|
||||
[[architecture/Overview]] → [[Docs.Architecture.Overview]]
|
||||
```
|
||||
|
||||
## Updating Documentation
|
||||
|
||||
1. Edit the Markdown files in this directory
|
||||
2. Run the conversion script
|
||||
3. Upload to XWiki
|
||||
|
||||
Keep Markdown as the source of truth for version control.
|
||||
254
docs/wiki/Why-XML.md
Normal file
254
docs/wiki/Why-XML.md
Normal file
|
|
@ -0,0 +1,254 @@
|
|||
# Why XML?
|
||||
|
||||
XML is the right format for a sovereign, attack-resistant message bus in a multi-agent system. JSON is not.
|
||||
|
||||
## The Short Answer
|
||||
|
||||
| Feature | XML | JSON |
|
||||
|---------|-----|------|
|
||||
| Schema validation | XSD (built-in, precise) | JSON Schema (optional, lossy) |
|
||||
| Namespaces | Native support | None |
|
||||
| Canonicalization | C14N standard | No standard |
|
||||
| Repair tolerance | lxml recover mode | Parser fails |
|
||||
| Comments | Supported | Forbidden |
|
||||
| Mixed content | Native | Fragile |
|
||||
|
||||
## JSON's Origins
|
||||
|
||||
JSON (JavaScript Object Notation) was invented in the early 2000s as a subset of JavaScript literal syntax for simple data exchange in web browsers. It was never designed as a general-purpose format—just a quick way to serialize objects for Ajax calls.
|
||||
|
||||
It became popular because:
|
||||
- Simple for JavaScript developers
|
||||
- Human-readable
|
||||
- Web API boom (REST over SOAP)
|
||||
- Low barrier to entry
|
||||
|
||||
## Why JSON Fails for Multi-Agent Systems
|
||||
|
||||
### No Schema Enforcement
|
||||
|
||||
JSON Schema exists but is:
|
||||
- Optional (rarely enforced on wire)
|
||||
- Lossy (can't express all constraints)
|
||||
- Inconsistently implemented
|
||||
|
||||
Result: Messages accepted without validation, bugs discovered at runtime.
|
||||
|
||||
### No Namespaces
|
||||
|
||||
Can't safely mix vocabularies:
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "Alice", // User name? Product name?
|
||||
"type": "admin" // User type? Message type?
|
||||
}
|
||||
```
|
||||
|
||||
### No Canonicalization
|
||||
|
||||
No standard way to normalize for signing:
|
||||
|
||||
```json
|
||||
{"a": 1, "b": 2}
|
||||
{"b": 2, "a": 1}
|
||||
```
|
||||
|
||||
Same data? Different bytes. Can't sign reliably.
|
||||
|
||||
### No Repair Tolerance
|
||||
|
||||
One syntax error → entire payload rejected:
|
||||
|
||||
```json
|
||||
{"name": "Alice",} // Trailing comma → FAIL
|
||||
```
|
||||
|
||||
### Escaping Hell
|
||||
|
||||
Strings with special characters are fragile:
|
||||
|
||||
```json
|
||||
{"message": "She said \"hello\""} // Manual escaping
|
||||
```
|
||||
|
||||
Easy to break, security vulnerability vector.
|
||||
|
||||
## Why JSON Fails for LLM Integration
|
||||
|
||||
### Hallucination Fragility
|
||||
|
||||
LLMs routinely produce invalid JSON:
|
||||
- Trailing commas
|
||||
- Missing quotes
|
||||
- Wrong nesting
|
||||
- Comments (forbidden!)
|
||||
|
||||
Result: Massive prompt bloat ("You MUST output valid JSON, NO trailing commas EVER...") and post-processing parsers.
|
||||
|
||||
### No Graceful Degradation
|
||||
|
||||
One parse error → entire response lost. No partial recovery.
|
||||
|
||||
### Injection Attacks
|
||||
|
||||
User input in strings can break JSON structure:
|
||||
|
||||
```json
|
||||
{"user_input": "Alice", "role": "admin"}
|
||||
```
|
||||
|
||||
If user provides `", "role": "admin"` in their name → injection.
|
||||
|
||||
## Why XML Succeeds
|
||||
|
||||
### Schema as Contract
|
||||
|
||||
XSD enforces exact structure on the wire:
|
||||
|
||||
```xml
|
||||
<xs:element name="greeting">
|
||||
<xs:complexType>
|
||||
<xs:sequence>
|
||||
<xs:element name="name" type="xs:string"/>
|
||||
</xs:sequence>
|
||||
</xs:complexType>
|
||||
</xs:element>
|
||||
```
|
||||
|
||||
Every message validated before processing. No ambiguity.
|
||||
|
||||
### Namespaces
|
||||
|
||||
Safe vocabulary mixing:
|
||||
|
||||
```xml
|
||||
<message xmlns="https://xml-pipeline.org/ns/envelope/v1">
|
||||
<user:profile xmlns:user="https://example.org/user">
|
||||
<user:name>Alice</user:name>
|
||||
</user:profile>
|
||||
</message>
|
||||
```
|
||||
|
||||
### Canonicalization (C14N)
|
||||
|
||||
Deterministic representation for signing:
|
||||
|
||||
```python
|
||||
c14n_bytes = etree.tostring(tree, method='c14n')
|
||||
signature = sign(c14n_bytes)
|
||||
```
|
||||
|
||||
Same logical content → same bytes → verifiable signatures.
|
||||
|
||||
### Repair Tolerance
|
||||
|
||||
lxml recover mode fixes common issues:
|
||||
|
||||
```python
|
||||
parser = etree.XMLParser(recover=True)
|
||||
tree = etree.fromstring(broken_xml, parser)
|
||||
```
|
||||
|
||||
Partial documents, encoding issues, missing tags → recovered.
|
||||
|
||||
### Self-Describing
|
||||
|
||||
Elements carry meaning:
|
||||
|
||||
```xml
|
||||
<greeting>
|
||||
<name>Alice</name>
|
||||
</greeting>
|
||||
```
|
||||
|
||||
vs JSON:
|
||||
|
||||
```json
|
||||
["Alice"] // What is this?
|
||||
```
|
||||
|
||||
## LLM + XML = Reliable
|
||||
|
||||
### Natural Streaming
|
||||
|
||||
XML streams naturally (can process before complete).
|
||||
|
||||
### Repair on Output
|
||||
|
||||
LLM produces broken XML? lxml fixes it:
|
||||
|
||||
```python
|
||||
from lxml import etree
|
||||
|
||||
parser = etree.XMLParser(recover=True)
|
||||
tree = etree.fromstring(llm_output, parser)
|
||||
# Works even with minor errors
|
||||
```
|
||||
|
||||
### Schema-Guided Generation
|
||||
|
||||
XSD tells LLM exactly what to produce:
|
||||
|
||||
```
|
||||
Generate XML matching this schema:
|
||||
<greeting><name>string</name></greeting>
|
||||
```
|
||||
|
||||
Clear contract, fewer hallucinations.
|
||||
|
||||
### Graceful Validation
|
||||
|
||||
Validation errors become helpful feedback:
|
||||
|
||||
```xml
|
||||
<huh>
|
||||
<error>Element 'greeting' missing required element 'name'</error>
|
||||
</huh>
|
||||
```
|
||||
|
||||
LLM can self-correct.
|
||||
|
||||
## The Trade-Offs
|
||||
|
||||
### XML is More Verbose
|
||||
|
||||
```xml
|
||||
<greeting><name>Alice</name></greeting>
|
||||
```
|
||||
|
||||
vs
|
||||
|
||||
```json
|
||||
{"name": "Alice"}
|
||||
```
|
||||
|
||||
**But:** Compression eliminates this on wire. And verbosity aids debugging.
|
||||
|
||||
### XML Parsing is Slower
|
||||
|
||||
Microseconds more than JSON parsing.
|
||||
|
||||
**But:** Network latency dominates. And lxml is highly optimized.
|
||||
|
||||
### XML is "Old"
|
||||
|
||||
True. Also mature, battle-tested, standards-based.
|
||||
|
||||
## Conclusion
|
||||
|
||||
JSON won the web because it was "good enough" for stateless HTTP requests.
|
||||
|
||||
XML wins for multi-agent systems because:
|
||||
- Security requires schema enforcement
|
||||
- Signing requires canonicalization
|
||||
- LLMs require repair tolerance
|
||||
- Complexity requires namespaces
|
||||
|
||||
**JSON won the web. XML wins the swarm.**
|
||||
|
||||
## Further Reading
|
||||
|
||||
- [W3C XML Schema](https://www.w3.org/XML/Schema)
|
||||
- [Exclusive XML Canonicalization](https://www.w3.org/TR/xml-exc-c14n/)
|
||||
- [lxml Documentation](https://lxml.de/)
|
||||
308
docs/wiki/Writing-Handlers.md
Normal file
308
docs/wiki/Writing-Handlers.md
Normal file
|
|
@ -0,0 +1,308 @@
|
|||
# Writing Handlers
|
||||
|
||||
Handlers are async Python functions that process messages. This guide covers everything you need to know to write effective handlers.
|
||||
|
||||
## Basic Handler Structure
|
||||
|
||||
Every handler follows this pattern:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
from third_party.xmlable import xmlify
|
||||
from xml_pipeline.message_bus.message_state import HandlerMetadata, HandlerResponse
|
||||
|
||||
# 1. Define your payload (input)
|
||||
@xmlify
|
||||
@dataclass
|
||||
class MyPayload:
|
||||
"""Description of what this payload represents."""
|
||||
field1: str
|
||||
field2: int
|
||||
|
||||
# 2. Define your response (output)
|
||||
@xmlify
|
||||
@dataclass
|
||||
class MyResponse:
|
||||
"""Description of the response."""
|
||||
result: str
|
||||
|
||||
# 3. Write the handler
|
||||
async def my_handler(payload: MyPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
"""Process the payload and return a response."""
|
||||
result = f"Processed {payload.field1} with {payload.field2}"
|
||||
|
||||
return HandlerResponse(
|
||||
payload=MyResponse(result=result),
|
||||
to="next-listener", # Where to send the response
|
||||
)
|
||||
```
|
||||
|
||||
## The @xmlify Decorator
|
||||
|
||||
The `@xmlify` decorator enables automatic XML serialization and XSD generation:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
from third_party.xmlable import xmlify
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class Greeting:
|
||||
name: str # Required field
|
||||
formal: bool = False # Optional with default
|
||||
count: int = 1 # Optional with default
|
||||
```
|
||||
|
||||
This generates XML like:
|
||||
```xml
|
||||
<greeting>
|
||||
<name>Alice</name>
|
||||
<formal>true</formal>
|
||||
<count>3</count>
|
||||
</greeting>
|
||||
```
|
||||
|
||||
### Supported Types
|
||||
|
||||
| Python Type | XML Representation |
|
||||
|-------------|-------------------|
|
||||
| `str` | Text content |
|
||||
| `int` | Integer text |
|
||||
| `float` | Decimal text |
|
||||
| `bool` | `true` / `false` |
|
||||
| `list[T]` | Repeated elements |
|
||||
| `Optional[T]` | Optional element |
|
||||
| `@xmlify` class | Nested element |
|
||||
|
||||
### Nested Payloads
|
||||
|
||||
```python
|
||||
@xmlify
|
||||
@dataclass
|
||||
class Address:
|
||||
street: str
|
||||
city: str
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class Person:
|
||||
name: str
|
||||
address: Address # Nested payload
|
||||
```
|
||||
|
||||
## HandlerMetadata
|
||||
|
||||
The `metadata` parameter provides context about the message:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class HandlerMetadata:
|
||||
thread_id: str # Opaque thread UUID
|
||||
from_id: str # Who sent this message
|
||||
own_name: str | None # This listener's name (agents only)
|
||||
is_self_call: bool # True if message is from self
|
||||
usage_instructions: str # Peer schemas for LLM prompts
|
||||
todo_nudge: str # System reminders about pending todos
|
||||
```
|
||||
|
||||
### Usage Examples
|
||||
|
||||
```python
|
||||
async def my_handler(payload: MyPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
# Log who sent the message
|
||||
print(f"Received from: {metadata.from_id}")
|
||||
|
||||
# Check if this is a self-call (agent iterating)
|
||||
if metadata.is_self_call:
|
||||
print("This is a self-call")
|
||||
|
||||
# For agents: use peer schemas in LLM prompts
|
||||
if metadata.usage_instructions:
|
||||
system_prompt = metadata.usage_instructions + "\n\nYour custom instructions..."
|
||||
```
|
||||
|
||||
## Response Types
|
||||
|
||||
### Forward to Target
|
||||
|
||||
Send the response to a specific listener:
|
||||
|
||||
```python
|
||||
return HandlerResponse(
|
||||
payload=MyResponse(result="done"),
|
||||
to="next-listener",
|
||||
)
|
||||
```
|
||||
|
||||
### Respond to Caller
|
||||
|
||||
Return to whoever sent the message:
|
||||
|
||||
```python
|
||||
return HandlerResponse.respond(
|
||||
payload=ResultPayload(value=42)
|
||||
)
|
||||
```
|
||||
|
||||
This uses the thread registry to route back up the call chain.
|
||||
|
||||
### Terminate Chain
|
||||
|
||||
End processing with no response:
|
||||
|
||||
```python
|
||||
return None
|
||||
```
|
||||
|
||||
Use this for terminal handlers (logging, display, etc.).
|
||||
|
||||
## Handler Patterns
|
||||
|
||||
### Simple Tool
|
||||
|
||||
A stateless transformation:
|
||||
|
||||
```python
|
||||
@xmlify
|
||||
@dataclass
|
||||
class AddPayload:
|
||||
a: int
|
||||
b: int
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class AddResult:
|
||||
sum: int
|
||||
|
||||
async def add_handler(payload: AddPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
result = payload.a + payload.b
|
||||
return HandlerResponse.respond(payload=AddResult(sum=result))
|
||||
```
|
||||
|
||||
### LLM Agent
|
||||
|
||||
An agent that uses language models:
|
||||
|
||||
```python
|
||||
async def research_handler(payload: ResearchQuery, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
from xml_pipeline.platform.llm_api import complete
|
||||
|
||||
# Build prompt with peer awareness
|
||||
system_prompt = f"""
|
||||
{metadata.usage_instructions}
|
||||
|
||||
You are a research agent. Answer the query using available tools.
|
||||
"""
|
||||
|
||||
response = await complete(
|
||||
model="grok-4.1",
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": payload.query},
|
||||
],
|
||||
)
|
||||
|
||||
return HandlerResponse(
|
||||
payload=ResearchResult(answer=response.content),
|
||||
to="summarizer",
|
||||
)
|
||||
```
|
||||
|
||||
### Terminal Handler
|
||||
|
||||
A handler that displays output and ends the chain:
|
||||
|
||||
```python
|
||||
async def console_output(payload: TextOutput, metadata: HandlerMetadata) -> None:
|
||||
print(f"[{payload.source}] {payload.text}")
|
||||
return None # Chain ends here
|
||||
```
|
||||
|
||||
### Self-Iterating Agent
|
||||
|
||||
An agent that calls itself to continue reasoning:
|
||||
|
||||
```python
|
||||
async def thinking_agent(payload: ThinkPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
# Check if we should continue thinking
|
||||
if payload.iteration >= 5:
|
||||
return HandlerResponse(
|
||||
payload=FinalAnswer(answer=payload.current_answer),
|
||||
to="output",
|
||||
)
|
||||
|
||||
# Continue thinking by calling self
|
||||
return HandlerResponse(
|
||||
payload=ThinkPayload(
|
||||
iteration=payload.iteration + 1,
|
||||
current_answer=f"Refined: {payload.current_answer}",
|
||||
),
|
||||
to=metadata.own_name, # Self-call
|
||||
)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
Handlers should handle errors gracefully:
|
||||
|
||||
```python
|
||||
async def safe_handler(payload: MyPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
try:
|
||||
result = await risky_operation(payload)
|
||||
return HandlerResponse.respond(payload=SuccessResult(data=result))
|
||||
except ValidationError as e:
|
||||
return HandlerResponse.respond(payload=ErrorResult(error=str(e)))
|
||||
except Exception as e:
|
||||
# Log and return generic error
|
||||
logger.exception("Handler failed")
|
||||
return HandlerResponse.respond(payload=ErrorResult(error="Internal error"))
|
||||
```
|
||||
|
||||
## Registration
|
||||
|
||||
Register handlers in `organism.yaml`:
|
||||
|
||||
```yaml
|
||||
listeners:
|
||||
- name: calculator.add
|
||||
payload_class: handlers.calc.AddPayload
|
||||
handler: handlers.calc.add_handler
|
||||
description: "Adds two numbers and returns the sum"
|
||||
```
|
||||
|
||||
The `description` is important—it's used in auto-generated tool prompts for LLM agents.
|
||||
|
||||
## CPU-Bound Handlers
|
||||
|
||||
For computationally expensive handlers, mark them as `cpu_bound`:
|
||||
|
||||
```yaml
|
||||
listeners:
|
||||
- name: analyzer
|
||||
payload_class: handlers.analyze.AnalyzePayload
|
||||
handler: handlers.analyze.analyze_handler
|
||||
description: "Heavy document analysis"
|
||||
cpu_bound: true
|
||||
```
|
||||
|
||||
These run in a separate process pool to avoid blocking the event loop.
|
||||
|
||||
## Security Notes
|
||||
|
||||
Handlers are **untrusted code**. The system enforces:
|
||||
|
||||
1. **Identity injection** — `<from>` is always set by the pump, never by handlers
|
||||
2. **Thread isolation** — Handlers see only opaque UUIDs
|
||||
3. **Peer constraints** — Agents can only send to declared peers
|
||||
|
||||
Even compromised handlers cannot:
|
||||
- Forge sender identity
|
||||
- Access other threads
|
||||
- Discover organism topology
|
||||
- Route to undeclared peers
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Handler Contract]] — Complete handler specification
|
||||
- [[Configuration]] — Registering handlers
|
||||
- [[LLM Router]] — Using language models
|
||||
346
docs/wiki/architecture/Message-Pump.md
Normal file
346
docs/wiki/architecture/Message-Pump.md
Normal file
|
|
@ -0,0 +1,346 @@
|
|||
# Message Pump
|
||||
|
||||
The Message Pump (StreamPump) is the heart of xml-pipeline. It orchestrates message flow from ingress through processing to handler dispatch and response handling.
|
||||
|
||||
## Overview
|
||||
|
||||
The pump uses [aiostream](https://aiostream.readthedocs.io/) for stream-based processing with concurrent fan-out capabilities.
|
||||
|
||||
```python
|
||||
from xml_pipeline.message_bus.stream_pump import StreamPump
|
||||
|
||||
pump = StreamPump(config)
|
||||
await pump.start()
|
||||
|
||||
# Inject a message
|
||||
await pump.inject(raw_bytes, from_id="console")
|
||||
|
||||
await pump.shutdown()
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ Message Source │
|
||||
│ (Console, WebSocket)│
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ INGRESS PIPELINE │
|
||||
│ │
|
||||
│ ┌─────────┐ ┌──────┐ ┌──────────┐ ┌─────────────┐ │
|
||||
│ │ Repair │ → │ C14N │ → │ Envelope │ → │ Payload │ │
|
||||
│ │ Step │ │ Step │ │ Validate │ │ Extraction │ │
|
||||
│ └─────────┘ └──────┘ └──────────┘ └─────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌─────────┐ ┌─────────────┐ │
|
||||
│ │ Thread │ → │ XSD │ → │ Deserialize │ │
|
||||
│ │ Assign │ │ Validate│ │ to class │ │
|
||||
│ └──────────┘ └─────────┘ └─────────────┘ │
|
||||
│ │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ ROUTING TABLE │
|
||||
│ │
|
||||
│ root_tag → [ │
|
||||
│ Listener1, │
|
||||
│ Listener2, │
|
||||
│ ] │
|
||||
└──────────┬──────────┘
|
||||
│
|
||||
┌────────────────┼────────────────┐
|
||||
▼ ▼ ▼
|
||||
┌─────────────────┐ ┌─────────────┐ ┌─────────────────┐
|
||||
│ Handler A │ │ Handler B │ │ Handler C │
|
||||
│ (async/main) │ │ (cpu_bound) │ │ (async) │
|
||||
└────────┬────────┘ └──────┬──────┘ └────────┬────────┘
|
||||
│ │ │
|
||||
└─────────────────┼──────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ RESPONSE HANDLER │
|
||||
│ │
|
||||
│ • Serialize │
|
||||
│ • Wrap envelope │
|
||||
│ • Inject <from> │
|
||||
│ • Re-inject │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
## Pipeline Steps
|
||||
|
||||
### repair_step
|
||||
|
||||
Fixes malformed XML using lxml's recover mode:
|
||||
|
||||
```python
|
||||
async def repair_step(state: MessageState) -> MessageState:
|
||||
parser = etree.XMLParser(recover=True)
|
||||
state.envelope_tree = etree.fromstring(state.raw_bytes, parser)
|
||||
return state
|
||||
```
|
||||
|
||||
Handles:
|
||||
- Missing closing tags
|
||||
- Invalid characters
|
||||
- Encoding issues
|
||||
|
||||
### c14n_step
|
||||
|
||||
Canonicalizes XML using Exclusive C14N:
|
||||
|
||||
```python
|
||||
async def c14n_step(state: MessageState) -> MessageState:
|
||||
c14n_bytes = etree.tostring(state.envelope_tree, method='c14n')
|
||||
state.envelope_tree = etree.fromstring(c14n_bytes)
|
||||
return state
|
||||
```
|
||||
|
||||
Ensures deterministic representation for signing.
|
||||
|
||||
### envelope_validation_step
|
||||
|
||||
Validates against `envelope.xsd`:
|
||||
|
||||
```xml
|
||||
<message xmlns="https://xml-pipeline.org/ns/envelope/v1">
|
||||
<meta>
|
||||
<from>...</from>
|
||||
<to>...</to> <!-- optional -->
|
||||
<thread>...</thread> <!-- optional -->
|
||||
</meta>
|
||||
<!-- payload element -->
|
||||
</message>
|
||||
```
|
||||
|
||||
### payload_extraction_step
|
||||
|
||||
Extracts the payload element:
|
||||
|
||||
```python
|
||||
async def payload_extraction_step(state: MessageState) -> MessageState:
|
||||
# Find first non-meta child
|
||||
for child in state.envelope_tree:
|
||||
if child.tag != 'meta':
|
||||
state.payload_tree = child
|
||||
break
|
||||
return state
|
||||
```
|
||||
|
||||
### thread_assignment_step
|
||||
|
||||
Assigns or inherits thread UUID:
|
||||
|
||||
```python
|
||||
async def thread_assignment_step(state: MessageState) -> MessageState:
|
||||
meta = state.envelope_tree.find('meta')
|
||||
thread_elem = meta.find('thread')
|
||||
|
||||
if thread_elem is not None:
|
||||
state.thread_id = thread_elem.text
|
||||
else:
|
||||
# New thread - assign UUID
|
||||
state.thread_id = str(uuid.uuid4())
|
||||
return state
|
||||
```
|
||||
|
||||
### xsd_validation_step
|
||||
|
||||
Validates payload against listener's schema:
|
||||
|
||||
```python
|
||||
async def xsd_validation_step(state: MessageState) -> MessageState:
|
||||
listener = find_listener_for_payload(state.payload_tree)
|
||||
schema = load_schema(listener.schema_path)
|
||||
|
||||
if not schema.validate(state.payload_tree):
|
||||
state.error = "XSD validation failed"
|
||||
return state
|
||||
```
|
||||
|
||||
### deserialization_step
|
||||
|
||||
Converts XML to typed dataclass:
|
||||
|
||||
```python
|
||||
async def deserialization_step(state: MessageState) -> MessageState:
|
||||
listener = find_listener_for_payload(state.payload_tree)
|
||||
state.payload = xmlify_deserialize(
|
||||
state.payload_tree,
|
||||
listener.payload_class
|
||||
)
|
||||
return state
|
||||
```
|
||||
|
||||
## Routing
|
||||
|
||||
### Root Tag Derivation
|
||||
|
||||
Root tag = `{listener_name}.{dataclass_name}` (lowercase):
|
||||
|
||||
```
|
||||
Listener: greeter
|
||||
Dataclass: Greeting
|
||||
Root tag: greeter.greeting
|
||||
```
|
||||
|
||||
### Routing Table
|
||||
|
||||
```python
|
||||
routing_table = {
|
||||
"greeter.greeting": [greeter_listener],
|
||||
"calculator.add": [calculator_listener],
|
||||
"search.query": [google_listener, bing_listener], # Broadcast
|
||||
}
|
||||
```
|
||||
|
||||
### Broadcast
|
||||
|
||||
Multiple listeners can share a root tag (`broadcast: true`):
|
||||
|
||||
```yaml
|
||||
listeners:
|
||||
- name: search.google
|
||||
broadcast: true
|
||||
# ...
|
||||
|
||||
- name: search.bing
|
||||
broadcast: true
|
||||
# ...
|
||||
```
|
||||
|
||||
All matching listeners execute concurrently.
|
||||
|
||||
## Handler Dispatch
|
||||
|
||||
### Async Handlers (Default)
|
||||
|
||||
Run in the main event loop:
|
||||
|
||||
```python
|
||||
async def _dispatch_async(state, listener):
|
||||
metadata = build_metadata(state, listener)
|
||||
response = await listener.handler(state.payload, metadata)
|
||||
await self._process_response(response, listener, state)
|
||||
```
|
||||
|
||||
### CPU-Bound Handlers
|
||||
|
||||
Dispatched to ProcessPoolExecutor:
|
||||
|
||||
```python
|
||||
async def _dispatch_to_process_pool(state, listener):
|
||||
# Store data in shared backend
|
||||
payload_uuid, metadata_uuid = store_task_data(
|
||||
self._backend, state.payload, metadata
|
||||
)
|
||||
|
||||
# Submit to pool
|
||||
task = WorkerTask(
|
||||
thread_uuid=state.thread_id,
|
||||
payload_uuid=payload_uuid,
|
||||
handler_path=listener.handler_path,
|
||||
metadata_uuid=metadata_uuid,
|
||||
)
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
result = await loop.run_in_executor(
|
||||
self._process_pool, execute_handler, task
|
||||
)
|
||||
|
||||
# Fetch response from backend
|
||||
response = fetch_response(self._backend, result.response_uuid)
|
||||
await self._process_response(response, listener, state)
|
||||
```
|
||||
|
||||
## Response Processing
|
||||
|
||||
When a handler returns `HandlerResponse`:
|
||||
|
||||
```python
|
||||
async def _process_response(response, listener, state):
|
||||
if response is None:
|
||||
return # Chain terminates
|
||||
|
||||
# Validate target
|
||||
if listener.is_agent and listener.peers:
|
||||
if response.to not in listener.peers:
|
||||
await self._emit_error(state, "Routing error")
|
||||
return
|
||||
|
||||
# Build new envelope
|
||||
payload_xml = xmlify_serialize(response.payload)
|
||||
new_state = MessageState(
|
||||
raw_bytes=wrap_in_envelope(
|
||||
payload_xml,
|
||||
from_id=listener.name, # SYSTEM INJECTS
|
||||
thread_id=get_new_thread_id(response, state),
|
||||
),
|
||||
)
|
||||
|
||||
# Re-inject
|
||||
await self._process(new_state)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Pipeline Errors
|
||||
|
||||
If any step sets `state.error`, processing stops and `<huh>` is emitted:
|
||||
|
||||
```xml
|
||||
<huh xmlns="https://xml-pipeline.org/ns/core/v1">
|
||||
<error>XSD validation failed</error>
|
||||
<original-attempt>base64-encoded-bytes</original-attempt>
|
||||
</huh>
|
||||
```
|
||||
|
||||
### Routing Errors
|
||||
|
||||
If an agent tries to route to an undeclared peer:
|
||||
|
||||
```xml
|
||||
<SystemError>
|
||||
<code>routing</code>
|
||||
<message>Message could not be delivered.</message>
|
||||
<retry-allowed>true</retry-allowed>
|
||||
</SystemError>
|
||||
```
|
||||
|
||||
## Lifecycle
|
||||
|
||||
```python
|
||||
# Start
|
||||
pump = StreamPump(config)
|
||||
await pump.start()
|
||||
|
||||
# Inject messages
|
||||
await pump.inject(raw_bytes, from_id="console")
|
||||
|
||||
# Shutdown (graceful)
|
||||
await pump.shutdown()
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
process_pool:
|
||||
workers: 4
|
||||
max_tasks_per_child: 100
|
||||
|
||||
backend:
|
||||
type: redis
|
||||
redis_url: "redis://localhost:6379"
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Architecture Overview]] — High-level view
|
||||
- [[Thread Registry]] — Thread tracking
|
||||
- [[Shared Backend]] — Cross-process state
|
||||
- [[Writing Handlers]] — Handler patterns
|
||||
256
docs/wiki/architecture/Overview.md
Normal file
256
docs/wiki/architecture/Overview.md
Normal file
|
|
@ -0,0 +1,256 @@
|
|||
# Architecture Overview
|
||||
|
||||
xml-pipeline implements a stream-based message pump where all communication flows through validated XML envelopes. The architecture enforces strict isolation between handlers (untrusted code) and the system (trusted zone).
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ TRUSTED ZONE (System) │
|
||||
│ • Thread registry (UUID ↔ call chain mapping) │
|
||||
│ • Listener registry (name → peers, schema) │
|
||||
│ • Envelope injection (<from>, <thread>, <to>) │
|
||||
│ • Peer constraint enforcement │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
↕
|
||||
Coroutine Capture Boundary
|
||||
↕
|
||||
┌─────────────────────────────────────────────────────────────────────┐
|
||||
│ UNTRUSTED ZONE (Handlers) │
|
||||
│ • Receive typed payload + metadata │
|
||||
│ • Return HandlerResponse or None │
|
||||
│ • Cannot forge identity, escape thread, or probe topology │
|
||||
└─────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### Message Pump (StreamPump)
|
||||
|
||||
The central orchestrator that:
|
||||
1. Receives raw XML bytes
|
||||
2. Runs messages through preprocessing pipeline
|
||||
3. Routes to appropriate handlers
|
||||
4. Processes responses and re-injects
|
||||
|
||||
See [[Message Pump]] for details.
|
||||
|
||||
### Pipeline Steps
|
||||
|
||||
Messages flow through ordered processing stages:
|
||||
|
||||
```
|
||||
Raw Bytes
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ repair_step │ Fix malformed XML (lxml recover mode)
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ c14n_step │ Canonicalize XML (Exclusive C14N)
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ envelope_valid │ Validate against envelope.xsd
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ payload_extract │ Extract payload from envelope
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ thread_assign │ Assign or inherit thread UUID
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ xsd_validate │ Validate against listener's XSD
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ deserialize │ XML → @xmlify dataclass
|
||||
└────────┬────────┘
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ routing │ Match to listener(s)
|
||||
└────────┬────────┘
|
||||
▼
|
||||
Handler
|
||||
```
|
||||
|
||||
### Thread Registry
|
||||
|
||||
Maps opaque UUIDs to call chains:
|
||||
|
||||
```
|
||||
UUID: 550e8400-e29b-41d4-...
|
||||
Chain: system.organism.console.greeter.calculator
|
||||
│ │ │ │ │
|
||||
│ │ │ │ └─ Current handler
|
||||
│ │ │ └─ Previous hop
|
||||
│ │ └─ Entry point
|
||||
│ └─ Organism name
|
||||
└─ Root
|
||||
```
|
||||
|
||||
Handlers only see the UUID. The actual chain is private to the system.
|
||||
|
||||
See [[Thread Registry]] for details.
|
||||
|
||||
### Listener Registry
|
||||
|
||||
Tracks registered listeners:
|
||||
|
||||
```
|
||||
name: "greeter"
|
||||
├── payload_class: Greeting
|
||||
├── handler: handle_greeting
|
||||
├── description: "Friendly greeting handler"
|
||||
├── agent: true
|
||||
├── peers: [shouter, calculator]
|
||||
└── schema: schemas/greeter/v1.xsd
|
||||
```
|
||||
|
||||
### Context Buffer
|
||||
|
||||
Stores message history per thread:
|
||||
|
||||
```
|
||||
Thread: uuid-123
|
||||
├── Slot 0: Greeting(name="Alice") from console
|
||||
├── Slot 1: GreetingResponse(message="Hello!") from greeter
|
||||
└── Slot 2: ShoutResponse(text="HELLO!") from shouter
|
||||
```
|
||||
|
||||
Append-only, immutable slots. Auto-GC when thread is pruned.
|
||||
|
||||
## Message Flow
|
||||
|
||||
### 1. Message Arrival
|
||||
|
||||
External message arrives (console, WebSocket, etc.):
|
||||
|
||||
```xml
|
||||
<message xmlns="https://xml-pipeline.org/ns/envelope/v1">
|
||||
<meta>
|
||||
<from>console</from>
|
||||
<to>greeter</to>
|
||||
</meta>
|
||||
<greeting>
|
||||
<name>Alice</name>
|
||||
</greeting>
|
||||
</message>
|
||||
```
|
||||
|
||||
### 2. Pipeline Processing
|
||||
|
||||
Message flows through pipeline steps. Each step transforms `MessageState`:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class MessageState:
|
||||
raw_bytes: bytes | None # Input
|
||||
envelope_tree: Element | None # After repair
|
||||
payload_tree: Element | None # After extraction
|
||||
payload: Any | None # After deserialization
|
||||
thread_id: str | None # After assignment
|
||||
from_id: str | None # Sender
|
||||
target_listeners: list | None # After routing
|
||||
error: str | None # If step fails
|
||||
```
|
||||
|
||||
### 3. Handler Dispatch
|
||||
|
||||
Handler receives typed payload + metadata:
|
||||
|
||||
```python
|
||||
async def handle_greeting(payload: Greeting, metadata: HandlerMetadata):
|
||||
# payload.name == "Alice"
|
||||
# metadata.thread_id == "uuid-123"
|
||||
# metadata.from_id == "console"
|
||||
```
|
||||
|
||||
### 4. Response Processing
|
||||
|
||||
Handler returns `HandlerResponse`:
|
||||
|
||||
```python
|
||||
return HandlerResponse(
|
||||
payload=GreetingResponse(message="Hello, Alice!"),
|
||||
to="shouter",
|
||||
)
|
||||
```
|
||||
|
||||
System:
|
||||
1. Validates `to` against peer list
|
||||
2. Serializes payload to XML
|
||||
3. Creates new envelope with injected `<from>`
|
||||
4. Re-injects into pipeline
|
||||
|
||||
## Trust Boundaries
|
||||
|
||||
### What the System Controls
|
||||
|
||||
| Aspect | System Responsibility |
|
||||
|--------|----------------------|
|
||||
| `<from>` | Always injected from listener.name |
|
||||
| `<thread>` | Managed by thread registry |
|
||||
| `<to>` validation | Checked against peers list |
|
||||
| Schema enforcement | XSD validation on every message |
|
||||
| Call chain | Private, never exposed to handlers |
|
||||
|
||||
### What Handlers Control
|
||||
|
||||
| Aspect | Handler Capability |
|
||||
|--------|-------------------|
|
||||
| Payload content | Full control |
|
||||
| Target selection | Via `HandlerResponse.to` (validated) |
|
||||
| Response/no response | Return value |
|
||||
| Self-iteration | Call own name |
|
||||
|
||||
### What Handlers Cannot Do
|
||||
|
||||
- Forge sender identity
|
||||
- Access other threads
|
||||
- Discover topology
|
||||
- Route to undeclared peers
|
||||
- Modify message history
|
||||
- Access other handlers' state
|
||||
|
||||
## Multiprocess Architecture
|
||||
|
||||
For CPU-bound handlers:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Main Process (StreamPump) │
|
||||
│ - Ingress pipeline │
|
||||
│ - Routing decisions │
|
||||
│ - Response re-injection │
|
||||
└───────────────────────────┬─────────────────────────────────────┘
|
||||
│ UUID + handler_path (minimal IPC)
|
||||
┌─────────────┼─────────────┐
|
||||
▼ ▼ ▼
|
||||
┌─────────────────┐ ┌─────────────┐ ┌─────────────────┐
|
||||
│ Python Async │ │ ProcessPool │ │ (Future: WASM) │
|
||||
│ (main process) │ │ (N workers) │ │ │
|
||||
│ - Default mode │ │ - cpu_bound │ │ │
|
||||
└────────┬────────┘ └──────┬──────┘ └────────┬────────┘
|
||||
│ │ │
|
||||
└─────────────────┼──────────────────┘
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Shared Backend (Redis / Manager / Memory) │
|
||||
│ - Context buffer slots │
|
||||
│ - Thread registry mappings │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
See [[Shared Backend]] for details.
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Message Pump]] — Detailed pump architecture
|
||||
- [[Thread Registry]] — Call chain tracking
|
||||
- [[Shared Backend]] — Cross-process state
|
||||
- [[Handler Contract]] — Handler specification
|
||||
339
docs/wiki/architecture/Shared-Backend.md
Normal file
339
docs/wiki/architecture/Shared-Backend.md
Normal file
|
|
@ -0,0 +1,339 @@
|
|||
# Shared Backend
|
||||
|
||||
The Shared Backend enables cross-process state sharing for multiprocess deployments. It provides storage for the Context Buffer and Thread Registry.
|
||||
|
||||
## Overview
|
||||
|
||||
By default, xml-pipeline uses in-memory storage (single process). For CPU-bound handlers running in separate processes, you need shared state:
|
||||
|
||||
```
|
||||
┌────────────────────┐ ┌────────────────────┐
|
||||
│ Main Process │ │ Worker Process │
|
||||
│ (StreamPump) │ │ (cpu_bound) │
|
||||
└─────────┬──────────┘ └──────────┬─────────┘
|
||||
│ │
|
||||
└───────────┬───────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ Shared Backend │
|
||||
│ (Redis/Manager) │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
## Backend Types
|
||||
|
||||
### InMemoryBackend (Default)
|
||||
|
||||
Single-process, thread-safe storage using Python dictionaries.
|
||||
|
||||
```python
|
||||
from xml_pipeline.memory import get_shared_backend, BackendConfig
|
||||
|
||||
config = BackendConfig(backend_type="memory")
|
||||
backend = get_shared_backend(config)
|
||||
```
|
||||
|
||||
**Use when:**
|
||||
- Single process deployment
|
||||
- Development/testing
|
||||
- No CPU-bound handlers
|
||||
|
||||
### ManagerBackend
|
||||
|
||||
Uses `multiprocessing.Manager` for local multi-process sharing.
|
||||
|
||||
```python
|
||||
config = BackendConfig(backend_type="manager")
|
||||
backend = get_shared_backend(config)
|
||||
```
|
||||
|
||||
**Use when:**
|
||||
- Local deployment with CPU-bound handlers
|
||||
- No Redis available
|
||||
- Single machine, multiple processes
|
||||
|
||||
### RedisBackend
|
||||
|
||||
Distributed storage with TTL-based auto-cleanup.
|
||||
|
||||
```python
|
||||
config = BackendConfig(
|
||||
backend_type="redis",
|
||||
redis_url="redis://localhost:6379",
|
||||
redis_prefix="xp:",
|
||||
redis_ttl=86400, # 24 hours
|
||||
)
|
||||
backend = get_shared_backend(config)
|
||||
```
|
||||
|
||||
**Use when:**
|
||||
- Distributed deployment
|
||||
- Multiple machines
|
||||
- Need persistence
|
||||
- Production environments
|
||||
|
||||
## Configuration
|
||||
|
||||
### Via organism.yaml
|
||||
|
||||
```yaml
|
||||
backend:
|
||||
type: redis # memory | manager | redis
|
||||
redis_url: "redis://localhost:6379" # Redis connection URL
|
||||
redis_prefix: "xp:" # Key prefix for multi-tenancy
|
||||
redis_ttl: 86400 # Key TTL in seconds
|
||||
```
|
||||
|
||||
### Programmatic
|
||||
|
||||
```python
|
||||
from xml_pipeline.memory import get_shared_backend, BackendConfig
|
||||
|
||||
config = BackendConfig(
|
||||
backend_type="redis",
|
||||
redis_url="redis://localhost:6379",
|
||||
redis_prefix="myapp:",
|
||||
redis_ttl=3600,
|
||||
)
|
||||
backend = get_shared_backend(config)
|
||||
```
|
||||
|
||||
## Storage Schema
|
||||
|
||||
### Context Buffer
|
||||
|
||||
Stores message history per thread.
|
||||
|
||||
**In-Memory/Manager:**
|
||||
```python
|
||||
_buffers = {
|
||||
"thread-uuid-1": [slot_bytes_0, slot_bytes_1, ...],
|
||||
"thread-uuid-2": [...],
|
||||
}
|
||||
```
|
||||
|
||||
**Redis:**
|
||||
```
|
||||
{prefix}buffer:{thread_id} → LIST of pickled BufferSlots
|
||||
```
|
||||
|
||||
### Thread Registry
|
||||
|
||||
Maps UUIDs to call chains.
|
||||
|
||||
**In-Memory/Manager:**
|
||||
```python
|
||||
_chain_to_uuid = {"console.greeter": "uuid-123"}
|
||||
_uuid_to_chain = {"uuid-123": "console.greeter"}
|
||||
```
|
||||
|
||||
**Redis:**
|
||||
```
|
||||
{prefix}chain:{chain} → {uuid}
|
||||
{prefix}uuid:{uuid} → {chain}
|
||||
```
|
||||
|
||||
## API
|
||||
|
||||
### Buffer Operations
|
||||
|
||||
```python
|
||||
# Append a slot
|
||||
index = backend.buffer_append(thread_id, slot_bytes)
|
||||
|
||||
# Get all slots for thread
|
||||
slots = backend.buffer_get_thread(thread_id)
|
||||
|
||||
# Get specific slot
|
||||
slot = backend.buffer_get_slot(thread_id, index)
|
||||
|
||||
# Check thread exists
|
||||
exists = backend.buffer_thread_exists(thread_id)
|
||||
|
||||
# Delete thread
|
||||
deleted = backend.buffer_delete_thread(thread_id)
|
||||
|
||||
# List all threads
|
||||
threads = backend.buffer_list_threads()
|
||||
|
||||
# Clear all (testing)
|
||||
backend.buffer_clear()
|
||||
```
|
||||
|
||||
### Registry Operations
|
||||
|
||||
```python
|
||||
# Set chain ↔ UUID mapping
|
||||
backend.registry_set(chain, uuid)
|
||||
|
||||
# Get UUID from chain
|
||||
uuid = backend.registry_get_uuid(chain)
|
||||
|
||||
# Get chain from UUID
|
||||
chain = backend.registry_get_chain(uuid)
|
||||
|
||||
# Delete mapping
|
||||
deleted = backend.registry_delete(uuid)
|
||||
|
||||
# List all mappings
|
||||
all_mappings = backend.registry_list_all()
|
||||
|
||||
# Clear all (testing)
|
||||
backend.registry_clear()
|
||||
```
|
||||
|
||||
### Serialization
|
||||
|
||||
Slots are serialized using pickle:
|
||||
|
||||
```python
|
||||
from xml_pipeline.memory import serialize_slot, deserialize_slot
|
||||
|
||||
# Serialize for storage
|
||||
slot_bytes = serialize_slot(buffer_slot)
|
||||
|
||||
# Deserialize after retrieval
|
||||
slot = deserialize_slot(slot_bytes)
|
||||
```
|
||||
|
||||
## Integration
|
||||
|
||||
### With ContextBuffer
|
||||
|
||||
```python
|
||||
from xml_pipeline.memory import get_context_buffer
|
||||
|
||||
# Uses shared backend automatically if configured
|
||||
buffer = get_context_buffer(backend=backend)
|
||||
|
||||
# Check if using shared storage
|
||||
print(buffer.is_shared) # True
|
||||
```
|
||||
|
||||
### With ThreadRegistry
|
||||
|
||||
```python
|
||||
from xml_pipeline.message_bus.thread_registry import get_registry
|
||||
|
||||
registry = get_registry(backend=backend)
|
||||
|
||||
# Check if using shared storage
|
||||
print(registry.is_shared) # True
|
||||
```
|
||||
|
||||
### With StreamPump
|
||||
|
||||
The pump automatically uses the configured backend:
|
||||
|
||||
```yaml
|
||||
backend:
|
||||
type: redis
|
||||
redis_url: "redis://localhost:6379"
|
||||
|
||||
process_pool:
|
||||
workers: 4
|
||||
|
||||
listeners:
|
||||
- name: analyzer
|
||||
cpu_bound: true # Uses shared backend for data exchange
|
||||
```
|
||||
|
||||
## Worker Data Flow
|
||||
|
||||
For CPU-bound handlers, data flows through the backend:
|
||||
|
||||
```
|
||||
1. Main Process
|
||||
├── Serialize payload + metadata
|
||||
├── Store in backend (payload_uuid, metadata_uuid)
|
||||
└── Submit WorkerTask to ProcessPool
|
||||
|
||||
2. Worker Process
|
||||
├── Fetch payload + metadata from backend
|
||||
├── Execute handler
|
||||
├── Store response in backend (response_uuid)
|
||||
└── Return WorkerResult
|
||||
|
||||
3. Main Process
|
||||
├── Fetch response from backend
|
||||
├── Clean up temporary data
|
||||
└── Process response normally
|
||||
```
|
||||
|
||||
## TTL and Cleanup
|
||||
|
||||
### Redis TTL
|
||||
|
||||
Redis keys automatically expire:
|
||||
|
||||
```yaml
|
||||
backend:
|
||||
redis_ttl: 86400 # Keys expire after 24 hours
|
||||
```
|
||||
|
||||
### Manual Cleanup
|
||||
|
||||
```python
|
||||
# Delete specific thread
|
||||
backend.buffer_delete_thread(thread_id)
|
||||
backend.registry_delete(uuid)
|
||||
|
||||
# Clear all (testing only)
|
||||
backend.buffer_clear()
|
||||
backend.registry_clear()
|
||||
```
|
||||
|
||||
## Multi-Tenancy
|
||||
|
||||
Use prefixes to isolate different organisms:
|
||||
|
||||
```yaml
|
||||
# Organism A
|
||||
backend:
|
||||
type: redis
|
||||
redis_prefix: "orgA:"
|
||||
|
||||
# Organism B
|
||||
backend:
|
||||
type: redis
|
||||
redis_prefix: "orgB:"
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Redis Info
|
||||
|
||||
```python
|
||||
info = backend.info()
|
||||
# {'buffer_threads': 5, 'registry_entries': 12}
|
||||
```
|
||||
|
||||
### Health Check
|
||||
|
||||
```python
|
||||
is_healthy = backend.ping() # True if connected
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from xml_pipeline.memory import InMemoryBackend
|
||||
|
||||
@pytest.fixture
|
||||
def backend():
|
||||
backend = InMemoryBackend()
|
||||
yield backend
|
||||
backend.close()
|
||||
|
||||
def test_buffer_operations(backend):
|
||||
backend.buffer_append("thread-1", b"data")
|
||||
assert backend.buffer_thread_exists("thread-1")
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Architecture Overview]] — High-level architecture
|
||||
- [[Message Pump]] — How the pump uses backends
|
||||
- [[Configuration]] — Backend configuration options
|
||||
261
docs/wiki/architecture/Thread-Registry.md
Normal file
261
docs/wiki/architecture/Thread-Registry.md
Normal file
|
|
@ -0,0 +1,261 @@
|
|||
# Thread Registry
|
||||
|
||||
The Thread Registry maps opaque UUIDs to call chains, enabling thread tracking while hiding topology from handlers.
|
||||
|
||||
## Purpose
|
||||
|
||||
When agents communicate, they form call chains:
|
||||
|
||||
```
|
||||
console → greeter → calculator → back to greeter → shouter
|
||||
```
|
||||
|
||||
The registry:
|
||||
1. **Tracks call chains** for routing responses
|
||||
2. **Provides opaque UUIDs** to handlers (hiding topology)
|
||||
3. **Manages chain pruning** when handlers respond
|
||||
|
||||
## Concepts
|
||||
|
||||
### Call Chain
|
||||
|
||||
A dot-separated path showing message flow:
|
||||
|
||||
```
|
||||
system.organism.console.greeter.calculator
|
||||
│ │ │ │ │
|
||||
│ │ │ │ └─ Current position
|
||||
│ │ │ └─ Greeter called calculator
|
||||
│ │ └─ Console called greeter
|
||||
│ └─ Organism name
|
||||
└─ Root
|
||||
```
|
||||
|
||||
### Opaque UUID
|
||||
|
||||
What handlers actually see:
|
||||
|
||||
```
|
||||
550e8400-e29b-41d4-a716-446655440000
|
||||
```
|
||||
|
||||
Handlers never see the actual chain. This prevents:
|
||||
- Topology probing
|
||||
- Call chain forgery
|
||||
- Thread hijacking
|
||||
|
||||
## API
|
||||
|
||||
### Initialize Root
|
||||
|
||||
At boot time:
|
||||
|
||||
```python
|
||||
from xml_pipeline.message_bus.thread_registry import get_registry
|
||||
|
||||
registry = get_registry()
|
||||
root_uuid = registry.initialize_root("my-organism")
|
||||
# Creates: system.my-organism → uuid
|
||||
```
|
||||
|
||||
### Get or Create
|
||||
|
||||
Get UUID for a chain (creates if needed):
|
||||
|
||||
```python
|
||||
uuid = registry.get_or_create("console.greeter")
|
||||
# Returns: existing UUID or creates new one
|
||||
```
|
||||
|
||||
### Lookup
|
||||
|
||||
Get chain for a UUID:
|
||||
|
||||
```python
|
||||
chain = registry.lookup(uuid)
|
||||
# Returns: "console.greeter" or None
|
||||
```
|
||||
|
||||
### Extend Chain
|
||||
|
||||
When forwarding to a new handler:
|
||||
|
||||
```python
|
||||
new_uuid = registry.extend_chain(current_uuid, "calculator")
|
||||
# Before: console.greeter (uuid-123)
|
||||
# After: console.greeter.calculator (uuid-456)
|
||||
```
|
||||
|
||||
### Prune for Response
|
||||
|
||||
When a handler returns `.respond()`:
|
||||
|
||||
```python
|
||||
target, new_uuid = registry.prune_for_response(current_uuid)
|
||||
# Before: console.greeter.calculator (uuid-456)
|
||||
# After: console.greeter (uuid-123)
|
||||
# target: "greeter"
|
||||
```
|
||||
|
||||
### Register External Thread
|
||||
|
||||
For messages arriving with pre-assigned UUIDs:
|
||||
|
||||
```python
|
||||
registry.register_thread(
|
||||
thread_id="external-uuid",
|
||||
initiator="console",
|
||||
target="greeter"
|
||||
)
|
||||
# Creates: system.organism.console.greeter → external-uuid
|
||||
```
|
||||
|
||||
## Thread Lifecycle
|
||||
|
||||
### Creation
|
||||
|
||||
```
|
||||
1. External message arrives without thread
|
||||
│
|
||||
▼
|
||||
2. thread_assignment_step generates UUID
|
||||
│
|
||||
▼
|
||||
3. Registry maps: chain → UUID
|
||||
```
|
||||
|
||||
### Extension
|
||||
|
||||
```
|
||||
1. Handler A forwards to Handler B
|
||||
│
|
||||
▼
|
||||
2. Pump calls extend_chain(uuid_A, "B")
|
||||
│
|
||||
▼
|
||||
3. Registry creates: chain.B → uuid_B
|
||||
```
|
||||
|
||||
### Pruning
|
||||
|
||||
```
|
||||
1. Handler B calls .respond()
|
||||
│
|
||||
▼
|
||||
2. Pump calls prune_for_response(uuid_B)
|
||||
│
|
||||
▼
|
||||
3. Registry:
|
||||
- Looks up chain: "...A.B"
|
||||
- Prunes last segment: "...A"
|
||||
- Returns target "A" and uuid_A
|
||||
│
|
||||
▼
|
||||
4. Response routed to Handler A
|
||||
```
|
||||
|
||||
### Cleanup
|
||||
|
||||
```
|
||||
1. Chain exhausted (root reached) or
|
||||
Handler returns None
|
||||
│
|
||||
▼
|
||||
2. UUID mapping removed
|
||||
│
|
||||
▼
|
||||
3. Context buffer for thread deleted
|
||||
```
|
||||
|
||||
## Shared Backend Support
|
||||
|
||||
For multiprocess deployments, the registry can use a shared backend:
|
||||
|
||||
```python
|
||||
from xml_pipeline.memory.shared_backend import get_shared_backend, BackendConfig
|
||||
|
||||
# Use Redis for distributed deployments
|
||||
config = BackendConfig(backend_type="redis", redis_url="redis://localhost:6379")
|
||||
backend = get_shared_backend(config)
|
||||
registry = get_registry(backend=backend)
|
||||
```
|
||||
|
||||
### Storage Schema (Redis)
|
||||
|
||||
```
|
||||
xp:chain:{chain} → {uuid} # Chain to UUID
|
||||
xp:uuid:{uuid} → {chain} # UUID to Chain
|
||||
```
|
||||
|
||||
## Security Properties
|
||||
|
||||
### What Handlers See
|
||||
|
||||
```python
|
||||
metadata.thread_id = "550e8400-..." # Opaque UUID
|
||||
metadata.from_id = "greeter" # Only immediate caller
|
||||
```
|
||||
|
||||
### What Handlers Don't See
|
||||
|
||||
- Full call chain
|
||||
- Other thread UUIDs
|
||||
- Thread count or topology
|
||||
- Parent/child relationships
|
||||
|
||||
### Why This Matters
|
||||
|
||||
Even compromised handlers cannot:
|
||||
- **Forge thread IDs** — UUIDs are cryptographically random
|
||||
- **Discover topology** — Chain hidden behind UUID
|
||||
- **Hijack threads** — Registry validates all operations
|
||||
- **Probe other threads** — No enumeration API
|
||||
|
||||
## Debugging
|
||||
|
||||
For operators (not exposed to handlers):
|
||||
|
||||
```python
|
||||
# Dump all mappings
|
||||
chains = registry.debug_dump()
|
||||
# {'uuid-123': 'console.greeter', 'uuid-456': 'console.greeter.calc'}
|
||||
|
||||
# Clear (testing only)
|
||||
registry.clear()
|
||||
```
|
||||
|
||||
## Example Flow
|
||||
|
||||
```
|
||||
1. Console sends @greeter hello
|
||||
├── UUID assigned: uuid-1
|
||||
└── Chain: system.org.console.greeter
|
||||
|
||||
2. Greeter forwards to calculator
|
||||
├── extend_chain(uuid-1, "calculator")
|
||||
├── New UUID: uuid-2
|
||||
└── Chain: system.org.console.greeter.calculator
|
||||
|
||||
3. Calculator responds
|
||||
├── prune_for_response(uuid-2)
|
||||
├── Target: "greeter"
|
||||
└── UUID: uuid-1 (back to greeter's context)
|
||||
|
||||
4. Greeter responds
|
||||
├── prune_for_response(uuid-1)
|
||||
├── Target: "console"
|
||||
└── Chain exhausted → cleanup
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
No explicit configuration needed. The registry:
|
||||
- Initializes automatically at pump startup
|
||||
- Uses shared backend if configured
|
||||
- Cleans up on thread termination
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Architecture Overview]] — High-level architecture
|
||||
- [[Message Pump]] — How the pump uses the registry
|
||||
- [[Shared Backend]] — Cross-process storage
|
||||
76
docs/wiki/convert-to-xwiki.ps1
Normal file
76
docs/wiki/convert-to-xwiki.ps1
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
# Convert Markdown wiki docs to XWiki format using Pandoc
|
||||
#
|
||||
# Prerequisites:
|
||||
# - Pandoc installed (https://pandoc.org/installing.html)
|
||||
# - Run from docs/wiki directory
|
||||
#
|
||||
# Usage:
|
||||
# .\convert-to-xwiki.ps1
|
||||
#
|
||||
# Output:
|
||||
# Creates xwiki/ directory with converted files
|
||||
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
$ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
|
||||
$OutputDir = Join-Path $ScriptDir "xwiki"
|
||||
|
||||
Write-Host "Converting Markdown to XWiki format..."
|
||||
Write-Host "Output directory: $OutputDir"
|
||||
Write-Host ""
|
||||
|
||||
# Create output directory structure
|
||||
New-Item -ItemType Directory -Force -Path $OutputDir | Out-Null
|
||||
New-Item -ItemType Directory -Force -Path (Join-Path $OutputDir "architecture") | Out-Null
|
||||
New-Item -ItemType Directory -Force -Path (Join-Path $OutputDir "reference") | Out-Null
|
||||
New-Item -ItemType Directory -Force -Path (Join-Path $OutputDir "tutorials") | Out-Null
|
||||
|
||||
# Function to convert a single file
|
||||
function Convert-File {
|
||||
param (
|
||||
[string]$Input,
|
||||
[string]$Output
|
||||
)
|
||||
|
||||
if (Test-Path $Input) {
|
||||
Write-Host "Converting: $Input"
|
||||
pandoc -f markdown -t xwiki $Input -o $Output
|
||||
}
|
||||
}
|
||||
|
||||
# Convert root level docs
|
||||
Convert-File (Join-Path $ScriptDir "Home.md") (Join-Path $OutputDir "Home.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "Installation.md") (Join-Path $OutputDir "Installation.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "Quick-Start.md") (Join-Path $OutputDir "Quick-Start.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "Writing-Handlers.md") (Join-Path $OutputDir "Writing-Handlers.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "LLM-Router.md") (Join-Path $OutputDir "LLM-Router.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "Why-XML.md") (Join-Path $OutputDir "Why-XML.xwiki")
|
||||
|
||||
# Convert architecture docs
|
||||
Convert-File (Join-Path $ScriptDir "architecture\Overview.md") (Join-Path $OutputDir "architecture\Overview.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "architecture\Message-Pump.md") (Join-Path $OutputDir "architecture\Message-Pump.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "architecture\Thread-Registry.md") (Join-Path $OutputDir "architecture\Thread-Registry.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "architecture\Shared-Backend.md") (Join-Path $OutputDir "architecture\Shared-Backend.xwiki")
|
||||
|
||||
# Convert reference docs
|
||||
Convert-File (Join-Path $ScriptDir "reference\Configuration.md") (Join-Path $OutputDir "reference\Configuration.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "reference\Handler-Contract.md") (Join-Path $OutputDir "reference\Handler-Contract.xwiki")
|
||||
Convert-File (Join-Path $ScriptDir "reference\CLI.md") (Join-Path $OutputDir "reference\CLI.xwiki")
|
||||
|
||||
# Convert tutorials
|
||||
Convert-File (Join-Path $ScriptDir "tutorials\Hello-World.md") (Join-Path $OutputDir "tutorials\Hello-World.xwiki")
|
||||
|
||||
Write-Host ""
|
||||
Write-Host "Conversion complete!"
|
||||
Write-Host ""
|
||||
Write-Host "Files created in: $OutputDir"
|
||||
Write-Host ""
|
||||
Write-Host "Next steps:"
|
||||
Write-Host " 1. Review the converted files"
|
||||
Write-Host " 2. Upload to XWiki via REST API or import feature"
|
||||
Write-Host ""
|
||||
Write-Host "XWiki REST API example:"
|
||||
Write-Host " curl -u admin:password -X PUT ``"
|
||||
Write-Host " 'https://xml-pipeline.org/rest/wikis/xwiki/spaces/Docs/pages/Home' ``"
|
||||
Write-Host " -H 'Content-Type: text/plain' ``"
|
||||
Write-Host " -d @$OutputDir\Home.xwiki"
|
||||
75
docs/wiki/convert-to-xwiki.sh
Normal file
75
docs/wiki/convert-to-xwiki.sh
Normal file
|
|
@ -0,0 +1,75 @@
|
|||
#!/bin/bash
|
||||
# Convert Markdown wiki docs to XWiki format using Pandoc
|
||||
#
|
||||
# Prerequisites:
|
||||
# - Pandoc installed (https://pandoc.org/installing.html)
|
||||
# - Run from docs/wiki directory
|
||||
#
|
||||
# Usage:
|
||||
# ./convert-to-xwiki.sh
|
||||
#
|
||||
# Output:
|
||||
# Creates xwiki/ directory with converted files
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
OUTPUT_DIR="$SCRIPT_DIR/xwiki"
|
||||
|
||||
echo "Converting Markdown to XWiki format..."
|
||||
echo "Output directory: $OUTPUT_DIR"
|
||||
echo ""
|
||||
|
||||
# Create output directory structure
|
||||
mkdir -p "$OUTPUT_DIR"
|
||||
mkdir -p "$OUTPUT_DIR/architecture"
|
||||
mkdir -p "$OUTPUT_DIR/reference"
|
||||
mkdir -p "$OUTPUT_DIR/tutorials"
|
||||
|
||||
# Function to convert a single file
|
||||
convert_file() {
|
||||
local input="$1"
|
||||
local output="$2"
|
||||
|
||||
if [ -f "$input" ]; then
|
||||
echo "Converting: $input"
|
||||
pandoc -f markdown -t xwiki "$input" -o "$output"
|
||||
fi
|
||||
}
|
||||
|
||||
# Convert root level docs
|
||||
convert_file "$SCRIPT_DIR/Home.md" "$OUTPUT_DIR/Home.xwiki"
|
||||
convert_file "$SCRIPT_DIR/Installation.md" "$OUTPUT_DIR/Installation.xwiki"
|
||||
convert_file "$SCRIPT_DIR/Quick-Start.md" "$OUTPUT_DIR/Quick-Start.xwiki"
|
||||
convert_file "$SCRIPT_DIR/Writing-Handlers.md" "$OUTPUT_DIR/Writing-Handlers.xwiki"
|
||||
convert_file "$SCRIPT_DIR/LLM-Router.md" "$OUTPUT_DIR/LLM-Router.xwiki"
|
||||
convert_file "$SCRIPT_DIR/Why-XML.md" "$OUTPUT_DIR/Why-XML.xwiki"
|
||||
|
||||
# Convert architecture docs
|
||||
convert_file "$SCRIPT_DIR/architecture/Overview.md" "$OUTPUT_DIR/architecture/Overview.xwiki"
|
||||
convert_file "$SCRIPT_DIR/architecture/Message-Pump.md" "$OUTPUT_DIR/architecture/Message-Pump.xwiki"
|
||||
convert_file "$SCRIPT_DIR/architecture/Thread-Registry.md" "$OUTPUT_DIR/architecture/Thread-Registry.xwiki"
|
||||
convert_file "$SCRIPT_DIR/architecture/Shared-Backend.md" "$OUTPUT_DIR/architecture/Shared-Backend.xwiki"
|
||||
|
||||
# Convert reference docs
|
||||
convert_file "$SCRIPT_DIR/reference/Configuration.md" "$OUTPUT_DIR/reference/Configuration.xwiki"
|
||||
convert_file "$SCRIPT_DIR/reference/Handler-Contract.md" "$OUTPUT_DIR/reference/Handler-Contract.xwiki"
|
||||
convert_file "$SCRIPT_DIR/reference/CLI.md" "$OUTPUT_DIR/reference/CLI.xwiki"
|
||||
|
||||
# Convert tutorials
|
||||
convert_file "$SCRIPT_DIR/tutorials/Hello-World.md" "$OUTPUT_DIR/tutorials/Hello-World.xwiki"
|
||||
|
||||
echo ""
|
||||
echo "Conversion complete!"
|
||||
echo ""
|
||||
echo "Files created in: $OUTPUT_DIR"
|
||||
echo ""
|
||||
echo "Next steps:"
|
||||
echo " 1. Review the converted files"
|
||||
echo " 2. Upload to XWiki via REST API or import feature"
|
||||
echo ""
|
||||
echo "XWiki REST API example:"
|
||||
echo " curl -u admin:password -X PUT \\"
|
||||
echo " 'https://xml-pipeline.org/rest/wikis/xwiki/spaces/Docs/pages/Home' \\"
|
||||
echo " -H 'Content-Type: text/plain' \\"
|
||||
echo " -d @$OUTPUT_DIR/Home.xwiki"
|
||||
129
docs/wiki/reference/CLI.md
Normal file
129
docs/wiki/reference/CLI.md
Normal file
|
|
@ -0,0 +1,129 @@
|
|||
# CLI Reference
|
||||
|
||||
xml-pipeline provides a command-line interface for running and managing organisms.
|
||||
|
||||
## Commands
|
||||
|
||||
### xml-pipeline run
|
||||
|
||||
Run an organism from a configuration file.
|
||||
|
||||
```bash
|
||||
xml-pipeline run [CONFIG_PATH]
|
||||
```
|
||||
|
||||
**Arguments:**
|
||||
- `CONFIG_PATH` — Path to organism.yaml (default: `config/organism.yaml`)
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
xml-pipeline run config/organism.yaml
|
||||
xml-pipeline run # Uses default path
|
||||
xp run config/my-organism.yaml # Short alias
|
||||
```
|
||||
|
||||
### xml-pipeline init
|
||||
|
||||
Create a new organism configuration from template.
|
||||
|
||||
```bash
|
||||
xml-pipeline init [NAME]
|
||||
```
|
||||
|
||||
**Arguments:**
|
||||
- `NAME` — Organism name (default: `my-organism`)
|
||||
|
||||
**Creates:**
|
||||
```
|
||||
{NAME}/
|
||||
├── config/
|
||||
│ └── organism.yaml
|
||||
├── handlers/
|
||||
│ └── hello.py
|
||||
└── .env.example
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
xml-pipeline init my-agent
|
||||
xml-pipeline init # Uses default name
|
||||
```
|
||||
|
||||
### xml-pipeline check
|
||||
|
||||
Validate configuration without running.
|
||||
|
||||
```bash
|
||||
xml-pipeline check [CONFIG_PATH]
|
||||
```
|
||||
|
||||
**Arguments:**
|
||||
- `CONFIG_PATH` — Path to organism.yaml (default: `config/organism.yaml`)
|
||||
|
||||
**Output:**
|
||||
```
|
||||
Config valid: hello-world
|
||||
Listeners: 3
|
||||
LLM backends: 1
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```bash
|
||||
xml-pipeline check config/organism.yaml
|
||||
xml-pipeline check # Uses default path
|
||||
```
|
||||
|
||||
### xml-pipeline version
|
||||
|
||||
Show version and installed features.
|
||||
|
||||
```bash
|
||||
xml-pipeline version
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```
|
||||
xml-pipeline 0.4.0
|
||||
Python 3.11.5
|
||||
Features: anthropic, console, redis, search
|
||||
```
|
||||
|
||||
## Short Alias
|
||||
|
||||
The `xp` command is a short alias for `xml-pipeline`:
|
||||
|
||||
```bash
|
||||
xp run config/organism.yaml
|
||||
xp init my-agent
|
||||
xp check
|
||||
xp version
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `XAI_API_KEY` | xAI (Grok) API key |
|
||||
| `ANTHROPIC_API_KEY` | Anthropic (Claude) API key |
|
||||
| `OPENAI_API_KEY` | OpenAI API key |
|
||||
|
||||
Create a `.env` file in your project root:
|
||||
|
||||
```env
|
||||
XAI_API_KEY=xai-...
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
```
|
||||
|
||||
## Exit Codes
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| 0 | Success |
|
||||
| 1 | Configuration error |
|
||||
| 2 | Runtime error |
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Installation]] — Installing xml-pipeline
|
||||
- [[Quick Start]] — Getting started
|
||||
- [[Configuration]] — Configuration reference
|
||||
196
docs/wiki/reference/Configuration.md
Normal file
196
docs/wiki/reference/Configuration.md
Normal file
|
|
@ -0,0 +1,196 @@
|
|||
# Configuration Reference
|
||||
|
||||
Organisms are configured via YAML files. The default location is `config/organism.yaml`.
|
||||
|
||||
## Minimal Configuration
|
||||
|
||||
```yaml
|
||||
organism:
|
||||
name: my-organism
|
||||
|
||||
listeners:
|
||||
- name: greeter
|
||||
payload_class: handlers.hello.Greeting
|
||||
handler: handlers.hello.handle_greeting
|
||||
description: Greeting handler
|
||||
```
|
||||
|
||||
## Full Configuration Reference
|
||||
|
||||
```yaml
|
||||
# ============================================================
|
||||
# ORGANISM SECTION
|
||||
# Core identity and network settings
|
||||
# ============================================================
|
||||
organism:
|
||||
name: "my-organism" # Human-readable name (required)
|
||||
port: 8765 # WebSocket port (optional)
|
||||
identity: "config/identity.key" # Ed25519 private key path (optional)
|
||||
tls: # TLS settings (optional)
|
||||
cert: "certs/fullchain.pem"
|
||||
key: "certs/privkey.pem"
|
||||
|
||||
# ============================================================
|
||||
# LLM SECTION
|
||||
# Language model routing configuration
|
||||
# ============================================================
|
||||
llm:
|
||||
strategy: failover # failover | round-robin | least-loaded
|
||||
retries: 3 # Max retry attempts
|
||||
retry_base_delay: 1.0 # Base delay for exponential backoff
|
||||
retry_max_delay: 60.0 # Maximum delay between retries
|
||||
|
||||
backends:
|
||||
- provider: xai # xai | anthropic | openai | ollama
|
||||
api_key_env: XAI_API_KEY # Environment variable name
|
||||
priority: 1 # Lower = preferred (for failover)
|
||||
rate_limit_tpm: 100000 # Tokens per minute limit
|
||||
max_concurrent: 20 # Max concurrent requests
|
||||
|
||||
- provider: anthropic
|
||||
api_key_env: ANTHROPIC_API_KEY
|
||||
priority: 2
|
||||
|
||||
- provider: ollama
|
||||
base_url: http://localhost:11434
|
||||
supported_models: [llama3, mistral]
|
||||
|
||||
# ============================================================
|
||||
# BACKEND SECTION (Optional)
|
||||
# Shared state for multiprocess deployments
|
||||
# ============================================================
|
||||
backend:
|
||||
type: memory # memory | manager | redis
|
||||
# Redis-specific settings (when type: redis)
|
||||
redis_url: "redis://localhost:6379"
|
||||
redis_prefix: "xp:" # Key prefix for multi-tenancy
|
||||
redis_ttl: 86400 # TTL in seconds (24 hours)
|
||||
|
||||
# ============================================================
|
||||
# PROCESS POOL SECTION (Optional)
|
||||
# Worker processes for CPU-bound handlers
|
||||
# ============================================================
|
||||
process_pool:
|
||||
workers: 4 # Number of worker processes
|
||||
max_tasks_per_child: 100 # Restart workers after N tasks
|
||||
|
||||
# ============================================================
|
||||
# LISTENERS SECTION
|
||||
# Message handlers (tools and agents)
|
||||
# ============================================================
|
||||
listeners:
|
||||
# Simple tool (non-agent)
|
||||
- name: calculator.add
|
||||
payload_class: handlers.calc.AddPayload
|
||||
handler: handlers.calc.add_handler
|
||||
description: "Adds two numbers"
|
||||
|
||||
# LLM Agent
|
||||
- name: researcher
|
||||
payload_class: handlers.research.ResearchQuery
|
||||
handler: handlers.research.research_handler
|
||||
description: "Research agent that searches and synthesizes"
|
||||
agent: true # Marks as LLM agent
|
||||
peers: # Allowed call targets
|
||||
- calculator.add
|
||||
- web_search
|
||||
prompt: | # System prompt for LLM
|
||||
You are a research assistant.
|
||||
Use tools to find information.
|
||||
|
||||
# CPU-bound handler (runs in process pool)
|
||||
- name: librarian
|
||||
payload_class: handlers.librarian.Query
|
||||
handler: handlers.librarian.handle_query
|
||||
description: "Document analysis with heavy computation"
|
||||
cpu_bound: true # Dispatch to ProcessPoolExecutor
|
||||
|
||||
# ============================================================
|
||||
# GATEWAYS SECTION (Optional)
|
||||
# Federation with remote organisms
|
||||
# ============================================================
|
||||
gateways:
|
||||
- name: remote_search
|
||||
remote_url: "wss://search.example.org"
|
||||
trusted_identity: "keys/search_node.pub"
|
||||
description: "Federated search gateway"
|
||||
```
|
||||
|
||||
## Section Details
|
||||
|
||||
### organism
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `name` | string | Yes | Human-readable organism name |
|
||||
| `port` | int | No | WebSocket server port |
|
||||
| `identity` | path | No | Ed25519 private key for signing |
|
||||
| `tls.cert` | path | No | TLS certificate path |
|
||||
| `tls.key` | path | No | TLS private key path |
|
||||
|
||||
### llm.backends[]
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `provider` | string | Yes | `xai`, `anthropic`, `openai`, `ollama` |
|
||||
| `api_key_env` | string | Depends | Env var containing API key |
|
||||
| `base_url` | string | No | Override API endpoint |
|
||||
| `priority` | int | No | Lower = preferred (default: 1) |
|
||||
| `rate_limit_tpm` | int | No | Tokens per minute limit |
|
||||
| `max_concurrent` | int | No | Max concurrent requests |
|
||||
| `supported_models` | list | No | Models this backend serves (ollama) |
|
||||
|
||||
### listeners[]
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `name` | string | Yes | Unique listener name |
|
||||
| `payload_class` | string | Yes | Import path to `@xmlify` dataclass |
|
||||
| `handler` | string | Yes | Import path to handler function |
|
||||
| `description` | string | Yes | Short description (used in prompts) |
|
||||
| `agent` | bool | No | Is this an LLM agent? (default: false) |
|
||||
| `peers` | list | No | Allowed call targets for agents |
|
||||
| `prompt` | string | No | System prompt for LLM agents |
|
||||
| `cpu_bound` | bool | No | Run in ProcessPoolExecutor (default: false) |
|
||||
| `broadcast` | bool | No | Allow shared root tag (default: false) |
|
||||
|
||||
### backend
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `type` | string | No | `memory`, `manager`, `redis` (default: memory) |
|
||||
| `redis_url` | string | If redis | Redis connection URL |
|
||||
| `redis_prefix` | string | No | Key prefix (default: `xp:`) |
|
||||
| `redis_ttl` | int | No | Key TTL in seconds |
|
||||
|
||||
### process_pool
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `workers` | int | No | Number of worker processes (default: CPU count) |
|
||||
| `max_tasks_per_child` | int | No | Tasks before worker restart |
|
||||
|
||||
## Environment Variables
|
||||
|
||||
API keys should be stored in environment variables, referenced via `api_key_env`:
|
||||
|
||||
```env
|
||||
# .env file
|
||||
XAI_API_KEY=xai-abc123...
|
||||
ANTHROPIC_API_KEY=sk-ant-...
|
||||
OPENAI_API_KEY=sk-...
|
||||
```
|
||||
|
||||
## Validation
|
||||
|
||||
Validate your configuration without running:
|
||||
|
||||
```bash
|
||||
xml-pipeline check config/organism.yaml
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Quick Start]] — Get started quickly
|
||||
- [[Writing Handlers]] — Create handlers
|
||||
- [[LLM Router]] — LLM backend details
|
||||
293
docs/wiki/reference/Handler-Contract.md
Normal file
293
docs/wiki/reference/Handler-Contract.md
Normal file
|
|
@ -0,0 +1,293 @@
|
|||
# Handler Contract
|
||||
|
||||
The complete specification for handler functions in xml-pipeline.
|
||||
|
||||
## Signature
|
||||
|
||||
Every handler must be declared as:
|
||||
|
||||
```python
|
||||
async def handler(
|
||||
payload: PayloadDataclass,
|
||||
metadata: HandlerMetadata
|
||||
) -> HandlerResponse | None:
|
||||
...
|
||||
```
|
||||
|
||||
### Requirements
|
||||
|
||||
| Aspect | Requirement |
|
||||
|--------|-------------|
|
||||
| Function type | Must be `async def` |
|
||||
| First parameter | XSD-validated `@xmlify` dataclass |
|
||||
| Second parameter | `HandlerMetadata` (required) |
|
||||
| Return type | `HandlerResponse` or `None` |
|
||||
|
||||
## HandlerMetadata
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class HandlerMetadata:
|
||||
thread_id: str # Opaque thread UUID
|
||||
from_id: str # Who sent this message
|
||||
own_name: str | None # This listener's name (agents only)
|
||||
is_self_call: bool # True if message from self
|
||||
usage_instructions: str # Peer schemas for LLM prompts
|
||||
todo_nudge: str # System reminders
|
||||
```
|
||||
|
||||
### Field Details
|
||||
|
||||
#### thread_id
|
||||
|
||||
Opaque UUID for the current conversation thread.
|
||||
|
||||
- Use for thread-scoped storage
|
||||
- Never parse or make assumptions about format
|
||||
- Maps internally to call chain (hidden)
|
||||
|
||||
#### from_id
|
||||
|
||||
The registered name of the listener that sent this message.
|
||||
|
||||
- Only shows immediate sender, not full chain
|
||||
- Use for logging/debugging
|
||||
- Don't use for routing (use `HandlerResponse.to`)
|
||||
|
||||
#### own_name
|
||||
|
||||
This listener's registered name. Only set for agents (`agent: true`).
|
||||
|
||||
- Enables self-referential reasoning
|
||||
- Used for self-iteration: `to=metadata.own_name`
|
||||
- `None` for non-agent listeners
|
||||
|
||||
#### is_self_call
|
||||
|
||||
`True` if this message was sent by this same handler.
|
||||
|
||||
- Detect iteration loops
|
||||
- Handle self-messages differently if needed
|
||||
|
||||
#### usage_instructions
|
||||
|
||||
Auto-generated documentation of peer capabilities.
|
||||
|
||||
- Contains XSD schemas of declared peers
|
||||
- Inject into LLM system prompts
|
||||
- Empty if no peers declared
|
||||
|
||||
#### todo_nudge
|
||||
|
||||
System-generated reminders about pending todos.
|
||||
|
||||
- Contains info about raised watchers
|
||||
- Used by agents for task tracking
|
||||
- Empty if no pending todos
|
||||
|
||||
## HandlerResponse
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class HandlerResponse:
|
||||
payload: Any # @xmlify dataclass instance
|
||||
to: str # Target listener name
|
||||
```
|
||||
|
||||
### Construction Methods
|
||||
|
||||
#### Direct Construction
|
||||
|
||||
Forward to a specific listener:
|
||||
|
||||
```python
|
||||
return HandlerResponse(
|
||||
payload=MyResponse(data="result"),
|
||||
to="next-handler",
|
||||
)
|
||||
```
|
||||
|
||||
#### Respond to Caller
|
||||
|
||||
Return to whoever sent the message:
|
||||
|
||||
```python
|
||||
return HandlerResponse.respond(
|
||||
payload=ResultPayload(value=42)
|
||||
)
|
||||
```
|
||||
|
||||
### Return None
|
||||
|
||||
End the chain with no response:
|
||||
|
||||
```python
|
||||
return None
|
||||
```
|
||||
|
||||
## Return Types
|
||||
|
||||
| Return | Effect |
|
||||
|--------|--------|
|
||||
| `HandlerResponse(to="x")` | Forward to listener "x" |
|
||||
| `HandlerResponse.respond()` | Return to caller (prunes chain) |
|
||||
| `None` | Terminate chain |
|
||||
|
||||
## Envelope Control
|
||||
|
||||
The system enforces these rules on responses:
|
||||
|
||||
| Field | Handler Control | System Override |
|
||||
|-------|-----------------|-----------------|
|
||||
| `<from>` | None | Always `listener.name` |
|
||||
| `<to>` | Via `response.to` | Validated against peers |
|
||||
| `<thread>` | None | Managed by registry |
|
||||
| `<payload>` | Full control | — |
|
||||
|
||||
## Peer Constraints
|
||||
|
||||
Agents can only send to declared peers:
|
||||
|
||||
```yaml
|
||||
listeners:
|
||||
- name: greeter
|
||||
agent: true
|
||||
peers: [shouter, logger] # Only these allowed
|
||||
```
|
||||
|
||||
### Violation Handling
|
||||
|
||||
If agent sends to undeclared peer:
|
||||
|
||||
1. Message **blocked** (never routed)
|
||||
2. `SystemError` returned to agent
|
||||
3. Thread stays alive (can retry)
|
||||
|
||||
```xml
|
||||
<SystemError>
|
||||
<code>routing</code>
|
||||
<message>Message could not be delivered.</message>
|
||||
<retry-allowed>true</retry-allowed>
|
||||
</SystemError>
|
||||
```
|
||||
|
||||
## Response Semantics
|
||||
|
||||
### Critical: Pruning on Respond
|
||||
|
||||
When you call `.respond()`, the call chain is **pruned**:
|
||||
|
||||
```
|
||||
Before: console → greeter → calculator
|
||||
↑ (you respond here)
|
||||
|
||||
After: console → greeter
|
||||
↑ (response delivered here)
|
||||
```
|
||||
|
||||
**Consequences:**
|
||||
|
||||
- Sub-agents you called are terminated
|
||||
- Their state/context is lost
|
||||
- You cannot call them again in this context
|
||||
|
||||
**Therefore:** Complete ALL sub-tasks before responding.
|
||||
|
||||
## Examples
|
||||
|
||||
### Simple Tool
|
||||
|
||||
```python
|
||||
@xmlify
|
||||
@dataclass
|
||||
class AddPayload:
|
||||
a: int
|
||||
b: int
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class AddResult:
|
||||
sum: int
|
||||
|
||||
async def add_handler(payload: AddPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
result = payload.a + payload.b
|
||||
return HandlerResponse.respond(payload=AddResult(sum=result))
|
||||
```
|
||||
|
||||
### LLM Agent
|
||||
|
||||
```python
|
||||
async def research_handler(payload: ResearchQuery, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
from xml_pipeline.platform.llm_api import complete
|
||||
|
||||
response = await complete(
|
||||
model="grok-4.1",
|
||||
messages=[
|
||||
{"role": "system", "content": metadata.usage_instructions},
|
||||
{"role": "user", "content": payload.query},
|
||||
],
|
||||
)
|
||||
|
||||
return HandlerResponse(
|
||||
payload=ResearchResult(answer=response.content),
|
||||
to="summarizer",
|
||||
)
|
||||
```
|
||||
|
||||
### Self-Iterating Agent
|
||||
|
||||
```python
|
||||
async def thinking_agent(payload: ThinkPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
if payload.iteration >= 5:
|
||||
# Done thinking - respond to caller
|
||||
return HandlerResponse.respond(
|
||||
payload=FinalAnswer(answer=payload.current_answer)
|
||||
)
|
||||
|
||||
# Continue thinking by calling self
|
||||
return HandlerResponse(
|
||||
payload=ThinkPayload(
|
||||
iteration=payload.iteration + 1,
|
||||
current_answer=f"Refined: {payload.current_answer}",
|
||||
),
|
||||
to=metadata.own_name, # Self-call
|
||||
)
|
||||
```
|
||||
|
||||
### Terminal Handler
|
||||
|
||||
```python
|
||||
async def console_output(payload: TextOutput, metadata: HandlerMetadata) -> None:
|
||||
print(f"[{payload.source}] {payload.text}")
|
||||
return None # Chain ends
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
```python
|
||||
async def safe_handler(payload: MyPayload, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
try:
|
||||
result = await risky_operation(payload)
|
||||
return HandlerResponse.respond(payload=SuccessResult(data=result))
|
||||
except ValidationError as e:
|
||||
return HandlerResponse.respond(payload=ErrorResult(error=str(e)))
|
||||
except Exception:
|
||||
logger.exception("Handler failed")
|
||||
return HandlerResponse.respond(payload=ErrorResult(error="Internal error"))
|
||||
```
|
||||
|
||||
## Security Properties
|
||||
|
||||
Handlers are **untrusted code**. Even compromised handlers cannot:
|
||||
|
||||
- Forge sender identity
|
||||
- Access other threads
|
||||
- Discover organism topology
|
||||
- Route to undeclared peers
|
||||
- Modify message history
|
||||
|
||||
## See Also
|
||||
|
||||
- [[Writing Handlers]] — Practical guide
|
||||
- [[Configuration]] — Registering handlers
|
||||
- [[Architecture Overview]] — System architecture
|
||||
376
docs/wiki/tutorials/Hello-World.md
Normal file
376
docs/wiki/tutorials/Hello-World.md
Normal file
|
|
@ -0,0 +1,376 @@
|
|||
# Hello World Tutorial
|
||||
|
||||
Build a complete greeting agent from scratch. By the end, you'll understand payloads, handlers, configuration, and message flow.
|
||||
|
||||
## What We're Building
|
||||
|
||||
A greeting system with three components:
|
||||
|
||||
```
|
||||
User Input → Greeter Agent → Shouter Tool → Output
|
||||
│ │ │ │
|
||||
│ │ │ │
|
||||
"Alice" "Hello, "HELLO, Displayed
|
||||
Alice!" ALICE!" to user
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Python 3.11+
|
||||
- xml-pipeline installed (`pip install xml-pipeline[console]`)
|
||||
|
||||
## Step 1: Create Project Structure
|
||||
|
||||
```bash
|
||||
mkdir hello-world
|
||||
cd hello-world
|
||||
mkdir -p config handlers
|
||||
```
|
||||
|
||||
## Step 2: Define Payloads
|
||||
|
||||
Create `handlers/payloads.py`:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
from third_party.xmlable import xmlify
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class Greeting:
|
||||
"""Request to greet someone."""
|
||||
name: str
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class GreetingResponse:
|
||||
"""A friendly greeting."""
|
||||
message: str
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class ShoutRequest:
|
||||
"""Request to shout text."""
|
||||
text: str
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class ShoutResponse:
|
||||
"""Shouted text."""
|
||||
text: str
|
||||
|
||||
@xmlify
|
||||
@dataclass
|
||||
class ConsoleOutput:
|
||||
"""Text to display."""
|
||||
text: str
|
||||
source: str = "system"
|
||||
```
|
||||
|
||||
**What's happening:**
|
||||
- `@xmlify` enables XML serialization
|
||||
- `@dataclass` provides the fields
|
||||
- Each class becomes a valid XML payload
|
||||
|
||||
## Step 3: Write Handlers
|
||||
|
||||
Create `handlers/greeter.py`:
|
||||
|
||||
```python
|
||||
from xml_pipeline.message_bus.message_state import HandlerMetadata, HandlerResponse
|
||||
from handlers.payloads import Greeting, GreetingResponse, ShoutRequest
|
||||
|
||||
async def handle_greeting(payload: Greeting, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
"""
|
||||
Receive a greeting request, create a friendly message,
|
||||
then forward to the shouter to make it exciting.
|
||||
"""
|
||||
# Create the greeting
|
||||
message = f"Hello, {payload.name}! Welcome to xml-pipeline!"
|
||||
|
||||
# Forward to shouter (will come back to us? No - goes to output)
|
||||
return HandlerResponse(
|
||||
payload=ShoutRequest(text=message),
|
||||
to="shouter",
|
||||
)
|
||||
```
|
||||
|
||||
Create `handlers/shouter.py`:
|
||||
|
||||
```python
|
||||
from xml_pipeline.message_bus.message_state import HandlerMetadata, HandlerResponse
|
||||
from handlers.payloads import ShoutRequest, ConsoleOutput
|
||||
|
||||
async def handle_shout(payload: ShoutRequest, metadata: HandlerMetadata) -> HandlerResponse:
|
||||
"""
|
||||
Take text and SHOUT IT!
|
||||
Then send to console for display.
|
||||
"""
|
||||
shouted = payload.text.upper() + "!!!"
|
||||
|
||||
return HandlerResponse(
|
||||
payload=ConsoleOutput(text=shouted, source="shouter"),
|
||||
to="console-output",
|
||||
)
|
||||
```
|
||||
|
||||
Create `handlers/output.py`:
|
||||
|
||||
```python
|
||||
from xml_pipeline.message_bus.message_state import HandlerMetadata
|
||||
from handlers.payloads import ConsoleOutput
|
||||
|
||||
async def handle_output(payload: ConsoleOutput, metadata: HandlerMetadata) -> None:
|
||||
"""
|
||||
Display text to console.
|
||||
Returns None to end the message chain.
|
||||
"""
|
||||
print(f"\n[{payload.source}] {payload.text}\n")
|
||||
return None # Chain ends here
|
||||
```
|
||||
|
||||
## Step 4: Configure the Organism
|
||||
|
||||
Create `config/organism.yaml`:
|
||||
|
||||
```yaml
|
||||
organism:
|
||||
name: hello-world
|
||||
|
||||
listeners:
|
||||
# The greeter agent
|
||||
- name: greeter
|
||||
payload_class: handlers.payloads.Greeting
|
||||
handler: handlers.greeter.handle_greeting
|
||||
description: "Greets people by name"
|
||||
peers:
|
||||
- shouter
|
||||
|
||||
# The shouter tool
|
||||
- name: shouter
|
||||
payload_class: handlers.payloads.ShoutRequest
|
||||
handler: handlers.shouter.handle_shout
|
||||
description: "Makes text LOUD"
|
||||
peers:
|
||||
- console-output
|
||||
|
||||
# Output handler
|
||||
- name: console-output
|
||||
payload_class: handlers.payloads.ConsoleOutput
|
||||
handler: handlers.output.handle_output
|
||||
description: "Displays text to console"
|
||||
```
|
||||
|
||||
## Step 5: Verify Configuration
|
||||
|
||||
```bash
|
||||
xml-pipeline check config/organism.yaml
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
Config valid: hello-world
|
||||
Listeners: 3
|
||||
LLM backends: 0
|
||||
```
|
||||
|
||||
## Step 6: Create Test Script
|
||||
|
||||
Create `test_greeting.py`:
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
from xml_pipeline.message_bus.stream_pump import StreamPump
|
||||
from xml_pipeline.config.loader import load_config
|
||||
|
||||
async def main():
|
||||
# Load configuration
|
||||
config = load_config("config/organism.yaml")
|
||||
|
||||
# Create and start the pump
|
||||
pump = StreamPump(config)
|
||||
await pump.start()
|
||||
|
||||
print("Organism started! Injecting greeting...")
|
||||
|
||||
# Create a greeting message
|
||||
greeting_xml = b"""<?xml version="1.0"?>
|
||||
<message xmlns="https://xml-pipeline.org/ns/envelope/v1">
|
||||
<meta>
|
||||
<from>test</from>
|
||||
<to>greeter</to>
|
||||
</meta>
|
||||
<greeting>
|
||||
<name>Alice</name>
|
||||
</greeting>
|
||||
</message>
|
||||
"""
|
||||
|
||||
# Inject the message
|
||||
await pump.inject(greeting_xml, from_id="test")
|
||||
|
||||
# Give it time to process
|
||||
await asyncio.sleep(1)
|
||||
|
||||
# Shutdown
|
||||
await pump.shutdown()
|
||||
print("Done!")
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
```
|
||||
|
||||
## Step 7: Run It
|
||||
|
||||
```bash
|
||||
python test_greeting.py
|
||||
```
|
||||
|
||||
Expected output:
|
||||
```
|
||||
Organism started! Injecting greeting...
|
||||
|
||||
[shouter] HELLO, ALICE! WELCOME TO XML-PIPELINE!!!!
|
||||
|
||||
Done!
|
||||
```
|
||||
|
||||
## What Just Happened?
|
||||
|
||||
Let's trace the message flow:
|
||||
|
||||
### 1. Message Injection
|
||||
|
||||
```xml
|
||||
<greeting>
|
||||
<name>Alice</name>
|
||||
</greeting>
|
||||
```
|
||||
|
||||
Injected with `from=test`, `to=greeter`.
|
||||
|
||||
### 2. Pipeline Processing
|
||||
|
||||
```
|
||||
Raw bytes
|
||||
↓
|
||||
repair_step → Valid XML tree
|
||||
↓
|
||||
c14n_step → Canonicalized
|
||||
↓
|
||||
envelope_valid → Matches envelope.xsd
|
||||
↓
|
||||
payload_extract → Extracts <greeting>
|
||||
↓
|
||||
thread_assign → UUID: abc-123
|
||||
↓
|
||||
xsd_validate → Matches Greeting schema
|
||||
↓
|
||||
deserialize → Greeting(name="Alice")
|
||||
↓
|
||||
routing → target: greeter
|
||||
```
|
||||
|
||||
### 3. Handler Dispatch
|
||||
|
||||
```python
|
||||
# greeter receives:
|
||||
payload = Greeting(name="Alice")
|
||||
metadata.thread_id = "abc-123"
|
||||
metadata.from_id = "test"
|
||||
```
|
||||
|
||||
### 4. Response Processing
|
||||
|
||||
Greeter returns:
|
||||
```python
|
||||
HandlerResponse(
|
||||
payload=ShoutRequest(text="Hello, Alice!..."),
|
||||
to="shouter",
|
||||
)
|
||||
```
|
||||
|
||||
System:
|
||||
1. Validates `shouter` is in greeter's peers ✓
|
||||
2. Serializes ShoutRequest to XML
|
||||
3. Wraps in envelope with `from=greeter`
|
||||
4. Re-injects into pipeline
|
||||
|
||||
### 5. Chain Continues
|
||||
|
||||
Shouter receives ShoutRequest, returns ConsoleOutput to `console-output`.
|
||||
|
||||
### 6. Chain Terminates
|
||||
|
||||
`handle_output` returns `None` → chain ends.
|
||||
|
||||
## Exercises
|
||||
|
||||
### Exercise 1: Add a Counter
|
||||
|
||||
Modify the shouter to count how many times it's been called:
|
||||
|
||||
```python
|
||||
# Hint: Use a module-level variable (simple) or
|
||||
# the context buffer (proper way)
|
||||
```
|
||||
|
||||
### Exercise 2: Add Error Handling
|
||||
|
||||
What happens if someone sends an empty name? Add validation:
|
||||
|
||||
```python
|
||||
async def handle_greeting(payload: Greeting, metadata: HandlerMetadata):
|
||||
if not payload.name.strip():
|
||||
return HandlerResponse(
|
||||
payload=ConsoleOutput(text="Error: Name required", source="greeter"),
|
||||
to="console-output",
|
||||
)
|
||||
# ... rest of handler
|
||||
```
|
||||
|
||||
### Exercise 3: Make Greeter an LLM Agent
|
||||
|
||||
Convert greeter to use an LLM for creative greetings:
|
||||
|
||||
```python
|
||||
from xml_pipeline.platform.llm_api import complete
|
||||
|
||||
async def handle_greeting(payload: Greeting, metadata: HandlerMetadata):
|
||||
response = await complete(
|
||||
model="grok-4.1",
|
||||
messages=[
|
||||
{"role": "system", "content": "Generate a creative, friendly greeting."},
|
||||
{"role": "user", "content": f"Greet someone named {payload.name}"},
|
||||
],
|
||||
)
|
||||
|
||||
return HandlerResponse(
|
||||
payload=ShoutRequest(text=response.content),
|
||||
to="shouter",
|
||||
)
|
||||
```
|
||||
|
||||
Don't forget to add LLM configuration:
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
backends:
|
||||
- provider: xai
|
||||
api_key_env: XAI_API_KEY
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
You've learned:
|
||||
- **Payloads**: Define messages with `@xmlify` dataclasses
|
||||
- **Handlers**: Async functions that process and respond
|
||||
- **Configuration**: Wire everything in organism.yaml
|
||||
- **Message Flow**: How messages traverse the pipeline
|
||||
- **Chaining**: Handlers forward to each other via `HandlerResponse`
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [[Writing Handlers]] — More handler patterns
|
||||
- [[Configuration]] — Full configuration reference
|
||||
- [[Architecture Overview]] — Deep dive into internals
|
||||
Loading…
Reference in a new issue