dullfig a5e2ab22da Add thread registry, LLM router, console handler, and docs updates

Thread Registry:
- Root thread initialization at boot
- Thread chain tracking for message flow
- register_thread() for external message UUIDs

LLM Router:
- Multi-backend support with failover strategy
- Token bucket rate limiting per backend
- Async completion API with retries

Console Handler:
- Message-driven REPL (not separate async loop)
- ConsolePrompt/ConsoleInput payloads
- Handler returns None to disconnect

Boot System:
- System primitives module
- Boot message injected at startup
- Initializes root thread context

Documentation:
- Updated v2.1 docs for new architecture
- LLM router documentation
- Gap analysis cross-check

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-10 16:53:38 -08:00

4 KiB

Raw Blame History

AgentServer v2.1 — System Primitives

Updated: January 10, 2026

This document specifies system-level message types and handler return semantics.

Handler Return Semantics

Handlers control message flow through their return value, not through magic XML tags.

Forward to Target

return HandlerResponse(
    payload=MyPayload(...),
    to="target_listener",
)

Pump validates target against peers list (for agents)
Extends thread chain: a.b → a.b.target
Target receives the payload with updated thread

Respond to Caller

return HandlerResponse.respond(
    payload=ResultPayload(...)
)

Pump looks up call chain from thread registry
Prunes last segment (the responder)
Routes to new tail (the caller)
Sub-threads are terminated (calculator memory, scratch space, etc.)

Terminate Chain

return None

No message emitted
Chain ends here
Thread can be cleaned up

System Messages

These payload elements are emitted by the system (pump) only. Agents cannot emit them.

`<huh>` — Validation Error

Emitted when message processing fails (XSD validation, unknown root tag, etc.).

<huh xmlns="https://xml-pipeline.org/ns/core/v1">
  <error>Invalid payload structure</error>
  <original-attempt>SGVsbG8gV29ybGQ=</original-attempt>
</huh>

Field	Description
`error`	Brief, canned error message (never raw validator output)
`original-attempt`	Base64-encoded raw bytes (truncated if large)

Security notes:

Error messages are intentionally abstract and generic
Identical messages for "wrong schema" vs "capability doesn't exist"
Prevents topology probing by agents or external callers
Authorized introspection available via meta queries only

`<SystemError>` — Routing/Delivery Failure

Emitted when a handler tries to send to an unauthorized or unreachable target.

<SystemError xmlns="">
  <code>routing</code>
  <message>Message could not be delivered. Please verify your target and try again.</message>
  <retry-allowed>true</retry-allowed>
</SystemError>

Field	Description
`code`	Error category: `routing`, `validation`, `timeout`
`message`	Generic, non-revealing description
`retry-allowed`	Whether agent can retry the operation

Key properties:

Keeps thread alive (agent can retry)
Never reveals topology (no "target doesn't exist" vs "not authorized")
Replaces the failed message in the flow

Agent Iteration Patterns

Blind Self-Iteration

LLM agents iterate by emitting payloads with their own root tag. With unique root tags per agent, this automatically routes back to themselves.

# In agent handler
return HandlerResponse(
    payload=ThinkPayload(reasoning="Let me think more..."),
    to=metadata.own_name,  # Routes to self
)

The pump sets is_self_call=True in metadata for these messages.

Visible Planning (Optional)

Agents may include planning constructs in their output for clarity:

<answer>
  I need to:
  <todo-until condition="have final answer">
    1. Search for relevant data
    2. Analyze results
    3. Synthesize conclusion
  </todo-until>

  Starting with step 1...
</answer>

Note: <todo-until> is NOT interpreted by the system. It's visible structured text that LLMs can use for planning. The actual iteration happens through normal message routing.

Response Semantics Warning

Critical for LLM agents:

When you respond (return to caller via .respond()), your call chain is pruned:

Any sub-agents you called are effectively terminated
Their state/context is lost (calculator memory, scratch space, etc.)
You cannot call them again in the same context after responding

Therefore: Complete ALL sub-tasks before responding. If you need results from a peer, wait for their response first.

This warning is automatically included in usage_instructions provided to agents.

v2.1 Specification — Updated January 10, 2026

4 KiB Raw Blame History