xml-pipeline/xml_pipeline/llm/__init__.py
dullfig 8b11323a8b Add token budget enforcement and usage tracking
Token Budget System:
- ThreadBudgetRegistry tracks per-thread token usage with configurable limits
- BudgetExhaustedError raised when thread exceeds max_tokens_per_thread
- Integrates with LLMRouter to block LLM calls when budget exhausted
- Automatic cleanup when threads are pruned

Usage Tracking (for production billing):
- UsageTracker emits events after each LLM completion
- Subscribers receive UsageEvent with tokens, latency, estimated cost
- Cost estimation for common models (Grok, Claude, GPT, etc.)
- Aggregate stats by agent, model, and totals

Configuration:
- max_tokens_per_thread in organism.yaml (default 100k)
- LLMRouter.complete() accepts thread_id and metadata parameters

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-27 21:07:43 -08:00

66 lines
1.4 KiB
Python

"""
LLM abstraction layer.
Usage:
from xml_pipeline.llm import router
# Configure once at startup (or via organism.yaml)
router.configure_router({
"strategy": "failover",
"backends": [
{"provider": "xai", "api_key_env": "XAI_API_KEY"},
]
})
# Then anywhere in your code:
response = await router.complete(
model="grok-4.1",
messages=[{"role": "user", "content": "Hello"}],
thread_id=metadata.thread_id, # For budget enforcement
agent_id=metadata.own_name, # For usage tracking
)
Usage Tracking:
from xml_pipeline.llm import get_usage_tracker
tracker = get_usage_tracker()
# Subscribe to events for billing
tracker.subscribe(lambda event: billing_api.record(event))
# Query totals
totals = tracker.get_totals()
"""
from xml_pipeline.llm.router import (
LLMRouter,
get_router,
configure_router,
complete,
Strategy,
)
from xml_pipeline.llm.backend import LLMRequest, LLMResponse, BackendError
from xml_pipeline.llm.usage_tracker import (
UsageTracker,
UsageEvent,
get_usage_tracker,
reset_usage_tracker,
)
__all__ = [
# Router
"LLMRouter",
"get_router",
"configure_router",
"complete",
"Strategy",
# Backend
"LLMRequest",
"LLMResponse",
"BackendError",
# Usage tracking
"UsageTracker",
"UsageEvent",
"get_usage_tracker",
"reset_usage_tracker",
]