xml-pipeline

dullfig/xml-pipeline

Fork 0

Commit graph

Author	SHA1	Message	Date
dullfig	860395cd58	Add usage/gas tracking REST API endpoints Endpoints: - GET /api/v1/usage - Overview with totals, per-agent, per-model breakdown - GET /api/v1/usage/threads - List all thread budgets sorted by usage - GET /api/v1/usage/threads/{id} - Single thread budget details - GET /api/v1/usage/agents/{id} - Usage totals for specific agent - GET /api/v1/usage/models/{model} - Usage totals for specific model - POST /api/v1/usage/reset - Reset all usage tracking Models: - UsageTotals, UsageOverview, UsageResponse - ThreadBudgetInfo, ThreadBudgetListResponse - AgentUsageInfo, ModelUsageInfo Also adds has_budget() method to ThreadBudgetRegistry for checking if a thread exists without auto-creating it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 21:20:36 -08:00
dullfig	8b11323a8b	Add token budget enforcement and usage tracking Token Budget System: - ThreadBudgetRegistry tracks per-thread token usage with configurable limits - BudgetExhaustedError raised when thread exceeds max_tokens_per_thread - Integrates with LLMRouter to block LLM calls when budget exhausted - Automatic cleanup when threads are pruned Usage Tracking (for production billing): - UsageTracker emits events after each LLM completion - Subscribers receive UsageEvent with tokens, latency, estimated cost - Cost estimation for common models (Grok, Claude, GPT, etc.) - Aggregate stats by agent, model, and totals Configuration: - max_tokens_per_thread in organism.yaml (default 100k) - LLMRouter.complete() accepts thread_id and metadata parameters Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 21:07:43 -08:00

Author

SHA1

Message

Date

dullfig

860395cd58

Add usage/gas tracking REST API endpoints

Endpoints:
- GET /api/v1/usage - Overview with totals, per-agent, per-model breakdown
- GET /api/v1/usage/threads - List all thread budgets sorted by usage
- GET /api/v1/usage/threads/{id} - Single thread budget details
- GET /api/v1/usage/agents/{id} - Usage totals for specific agent
- GET /api/v1/usage/models/{model} - Usage totals for specific model
- POST /api/v1/usage/reset - Reset all usage tracking

Models:
- UsageTotals, UsageOverview, UsageResponse
- ThreadBudgetInfo, ThreadBudgetListResponse
- AgentUsageInfo, ModelUsageInfo

Also adds has_budget() method to ThreadBudgetRegistry for checking
if a thread exists without auto-creating it.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-27 21:20:36 -08:00

dullfig

8b11323a8b

Add token budget enforcement and usage tracking

Token Budget System:
- ThreadBudgetRegistry tracks per-thread token usage with configurable limits
- BudgetExhaustedError raised when thread exceeds max_tokens_per_thread
- Integrates with LLMRouter to block LLM calls when budget exhausted
- Automatic cleanup when threads are pruned

Usage Tracking (for production billing):
- UsageTracker emits events after each LLM completion
- Subscribers receive UsageEvent with tokens, latency, estimated cost
- Cost estimation for common models (Grok, Claude, GPT, etc.)
- Aggregate stats by agent, model, and totals

Configuration:
- max_tokens_per_thread in organism.yaml (default 100k)
- LLMRouter.complete() accepts thread_id and metadata parameters

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-27 21:07:43 -08:00

2 commits