Add BloxServer API scaffold + architecture docs
BloxServer API (FastAPI + SQLAlchemy async): - Database models: users, flows, triggers, executions, usage tracking - Clerk JWT auth with dev mode bypass for local testing - SQLite support for local dev, PostgreSQL for production - CRUD routes for flows, triggers, executions - Public webhook endpoint with token auth - Health/readiness endpoints - Pydantic schemas with camelCase aliases for frontend - Docker + docker-compose setup Architecture documentation: - Librarian architecture with RLM-powered query engine - Stripe billing integration (usage-based, trials, webhooks) - LLM abstraction layer (rate limiting, semantic cache, failover) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
d184d22c60
commit
a5c00c1e90
23 changed files with 4681 additions and 0 deletions
4
.gitignore
vendored
4
.gitignore
vendored
|
|
@ -33,3 +33,7 @@ xml_pipeline/config/*.signed.xml
|
||||||
# OS
|
# OS
|
||||||
Thumbs.db
|
Thumbs.db
|
||||||
.DS_Store
|
.DS_Store
|
||||||
|
|
||||||
|
# BloxServer local dev
|
||||||
|
bloxserver.db
|
||||||
|
bloxserver/.env
|
||||||
|
|
|
||||||
54
bloxserver/.env.example
Normal file
54
bloxserver/.env.example
Normal file
|
|
@ -0,0 +1,54 @@
|
||||||
|
# BloxServer API Environment Variables
|
||||||
|
# Copy this file to .env and fill in the values
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Environment
|
||||||
|
# =============================================================================
|
||||||
|
ENV=development
|
||||||
|
# ENV=production
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Database (PostgreSQL)
|
||||||
|
# =============================================================================
|
||||||
|
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/bloxserver
|
||||||
|
|
||||||
|
# Set to true to auto-create tables on startup (disable in production)
|
||||||
|
AUTO_CREATE_TABLES=true
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Clerk Authentication
|
||||||
|
# =============================================================================
|
||||||
|
CLERK_ISSUER=https://your-clerk-instance.clerk.accounts.dev
|
||||||
|
CLERK_AUDIENCE=your-clerk-audience
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Stripe Billing
|
||||||
|
# =============================================================================
|
||||||
|
STRIPE_SECRET_KEY=sk_test_...
|
||||||
|
STRIPE_WEBHOOK_SECRET=whsec_...
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# API Key Encryption
|
||||||
|
# =============================================================================
|
||||||
|
# Generate with: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
|
||||||
|
API_KEY_ENCRYPTION_KEY=your-fernet-key-here
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# CORS
|
||||||
|
# =============================================================================
|
||||||
|
CORS_ORIGINS=http://localhost:3000,https://app.openblox.ai
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Webhooks
|
||||||
|
# =============================================================================
|
||||||
|
WEBHOOK_BASE_URL=https://api.openblox.ai/webhooks
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Redis (optional, for caching/rate limiting)
|
||||||
|
# =============================================================================
|
||||||
|
# REDIS_URL=redis://localhost:6379
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Docs
|
||||||
|
# =============================================================================
|
||||||
|
ENABLE_DOCS=true
|
||||||
58
bloxserver/Dockerfile
Normal file
58
bloxserver/Dockerfile
Normal file
|
|
@ -0,0 +1,58 @@
|
||||||
|
# BloxServer API Dockerfile
|
||||||
|
# Multi-stage build for smaller production image
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Build stage
|
||||||
|
# =============================================================================
|
||||||
|
FROM python:3.12-slim as builder
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Install build dependencies
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
build-essential \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Copy requirements first for layer caching
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip wheel --no-cache-dir --wheel-dir /app/wheels -r requirements.txt
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Production stage
|
||||||
|
# =============================================================================
|
||||||
|
FROM python:3.12-slim as production
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Create non-root user
|
||||||
|
RUN groupadd --gid 1000 bloxserver \
|
||||||
|
&& useradd --uid 1000 --gid bloxserver --shell /bin/bash --create-home bloxserver
|
||||||
|
|
||||||
|
# Install runtime dependencies
|
||||||
|
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||||
|
curl \
|
||||||
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Copy wheels from builder and install
|
||||||
|
COPY --from=builder /app/wheels /wheels
|
||||||
|
RUN pip install --no-cache-dir /wheels/* && rm -rf /wheels
|
||||||
|
|
||||||
|
# Copy application code
|
||||||
|
COPY --chown=bloxserver:bloxserver . /app/bloxserver
|
||||||
|
|
||||||
|
# Set Python path
|
||||||
|
ENV PYTHONPATH=/app
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
# Switch to non-root user
|
||||||
|
USER bloxserver
|
||||||
|
|
||||||
|
# Health check
|
||||||
|
HEALTHCHECK --interval=30s --timeout=5s --start-period=5s --retries=3 \
|
||||||
|
CMD curl -f http://localhost:8000/health/live || exit 1
|
||||||
|
|
||||||
|
# Expose port
|
||||||
|
EXPOSE 8000
|
||||||
|
|
||||||
|
# Run with uvicorn
|
||||||
|
CMD ["uvicorn", "bloxserver.api.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
203
bloxserver/README.md
Normal file
203
bloxserver/README.md
Normal file
|
|
@ -0,0 +1,203 @@
|
||||||
|
# BloxServer API
|
||||||
|
|
||||||
|
Backend API for BloxServer (OpenBlox.ai) - Visual AI Agent Workflow Builder.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### With Docker Compose (Recommended)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd bloxserver
|
||||||
|
|
||||||
|
# Start PostgreSQL, Redis, and API
|
||||||
|
docker-compose up -d
|
||||||
|
|
||||||
|
# Check logs
|
||||||
|
docker-compose logs -f api
|
||||||
|
|
||||||
|
# API available at http://localhost:8000
|
||||||
|
# Docs at http://localhost:8000/docs
|
||||||
|
```
|
||||||
|
|
||||||
|
### Local Development
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd bloxserver
|
||||||
|
|
||||||
|
# Create virtual environment
|
||||||
|
python -m venv .venv
|
||||||
|
source .venv/bin/activate # Linux/macOS
|
||||||
|
# .venv\Scripts\activate # Windows
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Copy environment variables
|
||||||
|
cp .env.example .env
|
||||||
|
# Edit .env with your settings
|
||||||
|
|
||||||
|
# Start PostgreSQL and Redis (or use Docker)
|
||||||
|
docker-compose up -d postgres redis
|
||||||
|
|
||||||
|
# Run the API
|
||||||
|
python -m bloxserver.api.main
|
||||||
|
# Or with uvicorn directly:
|
||||||
|
uvicorn bloxserver.api.main:app --reload
|
||||||
|
```
|
||||||
|
|
||||||
|
## API Endpoints
|
||||||
|
|
||||||
|
### Health
|
||||||
|
|
||||||
|
- `GET /health` - Basic health check
|
||||||
|
- `GET /health/ready` - Readiness check (includes DB)
|
||||||
|
- `GET /health/live` - Liveness check
|
||||||
|
|
||||||
|
### Flows
|
||||||
|
|
||||||
|
- `GET /api/v1/flows` - List flows
|
||||||
|
- `POST /api/v1/flows` - Create flow
|
||||||
|
- `GET /api/v1/flows/{id}` - Get flow
|
||||||
|
- `PATCH /api/v1/flows/{id}` - Update flow
|
||||||
|
- `DELETE /api/v1/flows/{id}` - Delete flow
|
||||||
|
- `POST /api/v1/flows/{id}/start` - Start flow
|
||||||
|
- `POST /api/v1/flows/{id}/stop` - Stop flow
|
||||||
|
|
||||||
|
### Triggers
|
||||||
|
|
||||||
|
- `GET /api/v1/flows/{flow_id}/triggers` - List triggers
|
||||||
|
- `POST /api/v1/flows/{flow_id}/triggers` - Create trigger
|
||||||
|
- `GET /api/v1/flows/{flow_id}/triggers/{id}` - Get trigger
|
||||||
|
- `DELETE /api/v1/flows/{flow_id}/triggers/{id}` - Delete trigger
|
||||||
|
- `POST /api/v1/flows/{flow_id}/triggers/{id}/regenerate-token` - Regenerate webhook token
|
||||||
|
|
||||||
|
### Executions
|
||||||
|
|
||||||
|
- `GET /api/v1/flows/{flow_id}/executions` - List executions
|
||||||
|
- `GET /api/v1/flows/{flow_id}/executions/{id}` - Get execution
|
||||||
|
- `POST /api/v1/flows/{flow_id}/executions/run` - Manual trigger
|
||||||
|
- `GET /api/v1/flows/{flow_id}/executions/stats` - Get stats
|
||||||
|
|
||||||
|
### Webhooks
|
||||||
|
|
||||||
|
- `POST /webhooks/{token}` - Trigger flow via webhook
|
||||||
|
- `GET /webhooks/{token}/test` - Test webhook token
|
||||||
|
|
||||||
|
## Project Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
bloxserver/
|
||||||
|
├── api/
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── main.py # FastAPI app entry point
|
||||||
|
│ ├── dependencies.py # Auth, DB session dependencies
|
||||||
|
│ ├── schemas.py # Pydantic request/response models
|
||||||
|
│ ├── models/
|
||||||
|
│ │ ├── __init__.py
|
||||||
|
│ │ ├── database.py # SQLAlchemy engine/session
|
||||||
|
│ │ └── tables.py # ORM table definitions
|
||||||
|
│ └── routes/
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── flows.py # Flow CRUD
|
||||||
|
│ ├── triggers.py # Trigger CRUD
|
||||||
|
│ ├── executions.py # Execution history
|
||||||
|
│ ├── webhooks.py # Webhook handler
|
||||||
|
│ └── health.py # Health checks
|
||||||
|
├── requirements.txt
|
||||||
|
├── Dockerfile
|
||||||
|
├── docker-compose.yml
|
||||||
|
├── .env.example
|
||||||
|
└── README.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Authentication
|
||||||
|
|
||||||
|
Uses Clerk for JWT authentication. All `/api/v1/*` endpoints require a valid JWT.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -H "Authorization: Bearer <clerk-jwt>" \
|
||||||
|
http://localhost:8000/api/v1/flows
|
||||||
|
```
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
See `.env.example` for all configuration options.
|
||||||
|
|
||||||
|
Key variables:
|
||||||
|
- `DATABASE_URL` - PostgreSQL connection string
|
||||||
|
- `CLERK_ISSUER` - Clerk JWT issuer URL
|
||||||
|
- `STRIPE_SECRET_KEY` - Stripe API key
|
||||||
|
- `API_KEY_ENCRYPTION_KEY` - Fernet key for encrypting user API keys
|
||||||
|
|
||||||
|
## Database Migrations
|
||||||
|
|
||||||
|
Using Alembic for migrations (not yet set up):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Initialize (first time)
|
||||||
|
alembic init alembic
|
||||||
|
|
||||||
|
# Create migration
|
||||||
|
alembic revision --autogenerate -m "description"
|
||||||
|
|
||||||
|
# Apply migrations
|
||||||
|
alembic upgrade head
|
||||||
|
```
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install test dependencies
|
||||||
|
pip install pytest pytest-asyncio httpx
|
||||||
|
|
||||||
|
# Run tests
|
||||||
|
pytest tests/ -v
|
||||||
|
```
|
||||||
|
|
||||||
|
## Deployment
|
||||||
|
|
||||||
|
### Railway / Render / Fly.io
|
||||||
|
|
||||||
|
1. Connect your repo
|
||||||
|
2. Set environment variables
|
||||||
|
3. Deploy
|
||||||
|
|
||||||
|
### Kubernetes
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: bloxserver-api
|
||||||
|
spec:
|
||||||
|
replicas: 3
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
containers:
|
||||||
|
- name: api
|
||||||
|
image: your-registry/bloxserver-api:latest
|
||||||
|
ports:
|
||||||
|
- containerPort: 8000
|
||||||
|
env:
|
||||||
|
- name: DATABASE_URL
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: bloxserver-secrets
|
||||||
|
key: database-url
|
||||||
|
livenessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health/live
|
||||||
|
port: 8000
|
||||||
|
readinessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health/ready
|
||||||
|
port: 8000
|
||||||
|
```
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- [ ] Alembic migrations setup
|
||||||
|
- [ ] Stripe webhook handlers
|
||||||
|
- [ ] Redis rate limiting
|
||||||
|
- [ ] Container orchestration integration
|
||||||
|
- [ ] WebSocket for real-time logs
|
||||||
7
bloxserver/__init__.py
Normal file
7
bloxserver/__init__.py
Normal file
|
|
@ -0,0 +1,7 @@
|
||||||
|
"""
|
||||||
|
BloxServer - Visual AI Agent Workflow Builder
|
||||||
|
|
||||||
|
SaaS backend for OpenBlox.ai
|
||||||
|
"""
|
||||||
|
|
||||||
|
__version__ = "0.1.0"
|
||||||
1
bloxserver/api/__init__.py
Normal file
1
bloxserver/api/__init__.py
Normal file
|
|
@ -0,0 +1 @@
|
||||||
|
"""BloxServer API package."""
|
||||||
236
bloxserver/api/dependencies.py
Normal file
236
bloxserver/api/dependencies.py
Normal file
|
|
@ -0,0 +1,236 @@
|
||||||
|
"""
|
||||||
|
FastAPI dependencies for authentication and database access.
|
||||||
|
|
||||||
|
Uses Clerk for JWT validation.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from typing import Annotated
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from fastapi import Depends, HTTPException, Request, status
|
||||||
|
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
|
||||||
|
from sqlalchemy import select
|
||||||
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
|
from bloxserver.api.models.database import get_db
|
||||||
|
from bloxserver.api.models.tables import UserRecord
|
||||||
|
|
||||||
|
# Dev mode - skip auth for local testing
|
||||||
|
DEV_MODE = os.getenv("ENV", "development") == "development" and not os.getenv("CLERK_ISSUER")
|
||||||
|
|
||||||
|
# Clerk configuration
|
||||||
|
CLERK_ISSUER = os.getenv("CLERK_ISSUER", "")
|
||||||
|
CLERK_JWKS_URL = f"{CLERK_ISSUER}/.well-known/jwks.json" if CLERK_ISSUER else ""
|
||||||
|
|
||||||
|
# Security scheme
|
||||||
|
security = HTTPBearer(auto_error=False)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# JWT Validation (Clerk)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
async def get_clerk_jwks() -> dict:
|
||||||
|
"""Fetch Clerk's JWKS for JWT validation."""
|
||||||
|
async with httpx.AsyncClient() as client:
|
||||||
|
response = await client.get(CLERK_JWKS_URL)
|
||||||
|
response.raise_for_status()
|
||||||
|
return response.json()
|
||||||
|
|
||||||
|
|
||||||
|
async def validate_clerk_token(token: str) -> dict:
|
||||||
|
"""
|
||||||
|
Validate a Clerk JWT token and return the payload.
|
||||||
|
|
||||||
|
In production, use a proper JWT library with caching.
|
||||||
|
This is a simplified version for the scaffold.
|
||||||
|
"""
|
||||||
|
import jwt
|
||||||
|
from jwt import PyJWKClient
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Get signing key from Clerk's JWKS
|
||||||
|
jwks_client = PyJWKClient(CLERK_JWKS_URL)
|
||||||
|
signing_key = jwks_client.get_signing_key_from_jwt(token)
|
||||||
|
|
||||||
|
# Decode and validate
|
||||||
|
payload = jwt.decode(
|
||||||
|
token,
|
||||||
|
signing_key.key,
|
||||||
|
algorithms=["RS256"],
|
||||||
|
audience=os.getenv("CLERK_AUDIENCE"),
|
||||||
|
issuer=CLERK_ISSUER,
|
||||||
|
)
|
||||||
|
|
||||||
|
return payload
|
||||||
|
|
||||||
|
except jwt.ExpiredSignatureError:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||||
|
detail="Token has expired",
|
||||||
|
)
|
||||||
|
except jwt.InvalidTokenError as e:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||||
|
detail=f"Invalid token: {e}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Current User Dependency
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class CurrentUser:
|
||||||
|
"""Authenticated user context."""
|
||||||
|
|
||||||
|
def __init__(self, user: UserRecord, clerk_payload: dict):
|
||||||
|
self.user = user
|
||||||
|
self.clerk_payload = clerk_payload
|
||||||
|
|
||||||
|
@property
|
||||||
|
def id(self) -> UUID:
|
||||||
|
return self.user.id
|
||||||
|
|
||||||
|
@property
|
||||||
|
def clerk_id(self) -> str:
|
||||||
|
return self.user.clerk_id
|
||||||
|
|
||||||
|
@property
|
||||||
|
def email(self) -> str:
|
||||||
|
return self.user.email
|
||||||
|
|
||||||
|
@property
|
||||||
|
def tier(self) -> str:
|
||||||
|
return self.user.tier.value
|
||||||
|
|
||||||
|
|
||||||
|
async def get_current_user(
|
||||||
|
request: Request,
|
||||||
|
credentials: Annotated[HTTPAuthorizationCredentials | None, Depends(security)],
|
||||||
|
db: Annotated[AsyncSession, Depends(get_db)],
|
||||||
|
) -> CurrentUser:
|
||||||
|
"""
|
||||||
|
Dependency that validates the JWT and returns the current user.
|
||||||
|
|
||||||
|
Creates the user record if this is their first request (synced from Clerk).
|
||||||
|
In DEV_MODE without Clerk configured, returns a test user.
|
||||||
|
"""
|
||||||
|
# Dev mode - create/return a test user without auth
|
||||||
|
if DEV_MODE:
|
||||||
|
dev_clerk_id = "dev_user_001"
|
||||||
|
result = await db.execute(
|
||||||
|
select(UserRecord).where(UserRecord.clerk_id == dev_clerk_id)
|
||||||
|
)
|
||||||
|
user = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not user:
|
||||||
|
from bloxserver.api.models.tables import Tier
|
||||||
|
user = UserRecord(
|
||||||
|
clerk_id=dev_clerk_id,
|
||||||
|
email="dev@localhost",
|
||||||
|
name="Dev User",
|
||||||
|
tier=Tier.PRO, # Give dev user Pro access
|
||||||
|
)
|
||||||
|
db.add(user)
|
||||||
|
await db.flush()
|
||||||
|
|
||||||
|
return CurrentUser(user=user, clerk_payload={"sub": dev_clerk_id, "dev": True})
|
||||||
|
|
||||||
|
# Production mode - require Clerk auth
|
||||||
|
if not credentials:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||||
|
detail="Missing authentication token",
|
||||||
|
headers={"WWW-Authenticate": "Bearer"},
|
||||||
|
)
|
||||||
|
|
||||||
|
# Validate JWT
|
||||||
|
payload = await validate_clerk_token(credentials.credentials)
|
||||||
|
clerk_id = payload.get("sub")
|
||||||
|
|
||||||
|
if not clerk_id:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_401_UNAUTHORIZED,
|
||||||
|
detail="Invalid token: missing subject",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Look up or create user
|
||||||
|
result = await db.execute(
|
||||||
|
select(UserRecord).where(UserRecord.clerk_id == clerk_id)
|
||||||
|
)
|
||||||
|
user = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not user:
|
||||||
|
# First login - create user record from Clerk data
|
||||||
|
user = UserRecord(
|
||||||
|
clerk_id=clerk_id,
|
||||||
|
email=payload.get("email", f"{clerk_id}@unknown"),
|
||||||
|
name=payload.get("name"),
|
||||||
|
avatar_url=payload.get("image_url"),
|
||||||
|
)
|
||||||
|
db.add(user)
|
||||||
|
await db.flush() # Get the ID without committing
|
||||||
|
|
||||||
|
return CurrentUser(user=user, clerk_payload=payload)
|
||||||
|
|
||||||
|
|
||||||
|
# Type alias for cleaner route signatures
|
||||||
|
AuthenticatedUser = Annotated[CurrentUser, Depends(get_current_user)]
|
||||||
|
DbSession = Annotated[AsyncSession, Depends(get_db)]
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Optional Auth (for public endpoints)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
async def get_optional_user(
|
||||||
|
request: Request,
|
||||||
|
credentials: Annotated[HTTPAuthorizationCredentials | None, Depends(security)],
|
||||||
|
db: Annotated[AsyncSession, Depends(get_db)],
|
||||||
|
) -> CurrentUser | None:
|
||||||
|
"""
|
||||||
|
Like get_current_user, but returns None instead of raising if not authenticated.
|
||||||
|
"""
|
||||||
|
if not credentials:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
return await get_current_user(request, credentials, db)
|
||||||
|
except HTTPException:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
OptionalUser = Annotated[CurrentUser | None, Depends(get_optional_user)]
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Tier Checks
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
def require_tier(*allowed_tiers: str):
|
||||||
|
"""
|
||||||
|
Dependency factory that requires the user to be on one of the allowed tiers.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
@router.post("/wasm", dependencies=[Depends(require_tier("pro", "enterprise"))])
|
||||||
|
"""
|
||||||
|
async def check_tier(user: AuthenticatedUser) -> None:
|
||||||
|
if user.tier not in allowed_tiers:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_403_FORBIDDEN,
|
||||||
|
detail=f"This feature requires one of: {', '.join(allowed_tiers)}",
|
||||||
|
)
|
||||||
|
|
||||||
|
return check_tier
|
||||||
|
|
||||||
|
|
||||||
|
RequirePro = Depends(require_tier("pro", "enterprise", "high_frequency"))
|
||||||
|
RequireEnterprise = Depends(require_tier("enterprise", "high_frequency"))
|
||||||
166
bloxserver/api/main.py
Normal file
166
bloxserver/api/main.py
Normal file
|
|
@ -0,0 +1,166 @@
|
||||||
|
"""
|
||||||
|
BloxServer API - FastAPI Application
|
||||||
|
|
||||||
|
Main entry point for the BloxServer backend API.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
from typing import AsyncGenerator
|
||||||
|
|
||||||
|
from fastapi import FastAPI, Request, status
|
||||||
|
from fastapi.exceptions import RequestValidationError
|
||||||
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
|
from fastapi.responses import JSONResponse
|
||||||
|
|
||||||
|
from bloxserver.api.models.database import init_db
|
||||||
|
from bloxserver.api.routes import executions, flows, health, triggers, webhooks
|
||||||
|
from bloxserver.api.schemas import ApiError
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||||
|
"""Application lifespan - startup and shutdown events."""
|
||||||
|
# Startup
|
||||||
|
print("Starting BloxServer API...")
|
||||||
|
|
||||||
|
# Initialize database tables
|
||||||
|
if os.getenv("AUTO_CREATE_TABLES", "true").lower() == "true":
|
||||||
|
await init_db()
|
||||||
|
print("Database tables initialized")
|
||||||
|
|
||||||
|
yield
|
||||||
|
|
||||||
|
# Shutdown
|
||||||
|
print("Shutting down BloxServer API...")
|
||||||
|
|
||||||
|
|
||||||
|
# Create FastAPI app
|
||||||
|
app = FastAPI(
|
||||||
|
title="BloxServer API",
|
||||||
|
description="Backend API for BloxServer - Visual AI Agent Workflow Builder",
|
||||||
|
version="0.1.0",
|
||||||
|
lifespan=lifespan,
|
||||||
|
docs_url="/docs" if os.getenv("ENABLE_DOCS", "true").lower() == "true" else None,
|
||||||
|
redoc_url="/redoc" if os.getenv("ENABLE_DOCS", "true").lower() == "true" else None,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# CORS Middleware
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
# Allowed origins (configure via environment)
|
||||||
|
CORS_ORIGINS = os.getenv(
|
||||||
|
"CORS_ORIGINS",
|
||||||
|
"http://localhost:3000,https://app.openblox.ai",
|
||||||
|
).split(",")
|
||||||
|
|
||||||
|
app.add_middleware(
|
||||||
|
CORSMiddleware,
|
||||||
|
allow_origins=CORS_ORIGINS,
|
||||||
|
allow_credentials=True,
|
||||||
|
allow_methods=["*"],
|
||||||
|
allow_headers=["*"],
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Exception Handlers
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
@app.exception_handler(RequestValidationError)
|
||||||
|
async def validation_exception_handler(
|
||||||
|
request: Request, exc: RequestValidationError
|
||||||
|
) -> JSONResponse:
|
||||||
|
"""Convert validation errors to standard API error format."""
|
||||||
|
errors = exc.errors()
|
||||||
|
details = {
|
||||||
|
".".join(str(loc) for loc in err["loc"]): err["msg"]
|
||||||
|
for err in errors
|
||||||
|
}
|
||||||
|
|
||||||
|
return JSONResponse(
|
||||||
|
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
|
||||||
|
content=ApiError(
|
||||||
|
code="validation_error",
|
||||||
|
message="Request validation failed",
|
||||||
|
details=details,
|
||||||
|
).model_dump(by_alias=True),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@app.exception_handler(Exception)
|
||||||
|
async def general_exception_handler(
|
||||||
|
request: Request, exc: Exception
|
||||||
|
) -> JSONResponse:
|
||||||
|
"""Catch-all exception handler."""
|
||||||
|
# In production, don't expose internal errors
|
||||||
|
if os.getenv("ENV", "development") == "production":
|
||||||
|
return JSONResponse(
|
||||||
|
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||||
|
content=ApiError(
|
||||||
|
code="internal_error",
|
||||||
|
message="An unexpected error occurred",
|
||||||
|
).model_dump(by_alias=True),
|
||||||
|
)
|
||||||
|
|
||||||
|
# In development, include error details
|
||||||
|
return JSONResponse(
|
||||||
|
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
||||||
|
content=ApiError(
|
||||||
|
code="internal_error",
|
||||||
|
message=str(exc),
|
||||||
|
details={"type": type(exc).__name__},
|
||||||
|
).model_dump(by_alias=True),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Routes
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
# Health checks (no auth)
|
||||||
|
app.include_router(health.router)
|
||||||
|
|
||||||
|
# Webhook endpoint (token-based auth)
|
||||||
|
app.include_router(webhooks.router)
|
||||||
|
|
||||||
|
# Protected API routes
|
||||||
|
app.include_router(flows.router, prefix="/api/v1")
|
||||||
|
app.include_router(triggers.router, prefix="/api/v1")
|
||||||
|
app.include_router(executions.router, prefix="/api/v1")
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Root endpoint
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/")
|
||||||
|
async def root() -> dict:
|
||||||
|
"""Root endpoint - API info."""
|
||||||
|
return {
|
||||||
|
"name": "BloxServer API",
|
||||||
|
"version": "0.1.0",
|
||||||
|
"docs": "/docs",
|
||||||
|
"health": "/health",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Run with uvicorn
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
import uvicorn
|
||||||
|
|
||||||
|
uvicorn.run(
|
||||||
|
"bloxserver.api.main:app",
|
||||||
|
host=os.getenv("HOST", "0.0.0.0"),
|
||||||
|
port=int(os.getenv("PORT", "8000")),
|
||||||
|
reload=os.getenv("ENV", "development") == "development",
|
||||||
|
)
|
||||||
23
bloxserver/api/models/__init__.py
Normal file
23
bloxserver/api/models/__init__.py
Normal file
|
|
@ -0,0 +1,23 @@
|
||||||
|
"""Database and Pydantic models."""
|
||||||
|
|
||||||
|
from bloxserver.api.models.database import Base, get_db, init_db
|
||||||
|
from bloxserver.api.models.tables import (
|
||||||
|
ExecutionRecord,
|
||||||
|
FlowRecord,
|
||||||
|
TriggerRecord,
|
||||||
|
UserApiKeyRecord,
|
||||||
|
UserRecord,
|
||||||
|
UsageRecord,
|
||||||
|
)
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
"Base",
|
||||||
|
"get_db",
|
||||||
|
"init_db",
|
||||||
|
"UserRecord",
|
||||||
|
"FlowRecord",
|
||||||
|
"TriggerRecord",
|
||||||
|
"ExecutionRecord",
|
||||||
|
"UserApiKeyRecord",
|
||||||
|
"UsageRecord",
|
||||||
|
]
|
||||||
84
bloxserver/api/models/database.py
Normal file
84
bloxserver/api/models/database.py
Normal file
|
|
@ -0,0 +1,84 @@
|
||||||
|
"""
|
||||||
|
Database connection and session management.
|
||||||
|
|
||||||
|
Uses SQLAlchemy async with PostgreSQL.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from collections.abc import AsyncGenerator
|
||||||
|
from contextlib import asynccontextmanager
|
||||||
|
|
||||||
|
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
|
||||||
|
from sqlalchemy.orm import DeclarativeBase
|
||||||
|
|
||||||
|
|
||||||
|
class Base(DeclarativeBase):
|
||||||
|
"""Base class for all ORM models."""
|
||||||
|
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
# Database URL from environment
|
||||||
|
# Supports both PostgreSQL and SQLite (for local testing)
|
||||||
|
DATABASE_URL = os.getenv(
|
||||||
|
"DATABASE_URL",
|
||||||
|
"sqlite+aiosqlite:///./bloxserver.db", # SQLite default for easy local testing
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create async engine with appropriate settings
|
||||||
|
_is_sqlite = DATABASE_URL.startswith("sqlite")
|
||||||
|
|
||||||
|
if _is_sqlite:
|
||||||
|
# SQLite doesn't support pool settings
|
||||||
|
engine = create_async_engine(
|
||||||
|
DATABASE_URL,
|
||||||
|
echo=os.getenv("SQL_ECHO", "false").lower() == "true",
|
||||||
|
connect_args={"check_same_thread": False},
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
# PostgreSQL with connection pooling
|
||||||
|
engine = create_async_engine(
|
||||||
|
DATABASE_URL,
|
||||||
|
echo=os.getenv("SQL_ECHO", "false").lower() == "true",
|
||||||
|
pool_pre_ping=True,
|
||||||
|
pool_size=10,
|
||||||
|
max_overflow=20,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Session factory
|
||||||
|
async_session_maker = async_sessionmaker(
|
||||||
|
engine,
|
||||||
|
class_=AsyncSession,
|
||||||
|
expire_on_commit=False,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def init_db() -> None:
|
||||||
|
"""Create all tables. Call once at startup."""
|
||||||
|
async with engine.begin() as conn:
|
||||||
|
await conn.run_sync(Base.metadata.create_all)
|
||||||
|
|
||||||
|
|
||||||
|
async def get_db() -> AsyncGenerator[AsyncSession, None]:
|
||||||
|
"""Dependency for FastAPI routes. Yields a database session."""
|
||||||
|
async with async_session_maker() as session:
|
||||||
|
try:
|
||||||
|
yield session
|
||||||
|
await session.commit()
|
||||||
|
except Exception:
|
||||||
|
await session.rollback()
|
||||||
|
raise
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def get_db_context() -> AsyncGenerator[AsyncSession, None]:
|
||||||
|
"""Context manager for use outside of FastAPI routes."""
|
||||||
|
async with async_session_maker() as session:
|
||||||
|
try:
|
||||||
|
yield session
|
||||||
|
await session.commit()
|
||||||
|
except Exception:
|
||||||
|
await session.rollback()
|
||||||
|
raise
|
||||||
381
bloxserver/api/models/tables.py
Normal file
381
bloxserver/api/models/tables.py
Normal file
|
|
@ -0,0 +1,381 @@
|
||||||
|
"""
|
||||||
|
SQLAlchemy ORM models for BloxServer.
|
||||||
|
|
||||||
|
These map to the Pydantic models in schemas.py and TypeScript types in types.ts.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import enum
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Any
|
||||||
|
from uuid import uuid4
|
||||||
|
|
||||||
|
from sqlalchemy import (
|
||||||
|
JSON,
|
||||||
|
Boolean,
|
||||||
|
DateTime,
|
||||||
|
Enum,
|
||||||
|
ForeignKey,
|
||||||
|
Index,
|
||||||
|
Integer,
|
||||||
|
LargeBinary,
|
||||||
|
Numeric,
|
||||||
|
String,
|
||||||
|
Text,
|
||||||
|
func,
|
||||||
|
)
|
||||||
|
from sqlalchemy.dialects.postgresql import UUID
|
||||||
|
from sqlalchemy.orm import Mapped, mapped_column, relationship
|
||||||
|
|
||||||
|
from bloxserver.api.models.database import Base
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Enums
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class Tier(str, enum.Enum):
|
||||||
|
"""User subscription tier."""
|
||||||
|
|
||||||
|
FREE = "free"
|
||||||
|
PRO = "pro"
|
||||||
|
ENTERPRISE = "enterprise"
|
||||||
|
HIGH_FREQUENCY = "high_frequency"
|
||||||
|
|
||||||
|
|
||||||
|
class BillingStatus(str, enum.Enum):
|
||||||
|
"""Subscription billing status."""
|
||||||
|
|
||||||
|
ACTIVE = "active"
|
||||||
|
TRIALING = "trialing"
|
||||||
|
PAST_DUE = "past_due"
|
||||||
|
CANCELED = "canceled"
|
||||||
|
CANCELING = "canceling"
|
||||||
|
|
||||||
|
|
||||||
|
class FlowStatus(str, enum.Enum):
|
||||||
|
"""Flow runtime status."""
|
||||||
|
|
||||||
|
STOPPED = "stopped"
|
||||||
|
STARTING = "starting"
|
||||||
|
RUNNING = "running"
|
||||||
|
STOPPING = "stopping"
|
||||||
|
ERROR = "error"
|
||||||
|
|
||||||
|
|
||||||
|
class TriggerType(str, enum.Enum):
|
||||||
|
"""How a flow can be triggered."""
|
||||||
|
|
||||||
|
WEBHOOK = "webhook"
|
||||||
|
SCHEDULE = "schedule"
|
||||||
|
MANUAL = "manual"
|
||||||
|
|
||||||
|
|
||||||
|
class ExecutionStatus(str, enum.Enum):
|
||||||
|
"""Status of a flow execution."""
|
||||||
|
|
||||||
|
RUNNING = "running"
|
||||||
|
SUCCESS = "success"
|
||||||
|
ERROR = "error"
|
||||||
|
TIMEOUT = "timeout"
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Users (synced from Clerk)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class UserRecord(Base):
|
||||||
|
"""User account, synced from Clerk."""
|
||||||
|
|
||||||
|
__tablename__ = "users"
|
||||||
|
|
||||||
|
id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), primary_key=True, default=uuid4
|
||||||
|
)
|
||||||
|
clerk_id: Mapped[str] = mapped_column(String(255), unique=True, nullable=False)
|
||||||
|
email: Mapped[str] = mapped_column(String(255), nullable=False)
|
||||||
|
name: Mapped[str | None] = mapped_column(String(255))
|
||||||
|
avatar_url: Mapped[str | None] = mapped_column(Text)
|
||||||
|
|
||||||
|
# Stripe integration
|
||||||
|
stripe_customer_id: Mapped[str | None] = mapped_column(String(255), unique=True)
|
||||||
|
stripe_subscription_id: Mapped[str | None] = mapped_column(String(255))
|
||||||
|
stripe_subscription_item_id: Mapped[str | None] = mapped_column(String(255))
|
||||||
|
|
||||||
|
# Billing state (cached from Stripe)
|
||||||
|
tier: Mapped[Tier] = mapped_column(Enum(Tier), default=Tier.FREE)
|
||||||
|
billing_status: Mapped[BillingStatus] = mapped_column(
|
||||||
|
Enum(BillingStatus), default=BillingStatus.ACTIVE
|
||||||
|
)
|
||||||
|
trial_ends_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
|
||||||
|
current_period_start: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
|
||||||
|
current_period_end: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
|
||||||
|
|
||||||
|
# Timestamps
|
||||||
|
created_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
updated_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
|
||||||
|
)
|
||||||
|
|
||||||
|
# Relationships
|
||||||
|
flows: Mapped[list[FlowRecord]] = relationship(back_populates="user", cascade="all, delete-orphan")
|
||||||
|
api_keys: Mapped[list[UserApiKeyRecord]] = relationship(back_populates="user", cascade="all, delete-orphan")
|
||||||
|
usage_records: Mapped[list[UsageRecord]] = relationship(back_populates="user", cascade="all, delete-orphan")
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_users_clerk_id", "clerk_id"),
|
||||||
|
Index("idx_users_stripe_customer", "stripe_customer_id"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Flows
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class FlowRecord(Base):
|
||||||
|
"""A user's workflow/flow."""
|
||||||
|
|
||||||
|
__tablename__ = "flows"
|
||||||
|
|
||||||
|
id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), primary_key=True, default=uuid4
|
||||||
|
)
|
||||||
|
user_id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), ForeignKey("users.id", ondelete="CASCADE"), nullable=False
|
||||||
|
)
|
||||||
|
name: Mapped[str] = mapped_column(String(100), nullable=False)
|
||||||
|
description: Mapped[str | None] = mapped_column(String(500))
|
||||||
|
|
||||||
|
# The actual workflow definition
|
||||||
|
organism_yaml: Mapped[str] = mapped_column(Text, nullable=False, default="")
|
||||||
|
|
||||||
|
# React Flow canvas state (JSON)
|
||||||
|
canvas_state: Mapped[dict[str, Any] | None] = mapped_column(JSON)
|
||||||
|
|
||||||
|
# Runtime state
|
||||||
|
status: Mapped[FlowStatus] = mapped_column(Enum(FlowStatus), default=FlowStatus.STOPPED)
|
||||||
|
container_id: Mapped[str | None] = mapped_column(String(255))
|
||||||
|
error_message: Mapped[str | None] = mapped_column(Text)
|
||||||
|
|
||||||
|
# Timestamps
|
||||||
|
created_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
updated_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
|
||||||
|
)
|
||||||
|
|
||||||
|
# Relationships
|
||||||
|
user: Mapped[UserRecord] = relationship(back_populates="flows")
|
||||||
|
triggers: Mapped[list[TriggerRecord]] = relationship(back_populates="flow", cascade="all, delete-orphan")
|
||||||
|
executions: Mapped[list[ExecutionRecord]] = relationship(back_populates="flow", cascade="all, delete-orphan")
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_flows_user_id", "user_id"),
|
||||||
|
Index("idx_flows_status", "status"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Triggers
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class TriggerRecord(Base):
|
||||||
|
"""A trigger that can start a flow."""
|
||||||
|
|
||||||
|
__tablename__ = "triggers"
|
||||||
|
|
||||||
|
id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), primary_key=True, default=uuid4
|
||||||
|
)
|
||||||
|
flow_id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), ForeignKey("flows.id", ondelete="CASCADE"), nullable=False
|
||||||
|
)
|
||||||
|
type: Mapped[TriggerType] = mapped_column(Enum(TriggerType), nullable=False)
|
||||||
|
name: Mapped[str] = mapped_column(String(100), nullable=False)
|
||||||
|
|
||||||
|
# Trigger configuration (JSON)
|
||||||
|
config: Mapped[dict[str, Any]] = mapped_column(JSON, nullable=False, default=dict)
|
||||||
|
|
||||||
|
# Webhook-specific fields
|
||||||
|
webhook_token: Mapped[str | None] = mapped_column(String(64), unique=True)
|
||||||
|
webhook_url: Mapped[str | None] = mapped_column(Text)
|
||||||
|
|
||||||
|
# Timestamps
|
||||||
|
created_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
|
||||||
|
# Relationships
|
||||||
|
flow: Mapped[FlowRecord] = relationship(back_populates="triggers")
|
||||||
|
executions: Mapped[list[ExecutionRecord]] = relationship(back_populates="trigger")
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_triggers_flow_id", "flow_id"),
|
||||||
|
Index("idx_triggers_webhook_token", "webhook_token"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Executions
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class ExecutionRecord(Base):
|
||||||
|
"""A single execution/run of a flow."""
|
||||||
|
|
||||||
|
__tablename__ = "executions"
|
||||||
|
|
||||||
|
id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), primary_key=True, default=uuid4
|
||||||
|
)
|
||||||
|
flow_id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), ForeignKey("flows.id", ondelete="CASCADE"), nullable=False
|
||||||
|
)
|
||||||
|
trigger_id: Mapped[UUID | None] = mapped_column(
|
||||||
|
UUID(as_uuid=True), ForeignKey("triggers.id", ondelete="SET NULL")
|
||||||
|
)
|
||||||
|
trigger_type: Mapped[TriggerType] = mapped_column(Enum(TriggerType), nullable=False)
|
||||||
|
|
||||||
|
# Execution state
|
||||||
|
status: Mapped[ExecutionStatus] = mapped_column(
|
||||||
|
Enum(ExecutionStatus), default=ExecutionStatus.RUNNING
|
||||||
|
)
|
||||||
|
error_message: Mapped[str | None] = mapped_column(Text)
|
||||||
|
|
||||||
|
# Payloads (JSON strings for flexibility)
|
||||||
|
input_payload: Mapped[str | None] = mapped_column(Text)
|
||||||
|
output_payload: Mapped[str | None] = mapped_column(Text)
|
||||||
|
|
||||||
|
# Timing
|
||||||
|
started_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
completed_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
|
||||||
|
duration_ms: Mapped[int | None] = mapped_column(Integer)
|
||||||
|
|
||||||
|
# Relationships
|
||||||
|
flow: Mapped[FlowRecord] = relationship(back_populates="executions")
|
||||||
|
trigger: Mapped[TriggerRecord | None] = relationship(back_populates="executions")
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_executions_flow_id", "flow_id"),
|
||||||
|
Index("idx_executions_started_at", "started_at"),
|
||||||
|
Index("idx_executions_status", "status"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# User API Keys (BYOK)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class UserApiKeyRecord(Base):
|
||||||
|
"""User's own API keys for BYOK (Bring Your Own Key)."""
|
||||||
|
|
||||||
|
__tablename__ = "user_api_keys"
|
||||||
|
|
||||||
|
id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), primary_key=True, default=uuid4
|
||||||
|
)
|
||||||
|
user_id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), ForeignKey("users.id", ondelete="CASCADE"), nullable=False
|
||||||
|
)
|
||||||
|
provider: Mapped[str] = mapped_column(String(50), nullable=False)
|
||||||
|
|
||||||
|
# Encrypted API key
|
||||||
|
encrypted_key: Mapped[bytes] = mapped_column(LargeBinary, nullable=False)
|
||||||
|
key_hint: Mapped[str | None] = mapped_column(String(20)) # Last few chars for display
|
||||||
|
|
||||||
|
# Validation state
|
||||||
|
is_valid: Mapped[bool] = mapped_column(Boolean, default=True)
|
||||||
|
last_error: Mapped[str | None] = mapped_column(String(255))
|
||||||
|
last_used_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
|
||||||
|
|
||||||
|
# Timestamps
|
||||||
|
created_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
|
||||||
|
# Relationships
|
||||||
|
user: Mapped[UserRecord] = relationship(back_populates="api_keys")
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_user_api_keys_user_provider", "user_id", "provider", unique=True),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Usage Tracking
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class UsageRecord(Base):
|
||||||
|
"""Usage tracking for billing."""
|
||||||
|
|
||||||
|
__tablename__ = "usage_records"
|
||||||
|
|
||||||
|
id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), primary_key=True, default=uuid4
|
||||||
|
)
|
||||||
|
user_id: Mapped[UUID] = mapped_column(
|
||||||
|
UUID(as_uuid=True), ForeignKey("users.id", ondelete="CASCADE"), nullable=False
|
||||||
|
)
|
||||||
|
period_start: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), nullable=False
|
||||||
|
)
|
||||||
|
|
||||||
|
# Metrics
|
||||||
|
workflow_runs: Mapped[int] = mapped_column(Integer, default=0)
|
||||||
|
llm_tokens_in: Mapped[int] = mapped_column(Integer, default=0)
|
||||||
|
llm_tokens_out: Mapped[int] = mapped_column(Integer, default=0)
|
||||||
|
wasm_cpu_seconds: Mapped[float] = mapped_column(Numeric(10, 2), default=0)
|
||||||
|
storage_gb_hours: Mapped[float] = mapped_column(Numeric(10, 2), default=0)
|
||||||
|
|
||||||
|
# Stripe sync state
|
||||||
|
last_synced_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
|
||||||
|
last_synced_runs: Mapped[int] = mapped_column(Integer, default=0)
|
||||||
|
|
||||||
|
# Timestamps
|
||||||
|
created_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
updated_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now(), onupdate=func.now()
|
||||||
|
)
|
||||||
|
|
||||||
|
# Relationships
|
||||||
|
user: Mapped[UserRecord] = relationship(back_populates="usage_records")
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_usage_user_period", "user_id", "period_start", unique=True),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Stripe Events (Idempotency)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class StripeEventRecord(Base):
|
||||||
|
"""Processed Stripe webhook events for idempotency."""
|
||||||
|
|
||||||
|
__tablename__ = "stripe_events"
|
||||||
|
|
||||||
|
event_id: Mapped[str] = mapped_column(String(255), primary_key=True)
|
||||||
|
event_type: Mapped[str] = mapped_column(String(100), nullable=False)
|
||||||
|
processed_at: Mapped[datetime] = mapped_column(
|
||||||
|
DateTime(timezone=True), server_default=func.now()
|
||||||
|
)
|
||||||
|
payload: Mapped[dict[str, Any] | None] = mapped_column(JSON)
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index("idx_stripe_events_processed", "processed_at"),
|
||||||
|
)
|
||||||
1
bloxserver/api/routes/__init__.py
Normal file
1
bloxserver/api/routes/__init__.py
Normal file
|
|
@ -0,0 +1 @@
|
||||||
|
"""API route modules."""
|
||||||
204
bloxserver/api/routes/executions.py
Normal file
204
bloxserver/api/routes/executions.py
Normal file
|
|
@ -0,0 +1,204 @@
|
||||||
|
"""
|
||||||
|
Execution history and manual trigger endpoints.
|
||||||
|
|
||||||
|
Executions are immutable records of flow runs.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, status
|
||||||
|
from sqlalchemy import func, select
|
||||||
|
|
||||||
|
from bloxserver.api.dependencies import AuthenticatedUser, DbSession
|
||||||
|
from bloxserver.api.models.tables import (
|
||||||
|
ExecutionRecord,
|
||||||
|
ExecutionStatus,
|
||||||
|
FlowRecord,
|
||||||
|
TriggerType,
|
||||||
|
)
|
||||||
|
from bloxserver.api.schemas import Execution, ExecutionSummary, PaginatedResponse
|
||||||
|
|
||||||
|
router = APIRouter(prefix="/flows/{flow_id}/executions", tags=["executions"])
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("", response_model=PaginatedResponse[ExecutionSummary])
|
||||||
|
async def list_executions(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
page: int = 1,
|
||||||
|
page_size: int = 50,
|
||||||
|
status_filter: ExecutionStatus | None = None,
|
||||||
|
) -> PaginatedResponse[ExecutionSummary]:
|
||||||
|
"""List execution history for a flow."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
offset = (page - 1) * page_size
|
||||||
|
|
||||||
|
# Build query
|
||||||
|
base_query = select(ExecutionRecord).where(ExecutionRecord.flow_id == flow_id)
|
||||||
|
if status_filter:
|
||||||
|
base_query = base_query.where(ExecutionRecord.status == status_filter)
|
||||||
|
|
||||||
|
# Get total count
|
||||||
|
count_query = select(func.count()).select_from(base_query.subquery())
|
||||||
|
total = (await db.execute(count_query)).scalar() or 0
|
||||||
|
|
||||||
|
# Get page
|
||||||
|
query = base_query.order_by(ExecutionRecord.started_at.desc()).offset(offset).limit(page_size)
|
||||||
|
result = await db.execute(query)
|
||||||
|
executions = result.scalars().all()
|
||||||
|
|
||||||
|
return PaginatedResponse(
|
||||||
|
items=[ExecutionSummary.model_validate(e) for e in executions],
|
||||||
|
total=total,
|
||||||
|
page=page,
|
||||||
|
page_size=page_size,
|
||||||
|
has_more=offset + len(executions) < total,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{execution_id}", response_model=Execution)
|
||||||
|
async def get_execution(
|
||||||
|
flow_id: UUID,
|
||||||
|
execution_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> Execution:
|
||||||
|
"""Get details of a single execution."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get execution
|
||||||
|
query = select(ExecutionRecord).where(
|
||||||
|
ExecutionRecord.id == execution_id,
|
||||||
|
ExecutionRecord.flow_id == flow_id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
execution = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not execution:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Execution not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
return Execution.model_validate(execution)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/run", response_model=Execution, status_code=status.HTTP_201_CREATED)
|
||||||
|
async def run_flow_manually(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
input_payload: str | None = None,
|
||||||
|
) -> Execution:
|
||||||
|
"""
|
||||||
|
Manually trigger a flow execution.
|
||||||
|
|
||||||
|
The flow must be in 'running' state.
|
||||||
|
"""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
if flow.status != "running":
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_400_BAD_REQUEST,
|
||||||
|
detail=f"Flow must be running to execute (current: {flow.status})",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create execution record
|
||||||
|
execution = ExecutionRecord(
|
||||||
|
flow_id=flow_id,
|
||||||
|
trigger_type=TriggerType.MANUAL,
|
||||||
|
status=ExecutionStatus.RUNNING,
|
||||||
|
input_payload=input_payload,
|
||||||
|
)
|
||||||
|
db.add(execution)
|
||||||
|
await db.flush()
|
||||||
|
|
||||||
|
# TODO: Actually dispatch to the running container
|
||||||
|
# For now, just return the execution record
|
||||||
|
|
||||||
|
return Execution.model_validate(execution)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Stats endpoint
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/stats", response_model=dict)
|
||||||
|
async def get_execution_stats(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> dict:
|
||||||
|
"""Get execution statistics for a flow."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Calculate stats
|
||||||
|
stats_query = select(
|
||||||
|
func.count().label("total"),
|
||||||
|
func.count().filter(ExecutionRecord.status == ExecutionStatus.SUCCESS).label("success"),
|
||||||
|
func.count().filter(ExecutionRecord.status == ExecutionStatus.ERROR).label("error"),
|
||||||
|
func.avg(ExecutionRecord.duration_ms).label("avg_duration_ms"),
|
||||||
|
func.max(ExecutionRecord.started_at).label("last_executed_at"),
|
||||||
|
).where(ExecutionRecord.flow_id == flow_id)
|
||||||
|
|
||||||
|
result = await db.execute(stats_query)
|
||||||
|
row = result.one()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"flowId": str(flow_id),
|
||||||
|
"executionsTotal": row.total or 0,
|
||||||
|
"executionsSuccess": row.success or 0,
|
||||||
|
"executionsError": row.error or 0,
|
||||||
|
"avgDurationMs": float(row.avg_duration_ms) if row.avg_duration_ms else 0,
|
||||||
|
"lastExecutedAt": row.last_executed_at.isoformat() if row.last_executed_at else None,
|
||||||
|
}
|
||||||
269
bloxserver/api/routes/flows.py
Normal file
269
bloxserver/api/routes/flows.py
Normal file
|
|
@ -0,0 +1,269 @@
|
||||||
|
"""
|
||||||
|
Flow CRUD endpoints.
|
||||||
|
|
||||||
|
Flows are the core entity - a user's workflow definition.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, status
|
||||||
|
from sqlalchemy import func, select
|
||||||
|
|
||||||
|
from bloxserver.api.dependencies import AuthenticatedUser, DbSession
|
||||||
|
from bloxserver.api.models.tables import FlowRecord, Tier
|
||||||
|
from bloxserver.api.schemas import (
|
||||||
|
CreateFlowRequest,
|
||||||
|
Flow,
|
||||||
|
FlowSummary,
|
||||||
|
PaginatedResponse,
|
||||||
|
UpdateFlowRequest,
|
||||||
|
)
|
||||||
|
|
||||||
|
router = APIRouter(prefix="/flows", tags=["flows"])
|
||||||
|
|
||||||
|
# Default organism.yaml template for new flows
|
||||||
|
DEFAULT_ORGANISM_YAML = """organism:
|
||||||
|
name: my-flow
|
||||||
|
|
||||||
|
listeners:
|
||||||
|
- name: greeter
|
||||||
|
payload_class: handlers.hello.Greeting
|
||||||
|
handler: handlers.hello.handle_greeting
|
||||||
|
description: A friendly greeter agent
|
||||||
|
agent: true
|
||||||
|
peers: []
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Tier limits
|
||||||
|
TIER_FLOW_LIMITS = {
|
||||||
|
Tier.FREE: 1,
|
||||||
|
Tier.PRO: 100, # Effectively unlimited for most users
|
||||||
|
Tier.ENTERPRISE: 1000,
|
||||||
|
Tier.HIGH_FREQUENCY: 1000,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("", response_model=PaginatedResponse[FlowSummary])
|
||||||
|
async def list_flows(
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
page: int = 1,
|
||||||
|
page_size: int = 20,
|
||||||
|
) -> PaginatedResponse[FlowSummary]:
|
||||||
|
"""List all flows for the current user."""
|
||||||
|
offset = (page - 1) * page_size
|
||||||
|
|
||||||
|
# Get total count
|
||||||
|
count_query = select(func.count()).select_from(FlowRecord).where(
|
||||||
|
FlowRecord.user_id == user.id
|
||||||
|
)
|
||||||
|
total = (await db.execute(count_query)).scalar() or 0
|
||||||
|
|
||||||
|
# Get page of flows
|
||||||
|
query = (
|
||||||
|
select(FlowRecord)
|
||||||
|
.where(FlowRecord.user_id == user.id)
|
||||||
|
.order_by(FlowRecord.updated_at.desc())
|
||||||
|
.offset(offset)
|
||||||
|
.limit(page_size)
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
flows = result.scalars().all()
|
||||||
|
|
||||||
|
return PaginatedResponse(
|
||||||
|
items=[FlowSummary.model_validate(f) for f in flows],
|
||||||
|
total=total,
|
||||||
|
page=page,
|
||||||
|
page_size=page_size,
|
||||||
|
has_more=offset + len(flows) < total,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("", response_model=Flow, status_code=status.HTTP_201_CREATED)
|
||||||
|
async def create_flow(
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
request: CreateFlowRequest,
|
||||||
|
) -> Flow:
|
||||||
|
"""Create a new flow."""
|
||||||
|
# Check tier limits
|
||||||
|
count_query = select(func.count()).select_from(FlowRecord).where(
|
||||||
|
FlowRecord.user_id == user.id
|
||||||
|
)
|
||||||
|
current_count = (await db.execute(count_query)).scalar() or 0
|
||||||
|
limit = TIER_FLOW_LIMITS.get(user.user.tier, 1)
|
||||||
|
|
||||||
|
if current_count >= limit:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_403_FORBIDDEN,
|
||||||
|
detail=f"Flow limit reached ({limit}). Upgrade to create more flows.",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create flow
|
||||||
|
flow = FlowRecord(
|
||||||
|
user_id=user.id,
|
||||||
|
name=request.name,
|
||||||
|
description=request.description,
|
||||||
|
organism_yaml=request.organism_yaml or DEFAULT_ORGANISM_YAML,
|
||||||
|
)
|
||||||
|
db.add(flow)
|
||||||
|
await db.flush()
|
||||||
|
|
||||||
|
return Flow.model_validate(flow)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{flow_id}", response_model=Flow)
|
||||||
|
async def get_flow(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> Flow:
|
||||||
|
"""Get a single flow by ID."""
|
||||||
|
query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
flow = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
return Flow.model_validate(flow)
|
||||||
|
|
||||||
|
|
||||||
|
@router.patch("/{flow_id}", response_model=Flow)
|
||||||
|
async def update_flow(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
request: UpdateFlowRequest,
|
||||||
|
) -> Flow:
|
||||||
|
"""Update a flow."""
|
||||||
|
query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
flow = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Update fields that were provided
|
||||||
|
if request.name is not None:
|
||||||
|
flow.name = request.name
|
||||||
|
if request.description is not None:
|
||||||
|
flow.description = request.description
|
||||||
|
if request.organism_yaml is not None:
|
||||||
|
flow.organism_yaml = request.organism_yaml
|
||||||
|
if request.canvas_state is not None:
|
||||||
|
flow.canvas_state = request.canvas_state.model_dump()
|
||||||
|
|
||||||
|
await db.flush()
|
||||||
|
return Flow.model_validate(flow)
|
||||||
|
|
||||||
|
|
||||||
|
@router.delete("/{flow_id}", status_code=status.HTTP_204_NO_CONTENT)
|
||||||
|
async def delete_flow(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> None:
|
||||||
|
"""Delete a flow."""
|
||||||
|
query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
flow = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
await db.delete(flow)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Flow Actions (Start/Stop)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/{flow_id}/start", response_model=Flow)
|
||||||
|
async def start_flow(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> Flow:
|
||||||
|
"""Start a flow (deploy container)."""
|
||||||
|
query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
flow = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
if flow.status not in ("stopped", "error"):
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_400_BAD_REQUEST,
|
||||||
|
detail=f"Cannot start flow in {flow.status} state",
|
||||||
|
)
|
||||||
|
|
||||||
|
# TODO: Actually start the container
|
||||||
|
# This is where we'd call the container orchestration layer
|
||||||
|
# For now, just update the status
|
||||||
|
flow.status = "starting"
|
||||||
|
flow.error_message = None
|
||||||
|
|
||||||
|
await db.flush()
|
||||||
|
return Flow.model_validate(flow)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/{flow_id}/stop", response_model=Flow)
|
||||||
|
async def stop_flow(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> Flow:
|
||||||
|
"""Stop a running flow."""
|
||||||
|
query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
flow = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
if flow.status not in ("running", "starting", "error"):
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_400_BAD_REQUEST,
|
||||||
|
detail=f"Cannot stop flow in {flow.status} state",
|
||||||
|
)
|
||||||
|
|
||||||
|
# TODO: Actually stop the container
|
||||||
|
flow.status = "stopping"
|
||||||
|
|
||||||
|
await db.flush()
|
||||||
|
return Flow.model_validate(flow)
|
||||||
77
bloxserver/api/routes/health.py
Normal file
77
bloxserver/api/routes/health.py
Normal file
|
|
@ -0,0 +1,77 @@
|
||||||
|
"""
|
||||||
|
Health check and status endpoints.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
from fastapi import APIRouter
|
||||||
|
from sqlalchemy import text
|
||||||
|
|
||||||
|
from bloxserver.api.models.database import async_session_maker
|
||||||
|
|
||||||
|
router = APIRouter(tags=["health"])
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/health")
|
||||||
|
async def health_check() -> dict:
|
||||||
|
"""
|
||||||
|
Basic health check.
|
||||||
|
|
||||||
|
Returns 200 if the service is running.
|
||||||
|
"""
|
||||||
|
return {
|
||||||
|
"status": "healthy",
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
"service": "bloxserver-api",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/health/ready")
|
||||||
|
async def readiness_check() -> dict:
|
||||||
|
"""
|
||||||
|
Readiness check - verifies database connectivity.
|
||||||
|
|
||||||
|
Used by Kubernetes/load balancers to determine if the service
|
||||||
|
is ready to receive traffic.
|
||||||
|
"""
|
||||||
|
errors = []
|
||||||
|
|
||||||
|
# Check database
|
||||||
|
try:
|
||||||
|
async with async_session_maker() as session:
|
||||||
|
await session.execute(text("SELECT 1"))
|
||||||
|
except Exception as e:
|
||||||
|
errors.append(f"database: {e}")
|
||||||
|
|
||||||
|
# TODO: Check Redis
|
||||||
|
# TODO: Check other dependencies
|
||||||
|
|
||||||
|
if errors:
|
||||||
|
return {
|
||||||
|
"status": "unhealthy",
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
"errors": errors,
|
||||||
|
}
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": "ready",
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
"checks": {
|
||||||
|
"database": "ok",
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/health/live")
|
||||||
|
async def liveness_check() -> dict:
|
||||||
|
"""
|
||||||
|
Liveness check - just confirms the process is running.
|
||||||
|
|
||||||
|
If this fails, Kubernetes should restart the pod.
|
||||||
|
"""
|
||||||
|
return {
|
||||||
|
"status": "alive",
|
||||||
|
"timestamp": datetime.utcnow().isoformat(),
|
||||||
|
}
|
||||||
221
bloxserver/api/routes/triggers.py
Normal file
221
bloxserver/api/routes/triggers.py
Normal file
|
|
@ -0,0 +1,221 @@
|
||||||
|
"""
|
||||||
|
Trigger CRUD endpoints.
|
||||||
|
|
||||||
|
Triggers define how flows are started: webhook, schedule, or manual.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import secrets
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, status
|
||||||
|
from sqlalchemy import select
|
||||||
|
|
||||||
|
from bloxserver.api.dependencies import AuthenticatedUser, DbSession
|
||||||
|
from bloxserver.api.models.tables import FlowRecord, TriggerRecord, TriggerType
|
||||||
|
from bloxserver.api.schemas import CreateTriggerRequest, Trigger
|
||||||
|
|
||||||
|
router = APIRouter(prefix="/flows/{flow_id}/triggers", tags=["triggers"])
|
||||||
|
|
||||||
|
# Base URL for webhooks (configured via environment)
|
||||||
|
import os
|
||||||
|
WEBHOOK_BASE_URL = os.getenv("WEBHOOK_BASE_URL", "https://api.openblox.ai/webhooks")
|
||||||
|
|
||||||
|
|
||||||
|
def generate_webhook_token() -> str:
|
||||||
|
"""Generate a secure random token for webhook URLs."""
|
||||||
|
return secrets.token_urlsafe(32)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("", response_model=list[Trigger])
|
||||||
|
async def list_triggers(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> list[Trigger]:
|
||||||
|
"""List all triggers for a flow."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get triggers
|
||||||
|
query = select(TriggerRecord).where(TriggerRecord.flow_id == flow_id)
|
||||||
|
result = await db.execute(query)
|
||||||
|
triggers = result.scalars().all()
|
||||||
|
|
||||||
|
return [Trigger.model_validate(t) for t in triggers]
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("", response_model=Trigger, status_code=status.HTTP_201_CREATED)
|
||||||
|
async def create_trigger(
|
||||||
|
flow_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
request: CreateTriggerRequest,
|
||||||
|
) -> Trigger:
|
||||||
|
"""Create a new trigger for a flow."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create trigger
|
||||||
|
trigger = TriggerRecord(
|
||||||
|
flow_id=flow_id,
|
||||||
|
type=TriggerType(request.type.value),
|
||||||
|
name=request.name,
|
||||||
|
config=request.config,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Generate webhook URL for webhook triggers
|
||||||
|
if request.type == TriggerType.WEBHOOK:
|
||||||
|
trigger.webhook_token = generate_webhook_token()
|
||||||
|
trigger.webhook_url = f"{WEBHOOK_BASE_URL}/{trigger.webhook_token}"
|
||||||
|
|
||||||
|
db.add(trigger)
|
||||||
|
await db.flush()
|
||||||
|
|
||||||
|
return Trigger.model_validate(trigger)
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{trigger_id}", response_model=Trigger)
|
||||||
|
async def get_trigger(
|
||||||
|
flow_id: UUID,
|
||||||
|
trigger_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> Trigger:
|
||||||
|
"""Get a single trigger by ID."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get trigger
|
||||||
|
query = select(TriggerRecord).where(
|
||||||
|
TriggerRecord.id == trigger_id,
|
||||||
|
TriggerRecord.flow_id == flow_id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
trigger = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not trigger:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Trigger not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
return Trigger.model_validate(trigger)
|
||||||
|
|
||||||
|
|
||||||
|
@router.delete("/{trigger_id}", status_code=status.HTTP_204_NO_CONTENT)
|
||||||
|
async def delete_trigger(
|
||||||
|
flow_id: UUID,
|
||||||
|
trigger_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> None:
|
||||||
|
"""Delete a trigger."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get and delete trigger
|
||||||
|
query = select(TriggerRecord).where(
|
||||||
|
TriggerRecord.id == trigger_id,
|
||||||
|
TriggerRecord.flow_id == flow_id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
trigger = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not trigger:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Trigger not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
await db.delete(trigger)
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/{trigger_id}/regenerate-token", response_model=Trigger)
|
||||||
|
async def regenerate_webhook_token(
|
||||||
|
flow_id: UUID,
|
||||||
|
trigger_id: UUID,
|
||||||
|
user: AuthenticatedUser,
|
||||||
|
db: DbSession,
|
||||||
|
) -> Trigger:
|
||||||
|
"""Regenerate the webhook token for a webhook trigger."""
|
||||||
|
# Verify flow ownership
|
||||||
|
flow_query = select(FlowRecord).where(
|
||||||
|
FlowRecord.id == flow_id,
|
||||||
|
FlowRecord.user_id == user.id,
|
||||||
|
)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get trigger
|
||||||
|
query = select(TriggerRecord).where(
|
||||||
|
TriggerRecord.id == trigger_id,
|
||||||
|
TriggerRecord.flow_id == flow_id,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
trigger = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not trigger:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Trigger not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
if trigger.type != TriggerType.WEBHOOK:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_400_BAD_REQUEST,
|
||||||
|
detail="Can only regenerate token for webhook triggers",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Regenerate
|
||||||
|
trigger.webhook_token = generate_webhook_token()
|
||||||
|
trigger.webhook_url = f"{WEBHOOK_BASE_URL}/{trigger.webhook_token}"
|
||||||
|
|
||||||
|
await db.flush()
|
||||||
|
return Trigger.model_validate(trigger)
|
||||||
125
bloxserver/api/routes/webhooks.py
Normal file
125
bloxserver/api/routes/webhooks.py
Normal file
|
|
@ -0,0 +1,125 @@
|
||||||
|
"""
|
||||||
|
Webhook trigger endpoint.
|
||||||
|
|
||||||
|
This handles incoming webhook requests that trigger flows.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
from fastapi import APIRouter, HTTPException, Request, status
|
||||||
|
from sqlalchemy import select
|
||||||
|
|
||||||
|
from bloxserver.api.models.database import get_db_context
|
||||||
|
from bloxserver.api.models.tables import (
|
||||||
|
ExecutionRecord,
|
||||||
|
ExecutionStatus,
|
||||||
|
FlowRecord,
|
||||||
|
TriggerRecord,
|
||||||
|
TriggerType,
|
||||||
|
)
|
||||||
|
|
||||||
|
router = APIRouter(prefix="/webhooks", tags=["webhooks"])
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/{webhook_token}")
|
||||||
|
async def handle_webhook(
|
||||||
|
webhook_token: str,
|
||||||
|
request: Request,
|
||||||
|
) -> dict:
|
||||||
|
"""
|
||||||
|
Handle incoming webhook request.
|
||||||
|
|
||||||
|
This endpoint is public (no auth) - the token IS the authentication.
|
||||||
|
"""
|
||||||
|
async with get_db_context() as db:
|
||||||
|
# Look up trigger by token
|
||||||
|
query = select(TriggerRecord).where(
|
||||||
|
TriggerRecord.webhook_token == webhook_token,
|
||||||
|
TriggerRecord.type == TriggerType.WEBHOOK,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
trigger = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not trigger:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Webhook not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get the flow
|
||||||
|
flow_query = select(FlowRecord).where(FlowRecord.id == trigger.flow_id)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
if not flow:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Flow not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
if flow.status != "running":
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_400_BAD_REQUEST,
|
||||||
|
detail=f"Flow is not running (status: {flow.status})",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get request body
|
||||||
|
try:
|
||||||
|
body = await request.body()
|
||||||
|
input_payload = body.decode("utf-8") if body else None
|
||||||
|
except Exception:
|
||||||
|
input_payload = None
|
||||||
|
|
||||||
|
# Create execution record
|
||||||
|
execution = ExecutionRecord(
|
||||||
|
flow_id=flow.id,
|
||||||
|
trigger_id=trigger.id,
|
||||||
|
trigger_type=TriggerType.WEBHOOK,
|
||||||
|
status=ExecutionStatus.RUNNING,
|
||||||
|
input_payload=input_payload,
|
||||||
|
)
|
||||||
|
db.add(execution)
|
||||||
|
await db.commit()
|
||||||
|
|
||||||
|
# TODO: Actually dispatch to the running container
|
||||||
|
# This would send the payload to the flow's container
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": "accepted",
|
||||||
|
"executionId": str(execution.id),
|
||||||
|
"message": "Webhook received and execution started",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@router.get("/{webhook_token}/test")
|
||||||
|
async def test_webhook(webhook_token: str) -> dict:
|
||||||
|
"""
|
||||||
|
Test that a webhook token is valid.
|
||||||
|
|
||||||
|
Returns info about the trigger without actually executing.
|
||||||
|
"""
|
||||||
|
async with get_db_context() as db:
|
||||||
|
query = select(TriggerRecord).where(
|
||||||
|
TriggerRecord.webhook_token == webhook_token,
|
||||||
|
TriggerRecord.type == TriggerType.WEBHOOK,
|
||||||
|
)
|
||||||
|
result = await db.execute(query)
|
||||||
|
trigger = result.scalar_one_or_none()
|
||||||
|
|
||||||
|
if not trigger:
|
||||||
|
raise HTTPException(
|
||||||
|
status_code=status.HTTP_404_NOT_FOUND,
|
||||||
|
detail="Webhook not found",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Get the flow
|
||||||
|
flow_query = select(FlowRecord).where(FlowRecord.id == trigger.flow_id)
|
||||||
|
flow = (await db.execute(flow_query)).scalar_one_or_none()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"valid": True,
|
||||||
|
"triggerName": trigger.name,
|
||||||
|
"flowName": flow.name if flow else None,
|
||||||
|
"flowStatus": flow.status.value if flow else None,
|
||||||
|
}
|
||||||
322
bloxserver/api/schemas.py
Normal file
322
bloxserver/api/schemas.py
Normal file
|
|
@ -0,0 +1,322 @@
|
||||||
|
"""
|
||||||
|
Pydantic schemas for API request/response validation.
|
||||||
|
|
||||||
|
These match the TypeScript types in types.ts for frontend compatibility.
|
||||||
|
Uses camelCase aliases for JSON serialization.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from datetime import datetime
|
||||||
|
from enum import Enum
|
||||||
|
from typing import Any, Generic, Literal, TypeVar
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from pydantic import BaseModel, ConfigDict, Field
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Config for camelCase serialization
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
def to_camel(string: str) -> str:
|
||||||
|
"""Convert snake_case to camelCase."""
|
||||||
|
components = string.split("_")
|
||||||
|
return components[0] + "".join(x.title() for x in components[1:])
|
||||||
|
|
||||||
|
|
||||||
|
class CamelModel(BaseModel):
|
||||||
|
"""Base model with camelCase JSON serialization."""
|
||||||
|
|
||||||
|
model_config = ConfigDict(
|
||||||
|
alias_generator=to_camel,
|
||||||
|
populate_by_name=True,
|
||||||
|
from_attributes=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Common Types
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
T = TypeVar("T")
|
||||||
|
|
||||||
|
|
||||||
|
class PaginatedResponse(CamelModel, Generic[T]):
|
||||||
|
"""Paginated list response."""
|
||||||
|
|
||||||
|
items: list[T]
|
||||||
|
total: int
|
||||||
|
page: int
|
||||||
|
page_size: int
|
||||||
|
has_more: bool
|
||||||
|
|
||||||
|
|
||||||
|
class ApiError(CamelModel):
|
||||||
|
"""API error response."""
|
||||||
|
|
||||||
|
code: str
|
||||||
|
message: str
|
||||||
|
details: dict[str, Any] | None = None
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Enums
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class Tier(str, Enum):
|
||||||
|
FREE = "free"
|
||||||
|
PRO = "pro"
|
||||||
|
ENTERPRISE = "enterprise"
|
||||||
|
HIGH_FREQUENCY = "high_frequency"
|
||||||
|
|
||||||
|
|
||||||
|
class FlowStatus(str, Enum):
|
||||||
|
STOPPED = "stopped"
|
||||||
|
STARTING = "starting"
|
||||||
|
RUNNING = "running"
|
||||||
|
STOPPING = "stopping"
|
||||||
|
ERROR = "error"
|
||||||
|
|
||||||
|
|
||||||
|
class TriggerType(str, Enum):
|
||||||
|
WEBHOOK = "webhook"
|
||||||
|
SCHEDULE = "schedule"
|
||||||
|
MANUAL = "manual"
|
||||||
|
|
||||||
|
|
||||||
|
class ExecutionStatus(str, Enum):
|
||||||
|
RUNNING = "running"
|
||||||
|
SUCCESS = "success"
|
||||||
|
ERROR = "error"
|
||||||
|
TIMEOUT = "timeout"
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# User
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class User(CamelModel):
|
||||||
|
"""User account (synced from Clerk)."""
|
||||||
|
|
||||||
|
id: UUID
|
||||||
|
clerk_id: str
|
||||||
|
email: str
|
||||||
|
name: str | None = None
|
||||||
|
avatar_url: str | None = None
|
||||||
|
tier: Tier = Tier.FREE
|
||||||
|
created_at: datetime
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Canvas State (React Flow)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class CanvasNode(CamelModel):
|
||||||
|
"""A node in the React Flow canvas."""
|
||||||
|
|
||||||
|
id: str
|
||||||
|
type: str
|
||||||
|
position: dict[str, float]
|
||||||
|
data: dict[str, Any]
|
||||||
|
|
||||||
|
|
||||||
|
class CanvasEdge(CamelModel):
|
||||||
|
"""An edge connecting nodes in the canvas."""
|
||||||
|
|
||||||
|
id: str
|
||||||
|
source: str
|
||||||
|
target: str
|
||||||
|
source_handle: str | None = None
|
||||||
|
target_handle: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class CanvasState(CamelModel):
|
||||||
|
"""React Flow canvas state."""
|
||||||
|
|
||||||
|
nodes: list[CanvasNode]
|
||||||
|
edges: list[CanvasEdge]
|
||||||
|
viewport: dict[str, float]
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Flows
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class Flow(CamelModel):
|
||||||
|
"""A user's workflow/flow."""
|
||||||
|
|
||||||
|
id: UUID
|
||||||
|
user_id: UUID
|
||||||
|
name: str
|
||||||
|
description: str | None = None
|
||||||
|
organism_yaml: str
|
||||||
|
canvas_state: CanvasState | None = None
|
||||||
|
status: FlowStatus = FlowStatus.STOPPED
|
||||||
|
container_id: str | None = None
|
||||||
|
error_message: str | None = None
|
||||||
|
created_at: datetime
|
||||||
|
updated_at: datetime
|
||||||
|
|
||||||
|
|
||||||
|
class FlowSummary(CamelModel):
|
||||||
|
"""Abbreviated flow for list views."""
|
||||||
|
|
||||||
|
id: UUID
|
||||||
|
name: str
|
||||||
|
description: str | None = None
|
||||||
|
status: FlowStatus
|
||||||
|
updated_at: datetime
|
||||||
|
|
||||||
|
|
||||||
|
class CreateFlowRequest(CamelModel):
|
||||||
|
"""Request to create a new flow."""
|
||||||
|
|
||||||
|
name: str = Field(min_length=1, max_length=100)
|
||||||
|
description: str | None = Field(default=None, max_length=500)
|
||||||
|
organism_yaml: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class UpdateFlowRequest(CamelModel):
|
||||||
|
"""Request to update a flow."""
|
||||||
|
|
||||||
|
name: str | None = Field(default=None, min_length=1, max_length=100)
|
||||||
|
description: str | None = Field(default=None, max_length=500)
|
||||||
|
organism_yaml: str | None = None
|
||||||
|
canvas_state: CanvasState | None = None
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Triggers
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class WebhookTriggerConfig(CamelModel):
|
||||||
|
"""Config for webhook triggers."""
|
||||||
|
|
||||||
|
type: Literal["webhook"] = "webhook"
|
||||||
|
|
||||||
|
|
||||||
|
class ScheduleTriggerConfig(CamelModel):
|
||||||
|
"""Config for scheduled triggers."""
|
||||||
|
|
||||||
|
type: Literal["schedule"] = "schedule"
|
||||||
|
cron: str = Field(description="Cron expression")
|
||||||
|
timezone: str = "UTC"
|
||||||
|
|
||||||
|
|
||||||
|
class ManualTriggerConfig(CamelModel):
|
||||||
|
"""Config for manual triggers."""
|
||||||
|
|
||||||
|
type: Literal["manual"] = "manual"
|
||||||
|
|
||||||
|
|
||||||
|
TriggerConfig = WebhookTriggerConfig | ScheduleTriggerConfig | ManualTriggerConfig
|
||||||
|
|
||||||
|
|
||||||
|
class Trigger(CamelModel):
|
||||||
|
"""A trigger that can start a flow."""
|
||||||
|
|
||||||
|
id: UUID
|
||||||
|
flow_id: UUID
|
||||||
|
type: TriggerType
|
||||||
|
name: str
|
||||||
|
config: dict[str, Any]
|
||||||
|
webhook_token: str | None = None
|
||||||
|
webhook_url: str | None = None
|
||||||
|
created_at: datetime
|
||||||
|
|
||||||
|
|
||||||
|
class CreateTriggerRequest(CamelModel):
|
||||||
|
"""Request to create a trigger."""
|
||||||
|
|
||||||
|
type: TriggerType
|
||||||
|
name: str = Field(min_length=1, max_length=100)
|
||||||
|
config: dict[str, Any]
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Executions
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class Execution(CamelModel):
|
||||||
|
"""A single execution/run of a flow."""
|
||||||
|
|
||||||
|
id: UUID
|
||||||
|
flow_id: UUID
|
||||||
|
trigger_id: UUID | None = None
|
||||||
|
trigger_type: TriggerType
|
||||||
|
status: ExecutionStatus
|
||||||
|
started_at: datetime
|
||||||
|
completed_at: datetime | None = None
|
||||||
|
duration_ms: int | None = None
|
||||||
|
error_message: str | None = None
|
||||||
|
input_payload: str | None = None
|
||||||
|
output_payload: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
class ExecutionSummary(CamelModel):
|
||||||
|
"""Abbreviated execution for list views."""
|
||||||
|
|
||||||
|
id: UUID
|
||||||
|
status: ExecutionStatus
|
||||||
|
trigger_type: TriggerType
|
||||||
|
started_at: datetime
|
||||||
|
duration_ms: int | None = None
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# Usage & Stats
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class UsageDashboard(CamelModel):
|
||||||
|
"""Current usage for user dashboard."""
|
||||||
|
|
||||||
|
period_start: datetime
|
||||||
|
period_end: datetime | None
|
||||||
|
runs_used: int
|
||||||
|
runs_limit: int
|
||||||
|
runs_percentage: float
|
||||||
|
tokens_used: int
|
||||||
|
estimated_overage: float
|
||||||
|
days_remaining: int
|
||||||
|
|
||||||
|
|
||||||
|
class FlowStats(CamelModel):
|
||||||
|
"""Statistics for a single flow."""
|
||||||
|
|
||||||
|
flow_id: UUID
|
||||||
|
executions_total: int
|
||||||
|
executions_success: int
|
||||||
|
executions_error: int
|
||||||
|
avg_duration_ms: float
|
||||||
|
last_executed_at: datetime | None = None
|
||||||
|
|
||||||
|
|
||||||
|
# =============================================================================
|
||||||
|
# API Keys (BYOK)
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
|
||||||
|
class ApiKeyInfo(CamelModel):
|
||||||
|
"""Info about a stored API key (never exposes the key itself)."""
|
||||||
|
|
||||||
|
provider: str
|
||||||
|
key_hint: str | None # Last few chars: "...abc123"
|
||||||
|
is_valid: bool
|
||||||
|
last_used_at: datetime | None
|
||||||
|
created_at: datetime
|
||||||
|
|
||||||
|
|
||||||
|
class AddApiKeyRequest(CamelModel):
|
||||||
|
"""Request to add a user's API key."""
|
||||||
|
|
||||||
|
provider: str = Field(description="Provider name: openai, anthropic, xai")
|
||||||
|
api_key: str = Field(min_length=10, description="The API key")
|
||||||
72
bloxserver/docker-compose.yml
Normal file
72
bloxserver/docker-compose.yml
Normal file
|
|
@ -0,0 +1,72 @@
|
||||||
|
# BloxServer Development Docker Compose
|
||||||
|
# Run with: docker-compose up -d
|
||||||
|
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
# ==========================================================================
|
||||||
|
# PostgreSQL Database
|
||||||
|
# ==========================================================================
|
||||||
|
postgres:
|
||||||
|
image: postgres:16-alpine
|
||||||
|
container_name: bloxserver-postgres
|
||||||
|
environment:
|
||||||
|
POSTGRES_USER: postgres
|
||||||
|
POSTGRES_PASSWORD: postgres
|
||||||
|
POSTGRES_DB: bloxserver
|
||||||
|
ports:
|
||||||
|
- "5432:5432"
|
||||||
|
volumes:
|
||||||
|
- postgres_data:/var/lib/postgresql/data
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
|
||||||
|
# ==========================================================================
|
||||||
|
# Redis (for caching, rate limiting, queues)
|
||||||
|
# ==========================================================================
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
container_name: bloxserver-redis
|
||||||
|
ports:
|
||||||
|
- "6379:6379"
|
||||||
|
volumes:
|
||||||
|
- redis_data:/data
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "redis-cli", "ping"]
|
||||||
|
interval: 10s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 5
|
||||||
|
|
||||||
|
# ==========================================================================
|
||||||
|
# BloxServer API
|
||||||
|
# ==========================================================================
|
||||||
|
api:
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
container_name: bloxserver-api
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
environment:
|
||||||
|
- ENV=development
|
||||||
|
- DATABASE_URL=postgresql+asyncpg://postgres:postgres@postgres:5432/bloxserver
|
||||||
|
- REDIS_URL=redis://redis:6379
|
||||||
|
- AUTO_CREATE_TABLES=true
|
||||||
|
- ENABLE_DOCS=true
|
||||||
|
- CORS_ORIGINS=http://localhost:3000
|
||||||
|
depends_on:
|
||||||
|
postgres:
|
||||||
|
condition: service_healthy
|
||||||
|
redis:
|
||||||
|
condition: service_healthy
|
||||||
|
volumes:
|
||||||
|
# Mount source for hot reload in development
|
||||||
|
- .:/app/bloxserver:ro
|
||||||
|
command: uvicorn bloxserver.api.main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
postgres_data:
|
||||||
|
redis_data:
|
||||||
31
bloxserver/requirements.txt
Normal file
31
bloxserver/requirements.txt
Normal file
|
|
@ -0,0 +1,31 @@
|
||||||
|
# BloxServer API Dependencies
|
||||||
|
|
||||||
|
# Web framework
|
||||||
|
fastapi>=0.109.0
|
||||||
|
uvicorn[standard]>=0.27.0
|
||||||
|
|
||||||
|
# Database
|
||||||
|
sqlalchemy[asyncio]>=2.0.0
|
||||||
|
asyncpg>=0.29.0
|
||||||
|
alembic>=1.13.0
|
||||||
|
|
||||||
|
# Authentication (Clerk JWT validation)
|
||||||
|
pyjwt[crypto]>=2.8.0
|
||||||
|
httpx>=0.27.0
|
||||||
|
|
||||||
|
# Validation & serialization
|
||||||
|
pydantic>=2.5.0
|
||||||
|
pydantic-settings>=2.1.0
|
||||||
|
|
||||||
|
# Utilities
|
||||||
|
python-dotenv>=1.0.0
|
||||||
|
humps>=0.2.2
|
||||||
|
|
||||||
|
# Stripe billing
|
||||||
|
stripe>=8.0.0
|
||||||
|
|
||||||
|
# Redis (for caching/rate limiting)
|
||||||
|
redis>=5.0.0
|
||||||
|
|
||||||
|
# Cryptography (for API key encryption)
|
||||||
|
cryptography>=42.0.0
|
||||||
668
docs/bloxserver-billing.md
Normal file
668
docs/bloxserver-billing.md
Normal file
|
|
@ -0,0 +1,668 @@
|
||||||
|
# BloxServer Billing Integration — Stripe
|
||||||
|
|
||||||
|
**Status:** Design
|
||||||
|
**Date:** January 2026
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
BloxServer uses Stripe for subscription management, usage-based billing, and payment processing. This document specifies the integration architecture, webhook handlers, and usage tracking system.
|
||||||
|
|
||||||
|
## Pricing Tiers
|
||||||
|
|
||||||
|
| Tier | Price | Runs/Month | Features |
|
||||||
|
|------|-------|------------|----------|
|
||||||
|
| **Free** | $0 | 1,000 | 1 workflow, built-in tools, community support |
|
||||||
|
| **Pro** | $29 | 100,000 | Unlimited workflows, marketplace, WASM, project memory, priority support |
|
||||||
|
| **Enterprise** | Custom | Unlimited | SSO/SAML, SLA, dedicated support, private marketplace |
|
||||||
|
|
||||||
|
### Overage Pricing (Pro)
|
||||||
|
|
||||||
|
| Metric | Included | Overage Rate |
|
||||||
|
|--------|----------|--------------|
|
||||||
|
| Workflow runs | 100K/mo | $0.50 per 1K |
|
||||||
|
| Storage | 10 GB | $0.10 per GB |
|
||||||
|
| WASM execution | 1000 CPU-sec | $0.01 per CPU-sec |
|
||||||
|
|
||||||
|
## Stripe Product Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
Products:
|
||||||
|
├── bloxserver_free
|
||||||
|
│ └── price_free_monthly ($0/month, metered runs)
|
||||||
|
├── bloxserver_pro
|
||||||
|
│ ├── price_pro_monthly ($29/month base)
|
||||||
|
│ ├── price_pro_runs_overage (metered, $0.50/1K)
|
||||||
|
│ └── price_pro_storage_overage (metered, $0.10/GB)
|
||||||
|
└── bloxserver_enterprise
|
||||||
|
└── price_enterprise_custom (quoted per customer)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Stripe Configuration
|
||||||
|
|
||||||
|
```python
|
||||||
|
# One-time setup (or via Stripe Dashboard)
|
||||||
|
|
||||||
|
# Free tier product
|
||||||
|
free_product = stripe.Product.create(
|
||||||
|
name="BloxServer Free",
|
||||||
|
description="Build AI agent swarms, visually",
|
||||||
|
)
|
||||||
|
|
||||||
|
free_price = stripe.Price.create(
|
||||||
|
product=free_product.id,
|
||||||
|
unit_amount=0,
|
||||||
|
currency="usd",
|
||||||
|
recurring={"interval": "month"},
|
||||||
|
metadata={"tier": "free", "runs_included": "1000"}
|
||||||
|
)
|
||||||
|
|
||||||
|
# Pro tier product
|
||||||
|
pro_product = stripe.Product.create(
|
||||||
|
name="BloxServer Pro",
|
||||||
|
description="Unlimited workflows, marketplace access, custom WASM",
|
||||||
|
)
|
||||||
|
|
||||||
|
pro_base_price = stripe.Price.create(
|
||||||
|
product=pro_product.id,
|
||||||
|
unit_amount=2900, # $29.00
|
||||||
|
currency="usd",
|
||||||
|
recurring={"interval": "month"},
|
||||||
|
metadata={"tier": "pro", "runs_included": "100000"}
|
||||||
|
)
|
||||||
|
|
||||||
|
pro_runs_overage = stripe.Price.create(
|
||||||
|
product=pro_product.id,
|
||||||
|
currency="usd",
|
||||||
|
recurring={
|
||||||
|
"interval": "month",
|
||||||
|
"usage_type": "metered",
|
||||||
|
"aggregate_usage": "sum",
|
||||||
|
},
|
||||||
|
unit_amount_decimal="0.05", # $0.0005 per run = $0.50 per 1K
|
||||||
|
metadata={"type": "runs_overage"}
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Database Schema
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Users table (synced from Clerk + Stripe)
|
||||||
|
CREATE TABLE users (
|
||||||
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
|
clerk_id VARCHAR(255) UNIQUE NOT NULL,
|
||||||
|
email VARCHAR(255) NOT NULL,
|
||||||
|
name VARCHAR(255),
|
||||||
|
|
||||||
|
-- Stripe fields
|
||||||
|
stripe_customer_id VARCHAR(255) UNIQUE,
|
||||||
|
stripe_subscription_id VARCHAR(255),
|
||||||
|
stripe_subscription_item_id VARCHAR(255), -- For usage reporting
|
||||||
|
|
||||||
|
-- Billing state (cached from Stripe)
|
||||||
|
tier VARCHAR(50) DEFAULT 'free', -- free, pro, enterprise
|
||||||
|
billing_status VARCHAR(50) DEFAULT 'active', -- active, past_due, canceled
|
||||||
|
trial_ends_at TIMESTAMPTZ,
|
||||||
|
current_period_start TIMESTAMPTZ,
|
||||||
|
current_period_end TIMESTAMPTZ,
|
||||||
|
|
||||||
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMPTZ DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Usage tracking (local, for dashboard + Stripe sync)
|
||||||
|
CREATE TABLE usage_records (
|
||||||
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
|
user_id UUID REFERENCES users(id),
|
||||||
|
period_start DATE NOT NULL, -- Billing period start
|
||||||
|
|
||||||
|
-- Metrics
|
||||||
|
workflow_runs INT DEFAULT 0,
|
||||||
|
llm_tokens_in INT DEFAULT 0,
|
||||||
|
llm_tokens_out INT DEFAULT 0,
|
||||||
|
wasm_cpu_seconds DECIMAL(10,2) DEFAULT 0,
|
||||||
|
storage_gb_hours DECIMAL(10,2) DEFAULT 0,
|
||||||
|
|
||||||
|
-- Stripe sync state
|
||||||
|
last_synced_at TIMESTAMPTZ,
|
||||||
|
last_synced_runs INT DEFAULT 0,
|
||||||
|
|
||||||
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||||
|
updated_at TIMESTAMPTZ DEFAULT NOW(),
|
||||||
|
|
||||||
|
UNIQUE(user_id, period_start)
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Stripe webhook events (idempotency)
|
||||||
|
CREATE TABLE stripe_events (
|
||||||
|
event_id VARCHAR(255) PRIMARY KEY,
|
||||||
|
event_type VARCHAR(100) NOT NULL,
|
||||||
|
processed_at TIMESTAMPTZ DEFAULT NOW(),
|
||||||
|
payload JSONB
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Index for cleanup
|
||||||
|
CREATE INDEX idx_stripe_events_processed ON stripe_events(processed_at);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage Tracking
|
||||||
|
|
||||||
|
### Real-Time Counting (Redis)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# On every workflow execution
|
||||||
|
async def record_workflow_run(user_id: str):
|
||||||
|
"""Increment run counter in Redis."""
|
||||||
|
key = f"usage:{user_id}:runs:{get_current_period()}"
|
||||||
|
await redis.incr(key)
|
||||||
|
await redis.expire(key, 86400 * 35) # 35 days TTL
|
||||||
|
|
||||||
|
# Track users with usage for batch sync
|
||||||
|
await redis.sadd("users:with_usage", user_id)
|
||||||
|
|
||||||
|
async def record_llm_tokens(user_id: str, tokens_in: int, tokens_out: int):
|
||||||
|
"""Track LLM token usage."""
|
||||||
|
period = get_current_period()
|
||||||
|
await redis.incrby(f"usage:{user_id}:tokens_in:{period}", tokens_in)
|
||||||
|
await redis.incrby(f"usage:{user_id}:tokens_out:{period}", tokens_out)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Periodic Sync to Stripe (Hourly)
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def sync_usage_to_stripe():
|
||||||
|
"""Hourly job: push usage increments to Stripe."""
|
||||||
|
|
||||||
|
user_ids = await redis.smembers("users:with_usage")
|
||||||
|
|
||||||
|
for user_id in user_ids:
|
||||||
|
user = await get_user(user_id)
|
||||||
|
if not user.stripe_subscription_item_id:
|
||||||
|
continue # Free tier without Stripe subscription
|
||||||
|
|
||||||
|
# Get usage since last sync
|
||||||
|
period = get_current_period()
|
||||||
|
runs_key = f"usage:{user_id}:runs:{period}"
|
||||||
|
|
||||||
|
current_runs = int(await redis.get(runs_key) or 0)
|
||||||
|
last_synced = await get_last_synced_runs(user_id, period)
|
||||||
|
|
||||||
|
delta = current_runs - last_synced
|
||||||
|
if delta <= 0:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check if over included limit
|
||||||
|
tier_limit = get_tier_runs_limit(user.tier) # 1000 or 100000
|
||||||
|
if current_runs <= tier_limit:
|
||||||
|
# Still within included runs, just track locally
|
||||||
|
await update_last_synced(user_id, period, current_runs)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Calculate overage to report
|
||||||
|
overage_start = max(last_synced, tier_limit)
|
||||||
|
overage_runs = current_runs - overage_start
|
||||||
|
|
||||||
|
if overage_runs > 0:
|
||||||
|
# Report to Stripe
|
||||||
|
await stripe.subscription_items.create_usage_record(
|
||||||
|
user.stripe_subscription_item_id,
|
||||||
|
quantity=overage_runs,
|
||||||
|
timestamp=int(time.time()),
|
||||||
|
action='increment'
|
||||||
|
)
|
||||||
|
|
||||||
|
await update_last_synced(user_id, period, current_runs)
|
||||||
|
|
||||||
|
# Clear the tracking set (will rebuild next hour)
|
||||||
|
await redis.delete("users:with_usage")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dashboard Query
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def get_usage_dashboard(user_id: str) -> UsageDashboard:
|
||||||
|
"""Get current usage for user dashboard."""
|
||||||
|
user = await get_user(user_id)
|
||||||
|
period = get_current_period()
|
||||||
|
|
||||||
|
# Get real-time counts from Redis
|
||||||
|
runs = int(await redis.get(f"usage:{user_id}:runs:{period}") or 0)
|
||||||
|
tokens_in = int(await redis.get(f"usage:{user_id}:tokens_in:{period}") or 0)
|
||||||
|
tokens_out = int(await redis.get(f"usage:{user_id}:tokens_out:{period}") or 0)
|
||||||
|
|
||||||
|
tier_limit = get_tier_runs_limit(user.tier)
|
||||||
|
|
||||||
|
return UsageDashboard(
|
||||||
|
period_start=period,
|
||||||
|
period_end=user.current_period_end,
|
||||||
|
|
||||||
|
runs_used=runs,
|
||||||
|
runs_limit=tier_limit,
|
||||||
|
runs_percentage=min(100, (runs / tier_limit) * 100),
|
||||||
|
|
||||||
|
tokens_used=tokens_in + tokens_out,
|
||||||
|
|
||||||
|
estimated_overage=calculate_overage_cost(runs, tier_limit),
|
||||||
|
|
||||||
|
days_remaining=(user.current_period_end - datetime.now()).days,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Subscription Lifecycle
|
||||||
|
|
||||||
|
### Signup Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
User clicks "Start Free Trial"
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ 1. Create Stripe Customer │
|
||||||
|
│ │
|
||||||
|
│ customer = stripe.Customer.create( │
|
||||||
|
│ email=user.email, │
|
||||||
|
│ metadata={"clerk_id": user.clerk_id} │
|
||||||
|
│ ) │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ 2. Create Checkout Session (hosted payment page) │
|
||||||
|
│ │
|
||||||
|
│ session = stripe.checkout.Session.create( │
|
||||||
|
│ customer=customer.id, │
|
||||||
|
│ mode='subscription', │
|
||||||
|
│ line_items=[{ │
|
||||||
|
│ 'price': 'price_pro_monthly', │
|
||||||
|
│ 'quantity': 1 │
|
||||||
|
│ }, { │
|
||||||
|
│ 'price': 'price_pro_runs_overage', # metered │
|
||||||
|
│ }], │
|
||||||
|
│ subscription_data={ │
|
||||||
|
│ 'trial_period_days': 14, │
|
||||||
|
│ }, │
|
||||||
|
│ success_url='https://app.openblox.ai/welcome', │
|
||||||
|
│ cancel_url='https://app.openblox.ai/pricing', │
|
||||||
|
│ ) │
|
||||||
|
│ │
|
||||||
|
│ → Redirect user to session.url │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ 3. User enters payment details on Stripe Checkout │
|
||||||
|
│ │
|
||||||
|
│ Card validated but NOT charged (trial) │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ 4. Webhook: checkout.session.completed │
|
||||||
|
│ │
|
||||||
|
│ → Update user with stripe_customer_id │
|
||||||
|
│ → Update user with stripe_subscription_id │
|
||||||
|
│ → Set tier = 'pro' │
|
||||||
|
│ → Set trial_ends_at │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Trial End
|
||||||
|
|
||||||
|
```
|
||||||
|
Day 11 of 14-day trial
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Scheduled job: Trial ending soon emails │
|
||||||
|
│ │
|
||||||
|
│ SELECT * FROM users │
|
||||||
|
│ WHERE trial_ends_at BETWEEN NOW() AND NOW() + INTERVAL '3d'│
|
||||||
|
│ AND billing_status = 'trialing' │
|
||||||
|
│ │
|
||||||
|
│ → Send "Your trial ends in 3 days" email │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
Day 14: Trial ends
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Stripe automatically: │
|
||||||
|
│ 1. Charges the card on file │
|
||||||
|
│ 2. Sends invoice.payment_succeeded webhook │
|
||||||
|
│ │
|
||||||
|
│ Our webhook handler: │
|
||||||
|
│ → Update billing_status = 'active' │
|
||||||
|
│ → Send "Welcome to Pro!" email │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cancellation
|
||||||
|
|
||||||
|
```python
|
||||||
|
# User clicks "Cancel subscription" in Customer Portal
|
||||||
|
# Stripe sends webhook
|
||||||
|
|
||||||
|
@webhook("customer.subscription.updated")
|
||||||
|
async def handle_subscription_updated(event):
|
||||||
|
subscription = event.data.object
|
||||||
|
user = await get_user_by_stripe_subscription(subscription.id)
|
||||||
|
|
||||||
|
if subscription.cancel_at_period_end:
|
||||||
|
# User requested cancellation (takes effect at period end)
|
||||||
|
await send_email(user, "subscription_canceled", {
|
||||||
|
"effective_date": subscription.current_period_end
|
||||||
|
})
|
||||||
|
await db.execute("""
|
||||||
|
UPDATE users
|
||||||
|
SET billing_status = 'canceling',
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE id = $1
|
||||||
|
""", user.id)
|
||||||
|
|
||||||
|
@webhook("customer.subscription.deleted")
|
||||||
|
async def handle_subscription_deleted(event):
|
||||||
|
subscription = event.data.object
|
||||||
|
user = await get_user_by_stripe_subscription(subscription.id)
|
||||||
|
|
||||||
|
# Subscription actually ended
|
||||||
|
await db.execute("""
|
||||||
|
UPDATE users
|
||||||
|
SET tier = 'free',
|
||||||
|
billing_status = 'canceled',
|
||||||
|
stripe_subscription_id = NULL,
|
||||||
|
stripe_subscription_item_id = NULL,
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE id = $1
|
||||||
|
""", user.id)
|
||||||
|
|
||||||
|
await send_email(user, "downgraded_to_free")
|
||||||
|
```
|
||||||
|
|
||||||
|
## Webhook Handlers
|
||||||
|
|
||||||
|
### Endpoint Setup
|
||||||
|
|
||||||
|
```python
|
||||||
|
from fastapi import FastAPI, Request, HTTPException
|
||||||
|
import stripe
|
||||||
|
|
||||||
|
app = FastAPI()
|
||||||
|
|
||||||
|
@app.post("/webhooks/stripe")
|
||||||
|
async def stripe_webhook(request: Request):
|
||||||
|
payload = await request.body()
|
||||||
|
sig_header = request.headers.get("stripe-signature")
|
||||||
|
|
||||||
|
try:
|
||||||
|
event = stripe.Webhook.construct_event(
|
||||||
|
payload, sig_header, settings.STRIPE_WEBHOOK_SECRET
|
||||||
|
)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "Invalid payload")
|
||||||
|
except stripe.error.SignatureVerificationError:
|
||||||
|
raise HTTPException(400, "Invalid signature")
|
||||||
|
|
||||||
|
# Idempotency check
|
||||||
|
if await is_event_processed(event.id):
|
||||||
|
return {"status": "already_processed"}
|
||||||
|
|
||||||
|
# Route to handler
|
||||||
|
handler = WEBHOOK_HANDLERS.get(event.type)
|
||||||
|
if handler:
|
||||||
|
await handler(event)
|
||||||
|
else:
|
||||||
|
logger.info(f"Unhandled webhook: {event.type}")
|
||||||
|
|
||||||
|
# Mark processed
|
||||||
|
await mark_event_processed(event)
|
||||||
|
|
||||||
|
return {"status": "success"}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Handler Registry
|
||||||
|
|
||||||
|
```python
|
||||||
|
WEBHOOK_HANDLERS = {
|
||||||
|
# Checkout
|
||||||
|
"checkout.session.completed": handle_checkout_completed,
|
||||||
|
|
||||||
|
# Subscriptions
|
||||||
|
"customer.subscription.created": handle_subscription_created,
|
||||||
|
"customer.subscription.updated": handle_subscription_updated,
|
||||||
|
"customer.subscription.deleted": handle_subscription_deleted,
|
||||||
|
"customer.subscription.trial_will_end": handle_trial_ending,
|
||||||
|
|
||||||
|
# Payments
|
||||||
|
"invoice.payment_succeeded": handle_payment_succeeded,
|
||||||
|
"invoice.payment_failed": handle_payment_failed,
|
||||||
|
"invoice.upcoming": handle_invoice_upcoming,
|
||||||
|
|
||||||
|
# Customer
|
||||||
|
"customer.updated": handle_customer_updated,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Handlers
|
||||||
|
|
||||||
|
```python
|
||||||
|
@webhook("checkout.session.completed")
|
||||||
|
async def handle_checkout_completed(event):
|
||||||
|
"""User completed checkout - provision their account."""
|
||||||
|
session = event.data.object
|
||||||
|
|
||||||
|
# Get or create user
|
||||||
|
user = await get_user_by_clerk_id(session.client_reference_id)
|
||||||
|
|
||||||
|
# Update with Stripe IDs
|
||||||
|
subscription = await stripe.Subscription.retrieve(session.subscription)
|
||||||
|
|
||||||
|
await db.execute("""
|
||||||
|
UPDATE users SET
|
||||||
|
stripe_customer_id = $1,
|
||||||
|
stripe_subscription_id = $2,
|
||||||
|
stripe_subscription_item_id = $3,
|
||||||
|
tier = $4,
|
||||||
|
billing_status = $5,
|
||||||
|
trial_ends_at = $6,
|
||||||
|
current_period_start = $7,
|
||||||
|
current_period_end = $8,
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE id = $9
|
||||||
|
""",
|
||||||
|
session.customer,
|
||||||
|
subscription.id,
|
||||||
|
subscription['items'].data[0].id, # First item for usage reporting
|
||||||
|
'pro',
|
||||||
|
subscription.status, # 'trialing' or 'active'
|
||||||
|
datetime.fromtimestamp(subscription.trial_end) if subscription.trial_end else None,
|
||||||
|
datetime.fromtimestamp(subscription.current_period_start),
|
||||||
|
datetime.fromtimestamp(subscription.current_period_end),
|
||||||
|
user.id
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@webhook("invoice.payment_failed")
|
||||||
|
async def handle_payment_failed(event):
|
||||||
|
"""Payment failed - notify user, potentially downgrade."""
|
||||||
|
invoice = event.data.object
|
||||||
|
user = await get_user_by_stripe_customer(invoice.customer)
|
||||||
|
|
||||||
|
attempt_count = invoice.attempt_count
|
||||||
|
|
||||||
|
if attempt_count == 1:
|
||||||
|
# First failure - soft warning
|
||||||
|
await send_email(user, "payment_failed_soft", {
|
||||||
|
"amount": invoice.amount_due / 100,
|
||||||
|
"update_url": await get_customer_portal_url(user)
|
||||||
|
})
|
||||||
|
|
||||||
|
elif attempt_count == 2:
|
||||||
|
# Second failure - stronger warning
|
||||||
|
await send_email(user, "payment_failed_warning", {
|
||||||
|
"amount": invoice.amount_due / 100,
|
||||||
|
"days_until_downgrade": 3
|
||||||
|
})
|
||||||
|
|
||||||
|
else:
|
||||||
|
# Final failure - downgrade
|
||||||
|
await db.execute("""
|
||||||
|
UPDATE users SET
|
||||||
|
tier = 'free',
|
||||||
|
billing_status = 'past_due',
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE id = $1
|
||||||
|
""", user.id)
|
||||||
|
|
||||||
|
await send_email(user, "downgraded_payment_failed")
|
||||||
|
|
||||||
|
|
||||||
|
@webhook("customer.subscription.trial_will_end")
|
||||||
|
async def handle_trial_ending(event):
|
||||||
|
"""Trial ending in 3 days - Stripe sends this automatically."""
|
||||||
|
subscription = event.data.object
|
||||||
|
user = await get_user_by_stripe_subscription(subscription.id)
|
||||||
|
|
||||||
|
await send_email(user, "trial_ending", {
|
||||||
|
"trial_end_date": datetime.fromtimestamp(subscription.trial_end),
|
||||||
|
"amount": 29.00, # Pro price
|
||||||
|
"manage_url": await get_customer_portal_url(user)
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
## Customer Portal
|
||||||
|
|
||||||
|
Stripe's hosted portal for self-service billing management.
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def get_customer_portal_url(user: User) -> str:
|
||||||
|
"""Generate a portal session URL for the user."""
|
||||||
|
session = await stripe.billing_portal.Session.create(
|
||||||
|
customer=user.stripe_customer_id,
|
||||||
|
return_url="https://app.openblox.ai/settings/billing"
|
||||||
|
)
|
||||||
|
return session.url
|
||||||
|
```
|
||||||
|
|
||||||
|
**Portal capabilities:**
|
||||||
|
- Update payment method
|
||||||
|
- View invoices and receipts
|
||||||
|
- Cancel subscription
|
||||||
|
- Upgrade/downgrade plan (if configured)
|
||||||
|
|
||||||
|
## Email Templates
|
||||||
|
|
||||||
|
| Trigger | Template | Content |
|
||||||
|
|---------|----------|---------|
|
||||||
|
| Trial started | `trial_started` | Welcome, trial ends on X |
|
||||||
|
| Trial ending (3 days) | `trial_ending` | Your trial ends soon, card will be charged |
|
||||||
|
| Trial converted | `trial_converted` | Welcome to Pro! |
|
||||||
|
| Payment succeeded | `payment_succeeded` | Receipt attached |
|
||||||
|
| Payment failed (1st) | `payment_failed_soft` | Please update your card |
|
||||||
|
| Payment failed (2nd) | `payment_failed_warning` | Service will be interrupted |
|
||||||
|
| Payment failed (final) | `downgraded_payment_failed` | You've been downgraded |
|
||||||
|
| Subscription canceled | `subscription_canceled` | Access until period end |
|
||||||
|
| Downgraded | `downgraded_to_free` | You're now on Free |
|
||||||
|
|
||||||
|
## Rate Limiting & Abuse Prevention
|
||||||
|
|
||||||
|
### Soft Limits (Warning)
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def check_usage_limits(user_id: str) -> UsageLimitResult:
|
||||||
|
"""Check if user is approaching limits."""
|
||||||
|
usage = await get_current_usage(user_id)
|
||||||
|
user = await get_user(user_id)
|
||||||
|
tier_limit = get_tier_runs_limit(user.tier)
|
||||||
|
|
||||||
|
percentage = (usage.runs / tier_limit) * 100
|
||||||
|
|
||||||
|
if percentage >= 100:
|
||||||
|
return UsageLimitResult(
|
||||||
|
allowed=True, # Still allow, but warn
|
||||||
|
warning="You've exceeded your included runs. Overage charges apply.",
|
||||||
|
overage_rate="$0.50 per 1,000 runs"
|
||||||
|
)
|
||||||
|
elif percentage >= 80:
|
||||||
|
return UsageLimitResult(
|
||||||
|
allowed=True,
|
||||||
|
warning=f"You've used {percentage:.0f}% of your monthly runs."
|
||||||
|
)
|
||||||
|
|
||||||
|
return UsageLimitResult(allowed=True)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Hard Limits (Free Tier)
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def enforce_free_tier_limits(user_id: str) -> bool:
|
||||||
|
"""Free tier has hard limits - no overage allowed."""
|
||||||
|
user = await get_user(user_id)
|
||||||
|
if user.tier != "free":
|
||||||
|
return True # Paid tiers have soft limits
|
||||||
|
|
||||||
|
usage = await get_current_usage(user_id)
|
||||||
|
if usage.runs >= 1000:
|
||||||
|
raise UsageLimitExceeded(
|
||||||
|
"You've reached the Free tier limit of 1,000 runs/month. "
|
||||||
|
"Upgrade to Pro for unlimited workflows."
|
||||||
|
)
|
||||||
|
|
||||||
|
return True
|
||||||
|
```
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
### Test Mode
|
||||||
|
|
||||||
|
Stripe provides test mode with test API keys and test card numbers.
|
||||||
|
|
||||||
|
```python
|
||||||
|
# .env
|
||||||
|
STRIPE_SECRET_KEY=sk_test_... # Test mode
|
||||||
|
STRIPE_WEBHOOK_SECRET=whsec_...
|
||||||
|
|
||||||
|
# Test cards
|
||||||
|
# 4242424242424242 - Succeeds
|
||||||
|
# 4000000000000002 - Declined
|
||||||
|
# 4000002500003155 - Requires 3D Secure
|
||||||
|
```
|
||||||
|
|
||||||
|
### Webhook Testing
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Use Stripe CLI to forward webhooks locally
|
||||||
|
stripe listen --forward-to localhost:8000/webhooks/stripe
|
||||||
|
|
||||||
|
# Trigger test events
|
||||||
|
stripe trigger invoice.payment_succeeded
|
||||||
|
stripe trigger customer.subscription.trial_will_end
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitoring & Alerts
|
||||||
|
|
||||||
|
| Metric | Alert Threshold |
|
||||||
|
|--------|-----------------|
|
||||||
|
| Webhook processing time | > 5 seconds |
|
||||||
|
| Webhook failure rate | > 1% |
|
||||||
|
| Payment failure rate | > 5% |
|
||||||
|
| Usage sync lag | > 2 hours |
|
||||||
|
| Stripe API errors | Any 5xx |
|
||||||
|
|
||||||
|
## Security Checklist
|
||||||
|
|
||||||
|
- [ ] Webhook signature verification
|
||||||
|
- [ ] Idempotent event processing
|
||||||
|
- [ ] API keys in environment variables (never in code)
|
||||||
|
- [ ] Customer portal for sensitive operations (not custom UI)
|
||||||
|
- [ ] PCI compliance via Stripe Checkout (no card data touches our servers)
|
||||||
|
- [ ] Audit log for billing events
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Stripe Billing](https://stripe.com/docs/billing)
|
||||||
|
- [Stripe Webhooks](https://stripe.com/docs/webhooks)
|
||||||
|
- [Stripe Checkout](https://stripe.com/docs/payments/checkout)
|
||||||
|
- [Stripe Customer Portal](https://stripe.com/docs/billing/subscriptions/customer-portal)
|
||||||
|
- [Metered Billing](https://stripe.com/docs/billing/subscriptions/metered-billing)
|
||||||
961
docs/bloxserver-llm-layer.md
Normal file
961
docs/bloxserver-llm-layer.md
Normal file
|
|
@ -0,0 +1,961 @@
|
||||||
|
# BloxServer LLM Abstraction Layer — Resilient Multi-Provider Architecture
|
||||||
|
|
||||||
|
**Status:** Design
|
||||||
|
**Date:** January 2026
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The LLM abstraction layer is the critical path for all AI operations in BloxServer. It must handle:
|
||||||
|
|
||||||
|
- **Viral growth**: 100 → 10,000 users overnight
|
||||||
|
- **Provider outages**: Single provider down ≠ platform down
|
||||||
|
- **Fair access**: Paid users prioritized, free users served fairly
|
||||||
|
- **Cost control**: Platform keys vs BYOK (Bring Your Own Key)
|
||||||
|
- **Low latency**: Sub-second for simple calls, reasonable for complex
|
||||||
|
|
||||||
|
This document specifies the defense-in-depth architecture that survives success.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ LLM Abstraction Layer │
|
||||||
|
│ │
|
||||||
|
│ Request → [Rate Limit] → [Cache Check] → [Queue] → [Dispatch] │
|
||||||
|
│ │ │ │ │ │
|
||||||
|
│ ▼ ▼ ▼ ▼ │
|
||||||
|
│ Per-user Semantic Priority Provider │
|
||||||
|
│ per-tier cache queues pool + │
|
||||||
|
│ limits (30%+ hits) (by tier) failover │
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────────┐│
|
||||||
|
│ │ BYOK (Bring Your Own Key) ││
|
||||||
|
│ │ Pro+ users with own API keys bypass platform limits ││
|
||||||
|
│ └─────────────────────────────────────────────────────────────┘│
|
||||||
|
│ │
|
||||||
|
│ ┌─────────────────────────────────────────────────────────────┐│
|
||||||
|
│ │ High Frequency Tier ││
|
||||||
|
│ │ Dedicated capacity, custom SLA — contact sales ││
|
||||||
|
│ └─────────────────────────────────────────────────────────────┘│
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tier Limits
|
||||||
|
|
||||||
|
| Tier | Price | Requests/min | Tokens/min | Concurrent | Latency SLA |
|
||||||
|
|------|-------|--------------|------------|------------|-------------|
|
||||||
|
| **Free** | $0 | 10 | 10,000 | 2 | Best effort |
|
||||||
|
| **Pro** | $29/mo | 60 | 100,000 | 10 | < 30s P95 |
|
||||||
|
| **Enterprise** | Custom | 300 | 500,000 | 50 | < 10s P95 |
|
||||||
|
| **High Frequency** | Custom | Custom | Custom | Dedicated | Custom SLA |
|
||||||
|
| **BYOK** (any tier) | — | Unlimited* | Unlimited* | 20 | User's provider |
|
||||||
|
|
||||||
|
*BYOK users are limited only by their own provider's rate limits.
|
||||||
|
|
||||||
|
### High Frequency Tier
|
||||||
|
|
||||||
|
For users requiring:
|
||||||
|
- **Low latency**: Sub-second response times
|
||||||
|
- **High throughput**: Thousands of requests per minute
|
||||||
|
- **Guaranteed capacity**: Dedicated provider allocations
|
||||||
|
- **Custom models**: Fine-tuned or private deployments
|
||||||
|
|
||||||
|
**Use cases:**
|
||||||
|
- Real-time trading signals
|
||||||
|
- Live customer support at scale
|
||||||
|
- High-volume content generation
|
||||||
|
- Latency-sensitive applications
|
||||||
|
|
||||||
|
**Pricing:** Custom — based on capacity reservation, SLA requirements, and volume.
|
||||||
|
|
||||||
|
**Landing page CTA:**
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ │
|
||||||
|
│ Need High Frequency? │
|
||||||
|
│ │
|
||||||
|
│ Building something that needs thousands of requests per │
|
||||||
|
│ minute with sub-second latency? Let's talk dedicated │
|
||||||
|
│ capacity and custom SLAs. │
|
||||||
|
│ │
|
||||||
|
│ [Contact Sales →] │
|
||||||
|
│ │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Layer 1: Intake Rate Limiting
|
||||||
|
|
||||||
|
First line of defense. Rejects requests before they consume resources.
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
```python
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from enum import Enum
|
||||||
|
import time
|
||||||
|
|
||||||
|
class Tier(Enum):
|
||||||
|
FREE = "free"
|
||||||
|
PRO = "pro"
|
||||||
|
ENTERPRISE = "enterprise"
|
||||||
|
HIGH_FREQUENCY = "high_frequency"
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TierLimits:
|
||||||
|
requests_per_minute: int
|
||||||
|
tokens_per_minute: int
|
||||||
|
max_concurrent: int
|
||||||
|
|
||||||
|
TIER_LIMITS = {
|
||||||
|
Tier.FREE: TierLimits(10, 10_000, 2),
|
||||||
|
Tier.PRO: TierLimits(60, 100_000, 10),
|
||||||
|
Tier.ENTERPRISE: TierLimits(300, 500_000, 50),
|
||||||
|
Tier.HIGH_FREQUENCY: TierLimits(10_000, 10_000_000, 500), # Custom per customer
|
||||||
|
}
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class RateLimitResult:
|
||||||
|
allowed: bool
|
||||||
|
use_user_key: bool = False
|
||||||
|
retry_after: int | None = None
|
||||||
|
reason: str | None = None
|
||||||
|
concurrent_key: str | None = None
|
||||||
|
|
||||||
|
async def rate_limit_check(user: User, request: LLMRequest) -> RateLimitResult:
|
||||||
|
"""Check if user can make this request."""
|
||||||
|
|
||||||
|
# BYOK users bypass platform limits
|
||||||
|
if user.has_own_api_key(request.provider):
|
||||||
|
return RateLimitResult(allowed=True, use_user_key=True)
|
||||||
|
|
||||||
|
limits = TIER_LIMITS[user.tier]
|
||||||
|
|
||||||
|
# Check requests per minute (sliding window)
|
||||||
|
rpm_key = f"ratelimit:{user.id}:rpm"
|
||||||
|
now = time.time()
|
||||||
|
window_start = now - 60
|
||||||
|
|
||||||
|
# Remove old entries, add new one, count
|
||||||
|
pipe = redis.pipeline()
|
||||||
|
pipe.zremrangebyscore(rpm_key, 0, window_start)
|
||||||
|
pipe.zadd(rpm_key, {str(now): now})
|
||||||
|
pipe.zcard(rpm_key)
|
||||||
|
pipe.expire(rpm_key, 120)
|
||||||
|
_, _, current_rpm, _ = await pipe.execute()
|
||||||
|
|
||||||
|
if current_rpm > limits.requests_per_minute:
|
||||||
|
return RateLimitResult(
|
||||||
|
allowed=False,
|
||||||
|
retry_after=int(60 - (now - window_start)),
|
||||||
|
reason=f"Rate limit: {limits.requests_per_minute} requests/minute"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Check concurrent requests
|
||||||
|
concurrent_key = f"ratelimit:{user.id}:concurrent"
|
||||||
|
current_concurrent = await redis.incr(concurrent_key)
|
||||||
|
await redis.expire(concurrent_key, 300) # 5 min TTL as safety
|
||||||
|
|
||||||
|
if current_concurrent > limits.max_concurrent:
|
||||||
|
await redis.decr(concurrent_key)
|
||||||
|
return RateLimitResult(
|
||||||
|
allowed=False,
|
||||||
|
retry_after=1,
|
||||||
|
reason=f"Max concurrent: {limits.max_concurrent} requests"
|
||||||
|
)
|
||||||
|
|
||||||
|
return RateLimitResult(allowed=True, concurrent_key=concurrent_key)
|
||||||
|
|
||||||
|
async def release_concurrent(concurrent_key: str):
|
||||||
|
"""Release concurrent slot after request completes."""
|
||||||
|
if concurrent_key:
|
||||||
|
await redis.decr(concurrent_key)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Rate Limit Headers
|
||||||
|
|
||||||
|
Return standard headers so clients can self-regulate:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def rate_limit_headers(user: User) -> dict:
|
||||||
|
limits = TIER_LIMITS[user.tier]
|
||||||
|
current = await get_current_usage(user.id)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"X-RateLimit-Limit": str(limits.requests_per_minute),
|
||||||
|
"X-RateLimit-Remaining": str(max(0, limits.requests_per_minute - current.rpm)),
|
||||||
|
"X-RateLimit-Reset": str(int(time.time()) + 60),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Layer 2: Semantic Cache
|
||||||
|
|
||||||
|
Identical requests return cached responses. Reduces load and cost.
|
||||||
|
|
||||||
|
### Cache Key Generation
|
||||||
|
|
||||||
|
```python
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
|
||||||
|
def hash_request(request: LLMRequest) -> str:
|
||||||
|
"""Generate deterministic cache key for request."""
|
||||||
|
|
||||||
|
# Include all parameters that affect output
|
||||||
|
cache_input = {
|
||||||
|
"model": request.model,
|
||||||
|
"messages": [
|
||||||
|
{"role": m.role, "content": m.content}
|
||||||
|
for m in request.messages
|
||||||
|
],
|
||||||
|
"temperature": request.temperature,
|
||||||
|
"max_tokens": request.max_tokens,
|
||||||
|
"tools": request.tools, # Tool definitions matter
|
||||||
|
# Exclude: user_id, timestamps, request_id
|
||||||
|
}
|
||||||
|
|
||||||
|
serialized = json.dumps(cache_input, sort_keys=True)
|
||||||
|
return hashlib.sha256(serialized.encode()).hexdigest()[:32]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cache Logic
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class CachedResponse:
|
||||||
|
response: LLMResponse
|
||||||
|
cached_at: float
|
||||||
|
hit_count: int
|
||||||
|
|
||||||
|
async def check_semantic_cache(request: LLMRequest) -> LLMResponse | None:
|
||||||
|
"""Check if we've seen this exact request before."""
|
||||||
|
|
||||||
|
cache_key = f"llmcache:{hash_request(request)}"
|
||||||
|
cached = await redis.get(cache_key)
|
||||||
|
|
||||||
|
if cached:
|
||||||
|
data = json.loads(cached)
|
||||||
|
|
||||||
|
# Update hit count for analytics
|
||||||
|
await redis.hincrby(f"llmcache:stats", "hits", 1)
|
||||||
|
|
||||||
|
return LLMResponse(
|
||||||
|
content=data["content"],
|
||||||
|
model=data["model"],
|
||||||
|
usage=data["usage"],
|
||||||
|
cached=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
await redis.hincrby(f"llmcache:stats", "misses", 1)
|
||||||
|
return None
|
||||||
|
|
||||||
|
async def cache_response(request: LLMRequest, response: LLMResponse):
|
||||||
|
"""Cache response with TTL based on determinism."""
|
||||||
|
|
||||||
|
# Don't cache errors or empty responses
|
||||||
|
if response.error or not response.content:
|
||||||
|
return
|
||||||
|
|
||||||
|
cache_key = f"llmcache:{hash_request(request)}"
|
||||||
|
|
||||||
|
# TTL based on temperature (determinism)
|
||||||
|
if request.temperature == 0:
|
||||||
|
ttl = 86400 # 24 hours for deterministic
|
||||||
|
elif request.temperature < 0.3:
|
||||||
|
ttl = 3600 # 1 hour
|
||||||
|
elif request.temperature < 0.7:
|
||||||
|
ttl = 300 # 5 minutes
|
||||||
|
else:
|
||||||
|
return # Don't cache high-temperature responses
|
||||||
|
|
||||||
|
cache_data = {
|
||||||
|
"content": response.content,
|
||||||
|
"model": response.model,
|
||||||
|
"usage": response.usage,
|
||||||
|
"cached_at": time.time(),
|
||||||
|
}
|
||||||
|
|
||||||
|
await redis.setex(cache_key, ttl, json.dumps(cache_data))
|
||||||
|
```
|
||||||
|
|
||||||
|
### Expected Cache Performance
|
||||||
|
|
||||||
|
| Use Case | Temperature | Expected Hit Rate |
|
||||||
|
|----------|-------------|-------------------|
|
||||||
|
| Tool calls (same inputs) | 0 | 70-90% |
|
||||||
|
| Structured extraction | 0-0.3 | 50-70% |
|
||||||
|
| Agent reasoning | 0.5-0.7 | 20-40% |
|
||||||
|
| Creative content | 0.8-1.0 | ~0% |
|
||||||
|
|
||||||
|
**Aggregate impact:** 30-40% reduction in API calls for typical workloads.
|
||||||
|
|
||||||
|
## Layer 3: Priority Queues
|
||||||
|
|
||||||
|
Paid users get priority. Free users are served fairly but can be shed under load.
|
||||||
|
|
||||||
|
### Queue Structure
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Redis sorted set with composite score
|
||||||
|
# Score = (priority * 1B) + timestamp
|
||||||
|
# Lower score = higher priority + earlier arrival
|
||||||
|
|
||||||
|
QUEUE_PRIORITIES = {
|
||||||
|
Tier.HIGH_FREQUENCY: 0, # Highest priority (dedicated customers)
|
||||||
|
Tier.ENTERPRISE: 1,
|
||||||
|
Tier.PRO: 2,
|
||||||
|
"trial": 2, # Trials get Pro priority (first impression)
|
||||||
|
Tier.FREE: 3, # Lowest priority
|
||||||
|
}
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class QueuedRequest:
|
||||||
|
ticket_id: str
|
||||||
|
user_id: str
|
||||||
|
tier: str
|
||||||
|
request: LLMRequest
|
||||||
|
enqueued_at: float
|
||||||
|
use_user_key: bool = False
|
||||||
|
|
||||||
|
async def enqueue_request(user: User, request: LLMRequest, use_user_key: bool) -> str:
|
||||||
|
"""Add request to priority queue, return ticket ID."""
|
||||||
|
|
||||||
|
ticket_id = f"ticket:{uuid.uuid4().hex}"
|
||||||
|
priority = QUEUE_PRIORITIES.get(user.tier, 3)
|
||||||
|
|
||||||
|
# Composite score: priority (billions) + timestamp (seconds)
|
||||||
|
score = priority * 1_000_000_000 + time.time()
|
||||||
|
|
||||||
|
queued = QueuedRequest(
|
||||||
|
ticket_id=ticket_id,
|
||||||
|
user_id=str(user.id),
|
||||||
|
tier=user.tier,
|
||||||
|
request=request,
|
||||||
|
enqueued_at=time.time(),
|
||||||
|
use_user_key=use_user_key,
|
||||||
|
)
|
||||||
|
|
||||||
|
await redis.zadd("llm:queue", {json.dumps(asdict(queued)): score})
|
||||||
|
|
||||||
|
# Set a result placeholder
|
||||||
|
await redis.setex(f"llm:result:{ticket_id}", 300, "pending")
|
||||||
|
|
||||||
|
return ticket_id
|
||||||
|
```
|
||||||
|
|
||||||
|
### Queue Workers
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def queue_worker():
|
||||||
|
"""Process requests from the queue."""
|
||||||
|
|
||||||
|
while True:
|
||||||
|
# Get highest priority item (lowest score)
|
||||||
|
items = await redis.zpopmin("llm:queue", count=1)
|
||||||
|
|
||||||
|
if not items:
|
||||||
|
await asyncio.sleep(0.1) # Brief pause if queue empty
|
||||||
|
continue
|
||||||
|
|
||||||
|
data, score = items[0]
|
||||||
|
queued = QueuedRequest(**json.loads(data))
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Select provider and execute
|
||||||
|
response = await execute_llm_request(queued)
|
||||||
|
|
||||||
|
# Store result
|
||||||
|
await redis.setex(
|
||||||
|
f"llm:result:{queued.ticket_id}",
|
||||||
|
300,
|
||||||
|
json.dumps({"status": "success", "response": asdict(response)})
|
||||||
|
)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
await redis.setex(
|
||||||
|
f"llm:result:{queued.ticket_id}",
|
||||||
|
300,
|
||||||
|
json.dumps({"status": "error", "error": str(e)})
|
||||||
|
)
|
||||||
|
|
||||||
|
async def wait_for_result(ticket_id: str, timeout: float = 120) -> LLMResponse:
|
||||||
|
"""Wait for queued request to complete."""
|
||||||
|
|
||||||
|
deadline = time.time() + timeout
|
||||||
|
|
||||||
|
while time.time() < deadline:
|
||||||
|
result = await redis.get(f"llm:result:{ticket_id}")
|
||||||
|
|
||||||
|
if result and result != "pending":
|
||||||
|
data = json.loads(result)
|
||||||
|
if data["status"] == "success":
|
||||||
|
return LLMResponse(**data["response"])
|
||||||
|
else:
|
||||||
|
raise LLMError(data["error"])
|
||||||
|
|
||||||
|
await asyncio.sleep(0.1)
|
||||||
|
|
||||||
|
raise RequestTimeout("Request timed out")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Queue Health Monitoring
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class QueueHealth:
|
||||||
|
size: int
|
||||||
|
oldest_wait_seconds: float
|
||||||
|
by_tier: dict[str, int]
|
||||||
|
status: str # healthy, degraded, critical
|
||||||
|
|
||||||
|
async def get_queue_health() -> QueueHealth:
|
||||||
|
"""Get queue metrics for monitoring and load shedding."""
|
||||||
|
|
||||||
|
queue_size = await redis.zcard("llm:queue")
|
||||||
|
|
||||||
|
# Get oldest item
|
||||||
|
oldest = await redis.zrange("llm:queue", 0, 0, withscores=True)
|
||||||
|
if oldest:
|
||||||
|
oldest_score = oldest[0][1]
|
||||||
|
oldest_time = oldest_score % 1_000_000_000
|
||||||
|
wait_time = time.time() - oldest_time
|
||||||
|
else:
|
||||||
|
wait_time = 0
|
||||||
|
|
||||||
|
# Count by tier
|
||||||
|
all_items = await redis.zrange("llm:queue", 0, -1)
|
||||||
|
by_tier = {}
|
||||||
|
for item in all_items:
|
||||||
|
data = json.loads(item)
|
||||||
|
tier = data.get("tier", "unknown")
|
||||||
|
by_tier[tier] = by_tier.get(tier, 0) + 1
|
||||||
|
|
||||||
|
# Determine status
|
||||||
|
if queue_size < 500:
|
||||||
|
status = "healthy"
|
||||||
|
elif queue_size < 2000:
|
||||||
|
status = "degraded"
|
||||||
|
else:
|
||||||
|
status = "critical"
|
||||||
|
|
||||||
|
return QueueHealth(
|
||||||
|
size=queue_size,
|
||||||
|
oldest_wait_seconds=wait_time,
|
||||||
|
by_tier=by_tier,
|
||||||
|
status=status,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Layer 4: Multi-Provider Pool with Circuit Breakers
|
||||||
|
|
||||||
|
Never depend on a single provider.
|
||||||
|
|
||||||
|
### Provider Configuration
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class ProviderConfig:
|
||||||
|
name: str
|
||||||
|
base_url: str
|
||||||
|
api_key_env: str
|
||||||
|
models: list[str]
|
||||||
|
max_concurrent: int
|
||||||
|
priority: int # Lower = preferred
|
||||||
|
timeout: float = 60.0
|
||||||
|
|
||||||
|
PROVIDERS = {
|
||||||
|
"anthropic": ProviderConfig(
|
||||||
|
name="anthropic",
|
||||||
|
base_url="https://api.anthropic.com/v1",
|
||||||
|
api_key_env="ANTHROPIC_API_KEY",
|
||||||
|
models=["claude-sonnet-4-20250514", "claude-opus-4-20250514", "claude-haiku-3"],
|
||||||
|
max_concurrent=100,
|
||||||
|
priority=1,
|
||||||
|
),
|
||||||
|
"openai": ProviderConfig(
|
||||||
|
name="openai",
|
||||||
|
base_url="https://api.openai.com/v1",
|
||||||
|
api_key_env="OPENAI_API_KEY",
|
||||||
|
models=["gpt-4o", "gpt-4o-mini", "o1", "o3-mini"],
|
||||||
|
max_concurrent=50,
|
||||||
|
priority=2,
|
||||||
|
),
|
||||||
|
"xai": ProviderConfig(
|
||||||
|
name="xai",
|
||||||
|
base_url="https://api.x.ai/v1",
|
||||||
|
api_key_env="XAI_API_KEY",
|
||||||
|
models=["grok-3", "grok-3-mini"],
|
||||||
|
max_concurrent=50,
|
||||||
|
priority=1,
|
||||||
|
),
|
||||||
|
"together": ProviderConfig(
|
||||||
|
name="together",
|
||||||
|
base_url="https://api.together.xyz/v1",
|
||||||
|
api_key_env="TOGETHER_API_KEY",
|
||||||
|
models=["llama-3-70b", "mixtral-8x7b"],
|
||||||
|
max_concurrent=100,
|
||||||
|
priority=3, # Fallback
|
||||||
|
),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Circuit Breaker State
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class CircuitState:
|
||||||
|
provider: str
|
||||||
|
healthy: bool = True
|
||||||
|
failures: int = 0
|
||||||
|
successes: int = 0
|
||||||
|
last_failure: float = 0
|
||||||
|
circuit_open_until: float = 0
|
||||||
|
current_load: int = 0
|
||||||
|
|
||||||
|
# In-memory state (could be Redis for distributed)
|
||||||
|
CIRCUIT_STATES: dict[str, CircuitState] = {
|
||||||
|
name: CircuitState(provider=name)
|
||||||
|
for name in PROVIDERS
|
||||||
|
}
|
||||||
|
|
||||||
|
CIRCUIT_CONFIG = {
|
||||||
|
"failure_threshold": 5, # Failures before opening
|
||||||
|
"success_threshold": 3, # Successes before closing
|
||||||
|
"open_duration": 30, # Seconds circuit stays open
|
||||||
|
"half_open_requests": 1, # Requests allowed in half-open state
|
||||||
|
}
|
||||||
|
|
||||||
|
async def record_success(provider: str):
|
||||||
|
"""Record successful request."""
|
||||||
|
state = CIRCUIT_STATES[provider]
|
||||||
|
state.successes += 1
|
||||||
|
state.failures = 0
|
||||||
|
|
||||||
|
if not state.healthy and state.successes >= CIRCUIT_CONFIG["success_threshold"]:
|
||||||
|
state.healthy = True
|
||||||
|
logger.info(f"Circuit closed for {provider}")
|
||||||
|
|
||||||
|
async def record_failure(provider: str, error: Exception):
|
||||||
|
"""Record failed request, potentially open circuit."""
|
||||||
|
state = CIRCUIT_STATES[provider]
|
||||||
|
state.failures += 1
|
||||||
|
state.successes = 0
|
||||||
|
state.last_failure = time.time()
|
||||||
|
|
||||||
|
if state.failures >= CIRCUIT_CONFIG["failure_threshold"]:
|
||||||
|
state.healthy = False
|
||||||
|
state.circuit_open_until = time.time() + CIRCUIT_CONFIG["open_duration"]
|
||||||
|
logger.error(f"Circuit opened for {provider}: {error}")
|
||||||
|
await alert_ops(f"LLM provider {provider} circuit opened")
|
||||||
|
|
||||||
|
def is_provider_available(provider: str) -> bool:
|
||||||
|
"""Check if provider can accept requests."""
|
||||||
|
state = CIRCUIT_STATES[provider]
|
||||||
|
config = PROVIDERS[provider]
|
||||||
|
|
||||||
|
# Circuit open?
|
||||||
|
if not state.healthy:
|
||||||
|
if time.time() < state.circuit_open_until:
|
||||||
|
return False
|
||||||
|
# Half-open: allow limited requests to probe
|
||||||
|
|
||||||
|
# At capacity?
|
||||||
|
if state.current_load >= config.max_concurrent:
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
```
|
||||||
|
|
||||||
|
### Provider Selection
|
||||||
|
|
||||||
|
```python
|
||||||
|
def get_providers_for_model(model: str) -> list[str]:
|
||||||
|
"""Get providers that support this model."""
|
||||||
|
return [
|
||||||
|
name for name, config in PROVIDERS.items()
|
||||||
|
if model in config.models or any(model.startswith(m.split("-")[0]) for m in config.models)
|
||||||
|
]
|
||||||
|
|
||||||
|
async def select_provider(request: LLMRequest, user_key: str | None = None) -> tuple[str, str]:
|
||||||
|
"""Select best available provider, return (provider_name, api_key)."""
|
||||||
|
|
||||||
|
candidates = get_providers_for_model(request.model)
|
||||||
|
|
||||||
|
if not candidates:
|
||||||
|
raise UnsupportedModel(f"No provider supports model: {request.model}")
|
||||||
|
|
||||||
|
# Filter to available providers
|
||||||
|
available = [p for p in candidates if is_provider_available(p)]
|
||||||
|
|
||||||
|
if not available:
|
||||||
|
raise NoProvidersAvailable(
|
||||||
|
"All providers for this model are currently unavailable. "
|
||||||
|
"Please try again in a few seconds."
|
||||||
|
)
|
||||||
|
|
||||||
|
# Sort by priority, then by current load
|
||||||
|
available.sort(key=lambda p: (
|
||||||
|
PROVIDERS[p].priority,
|
||||||
|
CIRCUIT_STATES[p].current_load / PROVIDERS[p].max_concurrent
|
||||||
|
))
|
||||||
|
|
||||||
|
selected = available[0]
|
||||||
|
|
||||||
|
# Determine API key
|
||||||
|
if user_key:
|
||||||
|
api_key = user_key
|
||||||
|
else:
|
||||||
|
api_key = os.environ[PROVIDERS[selected].api_key_env]
|
||||||
|
|
||||||
|
return selected, api_key
|
||||||
|
```
|
||||||
|
|
||||||
|
## Layer 5: BYOK (Bring Your Own Key)
|
||||||
|
|
||||||
|
Pro+ users can add their own API keys to bypass platform limits.
|
||||||
|
|
||||||
|
### Database Schema
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE user_api_keys (
|
||||||
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
|
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
|
||||||
|
provider VARCHAR(50) NOT NULL,
|
||||||
|
encrypted_key BYTEA NOT NULL,
|
||||||
|
key_hint VARCHAR(20), -- Last 4 chars for display: "...abc123"
|
||||||
|
is_valid BOOLEAN DEFAULT true,
|
||||||
|
last_used_at TIMESTAMPTZ,
|
||||||
|
last_error VARCHAR(255),
|
||||||
|
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||||
|
|
||||||
|
UNIQUE(user_id, provider)
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_user_api_keys_user ON user_api_keys(user_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key Encryption
|
||||||
|
|
||||||
|
```python
|
||||||
|
from cryptography.fernet import Fernet
|
||||||
|
|
||||||
|
# Platform encryption key (from environment, rotated periodically)
|
||||||
|
ENCRYPTION_KEY = Fernet(os.environ["API_KEY_ENCRYPTION_KEY"])
|
||||||
|
|
||||||
|
def encrypt_api_key(key: str) -> bytes:
|
||||||
|
"""Encrypt user's API key for storage."""
|
||||||
|
return ENCRYPTION_KEY.encrypt(key.encode())
|
||||||
|
|
||||||
|
def decrypt_api_key(encrypted: bytes) -> str:
|
||||||
|
"""Decrypt user's API key for use."""
|
||||||
|
return ENCRYPTION_KEY.decrypt(encrypted).decode()
|
||||||
|
|
||||||
|
async def store_user_api_key(user_id: str, provider: str, api_key: str):
|
||||||
|
"""Store encrypted API key for user."""
|
||||||
|
|
||||||
|
# Validate key format
|
||||||
|
if not validate_key_format(provider, api_key):
|
||||||
|
raise InvalidAPIKey(f"Invalid {provider} API key format")
|
||||||
|
|
||||||
|
# Test the key
|
||||||
|
if not await test_api_key(provider, api_key):
|
||||||
|
raise InvalidAPIKey(f"API key validation failed for {provider}")
|
||||||
|
|
||||||
|
encrypted = encrypt_api_key(api_key)
|
||||||
|
key_hint = f"...{api_key[-6:]}"
|
||||||
|
|
||||||
|
await db.execute("""
|
||||||
|
INSERT INTO user_api_keys (user_id, provider, encrypted_key, key_hint)
|
||||||
|
VALUES ($1, $2, $3, $4)
|
||||||
|
ON CONFLICT (user_id, provider)
|
||||||
|
DO UPDATE SET encrypted_key = $3, key_hint = $4, is_valid = true, last_error = NULL
|
||||||
|
""", user_id, provider, encrypted, key_hint)
|
||||||
|
|
||||||
|
async def get_user_api_key(user_id: str, provider: str) -> str | None:
|
||||||
|
"""Get decrypted API key for user, if they have one."""
|
||||||
|
|
||||||
|
row = await db.fetchrow("""
|
||||||
|
SELECT encrypted_key, is_valid
|
||||||
|
FROM user_api_keys
|
||||||
|
WHERE user_id = $1 AND provider = $2
|
||||||
|
""", user_id, provider)
|
||||||
|
|
||||||
|
if not row or not row["is_valid"]:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return decrypt_api_key(row["encrypted_key"])
|
||||||
|
```
|
||||||
|
|
||||||
|
### BYOK Request Flow
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def execute_with_byok(user: User, request: LLMRequest) -> LLMResponse:
|
||||||
|
"""Execute request, preferring user's own key if available."""
|
||||||
|
|
||||||
|
# Check for user's key
|
||||||
|
user_key = await get_user_api_key(user.id, get_provider_for_model(request.model))
|
||||||
|
|
||||||
|
if user_key:
|
||||||
|
# Use user's key - bypass platform rate limits
|
||||||
|
try:
|
||||||
|
response = await call_provider_direct(request, user_key)
|
||||||
|
|
||||||
|
# Update last used
|
||||||
|
await db.execute("""
|
||||||
|
UPDATE user_api_keys
|
||||||
|
SET last_used_at = NOW(), last_error = NULL
|
||||||
|
WHERE user_id = $1 AND provider = $2
|
||||||
|
""", user.id, request.provider)
|
||||||
|
|
||||||
|
return response
|
||||||
|
|
||||||
|
except AuthenticationError:
|
||||||
|
# Key is invalid - mark it and fall back to platform
|
||||||
|
await db.execute("""
|
||||||
|
UPDATE user_api_keys
|
||||||
|
SET is_valid = false, last_error = 'Authentication failed'
|
||||||
|
WHERE user_id = $1 AND provider = $2
|
||||||
|
""", user.id, request.provider)
|
||||||
|
|
||||||
|
# Notify user
|
||||||
|
await send_notification(user, "api_key_invalid", {
|
||||||
|
"provider": request.provider
|
||||||
|
})
|
||||||
|
|
||||||
|
# Fall through to platform key
|
||||||
|
|
||||||
|
# Use platform key (with rate limiting)
|
||||||
|
return await execute_with_platform_key(user, request)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Layer 6: Backpressure & Graceful Degradation
|
||||||
|
|
||||||
|
When overwhelmed, fail gracefully and prioritize paid users.
|
||||||
|
|
||||||
|
### Load Shedding
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def should_shed_load(user: User, queue_health: QueueHealth) -> bool:
|
||||||
|
"""Determine if this request should be rejected to protect the system."""
|
||||||
|
|
||||||
|
# High Frequency and Enterprise never shed
|
||||||
|
if user.tier in [Tier.HIGH_FREQUENCY, Tier.ENTERPRISE]:
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Pro shed only in critical
|
||||||
|
if user.tier == Tier.PRO and queue_health.status != "critical":
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Free tier shed in degraded or critical
|
||||||
|
if user.tier == Tier.FREE and queue_health.status in ["degraded", "critical"]:
|
||||||
|
# Probabilistic shedding based on queue size
|
||||||
|
shed_probability = min(0.9, (queue_health.size - 500) / 2000)
|
||||||
|
return random.random() < shed_probability
|
||||||
|
|
||||||
|
return False
|
||||||
|
```
|
||||||
|
|
||||||
|
### Graceful Error Messages
|
||||||
|
|
||||||
|
```python
|
||||||
|
class ServiceDegraded(Exception):
|
||||||
|
"""Raised when load shedding rejects a request."""
|
||||||
|
|
||||||
|
def __init__(self, tier: str, queue_health: QueueHealth):
|
||||||
|
if tier == Tier.FREE:
|
||||||
|
message = (
|
||||||
|
"We're experiencing high demand. Free tier requests are "
|
||||||
|
"temporarily paused. Upgrade to Pro for priority access, "
|
||||||
|
"or try again in a few minutes."
|
||||||
|
)
|
||||||
|
retry_after = 60
|
||||||
|
else:
|
||||||
|
message = (
|
||||||
|
"High demand is causing delays. Your request has been queued. "
|
||||||
|
"Expected wait time: ~{} seconds."
|
||||||
|
).format(int(queue_health.oldest_wait_seconds * 1.5))
|
||||||
|
retry_after = 30
|
||||||
|
|
||||||
|
self.message = message
|
||||||
|
self.retry_after = retry_after
|
||||||
|
super().__init__(message)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Timeout Handling
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def execute_with_timeout(request: LLMRequest, provider: str, api_key: str) -> LLMResponse:
|
||||||
|
"""Execute request with appropriate timeout."""
|
||||||
|
|
||||||
|
# Timeout based on expected response size
|
||||||
|
if request.max_tokens and request.max_tokens > 2000:
|
||||||
|
timeout = 120 # Long responses need more time
|
||||||
|
else:
|
||||||
|
timeout = 60
|
||||||
|
|
||||||
|
try:
|
||||||
|
async with asyncio.timeout(timeout):
|
||||||
|
return await call_provider(request, provider, api_key)
|
||||||
|
except asyncio.TimeoutError:
|
||||||
|
await record_failure(provider, TimeoutError("Request timed out"))
|
||||||
|
raise RequestTimeout(
|
||||||
|
f"Request timed out after {timeout}s. "
|
||||||
|
"Try reducing max_tokens or simplifying the prompt."
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Main Entry Point
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def handle_llm_request(user: User, request: LLMRequest) -> LLMResponse:
|
||||||
|
"""
|
||||||
|
Main entry point for all LLM requests.
|
||||||
|
Implements full defense-in-depth stack.
|
||||||
|
"""
|
||||||
|
|
||||||
|
concurrent_key = None
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Layer 1: Rate limiting
|
||||||
|
rate_result = await rate_limit_check(user, request)
|
||||||
|
if not rate_result.allowed:
|
||||||
|
raise RateLimitExceeded(
|
||||||
|
message=rate_result.reason,
|
||||||
|
retry_after=rate_result.retry_after
|
||||||
|
)
|
||||||
|
concurrent_key = rate_result.concurrent_key
|
||||||
|
|
||||||
|
# Layer 2: Semantic cache
|
||||||
|
cached = await check_semantic_cache(request)
|
||||||
|
if cached:
|
||||||
|
return cached
|
||||||
|
|
||||||
|
# Layer 3: Check queue health for load shedding
|
||||||
|
queue_health = await get_queue_health()
|
||||||
|
if await should_shed_load(user, queue_health):
|
||||||
|
raise ServiceDegraded(user.tier, queue_health)
|
||||||
|
|
||||||
|
# Layer 4: Enqueue with priority
|
||||||
|
ticket_id = await enqueue_request(user, request, rate_result.use_user_key)
|
||||||
|
|
||||||
|
# Layer 5: Wait for result
|
||||||
|
response = await wait_for_result(ticket_id, timeout=120)
|
||||||
|
|
||||||
|
# Layer 6: Cache successful response
|
||||||
|
await cache_response(request, response)
|
||||||
|
|
||||||
|
return response
|
||||||
|
|
||||||
|
finally:
|
||||||
|
# Always release concurrent slot
|
||||||
|
if concurrent_key:
|
||||||
|
await release_concurrent(concurrent_key)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitoring & Alerts
|
||||||
|
|
||||||
|
### Key Metrics
|
||||||
|
|
||||||
|
| Metric | Source | Warning | Critical |
|
||||||
|
|--------|--------|---------|----------|
|
||||||
|
| Queue depth | Redis ZCARD | > 500 | > 2000 |
|
||||||
|
| P50 latency | Request timing | > 10s | > 30s |
|
||||||
|
| P99 latency | Request timing | > 60s | > 120s |
|
||||||
|
| Cache hit rate | Redis stats | < 25% | < 10% |
|
||||||
|
| Provider error rate | Circuit state | > 5% | > 20% |
|
||||||
|
| Circuit breaker open | Circuit state | Any | Multiple |
|
||||||
|
| Free tier rejection rate | Load shedding | > 20% | > 50% |
|
||||||
|
|
||||||
|
### Alerting
|
||||||
|
|
||||||
|
```python
|
||||||
|
# PagerDuty / Slack alerts
|
||||||
|
ALERTS = {
|
||||||
|
"queue_critical": {
|
||||||
|
"condition": lambda h: h.size > 2000,
|
||||||
|
"severity": "critical",
|
||||||
|
"message": "LLM queue depth critical: {size} requests backed up"
|
||||||
|
},
|
||||||
|
"provider_down": {
|
||||||
|
"condition": lambda p: not p.healthy,
|
||||||
|
"severity": "warning",
|
||||||
|
"message": "Provider {name} circuit breaker open"
|
||||||
|
},
|
||||||
|
"all_providers_down": {
|
||||||
|
"condition": lambda: all(not s.healthy for s in CIRCUIT_STATES.values()),
|
||||||
|
"severity": "critical",
|
||||||
|
"message": "ALL LLM providers are down!"
|
||||||
|
},
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dashboard Queries
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Requests per minute by tier
|
||||||
|
SELECT
|
||||||
|
date_trunc('minute', created_at) as minute,
|
||||||
|
tier,
|
||||||
|
COUNT(*) as requests
|
||||||
|
FROM llm_requests
|
||||||
|
WHERE created_at > NOW() - INTERVAL '1 hour'
|
||||||
|
GROUP BY 1, 2
|
||||||
|
ORDER BY 1 DESC;
|
||||||
|
|
||||||
|
-- Error rate by provider
|
||||||
|
SELECT
|
||||||
|
provider,
|
||||||
|
COUNT(*) FILTER (WHERE status = 'error') * 100.0 / COUNT(*) as error_rate
|
||||||
|
FROM llm_requests
|
||||||
|
WHERE created_at > NOW() - INTERVAL '1 hour'
|
||||||
|
GROUP BY provider;
|
||||||
|
|
||||||
|
-- BYOK adoption
|
||||||
|
SELECT
|
||||||
|
tier,
|
||||||
|
COUNT(*) FILTER (WHERE used_user_key) * 100.0 / COUNT(*) as byok_percentage
|
||||||
|
FROM llm_requests
|
||||||
|
WHERE created_at > NOW() - INTERVAL '24 hours'
|
||||||
|
GROUP BY tier;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Viral Day Playbook
|
||||||
|
|
||||||
|
What to do when that tweet hits:
|
||||||
|
|
||||||
|
### Hour 0-1: Detection
|
||||||
|
- Alert: Queue depth > 500
|
||||||
|
- Action: Monitor, no intervention needed
|
||||||
|
|
||||||
|
### Hour 1-2: Escalation
|
||||||
|
- Alert: Queue depth > 1000, latency spiking
|
||||||
|
- Action:
|
||||||
|
- Verify all provider circuits are healthy
|
||||||
|
- Check cache hit rate (should be climbing)
|
||||||
|
- Prepare to enable aggressive load shedding
|
||||||
|
|
||||||
|
### Hour 2-4: Peak
|
||||||
|
- Alert: Queue depth > 2000, free tier rejections > 30%
|
||||||
|
- Action:
|
||||||
|
- Enable aggressive load shedding for free tier
|
||||||
|
- Send "high demand" email to free users with upgrade CTA
|
||||||
|
- Monitor Pro/Enterprise latency (must stay < 30s)
|
||||||
|
- Tweet acknowledgment: "We're experiencing high demand due to [reason]. Pro users unaffected."
|
||||||
|
|
||||||
|
### Hour 4-8: Stabilization
|
||||||
|
- Queue draining as cache warms and load shedding works
|
||||||
|
- Many users convert to Pro or add BYOK keys
|
||||||
|
- Circuits recovering as providers stabilize
|
||||||
|
|
||||||
|
### Post-Mortem
|
||||||
|
- Review metrics: peak queue, rejection rate, conversion rate
|
||||||
|
- Adjust tier limits if needed
|
||||||
|
- Consider adding provider capacity for sustained growth
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Stripe-style rate limiting](https://stripe.com/docs/rate-limits)
|
||||||
|
- [Circuit breaker pattern](https://martinfowler.com/bliki/CircuitBreaker.html)
|
||||||
|
- [Token bucket algorithm](https://en.wikipedia.org/wiki/Token_bucket)
|
||||||
|
- [BloxServer Billing](bloxserver-billing.md) — Tier definitions and pricing
|
||||||
513
docs/librarian-architecture.md
Normal file
513
docs/librarian-architecture.md
Normal file
|
|
@ -0,0 +1,513 @@
|
||||||
|
# Librarian Architecture — RLM-Powered Document Intelligence
|
||||||
|
|
||||||
|
**Status:** Design
|
||||||
|
**Date:** January 2026
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The Librarian is an agent that ingests, indexes, and queries large document collections using the **Recursive Language Model (RLM)** pattern. It can handle codebases, documentation, and structured data at scales far beyond LLM context windows (10M+ tokens).
|
||||||
|
|
||||||
|
Key insight from [MIT RLM research](https://arxiv.org/abs/...): Long contexts should be loaded as **variables in a REPL environment**, not fed directly to the neural network. The LLM writes code to examine, decompose, and recursively query chunks.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
|
│ RLM-Powered Librarian │
|
||||||
|
│ │
|
||||||
|
│ ┌───────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Ingestion Pipeline │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ Source → Detect Type → Select Chunker → Index → Store │ │
|
||||||
|
│ └───────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌───────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Query Engine (RLM Pattern) │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ Query → Search → Filter → Recursive Sub-Query → Answer │ │
|
||||||
|
│ └───────────────────────────────────────────────────────────┘ │
|
||||||
|
│ │
|
||||||
|
│ ┌───────────────────────────────────────────────────────────┐ │
|
||||||
|
│ │ Storage Layer │ │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ eXist-db (XML) + Vector Embeddings + Dependency Graph │ │
|
||||||
|
│ └───────────────────────────────────────────────────────────┘ │
|
||||||
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## The RLM Pattern
|
||||||
|
|
||||||
|
Traditional LLM usage stuffs entire documents into the prompt. This fails at scale:
|
||||||
|
- Context windows have hard limits (128K-1M tokens)
|
||||||
|
- Performance degrades with context length ("context rot")
|
||||||
|
- Cost scales linearly with input size
|
||||||
|
|
||||||
|
**RLM approach:**
|
||||||
|
|
||||||
|
1. **Load as Variable**: Documents become references, not inline content
|
||||||
|
2. **Programmatic Access**: LLM writes code to peek into chunks
|
||||||
|
3. **Recursive Sub-Queries**: `llm_query(chunk, question)` for focused analysis
|
||||||
|
4. **Aggregation**: Combine sub-query results into final answer
|
||||||
|
|
||||||
|
```python
|
||||||
|
# RLM-style pseudocode
|
||||||
|
async def handle_query(query: str, codebase: CodebaseRef):
|
||||||
|
# 1. Search index for relevant chunks (not full content)
|
||||||
|
hits = await search_index(codebase, query)
|
||||||
|
|
||||||
|
# 2. Filter if too many results
|
||||||
|
if len(hits) > 10:
|
||||||
|
hits = await llm_filter(hits, query) # LLM picks most relevant
|
||||||
|
|
||||||
|
# 3. Recursive sub-queries on each chunk
|
||||||
|
findings = []
|
||||||
|
for hit in hits:
|
||||||
|
chunk = await load_chunk(hit)
|
||||||
|
result = await llm_query(
|
||||||
|
f"Analyze this for: {query}\n\n{chunk}"
|
||||||
|
)
|
||||||
|
findings.append(result)
|
||||||
|
|
||||||
|
# 4. Aggregate into final answer
|
||||||
|
return await llm_synthesize(findings, query)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Hybrid Chunking Architecture
|
||||||
|
|
||||||
|
Chunking is domain-specific. A C++ class should stay together; a legal clause shouldn't be split mid-sentence. We use a hybrid approach:
|
||||||
|
|
||||||
|
### Built-in Chunkers (Fast Path)
|
||||||
|
|
||||||
|
| Chunker | File Types | Strategy | Implementation |
|
||||||
|
|---------|------------|----------|----------------|
|
||||||
|
| **Code** | .c, .cpp, .py, .js, .rs, ... | AST-aware splitting | tree-sitter |
|
||||||
|
| **Markdown/Docs** | .md, .rst, .txt | Heading hierarchy | Custom parser |
|
||||||
|
| **Structured Data** | .json, .xml, .yaml | Schema-aware | lxml + json |
|
||||||
|
| **Plain Text** | emails, logs, notes | Semantic paragraphs | Sentence boundaries |
|
||||||
|
|
||||||
|
These cover ~90% of use cases with optimized, predictable behavior.
|
||||||
|
|
||||||
|
### WASM Factory (Fallback for Unknown Types)
|
||||||
|
|
||||||
|
For novel formats, the AI generates a custom chunker:
|
||||||
|
|
||||||
|
```
|
||||||
|
User uploads proprietary format
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 1: Sample Analysis │
|
||||||
|
│ │
|
||||||
|
│ AI examines sample files: │
|
||||||
|
│ - Structure patterns │
|
||||||
|
│ - Record boundaries │
|
||||||
|
│ - Semantic units │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 2: Generate Chunker (Rust → WASM) │
|
||||||
|
│ │
|
||||||
|
│ AI writes Rust code implementing the chunker interface │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 3: Compile & Validate │
|
||||||
|
│ │
|
||||||
|
│ cargo build --target wasm32-wasi │
|
||||||
|
│ Test on sample files │
|
||||||
|
│ AI reviews output quality │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 4: Deploy │
|
||||||
|
│ │
|
||||||
|
│ Store in user's WASM modules │
|
||||||
|
│ Optional: publish to marketplace │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### WASM Chunker Interface (WIT)
|
||||||
|
|
||||||
|
```wit
|
||||||
|
// chunker.wit
|
||||||
|
interface chunker {
|
||||||
|
record chunk {
|
||||||
|
id: string,
|
||||||
|
content: string,
|
||||||
|
metadata: list<tuple<string, string>>,
|
||||||
|
parent-id: option<string>,
|
||||||
|
children: list<string>,
|
||||||
|
}
|
||||||
|
|
||||||
|
record chunker-config {
|
||||||
|
file-type: string,
|
||||||
|
max-chunk-size: u32,
|
||||||
|
preserve-context: bool,
|
||||||
|
custom-params: list<tuple<string, string>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Analyze sample data, return chunking config
|
||||||
|
analyze: func(sample: string, file-type: string) -> chunker-config
|
||||||
|
|
||||||
|
// Chunk a file using the config
|
||||||
|
chunk-file: func(content: string, config: chunker-config) -> list<chunk>
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Ingestion Pipeline
|
||||||
|
|
||||||
|
### Step 1: Source Acquisition
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class IngestionSource:
|
||||||
|
type: Literal["git", "upload", "url", "s3"]
|
||||||
|
location: str
|
||||||
|
filter: str | None = None # e.g., "*.cpp", "docs/**/*.md"
|
||||||
|
```
|
||||||
|
|
||||||
|
Supported sources:
|
||||||
|
- **Git repository**: Clone and track branches
|
||||||
|
- **File upload**: Direct upload via UI
|
||||||
|
- **URL**: Fetch remote documents
|
||||||
|
- **S3/Cloud storage**: Enterprise integrations
|
||||||
|
|
||||||
|
### Step 2: Type Detection
|
||||||
|
|
||||||
|
```python
|
||||||
|
def detect_type(file_path: str, content: bytes) -> FileType:
|
||||||
|
# 1. Check extension
|
||||||
|
ext = Path(file_path).suffix.lower()
|
||||||
|
if ext in CODE_EXTENSIONS:
|
||||||
|
return FileType.CODE
|
||||||
|
|
||||||
|
# 2. Check magic bytes
|
||||||
|
if content.startswith(b'%PDF'):
|
||||||
|
return FileType.PDF
|
||||||
|
|
||||||
|
# 3. Content analysis
|
||||||
|
if looks_like_markdown(content):
|
||||||
|
return FileType.MARKDOWN
|
||||||
|
|
||||||
|
return FileType.PLAIN_TEXT
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Chunking
|
||||||
|
|
||||||
|
```python
|
||||||
|
def select_chunker(file_type: FileType, user_config: ChunkerConfig) -> Chunker:
|
||||||
|
# User override
|
||||||
|
if user_config.custom_wasm:
|
||||||
|
return WasmChunker(user_config.custom_wasm)
|
||||||
|
|
||||||
|
# Built-in chunkers
|
||||||
|
match file_type:
|
||||||
|
case FileType.CODE:
|
||||||
|
return TreeSitterChunker(language=detect_language(file_type))
|
||||||
|
case FileType.MARKDOWN:
|
||||||
|
return MarkdownChunker()
|
||||||
|
case FileType.JSON | FileType.XML | FileType.YAML:
|
||||||
|
return StructuredDataChunker()
|
||||||
|
case _:
|
||||||
|
return PlainTextChunker()
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Indexing
|
||||||
|
|
||||||
|
Each chunk is indexed in multiple ways:
|
||||||
|
|
||||||
|
| Index Type | Purpose | Implementation |
|
||||||
|
|------------|---------|----------------|
|
||||||
|
| **Full-text** | Keyword search | eXist-db Lucene |
|
||||||
|
| **Vector** | Semantic similarity | Embeddings (OpenAI/local) |
|
||||||
|
| **Graph** | Relationships | Class hierarchy, imports, references |
|
||||||
|
| **Metadata** | Filtering | File path, type, timestamp |
|
||||||
|
|
||||||
|
### Step 5: Storage
|
||||||
|
|
||||||
|
```xml
|
||||||
|
<!-- Chunk stored in eXist-db -->
|
||||||
|
<chunk xmlns="https://bloxserver.io/ns/librarian/v1">
|
||||||
|
<id>opencascade:BRepBuilderAPI_MakeEdge:constructor_1</id>
|
||||||
|
<source>
|
||||||
|
<repo>opencascade</repo>
|
||||||
|
<path>src/BRepBuilderAPI/BRepBuilderAPI_MakeEdge.cxx</path>
|
||||||
|
<lines start="42" end="87"/>
|
||||||
|
</source>
|
||||||
|
<type>function</type>
|
||||||
|
<metadata>
|
||||||
|
<class>BRepBuilderAPI_MakeEdge</class>
|
||||||
|
<visibility>public</visibility>
|
||||||
|
<params>const TopoDS_Vertex&, const TopoDS_Vertex&</params>
|
||||||
|
</metadata>
|
||||||
|
<content><![CDATA[
|
||||||
|
BRepBuilderAPI_MakeEdge::BRepBuilderAPI_MakeEdge(
|
||||||
|
const TopoDS_Vertex& V1,
|
||||||
|
const TopoDS_Vertex& V2)
|
||||||
|
{
|
||||||
|
// ... implementation
|
||||||
|
}
|
||||||
|
]]></content>
|
||||||
|
<embedding>[0.023, -0.041, 0.089, ...]</embedding>
|
||||||
|
</chunk>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Query Engine
|
||||||
|
|
||||||
|
### Query Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
User: "How does BRepBuilderAPI_MakeEdge handle degenerate curves?"
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 1: Search │
|
||||||
|
│ │
|
||||||
|
│ - Vector search: find semantically similar chunks │
|
||||||
|
│ - Keyword search: "BRepBuilderAPI_MakeEdge" + "degenerate"│
|
||||||
|
│ - Graph traversal: class hierarchy, method calls │
|
||||||
|
│ │
|
||||||
|
│ Result: 47 potentially relevant chunks │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 2: Filter (LLM-assisted) │
|
||||||
|
│ │
|
||||||
|
│ Too many chunks for direct analysis. │
|
||||||
|
│ LLM reviews summaries, picks top 8 most relevant. │
|
||||||
|
│ │
|
||||||
|
│ Selected: │
|
||||||
|
│ - BRepBuilderAPI_MakeEdge constructors (3 chunks) │
|
||||||
|
│ - Edge validation methods (2 chunks) │
|
||||||
|
│ - Degenerate curve handling (2 chunks) │
|
||||||
|
│ - Error reporting (1 chunk) │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 3: Recursive Sub-Queries │
|
||||||
|
│ │
|
||||||
|
│ For each chunk, focused LLM query: │
|
||||||
|
│ │
|
||||||
|
│ llm_query(chunk_1, "How does this handle degenerate...") │
|
||||||
|
│ llm_query(chunk_2, "What validation happens here...") │
|
||||||
|
│ llm_query(chunk_3, "What errors are raised for...") │
|
||||||
|
│ ... │
|
||||||
|
│ │
|
||||||
|
│ 8 parallel sub-queries → 8 focused findings │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌───────────────────────────────────────────────────────────┐
|
||||||
|
│ Step 4: Synthesize │
|
||||||
|
│ │
|
||||||
|
│ LLM combines findings into coherent answer: │
|
||||||
|
│ │
|
||||||
|
│ "BRepBuilderAPI_MakeEdge handles degenerate curves by: │
|
||||||
|
│ 1. Checking curve bounds in the constructor... │
|
||||||
|
│ 2. Calling BRepCheck_Edge for validation... │
|
||||||
|
│ 3. Setting myError to BRepBuilderAPI_CurveTooSmall..." │
|
||||||
|
└───────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Handler Implementation
|
||||||
|
|
||||||
|
```python
|
||||||
|
@xmlify
|
||||||
|
@dataclass
|
||||||
|
class LibrarianQuery:
|
||||||
|
"""Query the librarian for information."""
|
||||||
|
collection: str # Which indexed collection
|
||||||
|
question: str # Natural language question
|
||||||
|
max_chunks: int = 10 # Limit for recursive queries
|
||||||
|
include_sources: bool = True
|
||||||
|
|
||||||
|
@xmlify
|
||||||
|
@dataclass
|
||||||
|
class LibrarianResponse:
|
||||||
|
"""Response from librarian with sources."""
|
||||||
|
answer: str
|
||||||
|
sources: list[SourceReference]
|
||||||
|
confidence: float
|
||||||
|
|
||||||
|
async def handle_librarian_query(
|
||||||
|
payload: LibrarianQuery,
|
||||||
|
metadata: HandlerMetadata
|
||||||
|
) -> HandlerResponse:
|
||||||
|
"""RLM-style query handler."""
|
||||||
|
|
||||||
|
# 1. Search for relevant chunks
|
||||||
|
hits = await search_collection(
|
||||||
|
payload.collection,
|
||||||
|
payload.question,
|
||||||
|
limit=50 # Cast wide net
|
||||||
|
)
|
||||||
|
|
||||||
|
# 2. Filter if needed
|
||||||
|
if len(hits) > payload.max_chunks:
|
||||||
|
hits = await llm_filter_chunks(
|
||||||
|
hits,
|
||||||
|
payload.question,
|
||||||
|
limit=payload.max_chunks
|
||||||
|
)
|
||||||
|
|
||||||
|
# 3. Recursive sub-queries
|
||||||
|
findings = await asyncio.gather(*[
|
||||||
|
llm_analyze_chunk(chunk, payload.question)
|
||||||
|
for chunk in hits
|
||||||
|
])
|
||||||
|
|
||||||
|
# 4. Synthesize answer
|
||||||
|
answer = await llm_synthesize(findings, payload.question)
|
||||||
|
|
||||||
|
# 5. Build response
|
||||||
|
sources = [
|
||||||
|
SourceReference(
|
||||||
|
path=hit.source_path,
|
||||||
|
lines=(hit.start_line, hit.end_line),
|
||||||
|
relevance=hit.score
|
||||||
|
)
|
||||||
|
for hit in hits
|
||||||
|
]
|
||||||
|
|
||||||
|
return HandlerResponse.respond(
|
||||||
|
payload=LibrarianResponse(
|
||||||
|
answer=answer,
|
||||||
|
sources=sources if payload.include_sources else [],
|
||||||
|
confidence=calculate_confidence(findings)
|
||||||
|
)
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Storage Layer
|
||||||
|
|
||||||
|
### eXist-db (Primary Store)
|
||||||
|
|
||||||
|
XML-native database for chunk storage and XQuery retrieval.
|
||||||
|
|
||||||
|
**Why eXist-db:**
|
||||||
|
- Native XQuery for complex queries
|
||||||
|
- Full-text search with Lucene
|
||||||
|
- XML validation against schemas
|
||||||
|
- Transactional updates
|
||||||
|
|
||||||
|
**Collections structure:**
|
||||||
|
```
|
||||||
|
/db/librarian/
|
||||||
|
├── collections/
|
||||||
|
│ ├── {user_id}/
|
||||||
|
│ │ ├── {collection_id}/
|
||||||
|
│ │ │ ├── metadata.xml
|
||||||
|
│ │ │ ├── chunks/
|
||||||
|
│ │ │ │ ├── chunk_001.xml
|
||||||
|
│ │ │ │ ├── chunk_002.xml
|
||||||
|
│ │ │ │ └── ...
|
||||||
|
│ │ │ └── index/
|
||||||
|
│ │ │ └── embeddings.bin
|
||||||
|
```
|
||||||
|
|
||||||
|
### Vector Embeddings
|
||||||
|
|
||||||
|
For semantic search, chunks are embedded using:
|
||||||
|
- OpenAI `text-embedding-3-small` (cloud)
|
||||||
|
- Sentence Transformers (local/self-hosted)
|
||||||
|
|
||||||
|
Embeddings stored alongside chunks or in dedicated vector DB (Qdrant/Pinecone for scale).
|
||||||
|
|
||||||
|
### Dependency Graph
|
||||||
|
|
||||||
|
For code collections, track relationships:
|
||||||
|
- **Class hierarchy**: inheritance, interfaces
|
||||||
|
- **Imports**: file dependencies
|
||||||
|
- **Call graph**: function → function references
|
||||||
|
|
||||||
|
Stored in eXist-db as XML or external graph DB for complex traversals.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### organism.yaml
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
listeners:
|
||||||
|
- name: librarian
|
||||||
|
handler: xml_pipeline.tools.librarian.handle_librarian_query
|
||||||
|
payload_class: xml_pipeline.tools.librarian.LibrarianQuery
|
||||||
|
description: Query indexed document collections
|
||||||
|
agent: true
|
||||||
|
peers: [] # Terminal handler
|
||||||
|
config:
|
||||||
|
exist_db:
|
||||||
|
url: "http://localhost:8080/exist"
|
||||||
|
user_env: EXIST_USER
|
||||||
|
password_env: EXIST_PASSWORD
|
||||||
|
embeddings:
|
||||||
|
provider: openai # or "local"
|
||||||
|
model: text-embedding-3-small
|
||||||
|
chunkers:
|
||||||
|
code:
|
||||||
|
max_chunk_size: 2000
|
||||||
|
overlap: 200
|
||||||
|
markdown:
|
||||||
|
split_on_headings: true
|
||||||
|
min_heading_level: 2
|
||||||
|
```
|
||||||
|
|
||||||
|
### Ingestion API
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Ingest a git repository
|
||||||
|
await librarian.ingest(
|
||||||
|
source=GitSource(
|
||||||
|
url="https://github.com/Open-Cascade-SAS/OCCT",
|
||||||
|
branch="master",
|
||||||
|
filter="src/**/*.cxx"
|
||||||
|
),
|
||||||
|
collection="opencascade",
|
||||||
|
chunker_config=CodeChunkerConfig(
|
||||||
|
language="cpp",
|
||||||
|
max_chunk_size=2000
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Query the collection
|
||||||
|
response = await librarian.query(
|
||||||
|
collection="opencascade",
|
||||||
|
question="How does BRepBuilderAPI_MakeEdge handle curves?"
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scaling Considerations
|
||||||
|
|
||||||
|
| Scale | Storage | Search | Compute |
|
||||||
|
|-------|---------|--------|---------|
|
||||||
|
| Small (<10K chunks) | eXist-db local | In-DB Lucene | Single node |
|
||||||
|
| Medium (10K-1M) | eXist-db cluster | + Vector DB | Multi-worker |
|
||||||
|
| Large (1M+) | Sharded storage | Distributed search | GPU embeddings |
|
||||||
|
|
||||||
|
## Security
|
||||||
|
|
||||||
|
- **Collection isolation**: Users can only query their own collections
|
||||||
|
- **WASM sandbox**: Custom chunkers run in isolated WASM runtime
|
||||||
|
- **Rate limiting**: Prevent abuse of recursive queries
|
||||||
|
- **Audit logging**: Track all queries for compliance
|
||||||
|
|
||||||
|
## Future Enhancements
|
||||||
|
|
||||||
|
1. **Incremental updates**: Re-index only changed files
|
||||||
|
2. **Cross-collection queries**: Search across multiple codebases
|
||||||
|
3. **Collaborative collections**: Shared team libraries
|
||||||
|
4. **Query caching**: Cache common sub-queries
|
||||||
|
5. **Streaming ingestion**: Real-time updates from git webhooks
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- [Recursive Language Models (MIT)](docs/mit-paper.pdf) — Foundational research on RLM pattern
|
||||||
|
- [tree-sitter](https://tree-sitter.github.io/) — AST-aware code parsing
|
||||||
|
- [eXist-db](http://exist-db.org/) — XML-native database
|
||||||
|
- [BloxServer Architecture](bloxserver-architecture.md) — Platform overview
|
||||||
Loading…
Reference in a new issue