xml-pipeline/xml_pipeline/message_bus/steps/envelope_validation.py
dullfig d53bc1dfbe Fix XSD path for installed package (v0.3.1)
Use __file__-based path resolution for envelope.xsd so the schema
loads correctly when xml-pipeline is installed via pip.

Also:
- Add build artifacts to .gitignore
- Bump version to 0.3.1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 22:24:36 -08:00

61 lines
No EOL
2.4 KiB
Python

"""
envelope_validation.py — Validates the canonicalized <message> envelope against envelope.xsd.
After repair_step and c14n_step, we have a normalized envelope_tree.
This step enforces the outer <message> structure (thread, from, optional to, etc.)
using the strict envelope.xsd schema.
Failure here is serious — invalid envelope means the message is malformed at the protocol level,
so we set a clear error and let downstream steps handle it (typically route to system pipeline
for diagnostic <huh>).
Part of AgentServer v2.1 message pump.
"""
from pathlib import Path
from lxml import etree
from xml_pipeline.message_bus.message_state import MessageState
# Load envelope.xsd once at module import (startup time)
# Path is relative to this file: steps/ → message_bus/ → xml_pipeline/ → schema/
_SCHEMA_PATH = Path(__file__).parent.parent.parent / "schema" / "envelope.xsd"
_ENVELOPE_XSD = etree.XMLSchema(file=str(_SCHEMA_PATH))
async def envelope_validation_step(state: MessageState) -> MessageState:
"""
Validate the canonicalized envelope_tree against the fixed envelope.xsd schema.
Requirements:
- Must be a valid <message> with required <thread> and <from>
- Optional <to>, etc.
- Namespace must match https://xml-pipeline.org/ns/envelope/v1
On failure: sets state.error with schema validation details.
Downstream steps should short-circuit if error is set.
"""
if state.envelope_tree is None:
state.error = "envelope_validation_step: no envelope_tree (previous step failed)"
return state
try:
# lxml schema validation — raises XMLSchemaError on failure
_ENVELOPE_XSD.assertValid(state.envelope_tree)
# Optional extra checks (can be removed later if redundant)
if state.envelope_tree.tag != "{https://xml-pipeline.org/ns/envelope/v1}message":
raise ValueError("Root element is not <message> in expected namespace")
except etree.DocumentInvalid as exc:
# Schema violation — collect all error messages for diagnostics
error_lines = []
for error in _ENVELOPE_XSD.error_log:
error_lines.append(f"{error.level_name}: {error.message} (line {error.line})")
state.error = "envelope_validation_step: invalid envelope\n" + "\n".join(error_lines)
except Exception as exc: # pylint: disable=broad-except
state.error = f"envelope_validation_step failed: {exc}"
return state