- POST /api/v1/flows/{id}/analyze-edge endpoint spec
- Confidence levels: high (green), medium (yellow), low (red)
- Heuristic + LLM analysis paths
- Database schema for edge_mappings
- Sequencer integration notes
- Future enhancements roadmap
Co-authored-by: Dan
8.9 KiB
8.9 KiB
Edge Analysis API Specification
Status: Draft
Author: Donna (with Dan)
Date: 2026-01-26
Overview
When users connect nodes in the visual flow editor, an AI analyzes the schema compatibility and proposes field mappings. This provides immediate visual feedback (green/yellow/red) and reduces manual configuration for common cases.
User Experience
Visual Feedback
When a user draws a connection from Node A to Node B:
┌─────────────┐ ┌─────────────┐
│ Node A │ 🟢 ────────────────▶ │ Node B │
│ │ High confidence │ │
└─────────────┘ └─────────────┘
┌─────────────┐ ┌─────────────┐
│ Node C │ 🟡 ─ ─ ─ ─ ─ ─ ─ ─▶ │ Node D │
│ │ Review suggested │ │
└─────────────┘ └─────────────┘
┌─────────────┐ ┌─────────────┐
│ Node E │ 🔴 ─ ─ ─ ✕ ─ ─ ─ ─▶ │ Node F │
│ │ Manual mapping needed │ │
└─────────────┘ └─────────────┘
Confidence Levels
| Level | Color | Threshold | Meaning |
|---|---|---|---|
| HIGH | 🟢 Green | ≥ 0.8 | All required inputs mapped, types compatible |
| MEDIUM | 🟡 Yellow | 0.4 - 0.79 | Some mappings uncertain or missing optional fields |
| LOW | 🔴 Red | < 0.4 | Cannot determine mapping, manual intervention required |
Interaction Flow
- User drags connection from A output → B input
- Frontend calls
POST /api/v1/flows/{id}/analyze-edge - Backend analyzes schemas (cached if previously computed)
- Frontend renders line with confidence color
- User clicks line → mapping panel opens
- User can accept, modify, or manually define mappings
- Changes saved to flow's
canvas_state
API Endpoint
POST /api/v1/flows/{flow_id}/analyze-edge
Analyze compatibility between two nodes and propose field mappings.
Request
{
"from_node": "calculator.add",
"to_node": "calculator.multiply",
"from_output_schema": "<optional: override if not using node's default>",
"to_input_schema": "<optional: override if not using node's default>"
}
Notes:
- If schemas not provided, fetched from node registry
- Schemas can be XSD strings or references to registered schemas
Response
{
"edge_id": "add_to_multiply",
"confidence": 0.85,
"level": "high",
"proposed_mapping": {
"mappings": [
{
"from_field": "output.sum",
"to_field": "input.value",
"confidence": 0.95,
"reason": "Exact name match, compatible types (int → int)"
},
{
"from_field": "output.operands",
"to_field": "input.factors",
"confidence": 0.6,
"reason": "Semantic similarity, both arrays of numbers"
}
],
"unmapped_required": [
{
"field": "input.precision",
"type": "int",
"default": 2,
"suggestion": "Set constant value or map from upstream"
}
],
"unmapped_optional": [
{
"field": "input.label",
"type": "string"
}
]
},
"warnings": [
"input.precision has no source, using default value 2",
"output.metadata will be discarded (no matching input field)"
],
"errors": [],
"analysis_method": "llm", // or "heuristic" for simple cases
"cached": false,
"analysis_time_ms": 245
}
Error Response
{
"error": "schema_not_found",
"message": "Node 'calculator.add' has no registered output schema",
"details": {
"node": "calculator.add"
}
}
GET /api/v1/flows/{flow_id}/edges
List all edges in a flow with their current mapping status.
Response
{
"edges": [
{
"id": "edge_1",
"from_node": "input",
"to_node": "calculator.add",
"confidence": 0.92,
"level": "high",
"mapping_status": "auto",
"last_analyzed": "2026-01-26T06:30:00Z"
},
{
"id": "edge_2",
"from_node": "calculator.add",
"to_node": "formatter",
"confidence": 0.45,
"level": "medium",
"mapping_status": "user_modified",
"last_analyzed": "2026-01-26T06:30:00Z"
}
]
}
PUT /api/v1/flows/{flow_id}/edges/{edge_id}/mapping
Save user-defined or modified mapping for an edge.
Request
{
"mappings": [
{
"from_field": "output.sum",
"to_field": "input.value"
},
{
"to_field": "input.factor",
"constant": 5
},
{
"to_field": "input.label",
"expression": "concat('Result: ', output.sum)"
}
],
"user_confirmed": true
}
Analysis Engine
Heuristic Analysis (Fast Path)
Used when schemas are simple and mapping is obvious:
- Exact name match —
output.value→input.value(confidence: 0.95) - Case-insensitive match —
output.Value→input.value(confidence: 0.9) - Common aliases —
output.result→input.value(confidence: 0.7) - Type compatibility — int → float OK, string → int NOT OK
LLM Analysis (Deep Path)
Used when heuristics produce low confidence:
System: You are analyzing data flow compatibility between two XML schemas.
Given:
- Source schema (output of previous step): {from_schema}
- Target schema (input of next step): {to_schema}
Propose a field mapping. For each target field, identify:
1. The best source field to map from (if any)
2. Confidence (0-1) in the mapping
3. Brief reason for the mapping
If a required target field cannot be mapped, flag it.
If source fields will be discarded, note them.
Respond in JSON format.
Caching Strategy
- Cache key:
hash(from_schema) + hash(to_schema) - TTL: 24 hours (schemas rarely change)
- Invalidate on: node schema update, user clear cache
- Store: Redis or in-memory LRU
Database Schema
Edge Mappings Table
CREATE TABLE edge_mappings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
flow_id UUID NOT NULL REFERENCES flows(id) ON DELETE CASCADE,
-- Edge identification
from_node VARCHAR(100) NOT NULL,
to_node VARCHAR(100) NOT NULL,
-- Analysis results
confidence NUMERIC(3,2),
level VARCHAR(10), -- 'high', 'medium', 'low'
analysis_method VARCHAR(20), -- 'heuristic', 'llm'
-- The actual mapping (JSON)
proposed_mapping JSONB,
user_mapping JSONB, -- User overrides, if any
-- Status
user_confirmed BOOLEAN DEFAULT FALSE,
-- Timestamps
analyzed_at TIMESTAMP WITH TIME ZONE,
confirmed_at TIMESTAMP WITH TIME ZONE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(flow_id, from_node, to_node)
);
CREATE INDEX idx_edge_mappings_flow ON edge_mappings(flow_id);
Sequencer Integration
When a sequence is executed, the sequencer factory:
- Loads all edge mappings for the flow
- For each edge, generates a transformer function:
def generate_transformer(edge_mapping: EdgeMapping) -> Callable:
"""
Generate a function that transforms A's output to B's input.
"""
def transform(source_xml: str) -> str:
source = parse_xml(source_xml)
target = {}
for mapping in edge_mapping.effective_mapping:
if mapping.from_field:
target[mapping.to_field] = extract(source, mapping.from_field)
elif mapping.constant is not None:
target[mapping.to_field] = mapping.constant
elif mapping.expression:
target[mapping.to_field] = evaluate(mapping.expression, source)
return serialize_xml(target, edge_mapping.to_schema)
return transform
- Transformer is called between each step in the sequence
Future Enhancements
v1.1 — Type Coercion
- Automatic int → string, date formatting, etc.
- Warnings when lossy conversion occurs
v1.2 — Expression Builder
- Visual expression editor for complex mappings
- Functions:
concat(),format(),split(),lookup()
v1.3 — Learning from Corrections
- Track when users override AI suggestions
- Fine-tune confidence thresholds
- Eventually: personalized mapping suggestions
v2.0 — Multi-Output Nodes
- Some nodes produce multiple outputs
- UI shows multiple output ports
- User wires specific port to specific input
This spec is a living document. Update as implementation progresses.