- Runtime policy: green-only (no YOLO on yellow) - LLM clarification flow when wiring fails - Edge hints payload: map, constant, drop, expression - Structured error response for LLM to resolve issues Conservative but flexible: LLM can provide explicit instructions to turn yellow into green. Co-authored-by: Dan
422 lines
12 KiB
Markdown
422 lines
12 KiB
Markdown
# Edge Analysis API Specification
|
|
|
|
**Status:** Draft
|
|
**Author:** Donna (with Dan)
|
|
**Date:** 2026-01-26
|
|
|
|
## Overview
|
|
|
|
When users connect nodes in the visual flow editor, an AI analyzes the schema compatibility and proposes field mappings. This provides immediate visual feedback (green/yellow/red) and reduces manual configuration for common cases.
|
|
|
|
## User Experience
|
|
|
|
### Visual Feedback
|
|
|
|
When a user draws a connection from Node A to Node B:
|
|
|
|
```
|
|
┌─────────────┐ ┌─────────────┐
|
|
│ Node A │ 🟢 ────────────────▶ │ Node B │
|
|
│ │ High confidence │ │
|
|
└─────────────┘ └─────────────┘
|
|
|
|
┌─────────────┐ ┌─────────────┐
|
|
│ Node C │ 🟡 ─ ─ ─ ─ ─ ─ ─ ─▶ │ Node D │
|
|
│ │ Review suggested │ │
|
|
└─────────────┘ └─────────────┘
|
|
|
|
┌─────────────┐ ┌─────────────┐
|
|
│ Node E │ 🔴 ─ ─ ─ ✕ ─ ─ ─ ─▶ │ Node F │
|
|
│ │ Manual mapping needed │ │
|
|
└─────────────┘ └─────────────┘
|
|
```
|
|
|
|
### Confidence Levels
|
|
|
|
| Level | Color | Threshold | Meaning |
|
|
|-------|-------|-----------|---------|
|
|
| HIGH | 🟢 Green | ≥ 0.8 | All required inputs mapped, types compatible |
|
|
| MEDIUM | 🟡 Yellow | 0.4 - 0.79 | Some mappings uncertain or missing optional fields |
|
|
| LOW | 🔴 Red | < 0.4 | Cannot determine mapping, manual intervention required |
|
|
|
|
### Interaction Flow
|
|
|
|
1. User drags connection from A output → B input
|
|
2. Frontend calls `POST /api/v1/flows/{id}/analyze-edge`
|
|
3. Backend analyzes schemas (cached if previously computed)
|
|
4. Frontend renders line with confidence color
|
|
5. User clicks line → mapping panel opens
|
|
6. User can accept, modify, or manually define mappings
|
|
7. Changes saved to flow's `canvas_state`
|
|
|
|
## API Endpoint
|
|
|
|
### POST /api/v1/flows/{flow_id}/analyze-edge
|
|
|
|
Analyze compatibility between two nodes and propose field mappings.
|
|
|
|
#### Request
|
|
|
|
```json
|
|
{
|
|
"from_node": "calculator.add",
|
|
"to_node": "calculator.multiply",
|
|
"from_output_schema": "<optional: override if not using node's default>",
|
|
"to_input_schema": "<optional: override if not using node's default>"
|
|
}
|
|
```
|
|
|
|
**Notes:**
|
|
- If schemas not provided, fetched from node registry
|
|
- Schemas can be XSD strings or references to registered schemas
|
|
|
|
#### Response
|
|
|
|
```json
|
|
{
|
|
"edge_id": "add_to_multiply",
|
|
"confidence": 0.85,
|
|
"level": "high",
|
|
|
|
"proposed_mapping": {
|
|
"mappings": [
|
|
{
|
|
"from_field": "output.sum",
|
|
"to_field": "input.value",
|
|
"confidence": 0.95,
|
|
"reason": "Exact name match, compatible types (int → int)"
|
|
},
|
|
{
|
|
"from_field": "output.operands",
|
|
"to_field": "input.factors",
|
|
"confidence": 0.6,
|
|
"reason": "Semantic similarity, both arrays of numbers"
|
|
}
|
|
],
|
|
"unmapped_required": [
|
|
{
|
|
"field": "input.precision",
|
|
"type": "int",
|
|
"default": 2,
|
|
"suggestion": "Set constant value or map from upstream"
|
|
}
|
|
],
|
|
"unmapped_optional": [
|
|
{
|
|
"field": "input.label",
|
|
"type": "string"
|
|
}
|
|
]
|
|
},
|
|
|
|
"warnings": [
|
|
"input.precision has no source, using default value 2",
|
|
"output.metadata will be discarded (no matching input field)"
|
|
],
|
|
|
|
"errors": [],
|
|
|
|
"analysis_method": "llm", // or "heuristic" for simple cases
|
|
"cached": false,
|
|
"analysis_time_ms": 245
|
|
}
|
|
```
|
|
|
|
#### Error Response
|
|
|
|
```json
|
|
{
|
|
"error": "schema_not_found",
|
|
"message": "Node 'calculator.add' has no registered output schema",
|
|
"details": {
|
|
"node": "calculator.add"
|
|
}
|
|
}
|
|
```
|
|
|
|
### GET /api/v1/flows/{flow_id}/edges
|
|
|
|
List all edges in a flow with their current mapping status.
|
|
|
|
#### Response
|
|
|
|
```json
|
|
{
|
|
"edges": [
|
|
{
|
|
"id": "edge_1",
|
|
"from_node": "input",
|
|
"to_node": "calculator.add",
|
|
"confidence": 0.92,
|
|
"level": "high",
|
|
"mapping_status": "auto",
|
|
"last_analyzed": "2026-01-26T06:30:00Z"
|
|
},
|
|
{
|
|
"id": "edge_2",
|
|
"from_node": "calculator.add",
|
|
"to_node": "formatter",
|
|
"confidence": 0.45,
|
|
"level": "medium",
|
|
"mapping_status": "user_modified",
|
|
"last_analyzed": "2026-01-26T06:30:00Z"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### PUT /api/v1/flows/{flow_id}/edges/{edge_id}/mapping
|
|
|
|
Save user-defined or modified mapping for an edge.
|
|
|
|
#### Request
|
|
|
|
```json
|
|
{
|
|
"mappings": [
|
|
{
|
|
"from_field": "output.sum",
|
|
"to_field": "input.value"
|
|
},
|
|
{
|
|
"to_field": "input.factor",
|
|
"constant": 5
|
|
},
|
|
{
|
|
"to_field": "input.label",
|
|
"expression": "concat('Result: ', output.sum)"
|
|
}
|
|
],
|
|
"user_confirmed": true
|
|
}
|
|
```
|
|
|
|
## Analysis Engine
|
|
|
|
### Heuristic Analysis (Fast Path)
|
|
|
|
Used when schemas are simple and mapping is obvious:
|
|
|
|
1. **Exact name match** — `output.value` → `input.value` (confidence: 0.95)
|
|
2. **Case-insensitive match** — `output.Value` → `input.value` (confidence: 0.9)
|
|
3. **Common aliases** — `output.result` → `input.value` (confidence: 0.7)
|
|
4. **Type compatibility** — int → float OK, string → int NOT OK
|
|
|
|
### LLM Analysis (Deep Path)
|
|
|
|
Used when heuristics produce low confidence:
|
|
|
|
```
|
|
System: You are analyzing data flow compatibility between two XML schemas.
|
|
|
|
Given:
|
|
- Source schema (output of previous step): {from_schema}
|
|
- Target schema (input of next step): {to_schema}
|
|
|
|
Propose a field mapping. For each target field, identify:
|
|
1. The best source field to map from (if any)
|
|
2. Confidence (0-1) in the mapping
|
|
3. Brief reason for the mapping
|
|
|
|
If a required target field cannot be mapped, flag it.
|
|
If source fields will be discarded, note them.
|
|
|
|
Respond in JSON format.
|
|
```
|
|
|
|
### Caching Strategy
|
|
|
|
- Cache key: `hash(from_schema) + hash(to_schema)`
|
|
- TTL: 24 hours (schemas rarely change)
|
|
- Invalidate on: node schema update, user clear cache
|
|
- Store: Redis or in-memory LRU
|
|
|
|
## Database Schema
|
|
|
|
### Edge Mappings Table
|
|
|
|
```sql
|
|
CREATE TABLE edge_mappings (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
flow_id UUID NOT NULL REFERENCES flows(id) ON DELETE CASCADE,
|
|
|
|
-- Edge identification
|
|
from_node VARCHAR(100) NOT NULL,
|
|
to_node VARCHAR(100) NOT NULL,
|
|
|
|
-- Analysis results
|
|
confidence NUMERIC(3,2),
|
|
level VARCHAR(10), -- 'high', 'medium', 'low'
|
|
analysis_method VARCHAR(20), -- 'heuristic', 'llm'
|
|
|
|
-- The actual mapping (JSON)
|
|
proposed_mapping JSONB,
|
|
user_mapping JSONB, -- User overrides, if any
|
|
|
|
-- Status
|
|
user_confirmed BOOLEAN DEFAULT FALSE,
|
|
|
|
-- Timestamps
|
|
analyzed_at TIMESTAMP WITH TIME ZONE,
|
|
confirmed_at TIMESTAMP WITH TIME ZONE,
|
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
|
|
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
|
|
|
|
UNIQUE(flow_id, from_node, to_node)
|
|
);
|
|
|
|
CREATE INDEX idx_edge_mappings_flow ON edge_mappings(flow_id);
|
|
```
|
|
|
|
## Sequencer Integration
|
|
|
|
When a sequence is executed, the sequencer factory:
|
|
|
|
1. Loads all edge mappings for the flow
|
|
2. For each edge, generates a transformer function:
|
|
|
|
```python
|
|
def generate_transformer(edge_mapping: EdgeMapping) -> Callable:
|
|
"""
|
|
Generate a function that transforms A's output to B's input.
|
|
"""
|
|
def transform(source_xml: str) -> str:
|
|
source = parse_xml(source_xml)
|
|
target = {}
|
|
|
|
for mapping in edge_mapping.effective_mapping:
|
|
if mapping.from_field:
|
|
target[mapping.to_field] = extract(source, mapping.from_field)
|
|
elif mapping.constant is not None:
|
|
target[mapping.to_field] = mapping.constant
|
|
elif mapping.expression:
|
|
target[mapping.to_field] = evaluate(mapping.expression, source)
|
|
|
|
return serialize_xml(target, edge_mapping.to_schema)
|
|
|
|
return transform
|
|
```
|
|
|
|
3. Transformer is called between each step in the sequence
|
|
|
|
## LLM-Created Sequences (Runtime)
|
|
|
|
Agents can dynamically create sequences at runtime. The sequencer factory runs the same analysis but with stricter rules.
|
|
|
|
### Runtime Policy
|
|
|
|
| Confidence | Design-time (Canvas) | Run-time (LLM) |
|
|
|------------|---------------------|----------------|
|
|
| 🟢 High | Auto-wire, run | Auto-wire, run |
|
|
| 🟡 Medium | Show warning, let user decide | **Block**, request clarification |
|
|
| 🔴 Low | Show error, require manual | **Block**, request clarification |
|
|
|
|
**Rationale:** Letting LLMs YOLO on uncertain mappings is risky. Better to ask for explicit instructions.
|
|
|
|
### LLM Clarification Flow
|
|
|
|
```
|
|
1. LLM requests sequence: A → B → C
|
|
2. Factory analyzes:
|
|
- A → B: green ✓
|
|
- B → C: yellow (ambiguous)
|
|
3. Factory responds with structured error:
|
|
- Which edge failed
|
|
- What B outputs
|
|
- What C expects
|
|
- Suggested resolutions
|
|
4. LLM provides hints:
|
|
- Explicit mappings
|
|
- Constants for missing fields
|
|
- Fields to drop
|
|
5. Factory re-analyzes with hints
|
|
6. If green: run. If not: back to step 3.
|
|
```
|
|
|
|
### Edge Hints Payload
|
|
|
|
LLMs can provide explicit wiring instructions:
|
|
|
|
```xml
|
|
<CreateSequence>
|
|
<steps>transformer, uploader</steps>
|
|
<initial_payload>...</initial_payload>
|
|
|
|
<edge_hints>
|
|
<edge from="transformer" to="uploader">
|
|
<map from="result" to="payload"/>
|
|
<constant field="destination">/uploads/output.json</constant>
|
|
<drop field="factor"/>
|
|
<drop field="metadata"/>
|
|
</edge>
|
|
</edge_hints>
|
|
</CreateSequence>
|
|
```
|
|
|
|
### Hint Types
|
|
|
|
| Hint | Syntax | Effect |
|
|
|------|--------|--------|
|
|
| Map field | `<map from="X" to="Y"/>` | Wire source field to target field |
|
|
| Set constant | `<constant field="Y">value</constant>` | Set target field to literal value |
|
|
| Drop field | `<drop field="X"/>` | Explicitly ignore source field |
|
|
| Expression | `<expr field="Y">concat(X.a, X.b)</expr>` | Compute target from expression |
|
|
|
|
### Error Response to LLM
|
|
|
|
When wiring fails:
|
|
|
|
```xml
|
|
<SequenceError>
|
|
<code>wiring_failed</code>
|
|
<edge from="transformer" to="uploader"/>
|
|
|
|
<source_fields>
|
|
<field name="result" type="string"/>
|
|
<field name="factor" type="int"/>
|
|
<field name="metadata" type="object"/>
|
|
</source_fields>
|
|
|
|
<target_fields>
|
|
<field name="payload" type="string" required="true" mapped="result" confidence="0.85"/>
|
|
<field name="destination" type="string" required="true" mapped="" confidence="0"/>
|
|
</target_fields>
|
|
|
|
<issues>
|
|
<issue type="unmapped_required">destination has no source</issue>
|
|
<issue type="unmapped_source">factor will be dropped</issue>
|
|
<issue type="unmapped_source">metadata will be dropped</issue>
|
|
</issues>
|
|
|
|
<suggestion>Provide mapping for 'destination' or set as constant.</suggestion>
|
|
</SequenceError>
|
|
```
|
|
|
|
This gives the LLM enough information to either:
|
|
- Provide the missing hints
|
|
- Try a different sequence
|
|
- Ask the user for help
|
|
|
|
## Future Enhancements
|
|
|
|
### v1.1 — Type Coercion
|
|
- Automatic int → string, date formatting, etc.
|
|
- Warnings when lossy conversion occurs
|
|
|
|
### v1.2 — Expression Builder
|
|
- Visual expression editor for complex mappings
|
|
- Functions: `concat()`, `format()`, `split()`, `lookup()`
|
|
|
|
### v1.3 — Learning from Corrections
|
|
- Track when users override AI suggestions
|
|
- Fine-tune confidence thresholds
|
|
- Eventually: personalized mapping suggestions
|
|
|
|
### v2.0 — Multi-Output Nodes
|
|
- Some nodes produce multiple outputs
|
|
- UI shows multiple output ports
|
|
- User wires specific port to specific input
|
|
|
|
---
|
|
|
|
*This spec is a living document. Update as implementation progresses.*
|