| Architecture Blueprint
← Live Demo ↗ API Docs ↗
K9-AIF Framework · Architecture Blueprint

From Run Scenario to Result —
the complete runtime story

Every event flows through a governed, explainable pipeline. Kafka-routed, model-routed per task, zero-trust gated, PII-guarded, and immutably audited. This page shows exactly how — driven entirely from the framework's YAML configuration.

End-to-End Runtime Flow
Messaging
Kafka
🖥️
UI / API
Run Scenario
click
app_backend
FastAPI · port 8000
publishes event
eoc-events
inbound topic
EOCRouter
routes by event_type
deterministic table
eoc-claims
eoc-fraud · eoc-docs…
EOCOrchestrator
7 domain squads
dispatched here
Squad
🤖 Squad — agent pipeline (e.g. ClaimsProcessingSquad) loaded from squads.yaml + agent YAML files
⚖️ Zero Trust Gate
confidence ≥ 0.75 → approve, continue
confidence < 0.75 → HITL escalation
score > 0.85 → auto-deny
🧠
Domain Agent
Triage · Adjudication
Fraud · Extraction
⚡ Intelligent Model Router
task_type → model selection
cost + latency optimised
🤖
LLM Call
Ollama · Granite 3
or llama3.2:1b
routed per task type
🛡️
GuardAgent
granite3-guardian
PII + compliance
HARD GATE — no fallback
no bypass possible
📋
AuditAgent
PostgreSQL
immutable trail
every action recorded
with prompt hash
🔗
GraphSyncAgent
Neo4j · entities
relationships
knowledge graph
sync
Result
Kafka
eoc-results
results topic
📡
SSE Stream
/events/stream
real-time push
🖥️
UI Dashboard
KafkaResult event
live trace display
✓ Governed end-to-end
Audited · PII-guarded · Zero-trust gated
Intelligent Model Router — 4 Decision Rules
RULE 1 — checked first, always wins
Compliance Gate
When the task touches PII, policy, or compliance — Guardian is mandatory. No other model is acceptable, no fallback permitted.
Model selected
granite3-guardian
pii_detection policy guardrails confidential
NO FALLBACK — EVER
RULE 2 — high-volume commodity tasks
Cost-Optimised
Chat, summarization, and customer interactions are frequent and don't need heavy reasoning. Route to the smallest capable model.
Model selected
llama3.2:1b
chat summarization customer_intent general
Cost: minimal · Latency: realtime
RULE 3 — domain-specific tasks
Capability-Matched
Adjudication, fraud analysis, and extraction require a model with specific capabilities. The router looks up the catalog and selects the best match.
Model selected
granite3-dense:2b
adjudication reasoning fraud extraction ocr
Fallback to llama3.2:1b if capability unavailable
RULE 4 — nothing matched above
Catalog Default
If no rule matched, fall back to the default model configured in config.yaml. The router always resolves — it never leaves a task unhandled.
Model selected
llama3.2:1b
any unmatched task
Safe fallback — always resolves
Agent → Model Assignment (from agent YAML files)
Agent Task Type Model Chosen Fallback Guard Layer Why this model?
ClaimsTriageAgent
ClaimsProcessingSquad
reasoning granite3-dense:2b general pre-guard Priority scoring + completeness check needs reasoning
AdjudicationAgent
ClaimsProcessingSquad
adjudication granite3-dense:2b general pre + post Domain-specific, high accuracy required
FraudDetectionAgent
RiskAssessmentSquad
fraud / reasoning granite3-dense:2b general pre + post Multi-signal correlation — complex reasoning required
DocumentExtractorAgent
DocumentIntelligenceSquad
extraction granite3-dense:2b general none Structured JSON output — extraction capability required
GuardAgent
all squads
guardrails · pii_detection granite3-guardian NONE — hard requirement IS the guard Compliance gate — purpose-built safety model, no substitution
AuditAgent
all squads
general / audit llama3.2:1b none Structured record write — no LLM reasoning needed, minimal cost
Zero Trust Execution Layer

K9 Zero Trust — Confidence-Based Control

Every agent decision is scored against three thresholds before any action is taken. No agent output is implicitly trusted — this gate runs on every orchestrator flow, for every event type, without exception.

0.00.250.50 0.600.750.851.0
obligation · 0.60
approve · 0.75
deny · 0.85
deny / block
obligation
approve
high confidence
Approval Threshold
≥ 0.75
Agent confidence at or above this level — decision is approved and the pipeline continues to GuardAgent.
HITL Escalation
< 0.75
Confidence below threshold — EscalationAgent creates a HITL ticket. Pipeline pauses and waits for a human operator decision.
Auto-Deny
≥ 0.85
Risk or fraud scores above 0.85 trigger automatic denial — no human review needed at this confidence level.
GuardAgent — Hard Gate
Always runs
GuardAgent (granite3-guardian) runs on every flow regardless of confidence score. It is not a threshold check — it is a mandatory compliance checkpoint. No bypass, no fallback, no exceptions.
Governance Policies — config/governance.yaml
🔍
PII Guard
governance.yaml → policies.pii_guard
Detect and mask personally identifiable information before any payload reaches an LLM endpoint. Runs at pre-process and post-process on every guarded agent.
modelgranite3-guardian
pre + post processenabled
fields scannedssn · dob · bank · cc · phone · email
Output Validation
governance.yaml → policies.output_validation
All LLM outputs are validated against a required field schema before any downstream action. Malformed outputs are rejected — not silently passed through.
adjudication requiresdecision · confidence · rationale
fraud requiresrisk_score · signals · recommendation
triage requirespriority · completeness · coverage
⚖️
Confidence Threshold
governance.yaml → policies.confidence_threshold
When an agent's confidence score falls below 0.75, the decision is automatically escalated to a human operator via HITL ticket. The pipeline pauses and waits.
threshold0.75
applies toAdjudicationAgent · FraudDetectionAgent · ClaimsTriageAgent
📋
Audit All Actions
governance.yaml → policies.audit_all_actions
Every agent action produces an immutable AuditEntry in PostgreSQL — including prompt hash, response hash, model ID, confidence, and timestamp. Full compliance replay is always possible.
storagePostgreSQL · immutable
hashed fieldsprompt + response (SHA-256)
scopeevery agent · every event
🔒
No Fallback for Compliance
governance.yaml → policies.no_fallback_for_compliance
PII detection, policy compliance, and output validation must use Guardian. If Guardian is unavailable the pipeline fails — a degraded model is not an acceptable substitute.
hard taskspii_detection · policy_compliance · output_validation
on unavailabilitypipeline error — no degraded mode
🛡️
Zero Trust Execution
governance.yaml → policies.zero_trust
The K9 Zero Trust layer applies on every orchestrator flow. All decisions are scored against three thresholds — no agent output is implicitly trusted.
deny threshold0.85
approval threshold0.75
obligation threshold0.60