K9-AIF Framework · Architecture Blueprint
From Run Scenario to Result —
the complete runtime story
Every event flows through a governed, explainable pipeline. Kafka-routed, model-routed per task,
zero-trust gated, PII-guarded, and immutably audited. This page shows exactly how —
driven entirely from the framework's YAML configuration.
End-to-End Runtime Flow
Messaging
Kafka
🖥️
UI / API
Run Scenario
click
›
⚡
app_backend
FastAPI · port 8000
publishes event
›
›
⇄
EOCRouter
routes by event_type
deterministic table
›
eoc-claims
eoc-fraud · eoc-docs…
›
◈
EOCOrchestrator
7 domain squads
dispatched here
Squad
🤖 Squad — agent pipeline (e.g. ClaimsProcessingSquad)
loaded from squads.yaml + agent YAML files
⚖️ Zero Trust Gate
confidence ≥ 0.75 → approve, continue
confidence < 0.75 → HITL escalation
🧠
Domain Agent
Triage · Adjudication
Fraud · Extraction
⚡ Intelligent Model Router
task_type → model selection
cost + latency optimised
›
🤖
LLM Call
Ollama · Granite 3
or llama3.2:1b
routed per task type
›
🛡️
GuardAgent
granite3-guardian
PII + compliance
HARD GATE — no fallback
no bypass possible
›
📋
AuditAgent
PostgreSQL
immutable trail
every action recorded
with prompt hash
›
🔗
GraphSyncAgent
Neo4j · entities
relationships
knowledge graph
sync
Result
Kafka
eoc-results
results topic
›
📡
SSE Stream
/events/stream
real-time push
›
🖥️
UI Dashboard
KafkaResult event
live trace display
✓ Governed end-to-end
Audited · PII-guarded · Zero-trust gated
Intelligent Model Router — 4 Decision Rules
RULE 1 — checked first, always wins
Compliance Gate
When the task touches PII, policy, or compliance — Guardian is mandatory. No other model is acceptable, no fallback permitted.
Model selected
granite3-guardian
pii_detection
policy
guardrails
confidential
NO FALLBACK — EVER
RULE 2 — high-volume commodity tasks
Cost-Optimised
Chat, summarization, and customer interactions are frequent and don't need heavy reasoning. Route to the smallest capable model.
Model selected
llama3.2:1b
chat
summarization
customer_intent
general
Cost: minimal · Latency: realtime
RULE 3 — domain-specific tasks
Capability-Matched
Adjudication, fraud analysis, and extraction require a model with specific capabilities. The router looks up the catalog and selects the best match.
Model selected
granite3-dense:2b
adjudication
reasoning
fraud
extraction
ocr
Fallback to llama3.2:1b if capability unavailable
RULE 4 — nothing matched above
Catalog Default
If no rule matched, fall back to the default model configured in config.yaml. The router always resolves — it never leaves a task unhandled.
Model selected
llama3.2:1b
any unmatched task
Safe fallback — always resolves
Agent → Model Assignment (from agent YAML files)
| Agent |
Task Type |
Model Chosen |
Fallback |
Guard Layer |
Why this model? |
ClaimsTriageAgent ClaimsProcessingSquad |
reasoning |
granite3-dense:2b |
general |
pre-guard |
Priority scoring + completeness check needs reasoning |
AdjudicationAgent ClaimsProcessingSquad |
adjudication |
granite3-dense:2b |
general |
pre + post |
Domain-specific, high accuracy required |
FraudDetectionAgent RiskAssessmentSquad |
fraud / reasoning |
granite3-dense:2b |
general |
pre + post |
Multi-signal correlation — complex reasoning required |
DocumentExtractorAgent DocumentIntelligenceSquad |
extraction |
granite3-dense:2b |
general |
none |
Structured JSON output — extraction capability required |
GuardAgent all squads |
guardrails · pii_detection |
granite3-guardian |
NONE — hard requirement |
IS the guard |
Compliance gate — purpose-built safety model, no substitution |
AuditAgent all squads |
general / audit |
llama3.2:1b |
— |
none |
Structured record write — no LLM reasoning needed, minimal cost |
Zero Trust Execution Layer
K9 Zero Trust — Confidence-Based Control
Every agent decision is scored against three thresholds before any action is taken.
No agent output is implicitly trusted — this gate runs on every orchestrator flow,
for every event type, without exception.
0.00.250.50
0.600.750.851.0
obligation · 0.60
approve · 0.75
deny · 0.85
deny / block
obligation
approve
high confidence
≥ 0.75
Agent confidence at or above this level — decision is approved and the pipeline continues to GuardAgent.
< 0.75
Confidence below threshold — EscalationAgent creates a HITL ticket. Pipeline pauses and waits for a human operator decision.
≥ 0.85
Risk or fraud scores above 0.85 trigger automatic denial — no human review needed at this confidence level.
Always runs
GuardAgent (granite3-guardian) runs on every flow regardless of confidence score. It is not a threshold check — it is a mandatory compliance checkpoint. No bypass, no fallback, no exceptions.
Governance Policies — config/governance.yaml
🔍
PII Guard
governance.yaml → policies.pii_guard
Detect and mask personally identifiable information before any payload reaches an LLM endpoint. Runs at pre-process and post-process on every guarded agent.
modelgranite3-guardian
pre + post processenabled
fields scannedssn · dob · bank · cc · phone · email
✅
Output Validation
governance.yaml → policies.output_validation
All LLM outputs are validated against a required field schema before any downstream action. Malformed outputs are rejected — not silently passed through.
adjudication requiresdecision · confidence · rationale
fraud requiresrisk_score · signals · recommendation
triage requirespriority · completeness · coverage
⚖️
Confidence Threshold
governance.yaml → policies.confidence_threshold
When an agent's confidence score falls below 0.75, the decision is automatically escalated to a human operator via HITL ticket. The pipeline pauses and waits.
threshold0.75
applies toAdjudicationAgent · FraudDetectionAgent · ClaimsTriageAgent
📋
Audit All Actions
governance.yaml → policies.audit_all_actions
Every agent action produces an immutable AuditEntry in PostgreSQL — including prompt hash, response hash, model ID, confidence, and timestamp. Full compliance replay is always possible.
storagePostgreSQL · immutable
hashed fieldsprompt + response (SHA-256)
scopeevery agent · every event
🔒
No Fallback for Compliance
governance.yaml → policies.no_fallback_for_compliance
PII detection, policy compliance, and output validation must use Guardian. If Guardian is unavailable the pipeline fails — a degraded model is not an acceptable substitute.
hard taskspii_detection · policy_compliance · output_validation
on unavailabilitypipeline error — no degraded mode
🛡️
Zero Trust Execution
governance.yaml → policies.zero_trust
The K9 Zero Trust layer applies on every orchestrator flow. All decisions are scored against three thresholds — no agent output is implicitly trusted.
deny threshold0.85
approval threshold0.75
obligation threshold0.60