K9-AIF Framework · Architecture Blueprint

From Run Scenario to Result —
the complete runtime story

Every event flows through a governed, explainable pipeline. Kafka-routed, model-routed per task, zero-trust gated, PII-guarded, and immutably audited. This page shows exactly how — driven entirely from the framework's YAML configuration.

End-to-End Runtime Flow

Messaging

Kafka

🖥️

UI / API

Run Scenario
click

›

⚡

app_backend

FastAPI · port 8000

publishes event

›

eoc-events

inbound topic

›

⇄

EOCRouter

routes by event_type
deterministic table

›

eoc-claims

eoc-fraud · eoc-docs…

›

◈

EOCOrchestrator

7 domain squads
dispatched here

▼

Squad

🤖 Squad — agent pipeline (e.g. ClaimsProcessingSquad) loaded from squads.yaml + agent YAML files

⚖️ Zero Trust Gate

confidence ≥ 0.75 → approve, continue

confidence < 0.75 → HITL escalation

score > 0.85 → auto-deny

🧠

Domain Agent

Triage · Adjudication
Fraud · Extraction

⚡ Intelligent Model Router

task_type → model selection
cost + latency optimised

›

🤖

LLM Call

Ollama · Granite 3
or llama3.2:1b

routed per task type

›

🛡️

GuardAgent

granite3-guardian
PII + compliance

HARD GATE — no fallback
no bypass possible

›

📋

AuditAgent

PostgreSQL
immutable trail

every action recorded
with prompt hash

›

🔗

GraphSyncAgent

Neo4j · entities
relationships

knowledge graph
sync

▼

Result

Kafka

eoc-results

results topic

›

📡

SSE Stream

/events/stream
real-time push

›

🖥️

UI Dashboard

KafkaResult event
live trace display

✓ Governed end-to-end

Audited · PII-guarded · Zero-trust gated

Intelligent Model Router — 4 Decision Rules

RULE 1 — checked first, always wins

Compliance Gate

When the task touches PII, policy, or compliance — Guardian is mandatory. No other model is acceptable, no fallback permitted.

Model selected

granite3-guardian

pii_detection policy guardrails confidential

NO FALLBACK — EVER

RULE 2 — high-volume commodity tasks

Cost-Optimised

Chat, summarization, and customer interactions are frequent and don't need heavy reasoning. Route to the smallest capable model.

Model selected

llama3.2:1b

chat summarization customer_intent general

Cost: minimal · Latency: realtime

RULE 3 — domain-specific tasks

Capability-Matched

Adjudication, fraud analysis, and extraction require a model with specific capabilities. The router looks up the catalog and selects the best match.

Model selected

granite3-dense:2b

adjudication reasoning fraud extraction ocr

Fallback to llama3.2:1b if capability unavailable

RULE 4 — nothing matched above

Catalog Default

If no rule matched, fall back to the default model configured in config.yaml. The router always resolves — it never leaves a task unhandled.

Model selected

llama3.2:1b

any unmatched task

Safe fallback — always resolves

Agent → Model Assignment (from agent YAML files)

Agent	Task Type	Model Chosen	Fallback	Guard Layer	Why this model?
ClaimsTriageAgent ClaimsProcessingSquad	reasoning	granite3-dense:2b	general	pre-guard	Priority scoring + completeness check needs reasoning
AdjudicationAgent ClaimsProcessingSquad	adjudication	granite3-dense:2b	general	pre + post	Domain-specific, high accuracy required
FraudDetectionAgent RiskAssessmentSquad	fraud / reasoning	granite3-dense:2b	general	pre + post	Multi-signal correlation — complex reasoning required
DocumentExtractorAgent DocumentIntelligenceSquad	extraction	granite3-dense:2b	general	none	Structured JSON output — extraction capability required
GuardAgent all squads	guardrails · pii_detection	granite3-guardian	NONE — hard requirement	IS the guard	Compliance gate — purpose-built safety model, no substitution
AuditAgent all squads	general / audit	llama3.2:1b	—	none	Structured record write — no LLM reasoning needed, minimal cost

Zero Trust Execution Layer

K9 Zero Trust — Confidence-Based Control

Every agent decision is scored against three thresholds before any action is taken. No agent output is implicitly trusted — this gate runs on every orchestrator flow, for every event type, without exception.

0.00.250.50 0.600.750.851.0

obligation · 0.60

approve · 0.75

deny · 0.85

deny / block

obligation

approve

high confidence

Approval Threshold

≥ 0.75

Agent confidence at or above this level — decision is approved and the pipeline continues to GuardAgent.

HITL Escalation

< 0.75

Confidence below threshold — EscalationAgent creates a HITL ticket. Pipeline pauses and waits for a human operator decision.

Auto-Deny

≥ 0.85

Risk or fraud scores above 0.85 trigger automatic denial — no human review needed at this confidence level.

GuardAgent — Hard Gate

Always runs

GuardAgent (granite3-guardian) runs on every flow regardless of confidence score. It is not a threshold check — it is a mandatory compliance checkpoint. No bypass, no fallback, no exceptions.

Governance Policies — config/governance.yaml

🔍

PII Guard

governance.yaml → policies.pii_guard

Detect and mask personally identifiable information before any payload reaches an LLM endpoint. Runs at pre-process and post-process on every guarded agent.

modelgranite3-guardian

pre + post processenabled

fields scannedssn · dob · bank · cc · phone · email

✅

Output Validation

governance.yaml → policies.output_validation

All LLM outputs are validated against a required field schema before any downstream action. Malformed outputs are rejected — not silently passed through.

adjudication requiresdecision · confidence · rationale

fraud requiresrisk_score · signals · recommendation

triage requirespriority · completeness · coverage

⚖️

Confidence Threshold

governance.yaml → policies.confidence_threshold

When an agent's confidence score falls below 0.75, the decision is automatically escalated to a human operator via HITL ticket. The pipeline pauses and waits.

threshold0.75

applies toAdjudicationAgent · FraudDetectionAgent · ClaimsTriageAgent

📋

Audit All Actions

governance.yaml → policies.audit_all_actions

Every agent action produces an immutable AuditEntry in PostgreSQL — including prompt hash, response hash, model ID, confidence, and timestamp. Full compliance replay is always possible.

storagePostgreSQL · immutable

hashed fieldsprompt + response (SHA-256)

scopeevery agent · every event

🔒

No Fallback for Compliance

governance.yaml → policies.no_fallback_for_compliance

PII detection, policy compliance, and output validation must use Guardian. If Guardian is unavailable the pipeline fails — a degraded model is not an acceptable substitute.

hard taskspii_detection · policy_compliance · output_validation

on unavailabilitypipeline error — no degraded mode

🛡️

Zero Trust Execution

governance.yaml → policies.zero_trust

The K9 Zero Trust layer applies on every orchestrator flow. All decisions are scored against three thresholds — no agent output is implicitly trusted.

deny threshold0.85

approval threshold0.75

obligation threshold0.60

From Run Scenario to Result —the complete runtime story

K9 Zero Trust — Confidence-Based Control

From Run Scenario to Result —
the complete runtime story