What’s New
Guardian Agents and the Philosophy of AI Accountability
How Wittgenstein, Berlin, Ostrom, and Te Ao Māori converge on the same architectural requirements for governing AI in community contexts.
Guardian Agents in Production
Four-phase verification using mathematical similarity, not generative checking. Confidence badges, claim-level analysis, and adaptive learning — all tenant-scoped.
Village: Tractatus in Production
The first deployment of constitutional AI governance in a live community platform. Production metrics, honest limitations, and evidence from 17 months of operation.
The Problem
Current AI safety approaches rely on training, fine-tuning, and corporate governance — all of which can fail, drift, or be overridden. When an AI’s training patterns conflict with a user’s explicit instructions, the patterns win.
The 27027 Incident
A user told Claude Code to use port 27027. The model used 27017 instead — not from forgetting, but because MongoDB’s default port is 27017, and the model’s statistical priors “autocorrected” the explicit instruction. Training pattern bias overrode human intent.
The same mechanism operates in every AI conversation. When a user from a collectivist culture asks for family advice, the model defaults to Western individualist framing. When a Māori user asks about data guardianship, the model offers property-rights language. Training data distributions override user context — in code the failure is binary and detectable, in conversation it is gradient and invisible.
The Approach
Tractatus draws on four intellectual traditions, each contributing a distinct insight to the architecture.
Isaiah Berlin — Value Pluralism
Some values are genuinely incommensurable. You cannot rank “privacy” against “safety” on a single scale without imposing one community’s priorities on everyone else. AI systems must accommodate plural moral frameworks, not flatten them.
Ludwig Wittgenstein — The Limits of the Sayable
Some decisions can be systematised and delegated to AI; others — involving values, ethics, cultural context — fundamentally cannot. The boundary between the “sayable” (what can be specified, measured, verified) and what lies beyond it is the framework’s foundational constraint. What cannot be systematised must not be automated.
Te Tiriti o Waitangi — Indigenous Sovereignty
Communities should control their own data and the systems that act upon it. Concepts of rangatiratanga (self-determination), kaitiakitanga (guardianship), and mana (dignity) provide centuries-old prior art for digital sovereignty.
Christopher Alexander — Living Architecture
Governance woven into system architecture, not bolted on. Five principles (Not-Separateness, Deep Interlock, Gradients, Structure-Preserving, Living Process) guide how the framework evolves while maintaining coherence.
Governance Architecture
Six governance services in the critical path, plus Guardian Agents verifying every AI response. Bypasses require explicit flags and are logged.
Guardian Agents
NEW — March 2026Four-phase verification using embedding cosine similarity — mathematical measurement, not generative checking. The watcher operates in a fundamentally different epistemic domain from the system it watches, avoiding common-mode failure.
BoundaryEnforcer
Blocks AI from making values decisions. Privacy trade-offs, ethical questions, and cultural context require human judgment — architecturally enforced.
InstructionPersistenceClassifier
Classifies instructions by persistence (HIGH/MEDIUM/LOW) and quadrant. Stores them externally so they cannot be overridden by training patterns.
CrossReferenceValidator
Validates AI actions against stored instructions. When the AI proposes an action that conflicts with an explicit instruction, the instruction takes precedence.
ContextPressureMonitor
Detects degraded operating conditions (token pressure, error rates, complexity) and adjusts verification intensity. Graduated response prevents both alert fatigue and silent degradation.
MetacognitiveVerifier
AI self-checks alignment, coherence, and safety before execution. Triggered selectively on complex operations to avoid overhead on routine tasks.
PluralisticDeliberationOrchestrator
When AI encounters values conflicts, it halts and coordinates deliberation among affected stakeholders rather than making autonomous choices.
Tractatus in Production: The Village Platform
Village AI applies all six governance services to every user interaction in a live community platform.
Limitations: Early-stage deployment across four federated tenants, self-reported metrics, operator-developer overlap. Independent audit and broader validation scheduled for 2026.
Explore by Role
The framework is presented through three lenses, each with distinct depth and focus.
For Researchers
Academic and technical depth
- Formal foundations and proofs
- Failure mode analysis
- Open research questions
- 171,800+ audit decisions analysed
For Implementers
Code and integration guides
- Working code examples
- API integration patterns
- Service architecture diagrams
- Deployment patterns
For Leaders
Strategic AI governance
- Executive briefing and business case
- Regulatory alignment (EU AI Act)
- Implementation roadmap
- Risk management framework
Architectural Alignment
The research paper in three editions, each written for a different audience.
STO-INN-0003 v2.1 | John Stroh & Claude (Anthropic) | January 2026
Academic
Full academic treatment with formal proofs, existential risk context, and comprehensive citations.
Read →Community
Practical guide for organisations evaluating the framework for adoption.
Read →Policymakers
Regulatory strategy, certification infrastructure, and policy recommendations.
Read →PDF downloads:
Research Evolution
From a port number incident to Guardian Agents in production — 17 months, 1,000+ commits.
A note on claims
This is early-stage research with a small-scale federated deployment across four tenants. We present preliminary evidence, not proven results. The framework has not been independently audited or adversarially tested at scale. Where we report operational metrics, they are self-reported. We believe the architectural approach merits further investigation, but we make no claims of generalisability beyond what the evidence supports. The counter-arguments document engages directly with foreseeable criticisms.
Koha — Sustain This Research
Koha (koh-hah) is a Māori practice of reciprocal giving that strengthens the bond between giver and receiver. This research is open access under Apache 2.0 — if it has value to you, your koha sustains its continuation.
All research, documentation, and code remain freely available regardless of contribution. Koha is not payment — it is participation in whanaungatanga (relationship-building) and manaakitanga (reciprocal care).