The Problem
Current AI safety approaches rely on training, fine-tuning, and corporate governance — all of which can fail, drift, or be overridden. When an AI’s training patterns conflict with a user’s explicit instructions, the patterns win.
The 27027 Incident
A user told Claude Code to use port 27027. The model used 27017 instead — not from forgetting, but because MongoDB’s default port is 27017, and the model’s statistical priors “autocorrected” the explicit instruction. Training pattern bias overrode human intent.
From Code to Conversation: The Same Mechanism
In code, this bias produces measurable failures — wrong port, connection refused, incident logged in 14.7ms. But the same architectural flaw operates in every AI conversation, where it is far harder to detect.
When a user from a collectivist culture asks for family advice, the model defaults to Western individualist framing — because that is what 95% of the training data reflects. When a Māori user asks about data guardianship, the model offers property-rights language instead of kaitiakitanga. When someone asks about end-of-life decisions, the model defaults to utilitarian calculus rather than the user’s religious or cultural framework.
The mechanism is identical: training data distributions override the user’s actual context. In code, the failure is binary and detectable. In conversation, it is gradient and invisible — culturally inappropriate advice looks like “good advice” to the system, and often to the user. There is no CrossReferenceValidator catching it in 14.7ms.
Read the full analysis →This is not an edge case, and it is not limited to code. It is a category of failure that gets worse as models become more capable: stronger patterns produce more confident overrides — whether the override substitutes a port number or a value system. Safety through training alone is insufficient. The failure mode is structural, it operates across every domain where AI acts, and the solution must be structural.
The Approach
Tractatus draws on four intellectual traditions, each contributing a distinct insight to the architecture.
Isaiah Berlin — Value Pluralism
Some values are genuinely incommensurable. You cannot rank “privacy” against “safety” on a single scale without imposing one community’s priorities on everyone else. AI systems must accommodate plural moral frameworks, not flatten them.
Ludwig Wittgenstein — The Limits of the Sayable
Some decisions can be systematised and delegated to AI; others — involving values, ethics, cultural context — fundamentally cannot. The boundary between the “sayable” (what can be specified, measured, verified) and what lies beyond it is the framework’s foundational constraint. What cannot be systematised must not be automated.
Te Tiriti o Waitangi — Indigenous Sovereignty
Communities should control their own data and the systems that act upon it. Concepts of rangatiratanga (self-determination), kaitiakitanga (guardianship), and mana (dignity) provide centuries-old prior art for digital sovereignty.
Christopher Alexander — Living Architecture
Governance woven into system architecture, not bolted on. Five principles (Not-Separateness, Deep Interlock, Gradients, Structure-Preserving, Living Process) guide how the framework evolves while maintaining coherence.
Six Governance Services
Every AI action passes through six external services before execution. Governance operates in the critical path — bypasses require explicit flags and are logged.
BoundaryEnforcer
Blocks AI from making values decisions. Privacy trade-offs, ethical questions, and cultural context require human judgment — architecturally enforced.
InstructionPersistenceClassifier
Classifies instructions by persistence (HIGH/MEDIUM/LOW) and quadrant. Stores them externally so they cannot be overridden by training patterns.
CrossReferenceValidator
Validates AI actions against stored instructions. When the AI proposes an action that conflicts with an explicit instruction, the instruction takes precedence.
ContextPressureMonitor
Detects degraded operating conditions (token pressure, error rates, complexity) and adjusts verification intensity. Graduated response prevents both alert fatigue and silent degradation.
MetacognitiveVerifier
AI self-checks alignment, coherence, and safety before execution. Triggered selectively on complex operations to avoid overhead on routine tasks.
PluralisticDeliberationOrchestrator
When AI encounters values conflicts, it halts and coordinates deliberation among affected stakeholders rather than making autonomous choices.
Tractatus in Production: The Village Platform
Home AI applies all six governance services to every user interaction in a live community platform.
Limitations: Early-stage deployment across four federated tenants, self-reported metrics, operator-developer overlap. Independent audit and broader validation scheduled for 2026.
Explore by Role
The framework is presented through three lenses, each with distinct depth and focus.
For Researchers
Academic and technical depth
- Formal foundations and proofs
- Failure mode analysis
- Open research questions
- 3,942 audit decisions on Hugging Face
For Implementers
Code and integration guides
- Working code examples
- API integration patterns
- Service architecture diagrams
- Deployment patterns
For Leaders
Strategic AI governance
- Executive briefing and business case
- Regulatory alignment (EU AI Act)
- Implementation roadmap
- Risk management framework
Architectural Alignment
The research paper in three editions, each written for a different audience.
STO-INN-0003 v2.1 | John Stroh & Claude (Anthropic) | January 2026
Academic
Full academic treatment with formal proofs, existential risk context, and comprehensive citations.
Read →Community
Practical guide for organisations evaluating the framework for adoption.
Read →Policymakers
Regulatory strategy, certification infrastructure, and policy recommendations.
Read →PDF downloads:
Research Evolution
From a port number incident to a production governance architecture, across 800 commits and one year of research.
A note on claims
This is early-stage research with a small-scale federated deployment across four tenants. We present preliminary evidence, not proven results. The framework has not been independently audited or adversarially tested at scale. Where we report operational metrics, they are self-reported. We believe the architectural approach merits further investigation, but we make no claims of generalisability beyond what the evidence supports. The counter-arguments document engages directly with foreseeable criticisms.
Koha — Sustain This Research
Koha (koh-hah) is a Māori practice of reciprocal giving that strengthens the bond between giver and receiver. This research is open access under Apache 2.0 — if it has value to you, your koha sustains its continuation.
All research, documentation, and code remain freely available regardless of contribution. Koha is not payment — it is participation in whanaungatanga (relationship-building) and manaakitanga (reciprocal care).