About | Tractatus AI Safety Framework

Our Mission

As AI systems make increasingly consequential decisions—medical treatment, hiring, content moderation, resource allocation—a fundamental question emerges: whose values guide these decisions? Current AI alignment approaches embed particular moral frameworks into systems deployed universally. When they work, it's because everyone affected shares those values. When they don't, someone's values inevitably override others'.

The Tractatus Framework exists to address a fundamental problem in AI safety: current approaches rely on training, fine-tuning, and corporate governance—all of which can fail, drift, or be overridden. We propose safety through architecture.

Inspired by Ludwig Wittgenstein's Tractatus Logico-Philosophicus, our framework recognizes that some domains—values, ethics, cultural context, human agency—cannot be systematized. What cannot be systematized must not be automated. AI systems should have structural constraints that prevent them from crossing these boundaries.

"Whereof one cannot speak, thereof one must be silent."
— Ludwig Wittgenstein, Tractatus (§7)

Applied to AI: "What cannot be systematized must not be automated."

Our Research Focus

Tractatus emerged from a simple but urgent question: as AI systems become more capable, how do we preserve human control over the decisions that matter? Not technical decisions—moral ones. Decisions about whose privacy matters more. Whose needs come first. What trade-offs are acceptable.

This is fundamentally a problem of moral philosophy, not management science. Different communities hold genuinely different, equally legitimate values. You cannot rank "family privacy" against "community safety" on a single scale—they are incommensurable. Any system claiming to do so is simply imposing one community's values on everyone else.

The framework's core insight comes from attending carefully to what humans actually need from AI: not another authority making decisions for them, but systems that recognize when a decision requires human deliberation across different perspectives. The PluralisticDeliberationOrchestrator represents our primary research focus—a component designed to detect when AI encounters values conflicts and coordinate deliberation among affected stakeholders rather than making autonomous choices.

Traditional organizational theory addresses authority through hierarchy. But post-AI contexts require something different: authority through appropriate deliberative process. Not "AI decides for everyone," but "AI recognizes when humans must decide together."

Why This Matters

AI systems are amoral hierarchical constructs, fundamentally incompatible with the plural, incommensurable values human societies exhibit. A hierarchy can only impose one framework and treat conflicts as anomalies. You cannot pattern-match your way to pluralism.

Human societies spent centuries learning to navigate moral pluralism through constitutional separation of powers, federalism, subsidiarity, and deliberative democracy. These structures acknowledge that legitimate authority over value decisions belongs to affected communities, not distant experts claiming universal wisdom.

AI development risks reversing this progress. As capability concentrates in a few labs, value decisions affecting billions are being encoded by small teams applying their particular moral intuitions at scale. Not through malice—through structural necessity. The architecture of current AI systems demands hierarchical value frameworks.

The Tractatus Framework offers an alternative: separate what must be universal (safety boundaries) from what should be contextual (value deliberation). This preserves human agency over moral decisions while enabling AI capability to scale.

Philosophical Foundations

For a comprehensive treatment of the intellectual foundations informing this work — Berlin's value pluralism, Alexander's structural integrity, indigenous data sovereignty, and Wittgenstein's limits of the sayable — see the full paper.

The Philosophical Foundations of the Village Project (PDF)

Core Values

Digital Sovereignty & Te Tiriti o Waitangi

The principle that communities should control their own data and technology isn't new—it has deep roots in indigenous frameworks that predate Western tech by centuries. The Tractatus Framework is developed in Aotearoa New Zealand, and we recognize Te Tiriti o Waitangi (the Treaty of Waitangi, 1840) as establishing principles of partnership, protection, and participation that directly inform how we think about AI sovereignty.

This isn't performative acknowledgment. Concepts like rangatiratanga (self-determination), kaitiakitanga (guardianship), and mana (authority and dignity) provide concrete guidance for building AI systems that respect human agency across cultural contexts. Read our complete approach to Te Tiriti and indigenous data sovereignty →

Sovereignty

Individuals and communities must maintain control over decisions affecting their data, privacy, and values. AI systems must preserve human agency, not erode it.

Transparency

All AI decisions must be explainable, auditable, and reversible. No black boxes. Users deserve to understand how and why systems make choices, and have power to override them.

Harmlessness

AI systems must not cause harm through action or inaction. This includes preventing drift, detecting degradation, and enforcing boundaries against values erosion.

Community

AI safety is a collective endeavor. We are committed to open collaboration, knowledge sharing, and empowering communities to shape the AI systems that affect their lives.

Pluralism

Different communities hold different, equally legitimate values. AI systems must respect this pluralism structurally, not by pretending one framework can serve all contexts. Value decisions require deliberation among affected stakeholders, not autonomous AI choices.

Read Our Complete Values Statement →

Tractatus Data Practices

We practice what we preach—transparent data handling with architectural constraints:

What Personal Data?

Audit logs may contain: usernames, timestamps, session IDs, action descriptions. No tracking cookies, no behavioral profiling, no cross-site data collection.

Why Needed?

Framework operation requires audit trails for governance decisions. BoundaryEnforcer logs blocked actions, CrossReferenceValidator logs instruction conflicts.

How Long Retained?

Configurable retention (default 90 days). Organizations can set retention based on their compliance requirements.

Your Rights (GDPR)

Access (Article 15), Deletion (Article 17), Portability (Article 20). Contact: privacy@agenticgovernance.digital

Architectural principle: Data minimization is a system constraint, not a policy hope. BoundaryEnforcer prevents PII exposure structurally—audit trails provide compliance evidence.

How It Works

The Tractatus Framework consists of six integrated governance services that operate in the critical execution path. Every AI action passes through validation before executing — this is architectural enforcement, not voluntary compliance.

The services — BoundaryEnforcer, InstructionPersistenceClassifier, CrossReferenceValidator, ContextPressureMonitor, MetacognitiveVerifier, and PluralisticDeliberationOrchestrator — coordinate through mutual validation, creating resilience through redundancy.

See the Architecture in Detail Implementation Guide

Origin Story

The Tractatus Framework emerged from real-world AI failures experienced during extended Claude Code sessions. The "27027 incident"—where AI's training patterns immediately overrode an explicit instruction (user said "port 27027", AI used "port 27017")—revealed that traditional safety approaches were insufficient. This wasn't forgetting; it was pattern recognition bias autocorrecting the user.

After documenting multiple failure modes (pattern recognition bias, values drift, silent degradation), we recognized a pattern: AI systems lacked structural constraints. They could theoretically "learn" safety, but in practice their training patterns overrode explicit instructions, and the problem gets worse as capabilities increase.

The solution wasn't better training—it was architecture. Drawing inspiration from Wittgenstein's insight that some things lie beyond the limits of language (and thus systematization), we built a framework that enforces boundaries through structure, not aspiration.

License & Contribution

The Tractatus Framework is open source under the Apache License 2.0. We encourage:

Academic research and validation studies
Implementation in production AI systems
Submission of failure case studies
Theoretical extensions and improvements
Community collaboration and knowledge sharing

The framework is intentionally permissive because AI safety benefits from transparency and collective improvement, not proprietary control.

Why Apache 2.0?

We chose Apache 2.0 over MIT because it provides:

Patent Protection: Explicit patent grant protects users from patent litigation by contributors
Contributor Clarity: Clear terms for how contributions are licensed
Permissive Use: Like MIT, allows commercial use and inclusion in proprietary products
Community Standard: Widely used in AI/ML projects (TensorFlow, PyTorch, Apache Spark)

View full Apache 2.0 License →

About Tractatus