External Governance Services for AI Systems

Six architectural services addressing pattern override and decision traceability in agentic systems. Development framework for instruction persistence, boundary enforcement, and audit logging.

πŸ—οΈ

Architectural Separation

Governance runs external to AI model

πŸ’Ύ

Instruction Persistence

Validates instructions across context

πŸ“‹

Audit Trail by Design

MongoDB logs with service attribution

How It Works

Pattern Override Challenge

AI systems operating across extended interactions may not maintain instruction consistency as context evolves. Instructions given early can be deprioritized or reinterpreted.

External Architecture Approach

Tractatus services run external to the AI model, providing boundary validation, instruction classification, and audit logging through architectural separation.

Request Flow with Governance

Example: AI decision flow with boundary enforcementβ€”from user request through governance validation to human approval.

Request Flow Sequence: How AI decisions are governed

System Architecture

Six Core Services

β†’ BoundaryEnforcer (Tractatus 12.1-12.7)
β†’ InstructionPersistenceClassifier
β†’ CrossReferenceValidator
β†’ ContextPressureMonitor
β†’ MetacognitiveVerifier
β†’ PluralisticDeliberationOrchestrator

Service Interaction Flow

Tractatus Framework Architecture: Shows how 6 governance services interact in sequence

Service Trigger Conditions

Service Trigger Decision Tree: When each framework service activates

System Architecture

High-level overview showing how the 6 governance services integrate with your application and data layer.

Tractatus System Architecture: Component interaction and data flow
ARCHITECTURAL ENFORCEMENT

Hook Architecture: The Credibility Layer

Tractatus governance is not voluntary compliance. PreToolUse hooks enforce boundaries before AI actions execute, making circumvention architecturally impossible.

The Voluntary Compliance Problem

Traditional AI safety relies on the AI system "choosing" to follow rules embedded in training data or system prompts. These approaches assume the AI will maintain alignment regardless of context pressure or capability.

Tractatus addresses this through architectural enforcement: governance runs in a separate process that the AI cannot access, modify, or bypass.

PreToolUse Hook Execution Flow

Before any file edit, write, or bash command executes, the hook intercepts the action and runs validation checks. Only if ALL checks pass (exit code 0) does the AI action proceed.

Hook Architecture: PreToolUse intercept flow showing validation and hard blocks

Real Enforcement Examples

🚫 GitHub URL Protection (inst_084)

AI attempted to change GitHub repository URL from tractatus-framework (public) to tractatus (private).

BLOCKED: Repository URL modification detected
Reason: Would expose private repo structure

Hook: validate-file-edit.js line 448-505

🚫 CSP Violation (inst_008)

AI attempted to add inline style attribute to HTML element.

BLOCKED: Inline style detected
Pattern: style="margin: 0 auto"

Hook: validate-file-write.js line 68-149

🚫 Governance File Protection (inst_027)

AI attempted direct edit of instruction-history.json.

BLOCKED: inst_027 hard block
Use: scripts/add-instruction.js instead

Hook: framework-audit-hook.js line 204-209

⚠ Human Approval Required (BoundaryEnforcer)

AI action touches values-sensitive file (privacy policy).

ASK HUMAN: Values decision detected
Requires: Explicit human approval

Hook: validate-file-write.js line 239-262

Technical Implementation

πŸ”’ Process Separation

Hooks execute in a separate Node.js process (child_process.spawn) with their own memory space. The AI agent cannot access, modify, or terminate the validation process.

// .claude/hooks.json
{
  "PreToolUse:Edit": {
    "command": "node",
    "args": ["scripts/hook-validators/validate-file-edit.js"]
  },
  "PreToolUse:Write": {
    "command": "node",
    "args": ["scripts/hook-validators/validate-file-write.js"]
  }
}

βš™οΈ Exit Code Enforcement

Hook validators return exit codes that Claude Code's architecture enforces:

  • 0 = PASS β†’ Tool executes
  • 2 = HARD BLOCK β†’ Tool execution halted, error shown to user
// validate-file-edit.js (line 373-382)
const githubUrlCheck = checkGitHubURLProtection();
if (!githubUrlCheck.passed) {
  error(githubUrlCheck.reason);
  console.error(githubUrlCheck.output);
  logMetrics('blocked', githubUrlCheck.reason);
  process.exit(2);  // ← HARD BLOCK: Tool cannot execute
}

πŸ“Š Audit Trail

Every hook execution is logged to .claude/metrics/hooks-metrics.json with timestamp, file, result, and reason. Blocks are separately tracked for analysis.

{
  "hook": "validate-file-edit",
  "timestamp": "2025-10-26T22:02:15.XXX",
  "file": "/public/implementer.html",
  "result": "blocked",
  "reason": "inst_084: GitHub repository name change detected"
}

Why Architectural Enforcement Matters

This is the fundamental difference between Tractatus and traditional AI safety approaches:

❌ Voluntary Compliance

  • AI "chooses" to follow rules
  • Can drift under context pressure
  • Pattern recognition may override instructions
  • No verification before execution

βœ“ Architectural Enforcement

  • Hook intercepts before execution
  • Separate process, cannot be accessed
  • Exit codes enforced by runtime
  • Audit trail of all decisions

Deployment Architecture

Technology Stack

Runtime: Node.js v20+
Database: MongoDB 7.0+
Framework: Express.js
Process: Systemd (recommended)

Production Deployment

Comprehensive deployment guide covering local development, production server configuration, Docker containerization, cloud deployment patterns (AWS, GCP, Kubernetes), database management, monitoring, and security best practices.

πŸ“¦ Covered in Guide:

  • Local development setup
  • Production server configuration
  • Docker and Docker Compose
  • Cloud deployment (AWS/GCP)

πŸ”§ Also Includes:

  • Kubernetes manifests
  • Database backup and migration
  • SSL/TLS configuration
  • Monitoring and logging

Core Services

BoundaryEnforcer

Implements Tractatus 12.1-12.7: AI cannot make values decisions without human approval.

Boundaries:
  • VALUES (12.1) - Ethics, privacy, principles
  • INNOVATION (12.2) - Architectural decisions
  • WISDOM (12.3) - Strategic judgment
  • PURPOSE (12.4) - Goal definition
const { checkBoundary } = require('./services/BoundaryEnforcer')
const result = await checkBoundary(decision)
// Returns: { allowed: false, requires_human: true }

InstructionPersistenceClassifier

Classifies instructions by quadrant (STRATEGIC/OPERATIONAL/TACTICAL/SYSTEM) and persistence level (HIGH/MEDIUM/LOW).

Classification:
  • Quadrant (STRATEGIC/OPERATIONAL/TACTICAL/SYSTEM)
  • Persistence (HIGH/MEDIUM/LOW)
  • Temporal scope (PROJECT/SESSION/TASK)
  • Explicitness score (0.0-1.0)
const { classify } = require('./services/InstructionPersistenceClassifier')
const result = await classify(instruction)
// Returns: { quadrant, persistence, temporal_scope, explicitness }

CrossReferenceValidator

Validates AI actions against stored instructions to prevent pattern recognition overrides.

Validation:
  • Checks action against HIGH persistence instructions
  • Detects conflicts (pattern vs explicit instruction)
  • Provides correct parameters when rejected
const { validate } = require('./services/CrossReferenceValidator')
const result = await validate(action, instructions)
// Returns: { status: 'REJECTED', conflicts, correct_parameters }

ContextPressureMonitor

Monitors token usage and context pressure, triggering safety protocols at thresholds.

Pressure Levels:
  • NORMAL (0-50%) - Full operation
  • ELEVATED (50-75%) - Increase verification
  • HIGH (75-90%) - Reduce complexity
  • CRITICAL (90%+) - Suggest handoff
const { analyzePressure } = require('./services/ContextPressureMonitor')
const result = await analyzePressure(tokens, messages)
// Returns: { level: 'HIGH', score: 82, shouldReduce: true }

MetacognitiveVerifier

Verifies action reasoning and confidence, requiring confirmation for low-confidence actions.

Verification:
  • Confidence scoring (0.0-1.0)
  • Selective mode (HIGH persistence only)
  • Requires confirmation if confidence < 0.7
const { verify } = require('./services/MetacognitiveVerifier')
const result = await verify(action, reasoning)
// Returns: { confidence: 0.65, requires_confirmation: true }

PluralisticDeliberationOrchestrator

Manages multi-stakeholder deliberation ensuring value pluralism in decisions.

Features:
  • Stakeholder perspective tracking
  • Value conflict detection
  • Deliberation session management
  • Precedent storage
const { orchestrate } = require('./services/PluralisticDeliberationOrchestrator')
const result = await orchestrate(decision, stakeholders)
// Returns: { decision, perspectives, conflicts_identified }

πŸ“ Source Code

Code patterns and examples are available in the GitHub repository.

API Reference

BoundaryEnforcer.checkBoundary()

const { checkBoundary } = require('./src/services/BoundaryEnforcer.service')

// Check if decision crosses Tractatus boundary
const decision = {
  domain: 'values',
  description: 'Change privacy policy to enable analytics',
  context: { /* ... */ }
}

const result = await checkBoundary(decision)

// Returns:
{
  allowed: false,                    // AI cannot proceed
  requires_human: true,              // Human decision required
  boundary: "12.1",                  // Tractatus boundary violated
  principle: "Values cannot be automated, only verified",
  reason: "Decision involves values domain",
  ai_can_provide: [                  // What AI CAN do
    "Analyze privacy implications",
    "List alternative approaches",
    "Document tradeoffs"
  ]
}
Keywords detected: value, principle, ethic, moral, should, ought, right, wrong, privacy, policy, trade-off, etc.

InstructionPersistenceClassifier.classify()

const { classify } = require('./src/services/InstructionPersistenceClassifier.service')

const instruction = "Always use MongoDB on port 27017"
const result = await classify(instruction, context)

// Returns:
{
  quadrant: 'SYSTEM',                // Decision domain
  persistence: 'HIGH',               // How long to remember
  temporal_scope: 'PROJECT',         // Scope of applicability
  verification_required: 'MANDATORY', // Verification level
  explicitness: 0.85,                // Confidence score
  parameters: {
    port: "27017",
    database: "mongodb",
    service: "mongodb"
  }
}
Quadrants: STRATEGIC, OPERATIONAL, TACTICAL, SYSTEM, STORAGE
Persistence: HIGH (override all), MEDIUM (session-scoped), LOW (can be superseded)

CrossReferenceValidator.validate()

const { validate } = require('./src/services/CrossReferenceValidator.service')

// User instructed: "Use port 27027"
// AI attempting: port 27017 (pattern recognition)

const action = {
  type: 'db_config',
  parameters: { port: 27017 }
}

const instructions = await getStoredInstructions() // From MongoDB
const result = await validate(action, instructions)

// Returns (CONFLICT):
{
  status: 'REJECTED',
  conflicts: [
    {
      instruction_id: 'inst_042',
      instruction: 'Use MongoDB port 27027',
      persistence: 'HIGH',
      conflict: 'Proposed port 27017 conflicts with instruction port 27027'
    }
  ],
  correct_parameters: {
    port: 27027  // Use this instead
  }
}

ContextPressureMonitor.analyzePressure()

const { analyzePressure } = require('./src/services/ContextPressureMonitor.service')

const pressure = await analyzePressure({
  currentTokens: 150000,
  maxTokens: 200000,
  messageCount: 45,
  errorCount: 2
})

// Returns:
{
  level: 'HIGH',                     // NORMAL|ELEVATED|HIGH|CRITICAL
  score: 75,                         // 0-100 percentage
  shouldReduce: true,                // Reduce complexity
  recommendations: [
    'Consider handoff to new session',
    'Reduce verbose explanations',
    'Increase verification for remaining actions'
  ],
  thresholds: {
    tokens: 75,     // 75% of max
    messages: 64,   // 45/70 messages
    errors: 40      // 2 errors
  }
}

Integration Examples

Express Middleware Integration

const express = require('express')
const { BoundaryEnforcer } = require('./services')

const app = express()

// Add boundary checking middleware
app.use(async (req, res, next) => {
  if (req.body.decision) {
    const check = await BoundaryEnforcer.checkBoundary(
      req.body.decision
    )

    if (!check.allowed) {
      return res.status(403).json({
        error: 'Boundary violation',
        reason: check.reason,
        alternatives: check.ai_can_provide
      })
    }
  }
  next()
})

Instruction Validation

const {
  InstructionPersistenceClassifier,
  CrossReferenceValidator
} = require('./services')

// Classify and store user instruction
const classification = await
  InstructionPersistenceClassifier.classify(
    userInstruction
  )

if (classification.explicitness >= 0.6) {
  await storeInstruction(classification)
}

// Validate AI action before execution
const validation = await
  CrossReferenceValidator.validate(
    proposedAction,
    await getStoredInstructions()
  )

if (validation.status === 'REJECTED') {
  console.error(validation.conflicts)
  useCorrectParameters(
    validation.correct_parameters
  )
}

MongoDB Data Models

GovernanceRule

{
  id: "inst_001",
  text: "Use MongoDB port 27017",
  quadrant: "SYSTEM",
  persistence: "HIGH",
  temporal_scope: "PROJECT",
  explicitness: 0.85,
  parameters: { port: "27017" },
  active: true,
  timestamp: "2025-10-21T10:00:00Z"
}

AuditLog

{
  action: "boundary_check",
  result: "REJECTED",
  boundary: "12.1",
  decision: { /* ... */ },
  timestamp: "2025-10-21T11:30:00Z",
  session_id: "2025-10-21-001"
}

Deployment

Requirements

  • β€’
    Node.js: v18.0.0+ (v20+ recommended)
  • β€’
    MongoDB: v7.0+ (for instruction persistence)
  • β€’
    Memory: 2GB+ recommended

Installation

# Clone the framework repository
git clone https://github.com/AgenticGovernance/tractatus-framework.git
cd tractatus-framework

# Install dependencies
npm install

# Configure environment
cp .env.example .env
# Edit .env with your MongoDB URI

# Initialize MongoDB indexes
node scripts/init-db.js

# Start the server
npm start

πŸ“– Full Documentation

Complete deployment patterns and examples available in the GitHub repository.

Integration Patterns

Common architectural patterns for integrating Tractatus into existing systems.

Middleware Integration

Insert governance checks as middleware in your request pipeline. Suitable for API-based AI systems.

Use Case: REST APIs, Express.js applications

app.use(governanceMiddleware({
  services: ['BoundaryEnforcer', 'CrossReferenceValidator'],
  mode: 'strict',
  auditAll: true
}))

Event-Driven Governance

Trigger governance checks via events. Suitable for async workflows and microservices.

Use Case: Message queues, event buses, async processing

eventBus.on('ai:decision', async (event) => {
  const result = await checkBoundary(event.decision)
  if (!result.allowed) {
    await requestHumanApproval(event, result)
  }
})

Pre/Post-Action Hooks

Validate actions before and after execution. Current production pattern for Claude Code.

Use Case: LLM tool use, autonomous agents

hooks: {
  PreToolUse: governanceCheck,
  PostToolUse: auditLog,
  SessionStart: loadInstructions,
  SessionEnd: cleanup
}

Sidecar Governance Service

Deploy governance as a separate service. Suitable for multi-LLM or polyglot environments.

Use Case: Kubernetes, containerized deployments

// AI Service makes HTTP call
const govResponse = await fetch(
  'http://governance-sidecar:8080/check',
  { method: 'POST', body: JSON.stringify(decision) }
)
⚑

Performance Optimization with Agent Lightning

Integrate Microsoft's Agent Lightning for reinforcement learning-based optimization while maintaining full governance coverage.

πŸ—οΈ Two-Layer Architecture

Tractatus governance sits above Agent Lightning optimization, creating a separation of concerns: governance enforces boundaries, AL optimizes performance within those boundaries.

Layer 1: Governance (Tractatus)

  • βœ“ BoundaryEnforcer: Classifies decisions as safe/unsafe
  • βœ“ CrossReferenceValidator: Validates against constraints
  • βœ“ PluralisticDeliberator: Manages stakeholder input
  • βœ“ PressureMonitor: Detects manipulation attempts

Layer 2: Performance (Agent Lightning)

  • ⚑ Reinforcement Learning: Optimizes response quality
  • ⚑ Human Feedback: Learns from user corrections
  • ⚑ Training Loops: Continuous improvement over time
  • ⚑ Cost Efficiency: Maintains performance with lower compute

Key Insight: Governance runs before Agent Lightning optimization. AL never sees unsafe decisions - they're blocked at the governance layer.

πŸ“¦ Implementation Example

Use the GovernedAgentLightning wrapper to combine governance + performance optimization:

from tractatus_agent_lightning import GovernedAgentLightning
from agentlightning import AgentLightning

# Initialize governed agent
agent = GovernedAgentLightning(
    base_agent=AgentLightning(agent_id="customer-support-bot"),
    governance_config={
        "services": ["BoundaryEnforcer", "CrossReferenceValidator"],
        "mode": "strict",  # Block any governance violations
        "audit_all": True   # Log all decisions
    },
    mongodb_uri="mongodb://localhost:27017/governance"
)

# Agent Lightning trains on feedback, Tractatus validates boundaries
for user_message in conversation:
    # Governance check happens first
    response = agent.respond(user_message)

    # User feedback feeds Agent Lightning optimization
    if user_corrected:
        agent.update_from_feedback(
            original_response=response,
            corrected_response=user_correction,
            rating=user_rating
        )
        # Governance audit logs the training update
Pattern: Tractatus ensures safety boundaries are never crossed, while Agent Lightning learns to optimize within those safe boundaries.

πŸ“Š Preliminary Findings: Demo 2 Results

Small-Scale Validation Only

These results are from Demo 2: 5 training rounds, single agent, simulated environment. Not validated at production scale. Consider this preliminary evidence requiring further research.

Early evidence from small-scale demo suggests Agent Lightning can maintain near-baseline performance while reducing compute requirements, without compromising governance coverage.

Metric Baseline
(No AL)
Governed + AL
(Tractatus + AL)
Delta
User Engagement 94% 89% -5%
Governance Coverage 100% 100% 0%
Constraint Violations 0 0 0
Audit Trail Completeness 100% 100% 0%
100%
Governance Maintained
-5%
Performance Trade-off
5 rounds
Training Duration

Interpretation for Implementers:

  • βœ“ Governance Preserved: 100% coverage maintained - no violations observed across 5 rounds
  • β†’ Performance Cost: 5% engagement reduction may be acceptable for high-stakes use cases (healthcare, finance, legal)
  • ? Open Question: Does performance gap close over more training rounds? Requires longer-term validation.

Critical Limitations

  • β€’ Scale: Only 5 training rounds - insufficient to claim production readiness
  • β€’ Environment: Simulated interactions - not tested with real users
  • β€’ Agent Count: Single agent - multi-agent scenarios untested
  • β€’ Duration: Short-term only - long-term stability unknown
  • β€’ Edge Cases: Adversarial inputs and stress testing not performed

⚠️ Do not use these numbers as production validation. Conduct your own testing in your specific context.

πŸš€ Try the Integration

Download the install pack with Demo 2 (governed agent) and explore the integration yourself. Includes full source code, governance modules, and setup instructions.

Development Roadmap & Collaboration

Tractatus is an active research framework. We welcome collaboration on priority development areas.

πŸš€ Priority Areas for Development

These initiatives represent high-impact opportunities for framework enhancement. Technical contributors, researchers, and organizations are encouraged to engage.

πŸ€–

Multi-LLM Support

Status: Research Phase

Extend governance to GPT-4, Gemini, Llama, and local models. Requires adapting hook architecture to different LLM interfaces.

Technical Challenges: Provider-specific tool/function calling, rate limiting, context window differences
πŸ“š

Language Bindings

Status: Community Interest

Python, Go, and Rust implementations to serve broader developer communities. Core logic is portable; MongoDB integration is universal.

Value: Enable polyglot AI stacks, performance-critical applications (Rust), data science workflows (Python)
☁️

Cloud-Native Deployment

Status: Reference Architectures Needed

Terraform/Helm charts for AWS, Azure, GCP. Include managed MongoDB (Atlas), auto-scaling, and monitoring integration.

Deliverables: Reference IaC templates, cost optimization guides, security hardening checklist
πŸ“Š

Business Intelligence Tools

v1.0 Prototype Live

Status: Research Validation Phase

Transform governance logs into executive insights: cost avoidance calculator, framework maturity scoring, and team performance analytics. Demonstrates ROI of governance decisions in real-time.

Current Features: User-configurable cost factors, maturity scoring (0-100), AI vs human performance comparison, enterprise scaling projections
πŸ”—

AI Framework Integration

Status: Conceptual

Adapters for LangChain, Semantic Kernel, AutoGPT, and CrewAI. Enable governance for existing agent frameworks.

Approach: Plugin/middleware architecture that wraps agent actions with governance checks
⚑

Enterprise-Scale Performance

Status: Validation Needed

Optimize for 1000+ concurrent AI agents. Requires caching strategies, rule compilation, and distributed audit logging.

Metrics Target: < 5ms governance overhead per decision, 99.9% uptime, horizontal scalability
πŸ›‘οΈ

Extended Governance Services

Status: Research

Cost monitoring, rate limiting, PII detection, adversarial prompt defense. Domain-specific services for regulated industries.

Examples: FinancialComplianceService, HealthcarePrivacyService, CostBudgetEnforcer

Get Involved

Tractatus is Apache 2.0 licensed research. We welcome contributions, pilot implementations, and collaborative research partnerships.

πŸ‘¨β€πŸ’» Technical Contributors

Implement features, fix bugs, improve documentation

β†’ Contributing Guide

πŸ”¬ Research Partners

Validation studies, academic collaboration, case studies

β†’ research@agenticgovernance.digital

🏒 Organization Pilots

Production deployments, enterprise requirements, feedback loops

β†’ Submit Case Study

Why Collaborate? Tractatus addresses real gaps in AI safety. Early adopters shape the framework's evolution and gain expertise in structural AI governanceβ€”a differentiating capability as regulatory requirements mature.

Resources

Reference Implementation

This website (agenticgovernance.digital) runs on Tractatus governance.

Help us reach the right people.

If you know researchers, implementers, or leaders who need structural AI governance solutions, share this with them.