Sovereign AI Infrastructure

Your AI should
answer to you

RouxYou is a self-evolving, multi-agent AI system running entirely on consumer hardware. No cloud. No API keys. No data leaves your network.

Zero Cloud Dependencies

Local LLM Inference

Self-Mod With Rollback

Hybrid 3-Layer Memory System

Joint Intention Deliberation (Phase 40)

What It Is

The orchestration layer is too important to rent

Every major AI agent framework sends your data to someone else's server. Your prompts, your files, your workflow — routed through third-party APIs, metered by the token, owned by the provider.

RouxYou runs the entire stack locally. A lightweight router classifies intent. A reasoning engine plans complex tasks. Workers execute without LLM overhead. A watchdog monitors the whole thing — and can't be modified by the system it watches.

The system learns from its own task history through a three-layer memory architecture (semantic, episodic, core state), improves its own code through a blue-green deployment pipeline, and rolls back automatically when something breaks. Multi-agent deliberation (Phase 40) reaches consensus on a shared blackboard before execution. A silent drift verifier catches when retry loops satisfy surface constraints but drop semantic intent. It proposes its own improvements, searches the web for better patterns, and learns from the outcomes. No fine-tuning. No retraining. Just hybrid memory, safe self-modification, and a feedback loop that compounds.

Architecture

:8000

Gateway

Reverse proxy and route table. Hot-swaps between production and staging during deploys.

:8001 — ministral-3B router

Orchestrator

Intent classification + Phase 40 joint-intention deliberation. Planner/Coach/Coder agents reach consensus on a shared blackboard before execution.

:8002 — GPT-OSS 20B (MoE)

Coder

Deep reasoning and code generation via gpt-oss:20b (3.6B active params, MoE). Plans complex tasks. Enriches proposals. Drives the Phase 40 deliberation pipeline.

:8003

Worker

20+ capabilities. Zero LLM overhead. Pure Python execution for filesystem, search, commands, vision, and deploy.

:8004 + :8011

Memory & RAG

Three-layer memory: LanceDB semantic (hybrid BM25+RRF+FlashRank rerank), SQLite episodic, SQLite core state. Embeddings via nomic-embed-text-v2-moe.

:8010 — immutable

Watchtower

Supervisor with human approval gate. Blue-green deploy pipeline. Cannot be modified by the system it watches.

:8012 — proposals

Cron & Coach

Heuristic observers. LLM Coach enrichment. Web research proposals. Memory decay. Scheduled automation.

:8013 + Discord

Messaging

Outbound webhooks (Discord/Slack). Inbound command bridge via Discord bot. Approve proposals, check status, submit tasks from your phone.

:8014 + :5100

Roux Voice

Moonshine v2 STT (245M, CPU, 9.6x realtime) + Kokoro v1.0 TTS (82M, CPU, 54 voices). VAD wake-word pipeline. Local voice I/O with zero cloud.

:8039 — verifier

Silent Drift Verifier

Catches when self-healing retry loops satisfy surface constraints but silently drop semantic intent. Claude Sonnet as the verifier-of-last-resort.

Read the full technical architecture →

First To Combine

Fully local inference on consumer-grade hardware — no cloud APIs, no per-token costs, no data exfiltration

Multi-agent deliberation — planner, coach, and coder agents reach consensus on a shared blackboard before any action executes (Phase 40 joint-intention architecture)

Self-modification through blue-green deployment with anchor validation, health checks, automatic rollback, and a silent drift verifier that catches semantic degradation

Three-layer memory — LanceDB semantic search (BM25+RRF+FlashRank rerank), SQLite episodic outcomes, SQLite core state. Hybrid retrieval with temporal decay. No fine-tuning needed.

Self-improvement proposals — the system observes its own health, researches better patterns, and suggests changes to its operator through a three-tiered proposal pipeline

Voice + Discord bridge — local Moonshine STT + Kokoro TTS for voice I/O; Discord bot for remote task submission, proposal management, and status checks from your phone

Self-Modification

Stage LLM generates a code patch. Anchor validation confirms the patch targets actual file content, not hallucinated code.

Boot Staging instance starts on an alternate port. Health endpoint confirms the service is alive and functional.

Approve Human approval gate. The Watchtower presents the diff. Nothing deploys without operator consent.

Swap Gateway routes flip to staging. Production archived for rollback. 60-second watchdog monitors for failures.

Guard Three consecutive failures trigger automatic rollback. The system heals itself. No human intervention required.

Proposal System

The system suggests its own improvements

RouxYou doesn't wait to be told what's wrong. Three tiers of observation feed into a unified proposal pipeline. Every suggestion requires human approval. Every outcome is recorded in memory. The system learns from its own proposals.

Tier 1

Heuristic Observers

Six pure-Python observers. Zero LLM overhead. Health checks, memory pressure, stale tasks, codebase drift. Runs every 30 minutes.

Tier 2

LLM Coach

Local GPT-OSS 20B (MoE) enriches findings with root cause analysis, confidence scores, and cross-references with three-layer memory.

Tier 3

Web Research

Daily searches via self-hosted engine. Seven rotating topics. Only actionable findings above 0.6 relevance become proposals.

RouxYou exists because the operator believed that an AI system that manages your life should not report to someone else's server.

— Architecture Manifesto, Feb 2026

Your AI shouldanswer to you

The orchestration layer is too important to rent

The system suggests its own improvements

Your AI should
answer to you