RouxYou
Sovereign AI Infrastructure

Your AI should
answer to you

RouxYou is a self-evolving, multi-agent AI system running entirely on consumer hardware. No cloud. No API keys. No data leaves your network.

Zero Cloud Dependencies
Local LLM Inference
Self-Mod With Rollback
Hybrid 3-Layer Memory System
Joint Intention Deliberation (Phase 40)

The orchestration layer is too important to rent

Every major AI agent framework sends your data to someone else's server. Your prompts, your files, your workflow — routed through third-party APIs, metered by the token, owned by the provider.

RouxYou runs the entire stack locally. A lightweight router classifies intent. A reasoning engine plans complex tasks. Workers execute without LLM overhead. A watchdog monitors the whole thing — and can't be modified by the system it watches.

The system learns from its own task history through a three-layer memory architecture (semantic, episodic, core state), improves its own code through a blue-green deployment pipeline, and rolls back automatically when something breaks. Multi-agent deliberation (Phase 40) reaches consensus on a shared blackboard before execution. A silent drift verifier catches when retry loops satisfy surface constraints but drop semantic intent. It proposes its own improvements, searches the web for better patterns, and learns from the outcomes. No fine-tuning. No retraining. Just hybrid memory, safe self-modification, and a feedback loop that compounds.

:8000
Gateway
Reverse proxy and route table. Hot-swaps between production and staging during deploys.
:8001 — ministral-3B router
Orchestrator
Intent classification + Phase 40 joint-intention deliberation. Planner/Coach/Coder agents reach consensus on a shared blackboard before execution.
:8002 — GPT-OSS 20B (MoE)
Coder
Deep reasoning and code generation via gpt-oss:20b (3.6B active params, MoE). Plans complex tasks. Enriches proposals. Drives the Phase 40 deliberation pipeline.
:8003
Worker
20+ capabilities. Zero LLM overhead. Pure Python execution for filesystem, search, commands, vision, and deploy.
:8004 + :8011
Memory & RAG
Three-layer memory: LanceDB semantic (hybrid BM25+RRF+FlashRank rerank), SQLite episodic, SQLite core state. Embeddings via nomic-embed-text-v2-moe.
:8010 — immutable
Watchtower
Supervisor with human approval gate. Blue-green deploy pipeline. Cannot be modified by the system it watches.
:8012 — proposals
Cron & Coach
Heuristic observers. LLM Coach enrichment. Web research proposals. Memory decay. Scheduled automation.
:8013 + Discord
Messaging
Outbound webhooks (Discord/Slack). Inbound command bridge via Discord bot. Approve proposals, check status, submit tasks from your phone.
:8014 + :5100
Roux Voice
Moonshine v2 STT (245M, CPU, 9.6x realtime) + Kokoro v1.0 TTS (82M, CPU, 54 voices). VAD wake-word pipeline. Local voice I/O with zero cloud.
:8039 — verifier
Silent Drift Verifier
Catches when self-healing retry loops satisfy surface constraints but silently drop semantic intent. Claude Sonnet as the verifier-of-last-resort.
01
Fully local inference on consumer-grade hardware — no cloud APIs, no per-token costs, no data exfiltration
02
Multi-agent deliberation — planner, coach, and coder agents reach consensus on a shared blackboard before any action executes (Phase 40 joint-intention architecture)
03
Self-modification through blue-green deployment with anchor validation, health checks, automatic rollback, and a silent drift verifier that catches semantic degradation
04
Three-layer memory — LanceDB semantic search (BM25+RRF+FlashRank rerank), SQLite episodic outcomes, SQLite core state. Hybrid retrieval with temporal decay. No fine-tuning needed.
05
Self-improvement proposals — the system observes its own health, researches better patterns, and suggests changes to its operator through a three-tiered proposal pipeline
06
Voice + Discord bridge — local Moonshine STT + Kokoro TTS for voice I/O; Discord bot for remote task submission, proposal management, and status checks from your phone
Stage LLM generates a code patch. Anchor validation confirms the patch targets actual file content, not hallucinated code.
Boot Staging instance starts on an alternate port. Health endpoint confirms the service is alive and functional.
Approve Human approval gate. The Watchtower presents the diff. Nothing deploys without operator consent.
Swap Gateway routes flip to staging. Production archived for rollback. 60-second watchdog monitors for failures.
Guard Three consecutive failures trigger automatic rollback. The system heals itself. No human intervention required.

The system suggests its own improvements

RouxYou doesn't wait to be told what's wrong. Three tiers of observation feed into a unified proposal pipeline. Every suggestion requires human approval. Every outcome is recorded in memory. The system learns from its own proposals.

Tier 1
Heuristic Observers
Six pure-Python observers. Zero LLM overhead. Health checks, memory pressure, stale tasks, codebase drift. Runs every 30 minutes.
Tier 2
LLM Coach
Local GPT-OSS 20B (MoE) enriches findings with root cause analysis, confidence scores, and cross-references with three-layer memory.
Tier 3
Web Research
Daily searches via self-hosted engine. Seven rotating topics. Only actionable findings above 0.6 relevance become proposals.
RouxYou exists because the operator believed that an AI system that manages your life should not report to someone else's server.
— Architecture Manifesto, Feb 2026