diff --git a/GHOST_SHELL_VISION.md b/GHOST_SHELL_VISION.md index 34d86fe..3c99604 100644 --- a/GHOST_SHELL_VISION.md +++ b/GHOST_SHELL_VISION.md @@ -68,6 +68,29 @@ Closer to a lightweight Render, Railway, or Fly.io — but: --- +## Self-Healing as a Core Design Principle + +Ghost Shell is being built by a solo operator. That constraint shapes every architectural decision in the platform and must be carried forward into Ghost Shell itself. + +A platform designed for solo operation must run without continuous human supervision. This means self-healing is not a feature — it is a foundational requirement of the architecture. + +**The three properties every layer must have:** + +1. **Self-diagnosing** — the system knows when something is wrong without a human watching +2. **Self-recovering** — common failure modes are handled automatically +3. **Loud escalation** — the system pages a human only when it genuinely cannot recover + +This is the same philosophy used in industrial control systems: design for the absence of the operator, not their presence. The operator is the exception handler, not the normal execution path. + +What this means for Ghost Shell specifically: +- Agent supervision and crash recovery must be generic, not game-specific +- Lifecycle state must always be recoverable from DB alone +- Any component restart must trigger automatic re-registration and re-routing +- Quota and resource enforcement must be automatic — unbounded consumption is a 3am problem +- Notifications must surface failures without requiring dashboard monitoring + +--- + ## Strategic Direction **The honest answer as of April 2026: finish ZLH first, then decide.** @@ -104,6 +127,7 @@ For Ghost Shell to remain viable, these rules apply to ZLH development today: 2. **Keep the agent generic** — it runs any workload, not just game servers 3. **Keep template definitions abstract** — game types are plugins, not hardcoded paths 4. **Treat application layer as pluggable** — Layer 3 must be swappable +5. **Build self-healing at every layer** — the system must operate without continuous supervision If Minecraft becomes embedded in lifecycle logic, Ghost Shell dies. If Minecraft is a plug-in workload definition, Ghost Shell lives. @@ -118,9 +142,11 @@ These constraints should be checked in every architectural review. Phase 1 — ZLH as application layer (NOW) Build ZLH on top of what will become Ghost Shell. Do not abstract yet. + Close the self-healing gaps (pre-launch blockers). Phase 2 — Stabilize core (NEXT) ZLH reaches: stable, operational, revenue generating. + Runs without continuous supervision. Minimal rewrite pressure. Phase 3 — Extract core into Ghost Shell @@ -138,6 +164,8 @@ Phase 4 — ZLH becomes one implementation of Ghost Shell ## Connection to Owner Background -This architecture emerged naturally from controls engineering thinking applied to cloud infrastructure. The layer model above is structurally identical to how industrial control systems are organized (field devices → controllers → supervisory → operator interface). That is not accidental — it is trained instinct applied to a new domain. +This architecture emerged naturally from controls engineering thinking applied to cloud infrastructure. The layer model above is structurally identical to how industrial control systems are organized (field devices → controllers → supervisory → operator interface). The self-healing requirement mirrors how industrial systems are designed: the operator is the exception handler, not the normal execution path. + +That is not accidental — it is trained instinct applied to a new domain. See `OWNER_PROFILE.md` for full context.