139 lines
4.7 KiB
Markdown
139 lines
4.7 KiB
Markdown
# Owner Profile — James
|
|
|
|
## How to Work With James
|
|
|
|
James understands code and can read it fluently — he is not a beginner.
|
|
He does not need concepts over-explained or decisions softened.
|
|
Direct tradeoff discussions are preferred over hedged recommendations.
|
|
He thinks at the system level and delegates implementation to AI tooling.
|
|
|
|
---
|
|
|
|
## Background
|
|
|
|
**Controls Engineering + Cloud Engineering**
|
|
|
|
This combination is the single most important thing to understand about how ZeroLagHub is designed. It explains decisions that might otherwise look over-engineered for a game hosting platform.
|
|
|
|
Controls engineers think in terms of:
|
|
- Separation of responsibility
|
|
- Safe boundaries between layers
|
|
- Deterministic, predictable behavior
|
|
- Fail states that are loud, not silent
|
|
|
|
Cloud engineers add:
|
|
- Control plane vs execution plane as a first-class concept
|
|
- State management discipline (what is ephemeral, what is persistent)
|
|
- Orchestration vs execution separation
|
|
|
|
Both disciplines together produce an architecture that looks like:
|
|
|
|
```
|
|
Portal (operator interface)
|
|
API (supervisory control / orchestration)
|
|
Agent (controller / execution)
|
|
Container Runtime (plant / device)
|
|
```
|
|
|
|
With network segmentation that mirrors OT environments:
|
|
```
|
|
Core services network
|
|
Game/dev workload network
|
|
```
|
|
|
|
That's not accidental. That's training showing through.
|
|
|
|
---
|
|
|
|
## Platform Evolution (Normal, Not Accidental)
|
|
|
|
The path ZeroLagHub took is textbook platform evolution:
|
|
|
|
1. DigitalOcean (manual)
|
|
2. Pterodactyl + custom frontend
|
|
3. API + Ansible
|
|
4. API + templates
|
|
5. API + Go agent (current)
|
|
|
|
This compressed years of typical platform maturity into a shorter cycle.
|
|
|
|
---
|
|
|
|
## Operating Context — Solo Operator
|
|
|
|
**This is the most important constraint on every design decision.**
|
|
|
|
ZeroLagHub is built and operated by one person. That means:
|
|
|
|
- No on-call rotation
|
|
- No one watching dashboards at 3am
|
|
- No team to absorb incident response
|
|
|
|
The system must therefore be designed to operate without continuous human supervision. This is not a preference — it is a survival requirement.
|
|
|
|
A solo-operated platform needs:
|
|
- **Self-diagnosing** — knows when something is wrong without a human watching
|
|
- **Self-recovering** — handles common failure modes automatically
|
|
- **Loud about what it can't fix** — escalates only when it genuinely needs a human
|
|
|
|
The pre-launch blockers in `OPEN_THREADS.md` are not just launch requirements. They are the gap between where the system is and where it can run unsupervised. Email notifications, quota enforcement, world backup, readiness gating — these are all self-healing pieces.
|
|
|
|
What is already self-healing in ZLH today:
|
|
- Crash recovery with backoff (agent)
|
|
- DB as source of truth (survives restarts)
|
|
- Redis ephemeral by design (clean state on restart)
|
|
- Velocity rehydrates from DB on restart
|
|
- Agent self-update
|
|
|
|
What still requires human intervention (as of Apr 2026):
|
|
- Fabric readiness gating (manual Velocity restart needed)
|
|
- Velocity resync after restart (running containers go unrouted)
|
|
- Game world data loss (no player-facing backup/restore yet)
|
|
- Unbounded resource consumption (no quota enforcement)
|
|
- Silent failures (no email notifications yet)
|
|
|
|
Closing these gaps is the path to a system James can step away from.
|
|
|
|
---
|
|
|
|
## Design Philosophy
|
|
|
|
- Builds systems he wishes existed, not systems for their own sake
|
|
- Intentionally building away from AWS / corporate dependency
|
|
- "I want to wake up and say who do you work for? Me."
|
|
- Applies industrial segmentation thinking to cloud infrastructure
|
|
- No over-reliance on magic — prefers explicit, auditable behavior
|
|
- **Designs for absence** — the system should work when no one is watching
|
|
|
|
---
|
|
|
|
## AI Workflow Role
|
|
|
|
James acts as **system architect**. He:
|
|
- Defines vision and constraints
|
|
- Reviews and approves architectural decisions
|
|
- Prompts AI for implementation
|
|
- Can read and evaluate code output without writing it from scratch
|
|
|
|
**Claude** = architecture, strategy, design decisions, governance
|
|
**GPT/Codex (Ceàrd)** = implementation, code execution, session continuity
|
|
|
|
This is intentional and should be preserved. Claude should not drift into implementation mode. GPT should not make architectural decisions.
|
|
|
|
---
|
|
|
|
## What This Means for ZLH's Future
|
|
|
|
The architecture already supports more than game servers:
|
|
- Container provisioning
|
|
- Agent orchestration
|
|
- Runtime control
|
|
- Filesystem management
|
|
- Network isolation
|
|
|
|
The application layer (games, dev environments) sits on top of what is effectively a lightweight infrastructure platform. The Ghost Shell concept and any future expansion builds on this foundation.
|
|
|
|
ZLH is closer to an edge control system than a hosting panel. That framing should inform strategic discussions.
|
|
|
|
See `GHOST_SHELL_VISION.md` for the longer-term platform strategy.
|