knowledge-base/OWNER_PROFILE.md

5.1 KiB

Owner Profile — James

How to Work With James

James understands code and can read it fluently — he is not a beginner. He does not need concepts over-explained or decisions softened. Direct tradeoff discussions are preferred over hedged recommendations. He thinks at the system level and delegates implementation to AI tooling.


Background

Controls Engineering + Cloud Engineering

This combination is the single most important thing to understand about how ZeroLagHub is designed. It explains decisions that might otherwise look over-engineered for a game hosting platform.

Controls engineers think in terms of:

  • Separation of responsibility
  • Safe boundaries between layers
  • Deterministic, predictable behavior
  • Fail states that are loud, not silent

Cloud engineers add:

  • Control plane vs execution plane as a first-class concept
  • State management discipline (what is ephemeral, what is persistent)
  • Orchestration vs execution separation

Both disciplines together produce an architecture that looks like:

Portal (operator interface)
API (supervisory control / orchestration)
Agent (controller / execution)
Container Runtime (plant / device)

With network segmentation that mirrors OT environments:

Core services network
Game/dev workload network

That's not accidental. That's training showing through.


Platform Evolution (Normal, Not Accidental)

The path ZeroLagHub took is textbook platform evolution:

  1. DigitalOcean (manual)
  2. Pterodactyl + custom frontend
  3. API + Ansible
  4. API + templates
  5. API + Go agent (current)

This compressed years of typical platform maturity into a shorter cycle.


Operating Context — Solo Operator

This is the most important constraint on every design decision.

ZeroLagHub is built and operated by one person. That means:

  • No on-call rotation
  • No one watching dashboards at 3am
  • No team to absorb incident response

The system must therefore be designed to operate without continuous human supervision. This is not a preference — it is a survival requirement.

A solo-operated platform needs:

  • Self-diagnosing — knows when something is wrong without a human watching
  • Self-recovering — handles common failure modes automatically
  • Loud about what it can't fix — escalates only when it genuinely needs a human

The pre-launch blockers in OPEN_THREADS.md are not just launch requirements. They are the gap between where the system is and where it can run unsupervised. Email notifications, quota enforcement, world backup, readiness gating — these are all self-healing pieces.

What is already self-healing in ZLH today:

  • Crash recovery with backoff (agent)
  • DB as source of truth (survives restarts)
  • Redis ephemeral by design (clean state on restart)
  • Velocity rehydrates from DB on restart
  • Agent self-update

What still requires human intervention (as of Apr 2026):

  • Fabric readiness gating (manual Velocity restart needed)
  • Velocity resync after restart (running containers go unrouted)
  • Game world data loss (no player-facing backup/restore yet)
  • Unbounded resource consumption (no quota enforcement)
  • Silent failures (no email notifications yet)

Closing these gaps is the path to a system James can step away from.


Design Philosophy

  • Builds systems he wishes existed, not systems for their own sake
  • Intentionally building away from AWS / corporate dependency
  • "I want to wake up and say who do you work for? Me."
  • Applies industrial segmentation thinking to cloud infrastructure
  • No over-reliance on magic — prefers explicit, auditable behavior
  • Designs for absence — the system should work when no one is watching

AI Workflow Role

James acts as system architect. He:

  • Defines vision and constraints
  • Reviews and approves architectural decisions
  • Prompts AI for implementation
  • Can read and evaluate code output without writing it from scratch

Claude = architecture, strategy, design decisions, governance
GPT/Codex (Ceàrd) = implementation, code execution, session continuity

This is intentional and should be preserved. Claude should not drift into implementation mode. GPT should not make architectural decisions.

Repo access: Both Claude and GPT access Gitea via the same MCP server (https://git.zerolaghub.com/mcp). GPT's MCP connection has previously worked and lost access — this is a connector configuration issue, not a fundamental limitation. When GPT's MCP is connected, it can write directly to zlh-grind without James relaying through Claude. Claude handles knowledge-base writes as the architecture authority.


What This Means for ZLH's Future

The architecture already supports more than game servers:

  • Container provisioning
  • Agent orchestration
  • Runtime control
  • Filesystem management
  • Network isolation

The application layer (games, dev environments) sits on top of what is effectively a lightweight infrastructure platform. The Ghost Shell concept and any future expansion builds on this foundation.

ZLH is closer to an edge control system than a hosting panel. That framing should inform strategic discussions.

See GHOST_SHELL_VISION.md for the longer-term platform strategy.