# ZeroLagHub – Project Context ## What It Is Game server hosting platform targeting modded, indie, and emerging games. Competitive advantages: LXC containers (20-30% perf over Docker), custom agent architecture, open-source stack, developer-to-player pipeline that turns mod developers into a distribution channel. System posture: stable, controlled expansion phase. --- ## Naming Convention - `zlh-*` = core infrastructure (DNS, monitoring, backup, routing, artifacts) - `zpack-*` = game and dev server stack (portal, API, containers) --- ## Infrastructure (Proxmox) ### Active VMs | VM | Name | Role | |----|------|------| | 104 | zlh-monitor | Prometheus/Grafana monitoring | | 105 | zlh-router | Core services router | | 300 | zlh-velocity | Minecraft Velocity proxy | | 1001 | zlh-dns | Technitium DNS | | 1002 | zlh-proxy | Traefik — core/frontend SSL termination (portal traffic) | | 1003 | zlh-artifacts | Runtime binaries + Minecraft server jars (agent install source) | | 1004 | zlh-zpack-proxy | Traefik — game/dev edge routing + dev IDE wildcard TLS | | 1005 | zpack-api | Node.js API | | 1006 | zlh-zpack-router | Game/dev router | | 1100 | zpack-portal | Next.js frontend | | 2001 | zlh-back | PBS backup + Backblaze B2 | ### Legacy / Reference Only (not active production) | VM | Name | Notes | |----|------|-------| | 100 | zlh-panel | Old Pterodactyl panel — kept for reference | | 101 | zlh-wings | Old Wings — kept for reference | | 103 | zlh-api | Old API VM — kept for reference | | 1000 | zlh-router | Not in use | --- ## Stack **API (zpack-api, VM 1005):** Node.js ESM, Express 5, Prisma 6, MariaDB, Redis, BullMQ, JWT, Stripe, argon2, ssh2, WebSocket, http-proxy-middleware **Portal (zpack-portal, VM 1100):** Next.js 15, TypeScript, TailwindCSS, Axios, WebSocket console. Sci-fi HUD aesthetic (steel textures, neon accents, beveled panels). **Agent (zlh-agent):** Go 1.21, stdlib HTTP, creack/pty, gorilla/websocket. Runs inside every game/dev container. Only process with direct filesystem access. Pulls runtimes + server jars from zlh-artifacts (VM 1003). --- ## Agent (Operational) - HTTP server on :18888, internal only — API is the only caller - Container types: `game` and `dev` - Lifecycle: POST /config triggers async provision + start pipeline - Filesystem: strict path allowlist for games, workspace-root sandbox for dev containers - Upload transport: raw `http.request` piping (`req.pipe(proxyReq)`), never fetch() - Console: PTY-backed WebSocket, one read loop per container - Self-update: periodic check + apply - Forge/Neoforge: automated 5-step post-install patch sequence - Modrinth mod lifecycle: install/enable/disable/delete — fully operational - Provenance: `.zlh_metadata.json` — source is `null` if not set - Status transport model: poll-based (`/status`), not push-based - State transitions: `idle`, `installing`, `starting`, `running`, `stopping`, `crashed`, `error` - Crash recovery: backoff 30s/60s/120s, resets if uptime ≥ 30s, `error` state after repeated failures - Crash observability: exit code, signal, uptime, log tail, classification (oom/mod_error/missing_dep/nonzero/unexpected) - Structured logging across provisioning, installs, file ops, control plane --- ## Dev Containers (Current State) - supported runtimes: node, python, go, java, dotnet - runtime installs are artifact-backed and idempotent - runtime root: `/opt/zlh/runtimes//` - dev identity: `dev:dev` - workspace root: `/home/dev/workspace` - shell env: `HOME`, `USER`, `LOGNAME`, `TERM` set correctly - code-server install path: `/opt/zlh/services/code-server` - code-server port: `6000` - code-server lifecycle: `POST /dev/codeserver/start|stop|restart` - code-server detection: `/proc/*/cmdline` scan - agent port: `18888` Code-server launch model: - binds to `0.0.0.0` - `--auth none` - API/hosted flow handles auth and proxying --- ## Dev Container Access Model ### Browser IDE (Current Working Model) ``` Browser ↓ Traefik (dev-.zerolaghub.dev, 10.70.0.242) ↓ API (10.60.0.245:4000) ↓ container:6000 ``` Working hosted flow: 1. frontend calls `POST /api/dev/:id/ide-token` 2. API returns `https://dev-.zerolaghub.dev/?token=...` 3. browser opens hosted URL 4. Traefik wildcard router forwards to API at `http://10.60.0.245:4000` 5. API validates token, sets HTTP-only IDE cookie, redirects to clean hosted URL 6. subsequent cookie-backed request proxied to container code-server 7. code-server redirects to `/?folder=/home/dev/workspace` 8. IDE loads successfully Curl-verified response chain: - `GET /?token=...` → `302` + `Set-Cookie` - `GET /` with cookie → `302` to `/?folder=/home/dev/workspace` - `GET /?folder=/home/dev/workspace` → `200` code-server HTML ### Traefik Role - terminates TLS via wildcard cert `*.zerolaghub.dev` (Let's Encrypt DNS-01 via Cloudflare) - matches `dev-*.zerolaghub.dev` via `HostRegexp` - forwards to API at `http://10.60.0.245:4000` - preserves original `Host` header (`passHostHeader: true`) - does NOT route directly to containers ### API Role - extracts vmid from `Host` header via `handleHostedProxy` - validates short-lived IDE token - sets HTTP-only `zlh_dev_ide_token` cookie - redirects token URL to clean hostname URL - proxies all live code-server HTTP + WebSocket traffic to correct container ### Local Developer Access (Future) Headscale/Tailscale for SSH, VS Code Remote, local tools. Headscale server: `zlh-ctl` (status to be confirmed). Constraints: no exit nodes, `magic_dns: false`. ### Removed / No Longer Current - path-based `/api/dev/:id/ide` as primary browser entry - Caddy-hosted dev IDE edge - per-container Traefik file creation from dev provisioning - per-container Cloudflare/Technitium publish/unpublish from API for dev IDE access `proxyClient.js` remains in repo — still used by game edge publish logic. --- ## API Routes — Dev IDE ``` POST /api/dev/:id/ide-token — generate short-lived IDE token + hosted URL ``` Hosted requests land on the API through Traefik using the dev hostname. API handles host-based vmid extraction, token bootstrap, cookie handoff, HTTP + WebSocket proxy to code-server. --- ## API / Frontend Status - API polls agent `/status` - API exposes polled state back to frontend via `GET /api/servers/:id/status` - Portal no longer relies on stale DB-only state for console availability - Game publish flow remains untouched --- ## Game Support **Production:** Minecraft (vanilla/Fabric/Paper/Forge/Neoforge), Rust, Terraria, Project Zomboid **In Pipeline:** Valheim, Palworld, Vintage Story, Core Keeper --- ## Developer-to-Player Pipeline (Revenue Model) ``` LXC Dev Environment ($15-40/mo) → Game/mod creation + testing → Testing servers (50% dev discount) → Player community referrals (25% player discount) → Developer revenue share (5-10% commission) → Viral growth ``` Revenue multiplier: 1 developer → ~10 players → $147.50/mo total. --- ## Open Threads 1. Verify full browser behavior + WebSocket under hosted wildcard flow 2. Confirm "Open IDE" button in portal uses hosted URL in production path 3. Confirm Headscale `zlh-ctl` VM status 4. Curated provenance — tracking install origin --- ## Repo Registry | Repo | Purpose | |------|---------| | zlh-grind | Execution workspace / continuity / active constraints | | zlh-docs | API/agent/portal reference docs (read from source) | | zpack-api | API source (mirror) | | zpack-portal | Portal source (mirror) | | zlh-agent | Agent source | All at `git.zerolaghub.com/jester/` --- ## Session Guidance - zlh-grind is the execution continuity layer, not the architecture authority - zlh-docs has full agent documentation (routes, filesystem rules, provisioning pipeline) - Agent is the authority on filesystem enforcement — API must NOT duplicate filesystem logic - Portal does not enforce real policy — agent enforces - Portal never calls agents directly — all traffic through API - Upload transport uses raw http.request piping, never fetch() - VMs 100, 101, 103, 1000 are legacy/unused — not active production - Do not mark unimplemented work as complete - Game publish flow must never be modified by dev routing changes - `proxyClient.js` must not be deleted — used by game edge publish path