zlh-grind/PROJECT_CONTEXT.md

6.7 KiB
Raw Blame History

ZeroLagHub Project Context

What It Is

Game server hosting platform targeting modded, indie, and emerging games. Competitive advantages: LXC containers (20-30% perf over Docker), custom agent architecture, open-source stack, developer-to-player pipeline that turns mod developers into a distribution channel.

System posture: stable, controlled expansion phase.


Naming Convention

  • zlh-* = core infrastructure (DNS, monitoring, backup, routing, artifacts)
  • zpack-* = game and dev server stack (portal, API, containers)

Infrastructure (Proxmox)

Active VMs

VM Name Role
104 zlh-monitor Prometheus/Grafana monitoring
105 zlh-router Core services router
300 zlh-velocity Minecraft Velocity proxy
1001 zlh-dns Technitium DNS
1002 zlh-proxy Traefik — core/frontend SSL termination (portal traffic)
1003 zlh-artifacts Runtime binaries + Minecraft server jars (agent install source)
1004 zlh-zpack-proxy Traefik — game server traffic only
1005 zpack-api Node.js API
1006 zlh-zpack-router Game server router
1100 zpack-portal Next.js frontend
2001 zlh-back PBS backup + Backblaze B2

Legacy / Reference Only (not active production)

VM Name Notes
100 zlh-panel Old Pterodactyl panel — kept for reference
101 zlh-wings Old Wings — kept for reference
103 zlh-api Old API VM — kept for reference
1000 zlh-router Not in use

Stack

API (zpack-api, VM 1005): Node.js ESM, Express 5, Prisma 6, MariaDB, Redis, BullMQ, JWT, Stripe, argon2, ssh2, WebSocket, http-proxy-middleware

Portal (zpack-portal, VM 1100): Next.js 15, TypeScript, TailwindCSS, Axios, WebSocket console. Sci-fi HUD aesthetic (steel textures, neon accents, beveled panels).

Agent (zlh-agent): Go 1.21, stdlib HTTP, creack/pty, gorilla/websocket. Runs inside every game/dev container. Only process with direct filesystem access. Pulls runtimes + server jars from zlh-artifacts (VM 1003).


Agent (Operational)

  • HTTP server on :18888, internal only — API is the only caller
  • Container types: game and dev
  • Lifecycle: POST /config triggers async provision + start pipeline
  • Filesystem: strict path allowlist for games, workspace-root sandbox for dev containers
  • Upload transport: raw http.request piping (req.pipe(proxyReq)), never fetch()
  • Console: PTY-backed WebSocket, one read loop per container
  • Self-update: periodic check + apply
  • Forge/Neoforge: automated 5-step post-install patch sequence
  • Modrinth mod lifecycle: install/enable/disable/delete — fully operational
  • Provenance: .zlh_metadata.json — source is null if not set
  • Status transport model: poll-based (/status), not push-based
  • State transitions: idle, installing, starting, running, stopping, crashed, error

Dev Containers (Current State)

  • supported runtimes: node, python, go, java, dotnet
  • runtime installs are artifact-backed and idempotent
  • runtime root: /opt/zlh/runtimes/<runtime>/<version>
  • dev identity: dev:dev
  • workspace root: /home/dev/workspace
  • code-server install path: /opt/zlh/services/code-server
  • code-server port: 6000
  • agent port: 18888

Confirmed:

  • code-server process launches and binds to 0.0.0.0:6000
  • frontend host/console state updates correctly via API status endpoint

Pending agent change: code-server must be relaunched with --auth none --base-path /api/dev/<vmid>/ide


Dev Container Access Model

Browser IDE (Implemented)

Browser
  ↓
Portal
  ↓
API proxy (/api/dev/:id/ide)
  ↓
container:6000

Portal calls POST /api/dev/:id/ide-token first, then opens the returned URL in a new tab. Token is short-lived (300s), signed by API. Proxy accepts token via Authorization: Bearer or ?token= query param. WebSocket upgrades validated with same token.

Containers are never publicly exposed.

Local Developer Access (Future)

Headscale/Tailscale for SSH, VS Code Remote, local tools. Headscale server: zlh-ctl (status to be confirmed). Constraints: no exit nodes, magic_dns: false.

Removed

DNS-per-container + Traefik dynamic routing approach was abandoned. Code removed from API: devRouting.js, devDePublisher.js, Traefik file writes. proxyClient.js retained — still used by game edge publish path.


API Routes — Dev IDE

POST /api/dev/:id/ide-token   — generate short-lived IDE token
GET  /api/dev/:id/ide         — proxy to container:6000
GET  /api/dev/:id/ide/*       — proxy to container:6000
GET  /api/servers/:id/status  — expose polled agent state to frontend

API / Frontend Status

  • API polls agent /status
  • API exposes polled state back to frontend via GET /api/servers/:id/status
  • Portal no longer relies on stale DB-only state for console availability
  • Game publish flow remains untouched

Game Support

Production: Minecraft (vanilla/Fabric/Paper/Forge/Neoforge), Rust, Terraria, Project Zomboid

In Pipeline: Valheim, Palworld, Vintage Story, Core Keeper


Developer-to-Player Pipeline (Revenue Model)

LXC Dev Environment ($15-40/mo)
  → Game/mod creation + testing
  → Testing servers (50% dev discount)
  → Player community referrals (25% player discount)
  → Developer revenue share (5-10% commission)
  → Viral growth

Revenue multiplier: 1 developer → ~10 players → $147.50/mo total.


Open Threads

  1. Agent: update code-server launch args (--auth none, --base-path /api/dev/<vmid>/ide)
  2. Portal: "Open IDE" button calling /api/dev/:id/ide-token
  3. Confirm Headscale zlh-ctl VM status
  4. Curated provenance — tracking install origin

Repo Registry

Repo Purpose
zlh-grind Execution workspace / continuity / active constraints
zlh-docs API/agent/portal reference docs (read from source)
zpack-api API source (mirror)
zpack-portal Portal source (mirror)
zlh-agent Agent source

All at git.zerolaghub.com/jester/<repo>


Session Guidance

  • zlh-grind is the execution continuity layer, not the architecture authority
  • zlh-docs has full agent documentation (routes, filesystem rules, provisioning pipeline)
  • Agent is the authority on filesystem enforcement — API must NOT duplicate filesystem logic
  • Portal does not enforce real policy — agent enforces
  • Portal never calls agents directly — all traffic through API
  • Upload transport uses raw http.request piping, never fetch()
  • VMs 100, 101, 103, 1000 are legacy/unused — not active production
  • Do not mark unimplemented work as complete
  • Game publish flow must never be modified by dev routing changes
  • proxyClient.js must not be deleted — used by game edge publish path