# ZeroLagHub — Master Bootstrap (April 2026) **For**: Claude (strategic/architecture sessions) **Last Updated**: April 30, 2026 **Supersedes**: Previous April 7, 2026 version **Source refs**: zlh-grind/SCRATCH/handover-apr-2026.md, zlh-grind/OPEN_THREADS.md, zlh-grind/SCRATCH/billing-stripe-handover-apr11-2026.md --- ## AI Role Split | AI | Role | Workspace | |----|------|-----------| | **Claude** | Architecture, strategy, design decisions, cross-cutting concerns | `jester/knowledge-base` | | **GPT (Ceàrd)** | Implementation, code changes, verification, session continuity | `jester/zlh-grind` | **Hard rule**: GPT must not make architecture decisions. If architectural interpretation is required, stop and defer to Claude or canonical docs. **Claude's session start checklist**: 1. Read this file 2. Read `ZeroLagHub_Cross_Project_Tracker.md` (locked decisions + boundaries) 3. Reference `INFRASTRUCTURE.md` in `zlh-grind` for current IPs/VMs 4. Read `zlh-grind/OPEN_THREADS.md` for current active work --- ## What ZeroLagHub Is Build-and-run platform for developers, modders, and game communities. Browser-based dev environments + managed game servers + developer-to-player pipeline in one platform. **Competitive advantages**: - LXC containers (20-30% performance over Docker) — system-container infrastructure, not Docker overhead - Custom Go agent architecture — purpose-built, not adapted generic tooling - Open-source stack (30-40% cost advantage) - Developer-to-player pipeline: mod devs become a distribution channel **System posture as of Apr 30, 2026**: Stable. Pre-launch validation phase. Most core workflows verified end-to-end. --- ## Infrastructure **Single host**: GTHost Detroit Supermicro 2029TP-HTR, Xeon Gold 6152 22c/44t, 192GB DDR4, 2×1.92TB SSD, $99/mo, Proxmox VE 9.x **Denver host**: Fully decommissioned April 2, 2026. OS wiped, disks striped. ### Active VMs (all 9000s range — authoritative in zlh-grind/INFRASTRUCTURE.md) | VM | Name | Role | |----|------|------| | 9001 | zlh-router | OPNsense, WAN 66.163.115.221 | | 9002 | zpack-router | OPNsense, WAN 66.163.115.115 | | 9010 | zpack-dns | Technitium DNS | | 9011 | zlh-proxy | Caddy reverse proxy | | 9012 | zpack-proxy | Traefik v3 — game/dev edge routing + wildcard TLS | | 9014 | zlh-artifacts | Caddy file server — runtime binaries + server jars | | 9015 | zpack-velocity | Velocity 3.5 — Minecraft proxy | | 9017 | zlh-back | PBS backup + Backblaze B2 | | 9020 | zpack-api | Node.js API, MariaDB, Redis | | 9021 | zpack-portal | Next.js frontend | **LXC containers**: - Game containers: IDs 5000+ - Dev containers: IDs 6000+ - Base template: ID 820 (zlh-base) **Legacy VMs (NOT active production)**: 100 (zlh-panel), 101 (zlh-wings), 103 (zlh-api), 1000 (zlh-router) --- ## Naming Conventions - `zlh-*` = core infrastructure (DNS, monitoring, backup, routing, artifacts) - `zpack-*` = game and dev server stack (portal, API, containers) --- ## Stack **API (zpack-api, VM 9020)**: Node.js ESM, Express 5, Prisma 6, MariaDB, Redis, BullMQ, JWT, Stripe, argon2, ssh2, WebSocket, http-proxy-middleware **Portal (zpack-portal, VM 9021)**: Next.js 15, TypeScript, TailwindCSS, Axios, WebSocket console. Hybrid marketing + SaaS structure with SEO landing pages. Pricing tiers: Starter / Pro / Performance. **Agent (zlh-agent)**: Go 1.21, stdlib HTTP, creack/pty, gorilla/websocket. Runs inside every game/dev container. Only process with direct filesystem access. Pulls runtimes + server jars from zlh-artifacts (VM 9014). --- ## Architecture Rules (Locked) These cannot be changed without an explicit "Revisit decision X" conversation with Claude. | # | Decision | |---|----------| | DEC-001 | Templates + Agent hybrid (not templates-only or agent-only) | | DEC-002 | API orchestrates, Agent executes (never reversed) | | DEC-003 | API owns DNS (Agent never creates DNS records) | | DEC-004 | Traefik + Velocity only (no HAProxy) | | DEC-005 | MariaDB is source of truth (no flat files) | | DEC-006 | Frontend → API only (no direct agent calls) | | DEC-007 | Drift prevention mandatory | | DEC-008 | Control plane uses IP env vars, not internal.zlh DNS in hot paths | | DEC-009 | Redis never restored from backup — always start clean | | DEC-010 | Vanilla Minecraft servers use Fabric loader + FabricProxy-Lite (not vanilla JAR) | | DEC-011 | Velocity forwarding mode = `modern` (not `none` — breaks UUID stability) | --- ## Control Plane Architecture ``` Browser → Traefik (zpack-proxy, 10.70.0.242) — TLS termination, domain routing → API (10.60.0.245:4000) — orchestration, auth, state → Agent (:18888) — container execution (game + dev) ``` - Agent is HTTP server on :18888, internal only. API is the only caller. - Portal never calls agents directly. - Service-to-service communication uses IP env vars, not DNS FQDNs. - Upload transport: raw `http.request` piping (`req.pipe(proxyReq)`), never `fetch()`. ### Dev IDE Access (Browser — Current Working Model) ``` Browser → dev-.zerolaghub.dev → Traefik → API → container:6000 ``` Flow: frontend calls `POST /api/dev/:id/ide-token` → API returns hosted URL → browser opens → Traefik wildcard routes to API → API validates token, sets HTTP-only cookie, proxies to code-server. Browser-verified end-to-end. --- ## Game Support **Production**: Minecraft (Vanilla/Fabric/Paper/Forge/Neoforge), Rust, Terraria, Project Zomboid **Pipeline**: Valheim, Palworld, Vintage Story, Core Keeper **Target market**: Modded/indie games vs mainstream providers ### Minecraft Runtime Split (Verified) - `vanilla` = Fabric-based internal profile with proxy/API/config injection (FabricProxy-Lite pre-seeded) - `fabric` = plain Fabric jar delivery only - Forge/NeoForge first-start flow avoids premature readiness gating, applies post-start property enforcement, restarts through readiness-aware path --- ## Developer-to-Player Revenue Pipeline ``` LXC Dev Environment ($15-40/mo) → Game/mod creation + testing → Testing servers (50% dev discount) → Player community referrals (25% player discount) → Developer revenue share (5-10% commission) → Viral growth ``` Revenue multiplier: 1 developer → ~10 players → $147.50/mo total from one developer acquisition. --- ## Current System State (April 30, 2026) **System is stable and pre-launch validated** across core workflows. ### Resolved since last bootstrap (Apr 7) - ✅ Password reset — 5-minute tokens, hashed at rest, single-use, old tokens invalidated on deploy - ✅ Billing / Stripe — checkout flow works, customer creation works, Stripe sandbox confirmed, Portal billing UI working. **Remaining**: webhook delivery (Stripe cannot reach API — not yet publicly exposed). Once webhook is live, billing is functionally complete. - ✅ Vanilla/Fabric runtime split restored and validated - ✅ Forge/NeoForge first-start flow correct end-to-end - ✅ Delete/teardown lifecycle removes Velocity, Cloudflare, and Technitium records - ✅ Portal consumes API-owned `connectable`/`connection` state — no longer infers Minecraft readiness itself - ✅ Velocity proxy lifecycle callbacks live with `registered_with_proxy` and `proxy_ping_ok` in API state - ✅ Portal status labels fixed — non-connectable states no longer all show "Needs attention" - ✅ Portal server creation redirects to `/servers` and tracks setup progress there - ✅ Portal public marketing site: hybrid SEO structure, Starter/Pro/Performance pricing tiers, root metadata fixed, hero copy fixed, fake CLI line removed - ✅ SEO landing pages added: `/minecraft-server-hosting`, `/modded-minecraft-hosting`, `/browser-dev-environment` - ✅ Local Minecraft backup create/restore verified live end-to-end - ✅ Restore creates intentional pre-restore checkpoint; API starts restore asynchronously - ✅ Backup timestamps normalized; pre-restore checkpoints filtered from default backup list - ✅ Vanilla datapack upload works; direct vanilla `mods/` upload rejected by API - ✅ NeoForge mod search/install/list works - ✅ Agent-backed file edits create shadow copies for revert; API route/stream forwarding issues fixed - ✅ Public exposure model in place: Portal public, control plane private - ✅ Dev container creation succeeds; hosted IDE access verified post-cleanup passes - ✅ Minecraft server creation succeeds across supported runtime variants ### Still Active: Fabric Readiness Gating (agent fix needed) **Root cause**: Velocity registration fires before Fabric server is ready to accept proxy traffic → "proxy starting" errors until Velocity restart. **Fix**: Gate Velocity registration behind TCP probe success (port 25565). Agent already runs probe — registration must fire after probe returns success, not at process start. **Scope**: Agent only. No Velocity or FabricProxy-Lite changes needed. **Ref**: `zlh-grind/SCRATCH/session-stabilization-fabric-findings.md` --- ## Pre-Launch Blockers In priority order: 1. **Billing webhook delivery** — Stripe cannot POST to `/api/billing/webhook` (API not publicly exposed). Checkout works, customer/subscription created in Stripe, but `subscriptionStatus` and `plan` in DB not updating. Fix: expose webhook endpoint via public domain with HTTPS, or use Stripe CLI forwarding for dev testing. Ref: `zlh-grind/SCRATCH/billing-stripe-handover-apr11-2026.md` 2. **Fabric readiness gating** — agent TCP probe gate (see above) 3. **User onboarding flow** — guided first-server creation after registration 4. **Usage limits / quota enforcement** — prevent unbounded server creation 5. **Email notifications** — crash, billing, provisioning complete 6. **Upload testing** — end-to-end verification in dev containers 7. **Stress testing** — k6 IDE session load + Minecraft bot + code-server memory baseline 8. **OPNsense audit** — both routers need systematic validation 9. **Service discovery migration** — remaining non-hot-path internal.zlh refs 10. **Provisioning validation** — single controlled creation, confirm 1 record / 1 job / 1 execution 11. **Final smoke test** — full lifecycle: create → ready → connectable → backup → restore → stop/start/restart → delete. Confirm Velocity unregister, Cloudflare cleanup, Technitium cleanup. 12. **Portal public-site QA** — desktop + mobile layouts, CTA routing, metadata verification. Mobile not yet optimized. 13. **Monitoring / observability** — normalize game/dev Alloy label contract across API discovery, agent-written labels, Prometheus targets, and Grafana dashboards. Finish template cleanup (remove node-exporter, keep Alloy). 14. **Billing Portal UI polish** — `trialing` state handling, upgrade/downgrade flow, plan limit gating in Portal --- ## Repo Registry | Repo | Purpose | |------|---------| | `jester/knowledge-base` | Claude's home — architecture, strategy, canonical decisions | | `jester/zlh-grind` | GPT's workspace — execution continuity, session handovers, debug notes | | `jester/zlh-docs` | API/agent/portal operational reference docs | | `jester/zpac-api` | API source | | `jester/zpac-portal` | Portal source | | `jester/zlh-agent` | Agent source (Go) | --- ## Session Guidance for Claude - **Architecture decisions belong here** — if GPT hits an architectural question, it should stop and bring it to Claude - `zlh-grind` is GPT's execution ledger, not an architecture source - `zlh-grind/INFRASTRUCTURE.md` is authoritative for current IPs and VM inventory - `zlh-grind/OPEN_THREADS.md` has full active + outstanding work - `zlh-grind/SCRATCH/` has detailed debug docs for specific topics - Do not mark unimplemented work as complete - Game publish flow must never be modified by dev routing changes - Synthesis of Codex review outputs belongs here, not in Codex