diff --git a/SCRATCH/handover-apr-2026.md b/SCRATCH/handover-apr-2026.md new file mode 100644 index 0000000..bf8a2bb --- /dev/null +++ b/SCRATCH/handover-apr-2026.md @@ -0,0 +1,148 @@ +# ZeroLagHub — Session Handover (Apr 7, 2026) + +## Platform Overview + +ZeroLagHub is a game server hosting platform targeting modded/indie Minecraft. +Custom stack: Node.js API, Next.js portal, Go agent, Velocity proxy, LXC containers on Proxmox. +Source of truth: `git.zerolaghub.com/jester/zlh-grind` (mirrored to GitHub `jester1181/`) + +Key docs in this repo: +- `OPEN_THREADS.md` — all active and future work +- `INFRASTRUCTURE.md` — server specs, VM inventory, IPs +- `PROJECT_CONTEXT.md` — stack, architecture (note: has old VM IDs, INFRASTRUCTURE.md is authoritative) +- `SCRATCH/` — session notes, debug docs, specs + +--- + +## Infrastructure + +**Single host:** GTHost Detroit — Supermicro 2029TP-HTR, Xeon Gold 6152 22c/44t, 192GB DDR4, 2x1.92TB SSD, $99/mo, Proxmox VE 9.x + +**Denver host decommissioned** Apr 2, 2026 — OS wiped, disks striped. + +Key VMs (all 9000s range): +- 9001 zlh-router — OPNsense, 10.60.0.254, WAN 66.163.115.221 +- 9002 zpack-router — OPNsense, 10.70.0.1, WAN 66.163.115.115 +- 9010 zpack-dns — Technitium DNS, 10.60.0.14 +- 9011 zlh-proxy — Caddy, 10.60.0.16 +- 9012 zpack-proxy — Traefik v3, 10.70.0.11 +- 9014 zlh-artifacts — Caddy file server, 10.60.0.17 +- 9015 zpack-velocity — Velocity 3.5, 10.70.0.10 +- 9017 zlh-back — PBS, 10.60.0.24 / 172.60.0.30 +- 9020 zpack-api — Node.js API, MariaDB, Redis, 10.60.0.18 +- 9021 zpack-portal — Next.js, 10.60.0.19 + +Full IP table in `INFRASTRUCTURE.md`. + +--- + +## Architecture + +**Control plane:** API ↔ Agent ↔ services via direct IP env vars (not DNS) +- internal.zlh DNS exists but must NOT be used in service-to-service hot paths +- See `SCRATCH/service-discovery.md` for spec + +**Data plane:** user/browser → Traefik (zpack-proxy) → domain-based routing + +**Game containers:** LXC, IDs 5000+ +**Dev containers:** LXC, IDs 6000+ +**Base template:** ID 820 (zlh-base) + +**Agent** runs inside every container on :18888. Only caller is the API. +**API** is central authority for routing, console, orchestration. +**Velocity** (ZpackVelocityBridge plugin) handles Minecraft player routing via HTTP bridge on :8081. + +--- + +## Current System State + +System is **stable and deterministic** following stabilization work. + +Resolved this week: +- ✅ Duplicate server creation — was frontend issue, not API +- ✅ DB/Redis state drift from Denver→Detroit migration — Redis flushed, DB verified clean +- ✅ Console routing — browser → API → agent +- ✅ IP-based control plane — internal.zlh removed from hot paths +- ✅ Velocity rehydration — uses DB + Redis, not Proxmox live state +- ✅ Denver fully decommissioned + +--- + +## Active Issues (Next Session Priority) + +### 1. Fabric Readiness Gating (Agent fix needed) +**Root cause:** Velocity registration happens before Fabric server is fully ready to accept proxy traffic. Results in "proxy starting" errors until Velocity is restarted. + +**Fix:** Gate Velocity registration behind TCP probe success (port 25565). The agent already runs a TCP probe — Velocity registration just needs to happen after probe returns success, not at process start. + +**Does NOT require:** Changes to Velocity, FabricProxy-Lite version, or Fabric API version. + +See `SCRATCH/session-stabilization-fabric-findings.md` + +### 2. Fabric/Vanilla Stack (what's working) +- Fabric server starts correctly with Fabric API + FabricProxy-Lite pre-seeded in mods/ +- Correct versions for 1.21.7: `fabric-api-0.129.0+1.21.7.jar` + `fabricproxy-lite-2.9.0.jar` +- Agent normalizes "vanilla" game type → fabric profile, pulls Fabric loader as server.jar +- Forwarding secret must be written to `config/FabricProxy-Lite.toml` at provisioning time +- See `SCRATCH/minecraft-velocity-forwarding.md` for full per-game-type requirements + +### 3. Velocity Config (current state) +```toml +player-info-forwarding-mode = "modern" +forwarding-secret-file = "forwarding.secret" +``` +Lobby fallback removed. ZpackVelocityBridge handles all routing dynamically. + +--- + +## Pre-Launch Checklist (from OPEN_THREADS) + +Blockers: +- **Billing / Stripe** — cannot take money without this +- **Game server world backup/restore** — trust-critical +- **User onboarding flow** — guided first-server creation after register +- **Fabric readiness gating** — agent-level fix +- **Password reset flow** — verify wired up +- **Usage limits / quota enforcement** +- **Email notifications** +- **Upload testing** +- **Billing endpoints** +- **Stress testing** — k6 + Minecraft bot + code-server memory baseline +- **OPNsense audit** +- **Service discovery migration** — remaining internal.zlh refs in hot paths +- **Provisioning validation** — single controlled test post-cleanup + +--- + +## Key Decisions Made + +- **Forwarding mode:** `modern` — `none` breaks player identity/UUID stability +- **Vanilla servers:** Use Fabric loader + FabricProxy-Lite, not vanilla JAR +- **Service discovery:** IP env vars, not DNS, for control plane +- **Redis:** Never restore from backup — always start clean +- **Database:** Source of truth for all state +- **Hosting:** GTHost Detroit long-term (most cost-effective) +- **Velocity plugin:** ZpackVelocityBridge — in-memory only, re-registration needed after restart + +--- + +## Repo Registry + +All at `git.zerolaghub.com/jester/`: +- `zlh-grind` — execution workspace, continuity, this file +- `zlh-docs` — API/agent/portal reference docs +- `zpac-api` — API source +- `zpac-portal` — portal source +- `zlh-agent` — agent source (Go) + +--- + +## Session Guidance for Claude + +- Read `OPEN_THREADS.md` at session start for current state +- Read `INFRASTRUCTURE.md` for IPs and VM inventory +- `SCRATCH/` has detailed docs for specific topics +- Agent is authority on filesystem — API must not duplicate filesystem logic +- Portal never calls agents directly — all traffic through API +- Game publish flow must never be modified by dev routing changes +- Do not mark unimplemented work as complete