diff --git a/ZeroLagHub_Master_Bootstrap_Apr2026.md b/ZeroLagHub_Master_Bootstrap_Apr2026.md index 5d7eadc..863989a 100644 --- a/ZeroLagHub_Master_Bootstrap_Apr2026.md +++ b/ZeroLagHub_Master_Bootstrap_Apr2026.md @@ -1,9 +1,9 @@ # ZeroLagHub — Master Bootstrap (April 2026) **For**: Claude (strategic/architecture sessions) -**Last Updated**: April 7, 2026 -**Supersedes**: ZeroLagHub_Master_Bootstrap_Dec2025.md -**Source refs**: zlh-grind/SCRATCH/handover-apr-2026.md, zlh-grind/PROJECT_CONTEXT.md, zlh-grind/OPEN_THREADS.md +**Last Updated**: April 30, 2026 +**Supersedes**: Previous April 7, 2026 version +**Source refs**: zlh-grind/SCRATCH/handover-apr-2026.md, zlh-grind/OPEN_THREADS.md, zlh-grind/SCRATCH/billing-stripe-handover-apr11-2026.md --- @@ -20,20 +20,21 @@ 1. Read this file 2. Read `ZeroLagHub_Cross_Project_Tracker.md` (locked decisions + boundaries) 3. Reference `INFRASTRUCTURE.md` in `zlh-grind` for current IPs/VMs +4. Read `zlh-grind/OPEN_THREADS.md` for current active work --- ## What ZeroLagHub Is -Game server hosting platform targeting modded and indie Minecraft. +Build-and-run platform for developers, modders, and game communities. Browser-based dev environments + managed game servers + developer-to-player pipeline in one platform. **Competitive advantages**: -- LXC containers (20-30% performance over Docker) +- LXC containers (20-30% performance over Docker) — system-container infrastructure, not Docker overhead - Custom Go agent architecture — purpose-built, not adapted generic tooling - Open-source stack (30-40% cost advantage) - Developer-to-player pipeline: mod devs become a distribution channel -**System posture as of Apr 2026**: Stable. Controlled expansion phase. Detroit migration complete. +**System posture as of Apr 30, 2026**: Stable. Pre-launch validation phase. Most core workflows verified end-to-end. --- @@ -79,7 +80,7 @@ Supermicro 2029TP-HTR, Xeon Gold 6152 22c/44t, 192GB DDR4, 2×1.92TB SSD, $99/mo **API (zpack-api, VM 9020)**: Node.js ESM, Express 5, Prisma 6, MariaDB, Redis, BullMQ, JWT, Stripe, argon2, ssh2, WebSocket, http-proxy-middleware -**Portal (zpack-portal, VM 9021)**: Next.js 15, TypeScript, TailwindCSS, Axios, WebSocket console. Sci-fi HUD aesthetic (steel textures, neon accents, beveled panels). +**Portal (zpack-portal, VM 9021)**: Next.js 15, TypeScript, TailwindCSS, Axios, WebSocket console. Hybrid marketing + SaaS structure with SEO landing pages. Pricing tiers: Starter / Pro / Performance. **Agent (zlh-agent)**: Go 1.21, stdlib HTTP, creack/pty, gorilla/websocket. Runs inside every game/dev container. Only process with direct filesystem access. Pulls runtimes + server jars from zlh-artifacts (VM 9014). @@ -135,6 +136,11 @@ Flow: frontend calls `POST /api/dev/:id/ide-token` → API returns hosted URL **Pipeline**: Valheim, Palworld, Vintage Story, Core Keeper **Target market**: Modded/indie games vs mainstream providers +### Minecraft Runtime Split (Verified) +- `vanilla` = Fabric-based internal profile with proxy/API/config injection (FabricProxy-Lite pre-seeded) +- `fabric` = plain Fabric jar delivery only +- Forge/NeoForge first-start flow avoids premature readiness gating, applies post-start property enforcement, restarts through readiness-aware path + --- ## Developer-to-Player Revenue Pipeline @@ -152,19 +158,34 @@ Revenue multiplier: 1 developer → ~10 players → $147.50/mo total from one de --- -## Current System State (April 7, 2026) +## Current System State (April 30, 2026) -**System is stable and deterministic** following stabilization work this week. +**System is stable and pre-launch validated** across core workflows. -### Resolved -- ✅ Duplicate server creation (was frontend issue, not API) -- ✅ DB/Redis state drift from Denver→Detroit migration -- ✅ Console routing (browser → API → agent) -- ✅ IP-based control plane (internal.zlh removed from hot paths) -- ✅ Velocity rehydration (uses DB + Redis, not Proxmox live state) -- ✅ Denver fully decommissioned +### Resolved since last bootstrap (Apr 7) -### Active Issue: Fabric Readiness Gating (agent fix needed) +- ✅ Password reset — 5-minute tokens, hashed at rest, single-use, old tokens invalidated on deploy +- ✅ Billing / Stripe — checkout flow works, customer creation works, Stripe sandbox confirmed, Portal billing UI working. **Remaining**: webhook delivery (Stripe cannot reach API — not yet publicly exposed). Once webhook is live, billing is functionally complete. +- ✅ Vanilla/Fabric runtime split restored and validated +- ✅ Forge/NeoForge first-start flow correct end-to-end +- ✅ Delete/teardown lifecycle removes Velocity, Cloudflare, and Technitium records +- ✅ Portal consumes API-owned `connectable`/`connection` state — no longer infers Minecraft readiness itself +- ✅ Velocity proxy lifecycle callbacks live with `registered_with_proxy` and `proxy_ping_ok` in API state +- ✅ Portal status labels fixed — non-connectable states no longer all show "Needs attention" +- ✅ Portal server creation redirects to `/servers` and tracks setup progress there +- ✅ Portal public marketing site: hybrid SEO structure, Starter/Pro/Performance pricing tiers, root metadata fixed, hero copy fixed, fake CLI line removed +- ✅ SEO landing pages added: `/minecraft-server-hosting`, `/modded-minecraft-hosting`, `/browser-dev-environment` +- ✅ Local Minecraft backup create/restore verified live end-to-end +- ✅ Restore creates intentional pre-restore checkpoint; API starts restore asynchronously +- ✅ Backup timestamps normalized; pre-restore checkpoints filtered from default backup list +- ✅ Vanilla datapack upload works; direct vanilla `mods/` upload rejected by API +- ✅ NeoForge mod search/install/list works +- ✅ Agent-backed file edits create shadow copies for revert; API route/stream forwarding issues fixed +- ✅ Public exposure model in place: Portal public, control plane private +- ✅ Dev container creation succeeds; hosted IDE access verified post-cleanup passes +- ✅ Minecraft server creation succeeds across supported runtime variants + +### Still Active: Fabric Readiness Gating (agent fix needed) **Root cause**: Velocity registration fires before Fabric server is ready to accept proxy traffic → "proxy starting" errors until Velocity restart. **Fix**: Gate Velocity registration behind TCP probe success (port 25565). Agent already runs probe — registration must fire after probe returns success, not at process start. **Scope**: Agent only. No Velocity or FabricProxy-Lite changes needed. @@ -176,19 +197,20 @@ Revenue multiplier: 1 developer → ~10 players → $147.50/mo total from one de In priority order: -1. **Billing / Stripe** — cannot take payments, hard launch blocker -2. **Game server world backup/restore** — trust-critical, player retention risk +1. **Billing webhook delivery** — Stripe cannot POST to `/api/billing/webhook` (API not publicly exposed). Checkout works, customer/subscription created in Stripe, but `subscriptionStatus` and `plan` in DB not updating. Fix: expose webhook endpoint via public domain with HTTPS, or use Stripe CLI forwarding for dev testing. Ref: `zlh-grind/SCRATCH/billing-stripe-handover-apr11-2026.md` +2. **Fabric readiness gating** — agent TCP probe gate (see above) 3. **User onboarding flow** — guided first-server creation after registration -4. **Fabric readiness gating** — agent TCP probe gate (see above) -5. **Password reset flow** — verify fully wired -6. **Usage limits / quota enforcement** — prevent unbounded server creation -7. **Email notifications** — crash, billing, provisioning complete -8. **Upload testing** — end-to-end in dev containers -9. **Billing endpoints** — add back to API -10. **Stress testing** — k6 IDE session load + Minecraft bot + code-server memory baseline -11. **OPNsense audit** — both routers need systematic validation -12. **Service discovery migration** — replace remaining internal.zlh refs in hot paths -13. **Provisioning validation** — single controlled creation, confirm 1 record / 1 job / 1 execution +4. **Usage limits / quota enforcement** — prevent unbounded server creation +5. **Email notifications** — crash, billing, provisioning complete +6. **Upload testing** — end-to-end verification in dev containers +7. **Stress testing** — k6 IDE session load + Minecraft bot + code-server memory baseline +8. **OPNsense audit** — both routers need systematic validation +9. **Service discovery migration** — remaining non-hot-path internal.zlh refs +10. **Provisioning validation** — single controlled creation, confirm 1 record / 1 job / 1 execution +11. **Final smoke test** — full lifecycle: create → ready → connectable → backup → restore → stop/start/restart → delete. Confirm Velocity unregister, Cloudflare cleanup, Technitium cleanup. +12. **Portal public-site QA** — desktop + mobile layouts, CTA routing, metadata verification. Mobile not yet optimized. +13. **Monitoring / observability** — normalize game/dev Alloy label contract across API discovery, agent-written labels, Prometheus targets, and Grafana dashboards. Finish template cleanup (remove node-exporter, keep Alloy). +14. **Billing Portal UI polish** — `trialing` state handling, upgrade/downgrade flow, plan limit gating in Portal --- @@ -199,8 +221,8 @@ In priority order: | `jester/knowledge-base` | Claude's home — architecture, strategy, canonical decisions | | `jester/zlh-grind` | GPT's workspace — execution continuity, session handovers, debug notes | | `jester/zlh-docs` | API/agent/portal operational reference docs | -| `jester/zpac-api` | API source (mirror) | -| `jester/zpac-portal` | Portal source (mirror) | +| `jester/zpac-api` | API source | +| `jester/zpac-portal` | Portal source | | `jester/zlh-agent` | Agent source (Go) | --- @@ -214,3 +236,4 @@ In priority order: - `zlh-grind/SCRATCH/` has detailed debug docs for specific topics - Do not mark unimplemented work as complete - Game publish flow must never be modified by dev routing changes +- Synthesis of Codex review outputs belongs here, not in Codex