8.5 KiB
Open Threads — zlh-grind
This file tracks active but unfinished work.
Keep it short.
Agent (zlh-agent)
Dev Runtime System
Completed:
- catalog validation implemented
- runtime installs artifact-backed
- install guard implemented
- all installs now fetch from artifact server (no local artifact assumption)
Outstanding:
- runtime install verification improvements
- catalog hash validation
- runtime removal / upgrade handling
- runtime update process — in-place runtime version upgrade for dev containers
Dev Environment
Completed:
- dev user creation
- workspace root
/home/dev/workspace - console runs as dev user
HOME,USER,LOGNAME,TERMenv vars set correctly
Outstanding:
- PATH normalization
- shell profile consistency
- runtime PATH injection
Code Server Addon
Status: ✅ Installed, running, browser-verified end-to-end
Confirmed:
- pulled from artifact server (tar.gz)
- installed to
/opt/zlh/services/code-server - binds to
0.0.0.0:6000 - lifecycle endpoints:
POST /dev/codeserver/start|stop|restart - detection via
/proc/*/cmdlinescan - full browser IDE loading confirmed at
dev-6070.zerolaghub.dev
Game Server Supervision
Completed:
- crash recovery with backoff: 30s → 60s → 120s
- backoff resets if uptime ≥ 30s
- transitions to
errorstate after repeated failures - crash observability: time, exit code, signal, uptime, log tail, classification
- classifications:
oom,mod_or_plugin_error,missing_dependency,nonzero_exit,unexpected_exit
Agent Future Work (priority order)
- Structured logging (slog) for Loki
- Dev container
provisioningCompletestate in/status - Graceful shutdown verification (SIGTERM + wait for Minecraft)
- Process reattachment on agent restart
- SSH server install in dev container provisioning pipeline (for CF Tunnel SSH)
Dev IDE Access
Browser IDE ✅ Fully Working (browser-verified) — Zero Install
Browser → dev-<vmid>.zerolaghub.dev → Traefik → API → container:6000
This is the primary zero-install access method. VS Code in the browser, no client software required. Browser-verified and production ready.
Remaining Work
- confirm "Open IDE" button in portal uses hosted URL in production path
- reduce legacy
/__ide/:idcompatibility paths once portal button confirmed - simplify and harden
devProxy— remove stale path-based assumptions
Local Dev Access — SSH via CF Tunnel (Power Users Only)
Important: CF Tunnel SSH requires cloudflared installed on the
developer's local machine. This is not zero-install and is an optional
feature for power users who want local VS Code or terminal access.
The browser IDE remains the zero-install story for all developers.
Current state:
- ✅ CF Tunnel created and connected to bastion VM
- ✅ Cloudflare Zero Trust free plan active
- 🟡 Tunnel SSH hostname mapping not yet configured in Zero Trust dashboard
- 🟡 Bastion SSH proxy jump config not yet done
- 🟡 Dev container SSH server not yet verified
- 🟡 Portal SSH config snippet not yet built
API (zpack-api)
Completed:
- dev provisioning payload
- runtime/version fields
- enable_code_server flag
GET /api/servers/:id/status— server status endpointPOST /api/dev/:id/ide-token— IDE token generation + hosted URLGET /api/dev/:id/ide— bootstrap route/__ide/:id/*— live tunnel proxy (HTTP + WS, target-bound)handleHostedProxy— host-based routing viaHostheader vmid extraction- hosted flow browser-verified end-to-end
Outstanding:
- Billing / Stripe integration — cannot take payments without this, launch blocker
- Password reset flow — verify it is fully wired up
- Usage limits / quota enforcement — prevent unbounded server creation per account
- simplify and harden host-native
devProxy - dev runtime catalog endpoint for portal
- Headscale auth key generation
Portal (zpack-portal)
Completed:
- dev runtime dropdown
- dotnet runtime support
- enable code-server checkbox
- dev file browser support
- ✅ Site copy/wording rewrite — landing, features, FAQ, about updated for new platform
- ✅ Pricing page updated — Vanilla/Modded/Heavy tiers, Minecraft only
Outstanding:
- confirm "Open IDE" button fully uses hosted URL flow
- SSH config snippet for power users (optional, clearly labeled as requiring cloudflared install)
- User onboarding flow — what happens after registration? Guided first-server creation needed
- Email notifications — server crashed, provisioning complete, billing receipts
- Remove
testdameonbinary from repo root (7MB, should not be in repo)
Game Servers
Outstanding:
- Game server world backup / restore — player world data backup separate from PBS infrastructure backup. Trust-critical — players losing world data will kill retention.
- Game server subdomain — how do players connect? Verify IP vs subdomain (e.g.
mc.zerolaghub.comstyle)
Infrastructure
Hosting — GTHost (Decision: Stay, Most Cost-Effective)
GTHost Detroit is the primary host. Decision made to stay on GTHost long-term:
- Bare metal, instant Proxmox install without workarounds
- Unmetered bandwidth
- Competitive pricing — most cost-effective option evaluated
- Digital Ocean: too expensive, no bare metal/Proxmox
- OVH: more expensive, overkill for current scale
- Hetzner: Proxmox install was painful historically
DDoS Protection (Resolved — Accepted Risk at Launch)
Investigation complete:
- GTHost Detroit page: no DDoS mention
- GTHost Chicago page: vague mention, no specs
- Cloudflare Spectrum: ~$30k/year, not viable
- Path.net: enterprise-focused, requires consult, not on our radar yet
- OPNsense provides basic rate limiting and firewall protection
- Cloudflare DNS (non-proxied) hides real IP from casual attackers
- Minecraft Java uses TCP — harder to volumetric flood than UDP games
Decision: Accept DDoS risk at launch. Low threat profile for Minecraft-only soft launch with small user base. Revisit when revenue supports it or if an attack occurs.
Pre-Launch Checklist
Outstanding before launch:
- Billing / Stripe integration — launch blocker, cannot take money
- Game server world backup / restore — trust-critical
- User onboarding flow — guided first-server creation after register
- Password reset flow — verify wired up
- Usage limits / quota enforcement — per account
- Game server subdomain — verify player connection method
- Email notifications — crashed, billing, provisioning
- Upload testing — test file upload flow end-to-end in dev containers
- Billing endpoints — add back to API
- Stress testing — k6 IDE session load test + Minecraft bot test
- OPNsense audit — both routers need systematic validation
- Dedicated host upgrade — evaluate GTHost Gold 6152, Detroit
- Trial period: $5/day up to 10 days, PBS restore approach
- Remove
testdameonbinary from zpack-portal repo root
Platform
Future work:
- CF Tunnel SSH completion — power user feature, requires client cloudflared install
- artifact version promotion
- runtime rollback support
- Cloudflare R2 for large artifact/mod file delivery at scale
- Admin panel — manage users/servers as operator
- Referral / dev pipeline reward system — revenue sharing for developers
- Uptime history — visible to users per server
- DDoS mitigation — revisit Path.net or similar when revenue supports it
Closed Threads
- ✅ PTY console (dev + game)
- ✅ Mod lifecycle
- ✅ Upload pipeline
- ✅ Runtime artifact installs
- ✅ Dev container filesystem model
- ✅ Code-server artifact fix
- ✅ API status endpoint for frontend agent-state consumption
- ✅ Game server crash recovery with backoff
- ✅ Crash observability (classification, log tail, exit metadata)
- ✅ Code-server lifecycle endpoints (start/stop/restart)
- ✅ Code-server process detection via /proc scan
- ✅ Dev IDE proxy — path-based browser IDE working end-to-end
- ✅ Hosted wildcard Traefik → API → container dev IDE flow — browser-verified
- ✅ Per-container dev IDE edge publish/unpublish removed from API
- ✅ Wildcard TLS cert
*.zerolaghub.devvia Let's Encrypt + Cloudflare DNS-01 - ✅ Browser IDE fully loading at dev-.zerolaghub.dev
- ✅ CF Tunnel created and connected to bastion VM
- ✅ Portal copy rewrite — landing, features, FAQ, about, pricing
- ✅ DDoS investigation — accepted risk at launch, revisit post-launch
- ✅ Hosting provider decision — GTHost Detroit, most cost-effective option