5.4 KiB
5.4 KiB
Open Threads – zlh-grind
This file tracks active but unfinished work.
Keep it short.
Agent (zlh-agent)
Dev Runtime System
Completed:
- catalog validation implemented
- runtime installs artifact-backed
- install guard implemented
- all installs now fetch from artifact server (no local artifact assumption)
Outstanding:
- runtime install verification improvements
- catalog hash validation
- runtime removal / upgrade handling
Dev Environment
Completed:
- dev user creation
- workspace root
/home/dev/workspace - console runs as dev user
HOME,USER,LOGNAME,TERMenv vars set correctly
Outstanding:
- PATH normalization
- shell profile consistency
- runtime PATH injection
Code Server Addon
Status: ✅ Installed, running, browser-verified end-to-end
Confirmed:
- pulled from artifact server (tar.gz)
- installed to
/opt/zlh/services/code-server - binds to
0.0.0.0:6000 - lifecycle endpoints:
POST /dev/codeserver/start|stop|restart - detection via
/proc/*/cmdlinescan - full browser IDE loading confirmed at
dev-6070.zerolaghub.dev
Game Server Supervision
Completed:
- crash recovery with backoff: 30s → 60s → 120s
- backoff resets if uptime ≥ 30s
- transitions to
errorstate after repeated failures - crash observability: time, exit code, signal, uptime, log tail, classification
- classifications:
oom,mod_or_plugin_error,missing_dependency,nonzero_exit,unexpected_exit
Agent Future Work (priority order)
- Structured logging (slog) for Loki
- Dev container
provisioningCompletestate in/status - Graceful shutdown verification (SIGTERM + wait for Minecraft)
- Process reattachment on agent restart
Dev IDE Access
Browser IDE ✅ Fully Working (browser-verified)
Browser → dev-<vmid>.zerolaghub.dev → Traefik → API → container:6000
Browser-verified: VS Code loads in browser at dev-6070.zerolaghub.dev/?folder=/home/dev/workspace
with workspace mounted, extensions panel visible, AI chat panel active.
Remaining Work
- confirm "Open IDE" button in portal uses hosted URL in production path
- reduce legacy
/__ide/:idcompatibility paths once portal button confirmed - simplify and harden
devProxy— remove stale path-based assumptions
Local Dev Access — SSH via CF Tunnel (Next Step)
Decision: Cloudflare Tunnel on bastion VM. Free tier covers up to 50 users. Same hostname as browser IDE — different protocols routed separately.
Developer one-time SSH config:
Host *.zerolaghub.dev
ProxyCommand cloudflared access ssh --hostname %h
Outstanding:
- Install
cloudflaredon bastion VM - Create CF Tunnel pointed at bastion SSH port
- Map
*.zerolaghub.devSSH through tunnel - Portal SSH config snippet UI
API (zpack-api)
Completed:
- dev provisioning payload
- runtime/version fields
- enable_code_server flag
GET /api/servers/:id/status— server status endpointPOST /api/dev/:id/ide-token— IDE token generation + hosted URLGET /api/dev/:id/ide— bootstrap route/__ide/:id/*— live tunnel proxy (HTTP + WS, target-bound)handleHostedProxy— host-based routing viaHostheader vmid extraction- hosted flow browser-verified end-to-end
Outstanding:
- Billing endpoints — need to be added back
- simplify and harden host-native
devProxy - dev runtime catalog endpoint for portal
- Headscale auth key generation
Portal (zpack-portal)
Completed:
- dev runtime dropdown
- dotnet runtime support
- enable code-server checkbox
- dev file browser support
Outstanding:
- confirm "Open IDE" button fully uses hosted URL flow
- SSH config snippet for local VS Code / terminal access
- site copy/wording — needs rewriting for public audience
Pre-Launch Checklist
Outstanding before launch:
- Upload testing — test file upload flow end-to-end in dev containers
- Portal copy/wording — site needs rewriting for public audience
- Billing endpoints — add back to API
- Stress testing — k6 IDE session load test + Minecraft bot test
- See
knowledge-base/operations/stress-testing.md
- See
- OPNsense audit — both routers need systematic validation
- See
knowledge-base/network/opnsense-checklist.md
- See
- Dedicated host migration — evaluate GTHost upgrade (Gold 6152, Detroit)
- Trial period: $5/day up to 10 days, PBS restore approach
Platform
Future work:
- CF Tunnel SSH access (see Local Dev Access above)
- Tailscale dev access (alternative/complement to CF Tunnel)
- artifact version promotion
- runtime rollback support
- Cloudflare R2 for large artifact/mod file delivery at scale
Closed Threads
- ✅ PTY console (dev + game)
- ✅ Mod lifecycle
- ✅ Upload pipeline
- ✅ Runtime artifact installs
- ✅ Dev container filesystem model
- ✅ Code-server artifact fix
- ✅ API status endpoint for frontend agent-state consumption
- ✅ Game server crash recovery with backoff
- ✅ Crash observability (classification, log tail, exit metadata)
- ✅ Code-server lifecycle endpoints (start/stop/restart)
- ✅ Code-server process detection via /proc scan
- ✅ Dev IDE proxy — path-based browser IDE working end-to-end
- ✅ Hosted wildcard Traefik → API → container dev IDE flow — browser-verified
- ✅ Per-container dev IDE edge publish/unpublish removed from API
- ✅ Wildcard TLS cert
*.zerolaghub.devvia Let's Encrypt + Cloudflare DNS-01 - ✅ Browser IDE fully loading at dev-.zerolaghub.dev