zlh-grind/OPEN_THREADS.md

4.9 KiB
Raw Blame History

Open Threads zlh-grind

This file tracks active but unfinished work.

Keep it short.


Agent (zlh-agent)

Dev Runtime System

Completed:

  • catalog validation implemented
  • runtime installs artifact-backed
  • install guard implemented
  • all installs now fetch from artifact server (no local artifact assumption)

Outstanding:

  • runtime install verification improvements
  • catalog hash validation
  • runtime removal / upgrade handling

Dev Environment

Completed:

  • dev user creation
  • workspace root /home/dev/workspace
  • console runs as dev user
  • HOME, USER, LOGNAME, TERM env vars set correctly

Outstanding:

  • PATH normalization
  • shell profile consistency
  • runtime PATH injection

Code Server Addon

Status: Installed, running, and proxied through API

Confirmed:

  • pulled from artifact server (tar.gz)
  • installed to /opt/zlh/services/code-server
  • binds to 0.0.0.0:8080
  • lifecycle endpoints: POST /dev/codeserver/start|stop|restart
  • detection via /proc/*/cmdline scan
  • browser IDE fully working end-to-end via API proxy

Game Server Supervision

Completed:

  • crash recovery with backoff: 30s → 60s → 120s
  • backoff resets if uptime ≥ 30s
  • transitions to error state after repeated failures
  • crash observability: time, exit code, signal, uptime, log tail, classification
  • classifications: oom, mod_or_plugin_error, missing_dependency, nonzero_exit, unexpected_exit

Agent Future Work (priority order)

  1. Structured logging (slog) for Loki
  2. Dev container provisioningComplete state in /status
  3. Graceful shutdown verification (SIGTERM + wait for Minecraft)
  4. Process reattachment on agent restart

Dev IDE Access

Browser IDE Working (path-based)

Browser → Portal → API (bootstrap) → /__ide/:id/* → container:8080

Working flow:

  1. frontend calls POST /api/dev/:id/ide-token
  2. API returns /api/dev/:id/ide?token=...
  3. frontend opens that URL in new tab
  4. bootstrap route validates token, sets HTTP-only IDE cookie, redirects to /__ide/:id/
  5. all live code-server HTTP + WS traffic proxied through /__ide/:id/*
  6. API proxies to http://<container-ip>:8080

Host-based IDE URL — Caddy edge (BLOCKED)

Goal: open IDE on dev-<vmid>.zerolaghub.dev instead of raw API IP.

Browser → dev-6070.zerolaghub.dev → Caddy → 127.0.0.1:4000 → API

State:

  • API env vars set: DEV_IDE_HOST_SUFFIX=zerolaghub.dev, DEV_IDE_RETURN_HOSTED_URL=true
  • API generating correct absolute URL: http://dev-6070.zerolaghub.dev/?token=...
  • Caddyfile block correct:
    http://dev-*.zerolaghub.dev {
        @dev host dev-*.zerolaghub.dev
        reverse_proxy @dev 127.0.0.1:4000
    }
    
  • auto_https off global option added

Blocking issue: browser HSTS cache forces zerolaghub.dev subdomains to HTTPS regardless of Caddy config. Need to clear Chrome HSTS cache:

  • chrome://net-internals/#hsts
  • Delete zerolaghub.dev and dev-6070.zerolaghub.dev

Resume here next session.

Local Dev Access (Headscale/Tailscale — Future)

Outstanding:

  • confirm zlh-ctl Headscale server status
  • implement Tailscale addon install in agent
  • API auth key generation
  • portal setup instructions

Constraints: magic_dns: false, no exit nodes, no DNS takeover


API (zpack-api)

Completed:

  • dev provisioning payload
  • runtime/version fields
  • enable_code_server flag
  • GET /api/servers/:id/status — server status endpoint
  • POST /api/dev/:id/ide-token — IDE token generation
  • GET /api/dev/:id/ide — bootstrap route (validates token, sets cookie, redirects)
  • /__ide/:id/* — live tunnel proxy (HTTP + WS, target-bound)
  • dev routing experiment removed (devRouting.js, devDePublisher.js deleted)
  • host-based URL generation (DEV_IDE_HOST_SUFFIX, DEV_IDE_RETURN_HOSTED_URL)

Outstanding:

  • dev runtime catalog endpoint for portal
  • Headscale auth key generation

Portal (zpack-portal)

Completed:

  • dev runtime dropdown
  • dotnet runtime support
  • enable code-server checkbox
  • dev file browser support

Outstanding:

  • "Open IDE" button — calls POST /api/dev/:id/ide-token, opens returned URL in new tab
  • Headscale setup instructions

Platform

Future work:

  • Tailscale dev access
  • artifact version promotion
  • runtime rollback support

Closed Threads

  • PTY console (dev + game)
  • Mod lifecycle
  • Upload pipeline
  • Runtime artifact installs
  • Dev container filesystem model
  • Code-server artifact fix
  • API status endpoint for frontend agent-state consumption
  • Dev DNS/Traefik routing experiment — removed
  • Game server crash recovery with backoff
  • Crash observability (classification, log tail, exit metadata)
  • Code-server lifecycle endpoints (start/stop/restart)
  • Code-server process detection via /proc scan
  • Dev IDE proxy — browser IDE fully working end-to-end (path-based)