zlh-grind/OPEN_THREADS.md

4.5 KiB
Raw Blame History

Open Threads zlh-grind

This file tracks active but unfinished work.

Keep it short.


Agent (zlh-agent)

Dev Runtime System

Completed:

  • catalog validation implemented
  • runtime installs artifact-backed
  • install guard implemented

Outstanding:

  • runtime install verification improvements
  • catalog hash validation
  • runtime removal / upgrade handling

Dev Environment

Completed:

  • dev user creation
  • workspace root /home/dev/workspace
  • console runs as dev user

Outstanding:

  • PATH normalization
  • shell profile consistency
  • runtime PATH injection

Code Server Addon

Status: Install + launch operational inside dev containers

Confirmed:

  • compiled release artifact fixed on zlh-artifacts
  • install confirmed working
  • process confirmed running inside container
  • binds to 0.0.0.0:6000
  • launched from /opt/zlh/services/code-server
  • API now writes dev Traefik dynamic config during provisioning
  • API now uses proxy SSH service account (zlh) instead of personal user

Port: 6000

Routing model:

  • DNS: Cloudflare + Technitium
  • Proxy: Traefik dynamic file written by API during dev provisioning
  • Host format currently in use: dev-<vmid>.zerolaghub.dev

Outstanding:

  • finalize external browser reachability for code-server through Cloudflare → Traefik → container
  • remove manual proxy-file edits from debugging path and ensure generated config is the sole source
  • standardize hostname format everywhere (dev-<vmid> only)
  • add code-server launch link in portal
  • remove dynamic Traefik file on dev container deletion

Agent Future Work (priority order)

  1. Unified structured logging (slog) — Promtail/Loki needs structured fields
  2. Dev container /status — provisioningComplete + provisioningError fields
  3. Crash recovery with backoff — 30s/60s/120s, max 3 attempts, then error state
  4. Graceful shutdown verification — SIGTERM + wait before SIGKILL for Minecraft
  5. Agent restart/process reattachment — detect existing process on restart

API (zlh-api)

Completed:

  • dev provisioning payload
  • runtime/version fields
  • enable_code_server flag
  • dev-only routing hook added during provisioning
  • Technitium + Cloudflare dev DNS creation
  • remote Traefik dynamic file writing via proxy SSH
  • proxy SSH moved to service-user model (zlh)
  • server status endpoint added so frontend can consume agent state
  • frontend status/console availability now update correctly via API polling model

Outstanding:

  • runtime validation endpoint
  • dev runtime catalog endpoint for portal
  • remove Traefik dynamic config on dev container deletion
  • domain / hostname normalization audit
  • proxy/TLS generation cleanup so manual edits are no longer needed

Portal (zlh-portal)

Completed:

  • dev runtime dropdown
  • dotnet runtime support
  • enable code-server checkbox
  • dev file browser support
  • frontend now consumes API-backed status correctly for host/console state

Outstanding:

  • runtime list driven from catalog API
  • dev port exposure UI
  • code-server launch link
  • clearer dev readiness states (installing, starting, running, error, etc.)

Artifact Server

Completed:

  • runtime artifacts hosted
  • devcontainer catalog
  • runtime archive structure
  • code-server compiled release artifact

Outstanding:

  • checksum publishing
  • artifact metadata support

Platform

Active thread:

  • complete external dev IDE access path end-to-end

Future work:

  • dev port routing
  • dev service detection
  • artifact version promotion
  • runtime rollback support

Closed Threads

  • Interactive PTY-backed console (dev + game)
  • WebSocket stability and PTY ownership
  • Customer isolation (API + frontend)
  • Agent update system (versioned, hash-verified)
  • Minecraft player presence (agent-sourced)
  • Game telemetry router separation (/api/game/*)
  • Agent Phase 1 mod management endpoints
  • Agent process metrics endpoint
  • Minecraft readiness probe + restart race mitigation
  • Modrinth resolver + full mod lifecycle
  • Direct runtime upload model (no staging, no symlinks)
  • .zlh_metadata.json provenance tracking
  • Raw http.request streaming in API upload proxy
  • Filesystem architecture docs consolidated
  • Upload transport timeout tuning
  • Dev container filesystem support (container-aware, /workspace root)
  • Code-server artifact fix — compiled release on zlh-artifacts
  • Dev routing hook added to provisioning without changing game publish flow
  • API status endpoint added for frontend agent-state consumption