Pivot dev access — abandon Traefik/DNS, adopt API proxy + Headscale
This commit is contained in:
parent
88db6134f3
commit
d128d92d15
173
OPEN_THREADS.md
173
OPEN_THREADS.md
@ -40,9 +40,9 @@ Outstanding:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Code Server Addon
|
## Code Server Addon
|
||||||
|
|
||||||
Status: ✅ Install + launch operational inside dev containers
|
Status: ✅ Installed and running inside dev containers
|
||||||
|
|
||||||
Confirmed:
|
Confirmed:
|
||||||
|
|
||||||
@ -51,62 +51,114 @@ Confirmed:
|
|||||||
- process confirmed running inside container
|
- process confirmed running inside container
|
||||||
- binds to `0.0.0.0:6000`
|
- binds to `0.0.0.0:6000`
|
||||||
- launched from `/opt/zlh/services/code-server`
|
- launched from `/opt/zlh/services/code-server`
|
||||||
- API now writes dev Traefik dynamic config during provisioning
|
|
||||||
- API now uses proxy SSH service account (`zlh`) instead of personal user
|
|
||||||
|
|
||||||
Port: `6000`
|
Port: `6000`
|
||||||
|
|
||||||
Routing model:
|
---
|
||||||
|
|
||||||
- DNS: Cloudflare + Technitium
|
### Access Model (Updated)
|
||||||
- Proxy: Traefik dynamic file written by API during dev provisioning
|
|
||||||
- Host format currently in use: `dev-<vmid>.zerolaghub.dev`
|
|
||||||
|
|
||||||
Outstanding:
|
The previous approach using:
|
||||||
|
|
||||||
- finalize external browser reachability for code-server through Cloudflare → Traefik → container
|
- Cloudflare DNS
|
||||||
- remove manual proxy-file edits from debugging path and ensure generated config is the sole source
|
- Technitium DNS
|
||||||
- standardize hostname format everywhere (`dev-<vmid>` only)
|
- Traefik dynamic config per container
|
||||||
- add code-server launch link in portal
|
|
||||||
- remove dynamic Traefik file on dev container deletion
|
has been **abandoned**.
|
||||||
|
|
||||||
|
Reason:
|
||||||
|
|
||||||
|
- too many moving pieces
|
||||||
|
- TLS and proxy complexity
|
||||||
|
- per-container DNS automation
|
||||||
|
- unnecessary exposure of internal dev services
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Agent Future Work (priority order)
|
### New Access Strategy
|
||||||
|
|
||||||
1. Unified structured logging (slog) — Promtail/Loki needs structured fields
|
Dev containers will support **two access paths**.
|
||||||
2. Dev container /status — provisioningComplete + provisioningError fields
|
|
||||||
3. Crash recovery with backoff — 30s/60s/120s, max 3 attempts, then error state
|
#### Path 1 — Browser IDE (Primary)
|
||||||
4. Graceful shutdown verification — SIGTERM + wait before SIGKILL for Minecraft
|
|
||||||
5. Agent restart/process reattachment — detect existing process on restart
|
```
|
||||||
|
Browser
|
||||||
|
↓
|
||||||
|
Portal
|
||||||
|
↓
|
||||||
|
API proxy
|
||||||
|
↓
|
||||||
|
container:6000
|
||||||
|
```
|
||||||
|
|
||||||
|
URL format: `/dev/<vmid>/ide`
|
||||||
|
|
||||||
|
Implementation requirements:
|
||||||
|
|
||||||
|
- API proxy using `http-proxy-middleware`
|
||||||
|
- WebSocket support (`ws: true`)
|
||||||
|
- `server.on('upgrade', proxy.upgrade)`
|
||||||
|
- code-server launch args: `--base-path /dev/<vmid>/ide --auth none`
|
||||||
|
|
||||||
|
Authentication handled by portal JWT.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## API (zlh-api)
|
#### Path 2 — Local Dev Access (Advanced Users)
|
||||||
|
|
||||||
|
Direct developer access via **Headscale/Tailscale**.
|
||||||
|
|
||||||
|
Use cases:
|
||||||
|
|
||||||
|
- SSH
|
||||||
|
- VS Code Remote
|
||||||
|
- local development tools
|
||||||
|
|
||||||
|
Outstanding tasks:
|
||||||
|
|
||||||
|
- confirm `zlh-ctl` Headscale server status
|
||||||
|
- implement Tailscale addon install
|
||||||
|
- API auth key generation
|
||||||
|
- portal instructions
|
||||||
|
|
||||||
|
Headscale constraints:
|
||||||
|
|
||||||
|
- `magic_dns: false`
|
||||||
|
- no exit nodes
|
||||||
|
- no DNS takeover
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Agent Future Work (priority order)
|
||||||
|
|
||||||
|
1. Structured logging (slog) for Loki
|
||||||
|
2. Dev container provisioningComplete state
|
||||||
|
3. Crash recovery backoff
|
||||||
|
4. Graceful shutdown verification
|
||||||
|
5. Process reattachment on agent restart
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API (zpack-api)
|
||||||
|
|
||||||
Completed:
|
Completed:
|
||||||
|
|
||||||
- dev provisioning payload
|
- dev provisioning payload
|
||||||
- runtime/version fields
|
- runtime/version fields
|
||||||
- enable_code_server flag
|
- enable_code_server flag
|
||||||
- dev-only routing hook added during provisioning
|
- API status endpoint for frontend state
|
||||||
- Technitium + Cloudflare dev DNS creation
|
|
||||||
- remote Traefik dynamic file writing via proxy SSH
|
|
||||||
- proxy SSH moved to service-user model (`zlh`)
|
|
||||||
- server status endpoint added so frontend can consume agent state
|
|
||||||
- frontend status/console availability now update correctly via API polling model
|
|
||||||
|
|
||||||
Outstanding:
|
Outstanding:
|
||||||
|
|
||||||
- runtime validation endpoint
|
- `/dev/:id/ide` proxy route
|
||||||
- dev runtime catalog endpoint for portal
|
- websocket upgrade handling
|
||||||
- remove Traefik dynamic config on dev container deletion
|
- ownership validation before proxy
|
||||||
- domain / hostname normalization audit
|
- Headscale auth key generation
|
||||||
- proxy/TLS generation cleanup so manual edits are no longer needed
|
- dev runtime catalog endpoint
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Portal (zlh-portal)
|
## Portal (zpack-portal)
|
||||||
|
|
||||||
Completed:
|
Completed:
|
||||||
|
|
||||||
@ -114,30 +166,12 @@ Completed:
|
|||||||
- dotnet runtime support
|
- dotnet runtime support
|
||||||
- enable code-server checkbox
|
- enable code-server checkbox
|
||||||
- dev file browser support
|
- dev file browser support
|
||||||
- frontend now consumes API-backed status correctly for host/console state
|
|
||||||
|
|
||||||
Outstanding:
|
Outstanding:
|
||||||
|
|
||||||
- runtime list driven from catalog API
|
- "Open IDE" button
|
||||||
- dev port exposure UI
|
- `/dev/<vmid>/ide` page
|
||||||
- code-server launch link
|
- Headscale setup instructions
|
||||||
- clearer dev readiness states (`installing`, `starting`, `running`, `error`, etc.)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Artifact Server
|
|
||||||
|
|
||||||
Completed:
|
|
||||||
|
|
||||||
- runtime artifacts hosted
|
|
||||||
- devcontainer catalog
|
|
||||||
- runtime archive structure
|
|
||||||
- code-server compiled release artifact ✅
|
|
||||||
|
|
||||||
Outstanding:
|
|
||||||
|
|
||||||
- checksum publishing
|
|
||||||
- artifact metadata support
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -145,12 +179,11 @@ Outstanding:
|
|||||||
|
|
||||||
Active thread:
|
Active thread:
|
||||||
|
|
||||||
- complete external dev IDE access path end-to-end
|
- implement browser IDE proxy
|
||||||
|
|
||||||
Future work:
|
Future work:
|
||||||
|
|
||||||
- dev port routing
|
- Tailscale dev access
|
||||||
- dev service detection
|
|
||||||
- artifact version promotion
|
- artifact version promotion
|
||||||
- runtime rollback support
|
- runtime rollback support
|
||||||
|
|
||||||
@ -158,22 +191,10 @@ Future work:
|
|||||||
|
|
||||||
## Closed Threads
|
## Closed Threads
|
||||||
|
|
||||||
- ✅ Interactive PTY-backed console (dev + game)
|
- ✅ PTY console (dev + game)
|
||||||
- ✅ WebSocket stability and PTY ownership
|
- ✅ Mod lifecycle
|
||||||
- ✅ Customer isolation (API + frontend)
|
- ✅ Upload pipeline
|
||||||
- ✅ Agent update system (versioned, hash-verified)
|
- ✅ Runtime artifact installs
|
||||||
- ✅ Minecraft player presence (agent-sourced)
|
- ✅ Dev container filesystem model
|
||||||
- ✅ Game telemetry router separation (`/api/game/*`)
|
- ✅ Code-server artifact fix
|
||||||
- ✅ Agent Phase 1 mod management endpoints
|
- ✅ API status endpoint for frontend agent-state consumption
|
||||||
- ✅ Agent process metrics endpoint
|
|
||||||
- ✅ Minecraft readiness probe + restart race mitigation
|
|
||||||
- ✅ Modrinth resolver + full mod lifecycle
|
|
||||||
- ✅ Direct runtime upload model (no staging, no symlinks)
|
|
||||||
- ✅ `.zlh_metadata.json` provenance tracking
|
|
||||||
- ✅ Raw `http.request` streaming in API upload proxy
|
|
||||||
- ✅ Filesystem architecture docs consolidated
|
|
||||||
- ✅ Upload transport timeout tuning
|
|
||||||
- ✅ Dev container filesystem support (container-aware, /workspace root)
|
|
||||||
- ✅ Code-server artifact fix — compiled release on zlh-artifacts
|
|
||||||
- ✅ Dev routing hook added to provisioning without changing game publish flow
|
|
||||||
- ✅ API status endpoint added for frontend agent-state consumption
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user