Update OPEN_THREADS 2026-03-17 — base-path blocking issue, agent enhancements completed
This commit is contained in:
parent
56178ead38
commit
8c8af5ff62
112
OPEN_THREADS.md
112
OPEN_THREADS.md
@ -15,6 +15,7 @@ Completed:
|
|||||||
- catalog validation implemented
|
- catalog validation implemented
|
||||||
- runtime installs artifact-backed
|
- runtime installs artifact-backed
|
||||||
- install guard implemented
|
- install guard implemented
|
||||||
|
- all installs now fetch from artifact server (no local artifact assumption)
|
||||||
|
|
||||||
Outstanding:
|
Outstanding:
|
||||||
|
|
||||||
@ -31,6 +32,7 @@ Completed:
|
|||||||
- dev user creation
|
- dev user creation
|
||||||
- workspace root `/home/dev/workspace`
|
- workspace root `/home/dev/workspace`
|
||||||
- console runs as dev user
|
- console runs as dev user
|
||||||
|
- `HOME`, `USER`, `LOGNAME`, `TERM` env vars set correctly
|
||||||
|
|
||||||
Outstanding:
|
Outstanding:
|
||||||
|
|
||||||
@ -40,64 +42,85 @@ Outstanding:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Code Server Addon
|
### Code Server Addon
|
||||||
|
|
||||||
Status: ✅ Installed and running inside dev containers
|
Status: ✅ Installed and running
|
||||||
|
|
||||||
Confirmed:
|
Confirmed:
|
||||||
|
|
||||||
- compiled release artifact fixed on `zlh-artifacts`
|
- pulled from artifact server (tar.gz)
|
||||||
- install confirmed working
|
- installed to `/opt/zlh/services/code-server`
|
||||||
- process confirmed running inside container
|
|
||||||
- binds to `0.0.0.0:6000`
|
- binds to `0.0.0.0:6000`
|
||||||
- launched from `/opt/zlh/services/code-server`
|
- lifecycle endpoints: `POST /dev/codeserver/start|stop|restart`
|
||||||
|
- detection via `/proc/*/cmdline` scan (no longer relies solely on PID file)
|
||||||
|
|
||||||
Port: `6000`
|
**BLOCKING — next task:**
|
||||||
|
|
||||||
**Next session — agent change required:**
|
code-server must launch with:
|
||||||
|
|
||||||
code-server must be relaunched with:
|
|
||||||
|
|
||||||
```
|
```
|
||||||
|
--bind-addr 0.0.0.0:6000
|
||||||
--auth none
|
--auth none
|
||||||
|
--disable-telemetry
|
||||||
--base-path /api/dev/<vmid>/ide
|
--base-path /api/dev/<vmid>/ide
|
||||||
|
/home/dev/workspace
|
||||||
```
|
```
|
||||||
|
|
||||||
Reason: API token is now the sole auth mechanism. Password prompt must be removed. Base path required for correct asset loading through proxy.
|
Without `--base-path`, WebSocket paths and static assets mismatch through
|
||||||
|
the proxy. Result: IDE loads partially, WS closes with 1006, workspace
|
||||||
|
shows `!` (not mounted), extension host fails to start.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Game Server Supervision
|
||||||
|
|
||||||
|
Completed:
|
||||||
|
|
||||||
|
- crash recovery with backoff: 30s → 60s → 120s
|
||||||
|
- backoff resets if uptime ≥ 30s
|
||||||
|
- transitions to `error` state after repeated failures
|
||||||
|
- crash observability: time, exit code, signal, uptime, log tail, classification
|
||||||
|
- classifications: `oom`, `mod_or_plugin_error`, `missing_dependency`, `nonzero_exit`, `unexpected_exit`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Agent Future Work (priority order)
|
||||||
|
|
||||||
|
1. **Fix code-server `--base-path` launch arg** — unblocks IDE (IMMEDIATE)
|
||||||
|
2. Structured logging (slog) for Loki
|
||||||
|
3. Dev container `provisioningComplete` state in `/status`
|
||||||
|
4. Graceful shutdown verification (SIGTERM + wait for Minecraft)
|
||||||
|
5. Process reattachment on agent restart
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Dev IDE Access
|
## Dev IDE Access
|
||||||
|
|
||||||
### Browser IDE (Implemented ✅)
|
### Browser IDE
|
||||||
|
|
||||||
```
|
```
|
||||||
Browser
|
Browser → Portal → API (/api/dev/:id/ide) → container:6000
|
||||||
↓
|
|
||||||
Portal
|
|
||||||
↓
|
|
||||||
API (/api/dev/:id/ide)
|
|
||||||
↓
|
|
||||||
container:6000
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Implemented in API:
|
API layer: ✅ complete
|
||||||
|
Agent layer: ⚠️ blocked on `--base-path`
|
||||||
|
|
||||||
- `src/routes/devProxy.js` — proxy route mounted in `src/app.js`
|
What is confirmed working:
|
||||||
- `GET /api/dev/:id/ide` and `GET /api/dev/:id/ide/*`
|
|
||||||
- ownership verification before proxying
|
|
||||||
- `ctype === "dev"` required
|
|
||||||
- WebSocket support via `http-proxy-middleware` (`ws: true`)
|
|
||||||
- `server.on('upgrade')` handler wired
|
|
||||||
|
|
||||||
IDE token system implemented:
|
- API auth ✅
|
||||||
|
- Token flow ✅
|
||||||
|
- Proxy routing ✅
|
||||||
|
- WebSocket upgrade handler ✅
|
||||||
|
- Upstream targeting ✅
|
||||||
|
- code-server process running ✅
|
||||||
|
|
||||||
- `POST /api/dev/:id/ide-token` — returns signed short-lived token
|
What is failing:
|
||||||
- token payload: `sub`, `vmid`, `type: "dev-ide"`
|
|
||||||
- default TTL: 300 seconds
|
- Workbench WebSocket session ❌
|
||||||
- env overrides: `API_AUTH_IDE_TTL_SECONDS`, `API_AUTH_IDE_SECRET`
|
- Filesystem provider initialization ❌
|
||||||
- proxy accepts `Authorization: Bearer` or `?token=<ide-token>`
|
- Extension host startup ❌
|
||||||
- WebSocket upgrades validate same token
|
|
||||||
|
Root cause: code-server launched without `--base-path /api/dev/<vmid>/ide`
|
||||||
|
|
||||||
### Local Dev Access (Headscale/Tailscale — Future)
|
### Local Dev Access (Headscale/Tailscale — Future)
|
||||||
|
|
||||||
@ -108,22 +131,7 @@ Outstanding:
|
|||||||
- API auth key generation
|
- API auth key generation
|
||||||
- portal setup instructions
|
- portal setup instructions
|
||||||
|
|
||||||
Constraints:
|
Constraints: `magic_dns: false`, no exit nodes, no DNS takeover
|
||||||
|
|
||||||
- `magic_dns: false`
|
|
||||||
- no exit nodes
|
|
||||||
- no DNS takeover
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Agent Future Work (priority order)
|
|
||||||
|
|
||||||
1. Update code-server launch args (`--auth none`, `--base-path /api/dev/<vmid>/ide`)
|
|
||||||
2. Structured logging (slog) for Loki
|
|
||||||
3. Dev container provisioningComplete state
|
|
||||||
4. Crash recovery backoff
|
|
||||||
5. Graceful shutdown verification
|
|
||||||
6. Process reattachment on agent restart
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -136,7 +144,7 @@ Completed:
|
|||||||
- enable_code_server flag
|
- enable_code_server flag
|
||||||
- `GET /api/servers/:id/status` — server status endpoint
|
- `GET /api/servers/:id/status` — server status endpoint
|
||||||
- `POST /api/dev/:id/ide-token` — IDE token generation
|
- `POST /api/dev/:id/ide-token` — IDE token generation
|
||||||
- `GET /api/dev/:id/ide` — IDE proxy route with WebSocket support
|
- `GET /api/dev/:id/ide` + `GET /api/dev/:id/ide/*` — IDE proxy with WebSocket
|
||||||
- dev routing experiment removed (`devRouting.js`, `devDePublisher.js` deleted)
|
- dev routing experiment removed (`devRouting.js`, `devDePublisher.js` deleted)
|
||||||
|
|
||||||
Outstanding:
|
Outstanding:
|
||||||
@ -183,3 +191,7 @@ Future work:
|
|||||||
- ✅ API status endpoint for frontend agent-state consumption
|
- ✅ API status endpoint for frontend agent-state consumption
|
||||||
- ✅ Dev IDE proxy implementation (API proxy + token system)
|
- ✅ Dev IDE proxy implementation (API proxy + token system)
|
||||||
- ✅ Dev DNS/Traefik routing experiment — removed
|
- ✅ Dev DNS/Traefik routing experiment — removed
|
||||||
|
- ✅ Game server crash recovery with backoff
|
||||||
|
- ✅ Crash observability (classification, log tail, exit metadata)
|
||||||
|
- ✅ Code-server lifecycle endpoints (start/stop/restart)
|
||||||
|
- ✅ Code-server process detection via /proc scan
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user