Add session summary 2026-03-17 — IDE proxy complete, base-path blocking issue, agent enhancements
This commit is contained in:
parent
525366c5df
commit
393911c443
132
Session_Summaries/2026-03-17_IDE-Proxy-Blocking-Issue.md
Normal file
132
Session_Summaries/2026-03-17_IDE-Proxy-Blocking-Issue.md
Normal file
@ -0,0 +1,132 @@
|
||||
# 2026-03-17 – IDE proxy blocking issue + agent enhancements
|
||||
|
||||
## Summary
|
||||
|
||||
API proxy layer for dev IDE is complete. Agent has significant new
|
||||
capabilities. One blocking issue remains before IDE is fully functional.
|
||||
|
||||
---
|
||||
|
||||
## What Was Completed
|
||||
|
||||
### API
|
||||
|
||||
- Dev routing removed: `devRouting.js`, `devDePublisher.js` deleted
|
||||
- All hooks removed from provisioning and container routes
|
||||
- No DNS records, no Traefik routes, no container exposure
|
||||
|
||||
- `GET /api/dev/:id/ide` + `GET /api/dev/:id/ide/*` — proxy to container:6000
|
||||
- WebSocket: `server.on("upgrade")` wired, `ws: true` enabled
|
||||
- Path rewrite: `/api/dev/:id/ide/...` → `/...`
|
||||
|
||||
- `POST /api/dev/:id/ide-token` — short-lived signed token (300s TTL)
|
||||
- Token payload: `sub`, `vmid`, `type: "dev-ide"`
|
||||
- Proxy accepts `Authorization: Bearer` or `?token=`
|
||||
- WebSocket upgrades validate same token
|
||||
|
||||
- `GET /api/servers/:id/status` — reads Redis `agent:<vmid>`, returns agent state
|
||||
|
||||
- `proxyClient.js` intentionally untouched — game edge publish path depends on it
|
||||
|
||||
### Agent
|
||||
|
||||
- Full dev container filesystem support, workspace root `/home/dev/workspace`
|
||||
- Shell runs as `dev` user, `HOME`/`USER`/`LOGNAME`/`TERM` set
|
||||
- Dev provisioning creates `/home/dev` and `/home/dev/workspace` with correct ownership
|
||||
- Catalog-driven runtime validation against `devcontainer/_catalog.json`
|
||||
- All installs now fetch from artifact server — no local artifact assumption
|
||||
- Dotnet installer fetched from artifact server, installed into runtime path
|
||||
|
||||
- code-server pulled from artifact server (tar.gz)
|
||||
- Installed to `/opt/zlh/services/code-server`
|
||||
- Lifecycle endpoints: `POST /dev/codeserver/start|stop|restart`
|
||||
- Process detection via `/proc/*/cmdline` scan — no longer relies solely on PID file
|
||||
|
||||
- Game server crash recovery with backoff: 30s → 60s → 120s
|
||||
- Backoff resets if process uptime ≥ 30s
|
||||
- Transitions to `error` state after repeated failures
|
||||
- Crash observability: time, exit code, signal, uptime, log tail, classification
|
||||
- Classifications: `oom`, `mod_or_plugin_error`, `missing_dependency`, `nonzero_exit`, `unexpected_exit`
|
||||
|
||||
- Structured logging across provisioning, installs, file ops, control plane
|
||||
- Installer failures now include output tail for debugging
|
||||
|
||||
- `/status` now exposes: `runtimeInstalled`, `devProvisioned`, `devReadyAt`,
|
||||
`workspaceRoot`, `serverRoot`, `codeServerInstalled`, `codeServerRunning`,
|
||||
`lastCrashClassification`
|
||||
|
||||
---
|
||||
|
||||
## Current Blocking Issue
|
||||
|
||||
### Symptom
|
||||
|
||||
Browser opens IDE URL. Loads partially. Popup: "An unexpected error occurred."
|
||||
|
||||
Console:
|
||||
- `WebSocket close 1006`
|
||||
- `No file system provider found`
|
||||
|
||||
Workspace shows `!` (not mounted). Extension host fails to start.
|
||||
|
||||
### What works
|
||||
|
||||
- API auth ✅
|
||||
- Token validation ✅
|
||||
- Proxy routing to container ✅
|
||||
- WebSocket upgrade handler ✅
|
||||
- code-server process running on :6000 ✅
|
||||
|
||||
### What fails
|
||||
|
||||
- Workbench WebSocket session ❌
|
||||
- Filesystem provider initialization ❌
|
||||
- Extension host startup ❌
|
||||
|
||||
### Root cause (high confidence)
|
||||
|
||||
code-server is running without `--base-path /api/dev/<vmid>/ide`.
|
||||
|
||||
It assumes it is served from `/`. All WebSocket connection paths, static
|
||||
asset paths, and VS Code remote authority resolution are wrong when served
|
||||
through a proxy subpath without this flag.
|
||||
|
||||
---
|
||||
|
||||
## Required Fix
|
||||
|
||||
Update code-server launch in agent to:
|
||||
|
||||
```bash
|
||||
code-server \
|
||||
--bind-addr 0.0.0.0:6000 \
|
||||
--auth none \
|
||||
--disable-telemetry \
|
||||
--base-path /api/dev/${vmid}/ide \
|
||||
/home/dev/workspace
|
||||
```
|
||||
|
||||
The `vmid` must be injected at launch time from the agent's config.
|
||||
This is a one-line change to the code-server launch script in the agent.
|
||||
|
||||
---
|
||||
|
||||
## Architectural State
|
||||
|
||||
```
|
||||
User → Portal → API → Proxy → Container (code-server)
|
||||
```
|
||||
|
||||
Containers never publicly exposed. No Traefik per dev container.
|
||||
No DNS per dev container. API token is sole auth mechanism.
|
||||
|
||||
---
|
||||
|
||||
## Next Tasks
|
||||
|
||||
1. Fix `--base-path` in agent code-server launch — immediate priority
|
||||
2. Redeploy container, verify: no popup, no WS 1006, workspace loads, terminal works
|
||||
3. Validate proxy consistency (HTTP + WS + static assets)
|
||||
4. Portal "Open IDE" button implementation
|
||||
5. Container hardening (non-root dev user, restrict /opt)
|
||||
6. Headscale/Tailscale integration (future)
|
||||
Loading…
Reference in New Issue
Block a user