# Dev IDE Access Architecture ## Overview Dev containers expose a browser-accessible VS Code IDE (code-server) via a host-based routing model. The API is the auth and proxy boundary. Traefik handles TLS and routing. Containers are never directly exposed. --- ## Architecture ``` Browser ↓ https://dev-.zerolaghub.dev Traefik (zlh-zpack-proxy, 10.70.0.242) ↓ wildcard TLS + host routing → http://10.60.0.245:4000 API (zpack-api, 10.60.0.245:4000) ↓ token validation + cookie handoff ↓ HTTP + WebSocket proxy Container code-server (:6000) ``` --- ## Why API Proxy (Not Direct Traefik → Container) Direct Traefik → container routing was tested and rejected for the following reasons: 1. **No auth boundary** — code-server runs with `--auth none`. Without the API in the path, any request reaching the container would have full IDE access. 2. **No ownership validation** — the API verifies the token against the database to confirm the requesting user owns the container. Traefik has no concept of this. 3. **Per-container routing complexity** — direct routing requires a Traefik dynamic config entry per container. The API proxy approach requires only one wildcard rule regardless of how many containers exist. The API proxy adds one network hop but provides auth, ownership enforcement, and operational simplicity. --- ## Token Flow ### Step 1 — Token generation ``` POST /api/dev/:id/ide-token Authorization: Bearer ``` Response: ```json { "token": "", "url": "https://dev-6070.zerolaghub.dev/?token=...", "expiresIn": 300 } ``` The IDE proxy token is short-lived (300s TTL), signed with a separate secret, and carries `{ sub, vmid, type: "dev-ide" }`. It is scoped to a specific container — a token for vmid 6070 cannot access vmid 6071. ### Step 2 — Bootstrap Browser navigates to `https://dev-6070.zerolaghub.dev/?token=...` Traefik forwards to the API. `handleHostedProxy` extracts the vmid from the `Host` header (`dev-6070` → `6070`), validates the token, confirms ownership against the database, then: 1. Sets `zlh_dev_ide_token` HTTP-only cookie (path: `/`, scoped to the hosted domain) 2. Redirects to the clean URL (token stripped from query string) ### Step 3 — Live traffic All subsequent requests carry the cookie. The API validates it on every request and proxies to `http://:6000`. WebSocket upgrades are handled in `attachDevProxyServer` on the raw Node.js HTTP server — the upgrade event is intercepted before Express routing, the cookie/token is validated, and a target-bound WS proxy is built at upgrade time using the resolved container IP. --- ## Request Flow Detail ``` 1. GET https://dev-6070.zerolaghub.dev/?token= → Traefik → API handleHostedProxy → validate token (vmid match + ownership) → Set-Cookie: zlh_dev_ide_token=; HttpOnly; Path=/ → 302 → https://dev-6070.zerolaghub.dev/ 2. GET https://dev-6070.zerolaghub.dev/ → Traefik → API handleHostedProxy → validate cookie → proxy to http://10.100.x.x:6000/ → code-server 302 → /?folder=/home/dev/workspace 3. GET https://dev-6070.zerolaghub.dev/?folder=/home/dev/workspace → Traefik → API handleHostedProxy → validate cookie → proxy to http://10.100.x.x:6000/?folder=/home/dev/workspace → 200 code-server HTML 4. WS wss://dev-6070.zerolaghub.dev/stable-/... → Traefik → API server upgrade handler (attachDevProxyServer) → validate cookie from WS request headers → build target-bound WS proxy → ws://10.100.x.x:6000/stable-/... ``` --- ## Traefik Configuration ```yaml http: routers: dev-ide: rule: "HostRegexp(`dev-{vmid:[0-9]+}.zerolaghub.dev`)" entryPoints: - websecure service: dev-ide-api tls: certResolver: zpackv2 domains: - main: "zerolaghub.dev" sans: - "*.zerolaghub.dev" services: dev-ide-api: loadBalancer: passHostHeader: true servers: - url: "http://10.60.0.245:4000" ``` `passHostHeader: true` is critical — without it Express resolves relative redirects against the internal API IP, leaking it to the browser. TLS: wildcard cert `*.zerolaghub.dev` issued via Let's Encrypt DNS-01 challenge through Cloudflare. certResolver `zpackv2` handles renewal automatically. --- ## API Key Files - `src/routes/devProxy.js` — all IDE proxy logic - `src/app.js` — mounts devProxy router + calls `attachDevProxyServer` - `src/auth/tokens.js` — `signIdeProxyToken` / `verifyIdeProxyToken` Key functions in `devProxy.js`: - `handleHostedProxy` — entry point for all host-based requests - `parseHostedVmid` — extracts vmid from `Host: dev-6070.zerolaghub.dev` - `resolveDevTarget` — DB lookup confirming ownership + returning container IP - `attachDevProxyServer` — raw HTTP upgrade handler for WebSockets - `buildWsProxy` — builds a target-bound WS proxy instance at upgrade time --- ## WebSocket Critical Detail The shared HTTP proxy instance (`ideTunnelProxy`) has `ws: false`. WebSocket upgrades are handled exclusively in `attachDevProxyServer` which builds a new proxy instance per upgrade with the resolved container target hardcoded. This was the fix for the `ECONNREFUSED 127.0.0.1:6000` bug — the shared proxy was falling back to `DEFAULT_IDE_TARGET` (localhost) instead of the actual container IP because per-request context was lost during the upgrade. --- ## code-server Container Setup - Install path: `/opt/zlh/services/code-server` - Port: `6000` (bound to `0.0.0.0`) - Launch flags: `--auth none --disable-telemetry` - Auth: disabled at code-server level — API handles all auth - Workspace: `/home/dev/workspace` - User: `dev` Chrome blocks port 6000 as an unsafe port. This is internal only and never directly browser-accessible, so it is not an issue. --- ## Known Pitfalls | Issue | Cause | Fix | |-------|-------|-----| | `ERR_CONNECTION_CLOSED` in browser | Traefik/API not reachable or wrong code deployed | Check connectivity + confirm deployed code has `handleHostedProxy` | | 404 from API | `handleHostedProxy` not in running process | Restart API after deploy | | Redirect loops to raw IP | `passHostHeader: true` missing in Traefik | Add it to loadBalancer config | | WS 1006 disconnect | WS proxy falling back to `127.0.0.1` | Ensure `ws: false` on HTTP proxy, WS handled only in `attachDevProxyServer` | | Cookie not set | Token expired (300s TTL) | Generate a fresh token | | Wildcard cert fails to issue | Stale `_acme-challenge` TXT records in Cloudflare | Delete stale records manually, reload Traefik | --- ## Future: SSH Access via CF Tunnel Planned addition — same hostname, SSH protocol routed through Cloudflare Tunnel: ``` Developer laptop ↓ ssh dev-6070.zerolaghub.dev Cloudflare edge (CF Tunnel) ↓ Bastion VM ↓ SSH proxy jump Dev container ``` Developer one-time SSH config: ``` Host *.zerolaghub.dev ProxyCommand cloudflared access ssh --hostname %h ``` CF Tunnel runs persistently on the bastion VM. Free tier covers up to 50 users. Browser IDE and SSH share the same hostname — different protocols routed separately.