Replace session summary — document routing pivot decision and removed code paths

This commit is contained in:
jester 2026-03-15 23:00:42 +00:00
parent e6619c9a74
commit 77c0eeb1f5

View File

@ -1,43 +1,25 @@
# 2026-03-15 Dev routing + status exposure
# 2026-03-15 Dev routing pivot
## Summary
Work focused on the dev-container path for code-server and the frontend/API state gap.
Initial attempt exposed dev IDEs via Cloudflare DNS, Technitium DNS, and
Traefik dynamic routes. Each dev container received its own subdomain.
Example: `dev-6062.zerolaghub.dev`
---
## Completed
## What Was Confirmed Working
### API / routing
- Code-server artifact fixed — compiled release on `zlh-artifacts`
- Code-server installs and launches inside dev containers
- Process binds to `0.0.0.0:6000`
- Traefik loaded the dynamic config file
- Traefik router and service were created
- API can write remote Traefik config via SSH service account
- API status endpoint added — frontend host/console state now updates correctly
- Added dev-only routing hook during provisioning
- Reused existing Cloudflare + Technitium record creation for dev containers
- Added Traefik dynamic file creation for dev containers
- Kept existing game publish flow untouched
- Remote Traefik writes now use SSH service account model
- Added `zlh` SSH user path for proxy writes
### Agent / runtime validation
- Confirmed code-server artifact was the correct compiled release artifact
- Confirmed code-server installs correctly inside dev containers
- Confirmed code-server process launches successfully
- Confirmed code-server binds to `0.0.0.0:6000`
- Confirmed agent remains on `:18888`
### API / frontend status
- Added API endpoint(s) to expose polled agent status back to frontend
- Frontend host/console status now updates correctly from API state
- Console availability issue was frontend/API state exposure, not PTY transport
---
## Important confirmations
### Code-server process
Observed process:
Observed process shape:
```bash
/opt/zlh/services/code-server/lib/node /opt/zlh/services/code-server \
@ -46,47 +28,79 @@ Observed process:
/home/dev/workspace
```
This confirmed earlier confusion — ss showed the process as `node` because
code-server runs on Node internally. The process is code-server, not a
conflicting Node runtime process.
### Proxy/service path
Confirmed working pieces:
- Traefik loads dev dynamic file
- Traefik router and service are created
- backend target resolves to container IP on `:6000`
- API can write remote Traefik config via SSH automation
- internal/private-network checks from proxy side pass as expected
Note: `ss` shows process as `node` — expected, code-server runs on Node internally.
---
## Current blocker
## What Failed
External browser access to:
External browser access to `https://dev-6062.zerolaghub.dev` remained broken.
```
https://dev-6062.zerolaghub.dev
```
Issues encountered:
is still not complete.
State at end of session:
- Traefik route exists
- backend service exists
- code-server runs inside container
- frontend/API status path is fixed
- browser still fails externally with connection-closed / blocked-origin behavior
This remains an explicit open thread and should not be treated as solved.
- TLS negotiation failures
- Traefik routing complexity
- DNS automation overhead
- per-container subdomain management
- debugging difficulty across Cloudflare → Traefik → container chain
---
## Notes
## Decision
- Hostname format must remain consistent (`dev-<vmid>`)
- Game publish flow must remain untouched
- Dev routing is additive only
- Proxy SSH must remain service-account based (`zlh`)
Traefik/DNS approach abandoned. Dev IDE routing moving to **API proxy architecture**.
New model:
```
Browser
Portal
API proxy (/dev/<vmid>/ide)
container:6000
```
Advantages:
- eliminates DNS automation
- removes Traefik dependency for dev containers
- simplifies provisioning
- portal JWT controls access
- no per-container TLS
Implementation requirements:
- `http-proxy-middleware` with `ws: true`
- `server.on('upgrade', proxy.upgrade)` — required for WebSocket
- code-server launch args: `--base-path /dev/<vmid>/ide --auth none`
- API verifies container ownership before proxying
---
## Code to Remove from API
These code paths are no longer part of the architecture:
- `createDevRouting()`
- proxy SSH writes for Traefik dynamic files
- Traefik dynamic file creation on provisioning
- Cloudflare/Technitium DNS record creation for dev containers
Game publish flow must remain untouched — only dev routing code is removed.
---
## Additional Dev Access Path
Headscale/Tailscale will be added as an advanced option for developers
who want their local environment (SSH, VS Code Remote, local tools).
Headscale server expected on `zlh-ctl` — status to be confirmed.
Constraints:
- no exit nodes
- `magic_dns: false`
- no DNS takeover on customer machine