Log code-server launch issue and port conflict finding 2026-03-15
This commit is contained in:
parent
b0e490bd66
commit
0b442db44b
@ -41,36 +41,37 @@ Blocker:
|
|||||||
|
|
||||||
## 2026-03-15
|
## 2026-03-15
|
||||||
|
|
||||||
Architecture review session. No code changes. Key decisions and findings:
|
Architecture review session. Key decisions and findings:
|
||||||
|
|
||||||
Dev container model:
|
Dev container model:
|
||||||
- 1 server / 1 container / 1 world confirmed as correct model
|
- 1 server / 1 container / 1 world confirmed as correct model
|
||||||
- Dev containers: full R/W access under /home/dev/workspace, no allowlist
|
- Dev containers: full R/W access under /home/dev/workspace, no allowlist
|
||||||
- Multiverse/multi-world via plugins is customer-managed, not a platform concern
|
- Multiverse/multi-world via plugins is customer-managed, not a platform concern
|
||||||
- Port exposure (dev-<vmid>.zerolaghub.com) identified as next major dev feature — future work
|
- Port exposure (dev-<vmid>.zerolaghub.com) identified as next major dev feature — future work
|
||||||
- Wildcard DNS pre-planning needed before port exposure implementation
|
|
||||||
- dotnet SDK covers all C# game modding (Valheim, Core Keeper, Vintage Story, Rust/Oxide)
|
- dotnet SDK covers all C# game modding (Valheim, Core Keeper, Vintage Story, Rust/Oxide)
|
||||||
- Code Server confirmed as correct browser IDE approach given single public IP constraint
|
- Code Server confirmed as correct browser IDE approach given single public IP constraint
|
||||||
|
- Traefik dynamic file provider confirmed as correct routing approach — no plugin needed, no SRV records needed
|
||||||
|
|
||||||
Agent review (zlh-agent commit 6019d0bc — 2026-03-15):
|
Agent review (zlh-agent commit 6019d0bc — 2026-03-15):
|
||||||
- Catalog transition confirmed correct — ValidateRuntimeSelection gates all dev provisions
|
- Catalog transition confirmed correct — ValidateRuntimeSelection gates all dev provisions
|
||||||
- Scripts unchanged — embedded script execution via bash stdin pipe, no 126 risk from runtime installs
|
- Scripts unchanged — embedded script execution via bash stdin pipe, no 126 risk from runtime installs
|
||||||
- devcontainer/common.go is clean and complete
|
- devcontainer/common.go is clean and complete
|
||||||
- node/verify.go has hardcoded /opt/zlh/runtime/node/bin/node — wrong path, pre-existing issue, not a regression
|
- node/verify.go has hardcoded /opt/zlh/runtime/node/bin/node — wrong path, pre-existing issue
|
||||||
- node/python/go/java install packages still use old version-unaware marker pattern — pre-existing, not a regression from catalog work
|
- node/python/go/java install packages still use old version-unaware marker pattern — pre-existing, not a regression
|
||||||
- node exporter baked into containers — disk/CPU/memory/network already covered by Prometheus
|
|
||||||
- Promtail present but status unknown
|
|
||||||
|
|
||||||
Agent future work identified (not yet implemented, priority order):
|
Agent future work (priority order):
|
||||||
1. Unified structured logging (slog) — Promtail/Loki integration needs structured fields to be queryable
|
1. Unified structured logging (slog) — Promtail/Loki integration needs structured fields
|
||||||
2. Dev container status/readiness — /status needs provisioningComplete + provisioningError for dev containers
|
2. Dev container /status — provisioningComplete + provisioningError fields
|
||||||
3. Crash recovery with backoff — auto-restart on crash with increasing delay (30s/60s/120s), max 3 attempts, then error state
|
3. Crash recovery with backoff — 30s/60s/120s, max 3 attempts, then error state
|
||||||
4. Graceful shutdown verification — confirm SIGTERM + wait before SIGKILL for Minecraft world save safety
|
4. Graceful shutdown verification — SIGTERM + wait before SIGKILL for Minecraft world save safety
|
||||||
5. Agent restart/process reattachment — detect existing process on agent restart, reattach rather than double-start
|
5. Agent restart/process reattachment — detect existing process on restart
|
||||||
6. Disk pressure warning in /status — agent-level signal before node exporter threshold alerts
|
|
||||||
|
|
||||||
Explicitly out of scope (not Pterodactyl):
|
Code-server routing:
|
||||||
- User management, permissions, billing
|
- Artifact fix confirmed working 2026-03-15
|
||||||
- Multi-container orchestration
|
- Binary confirmed present at /opt/zlh/services/code-server/bin/code-server
|
||||||
- Plugin/extension systems
|
- Root cause of ERR_CONNECTION_CLOSED identified: code-server is installed but never launched
|
||||||
- Anything owned by API or portal
|
- Port conflict: Node runtime is binding 6000, code-server cannot share the port
|
||||||
|
- Two fixes needed:
|
||||||
|
1. Assign code-server a port that won't conflict with Node (6000 taken)
|
||||||
|
2. Add launch step to addon install script — install != start, binary must be daemonized after provisioning
|
||||||
|
- Suggested launch: nohup /opt/zlh/services/code-server/bin/code-server --bind-addr 0.0.0.0:<port> --auth none /home/dev/workspace > /opt/zlh-agent/logs/code-server.log 2>&1 &
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user