Update PROJECT_CONTEXT.md with current infrastructure and Velocity plugin model
This commit is contained in:
parent
c5f4b9ddb5
commit
9beac4015a
@ -3,7 +3,7 @@
|
||||
## What It Is
|
||||
Game server hosting platform targeting modded, indie, and emerging games.
|
||||
Competitive advantages: LXC containers (20-30% perf over Docker), custom
|
||||
agent architecture, open-source stack, developer-to-player pipeline that
|
||||
agent architecture, open-source stack, and a developer-to-player pipeline that
|
||||
turns mod developers into a distribution channel.
|
||||
|
||||
System posture: stable, controlled expansion phase.
|
||||
@ -12,72 +12,79 @@ System posture: stable, controlled expansion phase.
|
||||
|
||||
## Naming Convention
|
||||
|
||||
- `zlh-*` = core infrastructure (DNS, monitoring, backup, routing, artifacts)
|
||||
- `zpack-*` = game and dev server stack (portal, API, containers)
|
||||
- `zlh-*` = core infrastructure (routing, monitoring, backup, artifacts, shared services)
|
||||
- `zpack-*` = game and dev stack (portal, API, containers, Velocity/game edge)
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure (Proxmox)
|
||||
## Infrastructure (Current Reality)
|
||||
|
||||
### Active VMs
|
||||
### Active host / environment
|
||||
- Active dedicated host: **GTHost Detroit**
|
||||
- Denver host: **decommissioned Apr 2, 2026**
|
||||
- Active infra VM/LXC IDs are in the **9000s range**
|
||||
- Legacy IDs in the 100s / 300s / 1000s / 2000s are reference only and should **not** be treated as active production
|
||||
|
||||
| VM | Name | Role |
|
||||
|----|------|------|
|
||||
| 104 | zlh-monitor | Prometheus/Grafana monitoring |
|
||||
| 105 | zlh-router | Core services router |
|
||||
| 300 | zlh-velocity | Minecraft Velocity proxy |
|
||||
| 1001 | zlh-dns | Technitium DNS |
|
||||
| 1002 | zlh-proxy | Traefik — core/frontend SSL termination (portal traffic) |
|
||||
| 1003 | zlh-artifacts | Runtime binaries + Minecraft server jars (agent install source) |
|
||||
| 1004 | zlh-zpack-proxy | Traefik — game/dev edge routing + dev IDE wildcard TLS |
|
||||
| 1005 | zpack-api | Node.js API |
|
||||
| 1006 | zlh-zpack-router | Game/dev router |
|
||||
| 1100 | zpack-portal | Next.js frontend |
|
||||
| 2001 | zlh-back | PBS backup + Backblaze B2 |
|
||||
### Infrastructure source of truth
|
||||
- Authoritative VM/IP inventory: `INFRASTRUCTURE.md`
|
||||
- `PROJECT_CONTEXT.md` is a platform snapshot, **not** the authoritative VM/IP registry
|
||||
|
||||
### Legacy / Reference Only (not active production)
|
||||
### Active core infrastructure (high level)
|
||||
- `9001 zlh-router` — OPNsense core router
|
||||
- `9002 zpack-router` — OPNsense game/dev router
|
||||
- `9010 zpack-dns` — Technitium DNS
|
||||
- `9011 zlh-proxy` — core reverse proxy
|
||||
- `9012 zpack-proxy` — game/dev edge proxy
|
||||
- `9014 zlh-artifacts` — runtime, jar, and agent artifact server
|
||||
- `9015 zpack-velocity` — Velocity proxy
|
||||
- `9016 zlh-monitor` — Prometheus/Grafana
|
||||
- `9017 zlh-back` — Proxmox Backup Server
|
||||
- `9020 zpack-api` — API VM
|
||||
- `9021 zpack-portal` — portal VM
|
||||
|
||||
| VM | Name | Notes |
|
||||
|----|------|-------|
|
||||
| 100 | zlh-panel | Old Pterodactyl panel — kept for reference |
|
||||
| 101 | zlh-wings | Old Wings — kept for reference |
|
||||
| 103 | zlh-api | Old API VM — kept for reference |
|
||||
| 1000 | zlh-router | Not in use |
|
||||
### Service discovery note
|
||||
- `internal.zlh` is **not currently used** for hot-path runtime service discovery
|
||||
- Prefer explicit env-configured IPs / addresses in runtime-critical paths
|
||||
|
||||
---
|
||||
|
||||
## Stack
|
||||
|
||||
**API (zpack-api, VM 1005):** Node.js ESM, Express 5, Prisma 6, MariaDB,
|
||||
Redis, BullMQ, JWT, Stripe, argon2, ssh2, WebSocket, http-proxy-middleware
|
||||
**API (`zpack-api`):** Node.js ESM, Express 5, Prisma 6, MariaDB, Redis,
|
||||
BullMQ, JWT, Stripe, argon2, ssh2, WebSocket, http-proxy-middleware
|
||||
|
||||
**Portal (zpack-portal, VM 1100):** Next.js 15, TypeScript, TailwindCSS,
|
||||
**Portal (`zpack-portal`):** Next.js 15, TypeScript, TailwindCSS,
|
||||
Axios, WebSocket console. Sci-fi HUD aesthetic (steel textures, neon
|
||||
accents, beveled panels).
|
||||
|
||||
**Agent (zlh-agent):** Go 1.21, stdlib HTTP, creack/pty, gorilla/websocket.
|
||||
**Agent (`zlh-agent`):** Go 1.21, stdlib HTTP, creack/pty, gorilla/websocket.
|
||||
Runs inside every game/dev container. Only process with direct filesystem
|
||||
access. Pulls runtimes + server jars from zlh-artifacts (VM 1003).
|
||||
access. Pulls runtimes + server jars from `zlh-artifacts`.
|
||||
|
||||
**Velocity plugin (`ZpackVelocityBridge`):** custom Velocity-side bridge that
|
||||
hydrates/registers backend servers for the proxy and exposes plugin-local
|
||||
register/unregister/status HTTP endpoints.
|
||||
|
||||
---
|
||||
|
||||
## Agent (Operational)
|
||||
|
||||
- HTTP server on :18888, internal only — API is the only caller
|
||||
- HTTP server on :18888, internal only — API is the only intended caller
|
||||
- Container types: `game` and `dev`
|
||||
- Lifecycle: POST /config triggers async provision + start pipeline
|
||||
- Lifecycle: `POST /config` triggers async provision + start pipeline
|
||||
- Filesystem: strict path allowlist for games, workspace-root sandbox for dev containers
|
||||
- Upload transport: raw `http.request` piping (`req.pipe(proxyReq)`), never fetch()
|
||||
- Upload transport: raw `http.request` piping (`req.pipe(proxyReq)`), never `fetch()`
|
||||
- Console: PTY-backed WebSocket, one read loop per container
|
||||
- Self-update: periodic check + apply
|
||||
- Forge/Neoforge: automated 5-step post-install patch sequence
|
||||
- Modrinth mod lifecycle: install/enable/disable/delete — fully operational
|
||||
- Forge/Neoforge: automated post-install patch sequence
|
||||
- Modrinth mod lifecycle: install/enable/disable/delete — operational
|
||||
- Provenance: `.zlh_metadata.json` — source is `null` if not set
|
||||
- Status transport model: poll-based (`/status`), not push-based
|
||||
- State transitions: `idle`, `installing`, `starting`, `running`, `stopping`, `crashed`, `error`
|
||||
- Crash recovery: backoff 30s/60s/120s, resets if uptime ≥ 30s, `error` state after repeated failures
|
||||
- Crash observability: exit code, signal, uptime, log tail, classification (oom/mod_error/missing_dep/nonzero/unexpected)
|
||||
- Structured logging across provisioning, installs, file ops, control plane
|
||||
- Crash observability: exit code, signal, uptime, log tail, classification
|
||||
- Real Minecraft readiness probing exists in `internal/minecraft/readiness.go`
|
||||
- Current open Minecraft issue is sequencing/use of readiness, not absence of readiness logic
|
||||
|
||||
---
|
||||
|
||||
@ -96,7 +103,6 @@ access. Pulls runtimes + server jars from zlh-artifacts (VM 1003).
|
||||
- agent port: `18888`
|
||||
|
||||
Code-server launch model:
|
||||
|
||||
- binds to `0.0.0.0`
|
||||
- `--auth none`
|
||||
- API/hosted flow handles auth and proxying
|
||||
@ -107,75 +113,50 @@ Code-server launch model:
|
||||
|
||||
### Browser IDE (Current Working Model)
|
||||
|
||||
```
|
||||
```text
|
||||
Browser
|
||||
↓
|
||||
Traefik (dev-<vmid>.zerolaghub.dev, 10.70.0.242)
|
||||
Traefik / hosted dev edge
|
||||
↓
|
||||
API (10.60.0.245:4000)
|
||||
API
|
||||
↓
|
||||
container:6000
|
||||
```
|
||||
|
||||
Working hosted flow:
|
||||
|
||||
1. frontend calls `POST /api/dev/:id/ide-token`
|
||||
2. API returns `https://dev-<vmid>.zerolaghub.dev/?token=...`
|
||||
2. API returns hosted IDE URL with short-lived token
|
||||
3. browser opens hosted URL
|
||||
4. Traefik wildcard router forwards to API at `http://10.60.0.245:4000`
|
||||
4. edge forwards to API
|
||||
5. API validates token, sets HTTP-only IDE cookie, redirects to clean hosted URL
|
||||
6. subsequent cookie-backed request proxied to container code-server
|
||||
6. subsequent cookie-backed requests proxy to container code-server
|
||||
7. code-server redirects to `/?folder=/home/dev/workspace`
|
||||
8. IDE loads successfully
|
||||
|
||||
Curl-verified response chain:
|
||||
### Traefik / edge role
|
||||
- terminates TLS for hosted dev traffic
|
||||
- forwards hosted dev traffic to the API
|
||||
- preserves original `Host` header
|
||||
- does **not** route directly to containers for hosted IDE access
|
||||
|
||||
- `GET /?token=...` → `302` + `Set-Cookie`
|
||||
- `GET /` with cookie → `302` to `/?folder=/home/dev/workspace`
|
||||
- `GET /?folder=/home/dev/workspace` → `200` code-server HTML
|
||||
|
||||
### Traefik Role
|
||||
|
||||
- terminates TLS via wildcard cert `*.zerolaghub.dev` (Let's Encrypt DNS-01 via Cloudflare)
|
||||
- matches `dev-*.zerolaghub.dev` via `HostRegexp`
|
||||
- forwards to API at `http://10.60.0.245:4000`
|
||||
- preserves original `Host` header (`passHostHeader: true`)
|
||||
- does NOT route directly to containers
|
||||
|
||||
### API Role
|
||||
|
||||
- extracts vmid from `Host` header via `handleHostedProxy`
|
||||
### API role
|
||||
- extracts vmid from hosted request context
|
||||
- validates short-lived IDE token
|
||||
- sets HTTP-only `zlh_dev_ide_token` cookie
|
||||
- sets HTTP-only IDE cookie
|
||||
- redirects token URL to clean hostname URL
|
||||
- proxies all live code-server HTTP + WebSocket traffic to correct container
|
||||
|
||||
### Local Developer Access (Future)
|
||||
- proxies live code-server HTTP + WebSocket traffic to the correct container
|
||||
|
||||
### Local developer access (future / separate track)
|
||||
Headscale/Tailscale for SSH, VS Code Remote, local tools.
|
||||
Headscale server: `zlh-ctl` (status to be confirmed).
|
||||
Constraints: no exit nodes, `magic_dns: false`.
|
||||
|
||||
### Removed / No Longer Current
|
||||
|
||||
- path-based `/api/dev/:id/ide` as primary browser entry
|
||||
- path-based `/api/dev/:id/ide` as the primary browser entry
|
||||
- Caddy-hosted dev IDE edge
|
||||
- per-container Traefik file creation from dev provisioning
|
||||
- per-container Cloudflare/Technitium publish/unpublish from API for dev IDE access
|
||||
- per-container Cloudflare/Technitium publish/unpublish for dev IDE browser access
|
||||
|
||||
`proxyClient.js` remains in repo — still used by game edge publish logic.
|
||||
|
||||
---
|
||||
|
||||
## API Routes — Dev IDE
|
||||
|
||||
```
|
||||
POST /api/dev/:id/ide-token — generate short-lived IDE token + hosted URL
|
||||
```
|
||||
|
||||
Hosted requests land on the API through Traefik using the dev hostname.
|
||||
API handles host-based vmid extraction, token bootstrap, cookie handoff,
|
||||
HTTP + WebSocket proxy to code-server.
|
||||
`proxyClient.js` remains in repo and is still used by game edge publish logic.
|
||||
|
||||
---
|
||||
|
||||
@ -183,8 +164,32 @@ HTTP + WebSocket proxy to code-server.
|
||||
|
||||
- API polls agent `/status`
|
||||
- API exposes polled state back to frontend via `GET /api/servers/:id/status`
|
||||
- Portal uses the API-mediated hosted IDE flow
|
||||
- Portal uses the API websocket bridge for console access
|
||||
- Portal still has some migration debt through `src/lib/api/legacy.ts`
|
||||
- Portal no longer relies on stale DB-only state for console availability
|
||||
- Game publish flow remains untouched
|
||||
- Game publish flow remains untouched by dev routing work
|
||||
|
||||
---
|
||||
|
||||
## Velocity / Registration Model
|
||||
|
||||
### Current model
|
||||
- API is the source of backend inventory/state for Minecraft routing
|
||||
- Velocity plugin (`ZpackVelocityBridge`) is the component that actually registers/unregisters backends inside Velocity
|
||||
- Plugin supports startup rehydrate from API plus plugin-local webhook endpoints:
|
||||
- `POST /zpack/register`
|
||||
- `POST /zpack/unregister`
|
||||
- `GET /zpack/status`
|
||||
|
||||
### Important current finding
|
||||
- The likely current bug is **not** generic “Velocity broken”
|
||||
- The likely issue is that the API/plugin path can expose/register a backend while the game server is `running` but not actually `ready`
|
||||
- The plugin is the critical part of that registration path
|
||||
|
||||
### Important implementation note
|
||||
- Current plugin default endpoint behavior still references `zpack-api.internal.zlh` unless overridden
|
||||
- That default is stale relative to current hot-path architecture and should not be relied on long-term
|
||||
|
||||
---
|
||||
|
||||
@ -199,7 +204,7 @@ Terraria, Project Zomboid
|
||||
|
||||
## Developer-to-Player Pipeline (Revenue Model)
|
||||
|
||||
```
|
||||
```text
|
||||
LXC Dev Environment ($15-40/mo)
|
||||
→ Game/mod creation + testing
|
||||
→ Testing servers (50% dev discount)
|
||||
@ -212,12 +217,19 @@ Revenue multiplier: 1 developer → ~10 players → $147.50/mo total.
|
||||
|
||||
---
|
||||
|
||||
## Open Threads
|
||||
## Open Threads (High Level)
|
||||
|
||||
1. Verify full browser behavior + WebSocket under hosted wildcard flow
|
||||
2. Confirm "Open IDE" button in portal uses hosted URL in production path
|
||||
3. Confirm Headscale `zlh-ctl` VM status
|
||||
4. Curated provenance — tracking install origin
|
||||
1. Billing / Stripe completion
|
||||
2. Game server world backup / restore
|
||||
3. User onboarding flow
|
||||
4. Fabric readiness gating / Velocity exposure sequencing
|
||||
5. Password reset verification
|
||||
6. Usage limits / quota enforcement
|
||||
7. Email notifications
|
||||
8. Velocity resync / refresh behavior
|
||||
9. Upload testing, stress testing, OPNsense audit, provisioning validation
|
||||
|
||||
See `OPEN_THREADS.md` for active detail and priority order.
|
||||
|
||||
---
|
||||
|
||||
@ -225,11 +237,13 @@ Revenue multiplier: 1 developer → ~10 players → $147.50/mo total.
|
||||
|
||||
| Repo | Purpose |
|
||||
|------|---------|
|
||||
| zlh-grind | Execution workspace / continuity / active constraints |
|
||||
| zlh-docs | API/agent/portal reference docs (read from source) |
|
||||
| zpack-api | API source (mirror) |
|
||||
| zpack-portal | Portal source (mirror) |
|
||||
| zlh-agent | Agent source |
|
||||
| `zlh-grind` | execution workspace / continuity / active constraints |
|
||||
| `knowledge-base` | canonical architecture / strategy / bootstrap |
|
||||
| `zlh-docs` | API/agent/portal reference docs |
|
||||
| `zpack-api` | API source |
|
||||
| `zpack-portal` | portal source |
|
||||
| `zlh-agent` | agent source |
|
||||
| `ZpackVelocityBridge` | Velocity plugin / backend registration layer |
|
||||
|
||||
All at `git.zerolaghub.com/jester/<repo>`
|
||||
|
||||
@ -237,13 +251,13 @@ All at `git.zerolaghub.com/jester/<repo>`
|
||||
|
||||
## Session Guidance
|
||||
|
||||
- zlh-grind is the execution continuity layer, not the architecture authority
|
||||
- zlh-docs has full agent documentation (routes, filesystem rules, provisioning pipeline)
|
||||
- Agent is the authority on filesystem enforcement — API must NOT duplicate filesystem logic
|
||||
- `knowledge-base` is the architecture authority
|
||||
- `zlh-grind` is the execution continuity layer
|
||||
- `INFRASTRUCTURE.md` is the authoritative VM/IP inventory
|
||||
- Agent is the authority on filesystem enforcement — API must **not** duplicate filesystem logic
|
||||
- Portal does not enforce real policy — agent enforces
|
||||
- Portal never calls agents directly — all traffic through API
|
||||
- Upload transport uses raw http.request piping, never fetch()
|
||||
- VMs 100, 101, 103, 1000 are legacy/unused — not active production
|
||||
- Portal never calls agents directly — all traffic goes through API
|
||||
- Upload transport uses raw `http.request` piping, never `fetch()`
|
||||
- Do not mark unimplemented work as complete
|
||||
- Game publish flow must never be modified by dev routing changes
|
||||
- `proxyClient.js` must not be deleted — used by game edge publish path
|
||||
|
||||
Loading…
Reference in New Issue
Block a user