224 lines
6.2 KiB
Markdown
224 lines
6.2 KiB
Markdown
# Open Threads — zlh-grind
|
|
|
|
This file tracks active but unfinished work.
|
|
|
|
Keep it short.
|
|
|
|
---
|
|
|
|
## Agent (zlh-agent)
|
|
|
|
### Dev Runtime System
|
|
Completed:
|
|
- catalog validation implemented
|
|
- runtime installs artifact-backed
|
|
- install guard implemented
|
|
- all installs now fetch from artifact server
|
|
|
|
Outstanding:
|
|
- runtime install verification improvements
|
|
- catalog hash validation
|
|
- runtime removal / upgrade handling
|
|
- runtime update process for dev containers
|
|
|
|
### Dev Environment
|
|
Completed:
|
|
- dev user creation
|
|
- workspace root `/home/dev/workspace`
|
|
- console runs as dev user
|
|
- `HOME`, `USER`, `LOGNAME`, `TERM` env vars set correctly
|
|
|
|
Outstanding:
|
|
- PATH normalization
|
|
- shell profile consistency
|
|
- runtime PATH injection
|
|
|
|
### Code Server Addon
|
|
Status: Installed, running, browser-verified end-to-end.
|
|
|
|
Outstanding:
|
|
- code-server memory baseline
|
|
- decide whether default dev RAM should increase or become tier-based
|
|
- k6 IDE session load test
|
|
|
|
### Game Server Supervision
|
|
Completed:
|
|
- crash recovery with backoff
|
|
- crash observability and classification
|
|
|
|
### Fabric Readiness Gating
|
|
Status: startup rehydrate path fixed in plugin and both happy-path and negative-path startup validation passed.
|
|
|
|
Still outstanding:
|
|
- confirm whether any remaining agent-side registration path can surface a backend before readiness probe success
|
|
- see `SCRATCH/session-stabilization-fabric-findings.md`
|
|
|
|
### Agent Future Work
|
|
1. Structured logging (slog) for Loki
|
|
2. Dev container `provisioningComplete` state in `/status`
|
|
3. Graceful shutdown verification
|
|
4. Process reattachment on agent restart
|
|
5. SSH server install in dev container provisioning pipeline
|
|
|
|
---
|
|
|
|
## Dev IDE Access
|
|
|
|
### Browser IDE
|
|
Status: fully working, browser-verified, zero-install.
|
|
|
|
Remaining:
|
|
- confirm "Open IDE" button in portal uses hosted URL in production path
|
|
- reduce legacy `/__ide/:id` compatibility paths once portal button confirmed
|
|
- simplify and harden `devProxy`
|
|
|
|
### Local Dev Access — SSH via CF Tunnel
|
|
Current state:
|
|
- tunnel created and connected to bastion VM
|
|
- Zero Trust free plan active
|
|
- SSH hostname mapping not yet configured
|
|
- bastion SSH proxy jump config not yet done
|
|
- dev container SSH server not yet verified
|
|
- portal SSH config snippet not yet built
|
|
|
|
---
|
|
|
|
## API (zpack-api)
|
|
|
|
Completed:
|
|
- dev provisioning payload
|
|
- runtime/version fields
|
|
- enable_code_server flag
|
|
- status endpoint
|
|
- IDE token generation + hosted URL
|
|
- bootstrap IDE route
|
|
- live tunnel proxy
|
|
- host-based routing for hosted IDE
|
|
- hosted flow browser-verified end-to-end
|
|
- backend lifecycle hardened
|
|
- duplicate server creation fixed
|
|
- console routing corrected
|
|
- control plane switched to IP-based service communication
|
|
- Velocity rehydration uses DB + Redis instead of Proxmox live state
|
|
|
|
Outstanding:
|
|
- Billing / Stripe integration
|
|
- password reset flow verification
|
|
- usage limits / quota enforcement
|
|
- simplify and harden host-native `devProxy`
|
|
- dev runtime catalog endpoint for portal
|
|
- Headscale auth key generation
|
|
- service discovery migration for remaining hot-path `internal.zlh` references
|
|
|
|
---
|
|
|
|
## Portal (zpack-portal)
|
|
|
|
Completed:
|
|
- dev runtime dropdown
|
|
- dotnet runtime support
|
|
- enable code-server checkbox
|
|
- dev file browser support
|
|
- site copy rewrite
|
|
- pricing page updated
|
|
|
|
Outstanding:
|
|
- confirm "Open IDE" button fully uses hosted URL flow
|
|
- SSH config snippet for power users
|
|
- user onboarding flow
|
|
- email notifications
|
|
- dashboard information architecture refresh — remove or rethink duplicate resource overview/cards that do not add value beyond the dedicated servers page
|
|
- remove `testdaemon` binary from repo root
|
|
|
|
---
|
|
|
|
## Game Servers
|
|
|
|
Outstanding:
|
|
- game server world backup / restore
|
|
- game server subdomain / player connection method verification
|
|
|
|
---
|
|
|
|
## Velocity / ZpackVelocityBridge
|
|
|
|
Completed:
|
|
- startup rehydrate now requires `ready == true`
|
|
- stale default `zpack-api.internal.zlh` fallback removed
|
|
- explicit `ZPACK_REHYDRATE_ENDPOINT` env wiring validated
|
|
- happy-path player routing validated live after restart
|
|
- negative-path startup validation passed: no backend registered when rehydrate returned zero eligible servers
|
|
|
|
Outstanding:
|
|
- confirm whether any remaining agent-side registration path can surface a backend before readiness probe success
|
|
- see `SCRATCH/velocity-plugin.md`
|
|
|
|
Optional / Nonessential:
|
|
- `ZpackCommands` exist in code but are not currently needed operationally
|
|
- prefer plugin HTTP status endpoint and host metrics over in-proxy admin commands
|
|
- if metrics coverage is sufficient, command registration can remain omitted or the command class can be removed later
|
|
|
|
---
|
|
|
|
## Pre-Launch Checklist
|
|
|
|
Outstanding before launch:
|
|
- Billing / Stripe integration
|
|
- game server world backup / restore
|
|
- user onboarding flow
|
|
- password reset flow verification
|
|
- usage limits / quota enforcement
|
|
- game server subdomain verification
|
|
- email notifications
|
|
- upload testing
|
|
- billing endpoints
|
|
- stress testing: k6 IDE + Minecraft bot + code-server memory baseline
|
|
- OPNsense audit
|
|
- Fabric readiness gating full validation
|
|
- service discovery migration
|
|
- provisioning validation
|
|
- remove `testdaemon` from `zpack-portal`
|
|
|
|
---
|
|
|
|
## Platform
|
|
|
|
Future work:
|
|
- CF Tunnel SSH completion
|
|
- artifact version promotion
|
|
- runtime rollback support
|
|
- Cloudflare R2 for large artifact/mod file delivery
|
|
- admin panel
|
|
- referral / dev pipeline reward system
|
|
- uptime history
|
|
- revisit DDoS mitigation later if needed
|
|
|
|
---
|
|
|
|
## Closed Threads
|
|
|
|
- PTY console (dev + game)
|
|
- mod lifecycle
|
|
- upload pipeline
|
|
- runtime artifact installs
|
|
- dev container filesystem model
|
|
- code-server artifact fix
|
|
- API status endpoint for frontend agent-state consumption
|
|
- game server crash recovery with backoff
|
|
- crash observability
|
|
- code-server lifecycle endpoints
|
|
- code-server process detection
|
|
- dev IDE proxy
|
|
- hosted wildcard Traefik → API → container IDE flow
|
|
- per-container dev IDE edge publish/unpublish removed from API
|
|
- wildcard TLS cert `*.zerolaghub.dev`
|
|
- browser IDE fully loading at `dev-<vmid>.zerolaghub.dev`
|
|
- CF Tunnel created and connected to bastion VM
|
|
- portal copy rewrite
|
|
- DDoS investigation
|
|
- hosting provider decision: GTHost Detroit
|
|
- migration to Detroit complete
|
|
- system stabilization after migration
|
|
- IP-based control plane
|
|
- Velocity startup rehydrate fixed and validated on happy path
|