zlh-grind/OPEN_THREADS.md

97 lines
4.4 KiB
Markdown

# Open Threads — zlh-grind
This file tracks **active cross-repo and platform-level work only**.
Repo-specific work belongs in:
- `Codex/API/OPEN_ITEMS.md`
- `Codex/Portal/OPEN_ITEMS.md`
- `Codex/Agent/OPEN_ITEMS.md`
Keep this file short.
---
## Cross-Repo Active
### Backup / restore UX and contract polish
- keep Portal aligned with async restore start + status polling
- keep restore wording/status transitions clear through completion and restart
- confirm checkpoint metadata presentation remains clean when exposed to Portal
- consider later hardening for automatic rollback from pre-restore checkpoint if restore apply/start fails after destructive replace
### Dev access / IDE / SSH
- simplify and harden API `devProxy`
- complete SSH / CF tunnel access path across platform, API, Agent, and Portal UX
- add Portal SSH config snippet for power users
- resolve the dev console / shell workspace-boundary mismatch: current live validation shows hosted IDE and dev console work, but interactive shell traversal can still `cd ..` upward from `/home/dev/workspace`
- make docs and implementation agree on whether workspace scoping is file-API-only or true interactive-shell confinement
### Service discovery / launch validation
- service discovery migration for remaining hot-path references
- provisioning validation across current API/Agent/Portal assumptions
- Fabric / readiness / Velocity exposure final cross-component verification
- game server subdomain / player connection method verification
### Monitoring / observability
- normalize game/dev Alloy monitoring contract across API discovery, agent-written Alloy labels, Prometheus targets, and Grafana dashboards
- keep dynamic game/dev discovery on API -> sync script -> file_sd and verify automatic add/remove behavior for new containers
- finish game/dev template cleanup so Alloy is standard and `node-exporter` is removed from those templates
- keep OPNsense plugin and PBS monitoring as explicit platform exceptions while Linux-managed targets converge on Alloy
### Notifications / launch polish
- email notifications across backend contract + Portal UX
- remove stray `testdameon` / `testdaemon` binary from Portal repo
---
## Platform / Infrastructure Active
- upload testing
- stress testing: k6 IDE load, Minecraft bot load, code-server memory baseline
- OPNsense audit
- billing endpoint/path cleanup verification
### Backup boundary
- Agent-owned backups are local, app-aware rollback backups for Minecraft worlds/config
- PBS / platform backup strategy is the durability / disaster-recovery layer
- do not track PBS/offsite durability work as agent implementation work unless that ownership changes
---
## Recently Verified / No Longer Considered Blocked
- local Minecraft backup create/restore works end-to-end on live validation
- restore creates intentional pre-restore checkpoint and API now starts restore asynchronously instead of holding the full request open
- backup timestamps are normalized and pre-restore checkpoints are filtered from the default backup list
- agent-backed file edits create shadow copies for revert and API route/stream forwarding issues were fixed
- vanilla / fabric runtime split is restored:
- `vanilla` = Fabric-based internal profile with proxy/API/config injection
- `fabric` = plain Fabric jar delivery only
- Forge / Neoforge first-start flow now avoids premature readiness gating, applies post-start property enforcement, and restarts through the readiness-aware path
- current validation indicates Minecraft server creation succeeds across supported runtime variants
- current validation indicates dev container creation succeeds and hosted IDE access still works after the latest API/Portal runtime and cleanup passes
---
## Platform Future
- CF Tunnel SSH completion beyond first working path
- artifact version promotion
- runtime rollback support
- Cloudflare R2 for large artifact/mod delivery
- admin panel
- referral / dev pipeline reward system
- uptime history
- revisit DDoS mitigation later if needed
---
## Cleaning Rule
- Root keeps only cross-repo/platform work
- Repo-specific items must be removed from root once they live only in one Codex tracker
- Completed items should be removed, not left in place as historical clutter
- Use `CURRENT_STATE.md` for durable implemented behavior
- Use `DECISIONS.md` for settled choices
- Re-open old items only when there is current evidence they are still unfinished or have regressed