# Open Threads — zlh-grind This file tracks **active cross-repo and platform-level work only**. Repo-specific work belongs in: - `Codex/API/OPEN_ITEMS.md` - `Codex/Portal/OPEN_ITEMS.md` - `Codex/Agent/OPEN_ITEMS.md` - `Codex/Monitoring/OPEN_ITEMS.md` Keep this file short. --- ## Cross-Repo Active ### Final launch smoke test - create Minecraft server - confirm it reaches `Ready` / `connectable=true` - verify public game hostname is shown only when connectable - upload datapack on vanilla or install mod on supported modded runtime - create backup - restore backup - stop/start/restart host lifecycle actions - delete server - confirm Velocity unregister, Cloudflare cleanup, and Technitium cleanup ### Portal public-site QA - marketing site now has hybrid SaaS structure and SEO landing pages; verify visually before public push - check desktop and mobile layouts for Home, Features, Pricing, FAQ, About, Support, and SEO landing pages - confirm public CTAs route correctly to Register, Login, Pricing, Features, FAQ, and SEO pages - confirm root metadata, page metadata, and hero copy reflect browser dev environments + managed server hosting - mobile web is not yet considered optimized; perform a targeted usability pass before launch marketing traffic ### Backup / restore polish - happy-path local Minecraft backup create/restore has been verified live - API restore starts asynchronously and Portal polls restore status - keep restore wording/status transitions clear through completion and restart - confirm checkpoint metadata presentation remains clean when exposed to Portal - later hardening: persist last restore failure/checkpoint state in Agent `/status` - later hardening: automatic rollback from pre-restore checkpoint if restore apply/start fails after destructive replace ### Service discovery / launch validation - service discovery migration audit for remaining non-launch hot-path references - provisioning validation across current API/Agent/Portal assumptions - keep public exposure model explicit: - Portal public - Minecraft game hostnames public as needed - API/control plane/internal bridge/agent/admin services private ### Monitoring / observability - core lifecycle monitoring is launch-ready - `/etc/zlh-monitor` is now the operational monitoring source of truth - game/dev monitoring uses API discovery -> monitor sync -> file_sd for lifecycle inventory and add/remove validation - container Alloy remote-write to Prometheus `10.60.0.25:9090` is the canonical game/dev metrics path - `game-dev-alloy` scrape health also works because container Alloy now listens on `0.0.0.0:12345` - remaining future work lives in `Codex/Monitoring/OPEN_ITEMS.md`: centralized logs/Loki and optional OPNsense router-only monitoring - keep OPNsense plugin/PBS monitoring as explicit platform exceptions while Linux-managed game/dev targets converge on Alloy ### Notifications / launch polish - email notifications across backend contract + Portal UX - billing launch validation: - plan limit gating verified in Portal - still verify checkout/portal/webhook/upgrade-downgrade if Stripe is live --- ## Platform / Infrastructure Active - stress testing: k6 IDE load, Minecraft bot load, code-server memory baseline - OPNsense / public exposure audit - billing endpoint/path cleanup verification ### Backup boundary - Agent-owned backups are local, app-aware rollback backups for Minecraft worlds/config - PBS / platform backup strategy is the durability / disaster-recovery layer - do not track PBS/offsite durability work as agent implementation work unless that ownership changes --- ## Recently Verified / No Longer Considered Blocked - password reset and logged-in change-password work end-to-end - password reset tokens are 5-minute, hashed at rest, single-use, and old unused tokens are invalidated on deploy - API-owned Minecraft connection state derives from agent readiness, edge/DNS state, Velocity registration, and backend ping - Velocity proxy lifecycle callbacks are live with `registered_with_proxy` and `proxy_ping_ok` landing in API state - Portal consumes API-owned `connectable` / `connection` state and no longer infers Minecraft readiness itself - Portal server creation redirects to `/servers` and tracks setup progress there - Portal status labels no longer treat all non-connectable states as `Needs attention` - Portal public marketing site now has hybrid conversion + SEO structure - Portal pricing tiers are now Starter / Pro / Performance workload tiers rather than Minecraft-only tier names - Portal root metadata, homepage hero copy, and fake CLI line were corrected to match actual product capabilities - SEO landing pages were added for Minecraft hosting, modded Minecraft hosting, and browser dev environments - local Minecraft backup create/restore works end-to-end on live validation - restore creates intentional pre-restore checkpoint and API now starts restore asynchronously instead of holding the full request open - backup timestamps are normalized and pre-restore checkpoints are filtered from the default backup list - agent-backed file edits create shadow copies for revert and API route/stream forwarding issues were fixed - vanilla datapack upload works - vanilla Mods UI is hidden and direct vanilla `mods/` upload is rejected by API - NeoForge mod search/install/list works - delete/teardown lifecycle removes Velocity, Cloudflare, and Technitium records - public exposure model is in place: Portal public, control plane private - vanilla / fabric runtime split is restored: - `vanilla` = Fabric-based internal profile with proxy/API/config injection - `fabric` = plain Fabric jar delivery only - Forge / Neoforge first-start flow avoids premature readiness gating, applies post-start property enforcement, and restarts through the readiness-aware path - current validation indicates Minecraft server creation succeeds across supported runtime variants - current validation indicates dev container creation succeeds and hosted IDE access still works after the latest API/Portal runtime and cleanup passes --- ## Platform Future / Phase 2 - SSH / CF tunnel power-user access - Portal SSH config snippets - true interactive shell confinement / workspace-boundary decision - dev-container backup ownership and user-facing restore contract - likely direction for dev backups: LXC snapshot-based backup/restore instead of agent-managed dev backups - artifact version promotion - runtime rollback support - Cloudflare R2 for large artifact/mod delivery - admin panel - referral / dev pipeline reward system - uptime history - revisit DDoS mitigation later if needed --- ## Cleaning Rule - Root keeps only cross-repo/platform work - Repo-specific items must be removed from root once they live only in one Codex tracker - Completed items should be removed, not left in place as historical clutter - Use `CURRENT_STATE.md` for durable implemented behavior - Use `DECISIONS.md` for settled choices - Re-open old items only when there is current evidence they are still unfinished or have regressed