zlh-grind/Codex/API/OPEN_ITEMS.md

5.0 KiB

API — Open Items

Only keep unfinished API work here.

Active

  • normalize backup response shape: define canonical success bodies for list/create/restore/delete and a stable error envelope that preserves agent details
  • service discovery migration: audit edge publish/DNS/Cloudflare/Technitium, Prometheus SD, dev IDE wildcard, and post-provision hot paths for direct host assumptions
  • provisioning validation follow-up where API behavior is involved
  • verify Portal compatibility after API JWT issuer/audience tightening, especially refresh flow and hosted IDE token flow
  • migrate Portal delete calls to DELETE /api/servers/:id; DELETE /api/containers/:vmid currently remains as owned-user compatibility plus internal-token automation, but should return to internal-only once Portal is updated
  • verify canonical and compatibility file routes still behave identically across list/stat/read/download/delete/put/revert/upload paths after helper extraction
  • align merged live-status readiness fields so Portal-facing agentReady semantics fully match semantic /ready
  • decide whether the remaining Portal-side /api/agent/:serverId/:action bridge should be deleted outright or formally kept as compatibility-only Portal-owned behavior
  • live-verify Velocity bridge lifecycle callbacks after ZPACK_PROXY_STATUS_ENDPOINT is set: confirm registered_with_proxy, proxy_ping_ok, and proxy_ping_failed land in ContainerInstance.payload.proxy and surface through GET /api/servers/:id/status
  • Portal integration follow-up for host/LXC lifecycle state: consume GET /api/servers/:id/host/status, list-level hostStatus / powerState / hostOperation, and 202 host action responses for game and dev containers
  • remove or downgrade the temporary MaxListenersExceededWarning tracer in src/app.js after outbound Axios socket listener warnings are confirmed quiet in runtime logs
  • verify Proxmox node resolution against all active container ranges; recent local smoke checks showed some DB VMIDs not present in /cluster/resources or on the configured node
  • add minimal test/CI coverage for auth boundaries and teardown behavior:
    • normal users cannot reach admin-only routes
    • missing internal token fails closed for internal-only routes
    • owned-user delete archives then removes the active instance
    • non-owner delete fails
    • Stripe webhook raw-body handling remains intact

Cleanup / consolidation priorities

  • fold repeated ownership/auth/IP-guard patterns into small concrete helpers without hiding route intent
  • split oversized route/service files by responsibility without changing route contracts
  • keep backup/restore status shaping and async-dispatch logic explicit, but remove duplicated mapping/normalization paths where possible
  • keep stream-vs-JSON forwarding rules centralized in one place and avoid route-local reimplementation
  • keep legacy flows out of the live tree unless they are intentionally revived and revalidated against the current schema/contracts

Completed and moved out of active cleanup

  • Node/runtime pinning is no longer an open cleanup-only item; Node 24 pinning is now treated as current repo state
  • node-fetch removal and built-in fetch migration are no longer open items
  • initial file-proxy route deduplication has been completed; only compatibility verification and follow-on cleanup remain open
  • Prisma config migration is no longer an open item
  • baseline proxy cookie/log hardening is no longer an open item
  • worker-era provisioning and detached legacy port reservation have been removed from the live tree rather than treated as active API surfaces
  • initial control-plane hardening has landed: shared admin/internal-token middleware, admin-only audit/global instance list, internal-token guards on raw control-plane routes, and fail-closed monitoring/SD token behavior outside explicit development/test
  • teardown workflow has been extracted into a service and live-verified against one game VMID and one dev VMID; both archived with reason=user_delete and active rows were removed
  • repo hygiene pass removed checked-in key/token/artifact/legacy clutter and tightened ignore rules

Cleanup rule

  • prefer behavior-preserving folding over broad refactors
  • merge repeated flows, not concepts
  • keep helpers small and concrete
  • reduce route-local duplication before introducing new abstractions
  • treat security/runtime changes as contract-sensitive validation work once they affect auth, cookies, or route compatibility

Verify before re-opening

  • hosted IDE token + hosted URL flow
  • backup forwarding semantics
  • readiness polling/cache behavior
  • quota enforcement on create flow
  • restore async-start contract + status polling semantics
  • streamed file edit/revert forwarding through both canonical and compatibility routes
  • older-session re-login behavior after JWT tightening
  • Portal-side /api/agent bridge usage before deleting any remaining compatibility code around instance lookup assumptions

Not API-owned

  • agent-local backup implementation details
  • portal-only UX/polish
  • PBS / infra backup strategy