zlh-grind/Codex/API/CURRENT_STATE.md

8.1 KiB

API — Current State

This file records what is believed to be implemented now.

Runtime / dependency baseline

  • API is now tracked against Node 24 with repo-local pinning via package.json engines and .nvmrc.
  • Direct node-fetch dependency has been removed and API code now uses built-in global fetch.
  • Dependency / audit cleanup has been performed and the reported audit state is clean.
  • Prisma config has been migrated out of deprecated package.json#prisma into prisma.config.ts.
  • Prisma generate / validate checks reportedly pass on the current API baseline.

Route and lifecycle split

  • /api/instances and /api/containers are different operational surfaces and should not be treated as duplicates.
  • GET /api/instances lists ContainerInstance rows and is still used for lookup/discovery by some Portal-side code paths.
  • POST /api/instances is the active agent-driven provisioning entrypoint.
  • DELETE /api/containers/:vmid is the cleanup/delete/orphan-remediation route used for failed creates, agent failures, manual deletions, and other orphaned container cases.
  • Current delete cleanup has been updated to derive archive metadata from current instance fields (payload, engineType, allocatedPorts) instead of assuming the pre-v2 game / variant / ports fields are still present on ContainerInstance.

Readiness / agent state model

  • API is the heartbeat authority by polling agents.
  • Agent does not push state to API.
  • API consumes:
    • /health for liveness
    • /ready for semantic readiness
    • /status for detailed state snapshot
  • Portal should rely on API-normalized state, not direct agent state.
  • Proxy lifecycle is now tracked separately from agent readiness under ContainerInstance.payload.proxy.

Readiness cleanup already done

  • agentClient.js centralizes non-streaming agent transport.
  • getAgentReady() remains low-level transport.
  • isAgentReadyResult() is the shared semantic readiness helper.
  • assertAgentReady() uses semantic readiness.
  • Poller only caches ready: true when /ready returns semantic success.
  • Provisioning requires semantic readiness before success/persist/publish.
  • Timeout handling in agentClient.js has been modernized to AbortSignal.timeout(...).
  • A final alignment pass is still needed in merged live status so any Portal-facing agentReady field fully matches semantic /ready rather than looser cache presence.

Provisioning / post-provision flow

  • src/api/provisionAgent.js is the active provisioning path.
  • The older worker-based provisioning chain has been archived for reference and is no longer treated as a live API path.
  • Post-provision edge publishing is still active.
  • Post-provision behavior has been adjusted away from legacy port-allocation commits and should be understood as request-driven / payload-driven.

Velocity / Minecraft edge lifecycle

  • Minecraft edge publish uses Velocity instead of Traefik TCP config.
  • API registers Minecraft backends with the Velocity bridge using POST /zpack/register and unregisters with POST /zpack/unregister.
  • Velocity registration verification now uses the bridge's real GET /zpack/status route instead of the nonexistent /zpack/list route.
  • /zpack/status is expected to expose a servers array containing { name, address, port } entries.
  • If /zpack/status exists but does not expose registered backends, API treats the register response as acknowledged but unverifiable instead of falsely failing the registration.
  • API exposes POST /internal/velocity/proxy-status for bridge lifecycle callbacks using the same hashed X-Zpack-Secret shared-secret auth style.
  • Accepted proxy lifecycle statuses are registered_with_proxy, proxy_ping_ok, and proxy_ping_failed.
  • Proxy lifecycle callbacks are stored under ContainerInstance.payload.proxy with source, status, server name, address, port, duplicate flag, timestamps, latency, detail, and last event time.
  • GET /api/servers/:id/status now includes the stored proxy object beside agent-derived live status.
  • The bridge should set ZPACK_PROXY_STATUS_ENDPOINT to the API internal callback URL, for example http://<api-host>:4000/internal/velocity/proxy-status.

Backup support

  • API forwards game backup operations.
  • Current API route shape:
    • GET /api/game/servers/:id/backups
    • POST /api/game/servers/:id/backups
    • POST /api/game/servers/:id/backups/restore?id=<backup_id>
    • DELETE /api/game/servers/:id/backups/:backupId
  • Restore start is async at the API layer and Portal is expected to poll status rather than hold the restore POST open.
  • API forwards agent HTTP status codes for backup responses.
  • Successful backup responses currently pass through the agent body.
  • Non-OK backup responses currently use the shared agent response envelope: { error: <fallback>, details: <agent_body> }.
  • Backup response shape normalization remains open.

File proxy / route compatibility

  • Duplicated game file proxy logic has been extracted into src/routes/helpers/gameFileProxy.js in the API repo.
  • Route compatibility is intentionally preserved between:
    • /api/game/servers/:id/files...
    • /api/servers/:id/files...
  • Streamed upload / download / file edit forwarding still exists outside the generic non-streaming agent helper path.
  • Compatibility between canonical game routes and legacy/compatibility server routes should be treated as part of the API contract.

Agent contract alignment already done

  • /start, /stop, /restart forwarded as POST.
  • /console/command forwarded as POST JSON.
  • /ready is part of poller/readiness logic.
  • There is no API-native /api/agent/:serverId/:action route in zpack-api.
  • The reviewed Portal repo contained a Portal-side bridge with that shape which calls /api/instances for IP lookup and then contacts the agent directly; static review did not find an obvious current frontend caller for that bridge.

Billing / auth lifecycle

  • API issues access tokens and refresh tokens.
  • Password reset tokens are stored hashed and exchanged through API routes.
  • Stripe billing routes cover checkout, upgrade, downgrade, portal, and current billing state.
  • Stripe webhooks are mounted with raw body parsing before normal JSON middleware.
  • Billing scheduler starts in-process and performs limited reminder/reconciliation work.
  • Admin users are billing-exempt in billing flows.
  • JWT verification has reportedly been tightened to fixed algorithm plus issuer/audience separation for access, refresh, and IDE proxy tokens.
  • Pre-hardening tokens may no longer verify and a re-login may be required after this change.

Hosted IDE proxy

  • POST /api/dev/:id/ide-token issues short-lived IDE proxy tokens.
  • IDE proxy supports both tunnel paths under /__ide/:id and hosted dev-<vmid>.<suffix> hosts.
  • Hosted IDE tokens can be delivered by query parameter and then persisted as the IDE proxy cookie.
  • Hosted URL return is controlled by DEV_IDE_RETURN_HOSTED_URL.
  • API exposes code-server controls for owned dev containers:
    • POST /api/dev/:id/codeserver/start -> Agent POST /dev/codeserver/start
    • POST /api/dev/:id/codeserver/stop -> Agent POST /dev/codeserver/stop
    • POST /api/dev/:id/codeserver/restart -> Agent POST /dev/codeserver/restart
  • IDE proxy cookie hardening is expected to include httpOnly, sameSite: "lax", and secure-cookie behavior tied to public HTTPS or explicit secure-cookie config.
  • Sensitive proxy logging has reportedly been reduced so cookies and forwarded header detail are not exposed in normal logs.

Legacy / archived behavior

  • legacy port allocation / slot reservation is no longer part of the live route mounts and has been archived for reference
  • legacy worker provisioning, detached reconcile helpers, and explicit .old files have been moved under archive for reference rather than kept in the live tree
  • websocket console proxy wiring remains outside agentClient.js
  • raw streaming upload proxy behavior remains outside agentClient.js

Node 24 cleanup already reflected in API repo

  • RegExp.escape(...) is used where host / suffix regex escaping was previously manual.
  • Selected built-in imports have been normalized to node: style.