diff --git a/Codex/Agent/AGENT_TOKEN_FUTURE.md b/Codex/Agent/AGENT_TOKEN_FUTURE.md new file mode 100644 index 0000000..da6da05 --- /dev/null +++ b/Codex/Agent/AGENT_TOKEN_FUTURE.md @@ -0,0 +1,76 @@ +# Agent Token Future State + +Static `ZLH_AGENT_TOKEN` is acceptable as a launch hardening layer, but it should not be treated as the final trust model. + +## Launch / short term + +For launch, keep the trust model simple and reliable: + +- API refuses to start in production without `ZLH_AGENT_TOKEN`. +- Agent refuses protected requests in production without a configured `ZLH_AGENT_TOKEN`. +- Portal never sees the Agent token. +- API injects the token into every Agent request. +- Protected Agent routes remain behind the token; only intentionally public endpoints such as `/health` and `/version` should be public. + +This closes the immediate API-to-Agent control-plane gap without adding clock, issuer, audience, or scope drift risk during final launch validation. + +## Longer-term target + +The better long-term pattern is short-lived signed API-to-Agent request tokens. + +In that model: + +1. API keeps a signing secret or private key. +2. Agent knows only the verification secret or public key. +3. API signs a short-lived token per Agent request, usually 30-120 seconds. +4. Agent verifies issuer, audience, expiry, VMID, and scope before executing the request. + +Example claims: + +```json +{ + "iss": "zpack-api", + "aud": "zlh-agent", + "vmid": 5202, + "scope": "files:read", + "iat": 1714500000, + "exp": 1714500060, + "jti": "request-id" +} +``` + +This limits blast radius if a token is exposed. A stolen short-lived `files:read` token for VMID `5202` cannot be reused later to stop another server, restore a backup, or mutate files on a different container. + +## Roadmap + +- Phase 1: static `ZLH_AGENT_TOKEN`, fail-closed in production. +- Phase 2: token rotation and deployment validation so stale or missing tokens are caught before traffic. +- Phase 3: short-lived API-signed request tokens for API-to-Agent calls. +- Phase 4: optional mTLS between API and Agent for transport-level service identity. + +## JWT vs HMAC request signatures + +JWT is useful when explicit claims and public-key verification are desired. + +A compact HMAC request-signing scheme may be simpler for the Agent: + +```text +X-ZLH-Agent-Timestamp: +X-ZLH-Agent-Vmid: +X-ZLH-Agent-Scope: lifecycle:stop +X-ZLH-Agent-Signature: HMAC(secret, method + path + timestamp + vmid + scope + bodyHash) +``` + +Either is better than a forever bearer token once the launch system is stable. + +## Not a launch blocker + +Do not jump directly to short-lived signed tokens before launch unless required. They add operational complexity: + +- clock drift +- issuer/audience mismatch +- token scope bugs +- signing-key/config drift +- harder provisioning debugging + +For launch, static token plus fail-closed production behavior is the correct low-risk hardening step. For scale, short-lived signed API-to-Agent tokens are the cleaner end state.