Add Agent token future-state guidance
This commit is contained in:
parent
32c281585f
commit
8e520d35cc
76
Codex/Agent/AGENT_TOKEN_FUTURE.md
Normal file
76
Codex/Agent/AGENT_TOKEN_FUTURE.md
Normal file
@ -0,0 +1,76 @@
|
||||
# Agent Token Future State
|
||||
|
||||
Static `ZLH_AGENT_TOKEN` is acceptable as a launch hardening layer, but it should not be treated as the final trust model.
|
||||
|
||||
## Launch / short term
|
||||
|
||||
For launch, keep the trust model simple and reliable:
|
||||
|
||||
- API refuses to start in production without `ZLH_AGENT_TOKEN`.
|
||||
- Agent refuses protected requests in production without a configured `ZLH_AGENT_TOKEN`.
|
||||
- Portal never sees the Agent token.
|
||||
- API injects the token into every Agent request.
|
||||
- Protected Agent routes remain behind the token; only intentionally public endpoints such as `/health` and `/version` should be public.
|
||||
|
||||
This closes the immediate API-to-Agent control-plane gap without adding clock, issuer, audience, or scope drift risk during final launch validation.
|
||||
|
||||
## Longer-term target
|
||||
|
||||
The better long-term pattern is short-lived signed API-to-Agent request tokens.
|
||||
|
||||
In that model:
|
||||
|
||||
1. API keeps a signing secret or private key.
|
||||
2. Agent knows only the verification secret or public key.
|
||||
3. API signs a short-lived token per Agent request, usually 30-120 seconds.
|
||||
4. Agent verifies issuer, audience, expiry, VMID, and scope before executing the request.
|
||||
|
||||
Example claims:
|
||||
|
||||
```json
|
||||
{
|
||||
"iss": "zpack-api",
|
||||
"aud": "zlh-agent",
|
||||
"vmid": 5202,
|
||||
"scope": "files:read",
|
||||
"iat": 1714500000,
|
||||
"exp": 1714500060,
|
||||
"jti": "request-id"
|
||||
}
|
||||
```
|
||||
|
||||
This limits blast radius if a token is exposed. A stolen short-lived `files:read` token for VMID `5202` cannot be reused later to stop another server, restore a backup, or mutate files on a different container.
|
||||
|
||||
## Roadmap
|
||||
|
||||
- Phase 1: static `ZLH_AGENT_TOKEN`, fail-closed in production.
|
||||
- Phase 2: token rotation and deployment validation so stale or missing tokens are caught before traffic.
|
||||
- Phase 3: short-lived API-signed request tokens for API-to-Agent calls.
|
||||
- Phase 4: optional mTLS between API and Agent for transport-level service identity.
|
||||
|
||||
## JWT vs HMAC request signatures
|
||||
|
||||
JWT is useful when explicit claims and public-key verification are desired.
|
||||
|
||||
A compact HMAC request-signing scheme may be simpler for the Agent:
|
||||
|
||||
```text
|
||||
X-ZLH-Agent-Timestamp: <unix timestamp>
|
||||
X-ZLH-Agent-Vmid: <vmid>
|
||||
X-ZLH-Agent-Scope: lifecycle:stop
|
||||
X-ZLH-Agent-Signature: HMAC(secret, method + path + timestamp + vmid + scope + bodyHash)
|
||||
```
|
||||
|
||||
Either is better than a forever bearer token once the launch system is stable.
|
||||
|
||||
## Not a launch blocker
|
||||
|
||||
Do not jump directly to short-lived signed tokens before launch unless required. They add operational complexity:
|
||||
|
||||
- clock drift
|
||||
- issuer/audience mismatch
|
||||
- token scope bugs
|
||||
- signing-key/config drift
|
||||
- harder provisioning debugging
|
||||
|
||||
For launch, static token plus fail-closed production behavior is the correct low-risk hardening step. For scale, short-lived signed API-to-Agent tokens are the cleaner end state.
|
||||
Loading…
Reference in New Issue
Block a user