zlh-grind/SCRATCH/service-discovery.md

78 lines
2.5 KiB
Markdown

# Service Discovery — Env-Based Approach
## Problem with internal.zlh DNS
Service-to-service calls in the API and agent currently use internal.zlh FQDNs
(e.g. `zpack-velocity.internal.zlh:8081`). This introduces:
- DNS resolution overhead on every call
- Silent failure if Technitium is down or slow
- Hard-to-diagnose timing issues during provisioning
- An extra dependency in the critical path for container creation
The provisioning + DB consistency issues may be partly caused by DNS resolution
delays or failures during rapid create/delete cycles.
## Recommended Approach — Env File
Replace internal.zlh FQDNs in service-to-service calls with env vars backed by IPs.
Each VM reads from its own `.env` or `services.env` file.
### Example services.env
```env
ZPACK_API_URL=http://10.60.0.18:4000
ZLH_ARTIFACTS_URL=http://10.60.0.17
ZPACK_VELOCITY_URL=http://10.70.0.10:8081
ZLH_PBS_URL=http://10.60.0.24
ZLH_MONITOR_URL=http://10.60.0.25
ZLH_DNS_URL=http://10.60.0.14
```
### Benefits
- Zero DNS resolution — direct IP, always predictable
- No dependency on Technitium for service calls
- If IPs ever change, update one env file per VM — no DNS record changes needed
- Easier to debug — no DNS layer to troubleshoot
- Faster provisioning — no resolution delay in the critical path
### Scope
Services that need to change:
| Service | Current | Change to |
|---------|---------|-----------|
| zpack-api | `zpack-velocity.internal.zlh:8081` | `ZPACK_VELOCITY_URL` env var |
| zpack-api | `zlh-artifacts.internal.zlh` | `ZLH_ARTIFACTS_URL` env var |
| zpack-portal | `zpack-api.internal.zlh` | `ZPACK_API_URL` env var (server-side only) |
| zlh-agent | artifact server FQDN | `ZLH_ARTIFACTS_URL` env var |
### What internal.zlh is still useful for
- Human navigation: SSH, browser access to admin tools, Proxmox
- Non-critical paths where DNS resolution timing doesn't matter
- Documentation and reference
internal.zlh should NOT be used in:
- Provisioning hot path
- Agent → artifact server calls
- API → Velocity calls
- Any call that happens during container create/delete
## Current IPs (Detroit host)
See INFRASTRUCTURE.md for full IP table. Key service IPs:
| Service | IP |
|---------|----|
| zpack-api | 10.60.0.18:4000 |
| zpack-portal | 10.60.0.19 |
| zlh-artifacts | 10.60.0.17 |
| zpack-velocity | 10.70.0.10:8081 |
| zlh-dns (Technitium) | 10.60.0.14 |
| zlh-proxy (Caddy) | 10.60.0.16 |
| zpack-proxy (Traefik) | 10.70.0.11 |
| zlh-monitor | 10.60.0.25 |
| zlh-back (PBS) | 10.60.0.24 |