Update monitoring launch posture
This commit is contained in:
parent
5bd6ddd8d7
commit
2b3a552358
@ -6,10 +6,10 @@ Monitoring implementation lives across infrastructure, API, Agent, Prometheus, G
|
|||||||
|
|
||||||
## Files
|
## Files
|
||||||
|
|
||||||
- `CURRENT_STATE.md` — observed monitoring state and launch-readiness posture
|
- `CURRENT_STATE.md` - observed monitoring state and launch-readiness posture
|
||||||
- `CONTRACT.md` — intended monitoring/discovery/label contract
|
- `CONTRACT.md` - intended monitoring/discovery/label contract
|
||||||
- `OPEN_ITEMS.md` — active monitoring blockers and follow-up work
|
- `OPEN_ITEMS.md` - active monitoring blockers and follow-up work
|
||||||
- `VALIDATION.md` — smoke-test and acceptance checklist
|
- `VALIDATION.md` - smoke-test and acceptance checklist
|
||||||
|
|
||||||
## Ownership boundary
|
## Ownership boundary
|
||||||
|
|
||||||
@ -22,4 +22,17 @@ Monitoring implementation lives across infrastructure, API, Agent, Prometheus, G
|
|||||||
|
|
||||||
## Current launch posture
|
## Current launch posture
|
||||||
|
|
||||||
As of the latest monitoring audit, monitoring is **not launch-ready**. Core services are running, but public exposure, stale/failing discovery, missing dashboards, and missing centralized logs block launch-debug readiness.
|
Core lifecycle monitoring is launch-ready for game/dev add-remove visibility and basic metrics debugging.
|
||||||
|
|
||||||
|
The operational source of truth is `/etc/zlh-monitor`. Prometheus, Grafana provisioning, dashboard JSON, file_sd discovery output, and monitoring-host Alloy config should be treated from that layout rather than legacy default paths.
|
||||||
|
|
||||||
|
Validated launch behavior includes:
|
||||||
|
|
||||||
|
- API discovery -> monitor sync -> Prometheus file_sd for game/dev lifecycle inventory
|
||||||
|
- new game/dev containers appear as `game-dev-alloy` targets
|
||||||
|
- deleted game/dev containers disappear and file_sd can intentionally become `[]`
|
||||||
|
- container Alloy remote-writes metrics to Prometheus at `10.60.0.25:9090`
|
||||||
|
- container Alloy direct scrape health works on `0.0.0.0:12345`
|
||||||
|
- Grafana dashboards are provisioned from `/etc/zlh-monitor/grafana/provisioning`
|
||||||
|
|
||||||
|
Remaining future work is tracked in `OPEN_ITEMS.md`, especially centralized logs/Loki and optional OPNsense router-only monitoring. Those are known follow-ups, not blockers to the current core lifecycle monitoring path unless launch policy changes.
|
||||||
Loading…
Reference in New Issue
Block a user