Clarify centralized Loki log model

This commit is contained in:
jester 2026-05-01 20:23:23 +00:00
parent 9020e35107
commit a78c663f7b

View File

@ -148,7 +148,8 @@ Agent/container-side monitoring responsibilities:
- keep local Alloy HTTP bound to loopback unless direct scrape health is intentionally adopted - keep local Alloy HTTP bound to loopback unless direct scrape health is intentionally adopted
- write or refresh Alloy config after `POST /config` - write or refresh Alloy config after `POST /config`
- keep labels minimal and consistent with dashboards/discovery - keep labels minimal and consistent with dashboards/discovery
- eventually emit structured logs suitable for Loki/Alloy log ingestion - eventually ship selected container logs through Alloy to centralized Loki
- do not run Loki itself inside each game/dev container
## Monitoring host responsibilities ## Monitoring host responsibilities
@ -162,6 +163,7 @@ The monitoring host must:
- alert on missing/stale remote-write metrics by VMID/type - alert on missing/stale remote-write metrics by VMID/type
- alert on API discovery endpoint failure - alert on API discovery endpoint failure
- store bearer tokens with tight permissions or systemd credentials - store bearer tokens with tight permissions or systemd credentials
- host or reach the centralized Loki service once log collection is enabled
## Grafana responsibilities ## Grafana responsibilities
@ -173,11 +175,33 @@ Grafana must have:
- dev container dashboard installed - dev container dashboard installed
- dashboard queries aligned with the canonical label contract - dashboard queries aligned with the canonical label contract
- dashboards based on remote-write metric freshness for game/dev container health unless direct scrape health is intentionally enabled - dashboards based on remote-write metric freshness for game/dev container health unless direct scrape health is intentionally enabled
- Loki datasource installed once centralized logs are enabled
Dashboard JSON existing on disk is not sufficient; dashboards must be visible in Grafana. Dashboard JSON existing on disk is not sufficient; dashboards must be visible in Grafana.
## Logs contract ## Logs contract
Loki should be centralized, not installed per game/dev container.
Preferred launch direction:
```text
container/app logs
-> container Alloy log collection
-> centralized Loki
-> Grafana Explore / dashboard links
```
Container Alloy should tail only selected, useful logs and attach stable labels such as:
- `vmid`
- `instance`
- `container_type`
- `server_id` where safe/useful
- `log_source`, for example `agent`, `minecraft`, `codeserver`, `provisioning`
Do not turn every container into a Loki server. Loki belongs on the monitoring/logging side as shared infrastructure. Containers should run Alloy as the lightweight collector/shipper.
Launch-ready debugging requires centralized logs for at least: Launch-ready debugging requires centralized logs for at least:
- API application logs - API application logs
@ -189,7 +213,17 @@ Launch-ready debugging requires centralized logs for at least:
- DNS/edge publish and cleanup actions - DNS/edge publish and cleanup actions
- monitoring discovery sync - monitoring discovery sync
Preferred direction: Loki or Alloy log ingestion. Until this exists, launch-debug readiness remains incomplete. Initial container log candidates:
- Agent service logs
- Minecraft/server stdout logs where useful
- code-server logs for dev containers
- provisioning/config application logs
- backup/restore operation logs
Avoid collecting noisy or sensitive logs by default. Review anything that may include customer secrets, tokens, console input, file contents, or environment variables before shipping to Loki.
Until centralized Loki/Alloy log shipping exists, launch-debug readiness remains incomplete.
## Security contract ## Security contract
@ -201,3 +235,4 @@ Required before launch:
- API monitoring endpoints fail closed without token - API monitoring endpoints fail closed without token
- no monitoring bearer token in world-readable files - no monitoring bearer token in world-readable files
- no token leakage in labels, target URLs, dashboards, or logs - no token leakage in labels, target URLs, dashboards, or logs
- Loki, once added, is not publicly reachable and accepts writes only from intended Alloy collectors or trusted internal paths