diff --git a/Session_Summaries/2026-03-10_Agent_Observability_Update.md b/Session_Summaries/2026-03-10_Agent_Observability_Update.md new file mode 100644 index 0000000..58e42ed --- /dev/null +++ b/Session_Summaries/2026-03-10_Agent_Observability_Update.md @@ -0,0 +1,57 @@ +# Agent Observability Update + +Date: 2026-03-10 + +This session improved agent observability without altering crash semantics or restart behavior. + +--- + +## Structured Logging + +Operational logs now use component tags: + +- `[process]` +- `[http]` +- `[files]` +- `[console]` +- `[state]` + +Each log entry includes the VMID when available. + +--- + +## Crash Metadata + +The agent now records and exposes via `/status`: + +- `lastCrashTime` +- `lastCrashExitCode` +- `lastCrashSignal` +- `lastCrashUptimeSeconds` +- `lastCrashLogTail` + +--- + +## Readiness Timeout + +A constant was introduced: + +```go +const ReadinessTimeout = 60 * time.Second +``` + +This centralizes the readiness policy and allows tuning without searching for hardcoded values. + +--- + +## Crash Semantics + +Crash classification unchanged: + +``` +unexpected exit → crashed +intentional stop → idle +readiness failure → error +``` + +This ensures crash metrics remain accurate and are not inflated by intentional operations.