Add agent observability session summary 2026-03-10
This commit is contained in:
parent
fa2aa7ae7d
commit
9f01809f1a
57
Session_Summaries/2026-03-10_Agent_Observability_Update.md
Normal file
57
Session_Summaries/2026-03-10_Agent_Observability_Update.md
Normal file
@ -0,0 +1,57 @@
|
|||||||
|
# Agent Observability Update
|
||||||
|
|
||||||
|
Date: 2026-03-10
|
||||||
|
|
||||||
|
This session improved agent observability without altering crash semantics or restart behavior.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Structured Logging
|
||||||
|
|
||||||
|
Operational logs now use component tags:
|
||||||
|
|
||||||
|
- `[process]`
|
||||||
|
- `[http]`
|
||||||
|
- `[files]`
|
||||||
|
- `[console]`
|
||||||
|
- `[state]`
|
||||||
|
|
||||||
|
Each log entry includes the VMID when available.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Crash Metadata
|
||||||
|
|
||||||
|
The agent now records and exposes via `/status`:
|
||||||
|
|
||||||
|
- `lastCrashTime`
|
||||||
|
- `lastCrashExitCode`
|
||||||
|
- `lastCrashSignal`
|
||||||
|
- `lastCrashUptimeSeconds`
|
||||||
|
- `lastCrashLogTail`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Readiness Timeout
|
||||||
|
|
||||||
|
A constant was introduced:
|
||||||
|
|
||||||
|
```go
|
||||||
|
const ReadinessTimeout = 60 * time.Second
|
||||||
|
```
|
||||||
|
|
||||||
|
This centralizes the readiness policy and allows tuning without searching for hardcoded values.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Crash Semantics
|
||||||
|
|
||||||
|
Crash classification unchanged:
|
||||||
|
|
||||||
|
```
|
||||||
|
unexpected exit → crashed
|
||||||
|
intentional stop → idle
|
||||||
|
readiness failure → error
|
||||||
|
```
|
||||||
|
|
||||||
|
This ensures crash metrics remain accurate and are not inflated by intentional operations.
|
||||||
Loading…
Reference in New Issue
Block a user