# Session Log – zlh-grind Append-only execution log for GPT-assisted development work. Do not rewrite or reorder past entries. --- ## 2025-12-14 - Goal: Stabilize zlh-agent provisioning pipeline for game + dev containers + addons without regressing game provisioning. - Scope: Agent routing (ctype=dev), local artifacts strategy, addon skeleton, initial end-to-end tests. - Adopted single base template approach (zlh-agent template) where agent installs roles at runtime. --- ## 2025-12-20 - Goal: Restore reliable end-to-end provisioning for **game + dev** without regressions; preserve the explicit orchestration steps/logging. ### Current Architecture Reality (IMPORTANT) - We are on **single base template**: `AGENT_TEMPLATE_VMID` is the only template in `.env`. - In this workflow, **any zlh-agent code change requires rebuilding + pushing the binary into the template container**. - That effectively means: **agent change = new template promotion**. ### Key Findings / Fixes - Game provisioning steps are functioning again and show clear step-by-step orchestration output (allocate VMID → clone → configure → start → IP → agent config → wait → DB save → edge publish). - Dev provisioning failure was traced to a **wire contract mismatch** between API JSON and agent `state.Config` struct tags: - Agent expects: `container_type` (snake_case) on the JSON wire - API refactors were sending: `ctype` / `containerType` (camelCase) in some variants - Result: `cfg.ContainerType == ""` on decode → agent routes into game path → errors like: - `unsupported container identity (containerType="" game="")` ### Decision (to minimize churn) - Because agent changes force a template promotion in this pipeline, the correct short-term move is: - **Update API to emit `container_type`** to match the existing agent contract - Avoid touching agent code until we intentionally rev the template ### Operational Guardrails - Do NOT remove or "simplify" the explicit provisioning steps/logging; they are required for debugging and operator confidence. - Treat `container_type` as the canonical wire key until the next planned template rev. --- ## 2025-12-21 - Goal: Restore reliable end-to-end provisioning for devcontainers and agent-managed installs. - Observed repeated failures during devcontainer runtime installation (node, python, go, java). - Initial assumption was installer regression; investigation showed installers were enforcing contract correctly. - Root cause identified: agent was not exporting required runtime environment variables (notably `RUNTIME_VERSION`). ### Devcontainer provisioning investigation - Deep dive on zlh-agent devcontainer provisioning flow. - Confirmed that all devcontainer installers intentionally require `RUNTIME_VERSION` and fail fast if missing. - Clarified that payload JSON is not read by installers; agent must project intent via environment variables. - Verified that installer logic itself (artifact naming, extraction, symlink layout) was correct. ### Embedded installer execution findings - Agent executes installers as **embedded scripts**, not filesystem paths. - Identified critical requirement: shared installer logic (`common.sh`) and runtime installer must execute in the **same shell session**. - Failure mode observed: `install_runtime: command not found` → caused by running runtime installer without `common.sh` loaded. - Confirmed this explains missing runtime directories and lack of artifact downloads. ### Installer architecture changes - Refactored installer model to: - `common.sh`: shared, strict, embedded-safe installation logic - per-runtime installers (`node`, `python`, `go`, `java`) as declarative descriptors only - Established that runtime installers are intentionally minimal and declarative by design. - Confirmed that this preserves existing runtime layout: `/opt/zlh/runtime///current` ### Artifact layout update - Artifact naming and layout standardized to simplified form: - `node-24.tar.xz` - `python-3.12.tar.xz` - `go-1.22.tar.gz` - `jdk-21.tar.gz` - Identified mismatch between runtime name and archive prefix (notably Java). - Introduced `ARCHIVE_PREFIX` as a runtime-level variable to resolve naming cleanly. ### Final conclusions - No regression in installer logic; failures were execution-order and environment-projection issues. - Correct fix is agent-side: - concatenate `common.sh` + runtime installer into one bash invocation - inject `RUNTIME_VERSION` (and related vars) into environment - Architecture now supports deterministic, artifact-driven, embedded-safe installs. Status: **Root cause resolved; implementation pending agent patch & installer updates.** ---