6.9 KiB
Mod Deployment Safety Model
Version: 2.0 Updated: 2026-03-01 Phase: Phase 1 — Active Applies To: Game Containers Owner: Agent Layer
Goal
Allow user uploads and automated mod installs while:
- Preventing path escape
- Preserving runtime integrity
- Tracking provenance
- Avoiding hidden deployment layers
Deployment Types
| Type | Path | Extension |
|---|---|---|
| Mod (Modrinth) | mods/<file>.jar |
.jar |
| Mod (user upload) | mods/<file>.jar |
.jar |
| Datapack | world/datapacks/<file>.zip |
.zip |
Direct Runtime Writes
Uploads and automated installs are written directly into runtime directories.
No:
- Staging
- Symlinking
- Delayed deployment
Atomic Write Process
- Create temp file in target directory
- Stream multipart body (or downloaded artifact) into temp file
- Enforce size limit while streaming
os.Rename()temp → final filename- Update
.zlh_metadata.json
If overwrite=false and file exists → 409
Size Limits
Configured in agent:
| Type | Limit |
|---|---|
| Mods | 250MB |
| Datapacks | 100MB |
Enforced server-side during streaming. Not at the API layer.
Overwrite Behavior
overwrite=false(default) → reject if file exists →409overwrite=true→ replace file, updateuploaded_atin metadata
Agent-Side Guarantees
The agent guarantees:
- Final path is a file (not directory)
- Target is not a symlink
- Path resolves within runtime root
- Parent directory exists
.zlh_metadata.jsonis hidden from file listing API.zlh-shadowis hidden from file listing API
Provenance Metadata
All user uploads are marked source: "user" in .zlh_metadata.json.
{
"mods/sodium.jar": {
"source": "user",
"uploaded_at": "2026-03-01T22:37:01Z"
}
}
Modrinth-installed mods do not currently write provenance. Future curated installs may optionally write source: "curated". Not currently implemented.
No automatic inference of source from filename or path.
Failure Categories
| Code | Meaning |
|---|---|
409 |
File exists (overwrite=false) |
413 |
Size limit exceeded |
403 |
Path rejected by allowlist |
502 |
API transport failure (not agent validation) |
API Transport Considerations
The API acts strictly as a streaming proxy for uploads.
req.pipe(proxyReq)
proxyRes.pipe(res)
- Uses raw
http.requestpiping — notfetch - Does not buffer file contents
- Does not re-validate upload policy
- Upload timeout must be substantially larger than normal routes
Mod Lifecycle (Current Phase 1 State)
Install (Modrinth):
Modrinth resolver → API → Agent → Verified download → <serverRoot>/mods
Install (User Upload):
Portal → API (streaming proxy) → Agent → Atomic write → <serverRoot>/mods
Enable/Disable:
Filesystem rename: .jar ↔ .jar.disabled
Delete:
Soft delete to <serverRoot>/mods-removed (no auto-purge, no retention policy)
Filesystem is canonical state. Agent cache invalidated after every mutation.
Restore of deleted mods handled manually via file browser (see OPEN_THREADS.md).
Self-Healing Model (Automated Installs)
Applies to Modrinth installs and dev artifact promotions. Does not apply to user uploads — users assume responsibility for files they upload directly.
Deployment States
IDLE → DEPLOYING → STABILIZING → STABLE → IDLE
↓
ROLLBACK_FILE → ROLLBACK_SNAPSHOT → FAILED_RECOVERY
Snapshot Scope
Included:
mods/config/server.properties
Excluded:
world/(separate backup system)logs/- Cache and temp files
Stabilization Window
Hardcoded Phase 1: 3–5 minutes
Server is stable when:
- Readiness probe succeeds
- No crash restart during the window
- Server runs continuously for full window duration
Failure Triggers
| Trigger | Action |
|---|---|
| Early boot crash (exit within ~30s) | → ROLLBACK_FILE |
| Crash loop (≥3 crashes in window) | → ROLLBACK_SNAPSHOT |
| Readiness timeout (full window) | → ROLLBACK_SNAPSHOT |
Recovery
File rollback: Restore .jar.shadow → .jar, monitor again. If stable → IDLE. If not → escalate.
Snapshot restore: Extract .zlh_snapshots/deploy-<timestamp>.tar.gz, monitor. If stable → IDLE. If not → FAILED_RECOVERY.
Safety Constraints
- Only one file rollback attempt
- Only one snapshot restore attempt
- Both fail →
FAILED_RECOVERY, stop automation, surface to API + UI - World data never touched
- Only one active snapshot at a time
Metadata Tracking (Automated Installs)
Agent-level fields, cleared after stabilization success:
lastChangedMod
lastChangeTimestamp
lastChangeSource # modrinth | dev
deploymentState
crashCount
snapshotId
Why No Symlinks
Symlink-based deployment was rejected because:
- Complicates mod loader behavior
- Breaks server-side loader compatibility
- Introduces unexpected runtime indirection
- Makes provenance ambiguous
Direct writes + metadata is simpler and safer.
Logging Requirements
Structured logs must emit:
- Deployment started
- Snapshot created
- Shadow created
- Stabilization started
- Crash detected
- File rollback triggered
- Snapshot restore triggered
- Deployment stabilized
- Recovery failed
- User upload received
- User upload rejected (with reason)
All events visible via agent logs and surfaced through API status endpoints.
Operational Guarantees
- A broken automated mod install will not permanently brick the server
- Most automated failures auto-recover without operator intervention
- User-uploaded files bypass self-healing (user responsibility)
- Deployment state will not accumulate disk artifacts
- The system remains deterministic and bounded
Non-Goals (Phase 1)
- No multi-mod version history
- No dependency resolution
- No world snapshotting
- No long-term snapshot retention
- No manual snapshot UI
- No automatic dependency analysis
- No virus/malware scanning of user uploads
- No retention policy for
mods-removed/
Future Extensions (Phase 2+)
- Snapshot retention policies
- Manual snapshot restore via UI
- Multi-mod change grouping
- Deployment audit history
- Version diffing
- Provenance for Modrinth installs
- Retention automation for
mods-removed/
Final Decision Lock
| Decision | Value |
|---|---|
| Upload model | Direct runtime write, atomic via os.Rename() |
| Staging | None |
| Symlinks | None |
| Shadow copies | Single, automated installs only |
| Snapshots | Single, temporary, automated installs only |
| Escalation model | File rollback → snapshot restore → FAILED_RECOVERY |
| World data | Excluded, never touched |
| Stabilization window | Fixed, hardcoded Phase 1 |
| Provenance | .zlh_metadata.json at runtime root |
| User upload self-healing | Not applicable — user responsibility |