diff --git a/SCRATCH/session-update-migration-stability.md b/SCRATCH/session-update-migration-stability.md
new file mode 100644
index 0000000..afe19d8
--- /dev/null
+++ b/SCRATCH/session-update-migration-stability.md
@@ -0,0 +1,92 @@
+# ZeroLagHub – Session Update: Migration, Stability, and Next Steps
+
+## Context
+
+System was migrated from Denver → Detroit (better CPU, same cost). Following the move, several non-deterministic issues appeared, including duplicate server creation and inconsistent lifecycle behavior. No API code changes were made during this period.
+
+---
+
+## Key Findings
+
+Issues are not code-related — they are caused by state inconsistency during/after migration.
+
+Likely causes:
+- Redis queue replay or stale jobs
+- Multiple workers briefly running (Denver + Detroit overlap)
+- Partial or unsynchronized DB/Redis migration
+- Internal domain routing introducing unintended request paths
+
+---
+
+## Architecture Adjustment
+
+Moving forward, all internal service communication will use IP-based env variables:
+
+```
+ARTIFACT_SERVER=http://10.x.x.x:port
+```
+
+Domain-based routing is reserved strictly for external/user-facing traffic (Traefik).
+
+This enforces proper separation of:
+- **Control plane**: API ↔ Agent ↔ services (IP env vars)
+- **Data plane**: user/browser ↔ domain ↔ proxy (Traefik)
+
+See `SCRATCH/service-discovery.md` for full spec.
+
+---
+
+## Fabric / Vanilla Progress
+
+Fabric server now successfully starts with:
+- Fabric API present in `/mods`
+- `FabricProxy-Lite.toml` configured (including forwarding secret)
+
+Remaining issue:
+- Initial startup requires Velocity restart before proxy recognizes backend
+
+Conclusion:
+- Agent must handle pre-seeding mods + config before first run
+- Velocity/plugin behavior likely tied to readiness timing (server not fully "ready" when registered)
+
+---
+
+## Next Session Plan (Priority Order)
+
+### 1. Database Validation
+- Check for duplicate VMIDs or container records
+- Verify lifecycle states are consistent (no stuck "creating")
+
+### 2. Redis Reset
+- Do NOT restore from backup
+- Start clean — `FLUSHALL`
+- Ensure no duplicate or replayed jobs
+
+### 3. Worker Verification
+- Confirm only one API/worker instance is running (Detroit only)
+- Confirm Denver host is fully decommissioned (it is — OS wiped Apr 2, 2026)
+
+### 4. PBS Backup Contingency
+- If DB is inconsistent → restore MariaDB from Denver snapshot
+- Redis remains fresh (never restored)
+
+### 5. Provisioning Validation
+- Run a single controlled server creation
+- Confirm:
+  - 1 DB record
+  - 1 job
+  - 1 agent execution
+
+---
+
+## Key Takeaway
+
+This session confirmed the system is operating as a distributed, state-driven platform:
+
+| Layer | Role |
+|-------|------|
+| Redis | Execution layer — ephemeral, never restore |
+| Database | Source of truth — authoritative state |
+| Routing | Strict boundaries — control plane vs data plane |
+
+Stability depends on maintaining a single authoritative state and clean execution flow across all components.