diff --git a/SCRATCH/debug-provisioning-consistency.md b/SCRATCH/debug-provisioning-consistency.md
new file mode 100644
index 0000000..98e6b72
--- /dev/null
+++ b/SCRATCH/debug-provisioning-consistency.md
@@ -0,0 +1,121 @@
+# ZeroLagHub – Debug Focus: Provisioning + Database Consistency
+
+## Context
+
+Recent changes implemented deterministic Fabric provisioning (mods + config pre-installed before first boot). This worked successfully once but is not consistently reliable across subsequent server creations.
+
+There are ongoing issues with:
+
+- Duplicate server creation during a single request
+- Inconsistent or failed server termination
+- Potential mismatch between system state and database state
+
+---
+
+## Suspected Root Causes
+
+### 1. Database State Drift
+
+There may be inconsistencies between:
+
+- Database (`ContainerInstance`, status fields, VMID allocation)
+- Actual Proxmox state (container exists / does not exist)
+- Agent state (running, crashed, not initialized)
+
+Potential issues:
+
+- Multiple job executions writing to DB
+- Missing transactional guarantees during create/delete
+- Lack of idempotency in provisioning flow
+- VMID or port allocation not properly locked
+
+---
+
+### 2. Job / Queue Behavior (BullMQ)
+
+Provisioning may be executing more than once due to:
+
+- Duplicate job enqueueing
+- Retry logic triggering unintended second runs
+- Missing job deduplication or idempotency keys
+
+---
+
+### 3. DNS-Based Routing Side Effects
+
+Current architecture relies on wildcard DNS:
+
+```
+dev-<vmid>.zerolaghub.dev → Traefik → API → container
+```
+
+This may be introducing timing-related issues:
+
+- Server considered "ready" before DNS propagation / routing works
+- API relying on hostname instead of internal state
+- Failure paths incorrectly handled when DNS resolution or routing is delayed
+
+This is especially problematic during:
+
+- First-time provisioning
+- Rapid create/delete cycles
+
+---
+
+## Areas to Review
+
+### Database Layer
+
+- `ContainerInstance` lifecycle: creation, status transitions, deletion
+- VMID allocation logic: ensure strict uniqueness, confirm locking or transactional safety
+- Port allocation: ensure no duplicate assignment
+- Verify no duplicate records created per request
+
+### Provisioning Flow
+
+- Ensure create endpoint is idempotent and guarded against double execution
+- Confirm only one job is enqueued per request
+- Validate job completion updates DB exactly once
+
+### Termination Flow
+
+- Ensure delete: removes container in Proxmox, updates DB state correctly, handles partial failures (e.g., container already gone)
+
+### Agent Interaction
+
+- Confirm agent is not re-triggering provisioning or reporting stale state
+- Validate `/status` vs DB consistency
+
+### DNS / Routing (Secondary Concern)
+
+- Evaluate whether DNS-based routing is being used as a source of truth (should not be)
+- Consider temporarily validating flows using direct IP instead of DNS
+
+---
+
+## Goal
+
+Restore deterministic lifecycle management:
+
+```
+Request → Single DB record → Single provision job → Single container → Clean termination
+```
+
+No duplicates, no drift, no reliance on DNS timing.
+
+---
+
+## Key Principle
+
+**The database must be the source of truth, not DNS or runtime reachability.**
+
+---
+
+## Next Step
+
+Full audit of:
+
+- DB writes during create/delete
+- Job queue behavior
+- VMID + resource allocation
+- State reconciliation between DB and Proxmox