Add stabilization session findings and Fabric readiness timing root cause
This commit is contained in:
parent
6d16e3e00e
commit
f55dfac860
72
SCRATCH/session-stabilization-fabric-findings.md
Normal file
72
SCRATCH/session-stabilization-fabric-findings.md
Normal file
@ -0,0 +1,72 @@
|
|||||||
|
# ZeroLagHub – System Stabilization & Fabric Proxy Findings
|
||||||
|
|
||||||
|
## Current State
|
||||||
|
|
||||||
|
Following API and frontend revisions, the system is now stable and deterministic:
|
||||||
|
|
||||||
|
- Backend lifecycle handling hardened — idempotent create/delete/start/stop
|
||||||
|
- Redis and database verified healthy
|
||||||
|
- Duplicate server creation traced to frontend (not API)
|
||||||
|
- Console routing corrected (browser → API → agent)
|
||||||
|
- Internal domain routing removed in favor of explicit IP-based communication
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Architectural Improvements
|
||||||
|
|
||||||
|
- Control plane now uses direct IP-based service communication
|
||||||
|
- Data plane remains domain-based (Traefik)
|
||||||
|
- Velocity rehydration now uses DB + Redis instead of Proxmox live state
|
||||||
|
- API now acts as the central authority for routing, console, and orchestration
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Fabric / Velocity Issue — Root Cause
|
||||||
|
|
||||||
|
The remaining issue with Fabric servers is NOT related to:
|
||||||
|
- Fabric API version
|
||||||
|
- FabricProxy-Lite version
|
||||||
|
- Mod installation
|
||||||
|
|
||||||
|
The issue is caused by **timing of server readiness vs Velocity registration**:
|
||||||
|
|
||||||
|
- Server is registered with Velocity before it is fully ready to accept proxy traffic
|
||||||
|
- This results in "proxy starting" errors until Velocity is restarted
|
||||||
|
- After restart, the server is already fully initialized and works correctly
|
||||||
|
|
||||||
|
**Conclusion:** The system lacks a reliable readiness signal for Fabric-based servers.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Required Fix
|
||||||
|
|
||||||
|
Agent (or API) must delay Velocity registration until true readiness is confirmed.
|
||||||
|
|
||||||
|
### Recommended approach
|
||||||
|
|
||||||
|
1. Pre-seed Fabric API and FabricProxy config before first run ✅ (already done)
|
||||||
|
2. Do not rely solely on "Done" log output
|
||||||
|
3. Introduce readiness gating — one of:
|
||||||
|
- Short delay buffer after startup
|
||||||
|
- TCP port readiness check (port 25565 accepting connections)
|
||||||
|
- Log-based readiness confirmation (watch for specific "done" log line)
|
||||||
|
|
||||||
|
TCP port check is the most reliable — if the port is accepting connections, the server is ready. This is already what the agent's probe does. The issue is likely that Velocity registration happens before the probe confirms readiness.
|
||||||
|
|
||||||
|
### Fix location
|
||||||
|
|
||||||
|
Agent-level orchestration change — do not register with Velocity until the readiness probe returns success.
|
||||||
|
|
||||||
|
Avoid modifying Velocity itself — the issue is orchestration timing, not proxy configuration.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
| Issue | Status |
|
||||||
|
|-------|--------|
|
||||||
|
| Duplicate server creation | ✅ Fixed — was frontend, not API |
|
||||||
|
| DB/Redis state drift | ✅ Resolved — verified healthy |
|
||||||
|
| Console routing | ✅ Fixed |
|
||||||
|
| Internal DNS timing | ✅ Resolved — replaced with IP env vars |
|
||||||
|
| Fabric readiness timing | 🔧 Pending — readiness gating needed in agent |
|
||||||
Loading…
Reference in New Issue
Block a user