diff --git a/ZeroLagHub_Cross_Project_Tracker.md b/ZeroLagHub_Cross_Project_Tracker.md index 729c765..e32cb78 100644 --- a/ZeroLagHub_Cross_Project_Tracker.md +++ b/ZeroLagHub_Cross_Project_Tracker.md @@ -1,7 +1,7 @@ # 🛡️ ZeroLagHub Cross-Project Tracker & Drift Prevention System -**Last Updated**: December 7, 2025 -**Version**: 1.0 (Canonical Architecture Enforcement) +**Last Updated**: January 18, 2026 +**Version**: 1.1 (Canonical Architecture Enforcement + Git Status Update) **Status**: ACTIVE - Must Be Consulted Before All Code Changes --- @@ -14,59 +14,105 @@ This document establishes **architectural boundaries** across the three major Ze --- +## 🚨 CRITICAL STATUS UPDATE (January 18, 2026) + +### **Git Repository Status** + +**zlh-api Repository: 🔴 EMPTY** +- **Created**: December 28, 2025 +- **Status**: No code pushed to git +- **Impact**: + - Cannot verify December 20 DNS fix application + - No version control for API codebase + - No change tracking or collaboration possible +- **Action Required**: IMMEDIATE - Push API codebase to git + +**zlh-grind Repository: ✅ CURRENT** +- **Last Updated**: January 18, 2026 +- **Status**: Architectural guardrails established +- **Recent Changes**: + - PORTAL_MIGRATION.md - Architectural boundaries added + - CONSTRAINTS.md - Network architecture rules added + - ANTI_DRIFT_GUARDRAIL.md - AI-specific guardrails added + +**knowledge-base Repository: ⚠️ OUTDATED** +- **Last Updated**: December 7, 2025 (6+ weeks ago) +- **Missing**: 4+ weeks of session summaries (Dec 20 - Jan 18) +- **Status**: Being updated now (Jan 18, 2026) + +### **Outstanding Critical Issues** + +1. **DNS Fix Status Unknown** 🔴 + - Fix identified: December 20, 2025 + - Location: `provisionAgent.js` lines 46, 330-331, 402 + - Status: Cannot verify if applied (API not in git) + - Impact: Servers may still be unreachable if not applied + +2. **API Codebase Not Version Controlled** 🔴 + - Running in production but not in git + - Cannot track changes, review code, or collaborate + - Immediate risk to project continuity + +3. **Documentation Debt** 🟡 + - 6 weeks without tracker updates + - 4 weeks of undocumented sessions + - Risk of losing institutional knowledge + +--- + ## 📊 The Three Systems (Canonical Ownership) ``` -┌─────────────────────────────────────────────────────────────┐ -│ NODE.JS API v2 │ -│ (Orchestration Engine) │ -│ │ +┌────────────────────────────────────────────────────────────┐ +│ NODE.JS API v2 │ +│ (Orchestration Engine) │ +│ │ │ Owns: VMID allocation, port management, DNS publishing, │ │ Velocity registration, job queue, database state │ -│ │ +│ │ │ Speaks To: Proxmox, Cloudflare, Technitium, Velocity, │ │ MariaDB, BullMQ, Go Agent (HTTP) │ -│ │ +│ │ │ Never Touches: Container filesystem, game server files, │ -│ Java installation, artifact downloads │ -└─────────────────────┬───────────────────────────────────────┘ - │ - │ HTTP Contract - │ POST /config, /start, /stop - │ GET /status, /health - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ GO AGENT │ -│ (Container-Internal Manager) │ -│ │ -│ Owns: Server installation, Java runtime, artifact │ +│ Java installation, artifact downloads │ +└────────────────────────────┬───────────────────────────────┘ + │ + │ HTTP Contract + │ POST /config, /start, /stop + │ GET /status, /health + │ + ▼ +┌────────────────────────────────────────────────────────────┐ +│ GO AGENT │ +│ (Container-Internal Manager) │ +│ │ +│ Owns: Server installation, Java runtime, artifact │ │ downloads, process management, READY detection, │ -│ filesystem layout, verification + self-repair │ -│ │ -│ Speaks To: Local filesystem, game server process, │ +│ filesystem layout, verification + self-repair │ +│ │ +│ Speaks To: Local filesystem, game server process, │ │ API (status updates via HTTP polling) │ -│ │ -│ Never Touches: Proxmox, DNS, Cloudflare, Velocity, │ +│ │ +│ Never Touches: Proxmox, DNS, Cloudflare, Velocity, │ │ port allocation, VMID selection │ -└─────────────────────┬───────────────────────────────────────┘ - │ - │ Status Polling - │ Agent reports state - │ - ▼ -┌─────────────────────────────────────────────────────────────┐ -│ NEXT.JS FRONTEND │ -│ (Customer + Admin UI) │ -│ │ +└────────────────────────────┬───────────────────────────────┘ + │ + │ Status Polling + │ Agent reports state + │ + ▼ +┌────────────────────────────────────────────────────────────┐ +│ NEXT.JS FRONTEND │ +│ (Customer + Admin UI) │ +│ │ │ Owns: User interaction, form validation, display logic, │ │ client-side state, UI components │ -│ │ -│ Speaks To: API v2 only (REST + WebSocket when added) │ -│ │ -│ Never Touches: Proxmox, Go Agent, DNS, Velocity, │ +│ │ +│ Speaks To: API v2 only (REST + WebSocket when added) │ +│ │ +│ Never Touches: Proxmox, Go Agent, DNS, Velocity, │ │ Cloudflare, direct container access │ -└─────────────────────────────────────────────────────────────┘ +└────────────────────────────────────────────────────────────┘ ``` --- @@ -88,7 +134,7 @@ This document establishes **architectural boundaries** across the three major Ze --- -## 🔄 API ↔ Agent Contract (IMMUTABLE) +## 🔒 API ↔ Agent Contract (IMMUTABLE) ### **API → Agent Endpoints** @@ -203,7 +249,78 @@ Frontend **NEVER**: --- -## 🚨 Drift Detection Rules (ACTIVE ENFORCEMENT) +## 🚨 NEW: Architectural Boundary Enforcement (January 18, 2026) + +### **Critical Rule: Frontend Cannot Call Agents** + +**The Hard Reality**: +- Container IPs are internal-only (10.x network) +- No network path exists from browser to container +- Agents have no CORS headers (they're not web services) +- Direct calls would fail at network layer + +**Why This Matters**: +- AI tools may suggest "quick fixes" calling agents directly +- Developers may try to add CORS to agents +- Frontend shortcuts bypass security/auth/rate limiting +- Breaks architectural isolation + +**Correct Flow**: +``` +User Action → Frontend → API → Agent → Response +``` + +**Forbidden Flow**: +``` +User Action → Frontend → Agent (FAILS - no network path) +``` + +### **Common Drift Patterns (Now Prevented)** + +**Pattern 1: AI Tool Suggests Direct Agent Call** +```javascript +// WRONG - AI suggestion +async function getServerLogs(vmid) { + const agentURL = `http://10.200.0.${vmid}:8080/logs`; + return await fetch(agentURL); // FAILS - no route +} + +// CORRECT - Architectural pattern +async function getServerLogs(vmid) { + return await api.get(`/containers/${vmid}/logs`); +} +``` + +**Pattern 2: Adding CORS to Agents** +```go +// WRONG - Never add this to agents +func (a *Agent) enableCORS() { + a.router.Use(cors.New(cors.Config{ + AllowOrigins: ["*"], + })) +} +``` + +**Pattern 3: Exposing Agent Ports** +```nginx +# WRONG - Never proxy agent ports +location /agent/ { + proxy_pass http://10.200.0.100:8080/; +} +``` + +### **Enforcement Documentation** + +Three-layer defense established in `zlh-grind`: +1. **PORTAL_MIGRATION.md** - High-level boundaries +2. **CONSTRAINTS.md** - Hard technical rules +3. **ANTI_DRIFT_GUARDRAIL.md** - AI-specific warnings + +**Rule**: When code conflicts with documentation, **documentation wins**. + +--- + +## 🛡️ Drift Detection Rules (ACTIVE ENFORCEMENT) ### **Rule #1: Provisioning Ownership** @@ -343,7 +460,7 @@ async function registerVelocity(vmid, hostname, internalIP) { --- -## 🔄 Context Switching Safety Workflow +## 🔍 Context Switching Safety Workflow ### **When Moving to Node.js API Work:** @@ -396,7 +513,7 @@ async function registerVelocity(vmid, hostname, internalIP) { --- -## 🚨 High-Risk Integration Zones (GUARDED) +## 🛡️ High-Risk Integration Zones (GUARDED) These areas have historically caused drift across sessions: @@ -442,7 +559,7 @@ These areas have historically caused drift across sessions: --- -## 📋 Architecture Decision Log (LOCKED) ⭐ NEW +## 📊 Architecture Decision Log (LOCKED) ⭐ NEW **Purpose**: Records finalized architectural decisions that **must not be re-litigated** unless explicitly requested. @@ -545,7 +662,27 @@ These areas have historically caused drift across sessions: --- -## 📋 Canonical Architecture Anchors (ABSOLUTE) +### **DEC-008: Frontend-Agent Network Isolation (NEW - January 18, 2026)** + +**Decision**: Frontend can NEVER call agents directly + +**Rationale**: +- Container IPs (10.x) are internal-only with no public routing +- Agents have no CORS headers (not web services) +- Direct calls would fail at network layer +- API enforces auth, rate limits, access control + +**Technical Reality**: +- No network path exists from browser to container +- Even if CORS added, network routing blocks access +- This is architectural fact, not policy choice + +**Applies To**: Frontend, All Documentation +**Status**: ✅ LOCKED + +--- + +## 📊 Canonical Architecture Anchors (ABSOLUTE) These rules are **immutable** unless explicitly changed with full system review: @@ -580,6 +717,11 @@ These rules are **immutable** unless explicitly changed with full system review: ### **Anchor #10: Directory Structure** ✅ `/opt/zlh///world` is canonical game server path +### **Anchor #11: Network Isolation (NEW)** +✅ Container IPs (10.x) are internal-only +✅ No network path from frontend to agents +✅ API is the only bridge between public and internal networks + --- ## 🛡️ Enforcement Policies (ACTIVE) @@ -624,7 +766,7 @@ D) Cancel this change? ``` ### **Step 5: ARCHITECTURE DECISION LOG SPECIAL RULE** -If violation is against a **LOCKED** Architecture Decision (DEC-001 through DEC-007): +If violation is against a **LOCKED** Architecture Decision (DEC-001 through DEC-008): **ADDITIONAL CHECK**: ``` @@ -646,87 +788,7 @@ Proceeding with this change would violate architectural governance. --- -## 📊 Drift Detection Examples - -### **Example 1: Agent Port Allocation (VIOLATION)** - -**Proposed Change**: -```go -// Agent code proposal -func (a *Agent) allocatePort() int { - // Find available port... - return port -} -``` - -**Drift Warning**: -``` -⚠️ DRIFT WARNING ⚠️ - -Rule Violated: Rule #1 - Provisioning Ownership -Violation: Agent attempting to allocate ports -Correct Path: API allocates ports, agent receives them via /config - -Impact: Port collisions, database inconsistency, broken PortPool - -Recommendation: Remove port allocation from agent, ensure API sends ports in config payload -``` - ---- - -### **Example 2: Frontend Direct Agent Call (VIOLATION)** - -**Proposed Change**: -```typescript -// Frontend code proposal -async function getServerLogs(vmid: number) { - const agentURL = `http://10.200.0.${vmid}:8080/logs`; - return await fetch(agentURL); -} -``` - -**Drift Warning**: -``` -⚠️ DRIFT WARNING ⚠️ - -Rule Violated: Rule #3 - Direct Container Access -Violation: Frontend bypassing API to talk to agent -Correct Path: Frontend → API → Agent (API proxies logs) - -Impact: Broken frontend if container IP changes, no auth/rate limiting, security risk - -Recommendation: Add GET /api/containers/:vmid/logs endpoint that proxies to agent -``` - ---- - -### **Example 3: API Installing Java (VIOLATION)** - -**Proposed Change**: -```javascript -// API code proposal -async function provisionServer(vmid, game, variant) { - await proxmox.exec(vmid, 'apt-get install openjdk-21-jdk'); - await proxmox.exec(vmid, 'wget https://papermc.io/...'); -} -``` - -**Drift Warning**: -``` -⚠️ DRIFT WARNING ⚠️ - -Rule Violated: Rule #2 - Artifact Ownership -Violation: API performing in-container installation -Correct Path: API sends config to agent, agent handles installation - -Impact: Breaks agent self-repair, variant-specific logic duplicated, no verification system - -Recommendation: Remove installation logic from API, ensure agent receives proper config via POST /config -``` - ---- - -## 📁 Integration with Existing Documentation +## 📚 Integration with Existing Documentation ### **Relationship to Master Bootstrap** - Master Bootstrap: Strategic overview and business model @@ -745,7 +807,7 @@ Recommendation: Remove installation logic from API, ensure agent receives proper --- -## 🔄 Architecture Amendment Process +## 🔧 Architecture Amendment Process If legitimate need to change Canonical Architecture: