knowledge-base/ZeroLagHub_Drift_Prevention_Card.md

205 lines
6.0 KiB
Markdown

# 🛡️ ZeroLagHub Drift Prevention - Quick Start Card
**Use This**: At the start of EVERY coding session (API, Agent, or Frontend)
---
## ⚡ 30-Second Architecture Check
### **Working on API?**
✅ Can allocate: Ports, VMIDs, DNS, Velocity
❌ Cannot: Install Java, download artifacts, exec in container
### **Working on Agent?**
✅ Can install: Java, artifacts, server files
❌ Cannot: Allocate ports, create DNS, call Proxmox, register Velocity
### **Working on Frontend?**
✅ Can: Display data, call API endpoints
❌ Cannot: Talk to Agent, Proxmox, DNS, Velocity, allocate anything
---
## 🔒 8 Locked Architectural Decisions (NEW) ⭐
**These decisions are FINAL** - cannot be changed without user saying "Revisit decision X":
1. **DEC-001**: Templates + Agent hybrid (not templates-only or agent-only)
2. **DEC-002**: API orchestrates, Agent executes (not reversed)
3. **DEC-003**: API owns DNS (Agent never creates DNS)
4. **DEC-004**: Traefik + Velocity only (no HAProxy)
5. **DEC-005**: MariaDB is source of truth (no flat files)
6. **DEC-006**: Frontend → API only (no direct agent calls)
7. **DEC-007**: Drift prevention mandatory (always enforced)
8. **DEC-008**: Frontend-Agent network isolation (NEW - Jan 18, 2026)
**If proposing change that conflicts with DEC-001 through DEC-008**:
→ STOP → Consult Cross-Project Tracker → Request "Revisit decision X" from user
---
## 🚨 CRITICAL: Frontend Cannot Call Agents (DEC-008)
**Network Reality** (not policy, but actual network architecture):
- Container IPs are internal-only (10.x network)
- No public routing from browser to containers
- Agents have NO CORS headers (not web services)
- Direct calls FAIL at network layer
**The ONLY Valid Flow**:
```
User → Frontend → API → Agent → Response
```
**Forbidden Flow** (will fail):
```
User → Frontend → Agent (FAILS - no network path)
```
**Common AI Tool Mistakes**:
- ❌ Suggesting `fetch('http://10.200.0.X:8080/...')`
- ❌ Adding CORS headers to agents
- ❌ Exposing agent ports through proxy
- ❌ Any "quick fix" that bypasses API
**Why This Matters**:
- AI tools WILL suggest direct agent calls
- Developers may try to "fix" by adding CORS
- Shortcuts bypass auth, rate limits, security
- Breaks architectural isolation
**Enforcement**: When code conflicts with documentation, **documentation wins**.
**Full Details**: See `zlh-grind` repository:
- `PORTAL_MIGRATION.md` - Architectural boundaries
- `CONSTRAINTS.md` - Hard network rules
- `ANTI_DRIFT_GUARDRAIL.md` - AI-specific warnings
---
## 🛡️ AI-Specific Guardrails (NEW)
**When using AI coding tools (Codex, GPT, Claude)**:
### **Hard Rules**:
1. ❌ Explicitly forbid frontend → agent calls
2. ✅ Require API-only control paths
3. ❌ Reject changes that "just work" via shortcuts
4. ✅ Prefer deletion over convenience
### **If AI Introduces Direct Agent Calls**:
- The change is INVALID
- The prompt must be corrected
- The documentation is authoritative
### **"Documentation Wins" Rule**:
When behavior and documentation disagree:
> **Documentation takes precedence**
This prevents gradual architectural erosion from "convenient" changes.
---
## 🚨 Drift Detection Triggers
**STOP and consult full tracker if you hear:**
1. "Agent should allocate ports..."
2. "API should install Java inside container..."
3. "Frontend should call agent directly..."
4. "Agent should register with Velocity..."
5. "API should decide what Java version..."
6. "Frontend should manage DNS records..."
7. "Let's use templates only..." (violates DEC-001)
8. "Let's add HAProxy..." (violates DEC-004)
9. "Let's use flat files instead of DB..." (violates DEC-005)
10. "Frontend can just fetch from container IP..." (violates DEC-008)
**All of these are VIOLATIONS** → Consult [ZeroLagHub_Cross_Project_Tracker.md](computer:///mnt/user-data/outputs/ZeroLagHub_Cross_Project_Tracker.md)
---
## 📋 Pre-Coding Checklist
Before writing ANY code:
- [ ] Which system am I modifying? (API / Agent / Frontend)
- [ ] Does this change cross boundaries? (If yes → read full tracker)
- [ ] Am I adding external API calls to Agent? (If yes → VIOLATION)
- [ ] Am I adding container execution to API? (If yes → VIOLATION)
- [ ] Am I bypassing API in Frontend? (If yes → VIOLATION)
- [ ] Am I making Frontend call Agent directly? (If yes → VIOLATION - network won't allow it)
---
## 🎯 The Golden Rules
1. **API orchestrates** (allocates resources, publishes state)
2. **Agent executes** (installs, runs, monitors inside container)
3. **Frontend displays** (no direct infrastructure access)
**Anything else** → Drift → Consult full tracker
---
## 🔍 Network Architecture Reality
**Why Frontend → Agent Calls Fail**:
```
Browser (Public Internet)
❌ NO ROUTE to 10.200.0.X (container network)
Container (Internal 10.x network)
```
**The API is the bridge**:
```
Browser → API (10.60.0.245:4000) → Container (10.200.0.X)
✅ Public access ✅ Internal access
```
**Key Facts**:
- Containers on isolated internal network
- No public IPs assigned to containers
- No firewall rules allowing browser → container
- This is infrastructure reality, not just policy
---
## 📖 Quick Reference
**Full Tracker**: [ZeroLagHub_Cross_Project_Tracker.md](computer:///mnt/user-data/outputs/ZeroLagHub_Cross_Project_Tracker.md)
**Architectural Guardrails** (zlh-grind repo):
- `PORTAL_MIGRATION.md` - High-level boundaries
- `CONSTRAINTS.md` - Hard technical rules
- `ANTI_DRIFT_GUARDRAIL.md` - AI tool warnings
**High-Risk Zones**:
- Forge/NeoForge installation
- Cloudflare SRV deletion
- Velocity registration order
- PortPool commit logic
- Agent READY detection
- IP selection logic
- Frontend-Agent communication (NEVER allowed)
---
## ✅ Session Start Command
```
Read ZeroLagHub Cross-Project Tracker Quick Start Card.
Activate drift detection.
Confirm which system I'm working on: [API / Agent / Frontend]
Verify no Frontend → Agent calls being introduced.
Proceed with architecture-aligned implementation only.
```
---
🛡️ **Drift prevention ACTIVE. Proceed with confidence.**