Add OPNsense validation checklist

This commit is contained in:
jester 2026-03-24 23:10:12 +00:00
parent d47445c4d4
commit a673534200

View File

@ -0,0 +1,127 @@
# OPNsense Network Validation Checklist
## Overview
ZeroLagHub uses two OPNsense routers. This document tracks what needs to be
verified, validated, and tested before production launch.
Current state: routers exist and have been minimally touched. This checklist
should be worked through systematically.
---
## Router Inventory
| Router | Role | Status |
|--------|------|--------|
| Primary OPNsense | Core network routing, VLANs, WireGuard | Needs audit |
| Secondary OPNsense | Failover / redundancy | Needs audit |
---
## VLAN Structure (Verify)
Expected VLANs — confirm these match actual config:
| VLAN | Subnet | Purpose |
|------|--------|---------|
| Management | 10.60.0.0/24 | API, portal, core services |
| Game/Dev containers | 10.100.0.0/24 | LXC containers |
| Proxy/Edge | 10.70.0.0/24 | Traefik, zpack-proxy |
| WireGuard | 10.x.x.x/24 | Admin VPN access |
**Action:** Log into OPNsense dashboard and confirm actual VLAN assignments
match the above. Update this table if different.
---
## Firewall Rules (Verify)
### Must be confirmed open:
- [ ] `10.60.0.245:4000` (API) reachable from `10.70.0.242` (Traefik)
- [ ] `10.100.0.x:6000` (dev containers) reachable from `10.60.0.245` (API)
- [ ] `10.100.0.x:18888` (agent) reachable from `10.60.0.245` (API)
- [ ] `10.70.0.242:443` (Traefik) reachable from internet (public IP)
- [ ] DNS resolution working from containers → `10.60.0.x` (Technitium)
### Must be confirmed blocked:
- [ ] Direct container access from internet (no public IP on container VMs)
- [ ] Cross-VLAN access that bypasses API (containers cannot reach API directly)
---
## WireGuard (Verify)
- [ ] Admin WireGuard tunnel active and stable
- [ ] Peer configs for all admin machines documented
- [ ] Confirm WireGuard survives router restart
- [ ] Test: disconnect and reconnect from admin machine
---
## Failover / Redundancy
- [ ] Confirm whether CARP/VRRP is configured between primary and secondary
- [ ] Test failover: shut down primary router, verify traffic continues
- [ ] Confirm secondary has identical VLAN and firewall config
- [ ] Document failover behavior — is it automatic or manual?
---
## DNS Resolution
- [ ] Containers resolve internal hostnames via Technitium (`zlh-dns`)
- [ ] `dev-*.zerolaghub.dev` resolves to Traefik IP (`10.70.0.242`) internally
- [ ] External DNS (`zerolaghub.dev`) resolves correctly via Cloudflare
- [ ] Reverse DNS for container IPs (nice to have, not blocking)
---
## Routing Validation
Quick connectivity matrix to verify end-to-end:
```bash
# From API host (10.60.0.245)
curl http://10.100.0.x:18888/status # Agent reachable
curl http://10.100.0.x:6000 # code-server reachable
curl http://10.70.0.242 # Traefik reachable
# From Traefik host (10.70.0.242)
curl http://10.60.0.245:4000/api/health # API reachable
# From dev container
curl http://10.60.0.251:8080 # Artifact server reachable
nslookup zerolaghub.dev # DNS working
```
---
## Known Issues / History
- Routers have not been systematically audited
- Basic routing is confirmed working (cross-subnet curl tests pass)
- WireGuard access confirmed working for admin
- No formal failover test has been performed
---
## Action Items (Priority Order)
1. Audit both router configs — document actual VLAN and firewall rules
2. Run connectivity matrix above and confirm all pass
3. Test WireGuard reconnect after router restart
4. Test failover between primary and secondary
5. Document any gaps found and remediate
---
## Notes
OPNsense dashboard is accessible via WireGuard. Do not expose OPNsense
management interface to the internet.
Configuration backups: OPNsense has built-in XML config export.
Export and store in a secure location before making any changes.