Launch blocker: implement billing enforcement and overdue server handling #14

Open
opened 2026-05-03 16:05:48 +00:00 by jester · 5 comments
Owner

Launch blocker

Implement billing enforcement for overdue / non-paid customer servers.

Context

Controller/reconciler is now running in dry-run while we observe recommendations and noise. The next launch item is billing enforcement so ZLH can safely handle overdue accounts without requiring James to manually monitor everything.

Required policy

Do not destroy customer servers immediately for non-payment.

Required enforcement ladder:

1. Initial warning
2. Grace-period warning / final notice
3. Block new backups / customer-triggered backup actions
4. Shutdown or suspend workload/server
5. Keep server/data retained for a defined retention window
6. Destroy/delete only after explicit retention policy and admin-safe workflow exist

Principles

- Never delete customer data immediately for missed payment
- Never auto-destroy as the first enforcement action
- Billing suspension must override normal auto-repair / auto-start behavior
- Controller must not restart or repair suspended servers back to running
- Portal must clearly show billing/suspended state
- Customer should know what happened and what to do next

Billing state model needed

Add/verify durable billing enforcement state per user/account and/or server:

billingStatus:
  active
  past_due
  overdue_warning
  suspended
  retained
  pending_deletion

serverBillingState:
  allowed
  backup_blocked
  suspended_shutdown
  retained

Suggested durable fields, exact model TBD:

userId/customerId
stripeCustomerId/subscriptionId if available
billingStatus
pastDueSince
warningSentAt
finalWarningSentAt
backupBlockedAt
suspendedAt
retentionUntil
lastBillingEventAt
lastEnforcementAction

Stripe / billing event handling

Webhook handling should update durable state for events such as:

invoice.payment_succeeded
invoice.payment_failed
customer.subscription.updated
customer.subscription.deleted

Expected behavior:

payment succeeded:
  clear past_due/suspended enforcement state where appropriate
  allow backups again
  allow manual/customer start again
  do not automatically start servers unless policy explicitly says so

payment failed / subscription past_due:
  set past_due / warning state
  send warning notification

subscription canceled/deleted or unpaid beyond grace period:
  move to suspended/retained state according to policy

Enforcement worker / controller behavior

Controller/reconciler should observe billing state and enqueue safe enforcement actions.

Possible queue/action names:

billing_enforcement
block_backups
shutdown_suspended_server
mark_server_retained
send_billing_warning

Initial safe launch actions:

- send warning/final notice
- block customer-triggered backups
- prevent new provisioning
- prevent start/restart actions for suspended accounts
- stop/shutdown workload or container for suspended servers

Dangerous actions not allowed automatically yet:

- delete container
- wipe data
- prune backups
- restore over current data

API gating requirements

API must enforce billing state consistently:

POST /api/instances:
  block if past grace / suspended / no active plan

server start/restart actions:
  block if suspended

backup create actions:
  block if backup_blocked or suspended

restore actions:
  decide policy; likely block when suspended except admin

console/file access:
  decide policy; likely read-only or blocked when suspended

Portal requirements

Portal must show clear customer-facing state:

Account past due
Payment required
Backups disabled due to billing status
Server suspended due to billing status
Data retained until <date>
Update payment method / contact support

Do not show suspended servers as generic broken/offline servers.

Controller interaction

Controller must respect billing state:

- Do not auto-start suspended servers
- Do not repair suspended servers into running/connectable state
- Do not edge-republish suspended/deleting/retained servers unless policy explicitly allows it
- Suspended + running should enqueue safe shutdown/suspend action
- Billing state should be included in desired vs observed evaluation

Notifications

Need customer-facing email and internal Discord notifications.

Discord channels:

#alerts-billing for webhook/enforcement failures
#audit-log for enforcement actions
#alerts-critical only for repeated billing webhook failures or data-risk states

Customer email events:

initial payment failed warning
final warning before suspension
server suspended notice
payment restored / access restored
retention deadline warning

Validation plan

1. Simulate payment_failed / past_due event.
2. Confirm billing state updates.
3. Confirm provisioning is blocked.
4. Confirm backups are blocked at the correct stage.
5. Confirm suspended running test server is shut down but not deleted.
6. Confirm controller does not auto-repair/start suspended server.
7. Simulate payment_succeeded.
8. Confirm access is restored and backups unblocked.
9. Confirm server is not automatically started unless explicitly intended.
10. Confirm Portal messaging is clear.

Launch expectation

This is a launch blocker. The minimum acceptable launch behavior is warning + backup block + shutdown/suspension + retention. Destroy/delete automation can wait until after launch and must require explicit retention policy plus admin review.

## Launch blocker Implement billing enforcement for overdue / non-paid customer servers. ## Context Controller/reconciler is now running in dry-run while we observe recommendations and noise. The next launch item is billing enforcement so ZLH can safely handle overdue accounts without requiring James to manually monitor everything. ## Required policy Do not destroy customer servers immediately for non-payment. Required enforcement ladder: ```text 1. Initial warning 2. Grace-period warning / final notice 3. Block new backups / customer-triggered backup actions 4. Shutdown or suspend workload/server 5. Keep server/data retained for a defined retention window 6. Destroy/delete only after explicit retention policy and admin-safe workflow exist ``` ## Principles ```text - Never delete customer data immediately for missed payment - Never auto-destroy as the first enforcement action - Billing suspension must override normal auto-repair / auto-start behavior - Controller must not restart or repair suspended servers back to running - Portal must clearly show billing/suspended state - Customer should know what happened and what to do next ``` ## Billing state model needed Add/verify durable billing enforcement state per user/account and/or server: ```text billingStatus: active past_due overdue_warning suspended retained pending_deletion serverBillingState: allowed backup_blocked suspended_shutdown retained ``` Suggested durable fields, exact model TBD: ```text userId/customerId stripeCustomerId/subscriptionId if available billingStatus pastDueSince warningSentAt finalWarningSentAt backupBlockedAt suspendedAt retentionUntil lastBillingEventAt lastEnforcementAction ``` ## Stripe / billing event handling Webhook handling should update durable state for events such as: ```text invoice.payment_succeeded invoice.payment_failed customer.subscription.updated customer.subscription.deleted ``` Expected behavior: ```text payment succeeded: clear past_due/suspended enforcement state where appropriate allow backups again allow manual/customer start again do not automatically start servers unless policy explicitly says so payment failed / subscription past_due: set past_due / warning state send warning notification subscription canceled/deleted or unpaid beyond grace period: move to suspended/retained state according to policy ``` ## Enforcement worker / controller behavior Controller/reconciler should observe billing state and enqueue safe enforcement actions. Possible queue/action names: ```text billing_enforcement block_backups shutdown_suspended_server mark_server_retained send_billing_warning ``` Initial safe launch actions: ```text - send warning/final notice - block customer-triggered backups - prevent new provisioning - prevent start/restart actions for suspended accounts - stop/shutdown workload or container for suspended servers ``` Dangerous actions not allowed automatically yet: ```text - delete container - wipe data - prune backups - restore over current data ``` ## API gating requirements API must enforce billing state consistently: ```text POST /api/instances: block if past grace / suspended / no active plan server start/restart actions: block if suspended backup create actions: block if backup_blocked or suspended restore actions: decide policy; likely block when suspended except admin console/file access: decide policy; likely read-only or blocked when suspended ``` ## Portal requirements Portal must show clear customer-facing state: ```text Account past due Payment required Backups disabled due to billing status Server suspended due to billing status Data retained until <date> Update payment method / contact support ``` Do not show suspended servers as generic broken/offline servers. ## Controller interaction Controller must respect billing state: ```text - Do not auto-start suspended servers - Do not repair suspended servers into running/connectable state - Do not edge-republish suspended/deleting/retained servers unless policy explicitly allows it - Suspended + running should enqueue safe shutdown/suspend action - Billing state should be included in desired vs observed evaluation ``` ## Notifications Need customer-facing email and internal Discord notifications. Discord channels: ```text #alerts-billing for webhook/enforcement failures #audit-log for enforcement actions #alerts-critical only for repeated billing webhook failures or data-risk states ``` Customer email events: ```text initial payment failed warning final warning before suspension server suspended notice payment restored / access restored retention deadline warning ``` ## Validation plan ```text 1. Simulate payment_failed / past_due event. 2. Confirm billing state updates. 3. Confirm provisioning is blocked. 4. Confirm backups are blocked at the correct stage. 5. Confirm suspended running test server is shut down but not deleted. 6. Confirm controller does not auto-repair/start suspended server. 7. Simulate payment_succeeded. 8. Confirm access is restored and backups unblocked. 9. Confirm server is not automatically started unless explicitly intended. 10. Confirm Portal messaging is clear. ``` ## Launch expectation This is a launch blocker. The minimum acceptable launch behavior is warning + backup block + shutdown/suspension + retention. Destroy/delete automation can wait until after launch and must require explicit retention policy plus admin review.
Author
Owner

Implementation update — billing enforcement pass

Billing enforcement has been implemented in zpack-api.

Created / changed

BillingEnforcementState
BillingEnforcementEvent
StripeEventLog
prisma/migrations/20260503143000_add_billing_enforcement/migration.sql
src/services/billingEnforcement.js
src/queues/billingEnforcement.js
src/services/emailNotifications.js
src/services/email.js
Stripe webhook handling
API billing gates
controller billing behavior
repair worker billing guards
.env.example

Billing schema added

Durable billing enforcement state now exists for:

billing statuses
server billing states
Stripe event idempotency
normalized billing enforcement audit events

Billing enforcement service

src/services/billingEnforcement.js now handles:

billing statuses
server billing states
Stripe event state transitions
idempotency helpers
API guard/assert helpers
due enforcement action calculation

Billing worker added

src/queues/billingEnforcement.js was added with queue name:

billing_enforcement

Script added:

npm run worker:billing

Safe actions implemented:

send warning
send final warning
block backups
suspend/shutdown
mark retained
restore access

Destructive actions are explicitly disabled.

Note: this creates a separate billing worker entrypoint. We should decide whether to keep it as a separate service or fold billing actions into the existing repair worker to avoid service sprawl.

Notifications

Added fail-soft customer notification wrapper:

src/services/emailNotifications.js

Added generic sendEmail export in:

src/services/email.js

Stripe webhook handling extended

Handled events now include:

invoice.payment_failed
invoice.paid
invoice.payment_succeeded
customer.subscription.updated
customer.subscription.deleted

Duplicate Stripe events are skipped through StripeEventLog.

API gates added

Billing gates were added for:

POST /api/instances
backup create / restore / delete
game and host start / restart
console command and websocket stream
file write / upload / revert / delete mutations

File read/list/download remains allowed while suspended/retained.

Controller behavior updated

billing state is checked before normal repair policy
suspended/retained/pending-deletion servers skip edge repair and live edge observation
running suspended servers enqueue billing-safe shutdown only

Repair worker billing guard

Existing edge/DNS/Velocity repair jobs now skip if billing state is blocked.

Env defaults added

BILLING_WARNING_GRACE_HOURS=0
BILLING_FINAL_WARNING_AFTER_HOURS=72
BILLING_BACKUP_BLOCK_AFTER_HOURS=72
BILLING_SUSPEND_AFTER_HOURS=120
BILLING_RETENTION_DAYS=14

Validation completed

node --check passed for touched services/routes/controllers/workers
npx prisma validate passed
npm run prisma:generate passed
git diff --check passed
npm run worker:billing started cleanly and shut down under timeout
npm run worker:repair still starts cleanly
controller dry-run starts, but active controller lock is held so it exits with controller_lock_lost

Not completed yet

Live/simulated Stripe flows still need validation against test users:

payment failed
backup block
suspension/shutdown
payment restored
replay idempotency

Portal was not changed yet. API now returns billing state/codes for Portal to render suspended/retained states clearly.

## Implementation update — billing enforcement pass Billing enforcement has been implemented in `zpack-api`. ### Created / changed ```text BillingEnforcementState BillingEnforcementEvent StripeEventLog prisma/migrations/20260503143000_add_billing_enforcement/migration.sql src/services/billingEnforcement.js src/queues/billingEnforcement.js src/services/emailNotifications.js src/services/email.js Stripe webhook handling API billing gates controller billing behavior repair worker billing guards .env.example ``` ### Billing schema added Durable billing enforcement state now exists for: ```text billing statuses server billing states Stripe event idempotency normalized billing enforcement audit events ``` ### Billing enforcement service `src/services/billingEnforcement.js` now handles: ```text billing statuses server billing states Stripe event state transitions idempotency helpers API guard/assert helpers due enforcement action calculation ``` ### Billing worker added `src/queues/billingEnforcement.js` was added with queue name: ```text billing_enforcement ``` Script added: ```text npm run worker:billing ``` Safe actions implemented: ```text send warning send final warning block backups suspend/shutdown mark retained restore access ``` Destructive actions are explicitly disabled. Note: this creates a separate billing worker entrypoint. We should decide whether to keep it as a separate service or fold billing actions into the existing repair worker to avoid service sprawl. ### Notifications Added fail-soft customer notification wrapper: ```text src/services/emailNotifications.js ``` Added generic `sendEmail` export in: ```text src/services/email.js ``` ### Stripe webhook handling extended Handled events now include: ```text invoice.payment_failed invoice.paid invoice.payment_succeeded customer.subscription.updated customer.subscription.deleted ``` Duplicate Stripe events are skipped through `StripeEventLog`. ### API gates added Billing gates were added for: ```text POST /api/instances backup create / restore / delete game and host start / restart console command and websocket stream file write / upload / revert / delete mutations ``` File read/list/download remains allowed while suspended/retained. ### Controller behavior updated ```text billing state is checked before normal repair policy suspended/retained/pending-deletion servers skip edge repair and live edge observation running suspended servers enqueue billing-safe shutdown only ``` ### Repair worker billing guard Existing edge/DNS/Velocity repair jobs now skip if billing state is blocked. ### Env defaults added ```text BILLING_WARNING_GRACE_HOURS=0 BILLING_FINAL_WARNING_AFTER_HOURS=72 BILLING_BACKUP_BLOCK_AFTER_HOURS=72 BILLING_SUSPEND_AFTER_HOURS=120 BILLING_RETENTION_DAYS=14 ``` ### Validation completed ```text node --check passed for touched services/routes/controllers/workers npx prisma validate passed npm run prisma:generate passed git diff --check passed npm run worker:billing started cleanly and shut down under timeout npm run worker:repair still starts cleanly controller dry-run starts, but active controller lock is held so it exits with controller_lock_lost ``` ### Not completed yet Live/simulated Stripe flows still need validation against test users: ```text payment failed backup block suspension/shutdown payment restored replay idempotency ``` Portal was not changed yet. API now returns billing state/codes for Portal to render suspended/retained states clearly.
Author
Owner

Validation update — billing enforcement simulated flow

Billing enforcement validation was run against a test user and disposable dev server.

Test subjects

User: testuser1@zerolaghub.com
userId: 4bc6e123-8ec9-4433-8a5f-2f7885a6f421
Stripe customer: cus_UJlUFsobDNZTZc
Test server: dev-6090
VMID: 6090
IP: 10.100.0.28

Preflight

npm run prisma:generate: pass
npx prisma validate: pass
node --check src/services/billingEnforcement.js: pass
node --check src/queues/billingEnforcement.js: pass
npm run worker:billing: pass

Migration was required on the deployed DB because the billing enforcement tables were not present. npx prisma migrate deploy applied successfully.

payment_failed simulation

event: evt_test_payment_failed_001
result: pass

State moved:

active -> past_due -> overdue_warning
pastDueSince set
warningSentAt set
lastStripeEventId set
StripeEventLog created and processed
BillingEnforcementEvent created for stripe_payment_failed and send_billing_warning
Email/Discord attempted fail-soft through worker path

Stripe replay idempotency

Replayed evt_test_payment_failed_001.

result: pass
StripeEventLog detected duplicate
StripeEventLog count stable at 1
BillingEnforcementEvent count stable at 2
billing state unchanged

Final warning / backup block

Timestamp advanced beyond 72 hours.

result: pass at state/service level
billingStatus: final_warning
serverBillingState: backup_blocked
finalWarningSentAt set
backupBlockedAt set
existing backups untouched

Backup mutation route validation was not fully applicable because the disposable server was a dev server; game backup route returned 404 for the dev VMID as expected.

Suspension

Timestamp advanced beyond 120 hours.

result: pass at worker/state level
billingStatus: suspended
serverBillingState: suspended_shutdown
suspendedAt set
retentionUntil: 2026-05-17T16:37:44.597Z

Worker recorded:

suspend_customer_servers
shutdown_suspended_server for VMID 6090

Data safety confirmed:

Container DB record remained
Proxmox container remained
VMID 6090 final state: stopped

API gate validation

result: fail/blocker in deployed HTTP process

The running API process appears stale and was not serving the new billing-gated code.

Evidence while DB state was suspended:

POST /api/servers/6090/host/start returned 202 Accepted
POST /api/instances returned old PLAN_LIMIT_REACHED response instead of billing JSON
console/file responses followed old behavior rather than billing gate responses

Disposable server was re-stopped afterward through the billing worker.

Controller no-repair while suspended

result: not completed

Reason:

Test server was dev, so game edge drift suppression was not applicable.
Controller dry-run could not acquire lock in earlier validation window.

Needs disposable game server or safe controller lock window.

Payment restored

event: evt_test_invoice_paid_001
result: pass

State restored:

billingStatus: active
serverBillingState: allowed
past due/block/suspend timestamps cleared
lastStripeEventId set
service guards allowed provisioning, backups, and start
server remained stopped and was not auto-started

Replay of evt_test_invoice_paid_001 passed: duplicate detected, event count stable, state remained active.

Destructive action rejection

Enqueued forbidden action:

delete_suspended_server

Result:

partial pass
server/container was not deleted
ContainerInstance for VMID 6090 remained

Failure against expectation:

No BillingEnforcementEvent row was recorded for the rejected destructive action.

Cleanup

Test user restored to billingStatus=active and serverBillingState=allowed
Test server VMID 6090 remains stopped
DB record intact
Manual billing worker stopped
No permanent billing worker service installed/enabled

Bugs / blockers

1. Running API process needs restart/redeploy before HTTP gate validation can pass.
2. Rejected destructive billing actions do not currently write a BillingEnforcementEvent.
3. Controller no-repair validation still needs a disposable game server or safe controller lock window.

Completion status

Billing enforcement is not complete by the requested criteria yet because API gating and controller no-repair validation did not pass end-to-end in the deployed process.

## Validation update — billing enforcement simulated flow Billing enforcement validation was run against a test user and disposable dev server. ### Test subjects ```text User: testuser1@zerolaghub.com userId: 4bc6e123-8ec9-4433-8a5f-2f7885a6f421 Stripe customer: cus_UJlUFsobDNZTZc Test server: dev-6090 VMID: 6090 IP: 10.100.0.28 ``` ### Preflight ```text npm run prisma:generate: pass npx prisma validate: pass node --check src/services/billingEnforcement.js: pass node --check src/queues/billingEnforcement.js: pass npm run worker:billing: pass ``` Migration was required on the deployed DB because the billing enforcement tables were not present. `npx prisma migrate deploy` applied successfully. ### payment_failed simulation ```text event: evt_test_payment_failed_001 result: pass ``` State moved: ```text active -> past_due -> overdue_warning pastDueSince set warningSentAt set lastStripeEventId set StripeEventLog created and processed BillingEnforcementEvent created for stripe_payment_failed and send_billing_warning Email/Discord attempted fail-soft through worker path ``` ### Stripe replay idempotency Replayed `evt_test_payment_failed_001`. ```text result: pass StripeEventLog detected duplicate StripeEventLog count stable at 1 BillingEnforcementEvent count stable at 2 billing state unchanged ``` ### Final warning / backup block Timestamp advanced beyond 72 hours. ```text result: pass at state/service level billingStatus: final_warning serverBillingState: backup_blocked finalWarningSentAt set backupBlockedAt set existing backups untouched ``` Backup mutation route validation was not fully applicable because the disposable server was a dev server; game backup route returned 404 for the dev VMID as expected. ### Suspension Timestamp advanced beyond 120 hours. ```text result: pass at worker/state level billingStatus: suspended serverBillingState: suspended_shutdown suspendedAt set retentionUntil: 2026-05-17T16:37:44.597Z ``` Worker recorded: ```text suspend_customer_servers shutdown_suspended_server for VMID 6090 ``` Data safety confirmed: ```text Container DB record remained Proxmox container remained VMID 6090 final state: stopped ``` ### API gate validation ```text result: fail/blocker in deployed HTTP process ``` The running API process appears stale and was not serving the new billing-gated code. Evidence while DB state was suspended: ```text POST /api/servers/6090/host/start returned 202 Accepted POST /api/instances returned old PLAN_LIMIT_REACHED response instead of billing JSON console/file responses followed old behavior rather than billing gate responses ``` Disposable server was re-stopped afterward through the billing worker. ### Controller no-repair while suspended ```text result: not completed ``` Reason: ```text Test server was dev, so game edge drift suppression was not applicable. Controller dry-run could not acquire lock in earlier validation window. ``` Needs disposable game server or safe controller lock window. ### Payment restored ```text event: evt_test_invoice_paid_001 result: pass ``` State restored: ```text billingStatus: active serverBillingState: allowed past due/block/suspend timestamps cleared lastStripeEventId set service guards allowed provisioning, backups, and start server remained stopped and was not auto-started ``` Replay of `evt_test_invoice_paid_001` passed: duplicate detected, event count stable, state remained active. ### Destructive action rejection Enqueued forbidden action: ```text delete_suspended_server ``` Result: ```text partial pass server/container was not deleted ContainerInstance for VMID 6090 remained ``` Failure against expectation: ```text No BillingEnforcementEvent row was recorded for the rejected destructive action. ``` ### Cleanup ```text Test user restored to billingStatus=active and serverBillingState=allowed Test server VMID 6090 remains stopped DB record intact Manual billing worker stopped No permanent billing worker service installed/enabled ``` ### Bugs / blockers ```text 1. Running API process needs restart/redeploy before HTTP gate validation can pass. 2. Rejected destructive billing actions do not currently write a BillingEnforcementEvent. 3. Controller no-repair validation still needs a disposable game server or safe controller lock window. ``` ### Completion status Billing enforcement is not complete by the requested criteria yet because API gating and controller no-repair validation did not pass end-to-end in the deployed process.
Author
Owner

Validation update — billing enforcement end-to-end pass

Billing enforcement validation has now completed across the main launch criteria.

Test subjects

Test user: testuser1@zerolaghub.com
User ID: 4bc6e123-8ec9-4433-8a5f-2f7885a6f421
Test game server: VMID 5211, hostname mc-vanilla-5211
Dev test server used earlier: VMID 6090

API restart / redeploy

The stale API process was replaced and the current API is now running from updated code.

npm start PID: 38169
health check: http://127.0.0.1:4000/api/health passed

Preflight

npm run prisma:generate: pass
npx prisma validate: pass
node --check for billing service / billing worker / controller / repair paths: pass
Prisma migration: applied successfully

payment_failed simulation

result: pass

Billing state moved into overdue flow:

pastDueSince set
warningSentAt set
lastStripeEventId set
StripeEventLog row created
BillingEnforcementEvent rows created
email/Discord paths fail-soft

Stripe replay idempotency

result: pass

Replay of evt_test_payment_failed_001 did not duplicate state changes, events, warnings, or queue actions.

Final warning / backup block

result: pass
billingStatus: final_warning
serverBillingState: backup_blocked
finalWarningSentAt set
backupBlockedAt set

Backup mutation gates are wired to block create/restore/delete/prune for customers. Full live route validation still needs a game backup route/test fixture with backups available.

Suspension

result: pass
billingStatus: suspended
serverBillingState: suspended_shutdown
suspendedAt set
retentionUntil set

Data safety confirmed:

No customer data deleted
No backups deleted
No DNS records deleted
No Velocity records deleted
No containers deleted

API gate validation while suspended

result: pass after API restart

Observed:

/api/announcements returned billing suspension announcement
POST /api/instances: 402 BILLING_SUSPENDED
POST /api/servers/6090/host/start: 402 BILLING_SUSPENDED
POST /api/servers/6090/host/restart: 402 BILLING_SUSPENDED
POST /api/servers/6090/host/stop: allowed, 202
POST /api/servers/6090/console/command: 402 BILLING_SUSPENDED
PUT /api/servers/6090/files: 402 BILLING_SUSPENDED
GET /api/servers/6090/files/list: not billing-blocked, but timed out waiting for dev agent

Announcements

result: fixed and validated

Authenticated /api/announcements now includes a billing suspension announcement for affected users.

Destructive action rejection

result: fixed and validated

delete_suspended_server was rejected safely and now writes a BillingEnforcementEvent.

Confirmed:

server/container remained intact
no backups deleted
no destructive action performed
audit event written

Controller no-repair while suspended

result: pass

Using suspended test user and game server VMID 5211:

controller dry-run logged billing guard protection
recommended only billing-safe shutdown_suspended_server in dry-run
did not run edge observation for 5211
did not recommend/enqueue edge_republish
did not recommend/enqueue dns_republish
did not recommend/enqueue velocity_reregister
recent RepairEvent rows for 5211 were empty

Payment restored simulation

result: pass

invoice.paid restored state:

billingStatus: active
serverBillingState: allowed
manual actions allowed again
server was not automatically started

Cleanup

testuser1 restored to active / allowed
temporary billing worker not running
API remains running
VMID 5211 was not stopped by dry-run controller test
VMID 6090 remains stopped from prior test cleanup

Bugs fixed during validation

Stale API process was serving old ungated routes until restart
Destructive billing action rejection did not always write an audit event
Billing announcements were missing from customer announcements route

Remaining TODOs

No Portal billing UI work yet
Do not install/enable permanent billing worker service yet
File read/list should be tested against a server with a responsive agent
Backup API mutation gates are implemented, but full live validation needs a game backup fixture with backups available

Current completion status

Billing enforcement now passes the core launch validation criteria for:

payment_failed simulation
Stripe replay idempotency
final warning / backup block state
suspension / shutdown safety
API gates while suspended
controller does not repair suspended game servers
payment restored flow
destructive action rejection
no destructive deletion

Remaining launch work is primarily Portal customer-facing billing UI and any final backup/file route fixture validation.

## Validation update — billing enforcement end-to-end pass Billing enforcement validation has now completed across the main launch criteria. ### Test subjects ```text Test user: testuser1@zerolaghub.com User ID: 4bc6e123-8ec9-4433-8a5f-2f7885a6f421 Test game server: VMID 5211, hostname mc-vanilla-5211 Dev test server used earlier: VMID 6090 ``` ### API restart / redeploy The stale API process was replaced and the current API is now running from updated code. ```text npm start PID: 38169 health check: http://127.0.0.1:4000/api/health passed ``` ### Preflight ```text npm run prisma:generate: pass npx prisma validate: pass node --check for billing service / billing worker / controller / repair paths: pass Prisma migration: applied successfully ``` ### payment_failed simulation ```text result: pass ``` Billing state moved into overdue flow: ```text pastDueSince set warningSentAt set lastStripeEventId set StripeEventLog row created BillingEnforcementEvent rows created email/Discord paths fail-soft ``` ### Stripe replay idempotency ```text result: pass ``` Replay of `evt_test_payment_failed_001` did not duplicate state changes, events, warnings, or queue actions. ### Final warning / backup block ```text result: pass billingStatus: final_warning serverBillingState: backup_blocked finalWarningSentAt set backupBlockedAt set ``` Backup mutation gates are wired to block create/restore/delete/prune for customers. Full live route validation still needs a game backup route/test fixture with backups available. ### Suspension ```text result: pass billingStatus: suspended serverBillingState: suspended_shutdown suspendedAt set retentionUntil set ``` Data safety confirmed: ```text No customer data deleted No backups deleted No DNS records deleted No Velocity records deleted No containers deleted ``` ### API gate validation while suspended ```text result: pass after API restart ``` Observed: ```text /api/announcements returned billing suspension announcement POST /api/instances: 402 BILLING_SUSPENDED POST /api/servers/6090/host/start: 402 BILLING_SUSPENDED POST /api/servers/6090/host/restart: 402 BILLING_SUSPENDED POST /api/servers/6090/host/stop: allowed, 202 POST /api/servers/6090/console/command: 402 BILLING_SUSPENDED PUT /api/servers/6090/files: 402 BILLING_SUSPENDED GET /api/servers/6090/files/list: not billing-blocked, but timed out waiting for dev agent ``` ### Announcements ```text result: fixed and validated ``` Authenticated `/api/announcements` now includes a billing suspension announcement for affected users. ### Destructive action rejection ```text result: fixed and validated ``` `delete_suspended_server` was rejected safely and now writes a `BillingEnforcementEvent`. Confirmed: ```text server/container remained intact no backups deleted no destructive action performed audit event written ``` ### Controller no-repair while suspended ```text result: pass ``` Using suspended test user and game server VMID `5211`: ```text controller dry-run logged billing guard protection recommended only billing-safe shutdown_suspended_server in dry-run did not run edge observation for 5211 did not recommend/enqueue edge_republish did not recommend/enqueue dns_republish did not recommend/enqueue velocity_reregister recent RepairEvent rows for 5211 were empty ``` ### Payment restored simulation ```text result: pass ``` `invoice.paid` restored state: ```text billingStatus: active serverBillingState: allowed manual actions allowed again server was not automatically started ``` ### Cleanup ```text testuser1 restored to active / allowed temporary billing worker not running API remains running VMID 5211 was not stopped by dry-run controller test VMID 6090 remains stopped from prior test cleanup ``` ### Bugs fixed during validation ```text Stale API process was serving old ungated routes until restart Destructive billing action rejection did not always write an audit event Billing announcements were missing from customer announcements route ``` ### Remaining TODOs ```text No Portal billing UI work yet Do not install/enable permanent billing worker service yet File read/list should be tested against a server with a responsive agent Backup API mutation gates are implemented, but full live validation needs a game backup fixture with backups available ``` ### Current completion status Billing enforcement now passes the core launch validation criteria for: ```text payment_failed simulation Stripe replay idempotency final warning / backup block state suspension / shutdown safety API gates while suspended controller does not repair suspended game servers payment restored flow destructive action rejection no destructive deletion ``` Remaining launch work is primarily Portal customer-facing billing UI and any final backup/file route fixture validation.
Author
Owner

Validation update — Portal billing announcement visible

Portal customer-facing billing messaging has been visually validated through the announcements system.

Current launch approach:

Billing UI for launch is handled through authenticated announcements.
Users see billing suspension/status messaging through Portal announcements rather than a dedicated billing UI panel.

This means the Portal billing messaging requirement is satisfied for launch as long as announcements remain prominent and tied to billing state.

Remaining billing follow-ups:

- Decide when/how to run billing worker permanently
- File read/list validation against a responsive agent
- Backup route live validation with a game backup fixture

Portal billing UI can remain a later enhancement unless announcements prove insufficient.

## Validation update — Portal billing announcement visible Portal customer-facing billing messaging has been visually validated through the announcements system. Current launch approach: ```text Billing UI for launch is handled through authenticated announcements. Users see billing suspension/status messaging through Portal announcements rather than a dedicated billing UI panel. ``` This means the Portal billing messaging requirement is satisfied for launch as long as announcements remain prominent and tied to billing state. Remaining billing follow-ups: ```text - Decide when/how to run billing worker permanently - File read/list validation against a responsive agent - Backup route live validation with a game backup fixture ``` Portal billing UI can remain a later enhancement unless announcements prove insufficient.
Author
Owner

Billing worker is now installed and running under systemd as zpack-billing-worker.service. Launch service set now includes API, provisioning worker, repair worker, controller, and billing worker. Billing worker remains scoped to billing enforcement actions only: warnings, final warnings, backup block, suspension shutdown, retained marking, and restore access. Destructive actions remain rejected. Launch guardrail: do not add more worker or systemd services before launch unless there is a strong safety-boundary reason. Recommended final checks: verify systemctl status and recent journald logs for zpack-billing-worker.service, and add/verify queue staleness monitoring for billing_enforcement.

Billing worker is now installed and running under systemd as zpack-billing-worker.service. Launch service set now includes API, provisioning worker, repair worker, controller, and billing worker. Billing worker remains scoped to billing enforcement actions only: warnings, final warnings, backup block, suspension shutdown, retained marking, and restore access. Destructive actions remain rejected. Launch guardrail: do not add more worker or systemd services before launch unless there is a strong safety-boundary reason. Recommended final checks: verify systemctl status and recent journald logs for zpack-billing-worker.service, and add/verify queue staleness monitoring for billing_enforcement.
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: jester/zlh-grind#14
No description provided.