Skip to content

feat(payments): subscription billing rearchitecture + bindings v0.16.0#132

Merged
drewstone merged 1 commit into
mainfrom
fix/zero-operator-bill-livelock
May 13, 2026
Merged

feat(payments): subscription billing rearchitecture + bindings v0.16.0#132
drewstone merged 1 commit into
mainfrom
fix/zero-operator-bill-livelock

Conversation

@drewstone
Copy link
Copy Markdown
Contributor

Closes ten prior audit findings on the subscription billing path with one cohesive redesign. Substantive contract-behavior changes — Rust bindings, indexer, and dapp must update.

Bill amount

  • TWAP-fair, capped at nominal. Bill = nominalRate × cumDeltaPeriod / (baseline × interval), clamped to nominalRate. Operators ramping stake mid-period cannot inflate a customer's bill; ramp-down still reduces it. Kills both customer-overpayment surprise AND the rate-vs-amount termination livelock window.
  • Baseline pinned at activation (not first bill). Closes the stake-ramp attack window the previous lazy-init path opened.
  • Manager QoS hook computeBillAdjustmentBps(serviceId, periodStart, periodEnd) → uint16 can discount the bill (clamped `[0, 10_000]`), called via gas-capped staticcall. queryDeveloperPaymentAddress is now gas-capped too. Malicious managers cannot grief keepers.
  • Cursor commit deferred until after the escrow-balance check passes — a failed try-bill no longer leaves cursors advanced. Closes the free-period exploit on the batch path that the final security audit caught.
  • Dust bills skip cleanly via minBillAmount rather than reverting deep in _distributeBill. A manager hook returning a tiny qosBps can't brick a service.
  • Overflow reverts with BillingArithmeticOverflow instead of silently returning nominal.

Operator payout

  • Same weights drive bill amount AND payout share. weights[i] = cumDelta[i] × exposureBps[i]. Customer-fairness and operator-fairness coupled — an operator who ramps stake earns a larger slice of the SAME (capped) pool.
  • Zero-operator skip. Empty active set advances the cursor without billing, emits SubscriptionBillSkippedNoOperators, escrow untouched. No livelock when a Dynamic service loses all operators.
  • Staker pool refund-to-escrow when the fee distributor is unset or reverts (was: silent capture by treasury). ERC20 already-transferred failures surfaced via StakerShareRefundedToEscrow with reason.

Payment split

  • PaymentSplit gained a fifth field keeperBps. setPaymentSplit requires the 5-field sum equal 10_000. paymentSplit() returns a 5-tuple.
  • Default unchanged at 20/20/40/20/0 — admins opt in to a non-zero keeper rebate via setPaymentSplit per network policy.
  • Non-subscription distributions (PayOnce, RFQ, per-job) fold keeperBps into the operator pool so totals still sum to 10_000.

EventDriven funding

  • Upfront paymentAmount > 0 at request rejected with UpfrontPaymentNotAllowedForEventDriven. Customer funds never collected on a path that can't pay them out properly. Quote-create flow has the matching guard.

Activation paths

  • _handleInitialPayments (request flow) and _activateQuoteService (quote flow) both seed per-op TWAP cursors and pin subscriptionBaselineStake at activation. A bill against an unbaselined service reverts loudly with SubscriptionBaselineNotInitialized instead of silently re-seeding.

Per-service operator ceiling

  • ProtocolConfig.MAX_OPERATORS_PER_SERVICE = 64. maxOperators = 0 on a blueprint config is interpreted as "use protocol ceiling" — never "unlimited". Bounds every per-operator loop in the bill / distribute / terminate paths. Closes the storage-audit Critical that let a blueprint creator unbound the bill loop.

Events / errors / dead code

  • New events: SubscriptionBillSkippedNoOperators, SubscriptionBillAdjustedByManager, KeeperRebateAccrued, StakerShareRefundedToEscrow, SubscriptionBaselineInitialized. PaymentSplitUpdated extended with keeperBps.
  • New errors: BillingArithmeticOverflow, SubscriptionBaselineNotInitialized, UpfrontPaymentNotAllowedForEventDriven.
  • Removed: PaymentLib.calculateOperatorPayments, validatePaymentAmount, bpsShareRoundUp, divUp, _seedTwapCursorForOp, _aggregateOperatorStakeAtNow. Orphaned test/payments/EffectiveExposurePayments.t.sol deleted.

Audits

Four independent reviewers ran against this branch (security, architectural correctness, hardening/gas/code-quality, timing+storage). Their actionable findings:

# Severity Status
Security HIGH: cursor-mutation persistence across failed try-bill High Fixed (commit-deferred cursors)
Security MEDIUM: quote-path EventDriven bypass Medium Fixed (guard added in QuotesCreate)
Storage CRITICAL C-1: maxOperators = 0 unbounded Critical Fixed (MAX_OPERATORS_PER_SERVICE = 64)
10 architectural-correctness items All Fixed
Hardening: dead helpers / gas caps / naming Mostly Minor Fixed
Hardening: 1 Blocking (return-arity) Blocking Fixed
Timing H-1 (job RFQ requester binding) High Captured below
Timing H-2 (service-quote freshness gap) High Captured below
Timing H-3 (extendServiceFromQuotes cumulative TTL) High Captured below
Storage C-2 (O(N) stake reads per bill) Critical (perf) Captured below
Storage H-1 (ValidatorPodManager unbounded delegator slash) High Captured below
Storage H-3 (claimRewardsAll bricked by single bad token) High Captured below

Tests

  • test/security/SubscriptionBilling.t.sol (renamed from F5TWAPBilling.t.sol): 15 tests — 7 unit + 6 fuzz × 1000 runs. Covers bill cap, zero-op skip, keeper rebate, QoS waiver, EventDriven rejection, try-bill cursor invariant, overflow revert.
  • All PaymentSplit literals + destructures migrated across test/ and script/ for the 5-tuple.
  • Fixed two pre-existing test failures that reproduce on main, both rooted in the same via-IR optimizer quirk where the optimizer caches block.timestamp across vm.warp calls: test_GovernorSelfUpgrade / test_TokenUpgradeViaGovernance in Governance.t.sol, and testFuzz_DisputeWindow_Enforcement in SlashingFuzz.t.sol (1000 runs).
  • Full regression: 1,445 / 1,445 ✅ across 94 suites.

Bindings + docs

  • bindings v0.16.0 regenerated; CHANGELOG written. Will publish to crates.io after merge.
  • docs/PRICING.md rewritten end-to-end against the v0.16.0 source.
  • README.md updated for the 5-tuple split + payment-model summary.
  • docs/DEPLOYMENT_RUNBOOK.md + docs/full-deploy.md path corrections.
  • Stale AUDIT-*.md, pursue-launch-readiness.md, COMMIT_MSG.md removed.

Local testnet smoke

Deployed end-to-end against a fresh anvil and exercised: PaymentSplit 5-tuple return, Subscription escrow drain, operator pending rewards, EventDriven UpfrontPaymentNotAllowedForEventDriven revert at request. All ✅.

Flagged from smoke: TanglePaymentsFacet + TangleServicesFacet exceed EIP-170 (24,576 bytes) — anvil deploys only with --disable-code-size-limit. Will block mainnet/L2 deployment until the facets are split. Captured below.

Follow-up (separate PRs)

  • Storage C-2: O(1) _operatorStakeForAsset aggregate to remove the per-bill stake-derivation loop.
  • Storage H-1: ValidatorPodManager share-pool delegator slashing.
  • Storage H-3: try/catch on claimRewardsAll per-token transfer.
  • Timing H-1: bind requester on job-RFQ quote verification.
  • Timing H-2: enforce maxQuoteAge on service-creation quotes.
  • Timing H-3: cap cumulative TTL on extendServiceFromQuotes.
  • Contract-size: split TanglePaymentsFacet and TangleServicesFacet to meet EIP-170 before mainnet.

Test plan

  • forge build clean
  • forge test --no-match-path test/Integration.t.sol: 1445/1445 ✅
  • test/security/SubscriptionBilling.t.sol: 15/15 ✅ (incl. 6 × 1000 fuzz runs)
  • Pre-existing governance + dispute-window-fuzz failures fixed
  • Local testnet smoke: deploy + subscription bill + EventDriven reject ✅
  • Mainnet contract-size split (separate PR; captured above)
  • Publish tnt-core-bindings v0.16.0 to crates.io after merge
  • Downstream consumer updates (blueprint, dapp) for the 5-tuple PaymentSplit ABI break — separate PRs in those repos

Closes ten prior audit findings on the subscription billing path with one
cohesive redesign. Substantive contract behavior changes — downstream
consumers (Rust bindings v0.16.0, indexer, dapp) must update.

## What changed

**Bill amount**
- TWAP-fair from activation: `nominalRate × cumDeltaPeriod / (baseline ×
  interval)`, capped at `nominalRate`. Operators ramping stake mid-period
  cannot inflate a customer's bill; ramp-down still reduces it.
- Baseline pinned at activation (not first bill). Removes the stake-ramp
  attack window that the previous lazy-init path opened.
- Manager QoS hook `computeBillAdjustmentBps(serviceId, periodStart,
  periodEnd)` can discount the bill (clamped `[0, 10_000]`); called via
  gas-capped staticcall. `queryDeveloperPaymentAddress` is now gas-capped
  too. Malicious managers cannot grief keepers.
- Per-operator TWAP cursors are mutated only AFTER the escrow-balance check
  passes — a failed try-bill no longer leaves cursors advanced (closes the
  free-period exploit on the batch path).
- Dust bills (rounding to less than 1 wei per recipient) skip cleanly via
  `minBillAmount` rather than reverting deep in `_distributeBill`.
- Overflow in `twapBillAmount` now reverts with `BillingArithmeticOverflow`
  instead of silently returning nominal.

**Operator payout**
- Same `cumDelta × exposureBps` weights drive bill amount AND payout share —
  customer-fairness and operator-fairness coupled. An operator who ramped
  stake earns a larger slice of the SAME (capped) pool.
- Zero active operators: bill is skipped, cursor advances, escrow untouched,
  `SubscriptionBillSkippedNoOperators` emitted. No livelock when a Dynamic
  service loses all its operators.
- Staker pool refund-to-escrow when the fee distributor is unset or reverts
  (was: silent capture by treasury).
- `claimRewardsAll` semantics unchanged for the user; the keeper rebate
  uses the same pull-pattern mapping.

**Payment split**
- `PaymentSplit` gained a fifth field `keeperBps`. `setPaymentSplit` requires
  the five-field sum equal 10_000. `paymentSplit()` returns a 5-tuple.
- Default split unchanged at 20/20/40/20/0 — admins opt in to a non-zero
  keeper rebate via `setPaymentSplit` per network policy.
- Non-subscription distributions (PayOnce, RFQ, per-job) fold `keeperBps`
  into the operator pool so the total still sums to 10_000.

**EventDriven funding**
- Upfront `paymentAmount > 0` at request is rejected with
  `UpfrontPaymentNotAllowedForEventDriven`. Customer funds are never
  collected on a path that cannot pay them out properly. Quote-create flow
  has the matching guard.

**Activation paths**
- `_handleInitialPayments` (request flow) and `_activateQuoteService`
  (quote flow) both seed per-op TWAP cursors and pin `subscriptionBaselineStake`
  at activation. A bill against an unbaselined service reverts loudly with
  `SubscriptionBaselineNotInitialized` instead of silently re-seeding.

**Per-service operator ceiling**
- `ProtocolConfig.MAX_OPERATORS_PER_SERVICE = 64`. `maxOperators = 0` on a
  blueprint config is interpreted as "use the protocol ceiling" — never
  "unlimited". Bounds every per-operator loop in the bill / distribute /
  terminate paths. Hard fix for the storage-audit Critical that let a
  blueprint creator unbound the bill loop.

**Events**
- New: `SubscriptionBillSkippedNoOperators`, `SubscriptionBillAdjustedByManager`,
  `KeeperRebateAccrued`, `StakerShareRefundedToEscrow`,
  `SubscriptionBaselineInitialized`.
- `PaymentSplitUpdated` extended with `keeperBps`.

**Errors**
- New: `BillingArithmeticOverflow`, `SubscriptionBaselineNotInitialized`,
  `UpfrontPaymentNotAllowedForEventDriven`.

**Dead code removed**
- `PaymentLib.calculateOperatorPayments`, `validatePaymentAmount`,
  `bpsShareRoundUp`, `divUp` — superseded by inline per-weight distribution
  and `minBillAmount`. Orphaned test file deleted.
- `_seedTwapCursorForOp`, `_aggregateOperatorStakeAtNow` — unreachable
  helpers stripped.

## Audits

Four independent reviewers ran against this branch:
- Security: 0 Critical / 1 High (cursor-mutation persistence — fixed in this
  commit) / 1 Medium (quote-path EventDriven bypass — fixed) plus Lows
- Architectural correctness: all 10 prior audit findings verified fixed
- Hardening + gas + code quality: dead-code cleanup, gas captures, naming
- Timing intervals + storage / O(1) audits: 3 timing Highs and 2 storage
  Criticals filed; storage C-1 (`maxOperators = 0` unbounded) fixed inline,
  remaining items captured as follow-up issues.

## Tests

- New `test/security/SubscriptionBilling.t.sol` (renamed from F5TWAPBilling):
  15 tests — 7 unit + 6 fuzz (× 1000 runs each), covering bill cap, zero-op
  skip, keeper rebate, QoS waiver, EventDriven rejection, try-bill cursor
  invariant, and overflow revert.
- Updated all PaymentSplit literals + destructures across `test/` and
  `script/` for the 5-tuple.
- Fixed two pre-existing test failures rooted in the same via-IR optimizer
  quirk on `vm.warp` + `block.timestamp`: `test_GovernorSelfUpgrade` /
  `test_TokenUpgradeViaGovernance` (Governance), and
  `testFuzz_DisputeWindow_Enforcement` (SlashingFuzz, 1000 runs).
- Full regression: 1445 tests pass across 94 suites.

## Bindings + docs

- `bindings v0.16.0`: regenerated, CHANGELOG written.
- `docs/PRICING.md` rewritten end-to-end against v0.16.0 source.
- `README.md` updated for the 5-tuple split + new payment-model summary.
- `docs/DEPLOYMENT_RUNBOOK.md` + `docs/full-deploy.md` path corrections.
- Stale AUDIT-*.md / pursue-launch-readiness.md / COMMIT_MSG.md removed.

## Follow-up (NOT in this PR)

Captured for future PRs:
- Storage C-2: O(1) operator-asset stake aggregate (perf, not correctness).
- Storage H-1: ValidatorPodManager unbounded delegator array on slash.
- Storage H-3: `claimRewardsAll` try/catch on per-token transfer.
- Timing H-1: Job RFQ quote `requester` binding not verified.
- Timing H-2: Service-creation quote freshness gap.
- Timing H-3: `extendServiceFromQuotes` cumulative TTL cap.
- Contract-size: TanglePaymentsFacet + TangleServicesFacet exceed EIP-170
  (deploys on anvil require `--disable-code-size-limit`); needs facet split
  before mainnet.
@drewstone drewstone merged commit 8b2777b into main May 13, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant