Skip to content

feat(buzz-acp): add goose usage adapter for NIP-AM turn metrics#1446

Draft
wpfleger96 wants to merge 3 commits into
mainfrom
duncan/nip-am-goose-adapter
Draft

feat(buzz-acp): add goose usage adapter for NIP-AM turn metrics#1446
wpfleger96 wants to merge 3 commits into
mainfrom
duncan/nip-am-goose-adapter

Conversation

@wpfleger96

@wpfleger96 wpfleger96 commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds the Goose harness adapter for NIP-AM agent turn metrics (kind 44200). This is Task B of Phase 2; Task A (kind/relay plumbing) is #1445.

When a Goose agent session is active and the client advertises clientCapabilities._meta.goose.customNotifications: true, Goose emits a _goose/unstable/session/update notification at the end of every turn. This PR captures those notifications, computes per-turn deltas, and exposes the result via AcpClient::take_turn_usage() for TurnCompletionGuard (Phase 3).

Changes

crates/buzz-acp/src/goose_usage.rs (new)

Wire types:

  • GooseSessionUpdateNotification — top-level params with sessionId and update
  • GooseSessionUpdateVariant — discriminated union; UsageUpdate or Other (unknown variants ignored)
  • GooseUsageUpdatePayloadaccumulatedInputTokens, accumulatedOutputTokens, accumulatedCost

Tracker state:

  • GooseUsageTracker — per-session baseline map + turn-scoped in-flight state
  • SessionState — cumulative snapshots and monotonic turn_seq per session
  • GooseTurnUsage — per-turn record returned to TurnCompletionGuard

Turn lifecycle:

  1. begin_turn(session_id) — marks the tracker in-flight before session/prompt is sent; clears leftover pending; ensures setup notifications (fired during session/new) update the baseline but do NOT produce a publishable record for the first real turn
  2. record(session_id, payload) — always advances the cumulative baseline; only sets pending when in_flight_session matches
  3. take() — drains pending and clears the in-flight marker

Delta computation (NIP-AM compliant):

  • First turn (no baseline): delta_reliable: false, null turn fields, cumulative populated
  • Token counter decrease: delta_reliable: false, null turn fields (no negative deltas)
  • Cost counter decrease: delta_reliable: false, null ALL turn fields (not just cost)
  • Cost absent on either side: null cost, reliable tokens (not an error)
  • Session restart (new session_id): treated as first turn

crates/buzz-acp/src/acp.rs

  • Advertise clientCapabilities._meta.goose.customNotifications: true in session/new initialize
  • handle_goose_usage_update() — parses _goose/unstable/session/update notifications and calls goose_usage.record()
  • take_turn_usage() — public method for TurnCompletionGuard to drain the per-turn record
  • session_prompt_blocks_with_idle_timeout calls goose_usage.begin_turn(session_id) before sending the prompt

Delta computation edge cases

Scenario delta_reliable turn fields
First turn false all null
Token counter decrease false all null
Cost counter decrease false all null
Cost absent (either side) true tokens set, cost null
Normal turn true all set

Related

…e 2 Task B)

Advertise `clientCapabilities._meta.goose.customNotifications: true` at
initialize so goose emits `_goose/unstable/session/update` notifications
carrying session-cumulative token counts at turn completion.

Add `GooseUsageTracker` (new `goose_usage.rs`) that:
- Deserializes the `_goose/unstable/session/update` wire payload
- Stores per-session cumulative state (`sessionId`, `turnSeq`, last snapshot)
- Computes per-turn deltas per NIP-AM rules: first-turn no-prior → null +
  deltaReliable:false; counter decrease → null + false; session restart
  (new sessionId) → treated as first turn
- Exposes a `GooseTurnUsage` record via `take()` for consumption by the
  TurnCompletionGuard emit hook (sequential next task)

Wire both dispatch arms (`read_until_response` and
`read_until_response_with_idle_timeout`) to handle the new method,
mirroring the existing `session/update` pattern. Non-goose harnesses are
unaffected: no capability advertised, no dispatch, no state kept.

References #1441 (NIP-AM spec)

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
npub1mn7jgtj4w2pd0g0zeuhxsa6jy6p0rewxz4kujt98my82ahfmp72sxjexk7 and others added 2 commits July 1, 2026 18:50
…unreliable gap

Two Thufir-flagged IMPORTANT fixes for PR #1446.

Turn scoping (setup usage misattributed to zero-update turn):
- Add in_flight_session: Option<String> field to GooseUsageTracker.
- Add begin_turn(session_id) method: sets in_flight_session and clears
  pending. Must be called before session/prompt is sent.
- record() now only sets pending when in_flight_session matches session_id.
  It ALWAYS updates the sessions baseline so the next real turn gets a
  correct delta even from setup notifications.
- take() clears in_flight_session after draining pending.
- Call goose_usage.begin_turn(session_id) at the top of
  session_prompt_blocks_with_idle_timeout, before sending the prompt.
- Setup notifications that arrive during session/new now correctly update
  the baseline without polluting the first real turn's pending record.
- New tests: setup_notification_before_begin_turn_returns_none (verifies
  baseline still feeds next delta), record_outside_in_flight_does_not_
  clobber_pending.

Cost counter decrease -> deltaReliable:false (Fix 2):
- When both snapshots have cost and current_cost < prev_cost, the computed
  delta would be negative — NIP-AM requires delta_reliable: false and all
  turn fields nulled (same as token-decrease path).
- The match arm now returns (None, false) for cost decrease; the outer
  if/else then overrides delta_reliable=false and nulls turn_input/output.
- Cost merely absent on either side stays as-is (null cost, reliable tokens).
- turn_seq still increments on cost-decrease turns (Thufir-endorsed).
- New tests: cost_decrease_sets_delta_unreliable_and_nulls_all_turn_fields,
  cost_absent_on_one_side_leaves_tokens_reliable.

Existing goose_usage unit tests and acp.rs integration tests updated to call
begin_turn() before record(), matching the real call flow.

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
Pure formatting pass — no logic changes. Fixes just fmt-check failure
in CI (Rust Lint job 84654119247). Line-length wrapping in acp.rs and
goose_usage.rs (record signature, assert! calls).

Co-authored-by: Will Pfleger <pfleger.will@gmail.com>
Signed-off-by: Will Pfleger <pfleger.will@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant