feat(gain): colony gain drift + savings_drift_report MCP tool#575
Merged
Conversation
- Median tokens-per-call comparison across non-overlapping windows - Classifies up_drift / down_drift / new_tool / gone / insufficient_data / stable - No schema change — reads existing mcp_metrics table Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a long-run regression detector that flags tools whose median tokens-per-call has drifted up or down between a baseline window and a recent window. Pure read path against existing `mcp_metrics` — no schema change.
Closes `⏳ Long-run regression detector that flags when a tool's median tokens-per-call drifts up` under README §v0.x "Receipts and observability".
OpenSpec
`openspec/changes/gain-drift-detector-2026-05-16/CHANGE.md`
Design adaptation
The original Plan used a correlated `LIMIT 1 OFFSET (COUNT(*)-1)/2` for the per-operation median. SQLite forbids outer aggregate refs in scalar-subquery `OFFSET`. Switched to `ROW_NUMBER() OVER (PARTITION BY operation ORDER BY tpc)` with a CTE join — same semantics, cleaner, supported by bundled better-sqlite3 (SQLite 3.49.2 verified).
Test plan
Merge order
Touches `packages/storage/src/storage.ts`. If #573 and #574 (scenarios) merge first, this should be third. The coach-mode PR also extends `storage.ts` independently — whichever lands second needs a trivial rebase.
🤖 Generated with Claude Code