perf(api): make play-count milestones incremental, single pass#896
Merged
Conversation
The p1/p2/p3 play milestones (250/1k/10k plays, verified artists only) each ran a non-incremental GROUP BY every tick: a Parallel Seq Scan of ~485k tracks producing ~1,867 rows in ~49s, times three tiers. That one processor was the dominant cost of IndexChallengesJob. Merge the three into a single PlayCountMilestonesProcessor that checkpoints plays.id and recomputes only the verified artists whose tracks were played since the checkpoint, deriving all three tiers in one pass. Tier eligibility gates on the live count reaching the previous tier's threshold, which is equivalent to Python's "previous milestone complete" check and removes the cross-tick cascade. Against prod (read-only EXPLAIN ANALYZE): the dirty scan is ~1.4s for a 200k-play catch-up window (far less per normal tick) and the bounded recompute is ~113ms for 50 users, versus 48.7s x 3 every tick before. Migration 0215 seeds the new checkpoint to max(plays.id) so prod starts "from now" rather than backfilling the whole plays table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2 tasks
raymondjacobson
added a commit
that referenced
this pull request
Jun 2, 2026
The challenge loop runs ~20 processors serially per tick but logged only the total cycle duration, so the loop's ~150s/tick cost in prod couldn't be attributed to a specific processor. After #896 removed the heaviest (play-count milestones), the remaining cost is unidentified. Time each processor, log any over a 1s threshold with its challenge_id, and add the slowest processor + its duration to the cycle summary line. Sub-second processors stay quiet to avoid a line-per-tick per processor. Observability only — no change to processing behavior. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
raymondjacobson
added a commit
that referenced
this pull request
Jun 2, 2026
## Summary `IndexChallengesJob` runs ~20 challenge processors serially per tick but logged only the **total** cycle duration. In prod the cycle steadily takes ~150s/tick. After #896 made the play-count milestones incremental (was ~49s×3/tick → ~2s), the remaining ~150s is unattributed — and the loop gives us no way to see which processor owns it. This adds per-processor timing so the next bottleneck is found by measurement, not guessing: - Time each `runProcessor` call. Log any processor over a **1s** threshold with its `challenge_id` and duration. Sub-second processors stay quiet so the loop logs at most a few lines per tick. - Add `slowest_id` + `slowest_duration` to the existing "Reconciled challenges" cycle summary line. - Include `duration` on the existing `processor failed` error log too. Observability only — no change to processing behavior or transaction handling. ## Test plan - [x] `go build ./...`, `go vet ./jobs/` - [ ] After deploy: confirm the "slow challenge processor" lines identify which processor(s) dominate the ~150s cycle (suspected: the trending challenge processors) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
raymondjacobson
added a commit
that referenced
this pull request
Jun 2, 2026
## Summary `FirstWeeklyComment` (challenge `c`) was the dominant cost of `IndexChallengesJob` — ~100s per 30s tick in prod (whole loop ~120-133s, surfaced by the per-processor timing added in #897). Every tick it rescanned the **entire** `comments` table (~321k rows), re-derived `DISTINCT (user_id, ISO-year, ISO-week)` across all history, and re-upserted a `user_challenge` row for every (user, week) pair. This converts it to the same checkpoint + dirty-set pattern already used by `TrackUpload` (#875) and `PlayCountMilestones` (#896): checkpoint `comments.blocknumber` (monotonic, bumped on every insert/update/delete) and recompute only the users whose comments changed since the checkpoint. The existing `EXTRACT(ISOYEAR)/EXTRACT(WEEK)` logic and `is_delete=false AND is_visible=true` filter are preserved exactly for parity; the per-user recompute just adds `AND user_id = ANY($1)`. This is the last big full-rescan challenge processor. `ProfileVerified` (`v`, ~4-17s) and `Tastemaker` (`t`, ~8-9s) are secondary and out of scope. ## Changes - `jobs/challenges/first_weekly_comment.go` — `Reconcile` now `LoadChallenge` → `reconcileIncrementalUsers(commentCheckpoint, commentDirtySQL, recompute)`, mirroring `track_upload.go`. New const `commentCheckpoint = "challenges:c:last_blocknumber"`. - `ddl/migrations/0216_comments_blocknumber_idx.sql` — `CREATE INDEX CONCURRENTLY` on `comments(blocknumber)`. Unlike the other source tables, `comments` had no blocknumber index, so the dirty scan would otherwise be a 321k-row seq scan each tick. Mirrors #897's `user_events` index migration (0209). - `ddl/migrations/0217_seed_first_weekly_comment_checkpoint.sql` — seeds `challenges:c:last_blocknumber` to `max(comments.blocknumber)` so prod starts "from now" instead of re-deriving all history. Mirrors 0215 (`ON CONFLICT DO NOTHING`). - `first_weekly_comment_test.go` — seeded comments now carry a `blocknumber`; tests don't run migrations, so the checkpoint stays 0 and the dirty scan picks up all fixtures. Existing assertions unchanged. ## Test plan - [x] `go build ./...` - [x] `go vet ./jobs/...` - [x] `go test ./jobs/challenges/` (incl. `TestFirstWeeklyComment_OneRowPerUserPerWeek`) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The p1/p2/p3 play milestones (250/1k/10k plays, verified artists only) each ran a non-incremental
GROUP BYevery tick — a Parallel Seq Scan of ~485k tracks producing ~1,867 rows in ~49s, times three tiers. Measuringpg_stat_statements, this one processor was the dominant cost ofIndexChallengesJoband the reason a per-cadence "fast/medium/slow" job taxonomy looked tempting (see #889). Fix the processor instead and the single serial loop stays simple.NewPlayCount250/1000/10000Processorinto onePlayCountMilestonesProcessorthat checkpointsplays.idand recomputes only the verified artists whose tracks were played since the checkpoint (thereconcileIncrementalUsersdirty-set pattern), deriving all three tiers in a single pass.step_count, so this is equivalent to Python's "previous milestone complete" check — and it removes the old cross-tick cascade (p2 no longer has to wait a tick for p1 to commit).0215seeds the new checkpoint tomax(plays.id)so prod starts "from now" rather than backfilling the wholeplaystable (mirrors0208).IndexChallengesJobkeeps its single serial loop — three registrations collapse to one.aggregate_monthly_playshas no monotonic column (updated in place), soplays.idis the high-water mark; the dirty scan joins throughtracks/usersto surface only verified, non-deactivated artists.Measurements (read-only
EXPLAIN ANALYZEon the write primary)Test plan
go build ./...,go vet ./jobs/...go test ./jobs/challenges/— updatedplay_count_milestones_test.goto seedplaysrows (drive the dirty scan) and assert the merged single-pass tier gating: verified-only, p1→p2→p3 thresholds in one runIndexChallengesJob"Reconciled challenges" duration drops and milestoneuser_challengesrows keep landing🤖 Generated with Claude Code