Skip to content

perf(api): stop TrendingJob from continuously rewriting score tables#890

Merged
raymondjacobson merged 2 commits into
mainfrom
api/p1-trending-perf
Jun 1, 2026
Merged

perf(api): stop TrendingJob from continuously rewriting score tables#890
raymondjacobson merged 2 commits into
mainfrom
api/p1-trending-perf

Conversation

@raymondjacobson
Copy link
Copy Markdown
Member

Summary

TrendingJob was the dominant source of IO starvation on the indexer. It ran on a 10s schedule, but each run took minutes and did two expensive things on every cycle:

  1. Two non-concurrent REFRESH MATERIALIZED VIEW calls (aggregate_interval_plays, trending_params). A plain refresh takes an ACCESS EXCLUSIVE lock for the duration of the (multi-minute, plays-table-scanning) rebuild — blocking every reader, including the block-indexing loop's downstream queries.
  2. A full DELETE-then-INSERT of the ~16GB track_trending_scores table (and the analogous playlist table), rewriting ~12.7M rows × 6 indexes per cycle.

On a 10s schedule with multi-minute runtimes, this ran effectively continuously and IO-starved block indexing.

Changes

  • Concurrent matview refresh. Use REFRESH MATERIALIZED VIEW CONCURRENTLY, with a graceful fallback to a blocking refresh when the matview is unpopulated (fresh/test DB → SQLSTATE 0A000) or the required unique index isn't present yet (pre-migration window of a rolling deploy → SQLSTATE 55000). This avoids a hard ordering dependency on the migration.
  • Migration 0214 promotes the track_id indexes on aggregate_interval_plays and trending_params to UNIQUE (a prerequisite for CONCURRENTLY). Verified in prod that both matviews are exactly one row per track (COUNT(*) == COUNT(DISTINCT track_id)), so the unique constraint is valid. Statements are intentionally not wrapped in a transaction so each index build autocommits.
  • Upsert + prune instead of delete + insert. Scores are staged into an ON COMMIT DROP temp table, then merged with INSERT ... ON CONFLICT (...) DO UPDATE ... WHERE col IS DISTINCT FROM EXCLUDED.col (skip-unchanged), followed by an anti-join DELETE ... WHERE NOT EXISTS to prune rows that left the source set. This stops the per-cycle full-table churn that bloated the table and its indexes.
  • Hourly cadence. Schedule TrendingJob every 1h instead of 10s, matching discovery's effective cadence: apps' celery beat ticked index_trending every 10s but an internal gate (trending_refresh_seconds, default 3600) only recomputed once an hour. The vendored ETL port had dropped that gate.

Notes

Test plan

  • go test ./jobs/... — existing TestTrendingJob_PopulatesScores passes; new TestTrendingJob_PrunesStaleRows covers the anti-join prune (seed track, run, delete track, re-run, assert rows pruned).
  • go vet ./jobs/... clean; touched files gofmt-clean.
  • Watch indexer IO / block lag after deploy to confirm starvation is resolved.

🤖 Generated with Claude Code

raymondjacobson and others added 2 commits June 1, 2026 13:34
TrendingJob ran on a 10s schedule but each run took minutes: it did two
non-concurrent REFRESH MATERIALIZED VIEW (AccessExclusiveLock, blocking all
readers including the block-indexing loop) and then a full DELETE+INSERT of
the ~16GB track_trending_scores table plus 6 indexes. Effectively continuous,
it IO-starved the indexer.

Changes:
- Refresh matviews with REFRESH MATERIALIZED VIEW CONCURRENTLY, falling back
  to a blocking refresh if the matview is unpopulated (fresh/test DB) or the
  unique index isn't present yet (pre-migration rolling deploy).
- Migration 0214 promotes the track_id indexes on aggregate_interval_plays
  and trending_params to UNIQUE so CONCURRENTLY is possible (both matviews are
  one row per track).
- Rewrite the score recompute as upsert-into-staging-temp-table +
  ON CONFLICT DO UPDATE (skip-unchanged) + anti-join prune, instead of
  DELETE-then-INSERT of every row each cycle.
- Schedule TrendingJob hourly to match discovery's effective cadence
  (trending_refresh_seconds=3600); the vendored port had dropped that gate.

sql/01_schema.sql picks up the unique indexes on the next `make test-schema`
regen; the code falls back safely until then. Adds a prune-coverage test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Condense the package doc, refreshMatview doc, migration 0214 header, and the
indexer cadence comment to the essential rationale.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@raymondjacobson raymondjacobson merged commit b0bba05 into main Jun 1, 2026
5 checks passed
@raymondjacobson raymondjacobson deleted the api/p1-trending-perf branch June 1, 2026 22:13
raymondjacobson added a commit that referenced this pull request Jun 2, 2026
## Summary
- #890 swapped the trending score-table rewrite for a temp-table stage +
`INSERT ... ON CONFLICT DO UPDATE` (skip-unchanged) + anti-join prune,
to avoid rewriting the ~16GB `*_trending_scores` tables every run.
- In prod that upsert does one unique-index probe + `IS DISTINCT FROM`
comparison **per row over ~4M rows**. It took **40+ minutes** and never
finished inside the hourly window — so trending scores went stale
(observed 12–18h old) and `isRunning` blocked every subsequent tick.
- This reverts `computeTrendingTracks` / `computeTrendingPlaylists` to
the strategy discovery's Python ran for years: within the single
transaction already opened, `DELETE` the `(type, version)` slice then
repopulate with plain bulk `INSERT`s.

## Why this is correct
- **No empty-table window:** the `DELETE` and `INSERT`s are in one
transaction, so the `DELETE` isn't visible to readers until `COMMIT` —
exactly how Python's `update_track_score_query` behaved.
- **Scores unchanged:** the score expressions are byte-for-byte
identical to before (and to apps' `pnagD`/`AnlGe` strategies); only
bind-parameter numbering changed because `type`/`version`/`created_at`
moved back into the `SELECT`.
- **Concurrent matview refresh from #890 is kept** — that part was a
genuine improvement over Python (which refreshed non-concurrently).

## Tradeoff
The bulk rewrite produces more dead tuples per run than the
skip-unchanged upsert. Autovacuum handled this under Python for years.
The alternative — a job that never completes — is strictly worse.

## Evidence
- Live prod query showed the upsert `INSERT INTO track_trending_scores`
active for **2650s (44 min)** on `IO/DataFileRead`, not blocked, still
on step 1 of 4.
- `track_trending_scores` max `created_at` frozen 12–18h while the pod
had been up >1h and the matview refreshes succeeded.

## Test plan
- [x] `go build ./jobs/` clean
- [x] `gofmt` clean
- [x] `go test ./jobs/ -run Trending` passes, incl.
`TestTrendingJob_PrunesStaleRows` (deleted tracks still drop out, now
via the full DELETE)
- [ ] After deploy: confirm `max(created_at)` in `track_trending_scores`
advances each cycle and a full cycle completes in seconds

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant