Skip to content

feat(api): port repair_audio_analyses backfill job#900

Merged
raymondjacobson merged 2 commits into
mainfrom
port-repair-audio-analyses
Jun 2, 2026
Merged

feat(api): port repair_audio_analyses backfill job#900
raymondjacobson merged 2 commits into
mainfrom
port-repair-audio-analyses

Conversation

@raymondjacobson
Copy link
Copy Markdown
Member

Summary

Adds a recurring job that backfills missing track bpm / musical_key from content-node audio analyses — a direct port of apps' discovery-provider repair_audio_analyses celery task (3-minute beat). This is the parity-correct backfill mechanism for tracks left without bpm/key (companion to the indexer fix in go-openaudio#334).

Each pass:

  • Selects up to 1000 current tracks missing a non-custom bpm or musical_key, with audio_analysis_error_count < 3, a streamable track_cid, non-podcast/audiobook genre, newest first.
  • Picks up to 5 random registered content-node endpoints via Eth.GetRegisteredEndpoints.
  • For each track, queries /uploads/{audio_upload_id} (modern) or /tracks/legacy/{track_cid}/analysis (legacy, when audio_upload_id is empty), falling back to the next node on transport/non-2xx error.
  • Fills in any missing and valid bpm / musical_key and syncs the stored error count, committing per track.

Validation mirrors apps: valid_bpm is a float in (0, 999); valid musical keys come from the flats-only MusicalKey enum. Both lowercase (mediorum) and capitalized (legacy python content node) JSON keys are accepted.

Changes

  • jobs/repair_audio_analyses.go: the job (query → node discovery → per-track fetch/apply).
  • jobs/musical_key.go: isValidMusicalKey + the enum map.
  • jobs/repair_audio_analyses_test.go: unit tests.
  • indexer/indexer.go: register in startParityJobs (every 3 min), passing the SDK for content-node discovery.

Test plan

  • go build ./jobs/... ./indexer/... + go vet ./jobs/ clean
  • Unit tests pass: bpm/key validation, lowercase+capitalized JSON parsing, modern/legacy endpoint parsing, non-2xx fallback, empty-results handling
  • Observe on staging: confirm tracks with NULL bpm/key get populated and error counts advance

🤖 Generated with Claude Code

raymondjacobson and others added 2 commits June 1, 2026 22:14
Adds a recurring job that backfills missing track bpm / musical_key from
content-node audio analyses, a direct port of apps' discovery-provider
repair_audio_analyses celery task (3-minute beat).

Each pass selects up to 1000 current tracks missing a non-custom bpm or
musical_key (error count < 3, streamable, non-podcast, newest first),
picks up to 5 random registered content nodes, and for each track queries
the modern uploads endpoint (when audio_upload_id is set) or the legacy
blob-analysis endpoint, falling back to the next node on error. It fills
in any missing+valid bpm / musical_key and syncs the stored error count,
committing per track.

Validation mirrors apps: valid_bpm is a float in (0, 999); valid musical
keys come from the flats-only MusicalKey enum. Both lowercase (mediorum)
and capitalized (legacy python content node) JSON keys are accepted.

Wired into startParityJobs with the SDK passed through for content-node
discovery via Eth.GetRegisteredEndpoints.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Advances github.com/OpenAudio/go-openaudio and .../pkg/etl from 5ed068b
to dca79f0 (go-openaudio#334), so the vendored ETL indexer persists
track bpm/musical_key/audio_upload_id. This is the prerequisite for the
repair_audio_analyses job in this PR: without it the indexer keeps
leaving these fields NULL, and the backfill has nothing to reconcile
against (no audio_upload_id to key the modern uploads endpoint on).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@raymondjacobson
Copy link
Copy Markdown
Member Author

Bumped go-openaudio + pkg/etl from 5ed068bdca79f0 to consume the now-merged indexer fix (OpenAudio/go-openaudio#334). The vendored ETL indexer now persists track bpm/musical_key/audio_upload_id, which is the prerequisite for this backfill job (it keys the modern uploads endpoint on audio_upload_id). go mod tidy, full go build ./..., and the jobs tests pass on the bumped deps.

@raymondjacobson raymondjacobson merged commit 7e7759d into main Jun 2, 2026
5 checks passed
@raymondjacobson raymondjacobson deleted the port-repair-audio-analyses branch June 2, 2026 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant