RAGtime is a Django application for ingesting jazz-related podcast episodes. It extracts metadata, transcribes audio, identifies jazz entities, and powers Scott β a jazz-focused AI agent that answers questions strictly from ingested episode content, with references to specific episodes and timestamps.
- ποΈ Episode Ingestion β Add podcast episodes by URL. RAGtime fetches episode details (title, description, date, image), downloads audio, and processes it through the pipeline.
- π Multilingual Transcription β Transcribes episodes using configurable backends (Whisper API by default) with segment and word-level timestamps. Supports multiple languages (English, Spanish, German, Swedish, etc.).
- π Entity Extraction β Identifies jazz entities: musicians, musical groups, albums, music venues, recording sessions, record labels, years. Entities are resolved against the local MusicBrainz database (foreground, sub-millisecond) and Wikidata (background, throttled singleton enrichment).
- π Episode Indexing β Splits transcripts into segments and generates multilingual embeddings stored in Qdrant. Enables cross-language semantic search so Scott can retrieve relevant content regardless of the question's language.
- π· Scott β Your Jazz AI β A conversational agent that answers questions strictly from ingested episode content. Scott responds in the user's language and provides references to specific episodes and timestamps. Responses stream in real-time.
- π AI Evaluation β Measures pipeline and Scott quality using RAGAS (faithfulness, answer relevancy, context precision/recall) with scores tracked in Langfuse.
RAGtime is under active development.
- Episode ingestion: submit episodes by URL, episode-detail fetching, audio download, transcription, summarization, chunking, entity extraction with foreground resolution against a local MusicBrainz Postgres database and background Wikidata enrichment, and multilingual embeddings into Qdrant.
- Episode management UI: Django admin interface to view episode status and metadata and browse extracted entities.
- Configuration wizard: interactive
manage.py configurecommand for allRAGTIME_*env vars. - Telemetry: OpenTelemetry-based tracing for pipeline steps and LLM calls with optional collectors: console, Jaeger, and Langfuse.
- Agent-driven download: when the cheap
wgetpath fails, a Pydantic AI agent with podcast-index lookup (fyyd, podcastindex.org) and Playwright browser automation discovers the audio URL from the publisher's page or interactive UI. - Scott chatbot: strict-RAG conversational agent that answers questions only from ingested episode content, with citations and real-time streaming via AG-UI. React frontend built with assistant-ui and conversation history persisted in Django.
See CHANGELOG.md for the full list of implemented features, fixes, implementation plans, feature documentation and session transcripts.
- AI evaluation: measure pipeline and Scott quality using RAGAS (faithfulness, answer relevancy, context precision/recall) with scores tracked in Langfuse. Enables regression testing across prompt and model changes.
Each step updates the episode's status field. A post_save signal enqueues a DBOS durable workflow on the episode_pipeline queue (default concurrency=4 via RAGTIME_EPISODE_CONCURRENCY) that sequences all steps with PostgreSQL-backed checkpointing β on crash or restart, the workflow resumes from the last completed step. The workflow exposes one @DBOS.step() per pipeline phase, so dbos workflow steps <id> (and the Episode admin's "View workflow steps" link) shows exactly which phase ran, its output, or the exception raised. Episodes that arrive while all worker slots are busy sit visibly in the queued state until DBOS picks them up.
| # | Step | Status | Description |
|---|---|---|---|
| 1 | π₯ Submit | pending |
User submits an episode URL |
| βΈ | β³ Queue | queued |
Waiting for a pipeline worker slot |
| 2 | π·οΈ Fetch Details | fetching_details |
Investigator agent extracts metadata and cross-links between canonical and aggregator pages (Apple Podcasts, fyyd) when the submitted URL alone is incomplete |
| 3 | β¬οΈ Download | downloading |
Download audio (cheap wget first; agent + podcast-index fallback) and extract duration |
| 4 | ποΈ Transcribe | transcribing |
Whisper API transcription with timestamps |
| 5 | π Summarize | summarizing |
LLM-generated episode summary |
| 6 | βοΈ Chunk | chunking |
Split transcript into ~150-word chunks |
| 7 | π Extract | extracting |
Named entity recognition per chunk |
| 8 | π§© Resolve | resolving |
Entity linking + deduplication against the local MusicBrainz database |
| 9 | π Embed | embedding |
Multilingual embeddings into Qdrant |
| 10 | β Ready | ready |
Episode available for Scott to query |
Wikidata IDs are filled in after ready by a separate background DBOS workflow on a singleton-concurrency queue (wikidata_enrichment, concurrency=1). It tries MBID β Wikidata via MusicBrainz external links first (local DB, no network) and falls back to the Wikidata API only when needed. Per-entity, deduplicated globally β common names get enriched once across all episodes.
See the full pipeline documentation for per-step details, the download-agent cascade, and entity types.
Detailed documentation lives in the doc/ directory:
- Full pipeline documentation β per-step details, the download-agent cascade, entity types
- How Scott works β RAG architecture and query flow
- Telemetry (OpenTelemetry) β tracing setup, collectors (console, Jaeger, Langfuse)
- Architecture diagrams β processing pipeline diagram
- Feature documentation β per-feature docs with problem, changes, and verification
- Plans β implementation plans
- Session transcripts β planning and implementation session logs
- Python 3.13+
- uv
- Node.js (for the frontend dev server and build)
- Docker (for PostgreSQL and Qdrant)
- ffmpeg (for audio downsampling)
- wget (for audio downloading)
git clone <repo-url>
cd ragtime
uv sync # Install dependenciesOptional dependency group:
| Extra | Install command | Description |
|---|---|---|
langfuse |
uv sync --extra langfuse |
Langfuse collector for telemetry |
Launch the interactive setup wizard for all RAGTIME_* env vars:
uv run python manage.py configureAlternatively, copy .env.sample to .env and fill in your values.
The service variables are read by docker-compose.yml when the containers start, so the values you set here flow straight through:
RAGTIME_DB_NAME,RAGTIME_DB_USER,RAGTIME_DB_PASSWORD,RAGTIME_DB_PORTβ Postgres (defaults:ragtime/ port5432).RAGTIME_QDRANT_PORTβ Qdrant published HTTP port (default:6333).
Defaults are used if the variables are unset, so a fresh clone runs with zero configuration.
Start PostgreSQL and Qdrant, apply migrations, create an admin account, and start the application:
docker compose up -d # Start PostgreSQL and Qdrant (both read ports/creds from .env)
uv run python manage.py migrate
uv run python manage.py createsuperuser # Create an admin user for the Django admin UI
uv run python manage.py load_entity_types # Seed initial entity typesThe foreground entity-resolution step queries a local MusicBrainz database to map extracted names to canonical MBIDs in sub-millisecond DB queries (instead of rate-limited Wikidata API calls). One-time import via musicbrainz-database-setup:
# Create an empty 'musicbrainz' database next to the 'ragtime' database.
docker compose exec postgres createdb -U ragtime musicbrainz
# One-shot import β runs the upstream CLI directly from GitHub via uvx,
# downloads the latest MusicBrainz dump, creates the schema, and streams
# COPY for every table. ~30+ minutes depending on disk speed. Resumable.
uvx --from git+https://github.com/rafacm/musicbrainz-database-setup \
musicbrainz-database-setup run \
--db postgresql://ragtime:ragtime@localhost:5432/musicbrainz \
--modules core \
--latestSee MusicBrainz database in the docs for why it's needed, configuration via RAGTIME_MUSICBRAINZ_*, optional modules, and tuning tips.
uv run uvicorn ragtime.asgi:application --host 127.0.0.1 --port 8000The application runs under ASGI via Uvicorn. This is required because Scott's chat endpoint (/chat/agent/) uses HTTP+SSE streaming through an ASGI sub-app mounted in ragtime/asgi.py. All other routes (admin, episodes, pages) are served by the same process through Django's standard ASGI handler.
Note:
manage.py runserverstill works for non-Scott development (admin, episodes, ingestion pipeline) but does not load the ASGI dispatcher, so the chat endpoint will not function.
cd frontend && npm install # First time only
cd frontend && npm run dev # Vite dev server with HMR on port 5173The Scott chat UI is a React application (assistant-ui + AG-UI) built with Vite. During development, Vite serves the frontend with hot module replacement. In production, run npm run build and the compiled assets are served by Django via django-vite.
The frontend communicates with the ASGI server over HTTP+SSE (AG-UI protocol), so both the Uvicorn server and the Vite dev server must be running to develop the chat UI.
RAGtime uses OpenTelemetry to trace pipeline steps and LLM calls. The quickest local setup is Jaeger:
docker run -d --name jaeger -p 4318:4318 -p 16686:16686 jaegertracing/all-in-one:latestThen set RAGTIME_OTEL_COLLECTORS=jaeger in .env. Traces are viewable at http://localhost:16686. See Telemetry (OpenTelemetry) for all collector options (console, Jaeger, Langfuse).
To drop all data and start fresh:
uv run python manage.py dbreset # Drop PostgreSQL DB (incl. DBOS tables) + Qdrant collection
uv run python manage.py migrate # Recreate tables
uv run python manage.py load_entity_types # Seed entity types
uv run python manage.py createsuperuser # Recreate the admin account (interactive)Or non-interactively:
DJANGO_SUPERUSER_PASSWORD=admin uv run python manage.py createsuperuser --username admin --email admin@example.com --noinput- Runtime: Python 3.13
- Framework: Django 5.2
- Database: PostgreSQL 17 (via Docker Compose)
- Vector Store: Qdrant (via Docker Compose)
- Durable Workflows: DBOS Transact (PostgreSQL-backed durable execution)
- AI Agents: Pydantic AI (fetch-details agent, download agent)
- Transcription: Configurable β Whisper API (default), local Whisper, etc.
- LLM: Configurable β Claude (Anthropic), GPT (OpenAI), etc.
- Embeddings: Configurable β must support multilingual models for cross-language retrieval
- AI Evaluation: RAGAS + Langfuse
- Frontend: React 19 + assistant-ui + Tailwind CSS 4 via Vite + django-vite (Scott chat UI); Django templates + HTMX (other pages)
- Package Manager: uv
This project is licensed under the MIT License β see the LICENSE file for details.
