bigint · bigint · May 21, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/.env.example b/.env.example
@@ -24,7 +24,6 @@ BIGRAG_MASTER_KEY=
 
 # Override only when the backing services are not running on local defaults.
 # BIGRAG_DATABASE_URL=postgres://bigrag:bigrag@localhost:5432/bigrag?sslmode=disable
-# BIGRAG_QDRANT_URL=http://localhost:6333
 # BIGRAG_REDIS_URL=redis://localhost:6379/0
 
 # Optional Redis password for docker-compose production runs.

diff --git a/AGENTS.md b/AGENTS.md
@@ -2,12 +2,12 @@
 
 ## Project Structure
 
-- `api/` — Python/FastAPI backend (Docling ingestion + Qdrant/Turbopuffer vector DB)
+- `api/` — Python/FastAPI backend (Docling ingestion + Turbopuffer vector search)
 - `sdks/typescript/` — TypeScript SDK (`@bigrag/client`)
 - `sdks/python/` — Python SDK (`bigrag`)
 - `app/` — admin UI (Vite + TanStack Router + Tailwind v4 + Base UI, `@bigrag/app`)
 - `website/` — Documentation site (Next.js + Fumadocs, content in `website/content/docs/`)
-- `e2e/` — pytest + vitest end-to-end suites against a fake-OpenAI + real Postgres/Redis/Qdrant stack
+- `e2e/` — pytest + vitest end-to-end suites against fake OpenAI/Turbopuffer services plus real Postgres/Redis
 
 ## Style Guide
 
@@ -33,8 +33,8 @@ When adding new code, prefer the smallest meaningful module instead of dropping
 
 ## Tech Stack
 
-- **Backend**: Python 3.12+, FastAPI, SQLAlchemy 2 (async) + asyncpg, Alembic, qdrant-client, docling, openai, cohere, cryptography (Fernet for at-rest encryption of provider secrets), dramatiq (Redis broker)
-- **Vector DB**: Qdrant default; turbopuffer alternative (selected per collection)
+- **Backend**: Python 3.12+, FastAPI, SQLAlchemy 2 (async) + asyncpg, Alembic, docling, openai, cohere, cryptography (Fernet for at-rest encryption of provider secrets), dramatiq (Redis broker)
+- **Vector DB**: Turbopuffer
 - **Metadata DB**: PostgreSQL 17
 - **Ingestion**: Docling (PDF, DOCX, PPTX, XLSX, HTML, Markdown, images)
 - **Embedding**: OpenAI, Cohere, Voyage, and OpenAI-compatible providers
@@ -124,7 +124,7 @@ If a feature is removed, remove it from the docs too. Never leave stale referenc
 
 ```bash
 ./dev.sh            # starts infra + backend + worker
-./dev.sh --infra    # postgres + redis + qdrant only
+./dev.sh --infra    # postgres + redis only
 ./dev.sh --website  # docs site only
 pnpm dev:app        # admin UI on localhost:3000
 ```
@@ -133,7 +133,6 @@ pnpm dev:app        # admin UI on localhost:3000
 - Backend API: http://localhost:4000 (Swagger docs at /docs)
 - Postgres: localhost:5432
 - Redis: localhost:6379
-- Qdrant: localhost:6333
 
 `dev.sh` only tears down the docker stack that it started — if you ran `docker compose up` separately before invoking `dev.sh`, it leaves that running on exit.
 

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -8,7 +8,7 @@ Thank you for your interest in contributing to bigRAG. This guide will help you
 
 - **Python 3.12+** with [uv](https://docs.astral.sh/uv/)
 - **Node.js 20+** with [pnpm](https://pnpm.io/) (via corepack)
-- **Docker** and **Docker Compose** — for Postgres, Redis, Qdrant
+- **Docker** and **Docker Compose** — for Postgres and Redis
 
 ### Development Setup
 
@@ -31,7 +31,7 @@ Or manually:
 
 ```bash
 # Start infrastructure
-docker compose up postgres redis qdrant -d
+docker compose up postgres redis -d
 
 # Set up the Python backend
 cd api
@@ -60,7 +60,7 @@ bigrag/
 ├── sdks/python/           # Python SDK (bigrag)
 ├── app/                   # Admin UI (TanStack Router + React)
 ├── website/               # Docs site (Next.js + Fumadocs)
-├── docker-compose.yml     # Full stack (Postgres, Redis, Qdrant, API)
+├── docker-compose.yml     # Full stack (Postgres, Redis, API)
 ├── biome.jsonc            # Biome linting config for TypeScript
 ├── pnpm-workspace.yaml    # pnpm workspace config
 ├── dev.sh                 # One-command dev setup

diff --git a/README.md b/README.md
@@ -9,7 +9,7 @@ Open-source, self-hostable RAG platform. Upload documents, auto-chunk, embed, an
 - **Document ingestion** — PDF, DOCX, PPTX, HTML, Markdown, images, and more via [Docling](https://github.com/DS4SD/docling)
 - **Embedding providers** — OpenAI, OpenAI-compatible gateways, Cohere, and Voyage
 - **Embedding presets** — save named provider/model configs once, reuse across collections
-- **Vector search** — semantic, keyword, and hybrid search modes via [Qdrant](https://qdrant.io) or [turbopuffer](https://turbopuffer.com)
+- **Vector search** — semantic, keyword, and hybrid search modes via [Turbopuffer](https://turbopuffer.com)
 - **Reranking** — Cohere reranking for improved result relevance
 - **Multi-collection queries** — search across collections in a single request
 - **Generated chat** — stateless backend-grounded playground chat with streaming and citations
@@ -31,7 +31,7 @@ Open-source, self-hostable RAG platform. Upload documents, auto-chunk, embed, an
 docker compose up -d
 ```
 
-This starts bigRAG API, worker, admin UI, Postgres, Redis, and Qdrant (or turbopuffer per collection). Open http://localhost:3000 for the admin UI or http://localhost:4000/docs for the interactive API docs.
+This starts bigRAG API, worker, admin UI, Postgres, and Redis. Configure Turbopuffer before ingesting or querying collections. Open http://localhost:3000 for the admin UI or http://localhost:4000/docs for the interactive API docs.
 
 ```bash
 # Create a collection
@@ -52,7 +52,7 @@ curl -X POST http://localhost:4000/v1/collections/docs/query \
 ### Development
 
 ```bash
-./dev.sh  # starts Postgres, Redis, Qdrant, the API with hot reload, and the worker
+./dev.sh  # starts Postgres, Redis, the API with hot reload, and the worker
 ```
 
 ### Docker Images
@@ -90,7 +90,7 @@ graph TD
 
     Worker -->|parse| Docling[Docling<br/>PDF, DOCX, HTML, Images]
     Worker -->|embed| Embedding[Embedding provider<br/>OpenAI / compatible / Cohere / Voyage]
-    Worker -->|store vectors| Vectors[(Qdrant<br/>or turbopuffer)]
+    Worker -->|store vectors| Vectors[(Turbopuffer)]
 
     Query -->|search| Vectors
     Query -->|embed query| Embedding
@@ -288,8 +288,8 @@ Full-workspace keys expose 8 tools — `list_collections`, `get_collection`, `ge
 
 ## Configuration
 
-Most settings use the `BIGRAG_` prefix as environment variables, or configure via `bigrag.toml`.
-Backend logging defaults to `debug` / `text` for local development. Use `BIGRAG_LOG_LEVEL=info` and `BIGRAG_LOG_FORMAT=json` for production log collection.
+Bootstrap settings use the `BIGRAG_` prefix as environment variables, or configure via `bigrag.toml`.
+Backend logging defaults to `debug` / `text` for local development. Use `BIGRAG_LOG_LEVEL=info` and `BIGRAG_LOG_FORMAT=json` for production log collection. Configure Turbopuffer from the admin UI; it is stored in Postgres with the other instance settings.
 
 | Variable | Description | Default |
 |----------|-------------|---------|
@@ -303,13 +303,6 @@ Backend logging defaults to `debug` / `text` for local development. Use `BIGRAG_
 | `BIGRAG_DB_POOL_MIN` | Min Postgres pool size | `5` |
 | `BIGRAG_DB_POOL_MAX` | Max Postgres pool size | `50` |
 | `BIGRAG_MIGRATION_TIMEOUT_SECONDS` | Startup migration check timeout (`0` disables the timeout) | `60` |
-| `BIGRAG_QDRANT_URL` | Qdrant URL | `http://localhost:6333` |
-| `BIGRAG_QDRANT_CONNECT_TIMEOUT_SECONDS` | Qdrant startup connection timeout (`0` disables the timeout) | `10` |
-| `BIGRAG_QDRANT_REQUIRED` | Fail API startup if Qdrant cannot be reached | `false` |
-| `BIGRAG_QDRANT_SEARCH_EF` | Optional Qdrant HNSW search recall/latency tuning | — |
-| `BIGRAG_TURBOPUFFER_API_KEY` | turbopuffer API key (only required for collections using turbopuffer) | — |
-| `BIGRAG_TURBOPUFFER_REGION` | turbopuffer region | `aws-us-east-1` |
-| `BIGRAG_TURBOPUFFER_NAMESPACE_PREFIX` | Prefix prepended to turbopuffer namespace names | `bigrag_` |
 | `BIGRAG_REDIS_URL` | Redis URL | `redis://localhost:6379/0` |
 | `BIGRAG_ENV` | `dev` or `prod` (prod enables startup safety checks) | `dev` |
 | `BIGRAG_TRUSTED_PROXIES` | JSON array of trusted proxy CIDRs used to honor `X-Forwarded-For` for audit and access logs | `[]` |

diff --git a/api/alembic/versions/0001_initial_schema.py b/api/alembic/versions/0001_initial_schema.py
@@ -260,7 +260,6 @@ def upgrade() -> None:
         sa.Column("embedding_api_key", bigrag.services.crypto.EncryptedString(), nullable=True),
         sa.Column("embedding_base_url", sa.Text(), nullable=True),
         sa.Column("embedding_preset_id", sa.Uuid(), nullable=True),
-        sa.Column("vector_store_provider", sa.Text(), server_default="qdrant", nullable=False),
         sa.Column("dimension", sa.Integer(), server_default=sa.text("1536"), nullable=False),
         sa.Column("chunk_size", sa.Integer(), server_default=sa.text("512"), nullable=False),
         sa.Column("chunk_overlap", sa.Integer(), server_default=sa.text("50"), nullable=False),
@@ -283,7 +282,6 @@ def upgrade() -> None:
             server_default=sa.text("false"),
             nullable=False,
         ),
-        sa.Column("index_type", sa.Text(), server_default="HNSW", nullable=False),
         sa.Column("tenant_field", sa.Text(), nullable=True),
         sa.Column("metadata_schema", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
         sa.Column(
@@ -317,66 +315,6 @@ def upgrade() -> None:
         unique=False,
     )
     op.create_index("idx_collections_name", "collections", ["name"], unique=False)
-    op.create_table(
-        "vector_migration_jobs",
-        sa.Column("id", sa.Uuid(), nullable=False),
-        sa.Column("collection_id", sa.Uuid(), nullable=True),
-        sa.Column("collection_name", sa.Text(), nullable=False),
-        sa.Column("source_provider", sa.Text(), nullable=False),
-        sa.Column("target_provider", sa.Text(), nullable=False),
-        sa.Column("status", sa.Text(), server_default="pending", nullable=False),
-        sa.Column("phase", sa.Text(), server_default="queued", nullable=False),
-        sa.Column("progress", sa.Double(), server_default=sa.text("0"), nullable=False),
-        sa.Column("copied_points", sa.Integer(), server_default=sa.text("0"), nullable=False),
-        sa.Column("total_points", sa.Integer(), nullable=True),
-        sa.Column(
-            "details",
-            postgresql.JSONB(astext_type=sa.Text()),
-            server_default=sa.text("'{}'::jsonb"),
-            nullable=False,
-        ),
-        sa.Column("error_message", sa.Text(), nullable=True),
-        sa.Column("created_by", sa.Uuid(), nullable=True),
-        sa.Column("started_at", sa.DateTime(timezone=True), nullable=True),
-        sa.Column("completed_at", sa.DateTime(timezone=True), nullable=True),
-        sa.Column(
-            "created_at",
-            sa.DateTime(timezone=True),
-            server_default=sa.text("now()"),
-            nullable=False,
-        ),
-        sa.Column(
-            "updated_at",
-            sa.DateTime(timezone=True),
-            server_default=sa.text("now()"),
-            nullable=False,
-        ),
-        sa.CheckConstraint(
-            "status IN ('pending', 'running', 'canceling', 'succeeded', 'failed')",
-            name="vector_migration_jobs_status_check",
-        ),
-        sa.ForeignKeyConstraint(["collection_id"], ["collections.id"], ondelete="SET NULL"),
-        sa.ForeignKeyConstraint(["created_by"], ["users.id"], ondelete="SET NULL"),
-        sa.PrimaryKeyConstraint("id"),
-    )
-    op.create_index(
-        "idx_vector_migration_jobs_collection",
-        "vector_migration_jobs",
-        ["collection_name"],
-        unique=False,
-    )
-    op.create_index(
-        "idx_vector_migration_jobs_created_at_id",
-        "vector_migration_jobs",
-        [sa.literal_column("created_at DESC"), sa.literal_column("id DESC")],
-        unique=False,
-    )
-    op.create_index(
-        "idx_vector_migration_jobs_status",
-        "vector_migration_jobs",
-        ["status"],
-        unique=False,
-    )
     op.create_table(
         "connector_accounts",
         sa.Column("id", sa.Uuid(), nullable=False),
@@ -1170,11 +1108,6 @@ def upgrade() -> None:
         "webhook_deliveries",
         "status IN ('pending', 'delivered', 'failed')",
     )
-    op.create_check_constraint(
-        "collections_vector_store_provider_check",
-        "collections",
-        "vector_store_provider IN ('qdrant', 'turbopuffer')",
-    )
     op.create_check_constraint(
         "embedding_presets_provider_check",
         "embedding_presets",
@@ -1266,10 +1199,6 @@ def downgrade() -> None:
     op.drop_table("connector_accounts")
     op.drop_index("idx_collections_name", table_name="collections")
     op.drop_index("idx_collections_created_at_id", table_name="collections")
-    op.drop_index("idx_vector_migration_jobs_status", table_name="vector_migration_jobs")
-    op.drop_index("idx_vector_migration_jobs_created_at_id", table_name="vector_migration_jobs")
-    op.drop_index("idx_vector_migration_jobs_collection", table_name="vector_migration_jobs")
-    op.drop_table("vector_migration_jobs")
     op.drop_table("collections")
     op.drop_index("idx_backup_jobs_status", table_name="backup_jobs")
     op.drop_index("idx_backup_jobs_created_at_id", table_name="backup_jobs")

diff --git a/api/bigrag/app_factory/lifespan.py b/api/bigrag/app_factory/lifespan.py
@@ -52,21 +52,16 @@ async def lifespan(app: FastAPI):
     runtime = await runtime_settings.get_values(
         [
             "ingestion_workers",
-            "qdrant_connect_timeout_seconds",
-            "qdrant_required",
-            "qdrant_search_ef",
-            "qdrant_url",
             "turbopuffer_api_key",
+            "turbopuffer_base_url",
             "turbopuffer_namespace_prefix",
             "turbopuffer_region",
         ]
     )
 
     vector_store.configure(
-        qdrant_url=runtime["qdrant_url"],
-        connect_timeout_seconds=runtime["qdrant_connect_timeout_seconds"],
-        search_ef=runtime["qdrant_search_ef"],
         turbopuffer_api_key=runtime["turbopuffer_api_key"],
+        turbopuffer_base_url=runtime["turbopuffer_base_url"],
         turbopuffer_region=runtime["turbopuffer_region"],
         turbopuffer_namespace_prefix=runtime["turbopuffer_namespace_prefix"],
     )
@@ -76,12 +71,10 @@ async def lifespan(app: FastAPI):
     except Exception as exc:
         logger.warning(
             "Vector store startup connection failed; API will start degraded",
-            provider=vector_store.provider,
+            provider="turbopuffer",
             error_type=exc.__class__.__name__,
             error=str(exc),
         )
-        if runtime["qdrant_required"]:
-            raise
     app.state.vector_store = vector_store
 
     storage = await init_storage_from_runtime(upload_dir=s.upload_dir)

diff --git a/api/bigrag/app_factory/routers.py b/api/bigrag/app_factory/routers.py
@@ -15,7 +15,6 @@ def include_all_routers(app: FastAPI) -> None:
     from bigrag.routers.admin_realtime import router as admin_realtime_router
     from bigrag.routers.admin_settings import router as admin_settings_router
     from bigrag.routers.admin_users import router as admin_users_router
-    from bigrag.routers.admin_vector_migrations import router as admin_vector_migrations_router
     from bigrag.routers.admin_vector_storage import router as admin_vector_storage_router
     from bigrag.routers.analytics import router as analytics_router
     from bigrag.routers.auth import router as auth_router
@@ -51,7 +50,6 @@ def include_all_routers(app: FastAPI) -> None:
     app.include_router(admin_settings_router)
     app.include_router(admin_access_router)
     app.include_router(admin_vector_storage_router)
-    app.include_router(admin_vector_migrations_router)
     app.include_router(admin_realtime_router)
     app.include_router(mcp_servers_router)
     app.include_router(admin_audit_router)

diff --git a/api/bigrag/config.py b/api/bigrag/config.py
@@ -27,15 +27,6 @@ class Settings(BaseSettings):
     db_pool_max: int = 50
     migration_timeout_seconds: int = 60
 
-    qdrant_url: str = "http://localhost:6333"
-    qdrant_connect_timeout_seconds: int = 10
-    qdrant_required: bool = False
-    qdrant_prefer_grpc: bool = False
-    qdrant_grpc_port: int = 6334
-    turbopuffer_api_key: str | None = None
-    turbopuffer_region: str = "aws-us-east-1"
-    turbopuffer_namespace_prefix: str = "bigrag_"
-
     redis_url: str = "redis://localhost:6379/0"
 
     master_key: str | None = None

diff --git a/api/bigrag/db/models/__init__.py b/api/bigrag/db/models/__init__.py
@@ -19,7 +19,6 @@
 from bigrag.db.models.instance import InstanceSetting, MaintenanceLock
 from bigrag.db.models.observability import AccessLog, AuditLog, BackupJob, QueryLog
 from bigrag.db.models.preference import UserPreference
-from bigrag.db.models.vector_migration import VectorMigrationJob
 from bigrag.db.models.webhook import Webhook, WebhookDelivery
 
 __all__ = [
@@ -46,7 +45,6 @@
     "User",
     "UserSession",
     "UserPreference",
-    "VectorMigrationJob",
     "Webhook",
     "WebhookDelivery",
 ]
diff --git a/api/bigrag/db/models/collection.py b/api/bigrag/db/models/collection.py
@@ -14,10 +14,6 @@
 class Collection(Base):
     __tablename__ = "collections"
     __table_args__ = (
-        sa.CheckConstraint(
-            "vector_store_provider IN ('qdrant', 'turbopuffer')",
-            name="collections_vector_store_provider_check",
-        ),
         sa.Index("idx_collections_name", "name"),
         sa.Index("idx_collections_created_at_id", sa.desc("created_at"), sa.desc("id")),
     )
@@ -37,11 +33,6 @@ class Collection(Base):
         sa.ForeignKey("embedding_presets.id", ondelete="RESTRICT"),
         nullable=True,
     )
-    vector_store_provider: Mapped[str] = mapped_column(
-        sa.Text,
-        nullable=False,
-        server_default="qdrant",
-    )
     dimension: Mapped[int] = mapped_column(
         sa.Integer, nullable=False, server_default=sa.text("1536")
     )
@@ -75,7 +66,6 @@ class Collection(Base):
     multimodal_enrichment_enabled: Mapped[bool] = mapped_column(
         sa.Boolean, nullable=False, server_default=sa.false()
     )
-    index_type: Mapped[str] = mapped_column(sa.Text, nullable=False, server_default="HNSW")
     tenant_field: Mapped[str | None] = mapped_column(sa.Text)
     metadata_schema: Mapped[dict | None] = mapped_column(JSONB)
     meta: Mapped[dict] = mapped_column(