Production RAG Pipeline with Multi-Agent Orchestration

Document Q&A system: upload PDF/TXT documents and ask questions. Uses a multi-agent pipeline (router, retrieval, generator, evaluator) with LangGraph, LlamaIndex RAG, and Qdrant.

Architecture

flowchart TD
    UserQuery[User Query]
    Router[Router Agent]
    Retrieval[Retrieval Agent]
    Clarification[Clarification Agent]
    RAG[RAG Pipeline LlamaIndex]
    Generator[Answer Generation Agent]
    Evaluator[Evaluation Agent]
    Response[Response]

    UserQuery --> Router
    Router -->|"intent: answer"| Retrieval
    Router -->|"intent: clarify"| Clarification
    Clarification --> Response
    Retrieval --> RAG
    RAG --> Generator
    Generator --> Evaluator
    Evaluator --> Response

Router: Classifies query as answer or clarify (GPT-4o).
Retrieval: Fetches top-k chunks from Qdrant via LlamaIndex.
Generator: Builds answer from context + query (GPT-4o).
Evaluator: Confidence score and guardrails (PASS/FAIL).

Tech Stack

Layer	Technology
LLM	OpenAI GPT-4o
Orchestration	LangChain + LangGraph
Embeddings	OpenAI text-embedding-3-small
Vector DB	Qdrant (Docker)
RAG	LlamaIndex
Backend	FastAPI
Observability	LangSmith
Package manager	uv

Setup

Using Docker Compose

Copy env and set API keys:

cp .env.example .env
# Edit .env: OPENAI_API_KEY=sk-... and optionally LANGCHAIN_API_KEY=...

Start Qdrant and the app:
```
docker compose up --build
```
API: http://localhost:8000 (docs at http://localhost:8000/docs).

Local development (uv)

Install uv, then:
```
uv sync --extra dev
```
Run Qdrant (e.g. docker run -p 6333:6333 qdrant/qdrant:latest).

Set .env as above; then:

uv run uvicorn api.main:app --reload --port 8000

Environment (.env)

Variable	Description
`OPENAI_API_KEY`	Required for embeddings and GPT-4o.
`LANGCHAIN_API_KEY`	Optional; for LangSmith tracing.
`LANGCHAIN_TRACING_V2`	Set to `true` to enable LangSmith.
`LANGCHAIN_PROJECT`	Project name in LangSmith (e.g. `prod-rag-agent`).
`QDRANT_HOST`	`localhost` locally; `qdrant` in Docker.
`QDRANT_PORT`	`6333`.

API

POST /upload — Upload a PDF or TXT file (multipart form file). Content is chunked, embedded, and stored in Qdrant.
POST /query — Body: {"question": "..."}. Returns answer, confidence, sources, latency_ms (and clarification: true when the router asks for clarification).
GET /health — Liveness check.

Example flow

Upload a document:

curl -X POST http://localhost:8000/upload -F "file=@doc.pdf"

Ask a question:

curl -X POST http://localhost:8000/query -H "Content-Type: application/json" -d '{"question": "What is the main topic of the document?"}'

Example response:

{
  "answer": "The document describes...",
  "confidence": 0.92,
  "sources": [{"text_preview": "..."}],
  "latency_ms": 1850.5
}

Performance

Target: p95 response time < 3 seconds for a typical query (upload not included).
Achieved latency depends on document size, retrieval count, and OpenAI latency; latency_ms is returned on every /query response.

Observability (LangSmith)

With LANGCHAIN_TRACING_V2=true and LANGCHAIN_API_KEY set, LangChain/LangGraph calls are traced automatically. View runs and traces in your LangSmith project (screenshot placeholder: add a screenshot of the LangSmith trace for a /query request).

Tests

Unit + integration: uv run pytest
Integration only: uv run pytest -m integration

Requires no real API keys (OpenAI and Qdrant are mocked or in-memory in tests).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agents		agents
api		api
rag		rag
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.py		config.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Production RAG Pipeline with Multi-Agent Orchestration

Architecture

Tech Stack

Setup

Using Docker Compose

Local development (uv)

Environment (.env)

API

Example flow

Performance

Observability (LangSmith)

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Production RAG Pipeline with Multi-Agent Orchestration

Architecture

Tech Stack

Setup

Using Docker Compose

Local development (uv)

Environment (.env)

API

Example flow

Performance

Observability (LangSmith)

Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages