AI Backend

AI backend platform built with FastAPI, featuring RAG pipelines, vector search, streaming AI responses, and scalable async architecture.

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                      FastAPI App                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐             │
│  │  Routes   │  │   Auth   │  │  Health  │             │
│  └────┬─────┘  └────┬─────┘  └──────────┘             │
│       │              │                                   │
│  ┌────▼──────────────▼────┐                            │
│  │      Service Layer      │                            │
│  │  AuthService │ AIService│                            │
│  │  FileService │ SubService                            │
│  └────┬─────────────┬─────┘                            │
│       │             │                                   │
│  ┌────▼────┐   ┌────▼──────────────────┐              │
│  │  Repos  │   │     AI Pipeline        │              │
│  │  (DB)   │   │  Embed → Index → RAG   │              │
│  └────┬────┘   └────┬──────────────────┘              │
│       │             │                                   │
│  ┌────▼────┐   ┌────▼────────┐   ┌───────────────┐   │
│  │PostgreSQL│  │   Qdrant    │   │   OpenAI API   │   │
│  └─────────┘  └─────────────┘   └───────────────┘   │
└─────────────────────────────────────────────────────────┘
         │
    ┌────▼──────────────────────┐
    │    Redis + Celery Workers  │
    │  Embedding │ Indexing │ GC │
    └────────────────────────────┘

Tech Stack

Layer	Technology
Framework	FastAPI + Uvicorn
Database	PostgreSQL 16 + SQLAlchemy 2.0 Async
Cache / Broker	Redis 7
Vector DB	Qdrant
AI Provider	OpenAI (GPT-4o + text-embedding-3-small)
Background Tasks	Celery
Auth	JWT (access + refresh tokens)
Validation	Pydantic v2
Migrations	Alembic
Testing	Pytest + pytest-asyncio + HTTPX
Containerization	Docker + Docker Compose

Features

Authentication

JWT access tokens (30-minute expiry)
Refresh token rotation with secure hashing
bcrypt password hashing
Role-based access control (user / admin)

AI Capabilities

Document Chat — RAG-powered Q&A over uploaded documents
Resume Analyzer — Structured resume feedback with optional job description matching
Code Review — Security, performance, and quality analysis
Meeting Summarizer — Transcript summarization with action items
Streaming Responses — Real-time SSE token streaming for all AI endpoints

File Processing Pipeline

Upload → Validate → Store → Queue Celery Task
  → Extract Text (PDF/DOCX/TXT/Code)
  → Chunk Text (configurable window + overlap)
  → Generate Embeddings (OpenAI batch)
  → Index to Qdrant
  → Update File Status → Done

Vector Search (RAG)

Semantic similarity search via Qdrant cosine distance
Per-user + per-file metadata filtering
Configurable top-k retrieval with score threshold
Context injection into structured prompts

Subscription System

Free / Pro / Enterprise tiers
Per-month request and token quotas
Quota enforcement via dependency injection
Stripe-ready schema (customer_id, subscription_id columns)

Background Processing (Celery)

embeddings queue — document processing & indexing
indexing queue — vector operations
cleanup queue — expired token purge, orphaned file cleanup

Project Structure

app/
├── api/v1/          # Thin route handlers
│   ├── auth.py
│   ├── ai.py
│   ├── files.py
│   ├── stream.py
│   └── subscriptions.py
├── ai/              # OpenAI integration & RAG pipeline
│   ├── client.py
│   ├── completions.py
│   ├── embeddings.py
│   └── pipeline.py
├── core/            # Security, exceptions, logging
├── db/              # SQLAlchemy engine & session
├── models/          # ORM models (7 tables)
├── schemas/         # Pydantic v2 request/response schemas
├── repositories/    # Data access layer (no business logic)
├── services/        # Business logic layer
├── tasks/           # Celery workers
├── vector/          # Qdrant client, indexer, retriever
├── streaming/       # SSE helpers
├── middleware/       # Request logging, rate limiting
├── dependencies/    # FastAPI DI (auth, db, quota)
├── utils/           # File extraction, text chunking
└── tests/           # pytest async test suite

Quick Start

Using Docker (recommended)

cp .env.example .env
# Edit .env with your OPENAI_API_KEY and SECRET_KEY

docker compose up --build

The API will be available at http://localhost:8000.

Local Development

python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# Fill in DATABASE_URL, REDIS_URL, OPENAI_API_KEY, SECRET_KEY

alembic upgrade head
uvicorn app.main:app --reload

API Reference

Authentication

POST /api/v1/auth/register    Register new user
POST /api/v1/auth/login       Obtain access + refresh tokens
POST /api/v1/auth/refresh     Rotate refresh token

Files

POST   /api/v1/files/upload   Upload document (PDF/DOCX/TXT/code)
GET    /api/v1/files          List user's files
DELETE /api/v1/files/{id}     Delete file + vectors

AI Endpoints

POST /api/v1/ai/chat              General AI chat
POST /api/v1/ai/document-chat     RAG chat over uploaded document
POST /api/v1/ai/resume-analyze    Resume analysis
POST /api/v1/ai/code-review       Code review
POST /api/v1/ai/meeting-summary   Meeting transcript summarizer

Streaming (SSE)

POST /api/v1/stream/chat              Streaming general chat
POST /api/v1/stream/document-chat     Streaming RAG document chat

SSE events: token, done, error

Subscriptions

GET  /api/v1/subscriptions/plans    Available plans
GET  /api/v1/subscriptions/me       Current subscription
POST /api/v1/subscriptions/upgrade  Upgrade tier
GET  /api/v1/subscriptions/usage    Current period usage

Health

GET /health    Service health check

Environment Variables

Variable	Description	Default
`DATABASE_URL`	PostgreSQL async URL	required
`REDIS_URL`	Redis URL	required
`OPENAI_API_KEY`	OpenAI secret key	required
`SECRET_KEY`	JWT signing secret	required
`QDRANT_URL`	Qdrant HTTP URL	`http://localhost:6333`
`OPENAI_MODEL`	Chat model	`gpt-4o`
`OPENAI_EMBEDDING_MODEL`	Embedding model	`text-embedding-3-small`
`ACCESS_TOKEN_EXPIRE_MINUTES`	Access token TTL	`30`
`REFRESH_TOKEN_EXPIRE_DAYS`	Refresh token TTL	`7`
`MAX_FILE_SIZE`	Max upload bytes	`10485760` (10MB)
`CHUNK_SIZE`	Embedding chunk word count	`512`
`CHUNK_OVERLAP`	Chunk overlap words	`50`

Running Tests

# Requires a test PostgreSQL database: ai_saas_test
pytest -v

Database Migrations

# Generate migration after model changes
alembic revision --autogenerate -m "description"

# Apply migrations
alembic upgrade head

# Roll back one
alembic downgrade -1

Streaming Example (JavaScript)

const response = await fetch('/api/v1/stream/chat', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${token}`
  },
  body: JSON.stringify({ message: 'Explain async/await in Python' })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const lines = decoder.decode(value).split('\n');
  for (const line of lines) {
    if (line.startsWith('data:')) {
      const data = JSON.parse(line.slice(5));
      if (data.token) process.stdout.write(data.token);
    }
  }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Backend

Architecture Overview

Tech Stack

Features

Authentication

AI Capabilities

File Processing Pipeline

Vector Search (RAG)

Subscription System

Background Processing (Celery)

Project Structure

Quick Start

Using Docker (recommended)

Local Development

API Reference

Authentication

Files

AI Endpoints

Streaming (SSE)

Subscriptions

Health

Environment Variables

Running Tests

Database Migrations

Streaming Example (JavaScript)

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AI Backend

Architecture Overview

Tech Stack

Features

Authentication

AI Capabilities

File Processing Pipeline

Vector Search (RAG)

Subscription System

Background Processing (Celery)

Project Structure

Quick Start

Using Docker (recommended)

Local Development

API Reference

Authentication

Files

AI Endpoints

Streaming (SSE)

Subscriptions

Health

Environment Variables

Running Tests

Database Migrations

Streaming Example (JavaScript)