Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 33 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,39 @@ Duckgres is a PostgreSQL wire protocol server backed by DuckDB. It allows any Po

## Architecture

Duckgres supports three run modes: `standalone` (default), `control-plane`, and `worker`.

```
PostgreSQL Client → TLS → Duckgres Server → DuckDB (per-user database)
Standalone: PostgreSQL Client → TLS → Duckgres Server → DuckDB (per-user database)
Control Plane: PostgreSQL Client → TLS → Control Plane → (FD pass) → Worker → DuckDB
```

### Key Components

- **main.go**: Entry point, configuration loading (CLI flags, env vars, YAML)
- **server/server.go**: Server struct, connection handling, graceful shutdown
- **main.go**: Entry point, configuration loading (CLI flags, env vars, YAML), mode routing
- **server/server.go**: Server struct, connection handling, graceful shutdown, `CreateDBConnection()` (standalone function)
- **server/conn.go**: Client connection handling, query execution, COPY protocol
- **server/protocol.go**: PostgreSQL wire protocol message encoding/decoding
- **server/exports.go**: Exported wrappers for protocol functions (used by control plane workers)
- **server/catalog.go**: pg_catalog compatibility views and macros initialization
- **server/types.go**: Type OID mapping between DuckDB and PostgreSQL
- **server/ratelimit.go**: Rate limiting for brute-force protection
- **server/certs.go**: Auto-generation of self-signed TLS certificates
- **server/parent.go**: Child process spawning for ProcessIsolation mode
- **server/worker.go**: Per-connection child worker (ProcessIsolation mode)
- **transpiler/**: AST-based SQL transpiler (PostgreSQL → DuckDB)
- `transpiler.go`: Main API, transform pipeline orchestration
- `config.go`: Configuration types (DuckLakeMode, ConvertPlaceholders)
- `transform/`: Individual transform implementations
- **controlplane/**: Multi-process control plane architecture
- `proto/worker.proto`: gRPC service definition (Configure, AcceptConnection, CancelQuery, Drain, Health, Shutdown)
- `proto/*.pb.go`: Generated gRPC/protobuf code
- `fdpass/fdpass.go`: Unix socket FD passing via SCM_RIGHTS
- `worker.go`: Long-lived worker process (gRPC server, FD receiver, session handler)
- `dbpool.go`: Per-session DuckDB database pool management
- `control.go`: Control plane main loop (TCP listener, rate limiting, connection routing)
- `pool.go`: Worker pool management (spawn, health check, least-connections routing, rolling update)
- `handover.go`: Graceful deployment (listener FD transfer between control planes)

## PostgreSQL Wire Protocol

Expand Down Expand Up @@ -74,10 +89,24 @@ Supports bulk data transfer:
- **COPY FROM STDIN**: Receives data from client, inserts row by row
- Supports CSV format with HEADER, DELIMITER, and NULL options

## Run Modes

- **standalone** (default): Single process, handles everything. Current behavior unchanged.
- **control-plane**: Multi-process. Accepts TCP connections, passes FDs to worker pool via Unix sockets.
- **worker**: Long-lived child process spawned by control plane. Handles TLS, auth, query execution via gRPC + FD passing.

Key CLI flags for control plane mode:
- `--mode control-plane|worker|standalone`
- `--worker-count N` (default 4)
- `--socket-dir /path` (Unix sockets for gRPC + FD passing)
- `--handover-socket /path` (graceful deployment between control planes)
- `--grpc-socket /path` (worker, set by control plane at spawn)
- `--fd-socket /path` (worker, set by control plane at spawn)

## Configuration

Three-tier configuration (highest to lowest priority):
1. CLI flags (`--port`, `--config`, etc.)
1. CLI flags (`--port`, `--config`, `--mode`, etc.)
2. Environment variables (`DUCKGRES_PORT`, etc.)
3. YAML config file
4. Built-in defaults
Expand Down
89 changes: 80 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ A PostgreSQL wire protocol compatible server backed by DuckDB. Connect with any
- [Rate Limiting](#rate-limiting)
- [Usage Examples](#usage-examples)
- [Architecture](#architecture)
- [Standalone Mode](#standalone-mode)
- [Control Plane Mode](#control-plane-mode)
- [Two-Tier Query Processing](#two-tier-query-processing)
- [Supported Features](#supported-features)
- [Limitations](#limitations)
Expand All @@ -45,6 +47,7 @@ A PostgreSQL wire protocol compatible server backed by DuckDB. Connect with any
- **DuckLake Integration**: Auto-attach DuckLake catalogs for lakehouse workflows
- **Rate Limiting**: Built-in protection against brute-force attacks
- **Graceful Shutdown**: Waits for in-flight queries before exiting
- **Control Plane Mode**: Multi-process architecture with long-lived workers, zero-downtime deployments, and rolling updates
- **Flexible Configuration**: YAML config files, environment variables, and CLI flags
- **Prometheus Metrics**: Built-in metrics endpoint for monitoring

Expand Down Expand Up @@ -177,12 +180,16 @@ export POSTHOG_HOST=eu.i.posthog.com
./duckgres --help

Options:
-config string Path to YAML config file
-host string Host to bind to
-port int Port to listen on
-data-dir string Directory for DuckDB files
-cert string TLS certificate file
-key string TLS private key file
-config string Path to YAML config file
-host string Host to bind to
-port int Port to listen on
-data-dir string Directory for DuckDB files
-cert string TLS certificate file
-key string TLS private key file
-mode string Run mode: standalone (default), control-plane, or worker
-worker-count int Number of worker processes (control-plane mode, default 4)
-socket-dir string Unix socket directory (control-plane mode)
-handover-socket string Handover socket for graceful deployment (control-plane mode)
```

## DuckDB Extensions
Expand Down Expand Up @@ -428,6 +435,12 @@ GROUP BY name;

## Architecture

Duckgres supports two run modes: **standalone** (single process, default) and **control-plane** (multi-process with worker pool).

### Standalone Mode

The default mode runs everything in a single process:

```
┌─────────────────┐
│ PostgreSQL │
Expand All @@ -449,6 +462,64 @@ GROUP BY name;
└─────────────────┘
```

### Control Plane Mode

For production deployments, control-plane mode splits the server into a **control plane** (connection management, routing) and a pool of long-lived **worker processes** (query execution). This enables zero-downtime deployments and cross-session DuckDB cache reuse.

```
CONTROL PLANE (duckgres --mode control-plane)
┌──────────────────────────────────────────┐
PG Client ──TLS──>│ TCP Listener │
│ Rate Limiting │
│ Connection Router (least-connections) │
│ │ FD pass via Unix socket (SCM_RIGHTS) │
│ ▼ │
│ gRPC Client ─────────────────────────+ │
└──────────────────────────────────────────┘
gRPC (UDS)
WORKER POOL ▼
┌──────────────────────────────────────────┐
│ Worker 1 (duckgres --mode worker) │
│ gRPC Server (Configure, Health, Drain) │
│ FD Receiver (Unix socket) │
│ Shared DuckDB instance (long-lived) │
│ ├── Session 1 (goroutine) │
│ ├── Session 2 (goroutine) │
│ └── Session N ... │
├──────────────────────────────────────────┤
│ Worker 2 ... │
└──────────────────────────────────────────┘
```

Start in control-plane mode:

```bash
# Start with 4 workers (default)
./duckgres --mode control-plane --port 5432 --worker-count 4

# Connect with psql (identical to standalone mode)
PGPASSWORD=postgres psql "host=localhost port=5432 user=postgres sslmode=require"
```

**Zero-downtime deployment** using the handover protocol:

```bash
# Start the first control plane with a handover socket
./duckgres --mode control-plane --port 5432 --handover-socket /var/run/duckgres/handover.sock

# Deploy a new version - it takes over the listener and workers without dropping connections
./duckgres-v2 --mode control-plane --port 5432 --handover-socket /var/run/duckgres/handover.sock
```

**Rolling worker updates** via signal:

```bash
# Replace workers one at a time (drains sessions before replacing each worker)
kill -USR2 <control-plane-pid>
```

## Two-Tier Query Processing

Duckgres uses a two-tier approach to handle both PostgreSQL and DuckDB-specific SQL syntax transparently:
Expand Down Expand Up @@ -509,9 +580,9 @@ The following DuckDB features work transparently through the fallback mechanism:

## Limitations

- **Single Process**: Each user's database is opened in the same process
- **No Replication**: Single-node only
- **Limited System Catalog**: Some `pg_*` system tables are not available
- **Single Node**: No built-in replication or clustering
- **Limited System Catalog**: Some `pg_*` system tables are stubs (return empty)
- **Type OID Mapping**: Incomplete (some types show as "unknown")

## Dependencies

Expand Down
Loading