Skip to content

Latest commit

 

History

History
389 lines (284 loc) · 15.7 KB

File metadata and controls

389 lines (284 loc) · 15.7 KB

PostgreSQL Wire Protocol over QUIC

Status: Draft v0.1 Editors: @arkstack-dev Repository: https://github.com/arkstack-dev/pg-quic Last updated: 2026-05-24


Abstract

This document specifies a binding of the PostgreSQL frontend/backend wire protocol (commonly called "v3", as defined in the PostgreSQL documentation, chapter "Frontend/Backend Protocol") onto QUIC (RFC 9000) as transport.

The binding preserves the v3 message format byte-for-byte. It replaces the TCP+TLS transport assumed by the current protocol with a QUIC connection in which each bidirectional client-initiated stream carries exactly one PostgreSQL session. TLS 1.3 is provided by QUIC itself; the v3 SSLRequest handshake is not used.

The goal of this specification is interoperability: any client or server implementation that follows this document MUST be wire-compatible with any other, regardless of the underlying QUIC library, I/O model, or programming language.

This document does not specify any change to the v3 message format itself, nor to PostgreSQL server internals. It defines a transport binding only.


1. Motivation

The PostgreSQL v3 wire protocol has run over TCP (optionally wrapped in TLS) since 2003. That transport is robust and ubiquitous, but it imposes several costs that QUIC addresses directly.

1.1 Connection establishment cost

A modern PG connection over TLS requires:

  • TCP three-way handshake (1 RTT)
  • TLS 1.3 handshake (1 RTT, or 0-RTT with notes of caution)
  • PG StartupMessage + authentication exchange (1+ RTT depending on auth method)

QUIC folds the transport and TLS handshakes into a single 1-RTT exchange, reducing connect latency materially for short-lived clients, edge workers, and serverless functions.

1.2 Query cancellation

Today cancelling an in-flight query requires the client to open a second TCP connection to the server and send a CancelRequest containing a secret key issued at session start. This is awkward to implement, racy in practice, and impossible across some NATs and load balancers (the second connection may land on a different backend).

Over QUIC, cancellation maps naturally to STOP_SENDING on the session's stream with a dedicated application error code (see §7). No second connection, no secret key, no routing ambiguity. The BackendKeyData and CancelRequest mechanisms are removed entirely from this binding.

1.3 Connection migration

QUIC connections survive client IP and port changes (RFC 9000 §9). For long-lived database sessions from mobile clients, laptops moving between networks, or NAT rebinding events, this eliminates a class of spurious disconnects that today force application-level retry and reconnection logic.

1.4 Multiplexed sessions per connection

A single QUIC connection can carry many independent bidirectional streams without head-of-line blocking between them. Mapping one PG session per stream lets a client maintain hundreds of logical sessions over a single QUIC connection, amortizing handshake and keepalive costs, and reducing client-side file descriptor and ephemeral port pressure to a single UDP socket.

This binding specifies the wire mapping only; how a server allocates backends to streams is an implementation concern (see §9).

1.5 Non-goals

  • This document does not change the v3 message format.
  • This document does not require changes to the PostgreSQL server. A reference implementation can be built as a proxy that terminates QUIC and forwards v3 messages to a stock PostgreSQL server over TCP.
  • This document does not define an HTTP/3 binding. HTTP semantics (methods, headers, status codes) are not used. QUIC is used as a transport only.

2. Terminology

The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY in this document are to be interpreted as described in RFC 2119 and RFC 8174.

  • v3 protocol — The PostgreSQL frontend/backend wire protocol as defined in the PostgreSQL documentation.
  • Session — One complete v3 protocol exchange, from StartupMessage through connection termination.
  • Session stream — A QUIC bidirectional stream carrying exactly one session.
  • Endpoint — A QUIC client or server as defined in RFC 9000.
  • End-entity certificate — The leaf X.509 certificate presented by the server in the QUIC TLS 1.3 Certificate handshake message.

3. Stream model

3.1 Sessions and streams

Each PostgreSQL session is carried on exactly one client-initiated bidirectional QUIC stream. A single QUIC connection MAY carry any number of concurrent session streams up to the peer's initial_max_streams_bidi transport parameter.

The client opens a new bidirectional stream to begin a session. The first bytes written to that stream MUST be a v3 StartupMessage.

3.2 Stream lifecycle

Clean close. A client terminating a session cleanly MUST send a v3 Terminate message followed by a QUIC STREAM frame with the FIN bit set on its sent half of the stream. The server, upon processing Terminate, MUST close its sent half with FIN. Sending FIN without a preceding Terminate is not a clean close and MUST be treated by the server as an abnormal close, equivalent to a TCP connection lost without a Terminate message under the existing protocol.

Abnormal close. Receipt of RESET_STREAM on either half of a session stream MUST be treated by the receiving endpoint as an unexpected disconnection. The server MUST roll back any in-progress transaction exactly as it would on sudden TCP disconnection.

3.3 Server-initiated bidirectional streams

A server MUST NOT open bidirectional streams. A client receiving a server-initiated bidirectional stream MUST close the QUIC connection with the PG_PROTOCOL_VIOLATION application error code (see §7).

3.4 Unidirectional streams

Unidirectional streams are reserved for future use. v0.1 endpoints MUST NOT open unidirectional streams and MUST close the QUIC connection with PG_PROTOCOL_VIOLATION upon receiving one. Future revisions of this specification MAY define their use (for example, for asynchronous server notifications or WAL streaming) via a new ALPN token.


4. Framing

v3 messages are written to a session stream as a contiguous byte sequence, with no additional framing, length prefix, or delimiter introduced by this binding. The existing length field within each v3 message is enough for the receiver to demarcate messages.

Implementations MUST NOT assume that v3 message boundaries align with QUIC STREAM frame boundaries. A single v3 message MAY be split across multiple STREAM frames, and a single STREAM frame MAY contain multiple v3 messages or a partial message.


5. TLS, ALPN, and connection establishment

5.1 TLS

QUIC mandates TLS 1.3 at the transport layer (RFC 9001). The v3 SSLRequest message MUST NOT be sent on a session stream. A server receiving SSLRequest on a session stream MUST close the QUIC connection with PG_PROTOCOL_VIOLATION.

5.2 ALPN

Endpoints MUST negotiate the ALPN token pgsql/3 during the QUIC TLS handshake. Connections that do not negotiate this token MUST NOT carry sessions defined by this specification.

This token denotes version 3 of the PostgreSQL frontend/backend protocol over QUIC. Future protocol versions are expected to register distinct tokens (e.g., pgsql/4).

IANA registration: see §11.

5.3 0-RTT

Endpoints MUST NOT send or accept 0-RTT application data in v0.1. Clients MUST NOT write any v3 message, including StartupMessage, in 0-RTT packets. Servers MUST reject any 0-RTT data received on session streams.

Rationale: beyond classical replay concerns, accepting StartupMessage in 0-RTT would allow an attacker replaying captured 0-RTT packets to force the server to allocate per-session authentication state before the 1-RTT handshake confirms the client's identity, enabling resource-exhaustion attacks. Future revisions MAY relax this restriction for narrowly scoped idempotent operations.


6. Authentication

Authentication is performed at the v3 layer, inside the session stream, unchanged from the existing protocol. SCRAM-SHA-256 and other SASL mechanisms are supported as-is.

6.1 Channel binding

SCRAM-SHA-256-PLUS uses channel binding type tls-server-end-point as defined in RFC 5929. Over this binding, the channel binding material is the hash of the end-entity certificate from the QUIC TLS 1.3 handshake — that is, the leaf certificate the server presents in the TLS 1.3 Certificate message, identical to the certificate a TLS 1.3 stack would expose via SSL_get_peer_certificate or equivalent API in the underlying QUIC implementation.

Endpoints MUST compute tls-server-end-point over this end-entity certificate. Endpoints MUST NOT use intermediate or root certificates from the chain for channel binding.

The tls-exporter channel binding type defined in RFC 9266 MAY be supported by future revisions of this specification. v0.1 implementations are not required to support it.


7. Cancellation

To cancel an in-flight query, the client MUST send STOP_SENDING on its receive half of the session stream with application error code PG_CANCEL. The server, on receipt, MUST attempt to cancel the currently executing query for that session, exactly as it would in response to a v3 CancelRequest over a secondary connection under the existing protocol.

If the server receives STOP_SENDING with PG_CANCEL for a stream on which no query is currently executing — for example, because the query completed before the cancellation arrived, or the session is idle — the server MUST treat the cancellation as a no-op. No error is returned and the stream remains open. This resolves the race between query completion and cancellation without requiring any feedback mechanism.

The v3 CancelRequest message and BackendKeyData mechanism are not used in this binding. Servers MUST NOT send BackendKeyData; clients MUST ignore BackendKeyData if received (for forward compatibility with servers that send it inadvertently).

7.1 Application error codes

Application error codes in QUIC are 62-bit integers. This specification defines values in the range 0x504700000x5047FFFF (the high bytes spelling "PG" in ASCII). This specification does not define values outside this range.

Code Name Use
0x50470001 PG_CANCEL STOP_SENDING to cancel a query (§7)
0x50470002 PG_PROTOCOL_VIOLATION Connection close on protocol violation (§§3.3, 3.4, 5.1)
0x50470003 PG_SHUTDOWN Server-initiated graceful shutdown
0x504700040x5047FFFF (reserved) Reserved for future revisions

8. Connection migration

Connection migration follows RFC 9000 §9 without modification. Active sessions on migrated connections continue uninterrupted. This binding requires no specific action.


9. Server architecture (informative)

This section is non-normative. It exists to clarify what this specification does and does not require of server implementations.

A server implementation MAY allocate one PostgreSQL backend process per session stream. In this model, the resource cost of N streams equals the resource cost of N TCP connections under the existing protocol. The benefits to the client (handshake, migration, cancellation, reduced client-side socket count) still apply.

A server implementation MAY pool backends and multiplex multiple session streams onto a smaller set of backends, similar to how connection poolers such as PgBouncer operate today. This specification does not mandate or preclude such designs.

A reference proxy implementation that terminates QUIC and forwards v3 messages over TCP to a stock PostgreSQL server is enough to deploy this specification without modification to PostgreSQL itself.


10. Security considerations

0-RTT. Forbidden in v0.1 (§5.3). The primary concern is not classical replay of authenticated operations — which SCRAM would refuse — but resource exhaustion against the server's authentication state allocator before 1-RTT confirmation of the client's address.

DoS via stream creation. A QUIC connection's stream concurrency is bounded by initial_max_streams_bidi. Servers SHOULD set this transport parameter to a value appropriate for their backend allocation policy. Implementations using a per-stream-per-backend model SHOULD set it conservatively.

Channel binding. §6.1 ties tls-server-end-point to the QUIC TLS end-entity certificate. This preserves the authentication guarantees of the existing protocol unchanged; an attacker who could not impersonate the server under TCP+TLS cannot do so under this binding either.

Cancellation authorization. Under the existing protocol, cancellation authority is conveyed by a per-session secret key in BackendKeyData. Under this binding, cancellation authority is conveyed by ownership of the QUIC stream itself — only the endpoint that opened the stream can send STOP_SENDING on it. This is strictly equivalent in security: in both cases, the ability to cancel derives from possession of a session-scoped capability that an attacker without access to the original session cannot forge.

Connection ID privacy. QUIC connection IDs may be visible to on-path observers and can correlate connection migration events. This is a property of QUIC itself, not introduced by this binding. However, operators should be aware that database session continuity across network changes becomes observable to on-path observers in a way TCP sessions are not (since TCP sessions simply break).


11. IANA considerations

11.1 ALPN token registration

This specification requests registration of the following ALPN token in the "TLS Application-Layer Protocol Negotiation (ALPN) Protocol IDs" registry:

  • Protocol: PostgreSQL Frontend/Backend Protocol v3 over QUIC
  • Identification Sequence: 0x70 0x67 0x73 0x71 0x6c 0x2f 0x33 ("pgsql/3")
  • Reference: This document

11.2 Application error codes

No IANA registry is requested for application error codes. The 0x5047xxxx range is administered by this specification and future revisions.


12. References

Normative

  • RFC 2119, RFC 8174 — Keywords
  • RFC 5929 — Channel Bindings for TLS
  • RFC 9000 — QUIC: A UDP-Based Multiplexed and Secure Transport
  • RFC 9001 — Using TLS to Secure QUIC
  • PostgreSQL Frontend/Backend Protocol documentation, current version

Informative

  • RFC 9114 — HTTP/3 (cited only to contrast; HTTP/3 is not used here)
  • RFC 9266 — Channel Bindings for TLS 1.3
  • PostgreSQL CancelRequest documentation
  • PgBouncer architecture documentation

Appendix A. Open questions

These need resolution before v1.0:

  1. Reference implementation status: at least one client and one server (proxy is acceptable) interoperating end-to-end. Required before submission to pgsql-hackers.
  2. Interoperability test vectors: canonical message sequences for handshake, cancellation, clean close, and abnormal close, to be maintained alongside the spec.
  3. Whether v0.2 should standardize a unidirectional stream use case (likely candidate: NOTIFY payload delivery without occupying the session stream's response slot).

Appendix B. Changelog

  • 2026-05-21 — v0.2: resolved §3.2 (lifecycle), §3.3 and §3.4 (stream policies), §5.2 (ALPN pgsql/3), §5.3 (0-RTT forbidden with rationale), §6.1 (channel binding bound to end-entity cert), §7, and §7.1 (cancellation semantics and error code table). Reorganized Appendix A.
  • 2026-05-21 — v0.1: initial skeleton.