pg_hardstorage is a PostgreSQL backup tool built around
continuous WAL streaming over the replication protocol. In
production you run two processes side by side: wal stream
continuously receives WAL from PG and commits every completed
16 MiB segment into the repo, and backup lays down a base
backup on a schedule (e.g. nightly) that the stream rolls
forward from. Daily backup + always-on stream = PITR to any
segment-aligned point.
It works against managed PG (RDS, Cloud SQL, Azure DB) the same as bare metal, deduplicates and encrypts content-addressed chunks, and restores with PITR. PG 15+, Apache 2.0.
This page gets you from zero to a running streamer, a first base backup, and a restored data dir in five minutes. After that, see the operator guide.
Releases ship as static linux/{amd64,arm64} and darwin/arm64
tarballs (Windows is CLI-only). Grab the matching one from
github.com/cybertec-postgresql/pg_hardstorage/releases,
verify the cosign signature, and drop the binary on your $PATH:
curl -LO https://github.com/cybertec-postgresql/pg_hardstorage/releases/download/v0.1.1/pg_hardstorage_linux_amd64.tar.gz
tar xzf pg_hardstorage_linux_amd64.tar.gz
sudo install -m 0755 pg_hardstorage /usr/local/bin/
pg_hardstorage versionsudo dpkg -i pg-hardstorage_0.1.1_amd64.debThe package installs the binary at /usr/bin/pg_hardstorage, drops a
systemd unit at /lib/systemd/system/pg_hardstorage.service, and
creates /etc/pg_hardstorage/, /var/lib/pg_hardstorage/,
/var/log/pg_hardstorage/ with mode 0750 owned by pg-hardstorage.
sudo rpm -i pg-hardstorage-0.1.1-1.x86_64.rpmSame layout as the .deb.
docker pull ghcr.io/cybertec-postgresql/pg_hardstorage:v0.1.1
docker run --rm ghcr.io/cybertec-postgresql/pg_hardstorage:v0.1.1 versionThe image is distroless. Mount a config dir at /etc/pg_hardstorage
and a state dir at /var/lib/pg_hardstorage; both must be writable by
UID 65532.
git clone https://github.com/cybertec-postgresql/pg_hardstorage
cd pg_hardstorage
make # produces bin/pg_hardstorage
sudo install -m 0755 bin/pg_hardstorage /usr/local/bin/Requires Go 1.26+. make test runs the full unit suite under the race
detector; make test-integration exercises a real PostgreSQL 17
container via testcontainers-go (needs Docker).
CREATE ROLE pgbackup REPLICATION LOGIN PASSWORD '<strong>';Add a pg_hba.conf line that allows the agent host to replicate as
that role:
host replication pgbackup 10.0.0.5/32 scram-sha-256
Reload PG (SELECT pg_reload_conf()).
pg_hardstorage repo init file:///srv/backupsThe repo is a directory (or S3 bucket) that holds chunks, manifests,
and WAL. One repo can hold many deployments. repo init is idempotent
on the URL — re-running against an existing repo returns
conflict.repo_exists (exit 7).
S3 works the same way:
pg_hardstorage repo init 's3://acme-backups/?region=eu-central-1'Other backends use the same shape — pick the URL scheme that matches your storage:
| Backend | Example URL |
|---|---|
| Local filesystem | file:///srv/backups |
| AWS S3 / MinIO / R2 / B2 | s3://acme-backups/?region=eu-central-1 |
| Google Cloud Storage | gcs://acme-backups/ |
| Azure Blob | azblob://account.blob.core.windows.net/container/ |
| Remote host via SSH (SFTP) | sftp://backup@nas.example.com/srv/backups |
| Remote host via SSH (ssh-exec) | scp://backup@nas.example.com/srv/backups |
sftp:// and scp:// both ride SSH; pick sftp:// by
default and scp:// when the remote disables the SFTP
subsystem. See
Add an SFTP repository
and Add an SCP repository
for the auth / known_hosts / extras-map setup.
wal stream runs an automatic preflight on every start, but you
can run it standalone first to confirm the source PostgreSQL
satisfies the replication requirements before you wire systemd:
pg_hardstorage wal preflight db1 \
--pg-connection 'postgres://pgbackup@db1.example.com/postgres'Fatal findings (wal_level.too_low, max_replication_slots.full,
max_wal_senders.saturated, role.no_replication) make the
command exit non-zero with a suggestion: block on each finding.
Warnings (max_slot_wal_keep_size.set,
idle_replication_slot_timeout.set on PG 17+) surface but don't
block.
With preflight clean, start the WAL streamer. It is the headline
feature of pg_hardstorage and the process you keep running 24/7:
pg_hardstorage wal stream db1 \
--pg-connection 'postgres://pgbackup@db1.example.com/postgres' \
--repo file:///srv/backupsThe agent issues CREATE_REPLICATION_SLOT pg_hardstorage_db1 PHYSICAL RESERVE_WAL if the slot is absent — RESERVE_WAL pins
the slot's restart_lsn immediately at create time, so PG
retains WAL from that moment on. Then it issues
START_REPLICATION SLOT pg_hardstorage_db1 PHYSICAL against the
slot. The stream is gap-free across agent restarts. Supervise it
with systemd (the package ships pg_hardstorage@<deployment>.service
for exactly this) or your container scheduler.
--skip-preflight is the explicit override if you've already
audited PG; --no-slot is the explicit escape hatch for
archive-only deployments that guarantee WAL retention through
another mechanism (both emit loud warnings — using either is
deliberate).
Leave it running. The remaining steps run in a second terminal or under a separate scheduler.
With the streamer running concurrently, take a base backup. The
two processes share the repo URL but do not coordinate beyond
that — backup streams a BASE_BACKUP over its own replication
connection while wal stream keeps shipping WAL.
The wizard probes PG, generates a signing keypair and a KEK, writes
pg_hardstorage.yaml, and (by default) takes the first backup:
pg_hardstorage init \
--pg-connection 'postgres://pgbackup@db1.example.com/postgres' \
--repo file:///srv/backups \
--deployment db1 \
--yesTo take a backup later without going through the wizard (this is the command your scheduler runs nightly):
pg_hardstorage backup db1 \
--pg-connection 'postgres://pgbackup@db1.example.com/postgres' \
--repo file:///srv/backupsIn production: schedule backup (cron / systemd timer / k8s
CronJob), supervise wal stream (systemd / k8s Deployment).
The base backup is the periodic anchor; the streamer is what
makes PITR byte-precise between anchors.
pg_hardstorage restore db1 latest \
--target /var/lib/postgresql/restored \
--repo file:///srv/backupsPITR via natural-language time:
pg_hardstorage restore db1 latest \
--target /var/lib/postgresql/restored \
--repo file:///srv/backups \
--to "5 minutes ago"Or to a specific LSN: --to-lsn 0/3000028. Or to a named restore
point: --to-name pre_release.
The restore writes a managed recovery.signal and a managed block in
postgresql.auto.conf whose restore_command invokes
pg_hardstorage wal fetch <deployment> %f %p --repo .... Start PG;
recovery proceeds.
A pg_verifybackup check runs against the data dir before the restore
declares success. Skip it with --verify=skip only if you know what
you are doing — exit 9 means the verifier said no.
$ pg_hardstorage version
pg_hardstorage v0.1.1 (abc1234, built 2026-04-29T12:00:00Z)doctor is the single-command "is anything wrong" check:
$ pg_hardstorage doctor
db1 — PG 17.2 — primary @ db1.example.com
✓ PostgreSQL reachable
✓ Replication slot 'pg_hardstorage_db1' active, lag 12s
✓ Last backup 47m ago
✓ Repository file:///srv/backups writable
✓ KMS keyring present (~/.config/pg_hardstorage/keyring)
✓ Schedule: next at 04:00 UTC
Summary: 1 healthy.Any ✗ line carries a Suggested fix: block underneath. Run
pg_hardstorage doctor -o json for a machine-readable form.
doctor exits 0 when healthy and exit 10 with --exit-on-issues when
there are findings — wire that into your alerting if you want a
hard fail signal.
- Operator guide — daily operations, retention, verification, encryption, sinks, troubleshooting pointers.
- Architecture — how the data plane is built, why it talks the replication protocol, what the on-disk layout looks like.
- Runbooks — copy-paste procedures for the seven scenarios that wake an on-call DBA at 3am.
- API reference — REST surface for the control plane.