Skip to content

feat: PostgreSQL graceful primary switchover (planned failover) #98

@renecannao

Description

@renecannao

Summary

Implement planned/graceful primary switchover for PostgreSQL. Currently only unplanned failover (dead primary detection → promotion) is supported. This adds the ability to gracefully demote a running primary and promote a designated standby with zero data loss.

Motivation

Operators need planned switchover for maintenance windows, version upgrades, and host migrations. The MySQL path already supports this via graceful-master-takeover / graceful-master-takeover-auto — PostgreSQL should have parity.

Design

See docs/superpowers/specs/2026-04-18-postgresql-graceful-switchover-design.md for the full spec.

New instance operations (go/inst/instance_topology_postgresql.go):

  • PostgreSQLSetReadOnly — ALTER SYSTEM + pg_reload_conf + pg_terminate_backend
  • PostgreSQLGetCurrentWALLSN — pg_current_wal_lsn()
  • PostgreSQLWaitForStandbyLSN — poll pg_last_wal_replay_lsn() until caught up
  • PostgreSQLRepositionAsStandby — ALTER SYSTEM SET primary_conninfo for demoted primary

Switchover orchestration (go/logic/topology_recovery_postgresql.go):

  • PostgreSQLGracefulPrimarySwitchover — full flow: validate → pre-hooks → set read-only → wait catch-up → promote → reconfigure standbys → reposition demoted primary → post-hooks

CLI/API dispatch (go/logic/topology_recovery.go):

  • Provider check in GracefulMasterTakeover() dispatches to PostgreSQL path when cluster is PostgreSQL
  • Same commands/endpoints: graceful-master-takeover, graceful-master-takeover-auto

Key decisions:

  • Demoted primary restart is operator's responsibility via PostGracefulTakeoverProcesses hooks
  • No pg_rewind needed — primary is read-only before switchover, so no timeline divergence
  • Separate implementation function (not branching in MySQL code) — follows existing pattern

Test Plan

  • Unit tests for nil/empty input validation on all 4 new instance operations
  • Functional test: graceful switchover against real PostgreSQL topology
  • Verify existing MySQL graceful takeover still works (no regression)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions