Handle endpos pipeline shutdown cleanly and reset sequences on follow#40
Merged
Conversation
In follow mode the receive, transform, and apply processes are connected
by Unix pipes. When the apply process reaches endpos it exits and closes
its read end of the pipe. Upstream processes that are still writing
trailing messages then hit EPIPE and exit non-zero, and the supervisor
ANDs every child's status, so a migration that completed correctly at
endpos was reported as a failure.
This adds two layers of handling:
- Child side: transform and receive treat an EPIPE on the downstream
pipe as a clean shutdown only when endpos has been durably reached
for the last message they processed. In every other case (endpos
unset, or not yet reached) a broken pipe is still a failure.
- Supervisor backstop: follow_wait_subprocesses declares overall
success when the apply process exited cleanly and endpos has been
durably applied (endpos <= replay_lsn), regardless of upstream
teardown noise. The apply process is authoritative, so this cannot
mask a genuine pre-endpos failure: if apply crashed or endpos was
not reached, the failure still propagates.
The false failure also skipped the end-of-migration sequence reset.
follow_reset_sequences is now also run by the standalone "pgcopydb
follow" command once endpos is durably reached, so a resumed CDC run
that catches up to endpos updates target sequences to current source
values. Previously only "clone --follow" reset sequences, leaving
resume-after-crash with stale sequences.
Adds a deterministic regression test for the sequence reset performed by the standalone `pgcopydb follow` command when it reaches endpos (the path used by resume-cdc helpers, which previously did not reset sequences). The test clones pagila, advances rental_rental_id_seq on the source by inserting rows, sets endpos, and runs `pgcopydb follow --resume`. Because CDC replays the inserts with OVERRIDING SYSTEM VALUE (explicit ids that do not advance the target sequence), the target sequence only catches up to the source if follow_reset_sequences runs at endpos. The test asserts the target sequence advanced from the snapshot value to match the source. Verified the test fails (target stuck at the snapshot value) when the reset is removed, and passes with it in place.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
In follow mode, the receive → transform → apply processes are connected by Unix pipes. When the apply process reaches
endposit exits and closes its read end of the pipe. Upstream processes still writing trailing messages then hitEPIPEand exit non-zero. Because the supervisor (follow_wait_subprocesses) ANDs every child's exit status, a migration that completed correctly at endpos was reported as a failure.A customer reported exactly this:
The false failure had a second, quieter consequence: it skipped the end-of-migration sequence reset, so target sequences were left at their initial base-copy values (logical decoding does not replicate sequences). The same gap existed for the standalone
pgcopydb followcommand used to resume CDC after a crash — it never reset sequences at all.Fix
Two layers of defense, since this involves customer data:
Child side —
transformandreceivetreat anEPIPEon the downstream pipe as a clean shutdown only whenendposhas been durably reached for the last message they processed. In every other case (endpos unset, or not yet reached) a broken pipe is still a failure.Supervisor backstop —
follow_wait_subprocessesdeclares overall success when the apply process exited cleanly and endpos has been durably applied (endpos <= replay_lsn), regardless of upstream teardown noise.The apply process is authoritative — it exits cleanly only after durably applying through endpos and syncing
replay_lsn. So the backstop cannot mask a genuine pre-endpos failure: if apply crashed, or endpos was not reached,successis left false and the failure propagates so the operator can resume. The two layers are belt-and-suspenders: the child-side gate handles the common case locally, and the supervisor gate guarantees a completed migration is never reported as failed even if an upstream process exits non-zero for another teardown reason.Sequences
follow_reset_sequencesis now also run by the standalonepgcopydb followcommand once endpos is durably reached. This mirrors whatclone --followalready does at the end of its run, and makes a resumed CDC run (pgcopydb follow --resume) that catches up to endpos correctly update target sequences to current source values. The reset is gated on endpos being reached, so an interrupted continuous follow (no endpos, or stopped early by a signal) does not advance sequences ahead of the data actually applied.Interaction with the reconnect/backoff flow
The downstream-
EPIPEpath is separate from the source-reconnect path: the existing exponential backoff loop fires only on source connection loss, while a broken downstream pipe is handled in its own branch. This change only affects how that downstream branch reports its outcome at endpos; it does not touch the reconnect window, backoff timing, or permissions-error handling.Testing
Full CDC / follow / unit suites pass on PG18:
cdc-wal2json,cdc-test-decoding,follow-wal2json,follow-defer-indexes,follow-defer-validate-fks,cdc-endpos-between-transaction,endpos-in-multi-wal-txn,cdc-low-level,cdc-message-handling,follow-data-only,follow-9.6,follow-target-reconnect,follow-standby,cdc-filtering,unit— all green.follow-target-reconnectin particular confirms the reconnect/backoff behavior is unchanged.Note
The endpos shutdown race is timing-dependent (pipe buffer fill, data volume), so it does not reproduce deterministically in CI — every test run completes via the normal clean-shutdown path. The change is verified not to regress any existing behavior; the correctness of the endpos handling rests on the gating analysis above (apply is authoritative; the override is strictly gated on
endpos <= replay_lsn).