Fix float8 precision loss during CDC replay#39
Merged
Conversation
CDC replay formatted float8 values with "%f", truncating to 6 decimal places and silently corrupting values that need more precision (e.g. -216237.00000035969 became -216237.000000). Emit the shortest decimal string that round-trips to the same IEEE-754 double instead. This preserves full precision where needed while keeping clean values clean: 5.99 stays "5.99" rather than expanding to "5.9900000000000002". The latter matters because under REPLICA IDENTITY FULL the value is also a WHERE-clause key for replayed UPDATE/DELETE, and an over-precise literal fails to match the stored numeric value, silently affecting zero rows.
The existing test data only contained low-precision values (5.99, 11.95), which formatted identically under the old "%f" and the new shortest round-trip code — so the test passed either way and did not actually exercise the precision fix. Add a float_precision_table (double precision, REPLICA IDENTITY FULL) with values that need more than 6 fractional digits, including one too small for fixed notation (1e-20). With the old "%f" these collapsed to -216237.000000, 0.123457, and 0.000000; the regenerated golden fixtures capture the correct full-precision output. REPLICA IDENTITY FULL also exercises the value as a WHERE-clause key on UPDATE/DELETE.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
During CDC replay, float8 values were formatted with
%f, which truncates to 6 decimal places. This silently corrupted any value needing more precision — e.g.-216237.00000035969was applied to the target as-216237.000000.This is the issue reported in upstream dimitri/pgcopydb dimitri#968. A customer hit it on PG16 and worked around it with wal2json's
--wal2json-numeric-as-string.Fix
Emit the shortest decimal string that round-trips back to the same IEEE-754 double: try 15 significant digits, increasing to the 17 required for a guaranteed round-trip, stopping at the first precision that parses back to the same value.
Why not a fixed
%.17g/%.17f?A fixed high precision fixes the truncation but introduces a silent data-divergence regression. Under
REPLICA IDENTITY FULL(tables without a primary key), the replayed value is used not only as an INSERT/UPDATE value but also as a WHERE-clause key for UPDATE/DELETE. A fixed%.17gexpands5.99to"5.9900000000000002", which does not match the storednumericvalue5.99, so the replayed UPDATE/DELETE matches zero rows and the target silently diverges. (Upstream dimitri#968's%.17fhas this same defect.)The shortest-round-trip form keeps
5.99as"5.99"(matches the stored value) while still emitting-216237.00000035969in full when the precision is genuinely needed — avoiding both the truncation and the WHERE-match failure.PostgreSQL accepts the
%goutput (including scientific notation for extreme magnitudes) as float input when the value is bound as a query parameter, which is how these values are applied.Scope
This addresses
float4/float8columns. True arbitrary-precisionnumericfidelity is a separate concern — wal2json coerces numeric to a C double upstream of this code, so--wal2json-numeric-as-stringremains the correct option there.Testing
The golden fixture
tests/cdc-wal2json/000000010000000000000002.sqlis updated to the corrected output (5.99/11.95/11.99).Full CDC / follow / unit / base-clone suites pass on PG18. Notably the data-applying
follow-defer-indexesandfollow-defer-validate-fkssuites pass — these exerciseREPLICA IDENTITY FULLreplay and are what surface the WHERE-match regression that a naive fixed-precision fix would introduce (the text-diffcdc-wal2jsonsuite alone does not catch it).