Skip to content

Fix float8 precision loss during CDC replay#39

Merged
teknogeek0 merged 3 commits into
mainfrom
fix/cdc-float-precision
Jun 22, 2026
Merged

Fix float8 precision loss during CDC replay#39
teknogeek0 merged 3 commits into
mainfrom
fix/cdc-float-precision

Conversation

@teknogeek0

Copy link
Copy Markdown
Collaborator

Problem

During CDC replay, float8 values were formatted with %f, which truncates to 6 decimal places. This silently corrupted any value needing more precision — e.g. -216237.00000035969 was applied to the target as -216237.000000.

This is the issue reported in upstream dimitri/pgcopydb dimitri#968. A customer hit it on PG16 and worked around it with wal2json's --wal2json-numeric-as-string.

Fix

Emit the shortest decimal string that round-trips back to the same IEEE-754 double: try 15 significant digits, increasing to the 17 required for a guaranteed round-trip, stopping at the first precision that parses back to the same value.

for (int precision = 15; precision <= 17; precision++)
{
    sformat(string, sizeof(string), "%.*g", precision, value->val.float8);
    if (strtod(string, NULL) == value->val.float8)
        break;
}

Why not a fixed %.17g / %.17f?

A fixed high precision fixes the truncation but introduces a silent data-divergence regression. Under REPLICA IDENTITY FULL (tables without a primary key), the replayed value is used not only as an INSERT/UPDATE value but also as a WHERE-clause key for UPDATE/DELETE. A fixed %.17g expands 5.99 to "5.9900000000000002", which does not match the stored numeric value 5.99, so the replayed UPDATE/DELETE matches zero rows and the target silently diverges. (Upstream dimitri#968's %.17f has this same defect.)

The shortest-round-trip form keeps 5.99 as "5.99" (matches the stored value) while still emitting -216237.00000035969 in full when the precision is genuinely needed — avoiding both the truncation and the WHERE-match failure.

PostgreSQL accepts the %g output (including scientific notation for extreme magnitudes) as float input when the value is bound as a query parameter, which is how these values are applied.

Scope

This addresses float4/float8 columns. True arbitrary-precision numeric fidelity is a separate concern — wal2json coerces numeric to a C double upstream of this code, so --wal2json-numeric-as-string remains the correct option there.

Testing

The golden fixture tests/cdc-wal2json/000000010000000000000002.sql is updated to the corrected output (5.99 / 11.95 / 11.99).

Full CDC / follow / unit / base-clone suites pass on PG18. Notably the data-applying follow-defer-indexes and follow-defer-validate-fks suites pass — these exercise REPLICA IDENTITY FULL replay and are what surface the WHERE-match regression that a naive fixed-precision fix would introduce (the text-diff cdc-wal2json suite alone does not catch it).

CDC replay formatted float8 values with "%f", truncating to 6 decimal
places and silently corrupting values that need more precision (e.g.
-216237.00000035969 became -216237.000000).

Emit the shortest decimal string that round-trips to the same IEEE-754
double instead. This preserves full precision where needed while keeping
clean values clean: 5.99 stays "5.99" rather than expanding to
"5.9900000000000002". The latter matters because under REPLICA IDENTITY
FULL the value is also a WHERE-clause key for replayed UPDATE/DELETE, and
an over-precise literal fails to match the stored numeric value, silently
affecting zero rows.
The existing test data only contained low-precision values (5.99, 11.95),
which formatted identically under the old "%f" and the new shortest
round-trip code — so the test passed either way and did not actually
exercise the precision fix.

Add a float_precision_table (double precision, REPLICA IDENTITY FULL)
with values that need more than 6 fractional digits, including one too
small for fixed notation (1e-20). With the old "%f" these collapsed to
-216237.000000, 0.123457, and 0.000000; the regenerated golden fixtures
capture the correct full-precision output. REPLICA IDENTITY FULL also
exercises the value as a WHERE-clause key on UPDATE/DELETE.
@teknogeek0 teknogeek0 merged commit d071163 into main Jun 22, 2026
87 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant