Fix float8 precision loss during CDC replay by teknogeek0 · Pull Request #39 · planetscale/pgcopydb

teknogeek0 · 2026-06-22T13:43:09Z

Problem

During CDC replay, float8 values were formatted with %f, which truncates to 6 decimal places. This silently corrupted any value needing more precision — e.g. -216237.00000035969 was applied to the target as -216237.000000.

This is the issue reported in upstream dimitri/pgcopydb dimitri#968. A customer hit it on PG16 and worked around it with wal2json's --wal2json-numeric-as-string.

Fix

Emit the shortest decimal string that round-trips back to the same IEEE-754 double: try 15 significant digits, increasing to the 17 required for a guaranteed round-trip, stopping at the first precision that parses back to the same value.

for (int precision = 15; precision <= 17; precision++)
{
    sformat(string, sizeof(string), "%.*g", precision, value->val.float8);
    if (strtod(string, NULL) == value->val.float8)
        break;
}

Why not a fixed `%.17g` / `%.17f`?

A fixed high precision fixes the truncation but introduces a silent data-divergence regression. Under REPLICA IDENTITY FULL (tables without a primary key), the replayed value is used not only as an INSERT/UPDATE value but also as a WHERE-clause key for UPDATE/DELETE. A fixed %.17g expands 5.99 to "5.9900000000000002", which does not match the stored numeric value 5.99, so the replayed UPDATE/DELETE matches zero rows and the target silently diverges. (Upstream dimitri#968's %.17f has this same defect.)

The shortest-round-trip form keeps 5.99 as "5.99" (matches the stored value) while still emitting -216237.00000035969 in full when the precision is genuinely needed — avoiding both the truncation and the WHERE-match failure.

PostgreSQL accepts the %g output (including scientific notation for extreme magnitudes) as float input when the value is bound as a query parameter, which is how these values are applied.

Scope

This addresses float4/float8 columns. True arbitrary-precision numeric fidelity is a separate concern — wal2json coerces numeric to a C double upstream of this code, so --wal2json-numeric-as-string remains the correct option there.

Testing

The golden fixture tests/cdc-wal2json/000000010000000000000002.sql is updated to the corrected output (5.99 / 11.95 / 11.99).

Full CDC / follow / unit / base-clone suites pass on PG18. Notably the data-applying follow-defer-indexes and follow-defer-validate-fks suites pass — these exercise REPLICA IDENTITY FULL replay and are what surface the WHERE-match regression that a naive fixed-precision fix would introduce (the text-diff cdc-wal2json suite alone does not catch it).

CDC replay formatted float8 values with "%f", truncating to 6 decimal places and silently corrupting values that need more precision (e.g. -216237.00000035969 became -216237.000000). Emit the shortest decimal string that round-trips to the same IEEE-754 double instead. This preserves full precision where needed while keeping clean values clean: 5.99 stays "5.99" rather than expanding to "5.9900000000000002". The latter matters because under REPLICA IDENTITY FULL the value is also a WHERE-clause key for replayed UPDATE/DELETE, and an over-precise literal fails to match the stored numeric value, silently affecting zero rows.

The existing test data only contained low-precision values (5.99, 11.95), which formatted identically under the old "%f" and the new shortest round-trip code — so the test passed either way and did not actually exercise the precision fix. Add a float_precision_table (double precision, REPLICA IDENTITY FULL) with values that need more than 6 fractional digits, including one too small for fixed notation (1e-20). With the old "%f" these collapsed to -216237.000000, 0.123457, and 0.000000; the regenerated golden fixtures capture the correct full-precision output. REPLICA IDENTITY FULL also exercises the value as a WHERE-clause key on UPDATE/DELETE.

teknogeek0 added 3 commits June 22, 2026 09:42

Trim float formatting comment to what the code needs

37ee31b

teknogeek0 merged commit d071163 into main Jun 22, 2026
87 checks passed

teknogeek0 mentioned this pull request Jun 22, 2026

Fix float8 precision loss during CDC replay teknogeek0/pgcopydb#65

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix float8 precision loss during CDC replay#39

Fix float8 precision loss during CDC replay#39
teknogeek0 merged 3 commits into
mainfrom
fix/cdc-float-precision

teknogeek0 commented Jun 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teknogeek0 commented Jun 22, 2026

Problem

Fix

Why not a fixed %.17g / %.17f?

Scope

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Why not a fixed `%.17g` / `%.17f`?