Skip to content

Conversation

@ti-chi-bot
Copy link
Member

@ti-chi-bot ti-chi-bot commented Jan 29, 2026

This is an automated cherry-pick of #10682

What problem does this PR solve?

Issue Number: close #10680

Problem Summary:

DROP DATABASE IF EXISTS db_1483421321;
CREATE DATABASE db_1483421321;
USE db_1483421321;
CREATE TABLE t0(c0 TINYINT, handle INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(handle));
ALTER TABLE t0 SET TIFLASH REPLICA 1;
-- this SLEEP will make sure the case always reproducable, it not, make it longer
select sleep(5);
SELECT /*+ read_from_storage(tiflash[t0]) */ * FROM t0 order by 1;
SELECT /*+ read_from_storage(tikv[t0]) */ * FROM t0 order by 1;

ALTER TABLE t0 ADD COLUMN c1 DECIMAL NULL;
ALTER TABLE t0 ADD COLUMN c2 FLOAT NULL DEFAULT 0.0795439693286002;
ALTER TABLE t0 ADD COLUMN c5 DECIMAL NOT NULL;
ALTER TABLE t0 ADD COLUMN c4 DECIMAL DEFAULT 247262911;
ALTER TABLE t0 MODIFY COLUMN c2 FLOAT NOT NULL;
ALTER TABLE t0 MODIFY COLUMN c5 DECIMAL NULL;
ALTER TABLE t0 MODIFY COLUMN c1 BIGINT DEFAULT -56083770 NOT NULL;
-- first insert
INSERT IGNORE INTO t0 (c1, c2, c5, c4) VALUES (-2051270087, 0.44141045099028775, 0.0, 15523);
UPDATE t0 SET c5 = 870337888;

ALTER TABLE t0 MODIFY COLUMN c1 BIGINT NULL;
-- second insert
INSERT INTO t0 (c2, c5, c4) VALUES (0.004406799693866592, 0.8752311290235516, 14652);

---- TiFlash data inconsistent with TiKV after modifying a column from NOT NULL to NULL
-- in tiflash, handle=2, c1=0
SELECT /*+ read_from_storage(tiflash[t0]) */ * FROM t0 order by 1;
+--------+--------+-------------+--------------+-----------+-------+
| c0     | handle | c1          | c2           | c5        | c4    |
+--------+--------+-------------+--------------+-----------+-------+
| <null> | 1      | -2051270087 | 0.44141045   | 870337888 | 15523 |
| <null> | 2      | 0           | 0.0044067996 | 1         | 14652 |
+--------+--------+-------------+--------------+-----------+-------+

-- in tikv, handle=2, c1=null
SELECT /*+ read_from_storage(tikv[t0]) */ * FROM t0 order by 1;
+--------+--------+-------------+--------------+-----------+-------+
| c0     | handle | c1          | c2           | c5        | c4    |
+--------+--------+-------------+--------------+-----------+-------+
| <null> | 1      | -2051270087 | 0.44141045   | 870337888 | 15523 |
| <null> | 2      | <null>      | 0.0044067996 | 1         | 14652 |
+--------+--------+-------------+--------------+-----------+-------+

Root cause

  • TiDB can omit a column in the row value when the column is NULL and both DefaultValue and OriginDefaultValue are nil (see TiDB CanSkip).
  • If TiFlash is still on the old schema (column is NOT NULL), addDefaultValueToColumnIfPossible currently accepts the missing column and fills defaultValueToField().
  • When origin_default_value is empty, defaultValueToField() for NOT NULL falls back to GenDefaultField (zero), so TiFlash returns 0 while TiKV (new schema) returns NULL.

What is changed and how it works?

ddl: Fix default value filling with finer granularity
- Tightens addDefaultValueToColumnIfPossible
  - For NOT NULL missing columns, force_decode=false now returns false unless there is an origin default.
  - This forces a schema sync instead of silently filling 0.
  - force_decode=true still fills a default value (best-effort).

With old schema (NOT NULL, no origin default), missing column now triggers schema sync. After schema sync, column becomes nullable, so missing column can be decode and stored in tiflash with NULL, matching TiKV.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Fixed a TiFlash/TiKV mismatch after altering a column from NOT NULL to NULL

Summary by CodeRabbit

  • New Features

    • Improved handling of column origin/default values during row decoding to reduce unnecessary schema-syncs and better fill missing values.
  • Bug Fixes

    • More robust NOT NULL vs nullable handling to avoid decoding failures and data mismatches during nullability changes.
  • Tests

    • Added unit and integration tests validating default-value resolution, origin-default parsing, and nullability transition scenarios.
  • Chores

    • Formatting tool now reports clang-format version before running.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. labels Jan 29, 2026
@ti-chi-bot
Copy link
Member Author

@JaySon-Huang This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 29, 2026

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

Adds default-filling logic during row decoding (origin/default/NULL) via a new helper integrated into V1/V2 RowCodec paths, updates ColumnInfo with origin-default accessors and parsing, adds tests (unit and fullstack) covering nullability/default scenarios, and minor RegionBlockReader comments and formatting helper output.

Changes

Cohort / File(s) Summary
RowCodec decoding changes
dbms/src/TiDB/Decode/RowCodec.cpp
Add addDefaultValueToColumnIfPossible; use it in V2 and V1 decoding to fill missing columns or signal schema-sync; adjust extra-column (force_decode) handling.
ColumnInfo origin-default API
dbms/src/TiDB/Schema/TiDB.h, dbms/src/TiDB/Schema/TiDB.cpp
Add hasOriginDefaultValue() and defaultValueToField() to detect/convert origin_default/origin_default_bit into Field with type-specific parsing and fallback.
Region block reader & tests
dbms/src/Storages/KVStore/Decode/RegionBlockReader.cpp, dbms/src/Storages/KVStore/tests/gtest_region_block_reader.cpp
RegionBlockReader.cpp: inline clarifying comments. gtest_region_block_reader.cpp: new ReadFromRegionDefaultValue unit test exercising default-filling and nullability across states and force_decode modes.
Schema tests & test utils
dbms/src/TiDB/Schema/tests/gtest_table_info.cpp, dbms/src/TiDB/tests/RowCodecTestUtils.h
Extend table-info tests to assert origin_default parsing; test helper now marks columns StatePublic.
Fullstack nullability scenario
tests/fullstack-test2/ddl/alter_column_nullable.test
Add end-to-end test reproducing issue #10680 covering ALTERs flipping nullability and TiFlash vs TiKV verification.
Formatting helper
format-diff.py
Add get_clang_format_version() and print clang-format version when formatting files.

Sequence Diagram(s)

sequenceDiagram
    participant Reader as RegionBlockReader
    participant Decoder as RowCodec
    participant Schema as ColumnInfo
    participant Block as Block
    participant Sync as SchemaSync

    Reader->>Decoder: request decode rows
    Decoder->>Schema: query column metadata (hasOriginDefaultValue / defaultValueToField)
    Decoder->>Block: append decoded datums
    alt missing column detected
        Decoder->>Schema: addDefaultValueToColumnIfPossible(column_info, block, pos, ...)
        Schema-->>Decoder: Field / GenDefault / failure
        alt success
            Decoder->>Block: insert default or NULL
        else failure
            Decoder->>Sync: return schema-sync signal (false)
        end
    end
    Decoder-->>Reader: decoded Block or schema-sync result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

approved, lgtm, needs-cherry-pick-release-7.5, needs-cherry-pick-release-8.1, needs-cherry-pick-release-8.5

Suggested reviewers

  • JinheLin
  • kolafish
  • gengliqi

Poem

🐰 I hopped through rows where defaults hide,
Nibbled blanks and filled them with pride.
Origin crumbs, NULLs, and generated grace,
I stitched each column back in place. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: fixing default value filling with finer granularity to address TiFlash/TiKV data mismatch issues.
Description check ✅ Passed The description includes issue number, problem summary with SQL reproducer, root cause analysis, what changed, test coverage (unit and integration), and release note.
Linked Issues check ✅ Passed The PR directly addresses issue #10680 by tightening addDefaultValueToColumnIfPossible to force schema sync for NOT NULL missing columns without origin defaults, preventing silent zero-filling and ensuring NULL consistency between TiFlash and TiKV.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the TiFlash/TiKV data mismatch: row codec refactoring, ColumnInfo helper methods, test cases, and integration test for issue #10680; format-diff.py changes are minor tooling updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Important

Action Needed: IP Allowlist Update

If your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:

  • 136.113.208.247/32 (new)
  • 34.170.211.100/32
  • 35.222.179.152/32

Reviews will stop working after February 8, 2026 if the new IP is not added to your allowlist.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@dbms/src/TiDB/Decode/RowCodec.cpp`:
- Around line 422-428: Resolve the git conflict markers in RowCodec.cpp by
removing the conflict markers (<<<<<<<, =======, >>>>>>>) and the duplicated
verbose comment, and keep the concise incoming comment "fallthrough for
best-effort fill when force_decode == true" so the code compiles; locate the
comment area in the RowCodec code path that handles non-clustered index /
pk_is_handle handling (around the logic that falls through to fill default
values) and replace the conflicted block with the single-line incoming comment.

In `@dbms/src/TiDB/Schema/tests/gtest_table_info.cpp`:
- Around line 144-174: The file contains unresolved Git conflict markers
(<<<<<<< HEAD, =======, >>>>>>>) that break compilation; remove these markers
and keep the intended new ParseCase tests that validate origin_default parsing
(the two ParseCase blocks that construct JSON and assert on TableInfo,
getColumnInfo(3), hasOriginDefaultValue(), and defaultValueToField()), ensuring
the duplicated/old chunk is not left behind and the test still compiles and
links with the TableInfo-related helpers (TableInfo, getColumnInfo,
defaultValueToField).
🧹 Nitpick comments (2)
dbms/src/Storages/KVStore/tests/gtest_region_block_reader.cpp (2)

818-818: Minor typo in comment.

"thrid" should be "third".

📝 Suggested fix
-            // the thrid elem is filled wih origin_default
+            // the third elem is filled with origin_default

844-844: Same typo in comment.

📝 Suggested fix
-            // the thrid elem is filled wih origin_default
+            // the third elem is filled with origin_default

@ti-chi-bot ti-chi-bot bot added cherry-pick-approved Cherry pick PR approved by release team. and removed do-not-merge/cherry-pick-not-approved labels Jan 29, 2026
Signed-off-by: JaySon-Huang <tshent@qq.com>
Signed-off-by: JaySon-Huang <tshent@qq.com>
Signed-off-by: JaySon-Huang <tshent@qq.com>
Signed-off-by: JaySon-Huang <tshent@qq.com>
@JaySon-Huang
Copy link
Contributor

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 4, 2026
@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Feb 4, 2026
@ti-chi-bot ti-chi-bot bot added the lgtm label Feb 4, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 4, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CalvinNeo, JaySon-Huang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [CalvinNeo,JaySon-Huang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Feb 4, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 4, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-02-04 03:58:52.843087747 +0000 UTC m=+239403.944486467: ☑️ agreed by JaySon-Huang.
  • 2026-02-04 04:00:44.145466999 +0000 UTC m=+239515.246865718: ☑️ agreed by CalvinNeo.

@ti-chi-bot ti-chi-bot bot merged commit 62de9af into pingcap:release-8.5 Feb 4, 2026
5 checks passed
@ti-chi-bot ti-chi-bot bot deleted the cherry-pick-10682-to-release-8.5 branch February 4, 2026 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cherry-pick-approved Cherry pick PR approved by release team. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants