Skip to content

redo dispatcher can be recreated with startTs = MaxUint64 and get stuck in Initializing in fail_over_ddl_mix #4703

@lidezhu

Description

@lidezhu

What did you do?

https://prow.tidb.net/jenkins/job/pingcap/job/ticdc/job/pull_cdc_mysql_integration_heavy/1499/
a redo dispatcher is recreated with startTs = 18446744073709551615 (MaxUint64 / -1 in logs). After that, the region stream is already initialized, but all later resolved-ts updates are treated as fallback because lastResolvedTs is already MaxUint64. As a result, the dispatcher stays in Initializing, schema store keeps waiting forever, redo meta stops advancing.

This is the key sequence from the logs:

[2026/04/03 13:38:02.638 +08:00] [DEBUG] [replication_span.go:305] ["clamp dispatcher start ts to committed checkpoint"] [changefeedID=default/test] [dispatcherID=1753181377339796546314423708754558046615] [tableID=116] [operatorType=O_Add] [originalStartTs=465356632664834061] [committedCheckpointTs=18446744073709551615] [finalStartTs=18446744073709551615] 
[2026/04/03 13:38:02.639 +08:00] [INFO] [dispatcher_manager_redo.go:185] ["new redo dispatcher created"] [changefeedID=default/test] [dispatcherID=1753181377339796546314423708754558046615] [tableSpan="tableID: 116, startKey: 7480000000000000ff745f720000000000fa, endKey: 7480000000000000ff745f730000000000fa, keyspaceID: 0"] [startTs=-1] [skipDMLAsStartTs=false] 
[2026/04/03 13:38:02.640 +08:00] [INFO] [subscription_client.go:377] ["subscribes span done"] [subscriptionID=5] [tableID=116] [startTs=18446744073709551615] [startKey=7480000000000000FF745F720000000000FA] [endKey=7480000000000000FF745F730000000000FA] 
[2026/04/03 13:38:02.644 +08:00] [DEBUG] [region_event_handler.go:280] ["region is initialized"] [tableID=116] [regionID=192] [requestID=5] [startKey=7480000000000000FF745F720000000000FA] [endKey=7480000000000000FF745F730000000000FA] 
[2026/04/03 13:38:04.577 +08:00] [DEBUG] [region_event_handler.go:358] ["The resolvedTs is fallen back in subscription client"] [subscriptionID=5] [regionID=192] [resolvedTs=465356635391131654] [lastResolvedTs=18446744073709551615]
[2026/04/03 13:38:05.777 +08:00] [DEBUG] [dispatcher_manager_helper.go:56] ["dispatcher is initializing"] [changefeedID=default/test] [dispatcherID=1753181377339796546314423708754558046615] [tableSpan="tableID: 116, startKey: 7480000000000000ff745f720000000000fa, endKey: 7480000000000000ff745f730000000000fa, keyspaceID: 0"] [componentStatus=Initializing] 
[2026/04/03 13:38:07.640 +08:00] [INFO] [schema_store.go:224] ["wait resolved ts slow"] [tableID=116] [ts=18446744073709551615] [resolvedTS=465356636177563664] [time=5.000066471s] 
[2026/04/03 13:38:25.235 +08:00] [DEBUG] [meta.go:368] ["Redo meta has not changed for a long time, table trigger redo dispatcher may be stuck"] [keyspace=default] [changefeed=test] [lastFlushTime=20.19930556s] [meta="{\"CheckpointTs\":465356632520654870,\"ResolvedTs\":465356632520654871,\"Version\":1}"]

What did you expect to see?

No response

What did you see instead?

None

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(paste TiDB cluster version here)

Upstream TiKV version (execute tikv-server --version):

(paste TiKV version here)

TiCDC version (execute cdc version):

(paste TiCDC version here)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions