Determine latest Dag version by version_number, not created_at#68389
Open
ephraimbuddy wants to merge 2 commits into
Open
Determine latest Dag version by version_number, not created_at#68389ephraimbuddy wants to merge 2 commits into
ephraimbuddy wants to merge 2 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes nondeterministic “latest DagVersion” selection by ordering on version_number (monotonic/unique per dag_id) instead of created_at, preventing occasional collisions with the (dag_id, version_number) unique constraint when timestamps tie or clock skew occurs.
Changes:
- Update
DagVersion“latest” selection logic to order byversion_number(includingget_latest_version,get_version, and the internal select used bywrite_dag). - Update bulk prefetch logic in
SerializedDagModel._prefetch_dag_write_metadatato align “latest” selection with DagVersion’sversion_number. - Add regression tests that seed DagVersions where
created_atordering disagrees withversion_numberordering.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
airflow-core/src/airflow/models/dag_version.py |
Switches “latest” DagVersion selection to deterministic version_number DESC ordering. |
airflow-core/src/airflow/models/serialized_dag.py |
Aligns bulk prefetch window ordering with DagVersion selection by joining and ordering on DagVersion.version_number. |
airflow-core/tests/unit/models/test_dag_version.py |
Adds regression coverage for inverted created_at vs version_number and verifies write_dag increments from max version. |
1c31871 to
2e34957
Compare
The latest DagVersion was selected by created_at DESC, which is not deterministic when two versions share a timestamp (or under clock skew). write_dag derives the next version_number from that row, so picking a non-max row collided with the (dag_id, version_number) unique constraint. Order by the monotonic, unique version_number instead, consistently across get_latest_version/write_dag, get_version, and the bulk prefetch.
2e34957 to
204c9e2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The latest DagVersion was selected by created_at DESC, which
is not deterministic when two versions share a timestamp (or under clock skew).
write_dag derives the next version_number from that row, so picking a non-max row collided with the (dag_id, version_number) unique constraint. Order by the monotonic, unique version_number instead, consistently across get_latest_version/write_dag, get_version, and the bulk prefetch.
Was generative AI tooling used to co-author this PR?
Claude opus 4.8