Skip to content

Core: Disallow setting main branch ref to a tag#16753

Open
rahulsmahadev wants to merge 2 commits into
apache:mainfrom
rahulsmahadev:core-disallow-tag-main-ref
Open

Core: Disallow setting main branch ref to a tag#16753
rahulsmahadev wants to merge 2 commits into
apache:mainfrom
rahulsmahadev:core-disallow-tag-main-ref

Conversation

@rahulsmahadev

Copy link
Copy Markdown
Contributor

Summary

TableMetadata.Builder#setRef accepts a tag ref named main: the main-branch special case sets currentSnapshotId and appends to the snapshot log without checking the ref type, and validateRefs only verifies that main's snapshot id matches current-snapshot-id — never its type. The spec requires main to be a branch (format/spec.md, refs: "There is always a main branch reference pointing to the current-snapshot-id").

Once such metadata is committed the table can no longer be written: every subsequent commit fails in setBranchSnapshotInternal with Cannot update branch: main is a tag — which also shows the builder already assumes main is a branch elsewhere.

This is reachable on any table whose metadata has snapshots but no main ref yet:

  • newAppend().stageOnly().commit() (WAP) followed by manageSnapshots().createTag("main", stagedSnapshotId).commit()
  • after CREATE OR REPLACE (replacement metadata drops refs but keeps snapshots)
  • via REST set-snapshot-ref updates applied through TableMetadata.Builder (update requirements assert ref snapshot ids, not ref types)

This change rejects the tag in Builder#setRef with a ValidationException before any builder state is mutated. The check is deliberately not added to validateRefs so that already-written metadata files containing this corruption can still be read (and repaired by committing a branch ref for main).

Test plan

  • TestTableMetadata#testSetRefRejectsTagForMainBranch — builder-level: setRef("main", tag) throws; tag refs under any other name still work
  • TestSnapshotManager#testCreateTagNamedMainFails — end-to-end repro via ManageSnapshots on a staged-only table; also asserts the failed commit did not create the main ref
  • Ran :iceberg-core:test --tests TestTableMetadata --tests TestSnapshotManager plus spotless/checkstyle locally

This pull request and its description were written by Isaac.

The main-branch special case in TableMetadata.Builder#setRef sets
currentSnapshotId and appends to the snapshot log without checking the
ref type, so a tag named main can be committed (for example
createTag("main", id) on a table whose only snapshots are staged). The
spec requires main to always be a branch, and once such metadata is
written every subsequent write fails with "Cannot update branch: main
is a tag".

Reject the tag in setRef before any builder state is mutated.

Co-authored-by: Isaac
@github-actions github-actions Bot added the core label Jun 10, 2026
@rahulsmahadev

Copy link
Copy Markdown
Contributor Author

@szehon-ho can you help take an initial look and see if this change is meaningful

Comment on lines +1332 to +1334
"Cannot set %s to a tag: %s must be a branch",
SnapshotRef.MAIN_BRANCH,
SnapshotRef.MAIN_BRANCH);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: the message repeats main twice (it expands to Cannot set main to a tag: main must be a branch). Could drop the second arg to read a bit cleaner, while still making clear that main can be set, just as a branch:

Suggested change
"Cannot set %s to a tag: %s must be a branch",
SnapshotRef.MAIN_BRANCH,
SnapshotRef.MAIN_BRANCH);
"Cannot set %s to a tag, it must be a branch",
SnapshotRef.MAIN_BRANCH);

Non-blocking, and it keeps the same Cannot set %s to ... shape as the sibling unknown-snapshot check just above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, done

@rahulsmahadev rahulsmahadev requested a review from szehon-ho June 12, 2026 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants