Skip to content

fix: serialize version metadata through JNI and correct row-ID lookup#6465

Open
ivscheianu wants to merge 19 commits intolance-format:mainfrom
ivscheianu:stable-row-id
Open

fix: serialize version metadata through JNI and correct row-ID lookup#6465
ivscheianu wants to merge 19 commits intolance-format:mainfrom
ivscheianu:stable-row-id

Conversation

@ivscheianu
Copy link
Copy Markdown
Contributor

@ivscheianu ivscheianu commented Apr 9, 2026

Fixes #6464

Summary

Two bugs in the stable row ID + CDF version tracking pipeline caused _row_created_at_version to report incorrect values after updates.

Bug 1: JNI serialization gap for version metadata

FragmentMetadata round-trips through JNI (Rust -> Java -> Rust) lost created_at_version_meta and last_updated_at_version_meta. The JNI layer serialized RowIdMeta but hard-coded None for both version metadata fields on the Java->Rust path, so any version information attached by lance-core was silently dropped when the connector returned fragments.

Fix: Add VersionMeta.java (mirrors RowIdMeta), extend FragmentMetadata with two new VersionMeta fields, and update lance-jni/src/fragment.rs to serialize/deserialize RowDatasetVersionMeta via JSON through the new Java type. Also extracted a generic extract_nullable_field<T>() helper to reduce duplicated nullable-getter boilerplate.

Bug 2: Incorrect row-ID lookup in Operation::Update

When preserving created_at versions for updated rows, the code assumed row_id >> 32 yields the original fragment ID. This is only true for unstable row addresses. Stable row IDs are sequential integers unrelated to
fragment layout. Any dataset with stable row IDs would look up the wrong fragment and fall back to a default version of 1.

Fix: Build a reverse index (stable row ID -> (fragment ref, offset)) from each existing fragment's RowIdMeta sequence, then look up each new fragment's row IDs in this index to find the correct original fragment and row offset.

Additional improvements

  • RowIdMeta.fromRowIds(long[]) — static factory that encodes stable row IDs into the inline protobuf/JSON format expected by lance-core, centralizing logic that was previously hand-rolled in the connector.

Companion PR

This PR is paired with lance-format/lance-spark#407 which implements the native UPDATE path in the Spark connector: capturing _rowid during update(), attaching RowIdMeta to new fragments, and consolidating deletion handling on the driver.

@ivscheianu ivscheianu marked this pull request as ready for review April 9, 2026 10:45
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 96.73367% with 13 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/transaction.rs 96.70% 12 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

let total_slots = stats.max - stats.min + 1;

let range_with_holes = 24 + 4 * n_holes as usize;
let range_with_holes = 24usize.saturating_add(4usize.saturating_mul(n_holes as usize));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created #6515 for a long-term fix, will try to take care of the impl in a separate PR

}

impl IntoJava for &RowDatasetVersionMeta {
fn into_java<'a>(self, env: &mut JNIEnv<'a>) -> Result<JObject<'a>> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pick... Let's use 'local for the lifetime, similar to the other Fragment.into_java().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'a is actually the dominant convention across all the other IntoJava impls in this file (DataFile, DeletionFile, RowIdMeta, FragmentMergeResult, etc.). Fragment is actually the outlier. My main concern was about consistency, happy to change it if you think it's the appropriate move


/// Fallback version for rows whose original creation version cannot be determined.
/// Version 1 is the initial dataset version in the Lance format.
const UNKNOWN_CREATED_AT_VERSION: u64 = 1;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to rename it to DEFAULT_VERSION.

Suggested change
const UNKNOWN_CREATED_AT_VERSION: u64 = 1;
const DEFAULT_VERSION: u64 = 1;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep UNKNOWN_CREATED_AT_VERSION because the name captures why this value is used (fallback when the original creation version can't be determined), not just what it is. DEFAULT_VERSION would be a little ambiguous at the call sites since a reader would have to trace back to the constant definition to understand the intent

Comment thread rust/lance/src/dataset/transaction.rs Outdated
Comment on lines +93 to +94
frag_indices.sort_by_key(|&i| existing_fragments[i].id);
for &i in &frag_indices {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we do to iterations over the existing_fragments. Can this be improved?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, thanks!

None => None,
};

if let Some(row_ids) = row_ids {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be moved in the match above. This will also make code more readable and move the else branch in the None case.

Copy link
Copy Markdown
Contributor Author

@ivscheianu ivscheianu Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Folding the if let Some(row_ids) into the Inline arm would require duplicating the fallback build_version_meta logic across three places: the Inline/Err sub-arm, the External arm, and the None arm. The current structure keeps that fallback in one else branch, which is the only reason the intermediate row_ids: Option<_> variable exists. Agreed it's not the most elegant shape, but I think it's less duplication than the alternative


fragment.last_updated_at_version_meta = build_version_meta(fragment, new_version);
} else {
let version_meta = build_version_meta(fragment, new_version);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be moved in the None case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to the above #6465 (comment)

Comment thread rust/lance/src/dataset/transaction.rs Outdated
let inherited = current_manifest
.map(|m| m.uses_stable_row_ids())
.unwrap_or(false);
let stable_row_ids = config.use_stable_row_ids || inherited;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick... the name of the variable should let us understand that it is a boolean, so use_stable_row_ids would be better.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 👍🏻

Comment thread rust/lance/src/dataset.rs Outdated
// build_manifest may have already set FLAG_STABLE_ROW_IDS on the manifest.
// Preserve it here so this second apply_feature_flags call does not clear it
// when config.use_stable_row_ids is false (the ManifestWriteConfig default).
let stable_row_ids = config.use_stable_row_ids || manifest.uses_stable_row_ids();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto... use_stable_row_ids to show that is a boolean.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working java

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JNI loses version metadata and row-ID lookup is incorrect for stable row IDs during updates

2 participants