Skip to content

Move partial-upsert resolveComparisonTies method from base class to implementations#18640

Open
deepthi912 wants to merge 2 commits into
apache:masterfrom
deepthi912:partial-upserts-resolveTies
Open

Move partial-upsert resolveComparisonTies method from base class to implementations#18640
deepthi912 wants to merge 2 commits into
apache:masterfrom
deepthi912:partial-upserts-resolveTies

Conversation

@deepthi912
Copy link
Copy Markdown
Collaborator

@deepthi912 deepthi912 commented May 31, 2026

Remove the logic of comparison ties to implementations and remove unnecessary methods from the base class.

…lasses

The base addOrReplaceSegment wrapper in BasePartitionUpsertMetadataManager
existed only to call resolveComparisonTies for partial-upsert before delegating
to the subclass-specific doAddOrReplaceSegment. This made the abstract
doAddOrReplaceSegment signature redundant with its only caller.

Inline the wrapper: callers now invoke doAddOrReplaceSegment directly, and the
resolveComparisonTies call moves into ConcurrentMapPartitionUpsertMetadataManager
and ConcurrentMapPartitionUpsertMetadataManagerForConsistentDeletes (the only
two non-abstract subclasses). Removes one virtual call per segment add/replace
and eliminates a layer of indirection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@deepthi912 deepthi912 added the upsert Related to upsert functionality label May 31, 2026
…nager

Line 661 was 121 chars; wrap argument list so each line stays ≤120.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@deepthi912 deepthi912 added the refactor Code restructuring without changing behavior label May 31, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 31, 2026

Codecov Report

❌ Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.39%. Comparing base (5a6cd7b) to head (3cbd93a).

Files with missing lines Patch % Lines
...tionUpsertMetadataManagerForConsistentDeletes.java 33.33% 1 Missing and 1 partial ⚠️
...cal/upsert/BasePartitionUpsertMetadataManager.java 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master   #18640   +/-   ##
=========================================
  Coverage     64.38%   64.39%           
  Complexity     1282     1282           
=========================================
  Files          3359     3359           
  Lines        207825   207824    -1     
  Branches      32447    32447           
=========================================
+ Hits         133818   133825    +7     
+ Misses        63244    63237    -7     
+ Partials      10763    10762    -1     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.39% <62.50%> (+<0.01%) ⬆️
temurin 64.39% <62.50%> (+<0.01%) ⬆️
unittests 64.39% <62.50%> (+<0.01%) ⬆️
unittests1 56.78% <0.00%> (+<0.01%) ⬆️
unittests2 37.15% <62.50%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@deepthi912 deepthi912 requested a review from KKcorps June 1, 2026 17:29
Comment on lines +74 to +76
if (_partialUpsertHandler != null) {
recordInfoIterator = resolveComparisonTies(recordInfoIterator, _hashFunction);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this method resolveComparisonTies can we try to not construct the entire Map in-memory here just to get the iterator?

We have something similar in BasePartitionDedupMetadataManager#addOrReplaceSegment, and it's creating the iterator properly via some readers without allocating the memory all at once.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay it seems we need to do a deduplication here on all records from recordInfoIterator, so a huge on-heap allocation is inevitable.

Copy link
Copy Markdown
Collaborator

@J-HowHuang J-HowHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Code restructuring without changing behavior upsert Related to upsert functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants