Skip to content

Conversation

@manishkr
Copy link
Contributor

@manishkr manishkr commented Dec 23, 2025

Which issue does this PR close?

Rationale for this change

To consider offset in slicing of RunArray.

What changes are included in this PR?

  1. Considered offset in slicing of RunArray.
  2. Enhanced RunArray slice API.

Are these changes tested?

yes

Are there any user-facing changes?

Yes, extended API to access RunArray slices directly than getting it from index.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 23, 2025
@manishkr
Copy link
Contributor Author

manishkr commented Dec 23, 2025

Looking into test(intermittent) case failure.

@manishkr manishkr force-pushed the 9018-fix-runarray-kernel branch from b17d380 to d9e54fb Compare December 24, 2025 12:31
@manishkr
Copy link
Contributor Author

Need help on.
Archery test With other arrows
The hosted runner lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

@Jefffrey
Copy link
Contributor

Need help on. Archery test With other arrows The hosted runner lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

Can ignore, it's an unrelated issue: #9024

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up. Left some comments, also changed PR body to not close original issue as I still need to check what other kernels/code might need fixing.

.take(len)
.map(move |run_end| {
let value = *run_end - offset;
value.min(length) + adjustment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this min necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essential for handling sliced RunArray, where the last physical run extends beyond the logical slice boundary. Added test case with analysis of this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment to that effect here? Something along the lines of:

min required for cases where slice ends in the middle of the run at the end

@manishkr manishkr force-pushed the 9018-fix-runarray-kernel branch from d9e54fb to 6fb02cd Compare December 27, 2025 06:55
@manishkr
Copy link
Contributor Author

Thanks for picking this up. Left some comments, also changed PR body to not close original issue as I still need to check what other kernels/code might need fixing.

Thanks for reviewing and feedback. If you permit, once this PR is good to go, can take this.

.take(len)
.map(move |run_end| {
let value = *run_end - offset;
value.min(length) + adjustment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a comment to that effect here? Something along the lines of:

min required for cases where slice ends in the middle of the run at the end

Comment on lines +1900 to +1910
// Without min(length), its run end would be 5 - 0 = 5.
// With min(length), it becomes min(5 - 0, 3) = 3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are nice, but they are quite separated from the implementation so they are a bit confusing (if someone were to look at this test, how would they understand what min(length) is referring to here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added more explanation note, also added comment in code as suggested above.

let filter_values = predicate.filter.values();
let run_ends = run_ends.inner();

let pred: BooleanArray = BooleanBuffer::collect_bool(run_ends.len(), |i| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still thinking about this a bit; it seems odd that we still iterate through all of the run_ends including the values before/after the slice. I wonder if there is a better way to handle it, or maybe I misunderstand how this logic is working 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping performance aspect in mind. Changed the run for physical length only.

Comment on lines 192 to 203
/// Similar to [`values`] but accounts for logical slicing, returning only the values
/// that are part of the logical slice of this buffer.
///
/// [`values`]: Self::values
pub fn values_slice(&self) -> &[E] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this docstring we also need a clear warning note that the first & last run values may be inaccurate as they don't take into account the logical slicing. I wonder if there is a better way to handle this without too much overhead 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added warning but not sure logical run ends API(logical_run_ends) is needed or not, but added it. Please feel free to suggest to remove it.

@manishkr manishkr force-pushed the 9018-fix-runarray-kernel branch from 6fb02cd to 57e6a3c Compare December 30, 2025 07:36
@manishkr
Copy link
Contributor Author

manishkr commented Jan 5, 2026

@Jefffrey Please help to close this. Thanks

@Jefffrey
Copy link
Contributor

Jefffrey commented Jan 5, 2026

I'll try take another look soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants