Skip to content

Reduce duplicate copies in pack encoding#137

Merged
genedna merged 4 commits into
libra-tools:mainfrom
cheng20050714:optimize-pack-encode-copies
Jun 29, 2026
Merged

Reduce duplicate copies in pack encoding#137
genedna merged 4 commits into
libra-tools:mainfrom
cheng20050714:optimize-pack-encode-copies

Conversation

@cheng20050714

Copy link
Copy Markdown
Contributor

Summary

This PR reduces unnecessary allocations and buffer copies in the pack encode path while keeping the existing public API and pack output semantics unchanged.

The main change is to preserve ownership of encoded object buffers once they have already been produced as Vec<u8>. Previously, some encode paths converted those owned buffers back into borrowed slices and then cloned them again before sending them through the output channel. This PR moves those owned buffers directly into the write path after updating the pack hash and offset.

Changes

  • Preallocate encode result collections when the final number of entries is known.
  • Avoid cloning encoded object bytes in try_as_offset_delta.
  • Drain grouped encode results in inner_encode so IndexEntry values can be moved instead of cloned.
  • Replace the borrowed write helper with an owned-buffer write helper for pack object data.
  • Apply the owned write path to both inner_encode and parallel_encode.
  • Preallocate parallel_encode index entries using self.object_number.
  • Remove small header/trailer buffer copies where the bytes can be moved after hashing.
  • Add focused regression tests for delta-window encoding, empty buckets, and parallel no-delta encoding.

Correctness

The optimization is limited to ownership and allocation behavior. It does not change pack object ordering, delta selection, encoded object contents, public APIs, or index generation semantics.

Important invariants are preserved:

  • IndexEntry.offset is still assigned before each object is written.
  • inner_offset is still increased by the exact encoded object byte length.
  • The pack hash is updated with the same bytes that are sent to the output channel.
  • inner_encode still writes objects in commit, tree, blob, tag order.
  • parallel_encode still writes each Rayon-collected batch in deterministic collection order.

Tests

cargo clippy --all-targets --all-features -- -D warnings
cargo +nightly fmt --all --check
cargo fmt --check
cargo test --lib test_try_as_offset_delta_keeps_one_result_per_input
cargo test --lib test_try_as_offset_delta_accepts_empty_bucket
cargo test --lib test_delta_window_encode_after_copy_optimization_roundtrips
cargo test --lib test_parallel_encode_after_owned_write_roundtrips
cargo test --lib test_pack_encoder

cargo test --lib test_pack_encoder passed with 14 passed; 0 failed.

Notes for Reviewers

This PR intentionally avoids broader strategy changes such as parameterizing window_size, changing delta-base selection, or modifying pack index generation. Those would affect behavior outside the scope of this allocation-focused cleanup.

Copilot AI review requested due to automatic review settings May 30, 2026 11:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Reduces unnecessary buffer cloning/allocations in the pack encoder by moving owned Vec<u8> buffers through the write/send pipeline, pre-sizing collections, and adds tests covering the refactored paths.

Changes:

  • Replaces write_all_and_update(&[u8]) with write_owned_and_update(Vec<u8>) and avoids clone()/to_vec() at several call sites.
  • Pre-allocates idx_entries and per-bucket result vectors with known capacities and drains intermediate vectors.
  • Adds unit/async tests for try_as_offset_delta and end-to-end encode roundtrips.

Comment on lines +887 to +890
let encoder = PackEncoder::new(contents.len(), 4, tx);
encoder.encode_async(entry_rx).await.unwrap();

for content in contents {
Comment on lines +910 to +922
#[tokio::test]
async fn test_parallel_encode_after_owned_write_roundtrips() {
let _guard = set_hash_kind_for_test(HashKind::Sha1);
let contents = vec![
"parallel alpha",
"parallel beta",
"parallel gamma",
"parallel delta",
];
let (tx, mut rx) = mpsc::channel(16);
let (entry_tx, entry_rx) = mpsc::channel::<MetaAttached<Entry, EntryMeta>>(16);
let encoder = PackEncoder::new(contents.len(), 0, tx);
encoder.encode_async(entry_rx).await.unwrap();
cheng20050714 and others added 3 commits May 31, 2026 23:23
Replace write_all_and_update(&[u8]) with write_owned_and_update(Vec<u8>)
so encoded object bytes are moved directly into the output channel instead
of being copied via to_vec(). This removes the last per-object full-buffer
copy in the encode hot path.

Also:
- Move header and trailer bytes instead of cloning in both inner_encode
  and parallel_encode
- Pre-allocate idx_entries in parallel_encode with Vec::with_capacity
- Use tuple destructuring in write loops for clarity

New test: test_parallel_encode_after_owned_write_roundtrips covers the
parallel (window_size=0) path with owned write semantics.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: cheng <hecheng07142005@163.com>
@cheng20050714 cheng20050714 force-pushed the optimize-pack-encode-copies branch from cfffdd9 to 670cad1 Compare May 31, 2026 15:25
Copilot AI review requested due to automatic review settings June 2, 2026 02:07

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment on lines 462 to +465
// Hash signature
let hash_result = self.inner_hash.clone().finalize();
self.final_hash = Some(ObjectHash::from_bytes(&hash_result).unwrap());
self.send_data(hash_result.to_vec()).await;
self.send_data(hash_result).await;
Comment on lines 673 to +676
// hash signature
let hash_result = self.inner_hash.clone().finalize();
self.final_hash = Some(ObjectHash::from_bytes(&hash_result).unwrap());
self.send_data(hash_result.to_vec()).await;
self.send_data(hash_result).await;
@cheng20050714

Copy link
Copy Markdown
Contributor Author

Updated the branch to address the CI failures.

Changes in the latest commit:

  • Fixed Clippy warnings reported by the CI's newer toolchain.
  • Fixed a concurrent pack fixture download/removal race that could make full cargo test fail with No such file or directory.

Re-ran locally:

  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo +nightly fmt --all --check
  • cargo test

All passed.

@cheng20050714

Copy link
Copy Markdown
Contributor Author

All checks are passing now. I also verified the branch locally with:

  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo +nightly fmt --all --check
  • cargo test

Thanks for reviewing when you have time.😃

@genedna genedna merged commit b67aab1 into libra-tools:main Jun 29, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants