Reduce duplicate copies in pack encoding#137
Merged
genedna merged 4 commits intoJun 29, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Reduces unnecessary buffer cloning/allocations in the pack encoder by moving owned Vec<u8> buffers through the write/send pipeline, pre-sizing collections, and adds tests covering the refactored paths.
Changes:
- Replaces
write_all_and_update(&[u8])withwrite_owned_and_update(Vec<u8>)and avoidsclone()/to_vec()at several call sites. - Pre-allocates
idx_entriesand per-bucket result vectors with known capacities and drains intermediate vectors. - Adds unit/async tests for
try_as_offset_deltaand end-to-end encode roundtrips.
Comment on lines
+887
to
+890
| let encoder = PackEncoder::new(contents.len(), 4, tx); | ||
| encoder.encode_async(entry_rx).await.unwrap(); | ||
|
|
||
| for content in contents { |
Comment on lines
+910
to
+922
| #[tokio::test] | ||
| async fn test_parallel_encode_after_owned_write_roundtrips() { | ||
| let _guard = set_hash_kind_for_test(HashKind::Sha1); | ||
| let contents = vec![ | ||
| "parallel alpha", | ||
| "parallel beta", | ||
| "parallel gamma", | ||
| "parallel delta", | ||
| ]; | ||
| let (tx, mut rx) = mpsc::channel(16); | ||
| let (entry_tx, entry_rx) = mpsc::channel::<MetaAttached<Entry, EntryMeta>>(16); | ||
| let encoder = PackEncoder::new(contents.len(), 0, tx); | ||
| encoder.encode_async(entry_rx).await.unwrap(); |
Replace write_all_and_update(&[u8]) with write_owned_and_update(Vec<u8>) so encoded object bytes are moved directly into the output channel instead of being copied via to_vec(). This removes the last per-object full-buffer copy in the encode hot path. Also: - Move header and trailer bytes instead of cloning in both inner_encode and parallel_encode - Pre-allocate idx_entries in parallel_encode with Vec::with_capacity - Use tuple destructuring in write loops for clarity New test: test_parallel_encode_after_owned_write_roundtrips covers the parallel (window_size=0) path with owned write semantics. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: cheng <hecheng07142005@163.com>
cfffdd9 to
670cad1
Compare
Comment on lines
462
to
+465
| // Hash signature | ||
| let hash_result = self.inner_hash.clone().finalize(); | ||
| self.final_hash = Some(ObjectHash::from_bytes(&hash_result).unwrap()); | ||
| self.send_data(hash_result.to_vec()).await; | ||
| self.send_data(hash_result).await; |
Comment on lines
673
to
+676
| // hash signature | ||
| let hash_result = self.inner_hash.clone().finalize(); | ||
| self.final_hash = Some(ObjectHash::from_bytes(&hash_result).unwrap()); | ||
| self.send_data(hash_result.to_vec()).await; | ||
| self.send_data(hash_result).await; |
Contributor
Author
|
Updated the branch to address the CI failures. Changes in the latest commit:
Re-ran locally:
All passed. |
Contributor
Author
|
All checks are passing now. I also verified the branch locally with:
Thanks for reviewing when you have time.😃 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR reduces unnecessary allocations and buffer copies in the pack encode path while keeping the existing public API and pack output semantics unchanged.
The main change is to preserve ownership of encoded object buffers once they have already been produced as
Vec<u8>. Previously, some encode paths converted those owned buffers back into borrowed slices and then cloned them again before sending them through the output channel. This PR moves those owned buffers directly into the write path after updating the pack hash and offset.Changes
try_as_offset_delta.inner_encodesoIndexEntryvalues can be moved instead of cloned.inner_encodeandparallel_encode.parallel_encodeindex entries usingself.object_number.Correctness
The optimization is limited to ownership and allocation behavior. It does not change pack object ordering, delta selection, encoded object contents, public APIs, or index generation semantics.
Important invariants are preserved:
IndexEntry.offsetis still assigned before each object is written.inner_offsetis still increased by the exact encoded object byte length.inner_encodestill writes objects in commit, tree, blob, tag order.parallel_encodestill writes each Rayon-collected batch in deterministic collection order.Tests
cargo test --lib test_pack_encoderpassed with14 passed; 0 failed.Notes for Reviewers
This PR intentionally avoids broader strategy changes such as parameterizing
window_size, changing delta-base selection, or modifying pack index generation. Those would affect behavior outside the scope of this allocation-focused cleanup.