[codex] fix pull restore performance#422
Draft
genedna wants to merge 1 commit into
Draft
Conversation
Signed-off-by: Eli Ma <eli@patch.sh>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses a libra pull fast-forward performance regression where the post-fetch restore path could recursively enumerate the entire working tree and hash large, unrelated untracked artifacts (e.g., target/, web/node_modules/), causing apparent hangs in large repos.
Changes:
- Reworks restore candidate collection to consider only (a) target-tree paths and (b) index-tracked paths, both filtered by the user pathspec, avoiding filesystem-wide expansion.
- Switches ordinary-file blob hashing to a streaming “Git object” hash computation (instead of reading full files into memory).
- Reuses the shared
internal::merge_baseimplementation for pull/merge LCA computation and adds a regression test that simulates an unreadable untracked artifact.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/command/pull_test.rs | Adds a regression test ensuring fast-forward pull restore doesn’t scan unreadable untracked artifacts. |
| tests/command/index_pack_test.rs | Refactors test pack index parsing to be chunk-based and avoid unwrap() on slice conversions. |
| src/internal/tui/markdown_render.rs | Uses std::mem::take to avoid drain+collect churn when wrapping segments. |
| src/command/verify_pack_index_v2.rs | Refactors offset parsing to chunk-based iteration with explicit truncation checks. |
| src/command/verify_pack_index_common.rs | Refactors fanout parsing to chunk-based iteration with explicit truncation checks. |
| src/command/restore.rs | Core fix: collects restore worktree paths from target+tracked sets rather than expanding the filesystem pathspec. |
| src/command/mod.rs | Implements streaming file blob hashing for ordinary files; preserves LFS pointer hashing behavior. |
| src/command/merge.rs | Uses shared merge_base logic for LCA in pull merge flows and improves error context. |
| src/command/for_each_ref.rs | Minor refactor to simplify refname:rstrip parsing logic. |
| src/command/diff.rs | Uses std::mem::take for trailing -- pathspec handling. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a
libra pullfast-forward performance path where restore expanded the entire working tree and hashed unrelated untracked artifacts such astarget/andweb/node_modules/after fetch completed.Root cause
pullfast-forward updated HEAD and then invoked restore with the whole worktree as the pathspec. Restore expanded that pathspec by recursively listing the filesystem, then computed blob hashes before checking whether paths were tracked or present in the target tree. In large worktrees this could scan and read hundreds of GB of build output and look like a hang.Changes
internal::merge_basegraph implementation for pull/merge LCA computation.-D warningson the current toolchain.Validation
cargo +nightly fmt --all --checkLIBRA_SKIP_WEB_BUILD=1 cargo checkLIBRA_SKIP_WEB_BUILD=1 cargo clippy --all-targets --all-features -- -D warningsLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_fast_forward_skips_untracked_artifacts_during_restore -- --nocaptureLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_fast_forward_updates_head_from_tracking_remote -- --nocaptureLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_ff_only_fast_forward_updates_head_from_tracking_remote -- --nocaptureLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_diverged_remote_creates_three_way_merge -- --nocaptureLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test pull_test -- --nocaptureLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test restore_test -- --nocaptureLIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test index_pack_test -- --nocapture