Skip to content

[codex] fix pull restore performance#422

Draft
genedna wants to merge 1 commit into
mainfrom
codex/pull-restore-performance
Draft

[codex] fix pull restore performance#422
genedna wants to merge 1 commit into
mainfrom
codex/pull-restore-performance

Conversation

@genedna

@genedna genedna commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a libra pull fast-forward performance path where restore expanded the entire working tree and hashed unrelated untracked artifacts such as target/ and web/node_modules/ after fetch completed.

Root cause

pull fast-forward updated HEAD and then invoked restore with the whole worktree as the pathspec. Restore expanded that pathspec by recursively listing the filesystem, then computed blob hashes before checking whether paths were tracked or present in the target tree. In large worktrees this could scan and read hundreds of GB of build output and look like a hang.

Changes

  • Limit worktree restore candidates to the union of target-tree paths and index-tracked paths filtered by the requested pathspec.
  • Skip hashing paths that are neither target paths nor tracked paths.
  • Compute ordinary file blob hashes with streaming Git-object hashing instead of reading whole files into memory.
  • Use the shared internal::merge_base graph implementation for pull/merge LCA computation.
  • Add a pull regression test with an ignored, unreadable untracked build artifact.
  • Apply small clippy cleanups required for -D warnings on the current toolchain.

Validation

  • cargo +nightly fmt --all --check
  • LIBRA_SKIP_WEB_BUILD=1 cargo check
  • LIBRA_SKIP_WEB_BUILD=1 cargo clippy --all-targets --all-features -- -D warnings
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_fast_forward_skips_untracked_artifacts_during_restore -- --nocapture
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_fast_forward_updates_head_from_tracking_remote -- --nocapture
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_ff_only_fast_forward_updates_head_from_tracking_remote -- --nocapture
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test test_pull_diverged_remote_creates_three_way_merge -- --nocapture
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test pull_test -- --nocapture
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test restore_test -- --nocapture
  • LIBRA_SKIP_WEB_BUILD=1 cargo test --test command_test index_pack_test -- --nocapture

Signed-off-by: Eli Ma <eli@patch.sh>
Copilot AI review requested due to automatic review settings July 4, 2026 16:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a libra pull fast-forward performance regression where the post-fetch restore path could recursively enumerate the entire working tree and hash large, unrelated untracked artifacts (e.g., target/, web/node_modules/), causing apparent hangs in large repos.

Changes:

  • Reworks restore candidate collection to consider only (a) target-tree paths and (b) index-tracked paths, both filtered by the user pathspec, avoiding filesystem-wide expansion.
  • Switches ordinary-file blob hashing to a streaming “Git object” hash computation (instead of reading full files into memory).
  • Reuses the shared internal::merge_base implementation for pull/merge LCA computation and adds a regression test that simulates an unreadable untracked artifact.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/command/pull_test.rs Adds a regression test ensuring fast-forward pull restore doesn’t scan unreadable untracked artifacts.
tests/command/index_pack_test.rs Refactors test pack index parsing to be chunk-based and avoid unwrap() on slice conversions.
src/internal/tui/markdown_render.rs Uses std::mem::take to avoid drain+collect churn when wrapping segments.
src/command/verify_pack_index_v2.rs Refactors offset parsing to chunk-based iteration with explicit truncation checks.
src/command/verify_pack_index_common.rs Refactors fanout parsing to chunk-based iteration with explicit truncation checks.
src/command/restore.rs Core fix: collects restore worktree paths from target+tracked sets rather than expanding the filesystem pathspec.
src/command/mod.rs Implements streaming file blob hashing for ordinary files; preserves LFS pointer hashing behavior.
src/command/merge.rs Uses shared merge_base logic for LCA in pull merge flows and improves error context.
src/command/for_each_ref.rs Minor refactor to simplify refname:rstrip parsing logic.
src/command/diff.rs Uses std::mem::take for trailing -- pathspec handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants