Skip to content

feat(pack): add PackStats and support callback closure in Pack::decode#128

Closed
yichYICH wants to merge 4 commits into
libra-tools:mainfrom
yichYICH:main
Closed

feat(pack): add PackStats and support callback closure in Pack::decode#128
yichYICH wants to merge 4 commits into
libra-tools:mainfrom
yichYICH:main

Conversation

@yichYICH

Copy link
Copy Markdown

Summary

This PR adds a Pack Decode statistics utility for collecting object counts and type distribution from a .pack file. The implementation reuses the existing Pack::decode pipeline and collects statistics through its decoded-object callback, avoiding duplication of the core decode flow.

Technical Details

  • Added PackStats with counts for total objects, commits, trees, blobs, tags, and delta-encoded entries.
  • Added decode_pack_stats(path) as a simple public helper and Pack::decode_stats(path) as the implementation entry point.
  • Reused Pack::decode for pack header validation, object decoding, zlib decompression, delta reconstruction, trailer hash validation, cache handling, and waitlist processing.
  • Counted delta entries using entry.meta.is_delta, while non-delta entries are counted by entry.inner.obj_type.
  • Improved pack cache temp-directory error messages and adjusted test temp paths to use target/test-cache/... for stable Windows/local test execution.
  • Added integration tests in tests/pack-stats.rs for normal packs, delta-heavy packs, missing files, and malformed pack headers.

Verification

Ran the following locally:

  • cargo build
  • cargo fmt --check
  • cargo clippy --all-targets -- -D warnings
  • cargo test --test pack-stats
  • cargo test

All checks passed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a lightweight pack-decoding statistics facility to the git-internal pack decoder, reusing the existing Pack::decode pipeline and collecting counts via the decoded-entry callback.

Changes:

  • Introduces PackStats plus decode_pack_stats(path) and Pack::decode_stats(path) to return object-type and delta counts for a .pack file.
  • Updates pack encode/decode/cache tests to use target/test-cache/... temp paths and improves cache temp-dir creation/removal diagnostics.
  • Adds integration tests covering small packs, delta-heavy packs, missing files, and invalid headers; simplifies rustfmt.toml config.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/pack-stats.rs New integration tests validating decode_pack_stats counts and error behavior.
src/internal/pack/mod.rs Adds PackStats and public decode_pack_stats helper.
src/internal/pack/decode.rs Adds Pack::decode_stats and wires stats collection through the decode callback; updates decode tests tmp paths.
src/internal/pack/encode.rs Adjusts encode tests to use stable temp cache paths under target/test-cache.
src/internal/pack/cache.rs Improves cache-dir creation error messaging and makes temp-dir removal non-panicking; updates cache tests tmp paths.
rustfmt.toml Removes unstable/import-grouping settings, leaving only reorder_imports.
Comments suppressed due to low confidence (1)

src/internal/pack/decode.rs:586

  • decode_stats is a public API but the callback uses lock().expect("pack stats mutex poisoned"), which can panic and bypass the Result error path. Consider handling PoisonError (e.g., unwrap_or_else(|e| e.into_inner())) or using a non-poisoning mechanism so decode_stats reliably returns GitError instead of panicking.
            move |entry| {
                let mut stats = stats_for_callback
                    .lock()
                    .expect("pack stats mutex poisoned");
                stats.record(entry.inner.obj_type, entry.meta.is_delta.unwrap_or(false));
            },

Comment thread src/internal/pack/decode.rs Outdated
Comment on lines +573 to +593
let file = fs::File::open(path)?;
let mut reader = io::BufReader::new(file);
let mut pack = Pack::new(None, Some(100 * 1024 * 1024), None, true);
let stats = Arc::new(Mutex::new(PackStats::default()));
let stats_for_callback = stats.clone();

pack.decode(
&mut reader,
move |entry| {
let mut stats = stats_for_callback
.lock()
.expect("pack stats mutex poisoned");
stats.record(entry.inner.obj_type, entry.meta.is_delta.unwrap_or(false));
},
None::<fn(ObjectHash)>,
)?;

let stats = stats
.lock()
.map_err(|_| GitError::CustomError("pack stats mutex poisoned".to_string()))?;
Ok(*stats)
Comment thread src/internal/pack/decode.rs Outdated
Comment on lines +572 to +576
pub fn decode_stats(path: impl AsRef<Path>) -> Result<PackStats, GitError> {
let file = fs::File::open(path)?;
let mut reader = io::BufReader::new(file);
let mut pack = Pack::new(None, Some(100 * 1024 * 1024), None, true);
let stats = Arc::new(Mutex::new(PackStats::default()));
Comment thread src/internal/pack/mod.rs

/// Decode a pack file and return object count statistics.
///
/// This uses the current thread-local hash kind, just like [`Pack::decode`].

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 293b027d00

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/internal/pack/decode.rs Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f1da420b6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/internal/pack/decode.rs
@genedna

genedna commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

该 PR 基于 2026-05-25 前后的旧代码,当前 main 已经包含后续合入的 pack stats、decode 快路径、unbounded cache、并行/模块化 encode、Rabin/delta search 等优化。\n\nReview 意见分析:\n- Mutex 统计锁竞争:原始意见合理,PR 后续提交已改为 AtomicUsize。\n- decode stats 的内存上限/tmp 目录副作用:原始意见合理,PR 后续提交已增加 DecodeStatsOptions 并补充文档。\n- cache dir / thread_num panic:原始意见合理,PR 后续提交做了部分修复。\n\n但与当前 main 合并时产生核心冲突:#128 仍基于旧的 src/internal/pack/encode.rs,当前 main 已删除该文件并使用 src/internal/pack/encode/ 模块目录;直接合入会删除/回退当前的 encode 模块化和性能优化。同时,当前 main 的 PackStats 位于 src/internal/pack/stats.rs,使用专用 pack scanner 做统计,避免 #128 方案中完整 Pack::decode + callback 的对象构建/delta 重建开销。\n\n因此合入 #128 会带来明确的 encode/decode 性能回退风险,按维护规则不合入,直接关闭。

@genedna genedna closed this Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants