Skip to content

Add check for lfs pointer files#4107

Open
rasapala wants to merge 48 commits intomainfrom
fix_pull_all_dl_failed
Open

Add check for lfs pointer files#4107
rasapala wants to merge 48 commits intomainfrom
fix_pull_all_dl_failed

Conversation

@rasapala
Copy link
Copy Markdown
Collaborator

@rasapala rasapala commented Apr 1, 2026

🛠 Summary

CVS-CVS-184757

Libgit2 Lfs-filter.c patch changes:

  1. Added a process-wide cancellation mechanism with an exported global flag and shutdown probe path, so host applications can request LFS aborts reliably: lfs_filter.c:59, lfs_filter.c:73.
  2. Improved cancellation responsiveness by checking cancel state in progress callbacks, before/after curl operations, during resume attempts, and even during retry sleep intervals: lfs_filter.c:1063, lfs_filter.c:1349, lfs_filter.c:1424, lfs_filter.c:1452.
  3. Switched logging to an error-focused logger and changed log file lifecycle so lfs_error.txt is created/truncated only when an actual error is emitted: lfs_filter.c:71, lfs_filter.c:127.
  4. Extended cancellation coverage to both the batch metadata request and object download request (both now wired to progress callbacks and explicit abort guards): lfs_filter.c:1593, lfs_filter.c:1659.
  5. Kept resume behavior but made it safer and more interruptible, including repeated cancel checks around resume retries and fallback logic: lfs_filter.c:1414, lfs_filter.c:1687.

Ovms Libgit2.cpp Changes:

  1. End-to-end cancellation propagation was added from server shutdown signaling into clone and LFS download paths, so Ctrl+C can abort both clone and ongoing LFS transfers.
  2. Resume logic was redesigned to reliably re-trigger LFS smudge after interruption by repairing index entries and re-checking out specific paths.
  3. A shared shutdown state module was introduced to decouple pull module cancellation checks from server internals and satisfy Bazel dependency boundaries.
  4. New status codes and tests were added for cancellation and resume-after-cancel behavior.
  5. Third-party libgit2 integration was updated significantly (new upstream commit + larger LFS patch), including Windows linkage/packaging adjustments.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Copilot AI review requested due to automatic review settings April 1, 2026 11:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a safeguard to detect Git LFS pointer files left behind after Hugging Face model downloads (clone or resume) and surfaces a dedicated status code when this occurs.

Changes:

  • Introduces a new StatusCode for LFS download failures.
  • Adds status message mapping for the new status code.
  • Extends HfDownloader::downloadModel() to scan for LFS pointer files after clone/resume and fail fast if any are found.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/status.hpp Adds a new status code for LFS download failure.
src/status.cpp Adds a human-readable message for the new status code.
src/pull_module/libgit2.cpp Detects remaining LFS pointer files after clone/resume and returns the new failure status.

Comment thread src/status.hpp Outdated
Comment thread src/pull_module/libgit2.cpp Outdated
Comment thread src/pull_module/libgit2.cpp Outdated
Comment thread src/pull_module/libgit2.cpp Outdated
@rasapala rasapala requested review from atobiszei and dtrawins April 1, 2026 12:54
Comment thread src/pull_module/libgit2.cpp Outdated
Comment thread src/pull_module/libgit2.cpp Outdated
@atobiszei atobiszei self-requested a review April 28, 2026 12:19
Comment on lines +71 to +78
/**
* Builds the side-marker path used to track interrupted LFS operations.
* The marker lives next to the repository directory as <repo>.lfswip.
*
* @param repositoryPath Absolute path to the model repository directory.
* @return Filesystem path to the .lfswip marker associated with repositoryPath.
* @note Pure path computation; does not access filesystem or network.
*/
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need so much comments for those short functions? I would skip them, naming convention is good so they are self-explanatory I think, and it just bloats the code.

Comment thread src/pull_module/libgit2.cpp Outdated
* @return -1 if cancellation requested (aborts transfer), 0 otherwise.
* @note Connects to internet during clone operations; works on the clone/fetch domain.
*/
int cloneTransferProgressCb(const git_indexer_progress* stats, void* payload) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cb->Callback?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't do a lot in those callbacks maybe name those (
checkIfCancelCloneTransferProgressCallback or sth like that?

Comment thread src/pull_module/libgit2.cpp Outdated
(void)str;
(void)len;
(void)payload;
if (libgit2::isCloneCancellationRequestedFromServer()) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in several places. Use macro to shorten the code?
#define STOP_IF_OVMS_CANCELLED or sth like that?

Comment thread src/pull_module/libgit2.cpp Outdated
}

/**
* Callback for sideband progress messages during git operations (clone, fetch).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here comments again - we don't even use those methods for anythin usefull, so this comments only bloats the code.

Comment thread src/pull_module/libgit2.cpp Outdated
* @note Works on specific git repository location; reads and deletes lfs_error.txt.
* @note Part of the git LFS domain: detects LFS download failures recorded by patched libgit2.
*/
bool hasLfsErrorFileAndLogContent(const std::string& repositoryRootPath) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool hasLfsErrorFileAndLogContent(const std::string& repositoryRootPath) {
bool ifHasLfsErrorFileLogContentAndRemove(const std::string& repositoryRootPath) {

or sth like that?
It not only checks but removes as well.

Comment thread src/pull_module/libgit2.cpp
std::cout << " Resuming " << p.string() << "...\n";
std::string path = p.string();
resumeLfsDownloadForFile(repoGuard.get(), path.c_str());
Status resumeExistingRepository(git_repository* repo,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - clear logic

Comment thread src/pull_module/libgit2.cpp Outdated
}

auto status = IModelDownloader::checkIfOverwriteAndRemove();
auto normalizeIndexStatus = normalizeIndexToHead(repo);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalizeIndex->restoreGitIndex?

@atobiszei atobiszei self-requested a review April 30, 2026 11:48
Comment on lines +420 to +421
// ResumeAfterShutdownRequestAndRerun
TEST_F(HfPull, ResumeShutdown) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// ResumeAfterShutdownRequestAndRerun
TEST_F(HfPull, ResumeShutdown) {
TEST_F(HfPull,ResumeAfterShutdownRequestAndRerun) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot have long names in unit tests for lib git 2 clone because on windows we have a path max that can be exceeded.

EXPECT_EQ(preservedDigestBefore, preservedDigestAfter);
}

// PullAfterUserEditedTrackedFileDoesNotOverwriteIt
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put that in test name

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cant

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

third_party/libgit2/lfs.patch:435

  • The new lfs_filter.c being introduced by this patch starts with a UTF-8 BOM and the license block lines are prefixed with +/ instead of the usual *. The BOM (the invisible character before /*) can break compilation on some toolchains or cause lint/style checks to fail. Please remove the BOM and normalize the comment to a standard C block comment format.
+/*
+/ Copyright 2025 Intel Corporation
+/
+/ Licensed under the Apache License, Version 2.0 (the "License");
+/ you may not use this file except in compliance with the License.

Comment thread src/test/pull_hf_model_test.cpp
Comment thread src/test/pull_hf_model_test.cpp Outdated
Comment thread src/server.cpp
@dkalinowski dkalinowski requested a review from atobiszei April 30, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants