download_model.sh: auto-resume interrupted downloads#225
Open
datag00n wants to merge 1 commit into
Open
Conversation
The downloader already supported manual resume (curl -C -): when a transfer stopped, the user had to re-run the command. For the ~81 GB Flash GGUF, a single network blip means babysitting the download. Make resume automatic: run curl inside a bounded retry loop that re-invokes it with -C - after any network failure, plus curl's own --retry for transient HTTP errors (e.g. Hugging Face 429 rate limits). - Fast-fail on curl exit 22 (HTTP 4xx, e.g. a missing HF token) instead of retrying pointlessly. - Stop on Ctrl-C / signals instead of fighting the user's interrupt; the partial .part file is kept either way. - Bound the retries: DS4_MAX_RETRIES (default 100) and DS4_RETRY_DELAY (default 5s), documented in the usage text. Tested on macOS / Apple M5 Max / Metal: ./download_model.sh q2-imatrix logged "Download interrupted (attempt 1/100, curl exit 56)", resumed automatically, and completed the ~81 GB q2-imatrix GGUF. No inference backend touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
download_model.shalready resumed viacurl -C -, but only manually — if the transfer dropped you had to notice and re-run the command. For the ~81 GB Flash GGUF, one network blip means babysitting the download. This makes resume automatic.Changes (download_model.sh only)
curlruns inside a bounded retry loop that re-invokes it with-C -after any network failure.--retry 5 --retry-delay 5on curl — absorbs transient HTTP errors (notably Hugging Face 429 rate limiting) within one invocation.curlexit 22 (401/403/404, e.g. a missing HF token) stops immediately with a clear hint instead of retrying pointlessly.trap … INTmakes Ctrl-C stop the script rather than being swallowed by the retry loop; the partial.partfile is kept so a re-run still resumes.DS4_MAX_RETRIES(default 100) andDS4_RETRY_DELAY(default 5s), documented in the usage text.A fully successful download behaves exactly as before.
Testing
macOS / Apple M5 Max / Metal:
During the run the connection dropped (
curlexit 56). The script loggedDownload interrupted (attempt 1/100, curl exit 56), resumed automatically, and completed the ~81 GBq2-imatrixGGUF.sh -n download_model.shpasses;--helpoutput verified.No inference backend is touched, so the CONTRIBUTING.md correctness/speed regression suites don't apply. Diff: +57 / −5, one file.