Skip to content

download_model.sh: auto-resume interrupted downloads#225

Open
datag00n wants to merge 1 commit into
antirez:mainfrom
datag00n:download-auto-resume
Open

download_model.sh: auto-resume interrupted downloads#225
datag00n wants to merge 1 commit into
antirez:mainfrom
datag00n:download-auto-resume

Conversation

@datag00n
Copy link
Copy Markdown

@datag00n datag00n commented May 22, 2026

Summary

download_model.sh already resumed via curl -C -, but only manually — if the transfer dropped you had to notice and re-run the command. For the ~81 GB Flash GGUF, one network blip means babysitting the download. This makes resume automatic.

Changes (download_model.sh only)

  • Automatic resumecurl runs inside a bounded retry loop that re-invokes it with -C - after any network failure.
  • --retry 5 --retry-delay 5 on curl — absorbs transient HTTP errors (notably Hugging Face 429 rate limiting) within one invocation.
  • Fast-fail on HTTP errorscurl exit 22 (401/403/404, e.g. a missing HF token) stops immediately with a clear hint instead of retrying pointlessly.
  • Clean interrupt — a trap … INT makes Ctrl-C stop the script rather than being swallowed by the retry loop; the partial .part file is kept so a re-run still resumes.
  • BoundedDS4_MAX_RETRIES (default 100) and DS4_RETRY_DELAY (default 5s), documented in the usage text.

A fully successful download behaves exactly as before.

Testing

macOS / Apple M5 Max / Metal:

./download_model.sh q2-imatrix

During the run the connection dropped (curl exit 56). The script logged Download interrupted (attempt 1/100, curl exit 56), resumed automatically, and completed the ~81 GB q2-imatrix GGUF. sh -n download_model.sh passes; --help output verified.

No inference backend is touched, so the CONTRIBUTING.md correctness/speed regression suites don't apply. Diff: +57 / −5, one file.

The downloader already supported manual resume (curl -C -): when a
transfer stopped, the user had to re-run the command. For the ~81 GB
Flash GGUF, a single network blip means babysitting the download.

Make resume automatic: run curl inside a bounded retry loop that
re-invokes it with -C - after any network failure, plus curl's own
--retry for transient HTTP errors (e.g. Hugging Face 429 rate limits).

- Fast-fail on curl exit 22 (HTTP 4xx, e.g. a missing HF token) instead
  of retrying pointlessly.
- Stop on Ctrl-C / signals instead of fighting the user's interrupt;
  the partial .part file is kept either way.
- Bound the retries: DS4_MAX_RETRIES (default 100) and DS4_RETRY_DELAY
  (default 5s), documented in the usage text.

Tested on macOS / Apple M5 Max / Metal: ./download_model.sh q2-imatrix
logged "Download interrupted (attempt 1/100, curl exit 56)", resumed
automatically, and completed the ~81 GB q2-imatrix GGUF. No inference
backend touched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant