Skip to content

Use multi-part upload on zarr chunks larger than 5GB#1839

Open
jjnesbitt wants to merge 18 commits intomasterfrom
zarr-multipart-upload
Open

Use multi-part upload on zarr chunks larger than 5GB#1839
jjnesbitt wants to merge 18 commits intomasterfrom
zarr-multipart-upload

Conversation

@jjnesbitt
Copy link
Copy Markdown
Member

Depends on and adds support for dandi/dandi-archive#2784

Comment thread dandi/files/zarr.py Fixed
Comment thread dandi/files/zarr.py Fixed
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 82.25806% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.27%. Comparing base (43f02a3) to head (f1f48da).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
dandi/files/bases.py 83.63% 9 Missing ⚠️
dandi/dandiapi.py 50.00% 5 Missing ⚠️
dandi/files/zarr.py 78.26% 5 Missing ⚠️
dandi/consts.py 0.00% 1 Missing ⚠️
dandi/support/digests.py 80.00% 1 Missing ⚠️
dandi/tests/test_files.py 96.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1839      +/-   ##
==========================================
+ Coverage   76.26%   76.27%   +0.01%     
==========================================
  Files          87       87              
  Lines       12512    12575      +63     
==========================================
+ Hits         9542     9592      +50     
- Misses       2970     2983      +13     
Flag Coverage Δ
unittests 76.27% <82.25%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jjnesbitt jjnesbitt force-pushed the zarr-multipart-upload branch from b427dc7 to 0e505bd Compare April 23, 2026 21:10
Comment thread dandi/files/zarr.py Fixed
Comment thread dandi/files/zarr.py Fixed
Comment thread dandi/files/zarr.py Fixed
Comment thread dandi/files/zarr.py Outdated
Comment thread dandi/files/zarr.py Outdated
Comment thread dandi/files/zarr.py Dismissed
Comment thread dandi/files/zarr.py Dismissed
@jjnesbitt jjnesbitt force-pushed the zarr-multipart-upload branch from c1843cd to 367bd24 Compare April 24, 2026 16:20
@jjnesbitt jjnesbitt marked this pull request as ready for review April 24, 2026 21:04
@jjnesbitt jjnesbitt requested a review from yarikoptic April 24, 2026 21:05
@jjnesbitt
Copy link
Copy Markdown
Member Author

jjnesbitt commented Apr 24, 2026

@yarikoptic I believe this is in a good state.

Regarding the test failures, it's worth noting that tests against the API are almost guaranteed to fail, as they blindly pull the dandiarchive/dandiarchive-api image from dockerhub, and the required changes to make this work are contained in dandi/dandi-archive#2784. There is also a new release of RabbitMQ that is causing celery to not work correctly, which means that any tests that depend on the API updating the status of things behind the scenes via celery (a good amount) are broken. We're looking into it.

@yarikoptic
Copy link
Copy Markdown
Member

Thank you @jjnesbitt ! I meanwhile submitted

to mitigate CI fails. Claude identified the same "cause" and provide some of its "aiplanation" but it could be slop, so beware.

@yarikoptic yarikoptic added minor Increment the minor version when merged zarr labels Apr 25, 2026
@yarikoptic yarikoptic force-pushed the zarr-multipart-upload branch from 367bd24 to 83db4f4 Compare April 25, 2026 12:03
@yarikoptic
Copy link
Copy Markdown
Member

@jjnesbitt FYI - I have rebased on now green master

@jjnesbitt jjnesbitt force-pushed the zarr-multipart-upload branch from 83db4f4 to 7121c31 Compare April 27, 2026 17:16
@jjnesbitt
Copy link
Copy Markdown
Member Author

@yarikoptic It would be good to get your review/approval on this PR. The only remaining test failure is the following:

FAILED dandi/tests/test_files.py::test_upload_zarr_large_chunks - dandi.exceptions.HTTP404Error: Error 404 while sending POST request to http://localhost:8000/api/uploads/zarr/initialize/: <!DOCTYPE html>

This is due to the fact that the version of the API that the tests are being run against doesn't contain the new URLs required for this change.

@yarikoptic
Copy link
Copy Markdown
Member

re tests:

FAILED dandi/tests/test_files.py::test_upload_zarr_large_chunks - dandi.exceptions.HTTP404Error: Error 404 while sending POST request to http://localhost:8000/api/uploads/zarr/initialize/: <!DOCTYPE html>

This is due to the fact that the version of the API that the tests are being run against doesn't contain the new URLs required for this change.

So let's make such test(s) conditional and skip if ran against released/older version of the archive. Is there anything in the /server-info which would be indicative of this feature to be there?

@jjnesbitt
Copy link
Copy Markdown
Member Author

So let's make such test(s) conditional and skip if ran against released/older version of the archive. Is there anything in the /server-info which would be indicative of this feature to be there?

We could do that, but it would achieve very little, since as soon as the feature is merged to master and the new image is built, they will immediately be re-enabled again. There's nothing at the /server-info endpoint at the moment to delineate the change.

Since the API change is backwards compatible, I suggest we simply merge the API change once it's ready (hopefully soon), at which point the tests will use the new image, and things are back to normal.

@yarikoptic
Copy link
Copy Markdown
Member

re server-capability conditioning -- and then client ideally be aware of that too at run time (not only in the tests) so it could just refuse/error if trying to upload 5TB files as I have implemented awhile back in #1827 and will be merging shortly, instead of hitting a non-existing end point at EMBER and LINC (if they keep it afloat). I will leave it to you to decide on how client could discover server capability here:

  • could be just version based if you are positive that next minor would have it
  • I thought I filed an issue proposing to add some 'features: [ ]' listing to /api/info but I fail to find
  • could sense if there is an expected endpoint (once per instance) and if not - assume absent feature.

@yarikoptic
Copy link
Copy Markdown
Member

Sorry @jjnesbitt , I will not be merging a PR with totally Red CI, it is out of question.

@yarikoptic
Copy link
Copy Markdown
Member

ideally it should actually test that server works with at least some TEMP commit which would somehow provision container with that new mighty version of the server from the PR. But I guess that is for the future, to do smth like I did for bids-examples where in PR description we can point to perspective PRs for bids-specification and bids-validator to test against: https://github.com/bids-standard/bids-examples/blob/master/.github/workflows/validate_datasets.yml#L35

@jjnesbitt
Copy link
Copy Markdown
Member Author

Sorry @jjnesbitt , I will not be merging a PR with totally Red CI, it is out of question.

That is not what I was suggesting. Once the API PR is merged, this CI will become green, because it will be using the latest image of dandiarchive/dandiarchive-api.

@yarikoptic
Copy link
Copy Markdown
Member

ok, let's wait then.

Copy link
Copy Markdown
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor ones about imports but also a question to clarify on why not to use smaller threshold

Comment thread dandi/tests/test_files.py Outdated
Comment thread dandi/files/zarr.py
@jjnesbitt jjnesbitt force-pushed the zarr-multipart-upload branch from 9f79117 to 3be1ccd Compare April 27, 2026 21:44
@jjnesbitt jjnesbitt changed the title Use multi-part upload on zarr chunks larger than 5GB Use multi-part upload on zarr chunks larger than 100MB Apr 28, 2026
@jjnesbitt
Copy link
Copy Markdown
Member Author

To avoid complicated conditioning of this feature based on the archive's capability (for which they may quickly upgrade anyway), we could do the following:

  1. Merge and release this change
  2. In the prod instance, set cli-minimal-version to the newly released version
  3. In the downstream archives, add the newly released version to cli-bad-versions, so that anyone with a CLI version that is newer than what is supported gets a CLI error.

This removes any conditional logic based on archive capabilities in the CLI (which should be discouraged anyway) and puts the compatibility check solely on the archive side. Once LINC/EMBER merge in the supported change, the CLI version can be removed from cli-bad-versions, and things are back to normal.

@yarikoptic Do you have objections to this?

Comment thread dandi/support/digests.py Outdated
@yarikoptic
Copy link
Copy Markdown
Member

yes, I have objections, as

  • I think conditioning should not be complicated, let me know if I should take stab at that,
  • again -- it seems would require red CI to be merged /released as current CI is red (didn't check details)?
  • we do have other instances running so I do not want to break new client release functioning with them ...

But before we jump any ways, I think we still need to elaborate on overall details of the change in behaviors! Thanks for quickly addressing the question of diverging checksum/etag'ing -- I left a comment there as I think multipart upload etag compute must be consistent with how we compute for regular blobs, shouldn'it?

And I (at least) still do not have a complete picture how it

  • would all be consistent with existing zarrs in the archive (already having >100MB single part uploads),
  • zarr-checksum library behavior (does not do any multipart splitting, and we do not want different checksums by the library and uploader/archive ) etc.

Potential "resolutions" to consider:

  • switch back to 5GB limit for deciding going multipart with zarr blobs -- even though I was the one who asked/reminded about AWS recommendation on when to switch to multipart uploads.
    • pros: This would allow for existing zarrs with single part uploads of large files (which succeeded) to be ok; finalizing upload of huge parts via multipart, if we fixup zarr-checksum library to behave similarly etc - we have consistent state
    • cons: single part uploads are more fragile etc
  • switch to this new logic of 100MB limit as now to go multipart for zarr
    • pros: consistent with blobs
    • cons: need to be figured out how we make it consistent with library etc . e.g.
      • it could be some new form of checksum, (e.g. zarr-checksum-v2, or zarr-checksum-mp100 where mp100 signals on multipart past 100M or zarr-dandietag-checksum to signal that individual blobs follow dandietag) and thus we just improve digester to understand both and then existing ones with old zarr-checksum and then for new uploads zarr-checksum-v2. will require zarr-checksum library to get support for both I guess
      • ... ???...

or, once again my question -- am I overcomplicating?

@jjnesbitt
Copy link
Copy Markdown
Member Author

  • again -- it seems would require red CI to be merged /released as current CI is red (didn't check details)?

And I'll again reiterate, the archive PR is not yet merged, which is why the CI is red. Once that happens, it will be green.

  • we do have other instances running so I do not want to break new client release functioning with them ...

What other clients besides LINC/EMBER? Why is simply excluding the new CLI version not sufficient to support these archives?

  • I think conditioning should not be complicated, let me know if I should take stab at that,

The CLI code for zarr upload is extremely convoluted as it is, adding this conditional logic further exacerbates that. However, if you're really determined to implement this conditional logic then I'll leave it to you.

But before we jump any ways, I think we still need to elaborate on overall details of the change in behaviors! Thanks for quickly addressing the question of diverging checksum/etag'ing -- I left a comment there as I think multipart upload etag compute must be consistent with how we compute for regular blobs, shouldn'it?

And I (at least) still do not have a complete picture how it

  • would all be consistent with existing zarrs in the archive (already having >100MB single part uploads),
  • zarr-checksum library behavior (does not do any multipart splitting, and we do not want different checksums by the library and uploader/archive ) etc.

The zarr-checksum library doesn't need to know specifics of how the digests are computed. As it stands now, we simply pass it the items containing the path and digest, so we control what digests it uses. There should be no issue here.

Potential "resolutions" to consider:

  • switch back to 5GB limit for deciding going multipart with zarr blobs -- even though I was the one who asked/reminded about AWS recommendation on when to switch to multipart uploads.

I think short term this is probably the best solution, as it immediately unblocks people trying to upload zarrs with parts over this limit. The underlying issue should still be re-addressed, but we can hold off until there's not a funding gap looming.

Comment thread dandi/support/digests.py Outdated
@yarikoptic
Copy link
Copy Markdown
Member

  • again -- it seems would require red CI to be merged /released as current CI is red (didn't check details)?

And I'll again reiterate, the archive PR is not yet merged, which is why the CI is red. Once that happens, it will be green.

great -- when it gets green we can merge! I feel like in a groundhog day ;) I guess we are missing each other's point on this one.

  • we do have other instances running so I do not want to break new client release functioning with them ...

What other clients besides LINC/EMBER? Why is simply excluding the new CLI version not sufficient to support these archives?

because it would require THEM to exclude, whenever WE introduce breakage, so I would prefer US to avoid breakage that rely on clients to mitigate, and then remitigate whenever we update or release yet another version.

Long term -- it is client/server's couple job imho to agree on acceptable interfaces to be used.

  • I think conditioning should not be complicated, let me know if I should take stab at that,

The CLI code for zarr upload is extremely convoluted as it is, adding this conditional logic further exacerbates that. However, if you're really determined to implement this conditional logic then I'll leave it to you.

gotcha, I will look into that... tomorrow morning.

My question would only remain of absent testing against WiP archive I guess. did you have chance to test it against some instance? what was its /info output -- I wonder what would be the best indicator if instance supports this -- probing new interface?

But before we jump any ways, I think we still need to elaborate on overall details of the change in behaviors! Thanks for quickly addressing the question of diverging checksum/etag'ing -- I left a comment there as I think multipart upload etag compute must be consistent with how we compute for regular blobs, shouldn'it?
And I (at least) still do not have a complete picture how it

  • would all be consistent with existing zarrs in the archive (already having >100MB single part uploads),
  • zarr-checksum library behavior (does not do any multipart splitting, and we do not want different checksums by the library and uploader/archive ) etc.

The zarr-checksum library doesn't need to know specifics of how the digests are computed. As it stands now, we simply pass it the items containing the path and digest, so we control what digests it uses. There should be no issue here.

doesn't it provide compute across filetree?:
https://github.com/dandi/zarr_checksum/blob/73bfdf56e822065623b282049e7c70783d87072c/zarr_checksum/generators.py#L108

Potential "resolutions" to consider:

  • switch back to 5GB limit for deciding going multipart with zarr blobs -- even though I was the one who asked/reminded about AWS recommendation on when to switch to multipart uploads.

I think short term this is probably the best solution, as it immediately unblocks people trying to upload zarrs with parts over this limit. The underlying issue should still be re-addressed, but we can hold off until there's not a funding gap looming.

agreeed! let's stick to 5GB limit as decision point for now

@jjnesbitt jjnesbitt force-pushed the zarr-multipart-upload branch from c349a4c to 6d99670 Compare April 29, 2026 15:30
@jjnesbitt jjnesbitt changed the title Use multi-part upload on zarr chunks larger than 100MB Use multi-part upload on zarr chunks larger than 5GB Apr 29, 2026
@jjnesbitt
Copy link
Copy Markdown
Member Author

My question would only remain of absent testing against WiP archive I guess. did you have chance to test it against some instance? what was its /info output -- I wonder what would be the best indicator if instance supports this -- probing new interface?

I've done this both through running the integration tests with this branch of the CLI code (successful run), as well as running the corresponding branch of the API locally and pointing the tests at that, instead of utilizing the docker-compose pytest fixture.

As far as what test can be done to check for server compatibility, the existence of the /uploads/zarr/initialize/ endpoint is the simplest and easiest answer.

doesn't it provide compute across filetree?: dandi/zarr_checksum@73bfdf5/zarr_checksum/generators.py#L108

What you've linked is the one of the "default" file generators included in the zarr-checksum package. We do not use that generator, and instead directly provide the files and digests used to compute the zarr checksum.

agreeed! let's stick to 5GB limit as decision point for now

I've switched this back.

yarikoptic and others added 4 commits April 29, 2026 19:51
…_upload->singlepart_items

The split of a batch into "items above the 5GB threshold" and "items at
or below" is really a split between items routed through the multipart
upload endpoint and items uploaded with a single signed PUT.  Naming
them after the upload mechanism rather than the size makes the
subsequent capability/gating logic read more naturally.

Pure rename, no behavior change.

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dandi-archive#2784 (still open) introduces the
``/uploads/zarr/initialize/`` endpoint required to upload zarr chunks
larger than 5 GiB.  Until that lands and is deployed, posting to the
endpoint returns 404 and the multi-part code path here would surface
that as an opaque ``HTTP404Error`` from inside ``_multipart_upload``.

Add a lazy ``DandiAPIClient.supports_zarr_multipart_upload`` cached
property that probes the endpoint once with an empty POST body: 404
means the server lacks the feature, anything else (400 for the bogus
payload, 401/403 for auth) means it is present.

In ``ZarrAsset.iter_upload``, right after the batch is split into
``multipart_items`` and ``singlepart_items``, raise an ``UploadError``
when there are multipart items and the server does not support them.
The error reports the count and aggregate size of the oversized chunks
plus the path of the largest one, mirroring the wording proposed in
#1827.

Marked with a TODO: the check (and the cached property) can be removed
once all supported servers ship dandi-archive#2784 (> 0.23.0).

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…umulative

``_multipart_upload`` (used here to upload zarr chunks larger than 5
GiB) yields progress dicts whose ``current`` is the number of bytes
uploaded *within the chunk currently being transferred*.  The
surrounding ``ZarrAsset.iter_upload`` loop reports ``current`` as the
number of bytes uploaded *across the whole zarr*.  Yielded as-is, the
inner per-file values caused ``current`` to jump backwards each time a
new multipart chunk started, breaking progress bars driven from that
field.

Wrap the inner generator: when an "uploading" status carries a
``current`` field, translate it by the cumulative bytes uploaded so
far before re-yielding; pass other status dicts through unchanged.
The trailing post-loop yield is dropped — the last per-part
translated yield already reports the full cumulative size for that
chunk.

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two ``ai_generated`` tests added in 9bc2976 patch
``ZARR_LARGE_CHUNK_THRESHOLD`` down so chunks are routed through
``_multipart_upload`` against whatever server backs ``new_dandiset``.
The local docker dandi-archive image does not yet ship
dandi-archive#2784, so the new ``supports_zarr_multipart_upload``
capability check raises ``UploadError`` before ``_multipart_upload``
is ever called and the tests fail without exercising what they claim
to.

Probe the live client for capability and ``pytest.skip`` only when
the endpoint is missing — keeps the tests effective on setups whose
server already carries dandi-archive#2784.

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yarikoptic
Copy link
Copy Markdown
Member

Zarr multipart upload tune ups: condition on server capability, fix progress reporting, harmonize variable names
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

minor Increment the minor version when merged zarr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants