Skip to content

[TEST] CI: cross-repo PR matrix vs dandi-archive#2784#1851

Draft
yarikoptic wants to merge 16 commits intomasterfrom
enh-tests-PR-dandiarchive
Draft

[TEST] CI: cross-repo PR matrix vs dandi-archive#2784#1851
yarikoptic wants to merge 16 commits intomasterfrom
enh-tests-PR-dandiarchive

Conversation

@yarikoptic
Copy link
Copy Markdown
Member

@yarikoptic yarikoptic commented Apr 30, 2026

Summary

Trial run of the new cross-repo PR matrix CI feature (see commit
CI: matrix runs against cross-repo dandi-archive / dandi-schema PRs).
The marker line above should be parsed by the new
discover-cross-repo-prs job; the build-archive-image job should
then build the dandi-archive Docker image from the head of
dandi/dandi-archive#2784 and the test-cross-repo job should run the
--dandi-api test suite against it.

This PR also stacks on top of #1839 (zarr multipart upload), which is
the change we ultimately want to validate against #2784 once the
cross-repo plumbing is confirmed working.

Why a separate PR

to exercise the new
workflow against a real upstream change before we merge and trust it to gate
real PRs.

TODOs before anything lands on master

  • check that doesn't do actual anything without corresponding phrase/pointers here in the description
  • add pointer to Add support for zarr multi-part uploads dandi-archive#2784
  • Confirm the cross-repo flow runs green here. Inspect the
    discover-cross-repo-prs, build-archive-image, and
    test-cross-repo job logs.
  • Strip this PR down to just the .github/ changes (drop the
    zarr-multipart commits from Use multi-part upload on zarr chunks larger than 5GB #1839) and re-target master as the
    real "land the cross-repo feature" PR. Or: cherry-pick the two
    CI commits onto a fresh branch off master and open that as the
    merge candidate.
    • that would also remove that lengthy .specify/specs/cross-repo-pr-matrix.md
  • Drop the if: false commit (CI [drop-before-merge]: temporarily disable the existing test matrix,
    342fa181 CI [drop-before-merge]: temporarily disable the existing test matrix)
    before merging, so the regular test matrix runs again.
  • Once merged on master, update the body of Use multi-part upload on zarr chunks larger than 5GB #1839 to add
    dandi-archive PR: https://github.com/dandi/dandi-archive/pull/2784
    so Use multi-part upload on zarr chunks larger than 5GB #1839 itself gets cross-repo coverage against #2784.

Design notes

See .specify/specs/cross-repo-pr-matrix.md for the full design,
including how the dandi-archive Docker image override works (build
locally from dev/django-public.Dockerfile, docker image save,
upload as artifact, docker image load in the test job, with
DANDI_TESTS_PULL_DOCKER_COMPOSE=0 so docker compose pull doesn't
clobber it) and how dandischema is overridden in both the runner venv
and inside the dandi-archive container build.

Test plan

  • discover-cross-repo-prs parses dandi-archive PR: … from this
    body and outputs archive_repo=dandi/dandi-archive,
    archive_ref=<sha of #2784 head>, should_run=true.
  • build-archive-image succeeds and uploads
    dandiarchive-api-cross-repo artifact.
  • test-cross-repo loads the artifact, shows the right image SHA
    under "Show cross-repo overrides", and runs pytest --dandi-api
    against it.
  • (Negative) The legacy test job is skipped (if: false).
  • Re-test on a PR with no marker → all three new jobs are
    skipped (should_run=false short-circuits the gating).

jjnesbitt and others added 14 commits April 29, 2026 11:26
…_upload->singlepart_items

The split of a batch into "items above the 5GB threshold" and "items at
or below" is really a split between items routed through the multipart
upload endpoint and items uploaded with a single signed PUT.  Naming
them after the upload mechanism rather than the size makes the
subsequent capability/gating logic read more naturally.

Pure rename, no behavior change.

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dandi-archive#2784 (still open) introduces the
``/uploads/zarr/initialize/`` endpoint required to upload zarr chunks
larger than 5 GiB.  Until that lands and is deployed, posting to the
endpoint returns 404 and the multi-part code path here would surface
that as an opaque ``HTTP404Error`` from inside ``_multipart_upload``.

Add a lazy ``DandiAPIClient.supports_zarr_multipart_upload`` cached
property that probes the endpoint once with an empty POST body: 404
means the server lacks the feature, anything else (400 for the bogus
payload, 401/403 for auth) means it is present.

In ``ZarrAsset.iter_upload``, right after the batch is split into
``multipart_items`` and ``singlepart_items``, raise an ``UploadError``
when there are multipart items and the server does not support them.
The error reports the count and aggregate size of the oversized chunks
plus the path of the largest one, mirroring the wording proposed in
#1827.

Marked with a TODO: the check (and the cached property) can be removed
once all supported servers ship dandi-archive#2784 (> 0.23.0).

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…umulative

``_multipart_upload`` (used here to upload zarr chunks larger than 5
GiB) yields progress dicts whose ``current`` is the number of bytes
uploaded *within the chunk currently being transferred*.  The
surrounding ``ZarrAsset.iter_upload`` loop reports ``current`` as the
number of bytes uploaded *across the whole zarr*.  Yielded as-is, the
inner per-file values caused ``current`` to jump backwards each time a
new multipart chunk started, breaking progress bars driven from that
field.

Wrap the inner generator: when an "uploading" status carries a
``current`` field, translate it by the cumulative bytes uploaded so
far before re-yielding; pass other status dicts through unchanged.
The trailing post-loop yield is dropped — the last per-part
translated yield already reports the full cumulative size for that
chunk.

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two ``ai_generated`` tests added in 9bc2976 patch
``ZARR_LARGE_CHUNK_THRESHOLD`` down so chunks are routed through
``_multipart_upload`` against whatever server backs ``new_dandiset``.
The local docker dandi-archive image does not yet ship
dandi-archive#2784, so the new ``supports_zarr_multipart_upload``
capability check raises ``UploadError`` before ``_multipart_upload``
is ever called and the tests fail without exercising what they claim
to.

Probe the live client for capability and ``pytest.skip`` only when
the endpoint is missing — keeps the tests effective on setups whose
server already carries dandi-archive#2784.

Co-Authored-By: Claude Code 2.1.123 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yarikoptic yarikoptic force-pushed the enh-tests-PR-dandiarchive branch from 5594b9d to be2b5d3 Compare April 30, 2026 21:00
@yarikoptic yarikoptic added the tests Add or improve existing tests label Apr 30, 2026
@yarikoptic yarikoptic force-pushed the enh-tests-PR-dandiarchive branch from be2b5d3 to e5625f7 Compare April 30, 2026 21:16
@yarikoptic
Copy link
Copy Markdown
Member Author

yarikoptic commented Apr 30, 2026

woohoo -- testing failed with

=========================== short test summary info ============================
FAILED dandi/tests/test_files.py::test_upload_zarr_large_chunks - requests.exceptions.HTTPError: Error 400 while sending POST request to http://localhost:8000/api/uploads/zarr/initialize/: ["Zarr files below 5368709120 bytes in size must be uploaded via single part upload."]
FAILED dandi/tests/test_files.py::test_upload_zarr_mixed_chunks - requests.exceptions.HTTPError: Error 400 while sending POST request to http://localhost:8000/api/uploads/zarr/initialize/: ["Zarr files below 5368709120 bytes in size must be uploaded via single part upload."]

and this smells like coming from the server so it indicates that we did pickup the server changes from the PR!

edit/note: that PR in dandi-archive was already merged! so we would get errors even if testing against master version of the image

yarikoptic and others added 2 commits April 30, 2026 17:43
Add a standalone GitHub Actions workflow,
`.github/workflows/cross-repo-pr-tests.yml`, that runs an extra
integration matrix when a dandi-cli PR description includes lines such
as

    dandi-archive PR: dandi/dandi-archive#2784
    dandi-schema  PR: #321

The workflow has three jobs:

* `discover-cross-repo-prs` parses the PR body and resolves each marker
  to repo+sha (handles forks via `gh api`).
* `build-archive-image` checks out the dandi-archive PR head (or master
  if only a schema PR is given), optionally rewrites the pinned
  `dandischema` line in `pyproject.toml` to a git URL pointing at the
  schema PR head, builds with `dev/django-public.Dockerfile`, and
  exports the image as a workflow artifact.
* `test-cross-repo` loads that image, force-installs `dandischema`
  from the PR head into the runner venv, sets
  `DANDI_TESTS_PULL_DOCKER_COMPOSE=0` so docker compose uses the
  loaded image instead of pulling Docker Hub, and runs the
  `--dandi-api` test suite.

Triggered on `pull_request: types: [opened, edited, synchronize,
reopened]` so adding or removing a marker re-fires *only* this
workflow, not the full test matrix in `run-tests.yml`. A per-PR
`concurrency:` group cancels older in-flight runs when the body or
commits change.

A PR template advertises the marker convention; the design notes
under `.specify/specs/cross-repo-pr-matrix.md` cover the rationale
and edge cases (forks, `pull_request` vs `pull_request_target`,
schema-version pin coordination, future generalization to an
organization-level reusable workflow).

Pattern adapted from bids-standard/bids-examples
`.github/workflows/validate_datasets.yml`.

Co-Authored-By: Claude Code 2.1.114 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… jobs

Set `if: false` on the top-level job of `run-tests.yml`, `lint.yml`,
`typing.yml`, `codeql.yml`, and `docs.yml` so we don't burn CI minutes
on the full PR check suite while iterating on the new
`cross-repo-pr-tests.yml` workflow. The cross-repo workflow itself
is unaffected.

This commit is intended to be dropped before this PR (or its successor)
lands on master.

Co-Authored-By: Claude Code 2.1.114 / Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@yarikoptic yarikoptic force-pushed the enh-tests-PR-dandiarchive branch from e5625f7 to 39f51d9 Compare April 30, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tests Add or improve existing tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants