Skip to content

ci: trim PR/merge-queue test time#5062

Draft
pietern wants to merge 2 commits intomainfrom
test-time-debug
Draft

ci: trim PR/merge-queue test time#5062
pietern wants to merge 2 commits intomainfrom
test-time-debug

Conversation

@pietern
Copy link
Copy Markdown
Contributor

@pietern pietern commented Apr 22, 2026

Summary

Two low-risk cuts targeting the dominant costs in PR test wall-clock.

1. Drop job_pydabs_1000_tasks from migrate + continue_293 invariant tests

The 1000-task scale case is still exercised by no_drift; running the same scale in two additional variants added ~3 min per linux/direct job without incremental coverage. See acceptance/bundle/invariant/{migrate,continue_293}/test.toml.

2. Slim install-pythons in the CI setup-build-environment action

  • New install-pythons-test Makefile target installs only Python 3.10 (the default for acceptance bundle tests; 3.13 already comes from actions/setup-python).
  • libs/patchwheel/patch_test.go now skips versions not available via uv python find, so CI can install a subset without failing.
  • make integration still pulls in the full install-pythons (3.9–3.13) — no change to integration runs.

Expected impact

  • linux/direct ~10 min → ~3–4 min
  • windows setup ~3 min lighter (Python install)

Test plan

  • make test passes locally
  • patchwheel unit test skips missing versions cleanly
  • Invariant tests show no_drift/pydabs_1000_tasks still running; migrate/pydabs_1000_tasks + continue_293/pydabs_1000_tasks no longer in the matrix
  • Monitor PR CI times on this PR vs baseline — want linux/direct under 5 min

Not included in this PR

  • Terraform-on-Windows speedups (Dev Drive / TF_PLUGIN_CACHE_DIR). The repo already uses a filesystem_mirror for the databricks provider, so the easy win of avoiding registry downloads is in. Deeper wins (GHA Dev Drive for .terraform/ small-file I/O, plugin cache hardlinks on NTFS) are a separate PR — research says the C: drive on windows-*-large runners is ~4k IOPS vs ~127k on a Dev Drive, and this is the real Windows gap.

This pull request and its description were written by Isaac.

Two low-risk cuts targeting the dominant costs in PR test wall-clock:

1. Exclude `job_pydabs_1000_tasks` from the `migrate` and `continue_293`
   invariant tests. The 1000-task scale case is still exercised by `no_drift`;
   running the same scale in two additional variants added ~3 minutes per
   linux/direct test job without incremental coverage.

2. Add `install-pythons-test` Makefile target that installs only the Python
   version actually needed by unit + acceptance tests (3.10), and switch the
   setup-build-environment composite action to use it. The `patchwheel` unit
   test now skips versions not available via `uv python find`, so CI can
   install a subset without failing. Integration runs keep using
   `install-pythons` for the full 3.9-3.13 matrix.

Expected impact: linux/direct ~10min -> ~3-4min, windows/* ~3min off setup.

Co-authored-by: Isaac
Comment thread libs/patchwheel/patch_test.go Outdated
// Skip if the interpreter is not available via uv. CI only installs
// a subset of versions to keep setup fast; the full matrix is
// exercised in integration and scheduled runs.
if err := exec.Command("uv", "python", "find", py).Run(); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not great to dynamically skip like this - we might be running less than we think we do because setup got broken and nobody noticed.

Better to add explicit env var, TEST_PYTHON_VERSIONS=3.10,3.11 that is initialized differently in different contexts.


# The 1000-task scale case is covered by no_drift. Running it here adds ~1.5 min
# per variant (deploy + migrate + plan at 1000 tasks) without incremental coverage.
EnvMatrixExclude.no_pydabs_1000_tasks = ["INPUT_CONFIG=job_pydabs_1000_tasks.yml.tmpl"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about dropping them from local tests but keep them on cloud (which is async)?

It's important that migrations works for all edge cases.

The test server (libs/testserver/jobs.go:531) maps a task's spark_version
to a Python version (DBR 12 -> 3.9, 13 -> 3.10, 15 -> 3.11, >=16 -> 3.12)
and calls `uv venv --python <ver>` at runtime. Acceptance tests using DBR
12.2 (integration_whl/wrapper*) or EnvMatrix on UV_PYTHON (templates/
default-python/integration_classic) therefore require those versions to
be installed.

Measured on the prior CI run: installing only 3.10 took ~6.7s on Windows,
vs an estimated 30-60s for all five. Not worth breaking tests. The real
Windows setup cost is Setup Go (2m49s cache restore), which is a
different problem.

Keeping the pydabs_1000_tasks invariant exclusion from the prior commit;
linux/direct is now ~4m18s including reruns, under the <5 min target.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants