ci: cross-platform test matrix with git version testing#1034
ci: cross-platform test matrix with git version testing#1034clemlesne wants to merge 11 commits intosourcegraph:mainfrom
Conversation
…sting Add build-git + test job split to test across all 4 supported platforms (x86_64-linux, aarch64-linux, aarch64-darwin, x86_64-darwin) and multiple git versions (2.34.1, 2.39.5, 2.47.3, 2.53.0) to catch regressions like the cat-file --filter breakage on older git (issue sourcegraph#1029).
60f13ce to
a7cb7fa
Compare
Each version maps to a real distro's stock git package: - 2.34.1: Ubuntu 22.04 LTS - 2.39.5: Debian 12, reported failing in sourcegraph#1029 - 2.47.3: Debian 13 + Alpine 3.21, also failing in sourcegraph#1029 - 2.53.0: latest stable, first with cat-file --batch --filter= (added in 2.50.0)
Define runners and git-versions once in an init job, then reference via fromJSON() in both build-git and test matrices. Avoids duplication and keeps version justification comments in a single place.
Build universal-ctags from source alongside git, using the same pattern (cache + artifact + per-runner build). Two versions: - p5.9.20210829.0: Ubuntu 22.04/24.04 LTS default (2021 snapshot) - v6.2.1: latest stable, matches Homebrew and Alpine Edge Uses --program-prefix=universal- and --enable-json to match the existing install-ctags-alpine.sh flags. Test matrix is now 8 runners × 4 git versions × 2 ctags versions = 64 combinations.
a7cb7fa to
965d839
Compare
gettext is keg-only on Homebrew, so libintl.h is not in the default include path. Add it to brew install and CFLAGS/LDFLAGS.
The verify step runs unconditionally but dynamically-linked binaries need their runtime deps present. Removing the cache-miss condition from dep install steps ensures dylibs are always available. brew/apt install are idempotent and fast when packages are already present.
ee2a502 to
36d7705
Compare
Git's config.mak.uname already auto-detects Homebrew prefix and finds gettext, libiconv, openssl paths. Manual CFLAGS/LDFLAGS were redundant and could conflict with the build system's own detection.
0c55791 to
00ddf88
Compare
This reverts commit 00ddf88. Git's config.mak.uname Homebrew auto-detection for gettext was added after git 2.34.1, so older versions fail with 'libintl.h' not found on macOS. Restore explicit CFLAGS/LDFLAGS to support all git versions in the matrix.
affb64f to
3069942
Compare
Git's Makefile sets CFLAGS = -g -O2 -Wall which overrides the env var. CPPFLAGS is preserved via ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS), so keg-only Homebrew include paths (gettext, openssl, etc.) are actually picked up by the compiler.
9e2fa73 to
21b8da5
Compare
|
Sorry @keegancsmith need to get your approval each time I fix the Actions. This is a security policy you enforced on the repo config. Achieving a standard cross-version git build on both Linux and macOS is a bit tricky but I'll solve that. |
Git's Makefile overrides both CFLAGS and LDFLAGS env vars with its own assignments. Only command-line arguments to make take precedence over Makefile assignments. Previous fix solved the compile step (CPPFLAGS for headers) but the linker still couldn't find libintl because LDFLAGS was being overridden.
97165ef to
430511c
Compare
Shell quoting with make command-line variables is fragile — spaces in CPPFLAGS/LDFLAGS caused make to interpret paths as options. Git's Makefile auto-includes config.mak, so writing the paths there avoids all quoting issues. Uses += to append to existing flags. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
430511c to
030ba7e
Compare
CI matrix results — first run analysisThe CI is failing, and that's the point. Before this PR, the CI ran on a single Alpine Edge container with bleeding-edge git and ctags. It passed — but #1029 proved that was a false green. Users on Debian 12, Debian 13, and Ubuntu LTS hit crashes that the CI never caught. This matrix now tests 8 runners × 4 git versions × 2 ctags versions = 64 combinations. The first run surfaced two real compatibility bugs that were invisible before: Run: https://github.com/sourcegraph/zoekt/actions/runs/2390370041448/48 builds pass. 8/64 tests pass (git 2.53.0 + ctags v6.2.1, all 8 runners).
Proposal: merge now, fix forwardI'd recommend merging the matrix as-is, then pushing targeted fix PRs:
Why merge first, fix after:
The matrix is doing what it was designed to do. The sooner it's in main, the sooner every PR gets the coverage that would have caught #1029. |
keegancsmith
left a comment
There was a problem hiding this comment.
I think this is great that it tests so much, but its a huge hammer for most CI runs on this project. Here is my stance:
- ctags having different output affecting test fixtures is not a bug. We make our fixtures match the version of ctags used.
- Older git versions: now that we are shelling out to git we should just make it detect if there is an old git and fail gracefully.
There was a problem hiding this comment.
Thanks for the review @keegancsmith! I appreciate you taking the time. Let me share some context that I think makes a strong case for this matrix — happy to discuss and find the right scope together.
On graceful fallback for older git:
Fully agree that runtime detection + graceful fallback is the right approach (and PR #1037 is heading in that direction). The question is: how do we validate that the fallback actually works?
The --filter regression (#1029) is a useful case study here. PR #1026 added --filter=blob:limit= which requires git >= 2.50.0, but the CI only tested on Alpine Edge (git 2.53+), so it passed. Three days later, @MarcWort reported that many repos showed 0 bytes on Debian 12 (git 2.39.5) — a silent failure, not a graceful one. The existing CI couldn't have caught this.
A matrix that exercises older git versions would let us verify that fallback paths like #1037 work correctly — and prevent the next version-dependent change from slipping through the same way.
The distro landscape is worth considering:
Today, no major LTS/stable distribution ships git >= 2.50.0:
| Distribution | git version | --filter support |
|---|---|---|
| Ubuntu 22.04 LTS | 2.34.1 | No |
| Ubuntu 24.04 LTS | 2.43.0 | No |
| Debian 12 (Bookworm) | 2.39.5 | No |
| Debian 13 (Trixie) | 2.47.3 | No |
| Alpine 3.21 | 2.47.3 | No |
(Sources: packages.ubuntu.com, tracker.debian.org, pkgs.alpinelinux.org)
Ubuntu 22.04 and 24.04 also ship ctags 5.9 (a 2021 snapshot). That's a pretty large user surface to cover — a matrix helps us understand what works for them.
On ctags version differences:
Fair point that fixtures should match the ctags version. What I'm seeing is that TestRanking fails on ctags 5.9 (the version shipped by Ubuntu 22.04/24.04 and Debian 12 — the two most common server distros). We could handle this with version-gated assertions or by documenting a minimum supported version — either way, a matrix helps us know where the boundaries are.
On CI cost:
The numbers from the first run:
| Platform | Test jobs | Avg duration |
|---|---|---|
| Linux (x86 + ARM) | 32 | 2:49 |
| macOS Apple Silicon | 16 | 2:37 |
| macOS Intel | 16 | 4:49 |
The 31-minute total wall-clock is dominated by macOS runner queuing (GitHub only provides ~2-3 macOS runners in parallel, so jobs wait up to 27 min). With full parallelism the whole workflow would finish in under 10 min for Linux, or ~12 min including macOS Intel. Build-git compiles in ~2 min on Ubuntu and the cache further speeds up subsequent runs.
If the macOS matrix feels too large, I'm happy to trim it (e.g. fewer git versions on Intel, or drop Intel entirely). But the Linux matrix (32 jobs, all under 3:30) is the critical coverage — that's what would have caught #1029.
Finding the right scope together:
I don't think it has to be all-or-nothing. We could start with a smaller matrix (e.g. Linux-only, 2-3 git versions) and expand from there. The goal is just to have at least some coverage for the git/ctags versions that real users are running on. What would feel like the right balance to you?
|
One more data point on CI cost — the 31-minute wall-clock is actually a documented GitHub Actions platform limit, not a matrix design issue. GitHub caps concurrent macOS jobs at 5 per organization on Free/Pro/Team plans, shared across all repos and all runner labels (Intel + Apple Silicon). That's why the 32 Linux jobs all started within seconds of each other, while macOS jobs serialized in batches with up to 27 min of queue time.
If we trim the macOS matrix (e.g. 2 git versions instead of 4), we halve the macOS queue time. And the Linux matrix (32 jobs, all under 3:30 each) is unaffected — that's the critical coverage that would have caught #1029. |
|
Thanks for the detailed response. I think we should set a minimum supported git version. Indeed, we may need a matrix here since we otherwise won't be able to test the --filter flag since our minimum supported version should match what is shipped in the latest LTS of debian and ubuntu. So I would recommend just two things in the matrix, the minimum supported version of git and whatever alpine latest (or whatever we are running?) has. I'm not that interested in testing darwin vs linux here, just stick to linux. In general I prefer faster times. Additionally we don't have any darwin vs linux specific code. Personally I develop mostly on linux, but am often on my mbp. For ctags, I still disagree here. The different versions break tests in a way which doesn't affect users, only the fixtures We should just stick to the version of universal-ctags we used to use. |
Summary
testjob with three decoupled jobs:build-git,build-ctags, andtestx86_64-linux,aarch64-linux,aarch64-darwin,x86_64-darwin(8 runners)2.34.1,2.39.5,2.47.3,2.53.0p5.9.20210829.0,v6.2.1initjob, consumed viafromJSON()— single source of truthMotivation
Issue #1029 revealed that the CI only tested on Alpine Edge (bleeding-edge git + ctags), masking a compatibility regression with
git cat-file --batch --filter=on older git versions (Debian 12/13, Ubuntu 22.04). More broadly, the CI only covered a single platform/arch despite zoekt officially supporting four.This PR strengthens the QA posture by testing across the full platform matrix and multiple dependency versions, catching system-level integration issues before they reach users.
Design
initbuild-gitbuild-ctags--program-prefix=universal- --enable-jsonmatching existinginstall-ctags-alpine.shtestgo test ./...OS-specific steps use
if: runner.os == 'Linux'/if: runner.os == 'macOS'conditionals.Git versions
--batch --filter=2.34.11:2.34.1-1ubuntu12.39.51:2.39.5-0+deb12u3, reported failing in #10292.47.31:2.47.3-0+deb13u1, also failing in #10292.53.0Universal-ctags versions
p5.9.20210829.05.9.20210829.0-1(2021 snapshot)v6.2.1Test plan
build-gitjobs compile successfullybuild-ctagsjobs compile successfullytestjobs on git 2.53.0 pass —--filteris supportedtestjobs on git 2.34.1 / 2.39.5 / 2.47.3 surface expectedcat-filefailures — confirms the CI would have caught suddenly no content in repos #1029universal-ctagsbinaries with+interactiveRelated to #1029
🤖 Generated with Claude Code