Skip to content

ci: cross-platform test matrix with git version testing#1034

Open
clemlesne wants to merge 11 commits intosourcegraph:mainfrom
clemlesne:ci/cross-platform-test-matrix
Open

ci: cross-platform test matrix with git version testing#1034
clemlesne wants to merge 11 commits intosourcegraph:mainfrom
clemlesne:ci/cross-platform-test-matrix

Conversation

@clemlesne
Copy link
Copy Markdown
Contributor

@clemlesne clemlesne commented Mar 31, 2026

Summary

  • Replace the single Alpine-based test job with three decoupled jobs: build-git, build-ctags, and test
  • Test across all 4 officially supported platforms: x86_64-linux, aarch64-linux, aarch64-darwin, x86_64-darwin (8 runners)
  • Test against multiple git versions built from kernel.org tarballs: 2.34.1, 2.39.5, 2.47.3, 2.53.0
  • Test against multiple universal-ctags versions built from source: p5.9.20210829.0, v6.2.1
  • Centralize all matrix variables in an init job, consumed via fromJSON() — single source of truth
  • Pin GitHub Actions to latest semver tags

Motivation

Issue #1029 revealed that the CI only tested on Alpine Edge (bleeding-edge git + ctags), masking a compatibility regression with git cat-file --batch --filter= on older git versions (Debian 12/13, Ubuntu 22.04). More broadly, the CI only covered a single platform/arch despite zoekt officially supporting four.

This PR strengthens the QA posture by testing across the full platform matrix and multiple dependency versions, catching system-level integration issues before they reach users.

Design

Job Purpose
init Defines runner, git-version, and ctags-version lists as JSON outputs — single source of truth
build-git 8 runners × 4 git versions = 32 builds. Cached across runs, uploaded as artifacts
build-ctags 8 runners × 2 ctags versions = 16 builds. Uses --program-prefix=universal- --enable-json matching existing install-ctags-alpine.sh
test 8 runners × 4 git × 2 ctags = 64 combinations. Downloads both artifacts, runs go test ./...

OS-specific steps use if: runner.os == 'Linux' / if: runner.os == 'macOS' conditionals.

Git versions

Version Source --batch --filter=
2.34.1 Ubuntu 22.04 LTS (Jammy) — 1:2.34.1-1ubuntu1 No
2.39.5 Debian 12 (Bookworm) — 1:2.39.5-0+deb12u3, reported failing in #1029 No
2.47.3 Debian 13 (Trixie) + Alpine 3.21 — 1:2.47.3-0+deb13u1, also failing in #1029 No
2.53.0 Latest stable (2026-02-02) Yes (added in 2.50.0)

Universal-ctags versions

Version Source
p5.9.20210829.0 Ubuntu 22.04/24.04 LTS — 5.9.20210829.0-1 (2021 snapshot)
v6.2.1 Latest stable, matches Homebrew and Alpine Edge

Test plan

  • All 32 build-git jobs compile successfully
  • All 16 build-ctags jobs compile successfully
  • test jobs on git 2.53.0 pass — --filter is supported
  • test jobs on git 2.34.1 / 2.39.5 / 2.47.3 surface expected cat-file failures — confirms the CI would have caught suddenly no content in repos #1029
  • Both ctags versions produce valid universal-ctags binaries with +interactive
  • macOS builds use correct brew prefix (ARM vs Intel)

Related to #1029

🤖 Generated with Claude Code

…sting

Add build-git + test job split to test across all 4 supported platforms
(x86_64-linux, aarch64-linux, aarch64-darwin, x86_64-darwin) and multiple
git versions (2.34.1, 2.39.5, 2.47.3, 2.53.0) to catch regressions like
the cat-file --filter breakage on older git (issue sourcegraph#1029).
@clemlesne clemlesne marked this pull request as ready for review March 31, 2026 11:26
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch 3 times, most recently from 60f13ce to a7cb7fa Compare March 31, 2026 11:53
Each version maps to a real distro's stock git package:
- 2.34.1: Ubuntu 22.04 LTS
- 2.39.5: Debian 12, reported failing in sourcegraph#1029
- 2.47.3: Debian 13 + Alpine 3.21, also failing in sourcegraph#1029
- 2.53.0: latest stable, first with cat-file --batch --filter= (added in 2.50.0)
Define runners and git-versions once in an init job, then reference
via fromJSON() in both build-git and test matrices. Avoids duplication
and keeps version justification comments in a single place.
Build universal-ctags from source alongside git, using the same
pattern (cache + artifact + per-runner build). Two versions:
- p5.9.20210829.0: Ubuntu 22.04/24.04 LTS default (2021 snapshot)
- v6.2.1: latest stable, matches Homebrew and Alpine Edge

Uses --program-prefix=universal- and --enable-json to match the
existing install-ctags-alpine.sh flags. Test matrix is now
8 runners × 4 git versions × 2 ctags versions = 64 combinations.
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch from a7cb7fa to 965d839 Compare March 31, 2026 11:55
gettext is keg-only on Homebrew, so libintl.h is not in the default
include path. Add it to brew install and CFLAGS/LDFLAGS.
@keegancsmith keegancsmith self-requested a review March 31, 2026 13:15
The verify step runs unconditionally but dynamically-linked binaries
need their runtime deps present. Removing the cache-miss condition
from dep install steps ensures dylibs are always available. brew/apt
install are idempotent and fast when packages are already present.
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch from ee2a502 to 36d7705 Compare March 31, 2026 13:18
Git's config.mak.uname already auto-detects Homebrew prefix and
finds gettext, libiconv, openssl paths. Manual CFLAGS/LDFLAGS were
redundant and could conflict with the build system's own detection.
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch from 0c55791 to 00ddf88 Compare March 31, 2026 13:18
This reverts commit 00ddf88.

Git's config.mak.uname Homebrew auto-detection for gettext was added
after git 2.34.1, so older versions fail with 'libintl.h' not found
on macOS. Restore explicit CFLAGS/LDFLAGS to support all git versions
in the matrix.
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch from affb64f to 3069942 Compare March 31, 2026 13:26
Git's Makefile sets CFLAGS = -g -O2 -Wall which overrides the env
var. CPPFLAGS is preserved via ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS),
so keg-only Homebrew include paths (gettext, openssl, etc.) are
actually picked up by the compiler.
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch from 9e2fa73 to 21b8da5 Compare April 1, 2026 09:07
@clemlesne
Copy link
Copy Markdown
Contributor Author

clemlesne commented Apr 1, 2026

Sorry @keegancsmith need to get your approval each time I fix the Actions. This is a security policy you enforced on the repo config. Achieving a standard cross-version git build on both Linux and macOS is a bit tricky but I'll solve that.

Git's Makefile overrides both CFLAGS and LDFLAGS env vars with its
own assignments. Only command-line arguments to make take precedence
over Makefile assignments. Previous fix solved the compile step
(CPPFLAGS for headers) but the linker still couldn't find libintl
because LDFLAGS was being overridden.
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch 2 times, most recently from 97165ef to 430511c Compare April 2, 2026 13:41
Shell quoting with make command-line variables is fragile — spaces in
CPPFLAGS/LDFLAGS caused make to interpret paths as options. Git's
Makefile auto-includes config.mak, so writing the paths there avoids
all quoting issues. Uses += to append to existing flags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clemlesne clemlesne force-pushed the ci/cross-platform-test-matrix branch from 430511c to 030ba7e Compare April 2, 2026 13:48
@clemlesne
Copy link
Copy Markdown
Contributor Author

clemlesne commented Apr 3, 2026

CI matrix results — first run analysis

The CI is failing, and that's the point.

Before this PR, the CI ran on a single Alpine Edge container with bleeding-edge git and ctags. It passed — but #1029 proved that was a false green. Users on Debian 12, Debian 13, and Ubuntu LTS hit crashes that the CI never caught.

This matrix now tests 8 runners × 4 git versions × 2 ctags versions = 64 combinations. The first run surfaced two real compatibility bugs that were invisible before:

Run: https://github.com/sourcegraph/zoekt/actions/runs/23903700414

48/48 builds pass. 8/64 tests pass (git 2.53.0 + ctags v6.2.1, all 8 runners).

Root cause Affected Failing packages Details
git cat-file --batch --filter= unsupported on git < 2.50.0 48/64 combos (git 2.34.1, 2.39.5, 2.47.3) gitindex, cmd/zoekt-sourcegraph-indexserver SetDefaults() always sets SizeMax=2MBcatfileFilterSpec() always generates --filter=blob:limit=2097153 → EOF on git < 2.50.0. This is exactly #1029.
ctags p5.9 (Ubuntu 22.04/24.04 stock) produces different symbols than v6.2.1 8/64 combos (all ctags p5.9, even with git 2.53.0) internal/e2e (TestRanking) Fewer symbols extracted → different ranking. E.g. for bytes_buffer, ctags 5.9 misses the symbol match and an assembly file outranks the correct Go source.

Proposal: merge now, fix forward

I'd recommend merging the matrix as-is, then pushing targeted fix PRs:

  1. git compat: detect git version at runtime and skip --filter on < 2.50.0 (or graceful fallback)
  2. ctags compat: gate TestRanking on ctags version, or make assertions version-tolerant

Why merge first, fix after:

  • The CI will fail for all other PRs — that's intentional. It forces the fixes to be the immediate priority and prevents merging new code that could introduce additional compat regressions.
  • If we merge fixes before merging the matrix, we're validating those fixes against the old single-config CI, which is exactly the false-green situation that caused suddenly no content in repos #1029. We'd risk bad integrations without full matrix coverage at merge time.
  • Merging the matrix first ensures every subsequent PR — including the fixes — is validated against the full compatibility surface before it lands.

The matrix is doing what it was designed to do. The sooner it's in main, the sooner every PR gets the coverage that would have caught #1029.

Copy link
Copy Markdown
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great that it tests so much, but its a huge hammer for most CI runs on this project. Here is my stance:

  • ctags having different output affecting test fixtures is not a bug. We make our fixtures match the version of ctags used.
  • Older git versions: now that we are shelling out to git we should just make it detect if there is an old git and fail gracefully.

Copy link
Copy Markdown
Contributor Author

@clemlesne clemlesne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @keegancsmith! I appreciate you taking the time. Let me share some context that I think makes a strong case for this matrix — happy to discuss and find the right scope together.

On graceful fallback for older git:

Fully agree that runtime detection + graceful fallback is the right approach (and PR #1037 is heading in that direction). The question is: how do we validate that the fallback actually works?

The --filter regression (#1029) is a useful case study here. PR #1026 added --filter=blob:limit= which requires git >= 2.50.0, but the CI only tested on Alpine Edge (git 2.53+), so it passed. Three days later, @MarcWort reported that many repos showed 0 bytes on Debian 12 (git 2.39.5) — a silent failure, not a graceful one. The existing CI couldn't have caught this.

A matrix that exercises older git versions would let us verify that fallback paths like #1037 work correctly — and prevent the next version-dependent change from slipping through the same way.

The distro landscape is worth considering:

Today, no major LTS/stable distribution ships git >= 2.50.0:

Distribution git version --filter support
Ubuntu 22.04 LTS 2.34.1 No
Ubuntu 24.04 LTS 2.43.0 No
Debian 12 (Bookworm) 2.39.5 No
Debian 13 (Trixie) 2.47.3 No
Alpine 3.21 2.47.3 No

(Sources: packages.ubuntu.com, tracker.debian.org, pkgs.alpinelinux.org)

Ubuntu 22.04 and 24.04 also ship ctags 5.9 (a 2021 snapshot). That's a pretty large user surface to cover — a matrix helps us understand what works for them.

On ctags version differences:

Fair point that fixtures should match the ctags version. What I'm seeing is that TestRanking fails on ctags 5.9 (the version shipped by Ubuntu 22.04/24.04 and Debian 12 — the two most common server distros). We could handle this with version-gated assertions or by documenting a minimum supported version — either way, a matrix helps us know where the boundaries are.

On CI cost:

The numbers from the first run:

Platform Test jobs Avg duration
Linux (x86 + ARM) 32 2:49
macOS Apple Silicon 16 2:37
macOS Intel 16 4:49

The 31-minute total wall-clock is dominated by macOS runner queuing (GitHub only provides ~2-3 macOS runners in parallel, so jobs wait up to 27 min). With full parallelism the whole workflow would finish in under 10 min for Linux, or ~12 min including macOS Intel. Build-git compiles in ~2 min on Ubuntu and the cache further speeds up subsequent runs.

If the macOS matrix feels too large, I'm happy to trim it (e.g. fewer git versions on Intel, or drop Intel entirely). But the Linux matrix (32 jobs, all under 3:30) is the critical coverage — that's what would have caught #1029.

Finding the right scope together:

I don't think it has to be all-or-nothing. We could start with a smaller matrix (e.g. Linux-only, 2-3 git versions) and expand from there. The goal is just to have at least some coverage for the git/ctags versions that real users are running on. What would feel like the right balance to you?

@clemlesne
Copy link
Copy Markdown
Contributor Author

One more data point on CI cost — the 31-minute wall-clock is actually a documented GitHub Actions platform limit, not a matrix design issue.

GitHub caps concurrent macOS jobs at 5 per organization on Free/Pro/Team plans, shared across all repos and all runner labels (Intel + Apple Silicon). That's why the 32 Linux jobs all started within seconds of each other, while macOS jobs serialized in batches with up to 27 min of queue time.

Plan Linux/Windows concurrent macOS concurrent
Free 20 5
Pro 40 5
Team 60 5
Enterprise 500 50

If we trim the macOS matrix (e.g. 2 git versions instead of 4), we halve the macOS queue time. And the Linux matrix (32 jobs, all under 3:30 each) is unaffected — that's the critical coverage that would have caught #1029.

@keegancsmith
Copy link
Copy Markdown
Member

Thanks for the detailed response. I think we should set a minimum supported git version. Indeed, we may need a matrix here since we otherwise won't be able to test the --filter flag since our minimum supported version should match what is shipped in the latest LTS of debian and ubuntu. So I would recommend just two things in the matrix, the minimum supported version of git and whatever alpine latest (or whatever we are running?) has.

I'm not that interested in testing darwin vs linux here, just stick to linux. In general I prefer faster times. Additionally we don't have any darwin vs linux specific code. Personally I develop mostly on linux, but am often on my mbp.

For ctags, I still disagree here. The different versions break tests in a way which doesn't affect users, only the fixtures We should just stick to the version of universal-ctags we used to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants