diff --git a/.agents/ci-caching.md b/.agents/ci-caching.md index c1127b65c9d0..1c8c65470ede 100644 --- a/.agents/ci-caching.md +++ b/.agents/ci-caching.md @@ -101,6 +101,161 @@ For ccache, the workflow exports `CMAKE_ARGS=… -DCMAKE_C_COMPILER_LAUNCHER=cca GitHub Actions caches are limited to 10 GB per repo. Steady-state worst case: ~800 MB Go cache + ~2 GB brew Cellar + up to 2 GB ccache + ~1.5 GB × 5 python backends. If the cap is hit, prefer collapsing the per-backend Python keys into a shared `pyenv-darwin-shared-` key (accepts more cross-backend churn for a smaller footprint) before reducing other caches. +## Layered base images (`localai-base`) + +The registry-backed BuildKit cache deduplicates **within** a matrix entry's +cache tag, but each matrix entry has its own tag — so the same `apt-get`, +GPU SDK install, and language toolchain bootstrap runs into N different +cache tags across the backend matrix. The `localai-base` images factor that +shared work out of the per-backend builds. + +### How it fits together + +``` +.github/backend-matrix.yaml # raw matrix data (linux + darwin) + │ + ▼ +backend.yml / backend_pr.yml + ├── derive-bases / generate-matrix + │ scripts/changed-backends.js + │ reads .github/backend-matrix.yaml + │ (PR mode also reads changed files) + │ emits: + │ - matrix (annotated with base-image-prebuilt) + │ - matrix-darwin + │ - bases-matrix (deduplicated by tag-stem) + │ + ├── build-bases (matrix: bases-matrix) + │ uses base_images.yml + │ FROM .docker/bases/Dockerfile. + │ pushes quay.io/go-skynet/localai-base:[-pr] + │ + └── backend-jobs (matrix: matrix; needs build-bases) + uses backend_build.yml + FROM ${BASE_IMAGE_PREBUILT} + i.e. quay.io/go-skynet/localai-base:[-pr] + only the backend source COPY + `make` remain. +``` + +The base image is **always** built before backends consume it, in the same +workflow run. There is no cross-workflow dependency, no chicken-and-egg +on first push, and no manual matrix to keep in sync — adding a backend +matrix entry is just an edit to `.github/backend-matrix.yaml`. + +### Tag scheme + +`` is computed by `tagStem()` in `scripts/changed-backends.js` from +the (lang, build-type, ubuntu, cuda, base-image) tuple. Arch is +intentionally NOT in the stem — bases are built multi-arch when any +consumer needs multi-arch, and single-arch otherwise (the `platforms` +field on each base entry is the union of its consumers' platforms). + +| Build-type | Stem template | +|---|---| +| `''` (CPU) | `-cpu-[-]` | +| `cublas` / `l4t` | `---cuda.[-]` | +| anything else (vulkan, hipblas, intel, sycl_*) | `--[-]` | + +The base-image slug is empty for the default `ubuntu:24.04` and a short +parseable suffix otherwise (`jetpack-r36.4.0`, `rocm-7.2.1`, +`oneapi-2025.3.2`, etc.). + +| Event | Pushed tag | +|---|---| +| `push` (master/tag) | `:` | +| `pull_request` | `:-pr` | + +The cache for the base build itself lives at +`quay.io/go-skynet/ci-cache:base-` (`mode=max,ignore-error=true`), +parallel to the per-matrix-entry caches. + +The script also runs a collision check across consumers of each stem: if +two consumers map to the same stem but disagree on `base-image` or +`skip-drivers` (and skip-drivers is meaningful for that build-type), the +script fails loudly. Resolve by encoding the differing input in +`tagStem()` rather than letting the dedup silently pick a winner. + +### PR testability + +PRs run the same pipeline as master: derive bases → build bases (tagged +`-pr`) → run filtered backend matrix consuming those `-pr` tags. +End-to-end validation always lives within the PR. + +For PRs that only change `.docker/bases/Dockerfile.` (no backend +source touched), `changed-backends.js` adds one canary backend matrix +entry per (lang × build-type × arch × cuda × ubuntu) tuple to the filtered +matrix so each base flavour gets exercised. + +### Existing language tiers + +| Tier (lang) | Recipe | Consumer Dockerfile(s) | Distinct stems | +|---|---|---|---| +| `python` | `.docker/bases/Dockerfile.python` | `backend/Dockerfile.python` | 9 | +| `golang` | `.docker/bases/Dockerfile.golang` | `backend/Dockerfile.golang` | 8 | +| `cpp` | `.docker/bases/Dockerfile.cpp` (apt + GPU + protoc + cmake + GRPC) | `backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` | 8 | +| `rust` | `.docker/bases/Dockerfile.rust` | `backend/Dockerfile.rust` | 1 | + +The C++ trio share a single `cpp` base because they only differ in their +per-backend `make` targets. `langOf()` in `scripts/changed-backends.js` +remaps `Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` → `cpp` so dedup +works across the trio. If a future C++ consumer needs a *different* base +(e.g. without GRPC, or with a different protoc version), give it its own +`Dockerfile.` recipe and remove it from the cpp remap. + +### Adding a new (accel × arch × cuda × lang) flavour + +Just add the matrix entry to `.github/backend-matrix.yaml` for the new +flavour. The bases matrix and the per-entry `base-image-prebuilt` are +derived automatically by `scripts/changed-backends.js`. Nothing else to +change. + +### Adding a new language tier + +1. Create `.docker/bases/Dockerfile.` mirroring an existing tier + (apt + accel install + lang-specific toolchain). +2. Slim `backend/Dockerfile.` to `FROM ${BASE_IMAGE_PREBUILT}` plus + the per-backend source COPY + build (no inline accel install). +3. Add the new recipe to `baseTriggerFiles` in + `scripts/changed-backends.js` so PRs touching it fan out to canaries. +4. Add `: (item) => item.dockerfile.endsWith("")` to + `langTriggerSelector` in the same file. +5. Add a `LOCAL_BASE__TAG`, a `docker-build--base` target, + and a clause in `local-base-tag` / `local-base-target` in `Makefile`. + +The `langsWithBase` set in `scripts/changed-backends.js` is auto-detected +from the `.docker/bases/` directory at script startup, so step 1 alone is +enough for the script to start emitting bases (and annotating matrix +entries with `base-image-prebuilt`) for that lang. Steps 3–5 plug it +into the canary fan-out and the local-build path. + +### Why not just rely on `mode=max` cache? + +`mode=max` deduplicates at the layer level, but each matrix entry has its +own cache tag (`cache`). A change that invalidates the GPU SDK +layer in one backend does not invalidate it in any other; each entry pays +the full cost on its next rebuild. The shared base image is built once per +(accel × arch × cuda × lang), then pulled by every backend that consumes +it — that's the actual cross-matrix dedup. + +### Local builds + +All `backend/Dockerfile.{python,golang,cpp,rust}` consumers require +`BASE_IMAGE_PREBUILT` (no inline fallback). The Makefile wires the right +`docker-build--base` as a prerequisite for each backend's +`docker-build-` target, so: + +```bash +# Build any backend; the matching base is built first if needed. +make docker-build-vllm BUILD_TYPE=cublas CUDA_MAJOR_VERSION=12 CUDA_MINOR_VERSION=8 +make docker-build-llama-cpp BUILD_TYPE=cublas CUDA_MAJOR_VERSION=13 CUDA_MINOR_VERSION=0 +make docker-build-rerankers # golang +make docker-build-kokoros # rust +``` + +Or build a base directly: `make docker-build-{python,golang,cpp,rust}-base +BUILD_TYPE=...`. Or pull a pre-built one from quay if it exists for your +target tuple. + ## Touching the cache pipeline When changing `image_build.yml`, `backend_build.yml`, or any of the `backend/Dockerfile.*` files: @@ -109,3 +264,4 @@ When changing `image_build.yml`, `backend_build.yml`, or any of the `backend/Doc 2. **Keep `tag-suffix` unique per matrix entry** — it's the cache namespace. Two matrix entries sharing a tag-suffix would clobber each other's cache. 3. **Keep `cache-to` gated on `github.event_name != 'pull_request'`** — PRs must not write. 4. **Keep `ignore-error=true` on `cache-to`** — quay registry hiccups must not fail builds. +5. **`tagStem()` in `scripts/changed-backends.js` is the single source of truth for base image tags.** The matrix entries are annotated with `base-image-prebuilt` in the same script run; backend-jobs reads the value as-is. There's no parallel YAML expression to keep in sync. Adding a new dimension to the stem (e.g. a slug for a new base-image variant) is a script change only. diff --git a/.docker/bases/Dockerfile.cpp b/.docker/bases/Dockerfile.cpp new file mode 100644 index 000000000000..e7ab763bb3d5 --- /dev/null +++ b/.docker/bases/Dockerfile.cpp @@ -0,0 +1,259 @@ +# Shared C++ + accelerator base image for the llama-cpp / ik-llama-cpp / +# turboquant trio. They differ only in their Makefile targets at build +# time; the apt + GPU SDK + protoc + cmake + GRPC install is identical. +# +# Built once per (build-type, arch, ubuntu-version, cuda-version) combination +# by .github/workflows/base_images.yml and pushed to +# quay.io/go-skynet/localai-base:[-pr]. Consumed by +# backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant} via the +# BASE_IMAGE_PREBUILT build-arg. See .agents/ci-caching.md. + +ARG BASE_IMAGE=ubuntu:24.04 +ARG APT_MIRROR="" +ARG APT_PORTS_MIRROR="" + +FROM ${BASE_IMAGE} AS grpc + +ARG GRPC_MAKEFLAGS="-j4 -Otarget" +ARG GRPC_VERSION=v1.65.0 +ARG CMAKE_FROM_SOURCE=false +# CUDA Toolkit 13.x compatibility: CMake 3.31.9+ fixes toolchain detection/arch table issues +ARG CMAKE_VERSION=3.31.10 +ARG APT_MIRROR +ARG APT_PORTS_MIRROR + +ENV MAKEFLAGS=${GRPC_MAKEFLAGS} + +WORKDIR /build + +RUN --mount=type=bind,source=.docker/apt-mirror.sh,target=/usr/local/sbin/apt-mirror \ + APT_MIRROR="${APT_MIRROR}" APT_PORTS_MIRROR="${APT_PORTS_MIRROR}" sh /usr/local/sbin/apt-mirror && \ + apt-get update && \ + apt-get install -y --no-install-recommends \ + ca-certificates \ + build-essential curl libssl-dev \ + git wget && \ + apt-get clean && \ + rm -rf /var/lib/apt/lists/* + +RUN </dev/null || ls /opt/rocm*/lib64/rocblas/library/Kernels* 2>/dev/null) | grep -oP 'gfx[0-9a-z+-]+' | sort -u || \ + echo "WARNING: No rocBLAS kernel data found" \ + ; fi + +# Install protoc (the version in 22.04 is too old, and grpc's bundled protoc +# would pull in a newer absl that breaks stablediffusion). +RUN <[-pr]. Consumed by +# backend/Dockerfile.golang via the BASE_IMAGE_PREBUILT build-arg. +# +# Mirrors the GPU stack stanzas in Dockerfile.python; the language-specific +# tail at the bottom installs Go + grpc tooling. See .agents/ci-caching.md. + +ARG BASE_IMAGE=ubuntu:24.04 +ARG APT_MIRROR="" +ARG APT_PORTS_MIRROR="" + +FROM ${BASE_IMAGE} + +ARG BUILD_TYPE +ENV BUILD_TYPE=${BUILD_TYPE} +ARG CUDA_MAJOR_VERSION +ARG CUDA_MINOR_VERSION +ARG SKIP_DRIVERS=false +ENV CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION} +ENV CUDA_MINOR_VERSION=${CUDA_MINOR_VERSION} +ENV DEBIAN_FRONTEND=noninteractive +ARG TARGETARCH +ARG TARGETVARIANT +ARG GO_VERSION=1.25.4 +ARG UBUNTU_VERSION=2404 +ARG APT_MIRROR +ARG APT_PORTS_MIRROR + +LABEL org.opencontainers.image.source="https://github.com/mudler/LocalAI" +LABEL org.opencontainers.image.description="LocalAI Go+accelerator base image" +LABEL org.localai.base.lang="golang" + +RUN --mount=type=bind,source=.docker/apt-mirror.sh,target=/usr/local/sbin/apt-mirror \ + APT_MIRROR="${APT_MIRROR}" APT_PORTS_MIRROR="${APT_PORTS_MIRROR}" sh /usr/local/sbin/apt-mirror && \ + apt-get update && \ + apt-get install -y --no-install-recommends \ + build-essential \ + gcc-14 g++-14 \ + git ccache \ + ca-certificates \ + make cmake wget libopenblas-dev \ + curl unzip \ + libssl-dev && \ + update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100 \ + --slave /usr/bin/g++ g++ /usr/bin/g++-14 \ + --slave /usr/bin/gcov gcov /usr/bin/gcov-14 && \ + apt-get clean && \ + rm -rf /var/lib/apt/lists/* + +# Cuda +ENV PATH=/usr/local/cuda/bin:${PATH} + +# HipBLAS requirements +ENV PATH=/opt/rocm/bin:${PATH} + +# Vulkan requirements +RUN <[-pr]. Consumed by +# backend/Dockerfile.python via the BASE_IMAGE_PREBUILT build-arg. +# +# Keep the install steps below in lock-step with backend/Dockerfile.python's +# accel-inline stage until the inline fallback is removed. See +# .agents/ci-caching.md for the migration plan. + +ARG BASE_IMAGE=ubuntu:24.04 +ARG APT_MIRROR="" +ARG APT_PORTS_MIRROR="" + +FROM ${BASE_IMAGE} + +ARG BUILD_TYPE +ENV BUILD_TYPE=${BUILD_TYPE} +ARG CUDA_MAJOR_VERSION +ARG CUDA_MINOR_VERSION +ARG SKIP_DRIVERS=false +ENV CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION} +ENV CUDA_MINOR_VERSION=${CUDA_MINOR_VERSION} +ENV DEBIAN_FRONTEND=noninteractive +ARG TARGETARCH +ARG TARGETVARIANT +ARG UBUNTU_VERSION=2404 +ARG APT_MIRROR +ARG APT_PORTS_MIRROR + +LABEL org.opencontainers.image.source="https://github.com/mudler/LocalAI" +LABEL org.opencontainers.image.description="LocalAI Python+accelerator base image" +LABEL org.localai.base.lang="python" + +RUN --mount=type=bind,source=.docker/apt-mirror.sh,target=/usr/local/sbin/apt-mirror \ + APT_MIRROR="${APT_MIRROR}" APT_PORTS_MIRROR="${APT_PORTS_MIRROR}" sh /usr/local/sbin/apt-mirror && \ + apt-get update && \ + apt-get install -y --no-install-recommends \ + build-essential \ + ccache \ + ca-certificates \ + espeak-ng \ + curl \ + libssl-dev \ + git wget \ + git-lfs \ + unzip clang \ + upx-ucl \ + curl python3-pip \ + python-is-python3 \ + python3-dev llvm \ + libnuma1 libgomp1 \ + python3-venv make cmake && \ + apt-get clean && \ + rm -rf /var/lib/apt/lists/* + +RUN <[-pr]. The current +# rust matrix is CPU-only, so this base skips the GPU SDK stanzas; if a +# future rust backend needs cublas/rocm/etc., promote this recipe to mirror +# Dockerfile.python's GPU stack. See .agents/ci-caching.md. + +ARG BASE_IMAGE=ubuntu:24.04 +ARG APT_MIRROR="" +ARG APT_PORTS_MIRROR="" + +FROM ${BASE_IMAGE} + +ENV DEBIAN_FRONTEND=noninteractive +ARG TARGETARCH +ARG TARGETVARIANT +ARG UBUNTU_VERSION=2404 +ARG APT_MIRROR +ARG APT_PORTS_MIRROR + +LABEL org.opencontainers.image.source="https://github.com/mudler/LocalAI" +LABEL org.opencontainers.image.description="LocalAI Rust base image" +LABEL org.localai.base.lang="rust" + +RUN --mount=type=bind,source=.docker/apt-mirror.sh,target=/usr/local/sbin/apt-mirror \ + APT_MIRROR="${APT_MIRROR}" APT_PORTS_MIRROR="${APT_PORTS_MIRROR}" sh /usr/local/sbin/apt-mirror && \ + apt-get update && \ + apt-get install -y --no-install-recommends \ + build-essential \ + git ccache \ + ca-certificates \ + make cmake wget \ + curl unzip \ + clang \ + pkg-config \ + libssl-dev \ + espeak-ng libespeak-ng-dev \ + libsonic-dev libpcaudio-dev \ + libopus-dev \ + protobuf-compiler && \ + apt-get clean && \ + rm -rf /var/lib/apt/lists/* + +# Install Rust +RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y +ENV PATH="/root/.cargo/bin:${PATH}" diff --git a/.github/backend-matrix.yaml b/.github/backend-matrix.yaml new file mode 100644 index 000000000000..07de4d55b495 --- /dev/null +++ b/.github/backend-matrix.yaml @@ -0,0 +1,3164 @@ +# Backend build matrix data, consumed by: +# - .github/workflows/backend.yml (master push) +# - .github/workflows/backend_pr.yml (PR filtering) +# - scripts/changed-backends.js (matrix derivation) +# Edit this file to add/remove/modify backend matrix entries; the rest of +# the build pipeline (base image derivation, build-bases job, fromJSON +# wiring) picks up the change automatically. + +linux: + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-diffusers" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-vllm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: vllm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-sglang" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: sglang + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-diffusers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-chatterbox" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: chatterbox + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-moonshine" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: moonshine + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-tinygrad" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: tinygrad + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-whisperx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: whisperx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-faster-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-ace-step" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: ace-step + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-trl" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: trl + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-llama-cpp-quantization" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: llama-cpp-quantization + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-mlx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: mlx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-mlx-vlm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: mlx-vlm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-mlx-audio" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: mlx-audio + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-mlx-distributed" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: mlx-distributed + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-vibevoice" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-qwen-asr" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-nemo" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: nemo + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-qwen-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-fish-speech" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-faster-qwen3-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: faster-qwen3-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-voxcpm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: voxcpm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-pocket-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-rerankers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: rerankers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-llama-cpp" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-turboquant" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-vllm" + runs-on: arc-runner-set + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vllm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-vllm-omni" + runs-on: arc-runner-set + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vllm-omni + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-sglang" + runs-on: arc-runner-set + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sglang + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-transformers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: transformers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-diffusers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-ace-step" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: ace-step + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-trl" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: trl + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-kokoro" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: kokoro + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-faster-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-whisperx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: whisperx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "9" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-coqui" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: coqui + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-outetts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: outetts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-chatterbox" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: chatterbox + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-moonshine" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: moonshine + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-mlx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-mlx-vlm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx-vlm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-mlx-audio" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx-audio + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-mlx-distributed" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx-distributed + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-stablediffusion-ggml" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-sam3-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-acestep-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-qwen3-tts-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-vibevoice-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-rfdetr" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: rfdetr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-insightface" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: insightface + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-speaker-recognition" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: speaker-recognition + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-neutts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: neutts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-rerankers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: rerankers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-vibevoice" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-qwen-asr" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-nemo" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: nemo + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-qwen-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-fish-speech" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-faster-qwen3-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: faster-qwen3-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-voxcpm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: voxcpm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-pocket-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-llama-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-turboquant" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-llama-cpp" + base-image: ubuntu:24.04 + runs-on: ubuntu-24.04-arm + ubuntu-version: "2404" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-turboquant" + base-image: ubuntu:24.04 + runs-on: ubuntu-24.04-arm + ubuntu-version: "2404" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-vllm" + runs-on: arc-runner-set + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vllm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-vllm-omni" + runs-on: arc-runner-set + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vllm-omni + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-transformers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: transformers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-diffusers" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-ace-step" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: ace-step + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-trl" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: trl + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-vibevoice" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-qwen-asr" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-qwen-tts" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-fish-speech" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: faster-qwen3-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-pocket-tts" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-chatterbox" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: chatterbox + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-diffusers" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-vllm" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: vllm + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-vllm-omni" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: vllm-omni + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-sglang" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: sglang + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-mlx" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: mlx + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-mlx-vlm" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: mlx-vlm + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-mlx-audio" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: mlx-audio + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-mlx-distributed" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: mlx-distributed + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-whisperx" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: whisperx + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: l4t + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-faster-whisper" + runs-on: ubuntu-24.04-arm + base-image: ubuntu:24.04 + skip-drivers: "false" + ubuntu-version: "2404" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-kokoro" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: kokoro + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-faster-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-whisperx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: whisperx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-chatterbox" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: chatterbox + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-moonshine" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: moonshine + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-mlx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-mlx-vlm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx-vlm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-mlx-audio" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx-audio + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-mlx-distributed" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: mlx-distributed + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-stablediffusion-ggml" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-stablediffusion-ggml" + base-image: ubuntu:24.04 + ubuntu-version: "2404" + runs-on: ubuntu-24.04-arm + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-sam3-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-sam3-cpp" + base-image: ubuntu:24.04 + ubuntu-version: "2404" + runs-on: ubuntu-24.04-arm + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-whisper" + base-image: ubuntu:24.04 + ubuntu-version: "2404" + runs-on: ubuntu-24.04-arm + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-acestep-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-qwen3-tts-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-vibevoice-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-acestep-cpp" + base-image: ubuntu:24.04 + ubuntu-version: "2404" + runs-on: ubuntu-24.04-arm + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-qwen3-tts-cpp" + base-image: ubuntu:24.04 + ubuntu-version: "2404" + runs-on: ubuntu-24.04-arm + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-cuda-13-arm64-vibevoice-cpp" + base-image: ubuntu:24.04 + ubuntu-version: "2404" + runs-on: ubuntu-24.04-arm + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-rfdetr" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: rfdetr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-rerankers" + runs-on: ubuntu-latest + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: rerankers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-llama-cpp" + runs-on: ubuntu-latest + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-turboquant" + runs-on: ubuntu-latest + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-vllm" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: vllm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-vllm-omni" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: vllm-omni + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-sglang" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: sglang + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-transformers" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: transformers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-diffusers" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-ace-step" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: ace-step + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-kokoro" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: kokoro + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-vibevoice" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-qwen-asr" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-nemo" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: nemo + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-qwen-tts" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-fish-speech" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-voxcpm" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: voxcpm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-pocket-tts" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-faster-whisper" + runs-on: bigger-runner + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-coqui" + runs-on: bigger-runner + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: coqui + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-rerankers" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: rerankers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-llama-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-turboquant" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-llama-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-turboquant" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-vllm" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: vllm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sglang" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: sglang + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-transformers" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: transformers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-diffusers" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: diffusers + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-ace-step" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: ace-step + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-vibevoice" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-qwen-asr" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-qwen-tts" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-fish-speech" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-faster-qwen3-tts" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: faster-qwen3-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-pocket-tts" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-kokoro" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: kokoro + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-mlx" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: mlx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-mlx-vlm" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: mlx-vlm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-mlx-audio" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: mlx-audio + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-mlx-distributed" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: mlx-distributed + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-whisperx" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: whisperx + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + tag-latest: auto + tag-suffix: "-nvidia-l4t-faster-whisper" + runs-on: ubuntu-24.04-arm + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + skip-drivers: "true" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-kokoro" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: kokoro + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-faster-whisper" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: faster-whisper + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-vibevoice" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-qwen-asr" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-nemo" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: nemo + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-qwen-tts" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-fish-speech" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-voxcpm" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: voxcpm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-pocket-tts" + runs-on: arc-runner-set + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-coqui" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: coqui + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-piper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: piper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-llama-cpp" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-turboquant" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-ik-llama-cpp" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: ik-llama-cpp + dockerfile: ./backend/Dockerfile.ik-llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-llama-cpp" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2204" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-turboquant" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2204" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-llama-cpp" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: llama-cpp + dockerfile: ./backend/Dockerfile.llama-cpp + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-turboquant" + runs-on: bigger-runner + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: turboquant + dockerfile: ./backend/Dockerfile.turboquant + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-stablediffusion-ggml" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-sam3-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-sam3-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-sam3-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-sam3-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-stablediffusion-ggml" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-stablediffusion-ggml" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-stablediffusion-ggml" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-stablediffusion-ggml" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: stablediffusion-ggml + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2204" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-sam3-cpp" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: sam3-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2204" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-whisper" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-whisper" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-whisper" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-whisper" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2204" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-whisper" + base-image: rocm/dev-ubuntu-24.04:7.2.1 + runs-on: ubuntu-latest + skip-drivers: "false" + backend: whisper + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-acestep-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-acestep-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-acestep-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-acestep-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-acestep-cpp" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2204" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-acestep-cpp" + base-image: rocm/dev-ubuntu-24.04:7.2.1 + runs-on: ubuntu-latest + skip-drivers: "false" + backend: acestep-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-qwen3-tts-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-qwen3-tts-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-qwen3-tts-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-qwen3-tts-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-qwen3-tts-cpp" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2204" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-qwen3-tts-cpp" + base-image: rocm/dev-ubuntu-24.04:6.4.4 + runs-on: ubuntu-latest + skip-drivers: "false" + backend: qwen3-tts-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-vibevoice-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-localvqe" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: localvqe + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f32 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f32-vibevoice-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: sycl_f16 + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-sycl-f16-vibevoice-cpp" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-vibevoice-cpp" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: vulkan + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-gpu-vulkan-localvqe" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: localvqe + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "false" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-vibevoice-cpp" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2204" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-vibevoice-cpp" + base-image: rocm/dev-ubuntu-24.04:6.4.4 + runs-on: ubuntu-latest + skip-drivers: "false" + backend: vibevoice-cpp + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-voxtral" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: voxtral + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-opus" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: opus + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-silero-vad" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: silero-vad + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-kokoros" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: kokoros + dockerfile: ./backend/Dockerfile.rust + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-local-store" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: local-store + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-rfdetr" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: rfdetr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-insightface" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: insightface + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-speaker-recognition" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: speaker-recognition + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: intel + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-intel-rfdetr" + runs-on: ubuntu-latest + base-image: intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04 + skip-drivers: "false" + backend: rfdetr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "true" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-rfdetr" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: rfdetr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: l4t + cuda-major-version: "12" + cuda-minor-version: "0" + platforms: linux/arm64 + skip-drivers: "true" + tag-latest: auto + tag-suffix: "-nvidia-l4t-arm64-chatterbox" + base-image: nvcr.io/nvidia/l4t-jetpack:r36.4.0 + runs-on: ubuntu-24.04-arm + backend: chatterbox + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2204" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-kitten-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: kitten-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-neutts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: neutts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: hipblas + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-rocm-hipblas-neutts" + runs-on: arc-runner-set + base-image: rocm/dev-ubuntu-24.04:7.2.1 + skip-drivers: "false" + backend: neutts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-vibevoice" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: vibevoice + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-qwen-asr" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen-asr + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-nemo" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: nemo + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-qwen-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: qwen-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-fish-speech" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: fish-speech + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-voxcpm" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: voxcpm + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-pocket-tts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: pocket-tts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-cpu-outetts" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "true" + backend: outetts + dockerfile: ./backend/Dockerfile.python + context: ./ + ubuntu-version: "2404" + - build-type: "" + cuda-major-version: "" + cuda-minor-version: "" + platforms: linux/amd64,linux/arm64 + tag-latest: auto + tag-suffix: "-cpu-sherpa-onnx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sherpa-onnx + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "12" + cuda-minor-version: "8" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-12-sherpa-onnx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sherpa-onnx + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" + - build-type: cublas + cuda-major-version: "13" + cuda-minor-version: "0" + platforms: linux/amd64 + tag-latest: auto + tag-suffix: "-gpu-nvidia-cuda-13-sherpa-onnx" + runs-on: ubuntu-latest + base-image: ubuntu:24.04 + skip-drivers: "false" + backend: sherpa-onnx + dockerfile: ./backend/Dockerfile.golang + context: ./ + ubuntu-version: "2404" +darwin: + - backend: diffusers + tag-suffix: "-metal-darwin-arm64-diffusers" + build-type: mps + - backend: ace-step + tag-suffix: "-metal-darwin-arm64-ace-step" + build-type: mps + - backend: mlx + tag-suffix: "-metal-darwin-arm64-mlx" + build-type: mps + - backend: chatterbox + tag-suffix: "-metal-darwin-arm64-chatterbox" + build-type: mps + - backend: mlx-vlm + tag-suffix: "-metal-darwin-arm64-mlx-vlm" + build-type: mps + - backend: mlx-audio + tag-suffix: "-metal-darwin-arm64-mlx-audio" + build-type: mps + - backend: mlx-distributed + tag-suffix: "-metal-darwin-arm64-mlx-distributed" + build-type: mps + - backend: stablediffusion-ggml + tag-suffix: "-metal-darwin-arm64-stablediffusion-ggml" + build-type: metal + lang: go + - backend: whisper + tag-suffix: "-metal-darwin-arm64-whisper" + build-type: metal + lang: go + - backend: acestep-cpp + tag-suffix: "-metal-darwin-arm64-acestep-cpp" + build-type: metal + lang: go + - backend: qwen3-tts-cpp + tag-suffix: "-metal-darwin-arm64-qwen3-tts-cpp" + build-type: metal + lang: go + - backend: vibevoice-cpp + tag-suffix: "-metal-darwin-arm64-vibevoice-cpp" + build-type: metal + lang: go + - backend: voxtral + tag-suffix: "-metal-darwin-arm64-voxtral" + build-type: metal + lang: go + - backend: vibevoice + tag-suffix: "-metal-darwin-arm64-vibevoice" + build-type: mps + - backend: qwen-asr + tag-suffix: "-metal-darwin-arm64-qwen-asr" + build-type: mps + - backend: nemo + tag-suffix: "-metal-darwin-arm64-nemo" + build-type: mps + - backend: qwen-tts + tag-suffix: "-metal-darwin-arm64-qwen-tts" + build-type: mps + - backend: fish-speech + tag-suffix: "-metal-darwin-arm64-fish-speech" + build-type: mps + - backend: voxcpm + tag-suffix: "-metal-darwin-arm64-voxcpm" + build-type: mps + - backend: pocket-tts + tag-suffix: "-metal-darwin-arm64-pocket-tts" + build-type: mps + - backend: moonshine + tag-suffix: "-metal-darwin-arm64-moonshine" + build-type: mps + - backend: whisperx + tag-suffix: "-metal-darwin-arm64-whisperx" + build-type: mps + - backend: rerankers + tag-suffix: "-metal-darwin-arm64-rerankers" + build-type: mps + - backend: transformers + tag-suffix: "-metal-darwin-arm64-transformers" + build-type: mps + - backend: kokoro + tag-suffix: "-metal-darwin-arm64-kokoro" + build-type: mps + - backend: faster-whisper + tag-suffix: "-metal-darwin-arm64-faster-whisper" + build-type: mps + - backend: coqui + tag-suffix: "-metal-darwin-arm64-coqui" + build-type: mps + - backend: rfdetr + tag-suffix: "-metal-darwin-arm64-rfdetr" + build-type: mps + - backend: kitten-tts + tag-suffix: "-metal-darwin-arm64-kitten-tts" + build-type: mps + - backend: piper + tag-suffix: "-metal-darwin-arm64-piper" + build-type: metal + lang: go + - backend: opus + tag-suffix: "-metal-darwin-arm64-opus" + build-type: metal + lang: go + - backend: silero-vad + tag-suffix: "-metal-darwin-arm64-silero-vad" + build-type: metal + lang: go + - backend: local-store + tag-suffix: "-metal-darwin-arm64-local-store" + build-type: metal + lang: go + - backend: llama-cpp-quantization + tag-suffix: "-metal-darwin-arm64-llama-cpp-quantization" + build-type: mps diff --git a/.github/workflows/backend.yml b/.github/workflows/backend.yml index 2242be0f79ba..d15604ea6312 100644 --- a/.github/workflows/backend.yml +++ b/.github/workflows/backend.yml @@ -13,8 +13,55 @@ concurrency: cancel-in-progress: true jobs: - backend-jobs: + derive-bases: if: github.repository == 'mudler/LocalAI' + runs-on: ubuntu-latest + outputs: + matrix: ${{ steps.derive.outputs.matrix }} + matrix-darwin: ${{ steps.derive.outputs.matrix-darwin }} + bases-matrix: ${{ steps.derive.outputs.bases-matrix }} + has-backends: ${{ steps.derive.outputs.has-backends }} + has-backends-darwin: ${{ steps.derive.outputs.has-backends-darwin }} + has-bases: ${{ steps.derive.outputs.has-bases }} + steps: + - uses: actions/checkout@v6 + - uses: oven-sh/setup-bun@v2 + - run: | + bun add js-yaml + bun add @octokit/core + - id: derive + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + GITHUB_EVENT_PATH: ${{ github.event_path }} + run: bun run scripts/changed-backends.js + + build-bases: + needs: derive-bases + if: needs.derive-bases.outputs.has-bases == 'true' + strategy: + fail-fast: false + matrix: ${{ fromJSON(needs.derive-bases.outputs.bases-matrix) }} + uses: ./.github/workflows/base_images.yml + with: + lang: ${{ matrix.lang }} + base-image: ${{ matrix.base-image }} + build-type: ${{ matrix.build-type }} + cuda-major-version: ${{ matrix.cuda-major-version }} + cuda-minor-version: ${{ matrix.cuda-minor-version }} + ubuntu-version: ${{ matrix.ubuntu-version }} + platforms: ${{ matrix.platforms }} + runs-on: ${{ matrix.runs-on }} + tag-stem: ${{ matrix.tag-stem }} + skip-drivers: ${{ matrix.skip-drivers }} + secrets: + quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }} + quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }} + + backend-jobs: + if: | + always() && github.repository == 'mudler/LocalAI' && + (needs.build-bases.result == 'success' || needs.build-bases.result == 'skipped') + needs: [derive-bases, build-bases] uses: ./.github/workflows/backend_build.yml with: tag-latest: ${{ matrix.tag-latest }} @@ -31,6 +78,10 @@ jobs: context: ${{ matrix.context }} ubuntu-version: ${{ matrix.ubuntu-version }} amdgpu-targets: ${{ matrix.amdgpu-targets || 'gfx908,gfx90a,gfx942,gfx950,gfx1030,gfx1100,gfx1101,gfx1102,gfx1151,gfx1200,gfx1201' }} + # Set by scripts/changed-backends.js for langs that have a + # .docker/bases/Dockerfile. recipe; '' otherwise (those run + # the inline bootstrap in their own Dockerfile). + base-image-prebuilt: ${{ matrix.base-image-prebuilt || '' }} secrets: dockerUsername: ${{ secrets.DOCKERHUB_USERNAME }} dockerPassword: ${{ secrets.DOCKERHUB_PASSWORD }} @@ -38,3214 +89,14 @@ jobs: quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }} strategy: fail-fast: false - #max-parallel: ${{ github.event_name != 'pull_request' && 6 || 4 }} - matrix: - include: - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-diffusers' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-vllm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "vllm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-sglang' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "sglang" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-diffusers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-chatterbox' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "chatterbox" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-moonshine' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "moonshine" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # tinygrad ships a single image — its CPU device uses bundled - # libLLVM, and its CUDA / HIP / Metal devices dlopen the host - # driver libraries at runtime via tinygrad's ctypes autogen - # wrappers. There is no toolkit-version split because tinygrad - # generates kernels itself (PTX renderer for CUDA) and never - # links against cuDNN/cuBLAS/torch. - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-tinygrad' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "tinygrad" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-whisperx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "whisperx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-faster-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-ace-step' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "ace-step" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-trl' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "trl" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-llama-cpp-quantization' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "llama-cpp-quantization" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-mlx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "mlx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-mlx-vlm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "mlx-vlm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-mlx-audio' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "mlx-audio" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-mlx-distributed' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "mlx-distributed" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # CUDA 12 builds - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-vibevoice' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-qwen-asr' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-nemo' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "nemo" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-qwen-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-fish-speech' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-faster-qwen3-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "faster-qwen3-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-voxcpm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "voxcpm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-pocket-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-rerankers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "rerankers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-llama-cpp' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-turboquant' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-vllm' - runs-on: 'arc-runner-set' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vllm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-vllm-omni' - runs-on: 'arc-runner-set' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vllm-omni" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-sglang' - runs-on: 'arc-runner-set' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sglang" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-transformers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "transformers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-diffusers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-ace-step' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "ace-step" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-trl' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "trl" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-kokoro' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "kokoro" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-faster-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-whisperx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "whisperx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "9" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-coqui' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "coqui" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-outetts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "outetts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-chatterbox' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "chatterbox" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-moonshine' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "moonshine" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-mlx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-mlx-vlm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx-vlm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-mlx-audio' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx-audio" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-mlx-distributed' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx-distributed" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-stablediffusion-ggml' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-sam3-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-acestep-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-qwen3-tts-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-vibevoice-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-rfdetr' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "rfdetr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-insightface' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "insightface" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-speaker-recognition' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "speaker-recognition" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-neutts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "neutts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # cuda 13 - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-rerankers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "rerankers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-vibevoice' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-qwen-asr' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-nemo' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "nemo" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-qwen-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-fish-speech' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-faster-qwen3-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "faster-qwen3-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-voxcpm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "voxcpm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-pocket-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-llama-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-turboquant' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-llama-cpp' - base-image: "ubuntu:24.04" - runs-on: 'ubuntu-24.04-arm' - ubuntu-version: '2404' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-turboquant' - base-image: "ubuntu:24.04" - runs-on: 'ubuntu-24.04-arm' - ubuntu-version: '2404' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-vllm' - runs-on: 'arc-runner-set' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vllm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-vllm-omni' - runs-on: 'arc-runner-set' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vllm-omni" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-transformers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "transformers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-diffusers' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-ace-step' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "ace-step" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-trl' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "trl" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-vibevoice' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-qwen-asr' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-qwen-tts' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-fish-speech' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-faster-qwen3-tts' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "faster-qwen3-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-pocket-tts' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-chatterbox' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "chatterbox" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-diffusers' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-vllm' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "vllm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-vllm-omni' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "vllm-omni" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-sglang' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "sglang" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-mlx' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "mlx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-mlx-vlm' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "mlx-vlm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-mlx-audio' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "mlx-audio" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-mlx-distributed' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "mlx-distributed" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-whisperx' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "whisperx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'l4t' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-faster-whisper' - runs-on: 'ubuntu-24.04-arm' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - ubuntu-version: '2404' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-kokoro' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "kokoro" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-faster-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-whisperx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "whisperx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-chatterbox' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "chatterbox" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-moonshine' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "moonshine" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-mlx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-mlx-vlm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx-vlm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-mlx-audio' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx-audio" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-mlx-distributed' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "mlx-distributed" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-stablediffusion-ggml' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-stablediffusion-ggml' - base-image: "ubuntu:24.04" - ubuntu-version: '2404' - runs-on: 'ubuntu-24.04-arm' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-sam3-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-sam3-cpp' - base-image: "ubuntu:24.04" - ubuntu-version: '2404' - runs-on: 'ubuntu-24.04-arm' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-whisper' - base-image: "ubuntu:24.04" - ubuntu-version: '2404' - runs-on: 'ubuntu-24.04-arm' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-acestep-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-qwen3-tts-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-vibevoice-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-acestep-cpp' - base-image: "ubuntu:24.04" - ubuntu-version: '2404' - runs-on: 'ubuntu-24.04-arm' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-qwen3-tts-cpp' - base-image: "ubuntu:24.04" - ubuntu-version: '2404' - runs-on: 'ubuntu-24.04-arm' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-cuda-13-arm64-vibevoice-cpp' - base-image: "ubuntu:24.04" - ubuntu-version: '2404' - runs-on: 'ubuntu-24.04-arm' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-rfdetr' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "rfdetr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # hipblas builds - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-rerankers' - runs-on: 'ubuntu-latest' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "rerankers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-llama-cpp' - runs-on: 'ubuntu-latest' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-turboquant' - runs-on: 'ubuntu-latest' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-vllm' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "vllm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-vllm-omni' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "vllm-omni" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-sglang' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "sglang" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-transformers' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "transformers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-diffusers' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-ace-step' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "ace-step" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # ROCm additional backends - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-kokoro' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "kokoro" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-vibevoice' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-qwen-asr' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-nemo' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "nemo" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-qwen-tts' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-fish-speech' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-voxcpm' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "voxcpm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-pocket-tts' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-faster-whisper' - runs-on: 'bigger-runner' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-coqui' - runs-on: 'bigger-runner' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "coqui" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # sycl builds - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-rerankers' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "rerankers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-llama-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-turboquant' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-llama-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-turboquant' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-vllm' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "vllm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sglang' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "sglang" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-transformers' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "transformers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-diffusers' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "diffusers" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-ace-step' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "ace-step" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-vibevoice' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-qwen-asr' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-qwen-tts' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-fish-speech' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-faster-qwen3-tts' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "faster-qwen3-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-pocket-tts' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-kokoro' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "kokoro" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-mlx' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "mlx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-mlx-vlm' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "mlx-vlm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-mlx-audio' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "mlx-audio" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-mlx-distributed' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "mlx-distributed" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-whisperx' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "whisperx" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-faster-whisper' - runs-on: 'ubuntu-24.04-arm' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - skip-drivers: 'true' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - # SYCL additional backends - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-kokoro' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "kokoro" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-faster-whisper' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "faster-whisper" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-vibevoice' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-qwen-asr' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-nemo' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "nemo" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-qwen-tts' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-fish-speech' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-voxcpm' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "voxcpm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-pocket-tts' - runs-on: 'arc-runner-set' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-coqui' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "coqui" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # piper - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-piper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "piper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-llama-cpp' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-turboquant' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-ik-llama-cpp' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "ik-llama-cpp" - dockerfile: "./backend/Dockerfile.ik-llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-llama-cpp' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2204' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-turboquant' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2204' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-llama-cpp' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "llama-cpp" - dockerfile: "./backend/Dockerfile.llama-cpp" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-turboquant' - runs-on: 'bigger-runner' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "turboquant" - dockerfile: "./backend/Dockerfile.turboquant" - context: "./" - ubuntu-version: '2404' - # Stablediffusion-ggml - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-stablediffusion-ggml' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # sam3-cpp - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-sam3-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-sam3-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-sam3-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-sam3-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-stablediffusion-ggml' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-stablediffusion-ggml' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-stablediffusion-ggml' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-stablediffusion-ggml' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "stablediffusion-ggml" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2204' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-sam3-cpp' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "sam3-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2204' - # whisper - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-whisper' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-whisper' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-whisper' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-whisper' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2204' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-whisper' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - runs-on: 'ubuntu-latest' - skip-drivers: 'false' - backend: "whisper" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # acestep-cpp - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-acestep-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-acestep-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-acestep-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-acestep-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-acestep-cpp' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2204' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-acestep-cpp' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - runs-on: 'ubuntu-latest' - skip-drivers: 'false' - backend: "acestep-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # qwen3-tts-cpp - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-qwen3-tts-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-qwen3-tts-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-qwen3-tts-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-qwen3-tts-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-qwen3-tts-cpp' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2204' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-qwen3-tts-cpp' - base-image: "rocm/dev-ubuntu-24.04:6.4.4" - runs-on: 'ubuntu-latest' - skip-drivers: 'false' - backend: "qwen3-tts-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # vibevoice-cpp - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-vibevoice-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-localvqe' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "localvqe" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f32' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f32-vibevoice-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'sycl_f16' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-sycl-f16-vibevoice-cpp' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-vibevoice-cpp' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'vulkan' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-gpu-vulkan-localvqe' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "localvqe" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'false' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-vibevoice-cpp' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2204' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-vibevoice-cpp' - base-image: "rocm/dev-ubuntu-24.04:6.4.4" - runs-on: 'ubuntu-latest' - skip-drivers: 'false' - backend: "vibevoice-cpp" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # voxtral - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-voxtral' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "voxtral" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - #opus - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-opus' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "opus" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - #silero-vad - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-silero-vad' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "silero-vad" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # kokoros (Rust TTS) - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-kokoros' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "kokoros" - dockerfile: "./backend/Dockerfile.rust" - context: "./" - ubuntu-version: '2404' - # local-store - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-local-store' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "local-store" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # rfdetr - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-rfdetr' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "rfdetr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # insightface (face recognition) - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-insightface' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "insightface" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # speaker-recognition (voice/speaker biometrics) - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-speaker-recognition' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "speaker-recognition" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'intel' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-intel-rfdetr' - runs-on: 'ubuntu-latest' - base-image: "intel/oneapi-basekit:2025.3.0-0-devel-ubuntu24.04" - skip-drivers: 'false' - backend: "rfdetr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'true' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-rfdetr' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "rfdetr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - - build-type: 'l4t' - cuda-major-version: "12" - cuda-minor-version: "0" - platforms: 'linux/arm64' - skip-drivers: 'true' - tag-latest: 'auto' - tag-suffix: '-nvidia-l4t-arm64-chatterbox' - base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0" - runs-on: 'ubuntu-24.04-arm' - backend: "chatterbox" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2204' - # runs out of space on the runner - # - build-type: 'hipblas' - # cuda-major-version: "" - # cuda-minor-version: "" - # platforms: 'linux/amd64' - # tag-latest: 'auto' - # tag-suffix: '-gpu-hipblas-rfdetr' - # base-image: "rocm/dev-ubuntu-24.04:7.2.1" - # runs-on: 'ubuntu-latest' - # skip-drivers: 'false' - # backend: "rfdetr" - # dockerfile: "./backend/Dockerfile.python" - # context: "./" - # kitten-tts - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-kitten-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "kitten-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # neutts - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-neutts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "neutts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: 'hipblas' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-rocm-hipblas-neutts' - runs-on: 'arc-runner-set' - base-image: "rocm/dev-ubuntu-24.04:7.2.1" - skip-drivers: 'false' - backend: "neutts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-vibevoice' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "vibevoice" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-qwen-asr' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen-asr" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-nemo' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "nemo" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-qwen-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "qwen-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-fish-speech' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "fish-speech" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-voxcpm' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "voxcpm" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-pocket-tts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "pocket-tts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-cpu-outetts' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'true' - backend: "outetts" - dockerfile: "./backend/Dockerfile.python" - context: "./" - ubuntu-version: '2404' - # sherpa-onnx CPU - - build-type: '' - cuda-major-version: "" - cuda-minor-version: "" - platforms: 'linux/amd64,linux/arm64' - tag-latest: 'auto' - tag-suffix: '-cpu-sherpa-onnx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sherpa-onnx" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # sherpa-onnx CUDA 12 - - build-type: 'cublas' - cuda-major-version: "12" - cuda-minor-version: "8" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-12-sherpa-onnx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sherpa-onnx" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' - # sherpa-onnx CUDA 13 — requires onnxruntime 1.24.x+ for the - # gpu_cuda13 tarball; sherpa-onnx SHERPA_COMMIT pins to v1.12.39. - - build-type: 'cublas' - cuda-major-version: "13" - cuda-minor-version: "0" - platforms: 'linux/amd64' - tag-latest: 'auto' - tag-suffix: '-gpu-nvidia-cuda-13-sherpa-onnx' - runs-on: 'ubuntu-latest' - base-image: "ubuntu:24.04" - skip-drivers: 'false' - backend: "sherpa-onnx" - dockerfile: "./backend/Dockerfile.golang" - context: "./" - ubuntu-version: '2404' + matrix: ${{ fromJSON(needs.derive-bases.outputs.matrix) }} backend-jobs-darwin: + if: github.repository == 'mudler/LocalAI' + needs: derive-bases uses: ./.github/workflows/backend_build_darwin.yml strategy: - matrix: - include: - - backend: "diffusers" - tag-suffix: "-metal-darwin-arm64-diffusers" - build-type: "mps" - - backend: "ace-step" - tag-suffix: "-metal-darwin-arm64-ace-step" - build-type: "mps" - - backend: "mlx" - tag-suffix: "-metal-darwin-arm64-mlx" - build-type: "mps" - - backend: "chatterbox" - tag-suffix: "-metal-darwin-arm64-chatterbox" - build-type: "mps" - - backend: "mlx-vlm" - tag-suffix: "-metal-darwin-arm64-mlx-vlm" - build-type: "mps" - - backend: "mlx-audio" - tag-suffix: "-metal-darwin-arm64-mlx-audio" - build-type: "mps" - - backend: "mlx-distributed" - tag-suffix: "-metal-darwin-arm64-mlx-distributed" - build-type: "mps" - - backend: "stablediffusion-ggml" - tag-suffix: "-metal-darwin-arm64-stablediffusion-ggml" - build-type: "metal" - lang: "go" - - backend: "whisper" - tag-suffix: "-metal-darwin-arm64-whisper" - build-type: "metal" - lang: "go" - - backend: "acestep-cpp" - tag-suffix: "-metal-darwin-arm64-acestep-cpp" - build-type: "metal" - lang: "go" - - backend: "qwen3-tts-cpp" - tag-suffix: "-metal-darwin-arm64-qwen3-tts-cpp" - build-type: "metal" - lang: "go" - - backend: "vibevoice-cpp" - tag-suffix: "-metal-darwin-arm64-vibevoice-cpp" - build-type: "metal" - lang: "go" - - backend: "voxtral" - tag-suffix: "-metal-darwin-arm64-voxtral" - build-type: "metal" - lang: "go" - - backend: "vibevoice" - tag-suffix: "-metal-darwin-arm64-vibevoice" - build-type: "mps" - - backend: "qwen-asr" - tag-suffix: "-metal-darwin-arm64-qwen-asr" - build-type: "mps" - - backend: "nemo" - tag-suffix: "-metal-darwin-arm64-nemo" - build-type: "mps" - - backend: "qwen-tts" - tag-suffix: "-metal-darwin-arm64-qwen-tts" - build-type: "mps" - - backend: "fish-speech" - tag-suffix: "-metal-darwin-arm64-fish-speech" - build-type: "mps" - - backend: "voxcpm" - tag-suffix: "-metal-darwin-arm64-voxcpm" - build-type: "mps" - - backend: "pocket-tts" - tag-suffix: "-metal-darwin-arm64-pocket-tts" - build-type: "mps" - - backend: "moonshine" - tag-suffix: "-metal-darwin-arm64-moonshine" - build-type: "mps" - - backend: "whisperx" - tag-suffix: "-metal-darwin-arm64-whisperx" - build-type: "mps" - - backend: "rerankers" - tag-suffix: "-metal-darwin-arm64-rerankers" - build-type: "mps" - - backend: "transformers" - tag-suffix: "-metal-darwin-arm64-transformers" - build-type: "mps" - - backend: "kokoro" - tag-suffix: "-metal-darwin-arm64-kokoro" - build-type: "mps" - - backend: "faster-whisper" - tag-suffix: "-metal-darwin-arm64-faster-whisper" - build-type: "mps" - - backend: "coqui" - tag-suffix: "-metal-darwin-arm64-coqui" - build-type: "mps" - - backend: "rfdetr" - tag-suffix: "-metal-darwin-arm64-rfdetr" - build-type: "mps" - - backend: "kitten-tts" - tag-suffix: "-metal-darwin-arm64-kitten-tts" - build-type: "mps" - - backend: "piper" - tag-suffix: "-metal-darwin-arm64-piper" - build-type: "metal" - lang: "go" - - backend: "opus" - tag-suffix: "-metal-darwin-arm64-opus" - build-type: "metal" - lang: "go" - - backend: "silero-vad" - tag-suffix: "-metal-darwin-arm64-silero-vad" - build-type: "metal" - lang: "go" - - backend: "local-store" - tag-suffix: "-metal-darwin-arm64-local-store" - build-type: "metal" - lang: "go" - - backend: "llama-cpp-quantization" - tag-suffix: "-metal-darwin-arm64-llama-cpp-quantization" - build-type: "mps" + fail-fast: false + matrix: ${{ fromJSON(needs.derive-bases.outputs.matrix-darwin) }} with: backend: ${{ matrix.backend }} build-type: ${{ matrix.build-type }} diff --git a/.github/workflows/backend_build.yml b/.github/workflows/backend_build.yml index a7f6a8a5efd1..5f68af3b0f95 100644 --- a/.github/workflows/backend_build.yml +++ b/.github/workflows/backend_build.yml @@ -63,6 +63,16 @@ on: required: false default: '' type: string + base-image-prebuilt: + description: | + Optional reference to a prebuilt accel/lang base image + (quay.io/go-skynet/localai-base:). When set, the backend + Dockerfile FROMs this image instead of running the inline + bootstrap. See .github/workflows/base_images_python.yml and + .agents/ci-caching.md. + required: false + default: '' + type: string secrets: dockerUsername: required: false @@ -228,6 +238,7 @@ jobs: APT_MIRROR=${{ steps.apt_mirror.outputs.effective-mirror }} APT_PORTS_MIRROR=${{ steps.apt_mirror.outputs.effective-ports-mirror }} DEPS_REFRESH=${{ steps.deps_refresh.outputs.key }} + BASE_IMAGE_PREBUILT=${{ inputs.base-image-prebuilt }} context: ${{ inputs.context }} file: ${{ inputs.dockerfile }} cache-from: type=registry,ref=quay.io/go-skynet/ci-cache:cache${{ inputs.tag-suffix }} @@ -254,6 +265,7 @@ jobs: APT_MIRROR=${{ steps.apt_mirror.outputs.effective-mirror }} APT_PORTS_MIRROR=${{ steps.apt_mirror.outputs.effective-ports-mirror }} DEPS_REFRESH=${{ steps.deps_refresh.outputs.key }} + BASE_IMAGE_PREBUILT=${{ inputs.base-image-prebuilt }} context: ${{ inputs.context }} file: ${{ inputs.dockerfile }} cache-from: type=registry,ref=quay.io/go-skynet/ci-cache:cache${{ inputs.tag-suffix }} diff --git a/.github/workflows/backend_pr.yml b/.github/workflows/backend_pr.yml index 5a557b38bbb2..5ad3fb7ad079 100644 --- a/.github/workflows/backend_pr.yml +++ b/.github/workflows/backend_pr.yml @@ -13,8 +13,10 @@ jobs: outputs: matrix: ${{ steps.set-matrix.outputs.matrix }} matrix-darwin: ${{ steps.set-matrix.outputs.matrix-darwin }} + bases-matrix: ${{ steps.set-matrix.outputs.bases-matrix }} has-backends: ${{ steps.set-matrix.outputs.has-backends }} has-backends-darwin: ${{ steps.set-matrix.outputs.has-backends-darwin }} + has-bases: ${{ steps.set-matrix.outputs.has-bases }} steps: - name: Checkout repository uses: actions/checkout@v6 @@ -27,7 +29,8 @@ jobs: bun add js-yaml bun add @octokit/core - # filters the matrix in backend.yml + # Filters the matrix from backend.yml against this PR's changed files + # AND derives the deduplicated bases-matrix consumed by build-bases. - name: Filter matrix for changed backends id: set-matrix env: @@ -35,10 +38,34 @@ jobs: GITHUB_EVENT_PATH: ${{ github.event_path }} run: bun run scripts/changed-backends.js - backend-jobs: + build-bases: needs: generate-matrix + if: needs.generate-matrix.outputs.has-bases == 'true' + strategy: + fail-fast: false + matrix: ${{ fromJSON(needs.generate-matrix.outputs.bases-matrix) }} + uses: ./.github/workflows/base_images.yml + with: + lang: ${{ matrix.lang }} + base-image: ${{ matrix.base-image }} + build-type: ${{ matrix.build-type }} + cuda-major-version: ${{ matrix.cuda-major-version }} + cuda-minor-version: ${{ matrix.cuda-minor-version }} + ubuntu-version: ${{ matrix.ubuntu-version }} + platforms: ${{ matrix.platforms }} + runs-on: ${{ matrix.runs-on }} + tag-stem: ${{ matrix.tag-stem }} + skip-drivers: ${{ matrix.skip-drivers }} + secrets: + quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }} + quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }} + + backend-jobs: + needs: [generate-matrix, build-bases] uses: ./.github/workflows/backend_build.yml - if: needs.generate-matrix.outputs.has-backends == 'true' + if: | + always() && needs.generate-matrix.outputs.has-backends == 'true' && + (needs.build-bases.result == 'success' || needs.build-bases.result == 'skipped') with: tag-latest: ${{ matrix.tag-latest }} tag-suffix: ${{ matrix.tag-suffix }} @@ -54,12 +81,17 @@ jobs: context: ${{ matrix.context }} ubuntu-version: ${{ matrix.ubuntu-version }} amdgpu-targets: ${{ matrix.amdgpu-targets || 'gfx908,gfx90a,gfx942,gfx950,gfx1030,gfx1100,gfx1101,gfx1102,gfx1151,gfx1200,gfx1201' }} + # The script annotates each filtered Python entry with the prebuilt + # base ref it should consume; non-Python entries get '' and run their + # own inline bootstrap. + base-image-prebuilt: ${{ matrix.base-image-prebuilt || '' }} secrets: quayUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }} quayPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }} strategy: fail-fast: true matrix: ${{ fromJson(needs.generate-matrix.outputs.matrix) }} + backend-jobs-darwin: needs: generate-matrix uses: ./.github/workflows/backend_build_darwin.yml diff --git a/.github/workflows/base_images.yml b/.github/workflows/base_images.yml new file mode 100644 index 000000000000..141c7411b875 --- /dev/null +++ b/.github/workflows/base_images.yml @@ -0,0 +1,145 @@ +--- +name: 'build base image (reusable)' + +# Builds and pushes one (lang, accel, arch, ubuntu, cuda) base image flavour +# to quay.io/go-skynet/localai-base. Consumed by backend builds via the +# BASE_IMAGE_PREBUILT build-arg. PR builds tag with `-pr${PR_NUMBER}` so the +# same PR's backend matrix can opt-in to the freshly-built base; master +# builds overwrite the unsuffixed tag for downstream consumption. See +# .agents/ci-caching.md for the full tagging scheme. + +on: + workflow_call: + inputs: + lang: + description: 'Language toolchain (matches .docker/bases/Dockerfile.)' + required: true + type: string + base-image: + description: 'Upstream base image (ubuntu:24.04, rocm/dev-ubuntu-24.04:..., etc.)' + required: true + type: string + build-type: + description: 'BUILD_TYPE: empty for CPU, cublas, hipblas, vulkan, l4t, ...' + default: '' + type: string + cuda-major-version: + description: 'CUDA major version (only meaningful for cublas/l4t)' + default: '12' + type: string + cuda-minor-version: + description: 'CUDA minor version' + default: '9' + type: string + ubuntu-version: + description: 'Ubuntu version code (2204, 2404)' + default: '2404' + type: string + platforms: + description: 'Single platform per call (linux/amd64 or linux/arm64)' + required: true + type: string + runs-on: + description: 'Runner label' + required: true + type: string + tag-stem: + description: 'Stable portion of the image tag (e.g. python-cpu-amd64-2404)' + required: true + type: string + skip-drivers: + description: 'Pass-through to the base Dockerfile' + default: 'false' + type: string + secrets: + quayUsername: + required: false + quayPassword: + required: false + outputs: + image-ref: + description: 'Full image reference of the built base' + value: ${{ jobs.base-build.outputs.image-ref }} + +jobs: + base-build: + runs-on: ${{ inputs.runs-on }} + env: + quay_username: ${{ secrets.quayUsername }} + outputs: + image-ref: ${{ steps.compute_ref.outputs.ref }} + steps: + - name: Checkout + uses: actions/checkout@v6 + + - name: Configure apt mirror on runner + id: apt_mirror + uses: ./.github/actions/configure-apt-mirror + + - name: Free Disk Space (Ubuntu) + if: inputs.runs-on == 'ubuntu-latest' + uses: jlumbroso/free-disk-space@main + with: + tool-cache: true + android: true + dotnet: true + haskell: true + large-packages: true + docker-images: true + swap-storage: true + + - name: Compute image ref + id: compute_ref + run: | + stem='${{ inputs.tag-stem }}' + if [ "${{ github.event_name }}" = "pull_request" ]; then + tag="${stem}-pr${{ github.event.number }}" + else + tag="${stem}" + fi + echo "tag=${tag}" >> "$GITHUB_OUTPUT" + echo "ref=quay.io/go-skynet/localai-base:${tag}" >> "$GITHUB_OUTPUT" + + - name: Set up QEMU + uses: docker/setup-qemu-action@master + with: + platforms: all + + - name: Set up Docker Buildx + id: buildx + uses: docker/setup-buildx-action@master + + - name: Login to Quay.io + if: ${{ env.quay_username != '' }} + uses: docker/login-action@v4 + with: + registry: quay.io + username: ${{ secrets.quayUsername }} + password: ${{ secrets.quayPassword }} + + - name: Build and push base image + uses: docker/build-push-action@v7 + with: + builder: ${{ steps.buildx.outputs.name }} + context: . + file: ./.docker/bases/Dockerfile.${{ inputs.lang }} + build-args: | + BUILD_TYPE=${{ inputs.build-type }} + CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }} + CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }} + BASE_IMAGE=${{ inputs.base-image }} + UBUNTU_VERSION=${{ inputs.ubuntu-version }} + SKIP_DRIVERS=${{ inputs.skip-drivers }} + APT_MIRROR=${{ steps.apt_mirror.outputs.effective-mirror }} + APT_PORTS_MIRROR=${{ steps.apt_mirror.outputs.effective-ports-mirror }} + platforms: ${{ inputs.platforms }} + # Push on PRs as well (if creds present) so the PR's backend matrix + # can opt-in to the freshly-built base via -pr${N} tag. + push: ${{ env.quay_username != '' }} + tags: ${{ steps.compute_ref.outputs.ref }} + cache-from: type=registry,ref=quay.io/go-skynet/ci-cache:base-${{ inputs.tag-stem }} + cache-to: type=registry,ref=quay.io/go-skynet/ci-cache:base-${{ inputs.tag-stem }},mode=max,ignore-error=true + + - name: job summary + run: | + echo "Built base image: ${{ steps.compute_ref.outputs.ref }}" >> "$GITHUB_STEP_SUMMARY" diff --git a/Makefile b/Makefile index 1de03999d555..ac9cd09491ac 100644 --- a/Makefile +++ b/Makefile @@ -1072,6 +1072,90 @@ BACKEND_KOKOROS = kokoros|rust|.|false|true # C++ backends (Go wrapper with purego) BACKEND_SAM3_CPP = sam3-cpp|golang|.|false|true +# Tag stem for the local prebuilt base images. Mirrors tagStem() in +# scripts/changed-backends.js and the inline expression in +# .github/workflows/backend.yml, so a `make docker-build-X` produces the +# same FROM ref shape that CI uses. +LOCAL_BASE_BUILD_TYPE := $(or $(BUILD_TYPE),cpu) +LOCAL_BASE_UBUNTU_VERSION := $(or $(UBUNTU_VERSION),2404) +LOCAL_BASE_CUDA_SUFFIX := $(if $(filter cublas l4t,$(BUILD_TYPE)),-cuda$(CUDA_MAJOR_VERSION).$(CUDA_MINOR_VERSION)) +LOCAL_BASE_PYTHON_TAG := localai-base:python-$(LOCAL_BASE_BUILD_TYPE)-$(LOCAL_BASE_UBUNTU_VERSION)$(LOCAL_BASE_CUDA_SUFFIX) +LOCAL_BASE_GOLANG_TAG := localai-base:golang-$(LOCAL_BASE_BUILD_TYPE)-$(LOCAL_BASE_UBUNTU_VERSION)$(LOCAL_BASE_CUDA_SUFFIX) +LOCAL_BASE_CPP_TAG := localai-base:cpp-$(LOCAL_BASE_BUILD_TYPE)-$(LOCAL_BASE_UBUNTU_VERSION)$(LOCAL_BASE_CUDA_SUFFIX) +LOCAL_BASE_RUST_TAG := localai-base:rust-$(LOCAL_BASE_BUILD_TYPE)-$(LOCAL_BASE_UBUNTU_VERSION) + +# Per-(lang) base image build targets. Each backend's docker-build-X target +# depends on the matching base via generate-docker-build-target below. +# PHONY so docker handles its own layer caching. +.PHONY: docker-build-python-base docker-build-golang-base docker-build-cpp-base docker-build-rust-base + +docker-build-python-base: + docker build \ + --build-arg BUILD_TYPE=$(BUILD_TYPE) \ + --build-arg BASE_IMAGE=$(or $(BASE_IMAGE),ubuntu:24.04) \ + --build-arg CUDA_MAJOR_VERSION=$(CUDA_MAJOR_VERSION) \ + --build-arg CUDA_MINOR_VERSION=$(CUDA_MINOR_VERSION) \ + --build-arg UBUNTU_VERSION=$(LOCAL_BASE_UBUNTU_VERSION) \ + --build-arg APT_MIRROR=$(APT_MIRROR) \ + --build-arg APT_PORTS_MIRROR=$(APT_PORTS_MIRROR) \ + $(if $(SKIP_DRIVERS),--build-arg SKIP_DRIVERS=$(SKIP_DRIVERS)) \ + -t $(LOCAL_BASE_PYTHON_TAG) \ + -f .docker/bases/Dockerfile.python \ + . + +docker-build-golang-base: + docker build \ + --build-arg BUILD_TYPE=$(BUILD_TYPE) \ + --build-arg BASE_IMAGE=$(or $(BASE_IMAGE),ubuntu:24.04) \ + --build-arg CUDA_MAJOR_VERSION=$(CUDA_MAJOR_VERSION) \ + --build-arg CUDA_MINOR_VERSION=$(CUDA_MINOR_VERSION) \ + --build-arg UBUNTU_VERSION=$(LOCAL_BASE_UBUNTU_VERSION) \ + --build-arg APT_MIRROR=$(APT_MIRROR) \ + --build-arg APT_PORTS_MIRROR=$(APT_PORTS_MIRROR) \ + $(if $(SKIP_DRIVERS),--build-arg SKIP_DRIVERS=$(SKIP_DRIVERS)) \ + -t $(LOCAL_BASE_GOLANG_TAG) \ + -f .docker/bases/Dockerfile.golang \ + . + +docker-build-cpp-base: + docker build \ + --build-arg BUILD_TYPE=$(BUILD_TYPE) \ + --build-arg BASE_IMAGE=$(or $(BASE_IMAGE),ubuntu:24.04) \ + --build-arg CUDA_MAJOR_VERSION=$(CUDA_MAJOR_VERSION) \ + --build-arg CUDA_MINOR_VERSION=$(CUDA_MINOR_VERSION) \ + --build-arg UBUNTU_VERSION=$(LOCAL_BASE_UBUNTU_VERSION) \ + --build-arg APT_MIRROR=$(APT_MIRROR) \ + --build-arg APT_PORTS_MIRROR=$(APT_PORTS_MIRROR) \ + $(if $(SKIP_DRIVERS),--build-arg SKIP_DRIVERS=$(SKIP_DRIVERS)) \ + -t $(LOCAL_BASE_CPP_TAG) \ + -f .docker/bases/Dockerfile.cpp \ + . + +docker-build-rust-base: + docker build \ + --build-arg BASE_IMAGE=$(or $(BASE_IMAGE),ubuntu:24.04) \ + --build-arg UBUNTU_VERSION=$(LOCAL_BASE_UBUNTU_VERSION) \ + --build-arg APT_MIRROR=$(APT_MIRROR) \ + --build-arg APT_PORTS_MIRROR=$(APT_PORTS_MIRROR) \ + -t $(LOCAL_BASE_RUST_TAG) \ + -f .docker/bases/Dockerfile.rust \ + . + +# Map a consumer dockerfile-type to the base-image tag it should consume. +# Mirrors langOf() in scripts/changed-backends.js: the C++ trio +# (llama-cpp/ik-llama-cpp/turboquant) all consume the shared cpp base. +local-base-tag = $(strip \ + $(if $(filter python,$(1)),$(LOCAL_BASE_PYTHON_TAG), \ + $(if $(filter golang,$(1)),$(LOCAL_BASE_GOLANG_TAG), \ + $(if $(filter llama-cpp ik-llama-cpp turboquant,$(1)),$(LOCAL_BASE_CPP_TAG), \ + $(if $(filter rust,$(1)),$(LOCAL_BASE_RUST_TAG)))))) + +local-base-target = $(strip \ + $(if $(filter python,$(1)),docker-build-python-base, \ + $(if $(filter golang,$(1)),docker-build-golang-base, \ + $(if $(filter llama-cpp ik-llama-cpp turboquant,$(1)),docker-build-cpp-base, \ + $(if $(filter rust,$(1)),docker-build-rust-base))))) + # Helper function to build docker image for a backend # Usage: $(call docker-build-backend,BACKEND_NAME,DOCKERFILE_TYPE,BUILD_CONTEXT,PROGRESS_FLAG,NEEDS_BACKEND_ARG) define docker-build-backend @@ -1084,15 +1168,18 @@ define docker-build-backend --build-arg UBUNTU_CODENAME=$(UBUNTU_CODENAME) \ --build-arg APT_MIRROR=$(APT_MIRROR) \ --build-arg APT_PORTS_MIRROR=$(APT_PORTS_MIRROR) \ + $(if $(call local-base-tag,$(2)),--build-arg BASE_IMAGE_PREBUILT=$(call local-base-tag,$(2))) \ $(if $(FROM_SOURCE),--build-arg FROM_SOURCE=$(FROM_SOURCE)) \ $(if $(AMDGPU_TARGETS),--build-arg AMDGPU_TARGETS=$(AMDGPU_TARGETS)) \ $(if $(filter true,$(5)),--build-arg BACKEND=$(1)) \ -t local-ai-backend:$(1) -f backend/Dockerfile.$(2) $(3) endef -# Generate docker-build targets from backend definitions +# Generate docker-build targets from backend definitions. Each consumer +# gets the matching layered base as a prerequisite so the FROM in the +# slimmed Dockerfile resolves locally. The map lives in local-base-target. define generate-docker-build-target -docker-build-$(word 1,$(subst |, ,$(1))): +docker-build-$(word 1,$(subst |, ,$(1))): $(call local-base-target,$(word 2,$(subst |, ,$(1)))) $$(call docker-build-backend,$(word 1,$(subst |, ,$(1))),$(word 2,$(subst |, ,$(1))),$(word 3,$(subst |, ,$(1))),$(word 4,$(subst |, ,$(1))),$(word 5,$(subst |, ,$(1)))) endef diff --git a/backend/Dockerfile.golang b/backend/Dockerfile.golang index 4d0980a81e37..f01e86e56608 100644 --- a/backend/Dockerfile.golang +++ b/backend/Dockerfile.golang @@ -1,202 +1,37 @@ -ARG BASE_IMAGE=ubuntu:24.04 -ARG APT_MIRROR="" -ARG APT_PORTS_MIRROR="" +# Builds a single Go backend on top of the shared +# .docker/bases/Dockerfile.golang base. The base bakes in apt + GPU SDK + +# Go toolchain + protoc + grpc tooling, so this stage only carries the +# per-backend opus-dev install + COPY + `make build`. +# +# CI orchestration (.github/workflows/backend.yml + backend_pr.yml) builds +# the right base flavour automatically via scripts/changed-backends.js +# and passes BASE_IMAGE_PREBUILT here. For local builds, run: +# make backend-image-base LANG=golang BUILD_TYPE=<...> +# make backend-image BACKEND=<...> BUILD_TYPE=<...> +# See .agents/ci-caching.md. + +ARG BASE_IMAGE_PREBUILT + +FROM ${BASE_IMAGE_PREBUILT} AS builder -FROM ${BASE_IMAGE} AS builder ARG BACKEND=rerankers ARG BUILD_TYPE ENV BUILD_TYPE=${BUILD_TYPE} ARG CUDA_MAJOR_VERSION ARG CUDA_MINOR_VERSION -ARG SKIP_DRIVERS=false ENV CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION} ENV CUDA_MINOR_VERSION=${CUDA_MINOR_VERSION} -ENV DEBIAN_FRONTEND=noninteractive ARG TARGETARCH ARG TARGETVARIANT -ARG GO_VERSION=1.25.4 -ARG UBUNTU_VERSION=2404 ARG AMDGPU_TARGETS ENV AMDGPU_TARGETS=${AMDGPU_TARGETS} -ARG APT_MIRROR -ARG APT_PORTS_MIRROR - -RUN --mount=type=bind,source=.docker/apt-mirror.sh,target=/usr/local/sbin/apt-mirror \ - APT_MIRROR="${APT_MIRROR}" APT_PORTS_MIRROR="${APT_PORTS_MIRROR}" sh /usr/local/sbin/apt-mirror && \ - apt-get update && \ - apt-get install -y --no-install-recommends \ - build-essential \ - gcc-14 g++-14 \ - git ccache \ - ca-certificates \ - make cmake wget libopenblas-dev \ - curl unzip \ - libssl-dev && \ - update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100 \ - --slave /usr/bin/g++ g++ /usr/bin/g++-14 \ - --slave /usr/bin/gcov gcov /usr/bin/gcov-14 && \ - apt-get clean && \ - rm -rf /var/lib/apt/lists/* - - -# Cuda -ENV PATH=/usr/local/cuda/bin:${PATH} - -# HipBLAS requirements -ENV PATH=/opt/rocm/bin:${PATH} - - -# Vulkan requirements -RUN </dev/null || ls /opt/rocm*/lib64/rocblas/library/Kernels* 2>/dev/null) | grep -oP 'gfx[0-9a-z+-]+' | sort -u || \ - echo "WARNING: No rocBLAS kernel data found" \ - ; fi - -RUN echo "TARGETARCH: $TARGETARCH" - -# We need protoc installed, and the version in 22.04 is too old. We will create one as part installing the GRPC build below -# but that will also being in a newer version of absl which stablediffusion cannot compile with. This version of protoc is only -# here so that we can generate the grpc code for the stablediffusion build -RUN < # build the base +# make backend-image BACKEND=<...> BUILD_TYPE=<...> +# See .agents/ci-caching.md. + +ARG BASE_IMAGE_PREBUILT + +FROM ${BASE_IMAGE_PREBUILT} AS builder -FROM ${BASE_IMAGE} AS builder ARG BACKEND=rerankers ARG BUILD_TYPE ENV BUILD_TYPE=${BUILD_TYPE} ARG CUDA_MAJOR_VERSION ARG CUDA_MINOR_VERSION -ARG SKIP_DRIVERS=false ENV CUDA_MAJOR_VERSION=${CUDA_MAJOR_VERSION} ENV CUDA_MINOR_VERSION=${CUDA_MINOR_VERSION} -ENV DEBIAN_FRONTEND=noninteractive -ARG TARGETARCH -ARG TARGETVARIANT -ARG UBUNTU_VERSION=2404 -ARG APT_MIRROR -ARG APT_PORTS_MIRROR - -RUN --mount=type=bind,source=.docker/apt-mirror.sh,target=/usr/local/sbin/apt-mirror \ - APT_MIRROR="${APT_MIRROR}" APT_PORTS_MIRROR="${APT_PORTS_MIRROR}" sh /usr/local/sbin/apt-mirror && \ - apt-get update && \ - apt-get install -y --no-install-recommends \ - build-essential \ - ccache \ - ca-certificates \ - espeak-ng \ - curl \ - libssl-dev \ - git wget \ - git-lfs \ - unzip clang \ - upx-ucl \ - curl python3-pip \ - python-is-python3 \ - python3-dev llvm \ - libnuma1 libgomp1 \ - python3-venv make cmake && \ - apt-get clean && \ - rm -rf /var/lib/apt/lists/* - -RUN </dev/null || ls /opt/rocm*/lib64/rocblas/library/Kernels* 2>/dev/null) | grep -oP 'gfx[0-9a-z+-]+' | sort -u || \ - echo "WARNING: No rocBLAS kernel data found" \ - ; fi - -RUN echo "TARGETARCH: $TARGETARCH" - -# We need protoc installed, and the version in 22.04 is too old. We will create one as part installing the GRPC build below -# but that will also being in a newer version of absl which stablediffusion cannot compile with. This version of protoc is only -# here so that we can generate the grpc code for the stablediffusion build -RUN <=true/false: per-backend booleans for test-extra.yml. +// +// On PR events the matrix is filtered to backends whose source dirs +// changed; if .docker/bases/Dockerfile. (or its workflow scaffolding) +// changed, a canary entry per (lang × build-type × arch × cuda × ubuntu) +// is added so the prebuilt-base path gets exercised end-to-end before +// merge. See .agents/ci-caching.md. + import fs from "fs"; import yaml from "js-yaml"; import { Octokit } from "@octokit/core"; -// Load backend.yml and parse matrix.include -const backendYml = yaml.load(fs.readFileSync(".github/workflows/backend.yml", "utf8")); -const jobs = backendYml.jobs; -const backendJobs = jobs["backend-jobs"]; -const backendJobsDarwin = jobs["backend-jobs-darwin"]; -const includes = backendJobs.strategy.matrix.include; -const includesDarwin = backendJobsDarwin.strategy.matrix.include; +// Backend matrix lives in a sibling data file so the workflow can switch +// to fromJSON without needing two copies of the same matrix. See +// .github/backend-matrix.yaml. +const matrixData = yaml.load(fs.readFileSync(".github/backend-matrix.yaml", "utf8")); +const includes = matrixData.linux; +const includesDarwin = matrixData.darwin; const eventPath = process.env.GITHUB_EVENT_PATH; const event = JSON.parse(fs.readFileSync(eventPath, "utf8")); +const isPR = !!event.pull_request; +const prNumber = isPR ? event.pull_request.number : null; + +// Langs with a prebuilt base recipe under .docker/bases/Dockerfile.. +// Discovered at runtime so adding a new language tier (e.g. golang) only +// requires creating that file + slimming the consumer Dockerfile; no +// orchestration changes needed. +const baseRecipeDir = ".docker/bases"; +const langsWithBase = new Set( + fs.existsSync(baseRecipeDir) + ? fs.readdirSync(baseRecipeDir) + .filter(f => f.startsWith("Dockerfile.")) + .map(f => f.slice("Dockerfile.".length)) + : [] +); + +// Files that, when changed in a PR, should fan out to canary backend +// matrix entries for the affected lang. Keeps PR validation honest when a +// PR only touches base scaffolding. Per-lang recipe paths +// (.docker/bases/Dockerfile.) trigger only their own lang; the +// shared scaffolding entries trigger every lang. +const baseTriggerFiles = new Set([ + ".docker/bases/Dockerfile.python", + ".docker/bases/Dockerfile.golang", + ".docker/bases/Dockerfile.cpp", + ".docker/bases/Dockerfile.rust", + ".docker/apt-mirror.sh", + ".github/workflows/base_images.yml", + ".github/actions/configure-apt-mirror/action.yml", + "scripts/changed-backends.js", +]); +// Maps a base lang back to the consumer Dockerfiles that build on top of +// it. The cpp base is shared by the llama-cpp / ik-llama-cpp / turboquant +// trio; everything else is 1:1 with the file suffix. +const langTriggerSelector = { + python: (item) => item.dockerfile && item.dockerfile.endsWith("python"), + golang: (item) => item.dockerfile && item.dockerfile.endsWith("golang"), + rust: (item) => item.dockerfile && item.dockerfile.endsWith("rust"), + cpp: (item) => + !!item.dockerfile && /Dockerfile\.(llama-cpp|ik-llama-cpp|turboquant)$/.test(item.dockerfile), +}; + +// ---------- helpers ---------- + +function langOf(item) { + if (!item.dockerfile) return null; + // dockerfile is like "./backend/Dockerfile.python" + const m = item.dockerfile.match(/Dockerfile\.([\w-]+)$/); + if (!m) return null; + // The C++ trio (llama-cpp, ik-llama-cpp, turboquant) consume a shared + // cpp base image — they only differ in their per-backend make targets. + if (m[1] === "llama-cpp" || m[1] === "ik-llama-cpp" || m[1] === "turboquant") { + return "cpp"; + } + return m[1]; +} -// Infer backend path function inferBackendPath(item) { if (item.dockerfile.endsWith("python")) { return `backend/python/${item.backend}/`; @@ -42,61 +114,196 @@ function inferBackendPathDarwin(item) { if (!item.lang) { return `backend/python/${item.backend}/`; } - return `backend/${item.lang}/${item.backend}/`; } -// Build a deduplicated map of backend name -> path prefix from all matrix entries +function platformsOf(item) { + // matrix.platforms can be "linux/amd64", "linux/arm64", or + // "linux/amd64,linux/arm64". Always return a normalized array. + if (!item.platforms) return ["linux/amd64"]; + return item.platforms.split(",").map(p => p.trim()).filter(Boolean); +} + +// Slug a base image reference for inclusion in a tag-stem. Returns "" for +// the default ubuntu:24.04 (which is the implicit BASE_IMAGE) so that case +// keeps a clean stem. Other base images get a short, parseable suffix. +function baseImageSlug(img) { + if (!img || img === "ubuntu:24.04") return ""; + if (img.includes("l4t-jetpack")) { + const m = img.match(/r\d+(?:\.\d+)+/); + return `jetpack-${m ? m[0] : "x"}`; + } + if (img.includes("rocm/dev-ubuntu")) { + const m = img.match(/:([\d.]+)/); + return `rocm-${m ? m[1] : "x"}`; + } + if (img.includes("intel/oneapi-basekit")) { + const m = img.match(/:([\d.]+)/); + return `oneapi-${m ? m[1] : "x"}`; + } + return img.replace(/.*\//, "").replace(/:/g, "-").replace(/[^A-Za-z0-9.-]/g, ""); +} + +// Tag stem for the prebuilt base. Arch is intentionally NOT in the stem: +// the base is built multi-arch when any consumer needs multi-arch, and +// single-arch otherwise. +function tagStem(item) { + const lang = langOf(item); + if (!lang || !langsWithBase.has(lang)) return null; + const ubuntu = item["ubuntu-version"] || "2404"; + const buildType = item["build-type"] || "cpu"; + let stem = `${lang}-${buildType}-${ubuntu}`; + if (buildType === "cublas" || buildType === "l4t") { + stem += `-cuda${item["cuda-major-version"]}.${item["cuda-minor-version"]}`; + } + const slug = baseImageSlug(item["base-image"]); + if (slug) stem += `-${slug}`; + return stem; +} + +function prebuiltRef(stem) { + if (!stem) return ""; + const suffix = isPR ? `-pr${prNumber}` : ""; + return `quay.io/go-skynet/localai-base:${stem}${suffix}`; +} + +// Build-types that actually exercise the SKIP_DRIVERS branch in the base +// Dockerfile. For everything else (cpu, intel, sycl_*, mps, metal), +// skip-drivers is a no-op and disagreeing values across consumers are +// safe to merge. +const driverBuildTypes = new Set(["vulkan", "cublas", "l4t", "clblas", "hipblas"]); + +function effectiveSkipDrivers(item) { + if (!driverBuildTypes.has(item["build-type"] || "")) return "false"; + return String(item["skip-drivers"] ?? "false"); +} + +// Build a base entry consumed by base_images.yml. Platforms is the union +// across all consumers of this stem (multi-arch when any consumer needs +// it). runs-on is derived from the platforms: arm-native when arm64 is +// the only arch, ubuntu-latest (with QEMU) otherwise. +function baseEntryFor(stem, items) { + const first = items[0]; + const platformSet = new Set(); + for (const it of items) for (const p of platformsOf(it)) platformSet.add(p); + const platforms = [...platformSet].sort().join(","); + const armOnly = platforms === "linux/arm64"; + return { + "tag-stem": stem, + lang: langOf(first), + "base-image": first["base-image"], + "build-type": first["build-type"] || "", + "cuda-major-version": String(first["cuda-major-version"] ?? ""), + "cuda-minor-version": String(first["cuda-minor-version"] ?? ""), + "ubuntu-version": String(first["ubuntu-version"] ?? "2404"), + platforms, + "runs-on": armOnly ? "ubuntu-24.04-arm" : "ubuntu-latest", + "skip-drivers": effectiveSkipDrivers(first), + }; +} + +function dedupBases(items) { + // Group consumers by tag-stem. + const groups = new Map(); + for (const item of items) { + const stem = tagStem(item); + if (!stem) continue; + if (!groups.has(stem)) groups.set(stem, []); + groups.get(stem).push(item); + } + // Inputs that MUST agree across all consumers of a stem. If they don't, + // the script picks one arbitrarily and the others get a wrong base — fail + // loudly so the matrix is reconciled. + const collisionChecks = [ + ["base-image", (it) => it["base-image"]], + ["skip-drivers", effectiveSkipDrivers], + ]; + const out = []; + for (const [stem, consumers] of groups) { + for (const [name, getter] of collisionChecks) { + const v0 = getter(consumers[0]); + for (const c of consumers.slice(1)) { + const v = getter(c); + if (v !== v0) { + throw new Error( + `Tag-stem collision for ${stem}: ${name} differs ` + + `(${JSON.stringify(v0)} for ${consumers[0]["tag-suffix"]} vs ` + + `${JSON.stringify(v)} for ${c["tag-suffix"]}). ` + + `Disambiguate by encoding ${name} in tagStem(), or reconcile the matrix entries.`, + ); + } + } + } + out.push(baseEntryFor(stem, consumers)); + } + return out; +} + +// Annotate a backend matrix entry with `base-image-prebuilt` for langs +// with a prebuilt base recipe; leave others untouched (their Dockerfile +// runs the inline bootstrap). +function annotate(item) { + const stem = tagStem(item); + if (!stem) return item; + return { ...item, "base-image-prebuilt": prebuiltRef(stem) }; +} + +// Build the deduplicated list of backend names → path prefixes from all +// matrix entries (linux + darwin). Used for per-backend boolean outputs +// consumed by test-extra.yml. function getAllBackendPaths() { const paths = new Map(); for (const item of includes) { const p = inferBackendPath(item); - if (p && !paths.has(item.backend)) { - paths.set(item.backend, p); - } + if (p && !paths.has(item.backend)) paths.set(item.backend, p); } for (const item of includesDarwin) { const p = inferBackendPathDarwin(item); - if (p && !paths.has(item.backend)) { - paths.set(item.backend, p); - } + if (p && !paths.has(item.backend)) paths.set(item.backend, p); } return paths; } const allBackendPaths = getAllBackendPaths(); -// Non-PR events: output run-all=true and all backends as true -if (!event.pull_request) { - fs.appendFileSync(process.env.GITHUB_OUTPUT, `run-all=true\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `has-backends=true\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `has-backends-darwin=true\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `matrix=${JSON.stringify({ include: includes })}\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `matrix-darwin=${JSON.stringify({ include: includesDarwin })}\n`); +function writeOutput(key, value) { + fs.appendFileSync(process.env.GITHUB_OUTPUT, `${key}=${value}\n`); +} + +function emit(filtered, filteredDarwin, runAll) { + const annotated = filtered.map(annotate); + const bases = dedupBases(filtered); + writeOutput("run-all", runAll); + writeOutput("has-backends", annotated.length > 0 ? "true" : "false"); + writeOutput("has-backends-darwin", filteredDarwin.length > 0 ? "true" : "false"); + writeOutput("has-bases", bases.length > 0 ? "true" : "false"); + writeOutput("matrix", JSON.stringify({ include: annotated })); + writeOutput("matrix-darwin", JSON.stringify({ include: filteredDarwin })); + writeOutput("bases-matrix", JSON.stringify({ include: bases })); +} + +// ---------- master mode (push events) ---------- + +if (!isPR) { + emit(includes, includesDarwin, "true"); for (const backend of allBackendPaths.keys()) { - fs.appendFileSync(process.env.GITHUB_OUTPUT, `${backend}=true\n`); + writeOutput(backend, "true"); } process.exit(0); } -// PR context -const prNumber = event.pull_request.number; +// ---------- PR mode ---------- + const repo = event.repository.name; const owner = event.repository.owner.login; - -const token = process.env.GITHUB_TOKEN; -const octokit = new Octokit({ auth: token }); +const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN }); async function getChangedFiles() { let files = []; let page = 1; while (true) { - const res = await octokit.request('GET /repos/{owner}/{repo}/pulls/{pull_number}/files', { - owner, - repo, - pull_number: prNumber, - per_page: 100, - page + const res = await octokit.request("GET /repos/{owner}/{repo}/pulls/{pull_number}/files", { + owner, repo, pull_number: prNumber, per_page: 100, page, }); files = files.concat(res.data.map(f => f.filename)); if (res.data.length < 100) break; @@ -107,35 +314,55 @@ async function getChangedFiles() { (async () => { const changedFiles = await getChangedFiles(); - console.log("Changed files:", changedFiles); - const filtered = includes.filter(item => { - const backendPath = inferBackendPath(item); - if (!backendPath) return false; - return changedFiles.some(file => file.startsWith(backendPath)); - }); + // Source-driven filter: backend dir touched. + const sourceTriggered = new Set(); + for (const item of includes) { + const p = inferBackendPath(item); + if (p && changedFiles.some(f => f.startsWith(p))) { + sourceTriggered.add(item); + } + } - const filteredDarwin = includesDarwin.filter(item => { - const backendPath = inferBackendPathDarwin(item); - return changedFiles.some(file => file.startsWith(backendPath)); - }) + // Base-driven filter: any matrix entry whose lang has a prebuilt base + // recipe AND that recipe (or its scaffolding) was touched. We want one + // canary per (lang × build-type × arch × cuda × ubuntu) so all bases get + // exercised, not 234 entries. + const baseTriggered = new Set(); + const baseTriggerHits = new Set(changedFiles.filter(f => baseTriggerFiles.has(f))); + if (baseTriggerHits.size > 0) { + const seenStems = new Set(); + for (const item of includes) { + const stem = tagStem(item); + if (!stem) continue; + const select = langTriggerSelector[langOf(item)]; + if (select && !select(item)) continue; + // Only canary entries for langs whose recipe/scaffolding actually changed. + const hits = [...baseTriggerHits]; + const recipePath = `.docker/bases/Dockerfile.${langOf(item)}`; + const langTouched = + hits.includes(recipePath) || + // any non-recipe trigger touches all langs + hits.some(h => h !== recipePath && !h.startsWith(".docker/bases/Dockerfile.")); + if (!langTouched) continue; + if (seenStems.has(stem)) continue; + seenStems.add(stem); + baseTriggered.add(item); + } + } - console.log("Filtered files:", filtered); - console.log("Filtered files Darwin:", filteredDarwin); + const filtered = includes.filter(item => sourceTriggered.has(item) || baseTriggered.has(item)); + const filteredDarwin = includesDarwin.filter(item => { + const p = inferBackendPathDarwin(item); + return changedFiles.some(f => f.startsWith(p)); + }); - const hasBackends = filtered.length > 0 ? 'true' : 'false'; - const hasBackendsDarwin = filteredDarwin.length > 0 ? 'true' : 'false'; - console.log("Has backends?:", hasBackends); - console.log("Has Darwin backends?:", hasBackendsDarwin); + console.log("Filtered linux:", filtered.length, "(source:", sourceTriggered.size, "base canaries:", baseTriggered.size, ")"); + console.log("Filtered darwin:", filteredDarwin.length); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `run-all=false\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `has-backends=${hasBackends}\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `has-backends-darwin=${hasBackendsDarwin}\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `matrix=${JSON.stringify({ include: filtered })}\n`); - fs.appendFileSync(process.env.GITHUB_OUTPUT, `matrix-darwin=${JSON.stringify({ include: filteredDarwin })}\n`); + emit(filtered, filteredDarwin, "false"); - // Per-backend boolean outputs for (const [backend, pathPrefix] of allBackendPaths) { let changed = changedFiles.some(file => file.startsWith(pathPrefix)); // turboquant reuses backend/cpp/llama-cpp sources via a thin wrapper; @@ -143,6 +370,6 @@ async function getChangedFiles() { if (backend === "turboquant" && !changed) { changed = changedFiles.some(file => file.startsWith("backend/cpp/llama-cpp/")); } - fs.appendFileSync(process.env.GITHUB_OUTPUT, `${backend}=${changed ? 'true' : 'false'}\n`); + writeOutput(backend, changed ? "true" : "false"); } })();