Goal
Drive the wall-clock time to compile the full FastLED examples/ set on CI as low as possible, for two distinct regimes:
- Cold — first time a runner sees this cache key. No fbuild cache, no toolchain on disk, no pip cache. This is what every new board / new OS / first PR after a cache bust pays.
- Warm (group rebuild) — same graph re-built with a populated fbuild cache. This is the common path on master after #2319 landed in the FastLED repo.
We want both numbers low, and we want to understand where the time goes — not just trust that it feels fast.
Non-goals
- Not trying to speed up a single example compile. The unit under test is the whole example list for one board.
- Not building hardware-specific fast paths. Improvements must come from fbuild / cache / CI orchestration, not from dropping examples.
Approach
A dedicated orphan branch in this repo — bench/fastled-examples — containing only:
.github/workflows/benchmark.yml — a single self-contained workflow (workflow_dispatch) that:
- installs fbuild via the local
setup action,
- clones FastLED at a pinned SHA,
- compiles all examples for a chosen board (
uno by default, overridable),
- prints per-phase wall-clock timings to the job summary,
- uploads raw timing data as an artifact.
README.md documenting the branch purpose.
Two invocations per change = one number for each regime:
- Run 1 with
cache-version bumped / fresh key → cold.
- Run 2 immediately after → warm.
Phases we want to measure
actions/setup-python
pip install fbuild
actions/cache restore (fbuild cache)
git clone FastLED (+ LFS if relevant)
- fbuild daemon startup
- toolchain download / materialization (first compile)
- per-example compile time (or at least aggregate compile phase)
actions/cache save
- job teardown
Each phase gets its own ::group:: + date +%s.%N delta so we can eyeball it in logs and parse it from artifacts.
Tracking
Baselines and per-iteration results will be posted as comments on this issue.
Dimensions to vary later
- Board:
uno, esp32dev, esp32s3, teensy41 (different toolchain sizes)
- OS:
ubuntu-latest vs ubuntu-24.04 vs self-hosted
- Parallelism:
--parallel vs --no-parallel
- fbuild daemon: cold-spawn vs pre-spawned
- pip:
uv pip vs pip vs pre-built wheel cache
Related
Goal
Drive the wall-clock time to compile the full FastLED
examples/set on CI as low as possible, for two distinct regimes:We want both numbers low, and we want to understand where the time goes — not just trust that it feels fast.
Non-goals
Approach
A dedicated orphan branch in this repo —
bench/fastled-examples— containing only:.github/workflows/benchmark.yml— a single self-contained workflow (workflow_dispatch) that:setupaction,unoby default, overridable),README.mddocumenting the branch purpose.Two invocations per change = one number for each regime:
cache-versionbumped / fresh key → cold.Phases we want to measure
actions/setup-pythonpip install fbuildactions/cacherestore (fbuild cache)git clone FastLED(+ LFS if relevant)actions/cachesaveEach phase gets its own
::group::+date +%s.%Ndelta so we can eyeball it in logs and parse it from artifacts.Tracking
bench/fastled-exampleswith initial workflow + READMEBaselines and per-iteration results will be posted as comments on this issue.
Dimensions to vary later
uno,esp32dev,esp32s3,teensy41(different toolchain sizes)ubuntu-latestvsubuntu-24.04vs self-hosted--parallelvs--no-paralleluv pipvspipvs pre-built wheel cacheRelated
build_template.yml.