fuzz: improve CI iteration strategy, add corpus minimization and summary reporting#4497
fuzz: improve CI iteration strategy, add corpus minimization and summary reporting#4497joostjager wants to merge 2 commits intolightningdevkit:mainfrom
Conversation
|
🎉 This PR is now ready for review! |
e88eb4c to
b2ea4c4
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4497 +/- ##
==========================================
+ Coverage 86.18% 86.19% +0.01%
==========================================
Files 160 160
Lines 107536 107536
Branches 107536 107536
==========================================
+ Hits 92680 92695 +15
+ Misses 12231 12215 -16
- Partials 2625 2626 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
1e31813 to
63d6802
Compare
The sanity check (cargo test on fuzz targets) doesn't use the restored corpus and was blocking the actual fuzz run. Move it to a separate fuzz_sanity job so both run in parallel. AI tools were used in preparing this commit.
63d6802 to
d3ec1c9
Compare
| - name: Run fuzzers | ||
| run: cd fuzz && ./ci-fuzz.sh && cd .. | ||
| env: | ||
| FUZZ_MINIMIZE: ${{ contains(github.event.pull_request.labels.*.name, 'fuzz-minimize') }} |
There was a problem hiding this comment.
Why bother if we do this on main every time anyway?
There was a problem hiding this comment.
I added it just for testing. Or maybe if a fuzz-affecting refactor PR is open for a long time and has accumulated a large corpus.
Can also remove the label trigger. Do you prefer that?
|
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
Replace the fixed 30s run_time with iteration counts scaled to 8x corpus size (plus a 1000 baseline) with a 10-minute hard cap per target. This ensures the full corpus is replayed with room for mutations, while small targets finish quickly. On main (and on PRs with the fuzz-minimize label), run honggfuzz corpus minimization after each target to prune inputs that don't contribute unique coverage, keeping the cache size manageable. Print a summary table at the end with per-target stats: iterations, corpus sizes before/after fuzzing and minimization, and run times. Other changes: - Use -q (quiet) to suppress per-iteration status output - Set 3s per-input timeout (-t 3) for all targets - Pass FUZZ_MINIMIZE env var from PR label in workflow AI tools were used in preparing this commit.
d3ec1c9 to
32f182a
Compare
| if [ "$GITHUB_REF" = "refs/heads/main" ] || [ "$FUZZ_MINIMIZE" = "true" ]; then | ||
| HFUZZ_RUN_ARGS="-M -q -n8 -t 3" | ||
| export HFUZZ_RUN_ARGS | ||
| MIN_START=$(date +%s) | ||
| cargo --color always hfuzz run $FILE | ||
| MIN_END=$(date +%s) | ||
| MIN_TIME=$((MIN_END - MIN_START)) | ||
| MIN_CORPUS_COUNT=$(find "$CORPUS_DIR" -type f 2>/dev/null | wc -l) |
There was a problem hiding this comment.
The crash check (lines 55-61) runs only after the fuzz step, but not after the minimization step here. Minimization still executes every corpus input through the target binary. If a crash occurs during minimization (e.g., a newly-mutated input from the fuzz step that was added to the corpus but didn't crash during fuzzing due to timing, or a non-deterministic crash), the HONGGFUZZ.REPORT.TXT would be written but never checked.
Consider adding a crash check after the minimization run as well:
| if [ "$GITHUB_REF" = "refs/heads/main" ] || [ "$FUZZ_MINIMIZE" = "true" ]; then | |
| HFUZZ_RUN_ARGS="-M -q -n8 -t 3" | |
| export HFUZZ_RUN_ARGS | |
| MIN_START=$(date +%s) | |
| cargo --color always hfuzz run $FILE | |
| MIN_END=$(date +%s) | |
| MIN_TIME=$((MIN_END - MIN_START)) | |
| MIN_CORPUS_COUNT=$(find "$CORPUS_DIR" -type f 2>/dev/null | wc -l) | |
| HFUZZ_RUN_ARGS="-M -q -n8 -t 3" | |
| export HFUZZ_RUN_ARGS | |
| MIN_START=$(date +%s) | |
| cargo --color always hfuzz run $FILE | |
| MIN_END=$(date +%s) | |
| MIN_TIME=$((MIN_END - MIN_START)) | |
| MIN_CORPUS_COUNT=$(find "$CORPUS_DIR" -type f 2>/dev/null | wc -l) | |
| if [ -f hfuzz_workspace/$FILE/HONGGFUZZ.REPORT.TXT ]; then | |
| cat hfuzz_workspace/$FILE/HONGGFUZZ.REPORT.TXT | |
| for CASE in hfuzz_workspace/$FILE/SIG*; do | |
| cat $CASE | xxd -p | |
| done | |
| exit 1 | |
| fi |
|
|
Fuzzing has become increasingly important with the recent wave of changes: async persist, channel manager refactors, splicing, and zero-fee channels. These are complex state machine changes where the fuzzer is one of our best tools for catching edge cases.
This PR gives the fuzz CI some overdue attention. The main goals are visibility into what the fuzzer is actually doing, and making the iteration budget more meaningful:
fuzz-minimizelabel.