docs/docs-cn: simplify pull_verify pipeline#4521
docs/docs-cn: simplify pull_verify pipeline#4521qiancai wants to merge 2 commits intoPingCAP-QE:mainfrom
Conversation
|
Hi @qiancai. Thanks for your PR. I'm waiting for a PingCAP-QE member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Code Review
This pull request refactors the pull_verify.groovy pipelines for both the docs-cn and docs repositories, transitioning from a matrix-based execution to a sequential stage-based model. The new structure includes explicit stages for checking out code, preparing check scripts via wget, and a comprehensive verification stage that runs multiple Python-based checks and markdownlint. Feedback suggests that for better CI reliability and performance, the check scripts should be baked into the Docker image rather than being downloaded and managed at runtime.
| cp -r ../check-scripts/* ./ | ||
|
|
||
| failed_checks=() | ||
|
|
||
| run_check() { | ||
| local name="\$1" | ||
| shift | ||
|
|
||
| echo "==> Running \${name}" | ||
| if "\$@" "\${diff_docs_files[@]}"; then | ||
| echo "==> \${name}: PASS" | ||
| else | ||
| echo "==> \${name}: FAIL" | ||
| failed_checks+=("\${name}") | ||
| fi | ||
| echo | ||
| } | ||
|
|
||
| run_check 'check-file-encoding' python3 ./check-file-encoding.py | ||
| run_check 'check-conflicts' python3 ./check-conflicts.py | ||
| run_check 'check-control-char' python3 ./check-control-char.py | ||
| run_check 'check-tags' python3 ./check-tags.py | ||
| run_check 'check-manual-line-breaks' python3 ./check-manual-line-breaks.py |
There was a problem hiding this comment.
Instead of managing check scripts at runtime by copying them or referencing sibling directories, these scripts should be baked into the Docker image. This approach improves CI reliability and performance by ensuring a consistent environment and reducing runtime overhead, consistent with repository practices.
failed_checks=()
run_check() {
local name="\$1"
shift
echo "==> Running \${name}"
if "\$@" "\${diff_docs_files[@]}"; then
echo "==> \${name}: PASS"
else
echo "==> \${name}: FAIL"
failed_checks+=("\${name}")
fi
echo
}
run_check 'check-file-encoding' python3 /usr/local/bin/check-file-encoding.py
run_check 'check-conflicts' python3 /usr/local/bin/check-conflicts.py
run_check 'check-control-char' python3 /usr/local/bin/check-control-char.py
run_check 'check-tags' python3 /usr/local/bin/check-tags.py
run_check 'check-manual-line-breaks' python3 /usr/local/bin/check-manual-line-breaks.py
References
- For better CI performance and reliability, bake dependencies into the Docker image instead of installing them at runtime in CI scripts.
| cp -r ../check-scripts/* ./ | ||
|
|
||
| failed_checks=() | ||
|
|
||
| run_check() { | ||
| local name="\$1" | ||
| shift | ||
|
|
||
| echo "==> Running \${name}" | ||
| if "\$@" "\${diff_docs_files[@]}"; then | ||
| echo "==> \${name}: PASS" | ||
| else | ||
| echo "==> \${name}: FAIL" | ||
| failed_checks+=("\${name}") | ||
| fi | ||
| echo | ||
| } | ||
|
|
||
| run_check 'check-file-encoding' python3 ./check-file-encoding.py | ||
| run_check 'check-conflicts' python3 ./check-conflicts.py | ||
| run_check 'check-control-char' python3 ./check-control-char.py | ||
| run_check 'check-tags' python3 ./check-tags.py | ||
| run_check 'check-manual-line-breaks' python3 ./check-manual-line-breaks.py |
There was a problem hiding this comment.
Instead of managing check scripts at runtime by copying them or referencing sibling directories, these scripts should be baked into the Docker image. This approach improves CI reliability and performance by ensuring a consistent environment and reducing runtime overhead, consistent with repository practices.
failed_checks=()
run_check() {
local name="\$1"
shift
echo "==> Running \${name}"
if "\$@" "\${diff_docs_files[@]}"; then
echo "==> \${name}: PASS"
else
echo "==> \${name}: FAIL"
failed_checks+=("\${name}")
fi
echo
}
run_check 'check-file-encoding' python3 /usr/local/bin/check-file-encoding.py
run_check 'check-conflicts' python3 /usr/local/bin/check-conflicts.py
run_check 'check-control-char' python3 /usr/local/bin/check-control-char.py
run_check 'check-tags' python3 /usr/local/bin/check-tags.py
run_check 'check-manual-line-breaks' python3 /usr/local/bin/check-manual-line-breaks.py
References
- For better CI performance and reliability, bake dependencies into the Docker image instead of installing them at runtime in CI scripts.
|
/ok-to-test |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wuhuizuo The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold |
|
New changes are detected. LGTM label has been removed. |
What problem does this PR solve?
The current
pingcap/docsandpingcap/docs-cnpull_verifypipelines use a 6-leg matrix where each leg spins up its own pod and repeats the same heavy setup work:.mdfilesIn practice, the bottleneck is usually pod readiness and repeated setup, not the checks themselves.
A concrete example is
pingcap/docsPR #22748:pull_verifybuild #19043 was aborted after about 15m35sAnother recent example is
pingcap/docsPR #22753, where successful build #19024 took about 20m44s.For these small docs-only PRs, the actual checks are cheap because they only run against files from
git diff-tree; the repeated matrix setup dominates the runtime.What is changed?
This PR simplifies both
pingcap/docsandpingcap/docs-cnpull_verifypipelines from a 6-leg matrix into a single-pod sequential flow:Checkoutprow.checkoutRefsWithCacheLock(REFS)only oncePreparepingcap/docs/masteronly onceVerify.mdfilesThis keeps the existing validation scope, required context, and trigger model unchanged. The main difference is that we remove repeated pod scheduling, repeated checkout/cache restore, and repeated helper-script preparation.
Expected impact
Based on recent
pull_verifyruns, the current matrix-based flow commonly spends around 14-21 minutes even on relatively small docs PRs.With this change, the expected runtime for typical small or medium docs PRs should drop to roughly 2-5 minutes:
The exact savings still depend on Kubernetes scheduling and network conditions, but this should remove the largest fixed-cost duplication in the current design.
Validation
JENKINS_URL=https://do.pingcap.net/jenkins .ci/verify-jenkins-pipeline-file.sh pipelines/pingcap/docs/latest/pull_verify.groovyJENKINS_URL=https://do.pingcap.net/jenkins .ci/verify-jenkins-pipeline-file.sh pipelines/pingcap/docs-cn/latest/pull_verify.groovy