Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions .github/workflows/docs-preview-cleanup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
name: Docs Preview Cleanup

# Deletes Cloudflare Pages preview deployments for a PR when it closes.
# Runs as pull_request_target so secrets are available for fork PRs; it never
# checks out PR code, so there is no untrusted-code execution risk.

on:
pull_request_target: # zizmor: ignore[dangerous-triggers] never checks out PR code
types: [closed]

permissions: {}

jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Delete preview deployments for this PR
env:
CF_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
CF_ACCOUNT_ID: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }}
CF_PROJECT: ${{ vars.CLOUDFLARE_PAGES_PROJECT }}
BRANCH: pr-${{ github.event.pull_request.number }}
run: |
set -euo pipefail
if [ -z "$CF_API_TOKEN" ] || [ -z "$CF_ACCOUNT_ID" ] || [ -z "$CF_PROJECT" ]; then
echo "Cloudflare credentials/project not configured; skipping cleanup."
exit 0
fi
base="https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/pages/projects/$CF_PROJECT/deployments"
# Collect matching ids across all pages first, then delete — deleting
# mid-pagination would shift later pages and skip entries.
ids=""
for page in $(seq 1 200); do
resp=$(curl -fsS -H "Authorization: Bearer $CF_API_TOKEN" "$base?env=preview&per_page=25&page=$page")
ids="$ids $(jq -r --arg b "$BRANCH" '.result[]? | select(.deployment_trigger.metadata.branch == $b) | .id' <<<"$resp")"
[ "$(jq '.result | length' <<<"$resp")" -lt 25 ] && break
done
deleted=0
for id in $ids; do
echo "Deleting deployment $id"
curl -fsS -X DELETE -H "Authorization: Bearer $CF_API_TOKEN" "$base/$id?force=true" > /dev/null
deleted=$((deleted + 1))
done

Check warning on line 43 in .github/workflows/docs-preview-cleanup.yml

View check run for this annotation

Claude / Claude Code Review

Cleanup loop aborts on first failed DELETE, leaving remaining deployments undeleted

In the cleanup loop, each `curl -fsS -X DELETE` runs under `set -euo pipefail` with no per-id error tolerance, so a single failed DELETE (transient 5xx, rate limit, or an already-deleted deployment returning 404) aborts the step and skips every remaining id — leaving the rest of the PR's preview deployments stale, which is the leak this workflow exists to prevent. A one-line `|| echo "failed to delete $id"` per id (optionally tracking failures and exiting non-zero at the end) keeps the batch res
Comment on lines +38 to +43

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 In the cleanup loop, each curl -fsS -X DELETE runs under set -euo pipefail with no per-id error tolerance, so a single failed DELETE (transient 5xx, rate limit, or an already-deleted deployment returning 404) aborts the step and skips every remaining id — leaving the rest of the PR's preview deployments stale, which is the leak this workflow exists to prevent. A one-line || echo "failed to delete $id" per id (optionally tracking failures and exiting non-zero at the end) keeps the batch resilient while still surfacing the error.

Extended reasoning...

The bug. The delete loop in docs-preview-cleanup.yml runs:

for id in $ids; do
  echo "Deleting deployment $id"
  curl -fsS -X DELETE -H "Authorization: Bearer $CF_API_TOKEN" "$base/$id?force=true" > /dev/null
  deleted=$((deleted + 1))
done

under set -euo pipefail. Because of -f, curl exits non-zero on any HTTP ≥ 400 (and on transient network errors), and set -e then aborts the whole step mid-loop. Every id after the failing one in $ids is never deleted.

Concrete walk-through. Suppose a long-lived PR accumulated 8 preview deployments and $ids collects all 8. (1) The PR closes and the workflow fires once. (2) DELETE succeeds for ids 1–2. (3) The DELETE for id 3 hits a transient Cloudflare 5xx or a 429 — curl -f exits non-zero. (4) set -e terminates the step immediately; ids 4–8 are never attempted. (5) The workflow only triggers on pull_request_target: closed, so there is no automatic retry — the remaining deployments stay live on Cloudflare indefinitely unless someone notices the red run and manually re-runs it from the Actions UI.

Why nothing else prevents it. The script already shows care about partial-iteration hazards (the collect-then-delete comment to avoid pagination shifts, the 200-page cap raised after review), but the delete phase has no equivalent protection: there is no || true, no retry, and no per-id error accounting.

Addressing the counterargument. It's fair that fail-fast with a visible red run is a defensible design and that a manual re-run will re-list and delete the leftovers, so the original "no retry path" framing overstates it slightly — recovery exists, it's just manual and easy to miss on a closed PR. It's also true that the trigger needs an actual API failure on a specific DELETE, which is uncommon, and the consequence is the same low-impact class already discussed on the pagination-cap thread (a stale preview; nothing breaks). That's why this is a nit rather than a blocker. Still, a workflow whose entire purpose is unattended cleanup ideally shouldn't let one bad response abandon the rest of the batch when the hardening costs one line.

Fix. Tolerate per-id failures while still reporting them, e.g.:

failed=0
for id in $ids; do
  echo "Deleting deployment $id"
  if curl -fsS -X DELETE -H "Authorization: Bearer $CF_API_TOKEN" "$base/$id?force=true" > /dev/null; then
    deleted=$((deleted + 1))
  else
    echo "::warning::failed to delete deployment $id"
    failed=$((failed + 1))
  fi
done
echo "Deleted $deleted deployment(s) for $BRANCH ($failed failed)."
[ "$failed" -eq 0 ]

This deletes everything it can, still fails the run (visibly) if anything was left behind, and a manual re-run then only has the genuinely stuck ids to deal with.

echo "Deleted $deleted deployment(s) for $BRANCH."
231 changes: 231 additions & 0 deletions .github/workflows/docs-preview.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
name: Docs Preview

# Builds the mkdocs site for a PR and deploys it to Cloudflare Pages.
#
# Security: mkdocs executes Python from the PR (mkdocstrings imports src/mcp,
# `!!python/name:` directives). The build is gated by `authorize` (admin sender
# for auto-preview, admin/maintainer commenter for /preview-docs) and isolated
# from Cloudflare secrets — `build` runs PR code with no secrets and hands the
# static site to `deploy` via an artifact, so PR code never shares a runner
# with the Cloudflare token.
#
# Required configuration:
# - secrets.CLOUDFLARE_API_TOKEN (scope: Account → Cloudflare Pages → Edit)
# - secrets.CLOUDFLARE_ACCOUNT_ID
# - vars.CLOUDFLARE_PAGES_PROJECT (existing Pages project, e.g. mcp-python-sdk-docs)

on:
pull_request_target: # zizmor: ignore[dangerous-triggers] build is permission-gated and secret-isolated; see header comment
types: [opened, reopened, synchronize]
paths:
- docs/**
- docs_src/**
- mkdocs.yml
- pyproject.toml
issue_comment:
types: [created]

permissions: {}

concurrency:
# Workflow-level concurrency is evaluated when the run is queued — before any
# job-level `if:` — so an unrelated PR comment would otherwise cancel an
# in-flight build. Only runs that actually produce a preview share a group;
# everything else falls through to a unique run_id group.
group: >-
docs-preview-pr-${{
github.event_name == 'pull_request_target' && github.event.pull_request.number
|| (github.event.issue.pull_request && startsWith(github.event.comment.body, '/preview-docs') && github.event.issue.number)
Comment thread
localden marked this conversation as resolved.
|| github.run_id
}}
cancel-in-progress: true
Comment thread
claude[bot] marked this conversation as resolved.

jobs:
authorize:
if: >-
github.event_name == 'pull_request_target' ||
(github.event.issue.pull_request && startsWith(github.event.comment.body, '/preview-docs'))
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
outputs:
authorized: ${{ steps.check.outputs.authorized }}
pr_number: ${{ steps.check.outputs.pr_number }}
head_sha: ${{ steps.check.outputs.head_sha }}
slash_attempt: ${{ steps.check.outputs.slash_attempt }}
steps:
- name: Determine authorization
id: check
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8.0.0
with:
script: |
const { owner, repo } = context.repo;

async function permissionFor(username) {
const { data } = await github.rest.repos.getCollaboratorPermissionLevel({ owner, repo, username });
Comment thread
localden marked this conversation as resolved.
return { level: data.permission, role: data.role_name };
}

let authorized = false;
let prNumber = '';
let headSha = '';
let slashAttempt = false;

if (context.eventName === 'pull_request_target') {
// Gate on the *sender* (whoever caused this run — on synchronize that
// is the pusher), not the PR author, so a non-admin pushing to an
// admin-opened branch does not get an automatic build.
const actor = context.payload.sender.login;
prNumber = String(context.payload.pull_request.number);
headSha = context.payload.pull_request.head.sha;
const perm = await permissionFor(actor);
authorized = perm.level === 'admin';
core.info(`pull_request_target by ${actor} (level=${perm.level}, role=${perm.role}) → authorized=${authorized}`);
} else {
// issue_comment: the job-level `if:` already guarantees this is a PR
// comment starting with /preview-docs.
slashAttempt = true;
const actor = context.payload.comment.user.login;
prNumber = String(context.payload.issue.number);
const perm = await permissionFor(actor);
authorized = perm.level === 'admin' || perm.role === 'maintain';
if (authorized) {
const { data: pr } = await github.rest.pulls.get({ owner, repo, pull_number: Number(prNumber) });
if (pr.state !== 'open') {
authorized = false;
core.info(`PR #${prNumber} is ${pr.state}; refusing to preview.`);
} else {
headSha = pr.head.sha;
}
}
core.info(`/preview-docs by ${actor} (level=${perm.level}, role=${perm.role}) → authorized=${authorized}`);
}

core.setOutput('authorized', String(authorized));
core.setOutput('pr_number', prNumber);
core.setOutput('head_sha', headSha);
core.setOutput('slash_attempt', String(slashAttempt));

build:
needs: authorize
if: needs.authorize.outputs.authorized == 'true'
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
ref: ${{ needs.authorize.outputs.head_sha }}
persist-credentials: false

- name: Install uv
uses: astral-sh/setup-uv@fac544c07dec837d0ccb6301d7b5580bf5edae39 # v8.2.0
with:
# pull_request_target runs share the base-branch Actions cache; saving
# a cache populated while untrusted PR code ran would let it poison
# later trusted workflows. Mirrors publish-pypi.yml.
enable-cache: false
version: 0.9.5

- run: uv sync --frozen --group docs
- run: uv run --frozen --no-sync mkdocs build

Comment thread
claude[bot] marked this conversation as resolved.
- uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: site
path: site/
retention-days: 1

deploy:
needs: [authorize, build]
if: needs.authorize.outputs.authorized == 'true'
runs-on: ubuntu-latest
permissions: {}
outputs:
deployment_url: ${{ steps.wrangler.outputs.deployment-url }}
alias_url: ${{ steps.wrangler.outputs.pages-deployment-alias-url }}
steps:
- uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: site
path: site

- name: Deploy to Cloudflare Pages
id: wrangler
uses: cloudflare/wrangler-action@ebbaa1584979971c8614a24965b4405ff95890e0 # v4.0.0
with:
apiToken: ${{ secrets.CLOUDFLARE_API_TOKEN }}
accountId: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }}
packageManager: npm
command: >-
pages deploy ./site
--project-name=${{ vars.CLOUDFLARE_PAGES_PROJECT }}
--branch=pr-${{ needs.authorize.outputs.pr_number }}
--commit-hash=${{ needs.authorize.outputs.head_sha }}
--commit-dirty=true

comment:
needs: [authorize, build, deploy]
if: >-
always() &&
needs.deploy.result != 'cancelled' &&
(needs.authorize.outputs.authorized == 'true' || needs.authorize.outputs.slash_attempt == 'true')
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Post or update preview comment
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8.0.0
env:
AUTHORIZED: ${{ needs.authorize.outputs.authorized }}
PR_NUMBER: ${{ needs.authorize.outputs.pr_number }}
HEAD_SHA: ${{ needs.authorize.outputs.head_sha }}
DEPLOY_RESULT: ${{ needs.deploy.result }}
DEPLOYMENT_URL: ${{ needs.deploy.outputs.deployment_url }}
ALIAS_URL: ${{ needs.deploy.outputs.alias_url }}
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
with:
script: |
const { owner, repo } = context.repo;
const env = process.env;
const issue_number = Number(env.PR_NUMBER);
const marker = '<!-- docs-preview -->';

async function upsert(body) {
const comments = await github.paginate(github.rest.issues.listComments, { owner, repo, issue_number, per_page: 100 });
const existing = comments.find(c => c.user?.login === 'github-actions[bot]' && c.body?.includes(marker));
if (existing) {
await github.rest.issues.updateComment({ owner, repo, comment_id: existing.id, body });
} else {
await github.rest.issues.createComment({ owner, repo, issue_number, body });
}
}

Comment thread
localden marked this conversation as resolved.
if (env.AUTHORIZED !== 'true') {
await github.rest.issues.createComment({
owner, repo, issue_number,
body: `@${context.actor} — only repository admins or maintainers can run \`/preview-docs\` (and the PR must be open).`,
});
return;
}

if (env.DEPLOY_RESULT !== 'success') {
await upsert(
`${marker}\n### 📚 Documentation preview\n\n` +
`❌ Preview build **failed** for \`${env.HEAD_SHA.slice(0, 7)}\` — [workflow logs](${env.RUN_URL}).`
);
return;
}

const previewUrl = env.ALIAS_URL || env.DEPLOYMENT_URL;
const ts = new Date().toISOString().replace('T', ' ').replace(/\.\d+Z$/, ' UTC');
await upsert(
`${marker}\n### 📚 Documentation preview\n\n` +
`| | |\n|---|---|\n` +
`| **Preview** | ${previewUrl} |\n` +
`| **Deployment** | ${env.DEPLOYMENT_URL} |\n` +
`| **Commit** | \`${env.HEAD_SHA.slice(0, 7)}\` |\n` +
`| **Triggered by** | @${context.actor} |\n` +
`| **Updated** | ${ts} |\n`
);
Loading