Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ correctness of the action.

- [ASF Infrastructure Pelican Action](/pelican/README.md): Generate and publish project websites with GitHub Actions
- [Stash Action](/stash/README.md): Manage large build caches
- [ASF Allowlist Check](/allowlist-check/README.md): Verify workflow action refs are on the ASF allowlist

## Management of Organization-wide GitHub Actions Allow List

Expand Down
115 changes: 115 additions & 0 deletions allowlist-check/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# ASF Allowlist Check

A composite GitHub Action that verifies all `uses:` refs in a project's workflow files are on the ASF Infrastructure [approved allowlist](../approved_patterns.yml). Catches violations **before merge**, preventing the silent CI failures that occur when an action is not on the org-level allowlist (see [#574](https://github.com/apache/infrastructure-actions/issues/574)).

## Why

When a GitHub Actions workflow references an action that isn't on the ASF org-level allowlist, the CI job silently fails with "Startup failure" — no logs, no notifications, and the PR may appear green because no checks ran. This action catches those problems at PR time with a clear error message.

## Usage

Add a workflow file to your project (e.g., `.github/workflows/asf-allowlist-check.yml`):

```yaml
name: "ASF Allowlist Check"

on:
pull_request:
paths:
- ".github/**"
push:
branches:
- main
paths:
- ".github/**"

permissions:
contents: read

jobs:
asf-allowlist-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
persist-credentials: false
- uses: apache/infrastructure-actions/allowlist-check@main
```

That's it — two steps. The `actions/checkout` step checks out your repo so `.github/` is available to scan, then the allowlist check runs against those files.

## Inputs

| Input | Required | Default | Description |
|---|---|---|---|
| `scan-glob` | No | `.github/**/*.yml` | Glob pattern for YAML files to scan for action refs |

### Custom scan glob

To scan only workflow files (excluding other YAML under `.github/`):

```yaml
- uses: apache/infrastructure-actions/allowlist-check@main
with:
scan-glob: ".github/workflows/*.yml"
```

## What it checks

The action scans all matching YAML files for `uses:` keys and validates each action ref against the [approved_patterns.yml](../approved_patterns.yml) allowlist.

### Automatically allowed

Actions from these GitHub organizations are implicitly trusted and don't need to be in the allowlist:
- `actions/*` — GitHub's official actions
- `github/*` — GitHub's own actions
- `apache/*` — ASF's own actions

### Skipped

- **Local refs** (`./`) — paths within the same repo are not subject to the org allowlist
- **Docker refs** (`docker://`) — container actions pulled directly from a registry
- **Empty YAML files** — skipped
- **Malformed YAML files** — fails with an error

### Violation output

When violations are found, the action fails with exit code 1 and prints:

```
::error::Found 2 action ref(s) not on the ASF allowlist:
::error file=.github/workflows/ci.yml::some-org/some-action@v1 is not on the ASF allowlist
::error file=.github/workflows/release.yml::other-org/other-action@abc123 is not on the ASF allowlist
```

To resolve a violation, open a PR in this repo to [add the action](../README.md#adding-a-new-action-to-the-allow-list) or [add a new version](../README.md#adding-a-new-version-to-the-allow-list) to the allowlist.

When all refs pass:

```
All 15 unique action refs are on the ASF allowlist
```

## Dependencies

- Python 3 (pre-installed on GitHub-hosted runners)
- ruyaml (installed automatically by the action)
41 changes: 41 additions & 0 deletions allowlist-check/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

name: "ASF Allowlist Check"
description: >
Verify that all GitHub Actions uses: refs in the caller's workflow files
are on the ASF Infrastructure approved allowlist. Fails with a clear error
listing any refs that are not allowlisted.
author: kevinjqliu

inputs:
scan-glob:
description: "Glob pattern for YAML files to scan for action refs"
required: false
default: ".github/**/*.yml"

runs:
using: composite
steps:
- name: Install ruyaml
shell: bash
run: pip install ruyaml
- name: Verify all action refs are allowlisted
shell: bash
run: python3 "${{ github.action_path }}/check_asf_allowlist.py" "${{ github.action_path }}/../approved_patterns.yml"
env:
GITHUB_YAML_GLOB: ${{ inputs.scan-glob }}
184 changes: 184 additions & 0 deletions allowlist-check/check_asf_allowlist.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

"""Check that all GitHub Actions uses: refs are on the ASF allowlist.

Usage:
python3 check_asf_allowlist.py <allowlist_path>

The allowlist is the approved_patterns.yml file colocated at the root of
this repository (../approved_patterns.yml relative to this script).

The glob pattern for YAML files to scan can be overridden via the
GITHUB_YAML_GLOB environment variable (default: .github/**/*.yml).

Exits with code 1 if any action ref is not allowlisted.
"""

import fnmatch
import glob
import os
import sys
from typing import Any, Generator

import ruyaml

# actions/*, github/*, apache/* are implicitly trusted by GitHub/ASF
# See ../README.md ("Management of Organization-wide GitHub Actions Allow List")
TRUSTED_OWNERS = {"actions", "github", "apache"}

# Default glob pattern for YAML files to scan for action refs
DEFAULT_GITHUB_YAML_GLOB = ".github/**/*.yml"

# Prefixes that indicate local or non-GitHub refs (not subject to allowlist)
# ./ — local composite actions within the same repo
# docker:// — container actions pulled directly from a registry
SKIPPED_PREFIXES = ("./", "docker://")

# YAML key that references a GitHub Action
USES_KEY = "uses"


def find_action_refs(node: Any) -> Generator[str, None, None]:
"""Recursively find all `uses:` values from a parsed YAML tree.

Args:
node: A parsed YAML node (any type returned by ruyaml)

Yields:
str: Each `uses:` string value found in the tree
"""
if isinstance(node, dict):
for key, value in node.items():
if key == USES_KEY and isinstance(value, str):
yield value
else:
yield from find_action_refs(value)
elif isinstance(node, list):
for item in node:
yield from find_action_refs(item)


def collect_action_refs(
scan_glob: str = DEFAULT_GITHUB_YAML_GLOB,
) -> dict[str, list[str]]:
"""Collect all third-party action refs from YAML files.

Skips local (./) and Docker (docker://) refs, as these are not
subject to the org-level allowlist.

Args:
scan_glob: Glob pattern for files to scan.

Returns:
dict: Mapping of each action ref to the list of file paths that use it.
"""

action_refs = {}
for filepath in sorted(glob.glob(scan_glob, recursive=True)):
try:
yaml = ruyaml.YAML()
with open(filepath) as f:
content = yaml.load(f)
except ruyaml.YAMLError as exc:
print(f"::error file={filepath}::Failed to parse YAML: {exc}")
sys.exit(1)
if not content:
continue
for ref in find_action_refs(content):
if ref.startswith(SKIPPED_PREFIXES):
continue
action_refs.setdefault(ref, []).append(filepath)
return action_refs


def load_allowlist(allowlist_path: str) -> list[str]:
"""Load the ASF approved_patterns.yml file.

The file is a flat YAML list of entries like:
- owner/action@<sha> (exact SHA match)
- owner/action@* (any ref allowed)
- golangci/*@* (any repo under owner, any ref)

Python's fnmatch.fnmatch matches "/" with "*" (unlike shell globs),
so these patterns work directly without transformation.

Args:
allowlist_path: Path to the approved_patterns.yml file

Returns:
list[str]: List of allowlist patterns (empty list if file is empty)
"""
yaml = ruyaml.YAML()
with open(allowlist_path) as f:
result = yaml.load(f)
return result if result else []


def is_allowed(action_ref: str, allowlist: list[str]) -> bool:
"""Check whether a single action ref is allowed.

An action ref is allowed if its owner is in TRUSTED_OWNERS or it
matches any pattern in the allowlist via fnmatch.

Args:
action_ref: The action reference string (e.g., "owner/action@ref")
allowlist: List of allowlist patterns to match against

Returns:
bool: True if the action ref is allowed
"""
owner = action_ref.split("/")[0]
if owner in TRUSTED_OWNERS:
return True
return any(fnmatch.fnmatch(action_ref, pattern) for pattern in allowlist)


def main():
if len(sys.argv) != 2:
print(f"Usage: {sys.argv[0]} <allowlist_path>", file=sys.stderr)
sys.exit(2)

allowlist_path = sys.argv[1]
allowlist = load_allowlist(allowlist_path)
scan_glob = os.environ.get("GITHUB_YAML_GLOB", DEFAULT_GITHUB_YAML_GLOB)
action_refs = collect_action_refs(scan_glob)

violations = []
for action_ref, filepaths in sorted(action_refs.items()):
if not is_allowed(action_ref, allowlist):
for filepath in filepaths:
violations.append((filepath, action_ref))

if violations:
print(
f"::error::Found {len(violations)} action ref(s) not on the ASF allowlist:"
)
for filepath, action_ref in violations:
print(f"::error file={filepath}::{action_ref} is not on the ASF allowlist")
print(
"::error::To resolve, open a PR in apache/infrastructure-actions to add"
" the action or version to the allowlist:"
" https://github.com/apache/infrastructure-actions#adding-a-new-action-to-the-allow-list"
)
sys.exit(1)
else:
print(f"All {len(action_refs)} unique action refs are on the ASF allowlist")


if __name__ == "__main__":
main()
Loading
Loading