Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 67 additions & 25 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,19 @@
# CLAUDE.md
# AGENTS.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

github_linter is a Python tool for auditing GitHub repositories at scale. It scans repositories for common configuration issues, missing files, and standardization opportunities across multiple repos.
github_linter is a Python tool for managing GitHub repositories in bulk. Its main job is to inspect repositories for policy or configuration drift, report what is wrong, and optionally apply repository-side fixes such as creating or updating files, workflows, and branch protection.

The project is built around a modular "test then fix" model:

- A module can define one or more `check_*` functions that inspect a repository and report problems.
- The same module can define one or more `fix_*` functions that make corrective changes when `--fix` is enabled.
- Modules can be run all at once, filtered by module name, or filtered down to specific checks/fixes.
- Modules can also declare language requirements so Python-only or Terraform-only checks do not run on unrelated repositories.

In practice, the tool is closer to a repository configuration manager than a passive linter. It can validate state, propose drift through warnings/errors, and push changes back to GitHub when fixes are enabled.

**MANDATORY** You are not finished with a task until running `just check` passes without warnings or errors.

Expand Down Expand Up @@ -37,28 +46,53 @@ github_linter is a Python tool for auditing GitHub repositories at scale. It sca

## Architecture

### Core Components
### Core flow

1. `github_linter/__main__.py` is the CLI entrypoint.
- Parses repo, owner, module, check, and `--fix` flags.
- Loads the available modules from `github_linter.tests`.
- Selects repositories via the GitHub API.

2. `GithubLinter` in `github_linter/__init__.py` is the top-level orchestrator.
- Loads config from `github_linter.json` or `~/.config/github_linter.json`.
- Authenticates with both PyGithub and `github3.py`.
- Builds the repo list, applies module selection, handles rate limiting, and stores the final report.

3. `RepoLinter` in `github_linter/repolinter.py` is the per-repository execution context.
- Wraps one GitHub repository.
- Caches file lookups.
- Merges module default config into runtime config.
- Records `errors`, `warnings`, and `fixes`.
- Exposes helper methods for reading repo files, checking languages, and writing fixes back to GitHub.

4. Modules under `github_linter/tests/` provide the actual repository rules.
- They are imported in `github_linter/tests/__init__.py`.
- Each module declares `CATEGORY`, `LANGUAGES`, and `DEFAULT_CONFIG`.
- Each module contributes `check_*` and optional `fix_*` functions.

1. **GithubLinter** (`github_linter/__init__.py`) - Main orchestrator that:
- Handles GitHub authentication (via environment variable `GITHUB_TOKEN` or config file)
- Manages rate limiting
- Coordinates module execution across repositories
- Generates reports
5. File and workflow templates used by fixes live under `github_linter/fixes/`.
- Fix functions typically read these templates and commit them into the target repository with `RepoLinter.create_or_update_file()`.

2. **RepoLinter** (`github_linter/repolinter.py`) - Per-repository handler that:
- Manages file caching for the repository
- Runs test modules against the repository
- Tracks errors, warnings, and fixes
- Provides utility methods for checking files and languages
- Handles file creation/updates with protected branch awareness
### Execution model

3. **Test Modules** (`github_linter/tests/`) - Pluggable modules that check specific aspects:
- Each module must define `CATEGORY`, `LANGUAGES`, and `DEFAULT_CONFIG`
- Functions starting with `check_` are automatically discovered and run
- Functions starting with `fix_` are run when `--fix` flag is used
- Modules are loaded dynamically in `tests/__init__.py`
For each selected repository, the CLI creates a `RepoLinter` and runs each enabled module through `RepoLinter.run_module()`.

### Available Test Modules
- Module config defaults are merged in before execution.
- Language filtering happens before a module runs.
- Every `check_*` function in the module is executed first.
- If `--fix` is enabled, every `fix_*` function is then executed as part of the same module pass.
- Check/fix execution can be narrowed with `--module` and `--check`.
- Skip exceptions such as archived/private/protected repositories are used as normal control flow and are swallowed by the runner.

This means a typical workflow is:

1. Run checks across all or some repositories.
2. Review the report.
3. Re-run with `--fix` to apply the module fixes that correspond to the same problem space.

The code does not enforce one exact `check_*` to `fix_*` pairing by name. Instead, checks and fixes are grouped by module and category, so a module usually contains the validation and remediation logic for the same repository concern.

### Available Modules

- `branch_protection` - Validates and configures branch protection on default branches
- `codeowners` - Validates CODEOWNERS files
Expand All @@ -72,13 +106,21 @@ github_linter is a Python tool for auditing GitHub repositories at scale. It sca
- `security_md` - Checks for SECURITY.md
- `terraform` - Checks Terraform provider configurations

### Module Language Filtering
### Repository selection and filtering

- Repositories are selected from CLI flags and/or `linter.owner_list` in config.
- If no owner is supplied, the current authenticated user is used.
- `--module` limits which modules are enabled.
- `--check` filters the `check_*` and `fix_*` function names within enabled modules.
- `--list-repos` prints the resolved repo set without running modules.

### Module language filtering

Modules declare which languages they apply to via the `LANGUAGES` attribute:

- Use `["all"]` for modules that apply to all repositories
- Use specific languages (e.g., `["python"]`, `["rust"]`) to run only on repos with those languages
- Language detection is based on GitHub's automatic language detection
- Use `["all"]` for modules that apply to every repository.
- Use specific languages such as `["python"]` or `["terraform"]` to restrict execution.
- Language detection comes from GitHub's repository language API, not local file inspection.

### Configuration

Expand All @@ -87,7 +129,7 @@ Configuration file locations (in priority order):
1. `./github_linter.json` (local directory)
2. `~/.config/github_linter.json` (user config)

Each module can define `DEFAULT_CONFIG` which gets merged with user configuration.
Each module can define `DEFAULT_CONFIG`, which is merged into the active config before the module runs. That lets modules ship sane defaults while still allowing overrides in the JSON config file.

#### Branch Protection Configuration

Expand Down
8 changes: 8 additions & 0 deletions github_linter/fixes/python/placeholder_test_nothing.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
"""doesn't test anything"""

import pytest


def test_nothing() -> None:
"""doesn't test anything"""
pytest.skip("This is just a placeholder")
1 change: 1 addition & 0 deletions github_linter/tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
homebrew, # noqa: F401
issues, # noqa: F401
mkdocs, # noqa: F401
python, # noqa: F401
pyproject, # noqa: F401
security_md, # noqa: F401
terraform, # noqa: F401
Expand Down
85 changes: 85 additions & 0 deletions github_linter/tests/python.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""Python-specific checks and fixes"""

from pathlib import PurePosixPath

from github.GithubException import GithubException
from loguru import logger

from ..repolinter import RepoLinter
from ..utils import get_fix_file_path

CATEGORY = "python"
LANGUAGES = ["python"]
DEFAULT_CONFIG = {}

PLACEHOLDER_TEST_PATH = "tests/test_nothing.py"
PLACEHOLDER_TEMPLATE_PATH = "placeholder_test_nothing.py"


def _is_pytest_test_path(path: str) -> bool:
"""Return True when the path is a pytest-style test under tests/."""

path_parts = PurePosixPath(path).parts
if not path_parts or path_parts[0] != "tests":
return False

filename = path_parts[-1]
return filename.startswith("test") and filename.endswith(".py")


def _has_pytest_test(repo: RepoLinter) -> bool:
"""Check the repository tree for at least one pytest-style test file."""

default_branch = repo.repository.get_branch(repo.repository.default_branch)
tree = repo.repository.get_git_tree(default_branch.commit.sha, recursive=True)

for tree_item in tree.tree:
if getattr(tree_item, "type", None) != "blob":
continue
if _is_pytest_test_path(getattr(tree_item, "path", "")):
return True

return False


def check_has_a_pytest_test(repo: RepoLinter) -> None:
"""Ensure Python repositories contain at least one pytest-style test file."""

repo.skip_on_archived()

try:
if _has_pytest_test(repo):
return
except GithubException as exc:
logger.error("Failed to inspect repository tree for {}: {}", repo.repository.full_name, exc)
repo.error(CATEGORY, "Failed to inspect repository tree for pytest tests.")
return

repo.error(CATEGORY, "Missing pytest tests. Expected at least one Python file matching tests/test*.py.")


def fix_has_a_pytest_test(repo: RepoLinter) -> None:
"""Create a placeholder pytest file when the repository has no tests."""

repo.skip_on_archived()

try:
if _has_pytest_test(repo):
return
except GithubException as exc:
logger.error("Failed to inspect repository tree for {}: {}", repo.repository.full_name, exc)
repo.error(CATEGORY, "Failed to inspect repository tree for pytest tests.")
return

placeholder_file = get_fix_file_path(CATEGORY, PLACEHOLDER_TEMPLATE_PATH)
commit_url = repo.create_or_update_file(
filepath=PLACEHOLDER_TEST_PATH,
newfile=placeholder_file,
oldfile=None,
message="github_linter: add placeholder pytest test",
)

if commit_url:
repo.fix(CATEGORY, f"Created placeholder pytest test: {commit_url}")
else:
repo.error(CATEGORY, "Failed to create placeholder pytest test.")
105 changes: 105 additions & 0 deletions tests/test_python.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
"""Tests for the python module"""

from unittest.mock import Mock

from github_linter.repolinter import RepoLinter
from github_linter.tests.python import (
CATEGORY,
PLACEHOLDER_TEST_PATH,
PLACEHOLDER_TEMPLATE_PATH,
_has_pytest_test,
check_has_a_pytest_test,
fix_has_a_pytest_test,
)


def create_tree_entry(path: str, item_type: str = "blob") -> Mock:
"""Create a git tree entry mock."""

entry = Mock()
entry.path = path
entry.type = item_type
return entry


def create_repo_with_tree(*paths: str) -> Mock:
"""Create a RepoLinter mock with a git tree."""

mock_repo = Mock(spec=RepoLinter)
mock_repo.repository = Mock()
mock_repo.repository.full_name = "test/repo"
mock_repo.repository.default_branch = "main"
mock_repo.repository.get_branch.return_value = Mock(commit=Mock(sha="deadbeef"))
mock_repo.repository.get_git_tree.return_value = Mock(tree=[create_tree_entry(path) for path in paths])
return mock_repo


def test_has_pytest_test_matches_top_level_test_file() -> None:
"""A tests/test*.py file should satisfy the check."""

mock_repo = create_repo_with_tree("tests/test_example.py")

assert _has_pytest_test(mock_repo)


def test_has_pytest_test_matches_nested_test_file() -> None:
"""Nested tests should also satisfy the check."""

mock_repo = create_repo_with_tree("tests/unit/test_example.py")

assert _has_pytest_test(mock_repo)


def test_has_pytest_test_rejects_non_matching_paths() -> None:
"""Non-test files should not satisfy the check."""

mock_repo = create_repo_with_tree("tests/example.py", "src/test_example.py", "tests/test_example.txt")

assert not _has_pytest_test(mock_repo)


def test_check_has_a_pytest_test_reports_missing_tests() -> None:
"""The check should report an error when no pytest tests exist."""

mock_repo = create_repo_with_tree("README.md", "src/app.py")

check_has_a_pytest_test(mock_repo)

mock_repo.error.assert_called_once_with(CATEGORY, "Missing pytest tests. Expected at least one Python file matching tests/test*.py.")


def test_check_has_a_pytest_test_accepts_existing_test() -> None:
"""The check should not report errors when a test exists."""

mock_repo = create_repo_with_tree("tests/test_example.py")

check_has_a_pytest_test(mock_repo)

mock_repo.error.assert_not_called()


def test_fix_has_a_pytest_test_creates_placeholder() -> None:
"""The fix should create a placeholder test file when none exist."""

mock_repo = create_repo_with_tree("README.md", "src/app.py")
mock_repo.create_or_update_file.return_value = "https://example.com/commit"

fix_has_a_pytest_test(mock_repo)

mock_repo.create_or_update_file.assert_called_once()
create_call = mock_repo.create_or_update_file.call_args.kwargs
assert create_call["filepath"] == PLACEHOLDER_TEST_PATH
assert str(create_call["newfile"]).endswith(f"github_linter/fixes/python/{PLACEHOLDER_TEMPLATE_PATH}")
assert create_call["oldfile"] is None
mock_repo.fix.assert_called_once_with(CATEGORY, "Created placeholder pytest test: https://example.com/commit")


def test_fix_has_a_pytest_test_skips_when_test_exists() -> None:
"""The fix should do nothing when pytest tests already exist."""

mock_repo = create_repo_with_tree("tests/test_example.py")

fix_has_a_pytest_test(mock_repo)

mock_repo.create_or_update_file.assert_not_called()
mock_repo.fix.assert_not_called()