-
Notifications
You must be signed in to change notification settings - Fork 278
Open
Labels
Description
Summary
Build a GitHub Actions workflow with a custom TypeScript action that automatically analyzes PR diffs and identifies which documentation needs updating -- both in-repo (Azure/azure-dev) and in the external docs repo (MicrosoftDocs/azure-dev-docs-pr).
Implementation: PR #6927
Flow
flowchart TD
A["PR Event: opened/synchronize/closed"] --> B{Event Type?}
B -->|opened / synchronize| C["Fetch PR Diff via API"]
B -->|closed + merged| SKIP["Skip: PRs already exist"]
B -->|closed + not merged| Z["Close doc PRs, clean branches"]
D["Manual Trigger"] --> E{Mode?}
E -->|single| C
E -->|all_open| F["Enumerate open PRs"]
E -->|list| G["Parse PR numbers"]
F --> C
G --> C
C --> H["Classify changes"]
H --> I["Build docs inventory"]
I --> J["AI Analysis via GPT-4o"]
J --> K{Docs impacted?}
K -->|No| L["Post: no doc changes needed"]
K -->|Yes| M["Generate doc proposals"]
M --> N{"In-repo docs?"}
N -->|Yes| O["Branch: docs/pr-N in azure-dev"]
O --> P["Create/update PR"]
N -->|No| Q{"External docs?"}
P --> Q
Q -->|Yes| R["Mint token via OIDC"]
R --> R2["Branch: docs/pr-N in docs repo"]
R2 --> S["Create/update docs PR"]
Q -->|No| T["Update tracking comment"]
S --> T
L --> U["Done"]
T --> U
Z --> U
SKIP --> U
Security Architecture
flowchart LR
subgraph "Fork PR Security"
FP["Fork PR"] --> PRT["PR target trigger"]
PRT --> MAIN["Runs from main"]
MAIN --> SAFE["Fork cant modify workflow"]
end
subgraph "OIDC + Key Vault Signing"
OIDC["OIDC Token"] --> AZ["azure/login"]
AZ --> KV["Key Vault Sign"]
KV --> JWT["Signed JWT"]
JWT --> TOKEN["Install Token"]
TOKEN --> WRITE["Write to docs repo"]
end
subgraph "Data Flow (API only)"
API["GitHub REST API"] --> DIFF["Read PR diff"]
API --> DOCS["Read doc inventory"]
API --> NEVER["NEVER checkout or execute PR code"]
end
Problem
When code changes land in Azure/azure-dev, documentation in two locations may need updating:
- In-repo docs -- markdown files within
Azure/azure-dev(e.g.,cli/azd/docs/, READMEs, etc.) - External docs --
MicrosoftDocs/azure-dev-docs-pr(the public-facing Learn documentation)
There is no automated system to detect which docs are impacted by a code PR, propose updates, or track the relationship between code PRs and doc PRs.
Proposed Solution
Workflow Triggers
pull_request_target: [opened, synchronize, reopened, closed]targetingmain-- usespull_request_targetinstead ofpull_requestto prevent fork PRs from exfiltrating secretsworkflow_dispatchfor manual/batch runs (single PR, all open PRs, or a list)
Authentication
| Layer | Method | Purpose |
|---|---|---|
| In-repo operations | GITHUB_TOKEN |
Read PR diff, create doc PRs in azure-dev, post comments |
| Azure login | OIDC federated credentials | azure/login@v2 exchanges GitHub OIDC token for Azure access |
| JWT signing | Azure Key Vault | az keyvault key sign signs a GitHub App JWT (RSA key is non-exportable) |
| Cross-repo writes | GitHub App installation token | Short-lived token scoped to MicrosoftDocs/azure-dev-docs-pr |
Key security properties:
- No secrets stored in GitHub -- OIDC is fully keyless (federated credential binding)
- Private key never leaves Key Vault -- signing happens server-side via
az keyvault key sign - Short-lived tokens -- GitHub App installation tokens expire in 1 hour
- Scoped access -- token only grants access to repos where the App is installed
Core Behavior
- Diff analysis -- Extract and classify PR changes (API, behavior, config, feature, deprecation, bug fix)
- Doc inventory -- Build manifest of all docs in both repos (via
git.getTree+git.getBlobfor efficiency, withsanitizeText()on all extracted content) - AI-powered impact mapping -- Use the GitHub Models API (
openai/gpt-4o) to determine which docs are impacted, with comprehensive output validation (repo format regex, path traversal blocking, unknown repo rejection, impact count cap) - Companion doc PRs -- Create/update PRs in both repos with branch naming
docs/pr-{source-pr-number} - Tracking comment -- Maintain a comment on the source PR linking to all companion doc PRs (with author verification to prevent spoofing and multi-layer markdown injection prevention)
- Cleanup -- Auto-close companion doc PRs when the source PR is closed without merge
Key Design Decisions
- AI backend: GitHub Models API with
openai/gpt-4o - Branch naming:
docs/pr-{N}for deterministic 1:1 mapping - Rebase-aware: Respects human edits on doc PRs (never force-pushes)
- Auth: OIDC + Key Vault signing (no secrets stored in GitHub, private key never on runner)
- Trigger:
pull_request_target-- workflow code runs frommain, preventing fork secret exfiltration - Graceful degradation: Without cross-repo token, still scans docs and reports impacts (just can't create PRs)
- Architecture: 12 focused source modules, all under 200 lines
- Injection prevention: 5-layer defense --
sanitizePlainText()on AI output,sanitizeText()on doc manifest input,escapeTableCell()on tracking comments (strips HTML/markdown),sanitizeForMarkdown()on PR bodies, output length caps (MAX_REASON=200, MAX_SUMMARY=500, MAX_IMPACTS=15)
Infrastructure (managed by EngSys)
| Component | Value | Purpose |
|---|---|---|
| GitHub Environment | AzureSDKEngKeyVault |
OIDC federated credential binding |
| Azure Key Vault | azuresdkengkeyvault |
Hosts the non-exportable RSA signing key |
| Key Vault Key | azure-sdk-automation |
RSA key used to sign GitHub App JWTs |
| GitHub App ID | 1086291 |
Azure SDK Automation GitHub App |
Tasks
- Scaffold the action project
- Implement diff extraction and change classification
- Implement docs inventory builder (both repos)
- Implement GitHub Models AI integration for analysis
- Implement PR manager (create/update branches, PRs, rebase logic)
- Implement tracking comment manager
- Implement main entry point with event handling and manual mode
- Create workflow definition (doc-monitor.yml)
- Integrate OIDC + Key Vault signing (eng/common/actions/login-to-github)
- MQ code review -- 12 findings fixed (security, logic, performance, type safety)
- Red team security assessment and hardening (11 findings: 7 code-fixed, 3 admin-tracked, 1 low-risk accepted)
- Add unit tests for pure functions
- End-to-end validation (requires EngSys infrastructure setup)
Reactions are currently unavailable