Auto-triage: flag area:UI PRs missing screenshots/demo from new contributors#64516
Auto-triage: flag area:UI PRs missing screenshots/demo from new contributors#64516shivaam wants to merge 5 commits intoapache:mainfrom
Conversation
pierrejeambrun
left a comment
There was a problem hiding this comment.
That's interesting. Code looks good beside one comment.
Would love to hear what @potiuk thinks about this one. (personally I think it's a good idea)
| elif not has_ci_failures and not has_conflicts and not has_unresolved_comments: | ||
| # Only soft violations (e.g. missing UI demo) — just comment, don't convert to draft | ||
| action = TriageAction.COMMENT |
There was a problem hiding this comment.
I don't think that's relevant. What's the goal of this?
There was a problem hiding this comment.
I will take a look but not earlier than tomorrow - have some things to fix on main first
There was a problem hiding this comment.
Yeah. This one is not necessary - also "Draft" is a suggestion - can be changed to comment by maintainer..
There was a problem hiding this comment.
Thanks for the feedback. I removed the soft-violations branch in the latest commit.
One thing to note: with --answer-triage y, there's no way to differentiate per-check — a PR with only a missing screenshot gets drafted the same as one with failing CI. That's fine for now since maintainers mostly run interactively, but worth keeping in mind.
For the future, I'm thinking about a plugin-based approach where each check defines its own suggested default action alongside its violations:
# Each check returns its own suggested action
assess_pr_checks() → ("draft", violations)
assess_pr_ui_demo() → ("comment", violations)
assess_pr_conflicts() → ("draft", violations)
Then _compute_default_action just aggregates by picking the most severe: draft > comment > rerun. No conditional chain needed — it scales to any number of checks without touching the aggregation logic.
Add a deterministic check to `breeze pr auto-triage` that flags `area:UI` PRs from non-collaborator contributors when their PR description lacks screenshots or demo videos. The check runs as part of the existing deterministic checks in `_assess_pr_deterministic()` alongside CI failure, merge conflict, and unresolved comment checks. PRs flagged only for missing UI demo receive a comment (not draft conversion), using a new soft-violation branch in `_compute_default_action`. Detection patterns cover GitHub drag-and-drop uploads (`<img>` tags, `user-attachments/assets/` URLs), markdown image syntax, and direct media file URLs (png, jpg, gif, mp4, etc).
d2fa110 to
b1b90fa
Compare
|
I just rebased the branch to rerun the CI |
|
Very interesting. Yes sometimes its not needed but I have found myself commenting a bunch to ask for a screenshot. |
potiuk
left a comment
There was a problem hiding this comment.
Looks good besides the one comment.
Per reviewer feedback from @pierrejeambrun and @potiuk: the separate elif branch for soft-only violations (e.g. missing UI demo) defaulting to COMMENT is unnecessary — maintainers can override DRAFT to COMMENT interactively. A future enhancement will let each check define its own suggested action severity.
There was a problem hiding this comment.
Pull request overview
Adds a deterministic breeze pr auto-triage rule to warn on area:UI PRs from non-collaborators when the PR body lacks screenshots or demo evidence.
Changes:
- Add
_has_demo_evidence()+assess_pr_ui_demo()deterministic assessment inutils/github.py - Integrate the new assessment into deterministic triage and review-mode aggregation in
pr_commands.py - Add focused pytest coverage for demo detection, assessment behavior, and default-action handling
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| dev/breeze/tests/test_github_ui_demo.py | Adds tests for demo-evidence detection, UI demo assessment, and expected triage actions |
| dev/breeze/src/airflow_breeze/utils/github.py | Implements demo-evidence detection and a new deterministic UI-demo assessment producing a warning violation |
| dev/breeze/src/airflow_breeze/commands/pr_commands.py | Wires the new deterministic assessment into triage flows and merges violations/summaries |
| def test_soft_violation_only_suggests_draft(self): | ||
| """A PR with only a soft violation (no CI failures, conflicts, or comments) gets DRAFT.""" | ||
| pr = self._make_pr() | ||
| assessment = self._make_assessment("PR #123 changes UI code but has no screenshots or demo.") | ||
| action, reason = _compute_default_action(pr, assessment, author_flagged_count={}) | ||
| assert action == TriageAction.DRAFT | ||
| assert "draft" in reason |
There was a problem hiding this comment.
The PR description states UI-demo-only violations should default to a COMMENT action (soft violation), but this test asserts DRAFT. The current _compute_default_action logic shown in context also falls through to DRAFT when there are no CI failures/conflicts/unresolved comments, so either (a) the intended behavior (COMMENT for UI-demo-only warnings) isn’t implemented yet, or (b) the PR description is out of date. To align behavior with the PR description, add an explicit branch in _compute_default_action that selects TriageAction.COMMENT when the assessment contains only warning-level violations (specifically the UI demo warning) and no hard signals are present; then update this test (and its docstring) to expect COMMENT and a reason containing 'comment'.
| @@ -6961,6 +6963,9 @@ def _review_ready_prs_review_mode( | |||
| if comments_assessment: | |||
| violations.extend(comments_assessment.violations) | |||
| summaries.append(comments_assessment.summary) | |||
| if ui_demo_assessment: | |||
| violations.extend(ui_demo_assessment.violations) | |||
| summaries.append(ui_demo_assessment.summary) | |||
|
|
|||
| assessment = PRAssessment( | |||
| should_flag=True, | |||
There was a problem hiding this comment.
This block manually merges assessments, but _assess_pr_deterministic() uses _merge_pr_assessments(...) for the same purpose. Duplicating merge logic in two places increases the chance of drift (e.g., future fields on PRAssessment, ordering, or new deterministic checks). Consider reusing _merge_pr_assessments(conflict_assessment, ci_assessment, comments_assessment, ui_demo_assessment) here as well, then using the merged result directly.
| _DEMO_EVIDENCE_PATTERNS = [ | ||
| r"<img\s", # HTML <img> tag (GitHub drag-and-drop screenshots) | ||
| r"https://github\.com/user-attachments/assets/", # GitHub-uploaded assets (images & videos) | ||
| r"!\[", # Markdown image syntax (rare but valid) | ||
| r"https?://\S+\.(?:png|jpg|jpeg|gif|webp|mp4|mov|webm)", # Direct media file URLs | ||
| ] |
There was a problem hiding this comment.
_has_demo_evidence() runs regex compilation on every call via re.search(pattern, ...). Since this can execute across many PRs in a single triage run, consider pre-compiling these patterns once at module import time (e.g., store re.Pattern objects with re.IGNORECASE) and then calling .search(body) in the loop. This keeps the behavior the same but reduces repeated regex overhead.
| def _has_demo_evidence(body: str | None) -> bool: | ||
| """Check if PR body contains screenshots, images, or video links.""" | ||
| if not body: | ||
| return False | ||
| for pattern in _DEMO_EVIDENCE_PATTERNS: | ||
| if re.search(pattern, body, re.IGNORECASE): | ||
| return True | ||
| return False |
There was a problem hiding this comment.
_has_demo_evidence() runs regex compilation on every call via re.search(pattern, ...). Since this can execute across many PRs in a single triage run, consider pre-compiling these patterns once at module import time (e.g., store re.Pattern objects with re.IGNORECASE) and then calling .search(body) in the loop. This keeps the behavior the same but reduces repeated regex overhead.
Add a deterministic check to
breeze pr auto-triagethat flagsarea:UIPRs from non-collaborator contributors when their PR description lacks screenshots or demo videos. So owners, members, and committers / collaborators are excluded from this check. The rationale: having one PR merged doesn't mean they know the project's conventions around UI demos.There are some PR - #62885 that do not need a UI mock but it is not simple to exclude them.
Comment posted on the PR:
[WARNING] Missing UI demo: This PR changes UI code but the description does not include screenshots or a demo video.
For UI changes, please add screenshots or a short screen recording to the PR description so reviewers can verify the visual changes. You can paste images directly into the GitHub PR description or drag-and-drop a screen recording. See pull request guidelines.
What this does
When triaging PRs, the tool now checks if a PR:
area:UIlabelIf all three conditions are met, the PR is flagged with a
[WARNING] Missing UI demoviolation. The suggested action is comment (not draft conversion) — this is a soft violation handled by a new branch in_compute_default_action.How it works
assess_pr_ui_demo()inutils/github.py— new deterministic check function following the same pattern asassess_pr_checks,assess_pr_conflicts, andassess_pr_unresolved_comments<img>tags,user-attachments/assets/URLs (images & videos), markdownsyntax, and direct media file URLs (.png,.jpg,.gif,.mp4,.mov,.webm, etc.)_assess_pr_deterministic()so the check runs everywhere: main triage flow, TUI, pagination batches, and single-PR mode_compute_default_actionsoft-violation branch ensures UI-demo-only flags getCOMMENTinstead of falling through toDRAFTDry-run validation against all open
area:UIPRsRan
breeze pr auto-triage --label "area:UI" --llm-use api --dry-run --answer-triage s --authors all:area:UIPRs assessedFalse negative (marked as Area:UI, but dont need a UI mock)
True negatives (correctly NOT flagged)
user-attachments/assets/URL → passed all checks<img>tags in body → not flaggedTrue positives (correctly flagged)
Known limitations
Tests
Tests covering:
_has_demo_evidencehelper (empty/None body, text-only, all media patterns, false positive resistance)assess_pr_ui_demo(every exit path: no label, collaborator associations, demo present/absent)_compute_default_actionsoft-violation branch (comment for soft-only, draft when combined with hard issues)Was generative AI tooling used to co-author this PR?
Generated-by: Claude Opus 4.6 following the guidelines