Auto-triage: flag area:UI PRs missing screenshots/demo from new contributors by shivaam · Pull Request #64516 · apache/airflow

shivaam · 2026-03-31T01:50:19Z

Add a deterministic check to breeze pr auto-triage that flags area:UI PRs from non-collaborator contributors when their PR description lacks screenshots or demo videos. So owners, members, and committers / collaborators are excluded from this check. The rationale: having one PR merged doesn't mean they know the project's conventions around UI demos.

There are some PR - #62885 that do not need a UI mock but it is not simple to exclude them.

Comment posted on the PR:

[WARNING] Missing UI demo: This PR changes UI code but the description does not include screenshots or a demo video.
For UI changes, please add screenshots or a short screen recording to the PR description so reviewers can verify the visual changes. You can paste images directly into the GitHub PR description or drag-and-drop a screen recording. See pull request guidelines.

What this does

When triaging PRs, the tool now checks if a PR:

Has the area:UI label
Is authored by a non-collaborator (not COLLABORATOR/MEMBER/OWNER)
Lacks demo evidence in the PR body (screenshots, videos, or image links)

If all three conditions are met, the PR is flagged with a [WARNING] Missing UI demo violation. The suggested action is comment (not draft conversion) — this is a soft violation handled by a new branch in _compute_default_action.

How it works

assess_pr_ui_demo() in utils/github.py — new deterministic check function following the same pattern as assess_pr_checks, assess_pr_conflicts, and assess_pr_unresolved_comments
Detection patterns cover: GitHub drag-and-drop <img> tags, user-attachments/assets/ URLs (images & videos), markdown ![alt](url) syntax, and direct media file URLs (.png, .jpg, .gif, .mp4, .mov, .webm, etc.)
Integrated into _assess_pr_deterministic() so the check runs everywhere: main triage flow, TUI, pagination batches, and single-PR mode
_compute_default_action soft-violation branch ensures UI-demo-only flags get COMMENT instead of falling through to DRAFT

Dry-run validation against all open `area:UI` PRs

Ran breeze pr auto-triage --label "area:UI" --llm-use api --dry-run --answer-triage s --authors all:

Metric	Count
Total `area:UI` PRs assessed	35
Flagged for Missing UI demo	32
Flagged for other issues only (CI, conflicts, comments)	3
Passed all checks (had video demo)	1
False positives	0
False negatives	0

False negative (marked as Area:UI, but dont need a UI mock)

PR Complete Portuguese (pt) UI translation to 90%+ coverage #62885 — Complete Portuguese (pt) UI translation to 90%+ coverage

True negatives (correctly NOT flagged)

PR add bulk delete for dagruns #60718 — has video demo via user-attachments/assets/ URL → passed all checks
PR Update Dutch translations (2026-02-20) #62256, feat: Prettify JSON input in variable and minify on submit #62882, Feat: Adding task-level filters to DAG view in graph tab #64271 — have <img> tags in body → not flagged

True positives (correctly flagged)

PR Feat: Add multi-select to Dag Runs bulk actions (#63854) #64010 — Feat: Dag Runs UI now has multi-select bulk actions
PR Fix Gantt view still visible when time range is outside dagrun window #64147 #64179 — Fix Gantt view still visible when time range is outside dagrun window
PR Add DagRunType for operator #63733 — Add (UI feature, contributor: henrywoo)
PR UI: Fix silent errors and layout shift in import error components #64267 — UI: (code change, contributor: vamsikrishna)

Known limitations

Grace period interaction: When CI failures are within the grace window and are the only hard issue, the grace-period skip logic does not check for UI demo violations. A PR with grace-period CI + missing UI demo could be silently skipped. This is a pre-existing structural pattern in the grace period logic, not introduced by this change.

Tests

Tests covering:

_has_demo_evidence helper (empty/None body, text-only, all media patterns, false positive resistance)
assess_pr_ui_demo (every exit path: no label, collaborator associations, demo present/absent)
_compute_default_action soft-violation branch (comment for soft-only, draft when combined with hard issues)

Was generative AI tooling used to co-author this PR?

Yes — Claude Opus 4.6

Generated-by: Claude Opus 4.6 following the guidelines

pierrejeambrun

That's interesting. Code looks good beside one comment.

Would love to hear what @potiuk thinks about this one. (personally I think it's a good idea)

pierrejeambrun · 2026-03-31T14:04:41Z

dev/breeze/src/airflow_breeze/commands/pr_commands.py

+    elif not has_ci_failures and not has_conflicts and not has_unresolved_comments:
+        # Only soft violations (e.g. missing UI demo) — just comment, don't convert to draft
+        action = TriageAction.COMMENT


I don't think that's relevant. What's the goal of this?

I will take a look but not earlier than tomorrow - have some things to fix on main first

Yeah. This one is not necessary - also "Draft" is a suggestion - can be changed to comment by maintainer..

Thanks for the feedback. I removed the soft-violations branch in the latest commit.

One thing to note: with --answer-triage y, there's no way to differentiate per-check — a PR with only a missing screenshot gets drafted the same as one with failing CI. That's fine for now since maintainers mostly run interactively, but worth keeping in mind.

For the future, I'm thinking about a plugin-based approach where each check defines its own suggested default action alongside its violations:

# Each check returns its own suggested action assess_pr_checks() → ("draft", violations) assess_pr_ui_demo() → ("comment", violations) assess_pr_conflicts() → ("draft", violations)

Then _compute_default_action just aggregates by picking the most severe: draft > comment > rerun. No conditional chain needed — it scales to any number of checks without touching the aggregation logic.

Add a deterministic check to `breeze pr auto-triage` that flags `area:UI` PRs from non-collaborator contributors when their PR description lacks screenshots or demo videos. The check runs as part of the existing deterministic checks in `_assess_pr_deterministic()` alongside CI failure, merge conflict, and unresolved comment checks. PRs flagged only for missing UI demo receive a comment (not draft conversion), using a new soft-violation branch in `_compute_default_action`. Detection patterns cover GitHub drag-and-drop uploads (`<img>` tags, `user-attachments/assets/` URLs), markdown image syntax, and direct media file URLs (png, jpg, gif, mp4, etc).

pierrejeambrun · 2026-03-31T14:12:18Z

I just rebased the branch to rerun the CI

bbovenzi · 2026-03-31T19:07:01Z

Very interesting. Yes sometimes its not needed but I have found myself commenting a bunch to ask for a screenshot.

potiuk

Looks good besides the one comment.

@pierrejeambrun

Per reviewer feedback from @pierrejeambrun and @potiuk: the separate elif branch for soft-only violations (e.g. missing UI demo) defaulting to COMMENT is unnecessary — maintainers can override DRAFT to COMMENT interactively. A future enhancement will let each check define its own suggested action severity.

Copilot

Pull request overview

Adds a deterministic breeze pr auto-triage rule to warn on area:UI PRs from non-collaborators when the PR body lacks screenshots or demo evidence.

Changes:

Add _has_demo_evidence() + assess_pr_ui_demo() deterministic assessment in utils/github.py
Integrate the new assessment into deterministic triage and review-mode aggregation in pr_commands.py
Add focused pytest coverage for demo detection, assessment behavior, and default-action handling

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
dev/breeze/tests/test_github_ui_demo.py	Adds tests for demo-evidence detection, UI demo assessment, and expected triage actions
dev/breeze/src/airflow_breeze/utils/github.py	Implements demo-evidence detection and a new deterministic UI-demo assessment producing a warning violation
dev/breeze/src/airflow_breeze/commands/pr_commands.py	Wires the new deterministic assessment into triage flows and merges violations/summaries

Copilot · 2026-04-02T00:44:08Z

dev/breeze/tests/test_github_ui_demo.py

+    def test_soft_violation_only_suggests_draft(self):
+        """A PR with only a soft violation (no CI failures, conflicts, or comments) gets DRAFT."""
+        pr = self._make_pr()
+        assessment = self._make_assessment("PR #123 changes UI code but has no screenshots or demo.")
+        action, reason = _compute_default_action(pr, assessment, author_flagged_count={})
+        assert action == TriageAction.DRAFT
+        assert "draft" in reason


The PR description states UI-demo-only violations should default to a COMMENT action (soft violation), but this test asserts DRAFT. The current _compute_default_action logic shown in context also falls through to DRAFT when there are no CI failures/conflicts/unresolved comments, so either (a) the intended behavior (COMMENT for UI-demo-only warnings) isn’t implemented yet, or (b) the PR description is out of date. To align behavior with the PR description, add an explicit branch in _compute_default_action that selects TriageAction.COMMENT when the assessment contains only warning-level violations (specifically the UI demo warning) and no hard signals are present; then update this test (and its docstring) to expect COMMENT and a reason containing 'comment'.

Copilot · 2026-04-02T00:44:08Z

dev/breeze/src/airflow_breeze/commands/pr_commands.py

@@ -6961,6 +6963,9 @@ def _review_ready_prs_review_mode(
                if comments_assessment:
                    violations.extend(comments_assessment.violations)
                    summaries.append(comments_assessment.summary)
+                if ui_demo_assessment:
+                    violations.extend(ui_demo_assessment.violations)
+                    summaries.append(ui_demo_assessment.summary)

                assessment = PRAssessment(
                    should_flag=True,


This block manually merges assessments, but _assess_pr_deterministic() uses _merge_pr_assessments(...) for the same purpose. Duplicating merge logic in two places increases the chance of drift (e.g., future fields on PRAssessment, ordering, or new deterministic checks). Consider reusing _merge_pr_assessments(conflict_assessment, ci_assessment, comments_assessment, ui_demo_assessment) here as well, then using the merged result directly.

Copilot · 2026-04-02T00:44:08Z

dev/breeze/src/airflow_breeze/utils/github.py

+_DEMO_EVIDENCE_PATTERNS = [
+    r"<img\s",  # HTML <img> tag (GitHub drag-and-drop screenshots)
+    r"https://github\.com/user-attachments/assets/",  # GitHub-uploaded assets (images & videos)
+    r"!\[",  # Markdown image syntax (rare but valid)
+    r"https?://\S+\.(?:png|jpg|jpeg|gif|webp|mp4|mov|webm)",  # Direct media file URLs
+]


_has_demo_evidence() runs regex compilation on every call via re.search(pattern, ...). Since this can execute across many PRs in a single triage run, consider pre-compiling these patterns once at module import time (e.g., store re.Pattern objects with re.IGNORECASE) and then calling .search(body) in the loop. This keeps the behavior the same but reduces repeated regex overhead.

Copilot · 2026-04-02T00:44:09Z

dev/breeze/src/airflow_breeze/utils/github.py

+def _has_demo_evidence(body: str | None) -> bool:
+    """Check if PR body contains screenshots, images, or video links."""
+    if not body:
+        return False
+    for pattern in _DEMO_EVIDENCE_PATTERNS:
+        if re.search(pattern, body, re.IGNORECASE):
+            return True
+    return False


_has_demo_evidence() runs regex compilation on every call via re.search(pattern, ...). Since this can execute across many PRs in a single triage run, consider pre-compiling these patterns once at module import time (e.g., store re.Pattern objects with re.IGNORECASE) and then calling .search(body) in the loop. This keeps the behavior the same but reduces repeated regex overhead.

boring-cyborg bot added area:dev-tools backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch labels Mar 31, 2026

shivaam changed the title ~~Add missing UI demo/screenshot check to auto-triage~~ Auto-triage: flag area:UI PRs missing screenshots/demo from new contributors Mar 31, 2026

shivaam marked this pull request as ready for review March 31, 2026 01:59

shivaam requested review from amoghrajesh, ashb, bugraoz93, choo121600, ephraimbuddy, gopidesupavan, jason810496, jedcunningham, jscheffl and potiuk as code owners March 31, 2026 01:59

pierrejeambrun reviewed Mar 31, 2026

View reviewed changes

shivaam added 2 commits March 31, 2026 16:12

Clarify that demo should be in PR description, not comments

b1b90fa

pierrejeambrun force-pushed the claude/ui-demo-check branch from d2fa110 to b1b90fa Compare March 31, 2026 14:12

pierrejeambrun requested a review from vatsrahul1001 as a code owner March 31, 2026 14:12

Merge branch 'main' into claude/ui-demo-check

c057732

potiuk approved these changes Apr 1, 2026

View reviewed changes

shivaam added 2 commits April 1, 2026 08:58

Merge branch 'main' into claude/ui-demo-check

014f1fd

kaxil requested a review from Copilot April 2, 2026 00:41

Copilot started reviewing on behalf of kaxil April 2, 2026 00:42 View session

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Conversation

shivaam commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

How it works

Dry-run validation against all open area:UI PRs

False negative (marked as Area:UI, but dont need a UI mock)

True negatives (correctly NOT flagged)

True positives (correctly flagged)

Known limitations

Tests

Was generative AI tooling used to co-author this PR?

Uh oh!

pierrejeambrun left a comment

Choose a reason for hiding this comment

Uh oh!

pierrejeambrun Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

potiuk Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

potiuk Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

shivaam Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

pierrejeambrun commented Mar 31, 2026

Uh oh!

bbovenzi commented Mar 31, 2026

Uh oh!

potiuk left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

shivaam commented Mar 31, 2026 •

edited

Loading

Dry-run validation against all open `area:UI` PRs