perf: run benchmarks daily and deduplicate regression issues by Mossaka · Pull Request #1871 · github/gh-aw-firewall

Mossaka · 2026-04-09T23:02:22Z

Summary

Change benchmark schedule from weekly (Mondays) to daily at 06:00 UTC for faster regression detection
Deduplicate regression issues: before creating a new issue, check for an existing open issue with performance + needs-investigation labels. If one exists, comment on it with updated data instead of creating duplicates.

Test plan

Verify cron syntax is valid for daily runs
Verify dedup logic: when an open regression issue exists, new regressions add a comment instead of a new issue
Verify first regression (no existing issue) still creates a new issue

Closes #1865
Closes #1868

🤖 Generated with Claude Code

- Change cron schedule from weekly (Mondays) to daily at 06:00 UTC - When a regression is detected, check for an existing open issue with performance+needs-investigation labels before creating a new one. If one exists, comment on it with updated data instead of duplicating. Closes #1865 Closes #1868 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T23:03:19Z

✅ Coverage Check Passed

Overall Coverage

Metric	Base	PR	Delta
Lines	85.85%	85.95%	📈 +0.10%
Statements	85.76%	85.85%	📈 +0.09%
Functions	87.54%	87.54%	➡️ +0.00%
Branches	78.56%	78.61%	📈 +0.05%

📁 Per-file Coverage Changes (1 files)

File	Lines (Before → After)	Statements (Before → After)
`src/docker-manager.ts`	86.3% → 86.6% (+0.36%)	85.9% → 86.2% (+0.35%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copilot

Pull request overview

Updates the performance monitoring workflow to detect regressions sooner and reduce issue spam by reusing an existing open regression issue when possible.

Changes:

Switch the benchmark schedule from weekly to daily at 06:00 UTC.
Add deduplication logic: if an open issue exists with performance + needs-investigation, comment on it instead of creating a new issue.

Show a summary per file

File	Description
.github/workflows/performance-monitor.yml	Changes cron cadence to daily and adds “comment vs create issue” dedup logic for regression reporting.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 1/1 changed files
Comments generated: 2

Copilot · 2026-04-09T23:04:27Z

.github/workflows/performance-monitor.yml

+
+              if (existing.data.length > 0) {
+                // Comment on existing issue instead of creating a duplicate
+                const issueNumber = existing.data[0].number;


issues.listForRepo returns both Issues and Pull Requests. With the current logic, an open PR that happens to have the performance + needs-investigation labels could be selected and then the workflow would comment on the PR instead of the regression issue. Filter existing.data to exclude items with the pull_request field (or switch to a search that targets issues only) before deciding whether an existing regression issue exists.

Suggested change

if (existing.data.length > 0) {

// Comment on existing issue instead of creating a duplicate

const issueNumber = existing.data[0].number;

const existingIssues = existing.data.filter(item => !item.pull_request);

if (existingIssues.length > 0) {

// Comment on existing issue instead of creating a duplicate

const issueNumber = existingIssues[0].number;

Addressed in a6bde85: added existing.data.filter(item => !item.pull_request) to exclude PRs, and bumped per_page from 1 to 5 to account for the filtering.

Copilot · 2026-04-09T23:04:28Z

.github/workflows/performance-monitor.yml

+                  repo: context.repo.repo,
+                  issue_number: issueNumber,
+                  body: `## Updated Regression Data\n\n**Run:** [${context.runId}](${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId})\n\n${body}`,
+                });


The comment body currently embeds the full body payload (including the raw JSON details/code block). On daily runs this can create very large comments and may hit GitHub's comment size limits, causing the workflow to fail when trying to comment. Consider making the update comment a shorter summary (e.g., run link + regression list + key stats) and link to the workflow artifacts for full JSON instead of duplicating the entire report each time.

Addressed in a6bde85: trimmed the update comment to a short summary (date, run link, commit SHA, regression list) with a link to the workflow run for full results and artifacts. No more raw JSON in comments.

Address Copilot review feedback: - Filter out pull requests from issues.listForRepo results since the API returns both issues and PRs; an open PR with matching labels could be mistakenly selected instead of a regression issue. - Reduce update comment size to a short summary (date, run link, commit, regression list) with a link to full artifacts, avoiding potential GitHub comment size limits on daily runs. - Bump per_page from 1 to 5 to account for PRs being filtered out. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-09T23:13:34Z

Smoke Test Results

✅ GitHub MCP: "perf: run benchmarks daily and deduplicate regression issues" / "perf: bump benchmark iterations from 5 to 30 and wire up workflow input"
✅ Playwright: github.com title contains "GitHub"
✅ File write: /tmp/gh-aw/agent/smoke-test-claude-24217646404.txt created
✅ Bash: file contents verified

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

github-actions · 2026-04-09T23:14:13Z

🏗️ Build Test Suite Results

Ecosystem	Project	Build/Install	Tests	Status
Bun	elysia	✅	1/1 passed	✅ PASS
Bun	hono	✅	1/1 passed	✅ PASS
C++	fmt	✅	N/A	✅ PASS
C++	json	✅	N/A	✅ PASS
Deno	oak	N/A	1/1 passed	✅ PASS
Deno	std	N/A	1/1 passed	✅ PASS
.NET	hello-world	✅	N/A	✅ PASS
.NET	json-parse	✅	N/A	✅ PASS
Go	color	✅	1/1 passed	✅ PASS
Go	env	✅	1/1 passed	✅ PASS
Go	uuid	✅	1/1 passed	✅ PASS
Java	gson	✅	1/1 passed	✅ PASS
Java	caffeine	✅	1/1 passed	✅ PASS
Node.js	clsx	✅	All passed	✅ PASS
Node.js	execa	✅	All passed	✅ PASS
Node.js	p-limit	✅	All passed	✅ PASS
Rust	fd	✅	1/1 passed	✅ PASS
Rust	zoxide	✅	1/1 passed	✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #1871 · ● 432.9K · ◷

Copilot AI review requested due to automatic review settings April 9, 2026 23:02

Copilot started reviewing on behalf of Mossaka April 9, 2026 23:03 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Mossaka merged commit 7258a16 into main Apr 9, 2026
38 of 40 checks passed

Mossaka deleted the feat/1865-daily-benchmarks branch April 9, 2026 23:09

github-actions bot added the smoke-claude label Apr 9, 2026

github-actions bot added the build-test label Apr 9, 2026

This was referenced Apr 9, 2026

perf: use --build-local in benchmarks to test source code changes #1872

Merged

fix: validate AWF_BENCHMARK_ITERATIONS env var input #1873

Merged

feat: add historical benchmark storage and trend reporting #1874

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: run benchmarks daily and deduplicate regression issues#1871

perf: run benchmarks daily and deduplicate regression issues#1871
Mossaka merged 2 commits intomainfrom
feat/1865-daily-benchmarks

Mossaka commented Apr 9, 2026

Uh oh!

github-actions bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Mossaka Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Mossaka Apr 9, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 9, 2026

Uh oh!

github-actions bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mossaka commented Apr 9, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Coverage Check Passed

Overall Coverage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Mossaka Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Mossaka Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Apr 9, 2026

Smoke Test Results

Uh oh!

github-actions bot commented Apr 9, 2026

🏗️ Build Test Suite Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 9, 2026 •

edited

Loading