Skip to content

feat(test): add it.isKnownFlake for opt-in stress-testing flaky Jest fixes#111860

Open
JoshuaKGoldberg wants to merge 9 commits intomasterfrom
fix/flaky-test-rerun-infra
Open

feat(test): add it.isKnownFlake for opt-in stress-testing flaky Jest fixes#111860
JoshuaKGoldberg wants to merge 9 commits intomasterfrom
fix/flaky-test-rerun-infra

Conversation

@JoshuaKGoldberg
Copy link
Copy Markdown
Member

@JoshuaKGoldberg JoshuaKGoldberg commented Mar 30, 2026

It's hard to determine when a known flaky Jest test is no longer flaky. You basically just have to keep running it repeatedly. But it takes a long time (and bogs up our GHA workflows) to re-run all tests too.

This PR adds plumbing that lets us specifically re-run known flaky tests 50x in a single run:

  1. I added an opt-in Frontend: Rerun Flaky Tests (not yet in use) Known flaky tests should be run many times, just to be safe. label as seen on this PR
  2. When that label is present, the frontend workflow sets a RERUN_KNOWN_FLAKY_TESTS process env var
  3. Tests defined with the new it.isKnownFlake will define the same test 50x with incremented counter names when that var is present

Example CI failure (search flaky rerun):

  ● EventGroupingInfo › [flaky rerun x50] fetches and renders grouping info for errors › run 2/50

I added it to the following tests that have failed >=2x on master over the last month:

File Test CI failures (30d) Ticket
eventReplay/index.spec.tsx render replay inline onboarding 6 REPLAY-879
stackTrace.spec.tsx URL link in tooltip 5 ENG-7192
resultsSearchQueryBuilder.spec.tsx has: dropdown + normal tags (2 tests) 5 ENG-7201
metricsTab.spec.tsx toggle query builder sidebar 4 ENG-7202
customerDetails.spec.tsx disabled without billing.admin 4 ENG-7203
eventsSearchBar.spec.tsx has: dropdown 3 DAIN-1271
trace.spec.tsx arrowup+shift scroll (was it.skip) 3 BROWSE-411
allMonitors.spec.tsx select all query results 2 ENG-7204
spansSearchBar.spec.tsx onSearch correct query 2 ENG-7205
react-native/metrics.spec.tsx onboarding content 2 ENG-7206
useReplaysFromIssue.spec.tsx fetch replay ids 2 ENG-7207
spanEvidencePreview.spec.tsx error on request fail 2 ENG-7208
groupingInfoSection.spec.tsx render grouping info 2 ENG-7209
timeSince.spec.tsx respects timezone in tooltip 1 ENG-7211
versionHoverCard.spec.tsx renders 1 ENG-7212

Made with Cursor

@JoshuaKGoldberg JoshuaKGoldberg added the Frontend: Rerun Flaky Tests (not yet in use) Known flaky tests should be run many times, just to be safe. label Mar 30, 2026
@github-actions github-actions bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Mar 30, 2026
@JoshuaKGoldberg JoshuaKGoldberg force-pushed the fix/flaky-test-rerun-infra branch from f42d3ea to 59dbb34 Compare March 31, 2026 13:12
@JoshuaKGoldberg JoshuaKGoldberg changed the title feat(test): Add itRepeatsWhenFlaky infra for stress-testing flaky Jest fixes feat(test): Add it.knownFlake for stress-testing flaky Jest fixes Mar 31, 2026
@JoshuaKGoldberg JoshuaKGoldberg force-pushed the fix/flaky-test-rerun-infra branch from 59dbb34 to ac09db1 Compare March 31, 2026 13:30
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

@JoshuaKGoldberg JoshuaKGoldberg force-pushed the fix/flaky-test-rerun-infra branch from ac09db1 to 93d1cbe Compare March 31, 2026 13:31
@JoshuaKGoldberg JoshuaKGoldberg requested review from a team and removed request for a team March 31, 2026 13:32
@JoshuaKGoldberg JoshuaKGoldberg force-pushed the fix/flaky-test-rerun-infra branch 2 times, most recently from 5abee83 to 206555a Compare March 31, 2026 13:34
Adds infrastructure for validating flaky test fixes in CI:

- `itRepeatsWhenFlaky()`: a test wrapper in tests/js/sentry-test/ that
  runs a test 50x when the RERUN_KNOWN_FLAKY_TESTS env var is set,
  otherwise runs once as a normal it()
- CI wiring: frontend.yml sets RERUN_KNOWN_FLAKY_TESTS=true when the PR
  has the "Frontend: Rerun Flaky Tests" label
- ESLint: configured jest/no-standalone-expect to recognize
  itRepeatsWhenFlaky as a test block

Wraps all 13 known flaky tests (identified from 30 days of CI failures
on master) with itRepeatsWhenFlaky so fixes can be stress-tested:
- eventReplay/index.spec.tsx (6 occ)
- stackTrace.spec.tsx (5 occ)
- resultsSearchQueryBuilder.spec.tsx (5 occ, 2 tests)
- metricsTab.spec.tsx (4 occ)
- customerDetails.spec.tsx (4 occ)
- eventsSearchBar.spec.tsx (3 occ)
- trace.spec.tsx (3 occ, previously skipped)
- allMonitors.spec.tsx (2 occ)
- spansSearchBar.spec.tsx (2 occ)
- react-native/metrics.spec.tsx (2 occ)
- useReplaysFromIssue.spec.tsx (2 occ)
- spanEvidencePreview.spec.tsx (2 occ)
- groupingInfoSection.spec.tsx (2 occ)

Made-with: Cursor
@JoshuaKGoldberg JoshuaKGoldberg force-pushed the fix/flaky-test-rerun-infra branch from 206555a to 6515f63 Compare March 31, 2026 13:34
@JoshuaKGoldberg JoshuaKGoldberg removed the Scope: Backend Automatically applied to PRs that change backend components label Mar 31, 2026
@JoshuaKGoldberg JoshuaKGoldberg changed the title feat(test): Add it.knownFlake for stress-testing flaky Jest fixes feat(test): add it.knownFlake for opt-in stress-testing flaky Jest fixes Mar 31, 2026
…Flake

Mark flaky tests for rerun infrastructure:
- TimeSince › respects timezone in tooltip (ENG-7211)
- VersionHoverCard › renders (ENG-7212)

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Made-with: Cursor
@JoshuaKGoldberg JoshuaKGoldberg marked this pull request as ready for review April 2, 2026 17:30
@JoshuaKGoldberg JoshuaKGoldberg requested review from a team as code owners April 2, 2026 17:30
@JoshuaKGoldberg JoshuaKGoldberg requested review from a team as code owners April 2, 2026 17:30
@JoshuaKGoldberg JoshuaKGoldberg changed the title feat(test): add it.knownFlake for opt-in stress-testing flaky Jest fixes feat(test): add it.isKnownFlake for opt-in stress-testing flaky Jest fixes Apr 2, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

DEBUG_PRINT_LIMIT: 0
# When the "Frontend: Rerun Flaky Tests" label is on the PR,
# tests wrapped with it.isKnownFlake() run 50x to validate fixes.
RERUN_KNOWN_FLAKY_TESTS: "${{ contains(github.event.pull_request.labels.*.name, 'Frontend: Rerun Flaky Tests') }}"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Frontend: Rerun Flaky Tests'

Frontend TSC suggested adding a prefix to the label, as we do that a lot for categorized ones. This is not so much a "trigger" (though that could be a nice feature to look into) as a "modifier". Sent: https://github.com/getsentry/getsentry/pull/19741. Once that goes in I'll update here.

JoshuaKGoldberg added a commit that referenced this pull request Apr 3, 2026
Previously, `groupingInfoSection.spec.tsx` was testing the
`GroupingInfoSection` component: but all that component does is combine
the `InterimSection` component with a lazy-loaded `GroupingInfo`
component. That lazy-loading can take >=100-150ms locally, and might be
the cause of flake in CI.

This PR changes the test to:

* Directly test `GroupingInfo`, bypassing the lazy-load
* Delete the specific flaky test that was checking for the component
being loaded after click (since that's covered by `FoldSection`'s unit
tests, as rendered by `InterimSection`)

Fixes ENG-7209

~Note that CI Jest tests are failing because the
https://github.com/getsentry/sentry/labels/Frontend%3A%20Rerun%20Flaky%20Tests
label is causing _other_, still-flaky tests to be run.~ Rebased on
`master` so the `it.isKnownFlake` (#111860) addition isn't here yet.

Made with [Cursor](https://cursor.com)

Co-authored-by: @nikkikapadia
@JoshuaKGoldberg
Copy link
Copy Markdown
Member Author

I want to merge this PR, but the frontend / Jest tests keep showing flaky test failures. Which I suppose is good 🙂. In the latest run's failures, https://github.com/getsentry/sentry/actions/runs/23950050162/job/69855165158?pr=111860:

I'll work on getting granular PRs rebased on master & merged before this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Frontend: Rerun Flaky Tests (not yet in use) Known flaky tests should be run many times, just to be safe. Scope: Frontend Automatically applied to PRs that change frontend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants