Skip to content

Add Azure DevOps flaky-history annotations and quarantine awareness#8298

Draft
Evangelink wants to merge 1 commit into
mainfrom
dev/amauryleve/azdo-flaky-history
Draft

Add Azure DevOps flaky-history annotations and quarantine awareness#8298
Evangelink wants to merge 1 commit into
mainfrom
dev/amauryleve/azdo-flaky-history

Conversation

@Evangelink
Copy link
Copy Markdown
Member

Part 2 of the brainstorm in #5951 — adds opt-in flaky-test history annotations and quarantine awareness to Microsoft.Testing.Extensions.AzureDevOpsReport. Decorates AzDO log issues with historical flake context and lets known-noisy failures be downgraded so PR gates aren't blocked.

One of three PRs derived from the #5951 brainstorm. The others:

See issue comment for the broader plan.

Why

Today every failure in an AzDO log looks equally bad. There's no way for the build to say "this test failed 4/20 times in the last 14 days — known noise" vs "this test had zero failures in 14 days and just broke — likely regression". Teams that have moved to MTP currently mark failures as warnings manually with a separate task or live with red PR checks for known-flaky tests.

What

Three new opt-in CLI options on Microsoft.Testing.Extensions.AzureDevOpsReport:

Option Type Purpose
--report-azdo-flaky-history <days> int (1–90) Query AzDO REST history for the last N days and annotate failures with [flaky: failed K/N in last Md] or [REGRESSION] (only when ≥5 prior samples).
--report-azdo-quarantine-file <path> string Path to a text file (one FQN/glob per line, # comments allowed) listing tests considered quarantined. Their failures are demoted to warning and tagged [quarantined]; emits ##vso[build.addbuildtag]has-quarantined-test-failure exactly once.
--report-azdo-demote-known-flaky zero-arity Together with --report-azdo-flaky-history, auto-demote failures whose flake-rate ≥25% in the window to warning. Default OFF (annotate-only). Requires --report-azdo-flaky-history.

All three are opt-in; missing AzDO env vars (SYSTEM_ACCESSTOKEN/SYSTEM_COLLECTIONURI/SYSTEM_TEAMPROJECT/BUILD_DEFINITIONID) → log warning and no-op.

How it works

  • AuthAuthorization: Basic base64(":<SYSTEM_ACCESSTOKEN>") (no Microsoft.TeamFoundationServer.Client dependency; just HttpClient + a source-generated JsonSerializerContext so it's AOT-safe).
  • History queryGET {project}/_apis/test/Runs?definitions={pipelineDefinitionId}&minLastUpdatedDate=…&maxLastUpdatedDate=…&automated=true&$top=200 paginated with $skip up to MaxRunsToInspect = 200. Per run, GET …/results?api-version=7.1&outcomes=Failed,Passed paged with continuation token. Aggregated into Dictionary<automatedTestName, FlakyStats>.
  • Bounded session start — history load has a wall-clock budget (default 30 s). If exceeded, log info and degrade to empty stats; tests start immediately.
  • Resilience — REST calls retry on transient errors (3 attempts, exponential backoff, 429 honors Retry-After). Response bodies truncated to 500 chars in error messages. All callbacks catch everything except OperationCanceledException; history/quarantine failures never fail the test run.
  • Quarantine file — line-based (# comments), case-sensitive ordinal matching against FQN, glob patterns (*, ?) compiled to a single alternation regex. Capped at 10 000 patterns / 4 KB per pattern with a logged warning.
  • Build tag##vso[build.addbuildtag]has-quarantined-test-failure guarded by Interlocked.Exchange so concurrent failure events emit it exactly once.

Highlights from the expert-reviewer round

Implementation went through one full round of expert-reviewer. Critical issues addressed:

  • C1 (feature was broken): the runs query was using buildIds=<pipelineDefinitionId> (the wrong AzDO parameter — buildIds filters by individual build run id, not pipeline). Switched to definitions=<pipelineDefinitionId> so the query actually returns data. Unit test now asserts the URL contains definitions=.
  • C2: history load was synchronous & serial — up to 25 000 sequential HTTP requests blocking session start. Bounded with a 30 s wall-clock budget + cancellation; degrades to empty stats on exceed.
  • C3: $top=501 was silently capped server-side. Now properly pages runs via $skip until MaxRunsToInspect is reached.
  • C4: [REGRESSION] previously fired on a single prior sample, generating false positives everywhere. Now requires TotalCount >= 5 (configurable via MinSamplesForRegressionAnnotation).

Major items also addressed: User-Agent/Accept headers added (prevents AzDO WAF-side 403s), source-generated JsonSerializerContext (AOT-safe), error-body truncation, inner-exception preservation on retry exhaustion, quarantine pattern caps, ordinal-case-sensitive matching, IAzureDevOpsHistoryService interface + proper DI registration, guard-clause CLI validation, acceptance tests for invalid-value error paths.

Tests

546 unit tests pass. New coverage:

  • AzureDevOpsHistoryServiceTests.cs — history aggregation, paging, time budget, regression boundary (4 vs 5 samples), quarantine tag-emitted-once.
  • AzureDevOpsHistoryClientTests.cs — URL composition (definitions= parameter), User-Agent/Accept headers, 429 retry honoring Retry-After.
  • AzureDevOpsCommandLineProviderTests.cs — cross-product validation (e.g. --demote-known-flaky without --flaky-history, --quarantine-file without --report-azdo).
  • AzureDevOpsCommandLineTests.cs — acceptance-style coverage.

HelpInfoAllExtensionsTests expectations updated for the new options (both --help and --info blocks, alphabetical order preserved).

Build status (local)

  • .\.dotnet\dotnet.exe build src\Platform\Microsoft.Testing.Extensions.AzureDevOpsReport\Microsoft.Testing.Extensions.AzureDevOpsReport.csproj -c Debug0 warnings, 0 errors.
  • .\.dotnet\dotnet.exe test test\UnitTests\Microsoft.Testing.Extensions.UnitTests\Microsoft.Testing.Extensions.UnitTests.csproj546/546 passed.
  • .\build.cmd -pack0 warnings, 0 errors.

Out of scope (deliberate)

Checklist

  • Critical & Major review findings addressed
  • Localized via resx + xlf (regenerated with /t:UpdateXlf, not hand-edited)
  • Help/info acceptance test expectations updated
  • No new public API (besides one internal IAzureDevOpsHistoryService)
  • .\build.cmd green (0 warnings, 0 errors)
  • PR feedback addressed

Refs #5951

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 16, 2026 19:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds opt-in Azure DevOps flaky-history annotations and quarantine awareness to the Microsoft.Testing.Extensions.AzureDevOpsReport extension. Failures can now be annotated with historical flake context, demoted from error to warning when known-flaky, and demoted with a [quarantined] tag when listed in a quarantine file (which also emits a one-shot ##vso[build.addbuildtag]has-quarantined-test-failure).

Changes:

  • New CLI options --report-azdo-flaky-history, --report-azdo-quarantine-file, --report-azdo-demote-known-flaky with cross-option validation in AzureDevOpsCommandLineProvider.
  • New AzureDevOpsHistoryService + AzureDevOpsHistoryClient (AOT-safe JsonSerializerContext) that queries the AzDO REST Runs/Results APIs under a 30 s wall-clock budget, with retries, 429 Retry-After honoring, paging caps, and a regression-annotation min-sample threshold.
  • AzureDevOpsReporter now annotates errors with [flaky: failed K/N in last Md] / [REGRESSION] / [quarantined], and demotes severity per the quarantine file and known-flaky rule.
Show a summary per file
File Description
src/.../AzureDevOpsCommandLineOptions.cs Adds 3 new option name constants.
src/.../AzureDevOpsCommandLineProvider.cs Registers new options and adds cross-option validation.
src/.../AzureDevOpsExtensions.cs Wires AzureDevOpsHistoryService as data consumer + session lifetime handler.
src/.../AzureDevOpsHistoryClient.cs New REST client (auth, paging, retries, AOT JSON).
src/.../AzureDevOpsHistoryClientJsonContext.cs Source-generated JSON context for DTOs.
src/.../AzureDevOpsHistoryService.cs Loads/aggregates flaky stats with a bounded budget; exposes TryGetStats/IsLikelyFlaky.
src/.../AzureDevOpsReporter.cs Adds annotation suffix building, severity demotion, one-shot quarantine build tag.
src/.../FlakyStats.cs Struct holding pass/fail counts and failure rate.
src/.../IAzureDevOpsHistoryService.cs Internal abstraction over the history service.
src/.../QuarantineFile.cs Parses quarantine file (globs, # comments, caps) into regex matchers.
src/.../Microsoft.Testing.Extensions.AzureDevOpsReport.csproj Adds System.Text.Json dependency and DynamicProxyGenAssembly2 IVT for Moq.
Directory.Packages.props Pins System.Text.Json version.
src/.../Resources/AzureDevOpsResources.resx New strings for options, warnings, and annotation templates; fixes prior Eanble/AzureDev Ops typos.
src/.../Resources/xlf/*.xlf (12 locales) Regenerated XLFs for new strings; Description/OptionDescription flipped to needs-review-translation after the English typo fix.
test/.../AzureDevOpsHistoryClientTests.cs Asserts URL composition (definitions=), headers, and run-paging behavior.
test/.../AzureDevOpsHistoryServiceTests.cs Covers aggregation, paging, time-budget timeout, regression threshold, demote, quarantine tag-once.
test/.../AzureDevOpsCommandLineProviderTests.cs Validates cross-option error messages.
test/.../AzureDevOpsCommandLineTests.cs Acceptance-style test for invalid CLI argument errors.
test/.../HelpInfoAllExtensionsTests.cs Updates --help / --info expectations for new options.

Copilot's findings

  • Files reviewed: 31/31 changed files
  • Comments generated: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants