Rename run.py → setup_helix.py and externalize test.py args by davidnguyen-tech · Pull Request #5181 · dotnet/performance

davidnguyen-tech · 2026-03-30T10:43:02Z

Summary

Refactors the MAUI Android inner loop scenario to align with the repo convention where .proj files call test.py directly.

Changes

Renamed run.py → setup_helix.py and removed the run_test() function that spawned test.py as a subprocess with hardcoded args
Updated maui_scenarios_android_innerloop.proj to chain setup_helix.py && test.py explicitly, with all measurement args visible in the .proj
Deleted test-prototype.py exploration artifact

Why

Every other scenario in this repo has the .proj call test.py directly. This scenario diverged by using run.py as an intermediary that hardcoded the test.py arguments. The setup (SDK, workloads, ADB) is genuinely needed but belongs in a separate script chained before test.py, not wrapping it.

Wire the mauiandroidinnerloop scenario into the Helix-based CI pipeline. This measures first-deploy and incremental-deploy times for MAUI Android apps using dotnet build -t:Install. Changes: - Add maui_scenarios_android_innerloop.proj for Helix workitems - Add 4 inner loop job entries to sdk-perf-jobs.yml - Add pre.py/test.py/post.py scenario scripts - Add AndroidInnerLoopParser for binlog parsing - Update shared const.py and runner.py for ANDROIDINNERLOOP test type - Fix csproj TFM targeting and NuGet package locality in pre.py - Add build-server shutdown to post.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Remove Release job entries — inner loop measures developer inner loop which is always Debug. Remove EmbedAssembliesIntoApk=true to keep FastDev enabled (the default for Debug builds). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

pre.py uses PreCommands() which requires a subcommand argument. Pass 'default -f $(PERFLAB_Framework)' so argparse doesn't fail. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Don't ship NuGet packages in the Helix payload — the .packages/ directory was ~1-2GB causing OutOfMemoryException during ZIP creation. Instead: - Skip restore during template creation (no_restore=True) - Copy merged NuGet.config to app directory for Helix access - Add dotnet restore step in Helix PreCommands - Packages are restored on-demand on the Helix machine Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The dotnet/packs directory (~3GB with MAUI workloads) caused OutOfMemoryException when the Helix SDK tried to ZIP the correlation payload in memory. Fix: Remove packs from correlation staging (like existing MAUI scenarios) and install the MAUI workload on the Helix machine using the rollback file from pre.py. This keeps the payload small while ensuring the build has everything it needs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace --skip-sign-check (nonexistent flag) with --skip-manifest-update to avoid downloading manifest updates on Helix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add echo statements, file existence checks, and --verbosity detailed to diagnose Helix workitem failures. Also use full paths with HELIX_WORKITEM_ROOT for all file references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace full inner loop test with minimal diagnostic that runs dotnet --info and dir commands to verify infrastructure. We can't access Helix console logs, so this helps isolate what works. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Infrastructure diagnostic passed. Now testing full flow with workload install, restore, and test.py. Each step has echo markers to identify failure point. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add performance/src/scenarios to PYTHONPATH so Python can find the shared/ module directory on Helix. Also remove --verbosity detailed from workload install and restore (debug-only, creates noise). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add maui_scenarios_android_innerloop to --runtime-flavor in() check in run-performance-job.yml so RuntimeFlavor env var gets set - Add get_run_configurations() handler in run_performance_job.py for telemetry config (CodegenType, RuntimeType, BuildConfig) - Add inner loop run_kind to binlog copy condition Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

HelixPreCommands (via machine-setup.cmd and run_performance_job.py) already sets DOTNET_ROOT, PATH, and PYTHONPATH. The workitem PreCommands were redundantly setting these with raw semicolons in values, which the Helix SDK interprets as command separators, corrupting the command chain. Simplified PreCommands to only set NUGET_PACKAGES (the one env var not already configured by HelixPreCommands). Also consolidated duplicate get_run_configurations() blocks for android and android_innerloop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Redirect all Command output to %HELIX_WORKITEM_UPLOAD_ROOT%\output.log so we can see what fails on the Helix machine. Added env var diagnostics (ANDROID_HOME, ADB path, etc.) to identify missing dependencies. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

$(BuildConfig) is overridden by send_to_helix.py to the artifact config string (e.g., x64_main_maui_scenarios_android_innerloop), which causes dotnet build -c x64_main_... to fail. All inner loop jobs use Debug. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The (echo ... && ...) > output.log 2>&1 grouping breaks when %PATH% contains (x86) because the parentheses corrupt the batch group. Reverted to clean command chain without output capture. Kept -c Debug. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Batch (...) grouping breaks when %PATH% contains (x86). Using a wrapper script with setlocal enabledelayedexpansion avoids this. Captures diagnostics (ANDROID_HOME, ADB, DOTNET_ROOT) and per-step output to %HELIX_WORKITEM_UPLOAD_ROOT%\output.log with exit codes per step for failure isolation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…llback-file The newer .NET SDK (11.0.100-preview.3) rejects using both flags together. --from-rollback-file already pins versions, making --skip-manifest-update redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

NETSDK1226: Prune Package data not found for .NETCoreApp 11.0. This is expected with preview SDKs. Adding the property allows restore to proceed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

DOTNET_ROOT points to dotnet-cli/ (set by machine-setup.cmd) but run.cmd was using %HELIX_CORRELATION_PAYLOAD%\dotnet\dotnet which is a different SDK. Workload was installed into the wrong SDK, causing NETSDK1139 (android platform not recognized) when test.py used the correct SDK. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Helix machines cannot reach NuGet certificate revocation servers, causing NU3018 errors during workload install. Set NUGET_CERT_REVOCATION_MODE=offline to use local CRL cache. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dotnet workload install ignores DOTNET_NUGET_SIGNATURE_VERIFICATION env var. Add signatureValidationMode=accept to NuGet.config both in pre.py (build machine) and run.cmd (Helix machine) to handle CI-signed packages. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ci_setup.py installs .NET 11 SDK into dotnet/ directory but DOTNET_ROOT points to dotnet-cli/ which has .NET 8.0.100. Override DOTNET_ROOT to use the correct SDK for workload install and build operations. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

NETSDK1226 occurs during dotnet build -t:Install on .NET 11 preview. Already fixed for restore step but also needed for the build command that runs through test.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The Windows.11.Amd64.Pixel.Perf Helix queue has Pixel 8 devices but ANDROID_HOME is not set and ADB is not on PATH. Existing scenarios use XHarness which bundles its own ADB, but our inner loop scenario calls dotnet build -t:Install directly, which requires ANDROID_HOME. Create a minimal fake Android SDK directory structure at the workitem root, copy all ADB files from XHarness's bundled location into platform-tools/, set ANDROID_HOME, and add platform-tools to PATH. This runs before any dotnet commands so MSBuild picks it up from the environment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dotnet workload install maui-android does NOT install Java/OpenJDK, so dotnet build -t:Install fails with XA5300 on Helix machines. Add a multi-tier discovery strategy: 1. Search common Windows JDK paths (Microsoft, Oracle, Eclipse Adoptium) 2. Fall back to `where java` and derive JAVA_HOME from the exe path 3. If still not found, log diagnostic info (dir listings) so the XA5300 failure is debuggable from the uploaded output.log Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The Windows.11.Amd64.Pixel.Perf Helix machines do not have Chocolatey installed. Replace the choco install fallback with a direct download of Microsoft OpenJDK 17 using curl.exe (available on Windows 10+) and PowerShell Expand-Archive for extraction. The common-path search is preserved as a fast path for machines that already have a JDK. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The build fails with XA5205 (Cannot find aapt2.exe) because our fake ANDROID_HOME only has platform-tools/adb.exe. MSBuild searches for aapt2.exe in ANDROID_HOME/build-tools/<version>/. Add a two-strategy approach right after the ANDROID_HOME setup: 1. Try copying aapt2 from the MAUI workload pack (Microsoft.Android.Sdk.Windows/*/tools/) — avoids a 57MB download 2. Fall back to downloading build-tools_r35.0.0-windows.zip from Google If both fail, log a warning and let the build fail with XA5205 so we get diagnostic info. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Bash equivalent of run.cmd for the Ubuntu.2204.Amd64.Android.29 Helix queue which has the Android emulator, SDK, ADB, and Java pre-installed. Much simpler than the Windows device track — no ANDROID_HOME setup, Java download, or build-tools download needed. Steps: verify emulator boot → install maui-android workload → restore → test.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The aapt2/build-tools setup block was running before the workload install step, but Strategy 1 (copy from workload pack Microsoft.Android.Sdk.Windows) needs the pack to exist first — it's only created by 'dotnet workload install maui-android'. Move the block to after Step 1 succeeds. Also fix the Google download URL for build-tools: the version format in the URL is 'r35', not 'r35.0.0'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…i-android' 'dotnet workload install maui-android' fails because the maui-android workload has dependencies on iOS/MacCatalyst packages that don't exist in the configured NuGet feeds. 'dotnet workload restore' on the csproj only installs workloads needed by the project's target frameworks (net11.0-android), avoiding the missing iOS/MacCatalyst packages. This also removes the separate non-fatal workload restore call that followed, since workload restore is now the primary (and only) command. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The pinned 'workload install maui-android --from-rollback-file' fails because iOS/MacCatalyst dependency packages at the pinned version don't exist in the NuGet feeds. Fix: run 'workload restore' first to install all project-required workloads (including iOS/MacCatalyst) at whatever version IS available, then run 'workload install' to pin maui-android to the exact rollback version we want to test. Workload restore failure is non-fatal (logged as warning); workload install failure remains fatal. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…endency commands The MAUI template csproj has multi-platform TargetFrameworks (ios, maccatalyst, android, etc.). Without an explicit override, workload restore demands the ios workload and dotnet restore fails with NETSDK1147. restore_packages() already handled this correctly. Add -p:TargetFrameworks={ctx['framework']} to: - workload restore in install_workload() - dotnet restore in install_android_dependencies() - dotnet msbuild -t:InstallAndroidDependencies Also add a note in pre.py that the csproj TFM rewrite is now largely redundant since all commands override TargetFrameworks via properties. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The dotnet msbuild -t:InstallAndroidDependencies command fails with MSB4057 ('target does not exist') because TargetFrameworks (plural) triggers an outer/dispatch build that never imports the Android SDK targets. The InstallAndroidDependencies target is only defined in the inner build where TargetFramework (singular) is set, which causes the Android SDK to be imported. Change TargetFrameworks to TargetFramework for the msbuild invocation only. The restore command keeps TargetFrameworks (plural) since restore handles the outer build correctly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…tore now succeeds The SkipResolvePackageAssets=true property was originally added as a safety net to avoid NETSDK1004 when restore might fail and leave an incomplete project.assets.json. Now that restore succeeds (thanks to AllowMissingPrunePackageData + TargetFrameworks override), SkipResolvePackageAssets blocks MSBuild from loading the targeting pack resolved during restore, causing NETSDK1127 ('targeting pack Microsoft.NETCore.App is not installed'). Removing it lets the msbuild invocation use the fully resolved assets from the successful restore. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The dotnet build command in runner.py doesn't receive the TargetFrameworks override, causing its implicit restore to evaluate all TFMs in the csproj (including net11.0-ios) and fail with NETSDK1147. Adding TargetFrameworks to PERFLAB_MSBUILD_ARGS ensures the build only evaluates the android TFM. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move AllowMissingPrunePackageData and UseSharedCompilation into the csproj via pre.py property injection instead of scattering them across .proj, run.py, and runner.py. Remove the TFM rewrite logic from pre.py — TargetFrameworks override handles this. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Loop measure_startup() N times per deploy to reduce noise in startup timing. Default 10 iterations, configurable via StartupIterations property in the .proj file. Refactor measure_startup() to encapsulate its command-building logic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Replace 10x startup-only loop with N full incremental iterations (edit → build+deploy → startup). Each iteration captures a binlog and measures startup once. Build metrics and startup times are collected into results arrays for statistical analysis. Default 10 iterations, configurable via InnerLoopIterations in .proj. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move the ~280-line ANDROIDINNERLOOP elif branch from runner.py into a dedicated AndroidInnerLoopHelper class in shared/androidinnerloop.py. The new class follows the same pattern as SODWrapper and DevicePowerConsumptionHelper: stateless instantiation with a run() method that accepts all needed parameters explicitly. The two nested functions (measure_startup, merge_build_and_startup) become private methods on the class. The runner.py elif branch is replaced with a thin 14-line dispatcher that creates the helper and forwards self.* attributes as explicit keyword arguments. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move 'import upload' and 'from performance.constants import UPLOAD_CONTAINER, UPLOAD_STORAGE_URI, UPLOAD_QUEUE' from top-level into the run() method, right before they are used. This prevents a crash during pre.py when azure.storage.blob is not installed — the import chain pre.py → test.py → runner.py → androidinnerloop.py → upload → azure.storage.blob would fail at module load time. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Revert the extraction into androidinnerloophelper.py and keep the logic inline in runner.py, but improve readability by: - Extracting _measure_startup(), _merge_build_and_startup(), and _run_incremental_iteration() as module-level helper functions - Adding section comments (Validate inputs, First build + deploy, Resolve activity name, Incremental loop, Aggregate, Cleanup) - Reducing the elif branch from ~295 lines to ~176 lines No behavioral changes — pure readability refactoring. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move measure_startup and merge_build_and_startup back inside the ANDROIDINNERLOOP elif block as nested functions, where they were originally. They are only used within that block and by _run_incremental_iteration (which now receives measure_startup as a parameter). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Instead of toggling between original/modified file contents on odd/even iterations (which allows MSBuild to cache previously-compiled states), each incremental iteration now appends one '!' character to the 'Hello, World!' string. This guarantees every iteration produces genuinely new, never-before-compiled content. Changes: - runner.py: Replace toggle logic with re.sub append in run_incremental_iteration; remove editsrc/original_content/ modified_content parameters; remove --edit-src argparser entry - pre.py: Remove modified MainPage.xaml.cs creation (no longer needed) - run.py: Remove --edit-src argument from test command Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Use dotnet new maui -sc (sample content) for a more realistic app - Toggle both MainPage.xaml.cs and MainPage.xaml each iteration, exercising both Csc and XamlC compiler paths - Replace silent fallback modifications with explicit, verifiable edits: .xaml.cs adds a Debug.WriteLine line, .xaml changes a label text - Fail fast if expected source patterns are not found - Support multiple edit-src/edit-dest file pairs (semicolon-separated) - Add content hash and length logging for toggle verification Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…d-innerloop

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The csproj injection in pre.py was skipped because the MAUI template already sets SupportedOSPlatformVersion conditionally. MSBuild /p: properties take highest precedence and override project-level values. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Remove the run_test() function that spawned test.py as a subprocess with hardcoded arguments. The .proj file will be updated in a subsequent commit to chain setup_helix.py && test.py explicitly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move the hardcoded test.py arguments from the deleted run_test() function into the .proj Command line. Both Windows and Linux HelixWorkItem entries now explicitly show all measurement args. Delete test-prototype.py exploration artifact. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

davidnguyen-tech and others added 30 commits March 21, 2026 11:21

Fix pre.py invocation: add required subcommand

bbecc8d

pre.py uses PreCommands() which requires a subcommand argument. Pass 'default -f $(PERFLAB_Framework)' so argparse doesn't fail. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix workload install: remove invalid --skip-sign-check

519472c

Replace --skip-sign-check (nonexistent flag) with --skip-manifest-update to avoid downloading manifest updates on Helix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Test: add workload install + restore + test.py back

9573714

Infrastructure diagnostic passed. Now testing full flow with workload install, restore, and test.py. Each step has echo markers to identify failure point. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix restore: AllowMissingPrunePackageData for .NET 11 preview

e905a02

NETSDK1226: Prune Package data not found for .NETCoreApp 11.0. This is expected with preview SDKs. Adding the property allows restore to proceed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix NuGet cert revocation check on Helix

a8fd45b

Helix machines cannot reach NuGet certificate revocation servers, causing NU3018 errors during workload install. Set NUGET_CERT_REVOCATION_MODE=offline to use local CRL cache. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Install OpenJDK via Chocolatey when not found on Helix

4190ec5

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

davidnguyen-tech and others added 28 commits March 25, 2026 16:34

Remove old redundant scripts

cb52b2d

Rename androidinnerloop.py to androidinnerloophelper.py

493dce0

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move _run_incremental_iteration into ANDROIDINNERLOOP block

82e04e8

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Simplify ANDROIDINNERLOOP cleanup logic

0dcb821

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Keep binlogs in TRACEDIR for Helix artifact upload

5c50790

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into nguyendav/maui-androi…

73be5a4

…d-innerloop

Remove obvious comments from copytree calls

8e38712

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Set SupportedOSPlatformVersion to 23 for Android deps

e1ade86

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Link tracking issue

9d43ba5

Remove redundant comment

ba19610

davidnguyen-tech closed this Mar 31, 2026

davidnguyen-tech deleted the nguyendav/maui-android-innerloop branch March 31, 2026 09:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename run.py → setup_helix.py and externalize test.py args#5181

Rename run.py → setup_helix.py and externalize test.py args#5181
davidnguyen-tech wants to merge 117 commits intomainfrom
nguyendav/maui-android-innerloop

davidnguyen-tech commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidnguyen-tech commented Mar 30, 2026

Summary

Changes

Why

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant