Skip to content

Rename run.py → setup_helix.py and externalize test.py args#5181

Closed
davidnguyen-tech wants to merge 117 commits intomainfrom
nguyendav/maui-android-innerloop
Closed

Rename run.py → setup_helix.py and externalize test.py args#5181
davidnguyen-tech wants to merge 117 commits intomainfrom
nguyendav/maui-android-innerloop

Conversation

@davidnguyen-tech
Copy link
Copy Markdown
Member

Summary

Refactors the MAUI Android inner loop scenario to align with the repo convention where .proj files call test.py directly.

Changes

  • Renamed run.pysetup_helix.py and removed the run_test() function that spawned test.py as a subprocess with hardcoded args
  • Updated maui_scenarios_android_innerloop.proj to chain setup_helix.py && test.py explicitly, with all measurement args visible in the .proj
  • Deleted test-prototype.py exploration artifact

Why

Every other scenario in this repo has the .proj call test.py directly. This scenario diverged by using run.py as an intermediary that hardcoded the test.py arguments. The setup (SDK, workloads, ADB) is genuinely needed but belongs in a separate script chained before test.py, not wrapping it.

davidnguyen-tech and others added 30 commits March 21, 2026 11:21
Wire the mauiandroidinnerloop scenario into the Helix-based CI pipeline.
This measures first-deploy and incremental-deploy times for MAUI Android
apps using dotnet build -t:Install.

Changes:
- Add maui_scenarios_android_innerloop.proj for Helix workitems
- Add 4 inner loop job entries to sdk-perf-jobs.yml
- Add pre.py/test.py/post.py scenario scripts
- Add AndroidInnerLoopParser for binlog parsing
- Update shared const.py and runner.py for ANDROIDINNERLOOP test type
- Fix csproj TFM targeting and NuGet package locality in pre.py
- Add build-server shutdown to post.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove Release job entries — inner loop measures developer inner loop
which is always Debug. Remove EmbedAssembliesIntoApk=true to keep
FastDev enabled (the default for Debug builds).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
pre.py uses PreCommands() which requires a subcommand argument.
Pass 'default -f $(PERFLAB_Framework)' so argparse doesn't fail.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Don't ship NuGet packages in the Helix payload — the .packages/
directory was ~1-2GB causing OutOfMemoryException during ZIP creation.

Instead:
- Skip restore during template creation (no_restore=True)
- Copy merged NuGet.config to app directory for Helix access
- Add dotnet restore step in Helix PreCommands
- Packages are restored on-demand on the Helix machine

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The dotnet/packs directory (~3GB with MAUI workloads) caused
OutOfMemoryException when the Helix SDK tried to ZIP the correlation
payload in memory.

Fix: Remove packs from correlation staging (like existing MAUI
scenarios) and install the MAUI workload on the Helix machine using
the rollback file from pre.py. This keeps the payload small while
ensuring the build has everything it needs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace --skip-sign-check (nonexistent flag) with --skip-manifest-update
to avoid downloading manifest updates on Helix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add echo statements, file existence checks, and --verbosity detailed
to diagnose Helix workitem failures. Also use full paths with
HELIX_WORKITEM_ROOT for all file references.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace full inner loop test with minimal diagnostic that runs
dotnet --info and dir commands to verify infrastructure. We can't
access Helix console logs, so this helps isolate what works.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Infrastructure diagnostic passed. Now testing full flow with
workload install, restore, and test.py. Each step has echo
markers to identify failure point.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add performance/src/scenarios to PYTHONPATH so Python can find the
shared/ module directory on Helix. Also remove --verbosity detailed
from workload install and restore (debug-only, creates noise).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add maui_scenarios_android_innerloop to --runtime-flavor in() check
  in run-performance-job.yml so RuntimeFlavor env var gets set
- Add get_run_configurations() handler in run_performance_job.py for
  telemetry config (CodegenType, RuntimeType, BuildConfig)
- Add inner loop run_kind to binlog copy condition

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
HelixPreCommands (via machine-setup.cmd and run_performance_job.py) already
sets DOTNET_ROOT, PATH, and PYTHONPATH. The workitem PreCommands were
redundantly setting these with raw semicolons in values, which the Helix SDK
interprets as command separators, corrupting the command chain.

Simplified PreCommands to only set NUGET_PACKAGES (the one env var not
already configured by HelixPreCommands). Also consolidated duplicate
get_run_configurations() blocks for android and android_innerloop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Redirect all Command output to %HELIX_WORKITEM_UPLOAD_ROOT%\output.log
so we can see what fails on the Helix machine. Added env var diagnostics
(ANDROID_HOME, ADB path, etc.) to identify missing dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
$(BuildConfig) is overridden by send_to_helix.py to the artifact config
string (e.g., x64_main_maui_scenarios_android_innerloop), which causes
dotnet build -c x64_main_... to fail. All inner loop jobs use Debug.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The (echo ... && ...) > output.log 2>&1 grouping breaks when %PATH%
contains (x86) because the parentheses corrupt the batch group.
Reverted to clean command chain without output capture. Kept -c Debug.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Batch (...)  grouping breaks when %PATH% contains (x86). Using a
wrapper script with setlocal enabledelayedexpansion avoids this.
Captures diagnostics (ANDROID_HOME, ADB, DOTNET_ROOT) and per-step
output to %HELIX_WORKITEM_UPLOAD_ROOT%\output.log with exit codes
per step for failure isolation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…llback-file

The newer .NET SDK (11.0.100-preview.3) rejects using both flags together.
--from-rollback-file already pins versions, making --skip-manifest-update
redundant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
NETSDK1226: Prune Package data not found for .NETCoreApp 11.0.
This is expected with preview SDKs. Adding the property allows
restore to proceed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DOTNET_ROOT points to dotnet-cli/ (set by machine-setup.cmd) but
run.cmd was using %HELIX_CORRELATION_PAYLOAD%\dotnet\dotnet which
is a different SDK. Workload was installed into the wrong SDK,
causing NETSDK1139 (android platform not recognized) when test.py
used the correct SDK.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Helix machines cannot reach NuGet certificate revocation servers,
causing NU3018 errors during workload install. Set
NUGET_CERT_REVOCATION_MODE=offline to use local CRL cache.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dotnet workload install ignores DOTNET_NUGET_SIGNATURE_VERIFICATION env var.
Add signatureValidationMode=accept to NuGet.config both in pre.py (build
machine) and run.cmd (Helix machine) to handle CI-signed packages.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ci_setup.py installs .NET 11 SDK into dotnet/ directory but DOTNET_ROOT
points to dotnet-cli/ which has .NET 8.0.100. Override DOTNET_ROOT to
use the correct SDK for workload install and build operations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
NETSDK1226 occurs during dotnet build -t:Install on .NET 11 preview.
Already fixed for restore step but also needed for the build command
that runs through test.py.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Windows.11.Amd64.Pixel.Perf Helix queue has Pixel 8 devices but
ANDROID_HOME is not set and ADB is not on PATH. Existing scenarios use
XHarness which bundles its own ADB, but our inner loop scenario calls
dotnet build -t:Install directly, which requires ANDROID_HOME.

Create a minimal fake Android SDK directory structure at the workitem
root, copy all ADB files from XHarness's bundled location into
platform-tools/, set ANDROID_HOME, and add platform-tools to PATH.
This runs before any dotnet commands so MSBuild picks it up from the
environment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dotnet workload install maui-android does NOT install Java/OpenJDK, so
dotnet build -t:Install fails with XA5300 on Helix machines. Add a
multi-tier discovery strategy:

1. Search common Windows JDK paths (Microsoft, Oracle, Eclipse Adoptium)
2. Fall back to `where java` and derive JAVA_HOME from the exe path
3. If still not found, log diagnostic info (dir listings) so the XA5300
   failure is debuggable from the uploaded output.log

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Windows.11.Amd64.Pixel.Perf Helix machines do not have Chocolatey
installed. Replace the choco install fallback with a direct download of
Microsoft OpenJDK 17 using curl.exe (available on Windows 10+) and
PowerShell Expand-Archive for extraction. The common-path search is
preserved as a fast path for machines that already have a JDK.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The build fails with XA5205 (Cannot find aapt2.exe) because our fake
ANDROID_HOME only has platform-tools/adb.exe. MSBuild searches for
aapt2.exe in ANDROID_HOME/build-tools/<version>/.

Add a two-strategy approach right after the ANDROID_HOME setup:
1. Try copying aapt2 from the MAUI workload pack
   (Microsoft.Android.Sdk.Windows/*/tools/) — avoids a 57MB download
2. Fall back to downloading build-tools_r35.0.0-windows.zip from Google

If both fail, log a warning and let the build fail with XA5205 so we
get diagnostic info.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bash equivalent of run.cmd for the Ubuntu.2204.Amd64.Android.29 Helix queue
which has the Android emulator, SDK, ADB, and Java pre-installed. Much simpler
than the Windows device track — no ANDROID_HOME setup, Java download, or
build-tools download needed.

Steps: verify emulator boot → install maui-android workload → restore → test.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The aapt2/build-tools setup block was running before the workload install
step, but Strategy 1 (copy from workload pack Microsoft.Android.Sdk.Windows)
needs the pack to exist first — it's only created by 'dotnet workload install
maui-android'. Move the block to after Step 1 succeeds.

Also fix the Google download URL for build-tools: the version format in the
URL is 'r35', not 'r35.0.0'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
davidnguyen-tech and others added 28 commits March 25, 2026 16:34
…i-android'

'dotnet workload install maui-android' fails because the maui-android
workload has dependencies on iOS/MacCatalyst packages that don't exist
in the configured NuGet feeds. 'dotnet workload restore' on the csproj
only installs workloads needed by the project's target frameworks
(net11.0-android), avoiding the missing iOS/MacCatalyst packages.

This also removes the separate non-fatal workload restore call that
followed, since workload restore is now the primary (and only) command.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pinned 'workload install maui-android --from-rollback-file' fails
because iOS/MacCatalyst dependency packages at the pinned version don't
exist in the NuGet feeds.

Fix: run 'workload restore' first to install all project-required
workloads (including iOS/MacCatalyst) at whatever version IS available,
then run 'workload install' to pin maui-android to the exact rollback
version we want to test. Workload restore failure is non-fatal (logged
as warning); workload install failure remains fatal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…endency commands

The MAUI template csproj has multi-platform TargetFrameworks (ios,
maccatalyst, android, etc.). Without an explicit override, workload
restore demands the ios workload and dotnet restore fails with
NETSDK1147. restore_packages() already handled this correctly.

Add -p:TargetFrameworks={ctx['framework']} to:
- workload restore in install_workload()
- dotnet restore in install_android_dependencies()
- dotnet msbuild -t:InstallAndroidDependencies

Also add a note in pre.py that the csproj TFM rewrite is now largely
redundant since all commands override TargetFrameworks via properties.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The dotnet msbuild -t:InstallAndroidDependencies command fails with
MSB4057 ('target does not exist') because TargetFrameworks (plural)
triggers an outer/dispatch build that never imports the Android SDK
targets. The InstallAndroidDependencies target is only defined in the
inner build where TargetFramework (singular) is set, which causes the
Android SDK to be imported.

Change TargetFrameworks to TargetFramework for the msbuild invocation
only. The restore command keeps TargetFrameworks (plural) since restore
handles the outer build correctly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tore now succeeds

The SkipResolvePackageAssets=true property was originally added as a
safety net to avoid NETSDK1004 when restore might fail and leave an
incomplete project.assets.json.

Now that restore succeeds (thanks to AllowMissingPrunePackageData +
TargetFrameworks override), SkipResolvePackageAssets blocks MSBuild
from loading the targeting pack resolved during restore, causing
NETSDK1127 ('targeting pack Microsoft.NETCore.App is not installed').

Removing it lets the msbuild invocation use the fully resolved assets
from the successful restore.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The dotnet build command in runner.py doesn't receive the
TargetFrameworks override, causing its implicit restore to evaluate
all TFMs in the csproj (including net11.0-ios) and fail with
NETSDK1147. Adding TargetFrameworks to PERFLAB_MSBUILD_ARGS ensures
the build only evaluates the android TFM.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move AllowMissingPrunePackageData and UseSharedCompilation into
the csproj via pre.py property injection instead of scattering
them across .proj, run.py, and runner.py. Remove the TFM rewrite
logic from pre.py — TargetFrameworks override handles this.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Loop measure_startup() N times per deploy to reduce noise in
startup timing. Default 10 iterations, configurable via
StartupIterations property in the .proj file. Refactor
measure_startup() to encapsulate its command-building logic.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 10x startup-only loop with N full incremental iterations
(edit → build+deploy → startup). Each iteration captures a binlog
and measures startup once. Build metrics and startup times are
collected into results arrays for statistical analysis. Default
10 iterations, configurable via InnerLoopIterations in .proj.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the ~280-line ANDROIDINNERLOOP elif branch from runner.py into a
dedicated AndroidInnerLoopHelper class in shared/androidinnerloop.py.

The new class follows the same pattern as SODWrapper and
DevicePowerConsumptionHelper: stateless instantiation with a run()
method that accepts all needed parameters explicitly. The two nested
functions (measure_startup, merge_build_and_startup) become private
methods on the class.

The runner.py elif branch is replaced with a thin 14-line dispatcher
that creates the helper and forwards self.* attributes as explicit
keyword arguments.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move 'import upload' and 'from performance.constants import UPLOAD_CONTAINER,
UPLOAD_STORAGE_URI, UPLOAD_QUEUE' from top-level into the run() method, right
before they are used. This prevents a crash during pre.py when azure.storage.blob
is not installed — the import chain pre.py → test.py → runner.py →
androidinnerloop.py → upload → azure.storage.blob would fail at module load time.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Revert the extraction into androidinnerloophelper.py and keep the logic
inline in runner.py, but improve readability by:

- Extracting _measure_startup(), _merge_build_and_startup(), and
  _run_incremental_iteration() as module-level helper functions
- Adding section comments (Validate inputs, First build + deploy,
  Resolve activity name, Incremental loop, Aggregate, Cleanup)
- Reducing the elif branch from ~295 lines to ~176 lines

No behavioral changes — pure readability refactoring.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move measure_startup and merge_build_and_startup back inside the
ANDROIDINNERLOOP elif block as nested functions, where they were
originally. They are only used within that block and by
_run_incremental_iteration (which now receives measure_startup
as a parameter).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of toggling between original/modified file contents on odd/even
iterations (which allows MSBuild to cache previously-compiled states),
each incremental iteration now appends one '!' character to the
'Hello, World!' string. This guarantees every iteration produces
genuinely new, never-before-compiled content.

Changes:
- runner.py: Replace toggle logic with re.sub append in
  run_incremental_iteration; remove editsrc/original_content/
  modified_content parameters; remove --edit-src argparser entry
- pre.py: Remove modified MainPage.xaml.cs creation (no longer needed)
- run.py: Remove --edit-src argument from test command

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use dotnet new maui -sc (sample content) for a more realistic app
- Toggle both MainPage.xaml.cs and MainPage.xaml each iteration,
  exercising both Csc and XamlC compiler paths
- Replace silent fallback modifications with explicit, verifiable edits:
  .xaml.cs adds a Debug.WriteLine line, .xaml changes a label text
- Fail fast if expected source patterns are not found
- Support multiple edit-src/edit-dest file pairs (semicolon-separated)
- Add content hash and length logging for toggle verification

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The csproj injection in pre.py was skipped because the MAUI template
already sets SupportedOSPlatformVersion conditionally. MSBuild /p:
properties take highest precedence and override project-level values.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the run_test() function that spawned test.py as a subprocess
with hardcoded arguments. The .proj file will be updated in a
subsequent commit to chain setup_helix.py && test.py explicitly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move the hardcoded test.py arguments from the deleted run_test()
function into the .proj Command line. Both Windows and Linux
HelixWorkItem entries now explicitly show all measurement args.
Delete test-prototype.py exploration artifact.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@davidnguyen-tech davidnguyen-tech deleted the nguyendav/maui-android-innerloop branch March 31, 2026 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant