Skip to content

NXP backend: Improve test result gathering#20024

Open
roman-janik-nxp wants to merge 2 commits into
pytorch:mainfrom
nxp-upstream:feature/nxg11066/EIEX-925-Improve-test-result-gathering
Open

NXP backend: Improve test result gathering#20024
roman-janik-nxp wants to merge 2 commits into
pytorch:mainfrom
nxp-upstream:feature/nxg11066/EIEX-925-Improve-test-result-gathering

Conversation

@roman-janik-nxp

@roman-janik-nxp roman-janik-nxp commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Feature for creating improved test results for reporting when test log-level is set to DEBUG. Adds IR models to .outputs/test_dir, result tensor differences, text variant of input and result tensors, summary with test info.

Test plan

Test provided. All tests that use NSYS.

cc @robert-kalmar

@roman-janik-nxp roman-janik-nxp added module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate labels Jun 4, 2026
@pytorch-bot

pytorch-bot Bot commented Jun 4, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20024

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 27 Pending

As of commit 0b19317 with merge base 06143cb (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 4, 2026
@linux-foundation-easycla

linux-foundation-easycla Bot commented Jun 4, 2026

Copy link
Copy Markdown

CLA Not Signed

@github-actions github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Jun 4, 2026
@roman-janik-nxp

Copy link
Copy Markdown
Collaborator Author

@novak-vaclav

Comment thread backends/nxp/tests/model_output_comparator.py
Comment thread backends/nxp/tests/nsys_testing.py Outdated
"tag1_neutron.et.tflite"
), "Converted Neutron model not found in working directory, export in NeutronBackend failed."
shutil.copy("tag1_pure.et.tflite", test_dir)
shutil.copy("tag1_neutron.et.tflite", test_dir)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work when there are more delegated partitions? The file names suggest it only works with 1 such partition.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I counted with single partition, as multiple partitions means there is a problem. But we also need to take these cases into account. As Robert pointed out the copy-solution unfortunate, so I changed it to pass test_dir to Neutron backend via CompileSpec and store all partitions directly there.

Comment thread backends/nxp/tests/nsys_testing.py Outdated
Comment thread backends/nxp/tests/model_output_comparator.py Outdated
Comment thread backends/nxp/tests/generic_tests/test_debug_results.py
Comment thread backends/nxp/tests/generic_tests/test_debug_results.py
Comment thread backends/nxp/tests/utils.py Outdated
Comment thread backends/nxp/tests/nsys_testing.py Outdated
remove_quant_io_ops=remove_quant_io_ops,
)

# Copy converter Neutron model and Neutron IR model to test_dir

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a quite impractical workaround for the code in executorch/backends/nxp/nxp_backend.py::NeutronBackend::preprocess() which stores the debug files into current working directory.

Now you have:

  • logic in lower_run_compare, which knows where to properly store the test files - derived from test ID from request
  • logic in NeutronBackend::preprocess storing them into current working directory, what e.g. Arm handles with passing the intermediates folder in compile_spec, what is acceptable, see https://github.com/pytorch/executorch/blob/main/backends/arm/scripts/aot_arm_compiler.py#L511
  • logic in this function (_run_delegated_executorch_program) which stores the pytorch/executorch visualized graphs (.json files) into the test_dir what is kind of OK as it is called from lower_run_compare but unflexible:

Unflexible because: the folder name for intermediate artefacts is created in lower_run_compare based on request fixture. The lower_run_compare is quite low level function to know about request [ https://en.wikipedia.org/wiki/Principle_of_least_privilege ] . Why not to move the test_dir name creation logic into test itself and pass it into utility functions. Such a change also opens the possibility to store intermediate artefacts also from the tests not calling the lower_run_compare , like assert_not_delegated tests.

And as Martin pointed out, you must rely on specific tag name to avoid mistakenly copying garbage.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the copy-solution is unfortunate, so I changed it to pass test_dir to Neutron backend via CompileSpec and store all partitions directly there. Now the storing logic is in one place. I made the test_dir param inside our pipeline optional, so the default store dir is still cwd.
I also wouldn't move the test_name extraction outside of lower_run_compare as I consider it as the one compact API for NSYS test. Other type of tests typically don't generate such test results and the test_name can still be be extracted from request in those tests.

Comment thread backends/nxp/tests/nsys_testing.py
Comment thread backends/nxp/tests/nsys_testing.py Outdated
Comment thread backends/nxp/tests/nsys_testing.py
Comment thread backends/nxp/tests/generic_tests/test_quantized_input_data.py
Comment thread backends/nxp/tests/generic_tests/test_debug_results.py
Comment thread backends/nxp/tests/model_output_comparator.py
Comment thread backends/nxp/tests/model_output_comparator.py Outdated
Comment thread backends/nxp/tests/nsys_testing.py Outdated
Comment thread backends/nxp/tests/nsys_testing.py Outdated
Comment thread backends/nxp/tests/nsys_testing.py
@roman-janik-nxp roman-janik-nxp force-pushed the feature/nxg11066/EIEX-925-Improve-test-result-gathering branch from ceed080 to 9b79072 Compare June 9, 2026 11:30
@novak-vaclav

Copy link
Copy Markdown
Contributor

Provided additional information to some of my comments. Otherwise LGTM

@roman-janik-nxp roman-janik-nxp force-pushed the feature/nxg11066/EIEX-925-Improve-test-result-gathering branch 3 times, most recently from c4223bb to 1f7d775 Compare June 11, 2026 11:17
@roman-janik-nxp roman-janik-nxp marked this pull request as ready for review June 11, 2026 14:38
Comment thread backends/nxp/tests/generic_tests/test_debug_results.py
)
def test__basic_nsys_inference_qat(self, x_input_shape, mocker):
x_input_spec = ModelInputSpec(x_input_shape)
def test__basic_nsys_inference_qat(self, mocker, request):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return back the variants for the test.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it because there is no need for them anymore. I added them and marked as xfail, because I wasn't sure if there is an error in QAT. Turns out it's not. The shapes are the same for PTQ test. The QAT behaves the same as PTQ in the context of conversion and precision. To be in line with other op tests, I intentionally kept only one shape. I consulted it with Martin and he agreed.

marks=pytest.mark.xfail(reason="AIR-14602: incorrect results"),
),
],
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep these tests

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it because there is no need for them anymore. I added them and marked as xfail, because I wasn't sure if there is an error in QAT. Turns out it's not. The shapes are the same for PTQ test. The QAT behaves the same as PTQ in the context of conversion and precision. To be in line with other op tests, I intentionally kept only one shape. I consulted it with Martin and he agreed.

Comment thread backends/nxp/neutron_partitioner.py
"""Generate compile spec for Neutron NPU

:param config: Neutron accelerator configuration, e.g. "imxrt700"
:param test_dir: Test directory to store test related files.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intermediates_dir ==> In the nxp_backend you are not storing test related data but intermediates from the nxp_backend conversion flow.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intermediate models are related to the test. They are stored to the test directory where other test related data are store as well (results, datasets, etc.) Don't see a point in different naming.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The intermediate models are related to the test."
==> Not related just with test. Consider it from the user perspective. He uses the eIQ Neutron backend and wants to keep the intermediate results. By using this item in the conversion config he can do it. This can be even enabled in the aot_example. He is not testing anything.

"They are stored to the test directory where other test related data are store as well (results, datasets, etc.)"
==> They are stored to the specified destination, what is context related. The test case set its to the test directory for obvious reasons. The aot_example can set it to some tmp folder, and user in its own conversion pipeline can set it differently. From the perspective of the nxp_backend, there is no relation to tests.
Test use this conversion config entry to collect all the artefacts in one test specific directory.

@@ -230,14 +241,16 @@ def preprocess( # noqa C901

# Dump the tflite file if logging level is enabled
if logging.root.isEnabledFor(logging.DEBUG):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As now we already introduce the test_dir or intermediates_dir, we can replace the logic to check if this *_dir is set or None, instead of limit the functionality to DEBUG logging level only.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand. The functionality is (and was before) limited to DEBUG logging level only. The logic to check is test_dir is set is because now the backend can be run without the use of lower_run_compare(), e.g. from aot_run_example.py

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was limited to DEBUG level only as there was no other means to enable/disable it. Now you introduced a specific entry in the conversion_config, so you can use that. And not limit the intermediates dump only to DEBUG log level.

@roman-janik-nxp roman-janik-nxp force-pushed the feature/nxg11066/EIEX-925-Improve-test-result-gathering branch from 1f7d775 to c8e76a9 Compare June 15, 2026 15:34
@roman-janik-nxp roman-janik-nxp force-pushed the feature/nxg11066/EIEX-925-Improve-test-result-gathering branch from c8e76a9 to 0b19317 Compare June 15, 2026 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: arm Issues related to arm backend module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants