NXP backend: Improve test result gathering#20024
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20024
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New Failures, 27 PendingAs of commit 0b19317 with merge base 06143cb ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
| "tag1_neutron.et.tflite" | ||
| ), "Converted Neutron model not found in working directory, export in NeutronBackend failed." | ||
| shutil.copy("tag1_pure.et.tflite", test_dir) | ||
| shutil.copy("tag1_neutron.et.tflite", test_dir) |
There was a problem hiding this comment.
How does this work when there are more delegated partitions? The file names suggest it only works with 1 such partition.
There was a problem hiding this comment.
I counted with single partition, as multiple partitions means there is a problem. But we also need to take these cases into account. As Robert pointed out the copy-solution unfortunate, so I changed it to pass test_dir to Neutron backend via CompileSpec and store all partitions directly there.
| remove_quant_io_ops=remove_quant_io_ops, | ||
| ) | ||
|
|
||
| # Copy converter Neutron model and Neutron IR model to test_dir |
There was a problem hiding this comment.
This is a quite impractical workaround for the code in executorch/backends/nxp/nxp_backend.py::NeutronBackend::preprocess() which stores the debug files into current working directory.
Now you have:
- logic in
lower_run_compare, which knows where to properly store the test files - derived from test ID fromrequest - logic in
NeutronBackend::preprocessstoring them into current working directory, what e.g. Arm handles with passing the intermediates folder incompile_spec, what is acceptable, see https://github.com/pytorch/executorch/blob/main/backends/arm/scripts/aot_arm_compiler.py#L511 - logic in this function (
_run_delegated_executorch_program) which stores the pytorch/executorch visualized graphs (.json files) into thetest_dirwhat is kind of OK as it is called fromlower_run_comparebut unflexible:
Unflexible because: the folder name for intermediate artefacts is created in lower_run_compare based on request fixture. The lower_run_compare is quite low level function to know about request [ https://en.wikipedia.org/wiki/Principle_of_least_privilege ] . Why not to move the test_dir name creation logic into test itself and pass it into utility functions. Such a change also opens the possibility to store intermediate artefacts also from the tests not calling the lower_run_compare , like assert_not_delegated tests.
And as Martin pointed out, you must rely on specific tag name to avoid mistakenly copying garbage.
There was a problem hiding this comment.
I agree the copy-solution is unfortunate, so I changed it to pass test_dir to Neutron backend via CompileSpec and store all partitions directly there. Now the storing logic is in one place. I made the test_dir param inside our pipeline optional, so the default store dir is still cwd.
I also wouldn't move the test_name extraction outside of lower_run_compare as I consider it as the one compact API for NSYS test. Other type of tests typically don't generate such test results and the test_name can still be be extracted from request in those tests.
ceed080 to
9b79072
Compare
|
Provided additional information to some of my comments. Otherwise LGTM |
c4223bb to
1f7d775
Compare
| ) | ||
| def test__basic_nsys_inference_qat(self, x_input_shape, mocker): | ||
| x_input_spec = ModelInputSpec(x_input_shape) | ||
| def test__basic_nsys_inference_qat(self, mocker, request): |
There was a problem hiding this comment.
Return back the variants for the test.
There was a problem hiding this comment.
I removed it because there is no need for them anymore. I added them and marked as xfail, because I wasn't sure if there is an error in QAT. Turns out it's not. The shapes are the same for PTQ test. The QAT behaves the same as PTQ in the context of conversion and precision. To be in line with other op tests, I intentionally kept only one shape. I consulted it with Martin and he agreed.
| marks=pytest.mark.xfail(reason="AIR-14602: incorrect results"), | ||
| ), | ||
| ], | ||
| ) |
There was a problem hiding this comment.
I removed it because there is no need for them anymore. I added them and marked as xfail, because I wasn't sure if there is an error in QAT. Turns out it's not. The shapes are the same for PTQ test. The QAT behaves the same as PTQ in the context of conversion and precision. To be in line with other op tests, I intentionally kept only one shape. I consulted it with Martin and he agreed.
| """Generate compile spec for Neutron NPU | ||
|
|
||
| :param config: Neutron accelerator configuration, e.g. "imxrt700" | ||
| :param test_dir: Test directory to store test related files. |
There was a problem hiding this comment.
intermediates_dir ==> In the nxp_backend you are not storing test related data but intermediates from the nxp_backend conversion flow.
There was a problem hiding this comment.
The intermediate models are related to the test. They are stored to the test directory where other test related data are store as well (results, datasets, etc.) Don't see a point in different naming.
There was a problem hiding this comment.
"The intermediate models are related to the test."
==> Not related just with test. Consider it from the user perspective. He uses the eIQ Neutron backend and wants to keep the intermediate results. By using this item in the conversion config he can do it. This can be even enabled in the aot_example. He is not testing anything.
"They are stored to the test directory where other test related data are store as well (results, datasets, etc.)"
==> They are stored to the specified destination, what is context related. The test case set its to the test directory for obvious reasons. The aot_example can set it to some tmp folder, and user in its own conversion pipeline can set it differently. From the perspective of the nxp_backend, there is no relation to tests.
Test use this conversion config entry to collect all the artefacts in one test specific directory.
| @@ -230,14 +241,16 @@ def preprocess( # noqa C901 | |||
|
|
|||
| # Dump the tflite file if logging level is enabled | |||
| if logging.root.isEnabledFor(logging.DEBUG): | |||
There was a problem hiding this comment.
As now we already introduce the test_dir or intermediates_dir, we can replace the logic to check if this *_dir is set or None, instead of limit the functionality to DEBUG logging level only.
There was a problem hiding this comment.
Don't understand. The functionality is (and was before) limited to DEBUG logging level only. The logic to check is test_dir is set is because now the backend can be run without the use of lower_run_compare(), e.g. from aot_run_example.py
There was a problem hiding this comment.
It was limited to DEBUG level only as there was no other means to enable/disable it. Now you introduced a specific entry in the conversion_config, so you can use that. And not limit the intermediates dump only to DEBUG log level.
1f7d775 to
c8e76a9
Compare
c8e76a9 to
0b19317
Compare
Summary
Feature for creating improved test results for reporting when test log-level is set to DEBUG. Adds IR models to .outputs/test_dir, result tensor differences, text variant of input and result tensors, summary with test info.
Test plan
Test provided. All tests that use NSYS.
cc @robert-kalmar