Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10918
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 1 Pending, 1 Unrelated FailureAs of commit 6a12fa8 with merge base d0464f8 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
26e9d75 to
fb6ae9e
Compare
8899a8e to
4b3b4ab
Compare
ac72cc1 to
4ccdcc6
Compare
| bash backends/qualcomm/scripts/build.sh --skip_aarch64 --job_number 2 --release || \ | ||
| bash backends/qualcomm/scripts/build.sh --skip_aarch64 --job_number 2 --release --no_clean |
There was a problem hiding this comment.
Is this change related?
There was a problem hiding this comment.
Yes, see the comment right above — this was the way the race was mitigated before in #7570
| if(CMAKE_BUILD_TYPE MATCHES "Debug") | ||
| set(FLATCC_LIB flatccrt_d) | ||
| else() | ||
| set(FLATCC_LIB flatccrt) | ||
| endif() | ||
|
|
There was a problem hiding this comment.
Have you confirmed that debug build will also build flatccrt?
There was a problem hiding this comment.
Yes, it seems to have both libs
### Summary Seems like there is a race in building `libflatccrt.a`. This issue has existed for a while: pytorch#7300. It was temporarily mitigated in pytorch#7570 by just reducing the parallelism. In this diff I attempt to fix it. This is just my assumption of what is wrong. Given flatccrt builds a debug version with a `_d` suffix, if the target isn't depended on (i.e. some target don't use the conditional target name) then the order of how the lib is built causes a race. So for now, always use the non-debug version. Given it's a race, I was never able to repro the issue locally — I can't guarantee this is the problem. However, it seems my recent changes in pytorch#10855 has increased the frequency of the problem in CI. ### Test plan CI cc @larryliu0820
|
@jathu, looks like this causing issue to build qnn backend in debug mode. Looks like the script to build qnn sdk only covers release mode though |
Summary
Seems like there is a race in building
libflatccrt.a. This issue has existed for a while: #7300. It was temporarily mitigated in #7570 by just reducing the parallelism.In this diff I attempt to fix it. This is just my assumption of what is wrong. Given flatccrt builds a debug version with a
_dsuffix, if the target isn't depended on (i.e. some target don't use the conditional target name) then the order of how the lib is built causes a race. So for now, always use the non-debug version.Given it's a race, I was never able to repro the issue locally — I can't guarantee this is the problem. However, it seems my recent changes in #10855 has increased the frequency of the problem in CI.
Test plan
CI
cc @larryliu0820