Skip to content

Use rocm 7.1.1 over 7.2.1#78

Merged
PatrickRMiles merged 2 commits into
LBANN:mainfrom
michaelmckinsey1:rocm-ver
Jun 18, 2026
Merged

Use rocm 7.1.1 over 7.2.1#78
PatrickRMiles merged 2 commits into
LBANN:mainfrom
michaelmckinsey1:rocm-ver

Conversation

@michaelmckinsey1

Copy link
Copy Markdown
Collaborator

There is a regression in ROCm 7.2.1 that prevents us from strong scaling, you will get No suitable algorithm was found to execute the required convolution error. We think this will be fixed in 7.14, but need to test when that is available.

@michaelmckinsey1 michaelmckinsey1 self-assigned this Jun 11, 2026
@michaelmckinsey1 michaelmckinsey1 changed the title Use 7.1.1 over 7.2.1 Use rocm 7.1.1 over 7.2.1 Jun 11, 2026
Comment thread pyproject.toml
Comment thread scripts/scaffold-tuolumne-torchpypi.job Outdated
Comment on lines +18 to +23
# # Disable direct naive convolution benchmarking (naive_conv_ab_nonpacked_fwd_ndhwc_half_double_half.kd)
# export MIOPEN_DEBUG_CONV_DIRECT_NAIVE_CONV_FWD=0
# # Disable naive_conv_ab_nonpacked_bwd_ndhwc_half_double_half.kd
# export MIOPEN_DEBUG_CONV_DIRECT_NAIVE_CONV_BWD=0
# # Disable naive_conv_ab_nonpacked_wrw_ndhwc_half_double_half.kd
# export MIOPEN_DEBUG_CONV_DIRECT_NAIVE_CONV_WRW=0

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me why we no longer want to disable these naive kernels for 7.1.1?

@michaelmckinsey1 michaelmckinsey1 Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export MIOPEN_DEBUG_CONV_DIRECT=0 does all 3 options

@PatrickRMiles PatrickRMiles merged commit fff0d49 into LBANN:main Jun 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants