Add generate_identity_sequences helper for common pattern#3588
Closed
tenpercent wants to merge 1 commit intotenpercent/old-ck-pack-rewritesfrom
Closed
Add generate_identity_sequences helper for common pattern#3588tenpercent wants to merge 1 commit intotenpercent/old-ck-pack-rewritesfrom
tenpercent wants to merge 1 commit intotenpercent/old-ck-pack-rewritesfrom
Conversation
1 task
This was referenced Jan 16, 2026
1b33b98 to
aef254c
Compare
ef35913 to
7c37209
Compare
2 tasks
This adds an optimized helper for the common generate_tuple pattern:
generate_tuple([](auto i) { return Sequence<i.value>{}; }, N)
The new generate_identity_sequences<N>() function creates
Tuple<Sequence<0>, Sequence<1>, ..., Sequence<N-1>> without
requiring lambda instantiation at each call site.
Updated 21 call sites across threadwise_tensor_slice_transfer,
wrapper utilities, and layout files to use the new helper.
Build time improvement: ~1.1% wall-clock (18.3s -> 18.1s)
57c8cb1 to
3d46680
Compare
7c37209 to
d7e7fbd
Compare
This was referenced Jan 19, 2026
Contributor
Author
This was referenced Feb 10, 2026
2 tasks
shumway
pushed a commit
to ROCm/rocm-libraries
that referenced
this pull request
Feb 28, 2026
… functors (#4828) ## Summary - Add `generate_identity_sequences<N>()` helper that returns `Tuple<Sequence<0>, Sequence<1>, ..., Sequence<N-1>>` - Replace lambdas with named functors in `transform_tensor_descriptor` - Add `unpack_and_merge_sequences` helper functor - Reduces `transform_tensor_descriptor` instantiations from 388 to 32 (92% reduction) ## Motivation Multiple call sites use `generate_tuple([](auto i) { return Sequence<i>{}; }, Number<N>{})` pattern. A named helper reduces lambda instantiations. Additionally, each lambda in `transform_tensor_descriptor` creates a unique closure type, causing the function to be instantiated separately for every call site. Named functors share a single type, so the compiler reuses the same instantiation. ## Changes ### Part 1: generate_identity_sequences helper - Replaces common lambda pattern for generating identity sequences - Each lambda expression creates a unique closure type, causing separate template instantiations at every call site - Named helper shares a single type across all uses ### Part 2: Named functors in transform_tensor_descriptor - Add `unpack_and_merge_sequences` helper to replace lambda in `GetNumOfHiddenDimension` - Use `generate_identity_sequences` in `matrix_padder.hpp` ## Test Plan - [x] Added 7 unit tests: - 4 tests for `generate_identity_sequences` - 3 tests for `unpack_and_merge_sequences` - [ ] Waiting for full CI ## Related PRs This PR merges the functionality from: - ROCm/composable_kernel#3588 (generate_identity_sequences helper) - ROCm/composable_kernel#3589 (Named functors in transform_tensor_descriptor) Part of PR stack for issue #4229 (Reduce CK/CKTile Build Times) **Note:** This PR supersedes #4283, ROCm/composable_kernel#3588 and ROCm/composable_kernel#3589, which can be closed once this is merged. --- 🔁 Imported from [ROCm/composable_kernel#3628](ROCm/composable_kernel#3628) 🧑💻 Originally authored by @tenpercent Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
assistant-librarian bot
pushed a commit
that referenced
this pull request
Feb 28, 2026
Add generate_identity_sequences helper and replace lambdas
with named functors (#4828)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
## Summary
- Add `generate_identity_sequences<N>()` helper that returns
`Tuple<Sequence<0>, Sequence<1>, ..., Sequence<N-1>>`
- Replace lambdas with named functors in `transform_tensor_descriptor`
- Add `unpack_and_merge_sequences` helper functor
- Reduces `transform_tensor_descriptor` instantiations from 388 to 32
(92% reduction)
## Motivation
Multiple call sites use `generate_tuple([](auto i) { return
Sequence<i>{}; }, Number<N>{})` pattern. A named helper reduces lambda
instantiations.
Additionally, each lambda in `transform_tensor_descriptor` creates a
unique closure type, causing the function to be instantiated separately
for every call site. Named functors share a single type, so the compiler
reuses the same instantiation.
## Changes
### Part 1: generate_identity_sequences helper
- Replaces common lambda pattern for generating identity sequences
- Each lambda expression creates a unique closure type, causing separate
template instantiations at every call site
- Named helper shares a single type across all uses
### Part 2: Named functors in transform_tensor_descriptor
- Add `unpack_and_merge_sequences` helper to replace lambda in
`GetNumOfHiddenDimension`
- Use `generate_identity_sequences` in `matrix_padder.hpp`
## Test Plan
- [x] Added 7 unit tests:
- 4 tests for `generate_identity_sequences`
- 3 tests for `unpack_and_merge_sequences`
- [ ] Waiting for full CI
## Related PRs
This PR merges the functionality from:
- #3588 (generate_identity_sequences helper)
- #3589 (Named functors in
transform_tensor_descriptor)
Part of PR stack for issue #4229 (Reduce CK/CKTile Build Times)
**Note:** This PR supersedes #4283, #3588 and
#3589, which can be closed once this is merged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
generate_identity_sequences<N>()helper that returnsTuple<Sequence<0>, Sequence<1>, ..., Sequence<N-1>>Motivation
Multiple call sites use
generate_tuple([](auto i) { return Sequence<i>{}; }, Number<N>{})pattern. A named helper reduces lambda instantiations.Why It Works
Each lambda expression creates a unique closure type. When passed to template functions like
generate_tuple, this causes separate template instantiations at every call site. A named helper shares a single type across all uses.Test Plan
PR Stack
__make_integer_seqTracking issue: #3575