[libcu++] Make argument namespace and wrappers construction public#9251
[libcu++] Make argument namespace and wrappers construction public#9251pciolkosz wants to merge 4 commits into
Conversation
davebayer
left a comment
There was a problem hiding this comment.
I'm sorry, but I'll block this PR. We said there would be a discussion about the naming. I'm still strongly against having a separate namespace for these types.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
SummaryThis pull request makes the user-facing CUDA argument wrapper API public by removing leading underscores from names and introducing the cuda::argument header. Wrapper and bounds types, their deduction guides, factory helpers, and the public header were renamed/moved from the internal cuda::argument namespace to cuda::argument. Internal inspection helpers and trait machinery remain underscored for now (exposed under cuda::argument::* where needed). Key Changes
Codebase UpdatesAll usages across CUB, benchmarks, kernels, and tests were updated to use the new public names and header. Notable updated files include:
Search confirms many test and example files now include <cuda/argument> and refer to cuda::argument::{constant, immediate, deferred, static_bounds, runtime_bounds, bounds}, while trait helpers remain used via cuda::argument::__*. API/ABI Impact
Review Notes
suggestion: Walkthrough This PR replaces internal cuda::__argument wrapper/bounds/traits names with public cuda::argument equivalents, adds a public forwarding header Changes CUDA argument API public-API migration
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Infer (1.2.0)libcudacxx/test/libcudacxx/cuda/argument/deferred_sequence_scalar.fail.cpplibcudacxx/test/libcudacxx/cuda/argument/deferred_sequence_scalar.fail.cpp:11:10: fatal error: 'cuda/argument' file not found ... [truncated 1187 characters] ... m" "/usr/local/include" "-internal-isystem" libcudacxx/test/libcudacxx/cuda/argument/deferred_sequence_scalar_traits.fail.cpplibcudacxx/test/libcudacxx/cuda/argument/deferred_sequence_scalar_traits.fail.cpp:11:10: fatal error: 'cuda/argument' file not found ... [truncated 1208 characters] ... r/local/include" "-internal-isystem" Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4424dd10-8027-4d66-97f4-e3d43a7b6cf4
📒 Files selected for processing (20)
cub/benchmarks/bench/segmented_topk/fixed/keys.cucub/benchmarks/bench/segmented_topk/variable/keys.cucub/cub/agent/agent_batched_topk.cuhcub/cub/detail/segmented_params.cuhcub/cub/device/dispatch/dispatch_batched_topk.cuhcub/cub/device/dispatch/kernels/kernel_batched_topk.cuhcub/test/catch2_test_device_segmented_topk_keys.cucub/test/catch2_test_device_segmented_topk_pairs.culibcudacxx/include/cuda/__argument/argument.hlibcudacxx/include/cuda/__argument/argument_bounds.hlibcudacxx/include/cuda/argumentlibcudacxx/include/cuda/std/__internal/namespaces.hlibcudacxx/test/libcudacxx/cuda/argument/argument_bounds.pass.cpplibcudacxx/test/libcudacxx/cuda/argument/argument_traits.pass.cpplibcudacxx/test/libcudacxx/cuda/argument/deferred_argument.pass.cpplibcudacxx/test/libcudacxx/cuda/argument/dynamic_argument.pass.cpplibcudacxx/test/libcudacxx/cuda/argument/static_argument.pass.cpplibcudacxx/test/libcudacxx/cuda/argument/static_bounds_conversion.fail.cpplibcudacxx/test/libcudacxx/cuda/argument/static_bounds_type_mismatch.fail.cpplibcudacxx/test/libcudacxx/cuda/argument/usage_example.pass.cpp
Tagging @gevtushenko |
6002bc6 to
3210e3d
Compare
This comment has been minimized.
This comment has been minimized.
| auto segment_sizes = ::cuda::__argument::__constant<MaxSegmentSize>{}; | ||
| auto k = ::cuda::__argument::__constant<MaxNumSelected>{}; | ||
| auto select_direction = ::cuda::__argument::__constant<cub::detail::topk::select::max>{}; | ||
| auto segment_sizes = ::cuda::argument::constant<MaxSegmentSize>{}; |
There was a problem hiding this comment.
C++26 has constant_wrapper. It's hard to back-port fully to earlier C++ versions (though there are proposals pending review that might fix this), but for now, it's how C++ spells "compile-time constant." Does this cuda::argument::constant do the same thing? If so, then should we consider just making this an alias to constant_wrapper (or a hypothetical cuda::std::constant_wrapper)?
|
|
||
| using segment_size_val_t = typename ::cuda::__argument::__traits<SegmentSizeParameterT>::element_type; | ||
| using num_segments_val_t = typename ::cuda::__argument::__traits<NumSegmentsParameterT>::element_type; | ||
| using segment_size_val_t = typename ::cuda::argument::__traits<SegmentSizeParameterT>::element_type; |
There was a problem hiding this comment.
Is this library usable without exposing __traits to the public API? If not, then should we consider waiting until __traits is ready?
There was a problem hiding this comment.
The main user facing part is the construction of the wrappers, while our APIs will use the traits to examine them. Long term we want to expose the __traits I think, but we don't have to right now so keeping them private leaves us room to change them easily and make public later
There was a problem hiding this comment.
Thanks for explaining! That makes sense.
| //! @brief Wraps a compile-time constant argument value. | ||
| template <auto _Value> | ||
| struct __constant | ||
| struct constant |
There was a problem hiding this comment.
This looks more or less like constant_wrapper. Should we consider spelling it cuda::std::constant_wrapper?
There was a problem hiding this comment.
We considered that design, cuda::std::constant_wrapper is difficult to back port to older c++ standards and the timeline probably doesn't leave enough room to implement it. I think its fine to just interoperate with constant wrapper later on
| static_assert(__valid_static_bounds_v<__element_type, _StaticBounds>, | ||
| "static argument bounds cannot be represented by the element type"); | ||
|
|
||
| _Arg __arg_; |
There was a problem hiding this comment.
This is a public member. That means it's possible to set the value without validating the bounds. Should we instead consider making the value private, so that every constructed immediate instance has been validated?
There was a problem hiding this comment.
I could make it a private member and add an underscored __access structure that is a friend or I can just underscore the variable and have less code this way. I don't mind either way, but we need a way to access it at the end of the day.
But I think this question does apply to __unwrap, which currently has two version, const and not-const reference. I could see the non-const reference version being not necessary, but it's also not public yet, so we don't have to resolve it right now. I might as well remove it and we can add it when needed
There was a problem hiding this comment.
@pciolkosz Thanks for answering! I don't quite understand the argument why validation is separate from construction.
There was a problem hiding this comment.
Also: Why can't users access the value (by value)? Isn't this just a validated value? Why wouldn't this class just have a value() member?
| _Arg __arg_; | ||
| __runtime_bounds<__element_type> __runtime_bounds_{}; | ||
| runtime_bounds<__element_type> __runtime_bounds_{}; |
There was a problem hiding this comment.
These are public members. That means it's possible to set them without validation. Should we instead consider making the members private, so that every constructed immediate_sequence instance has been validated?
|
|
||
| template <class _ElementType, auto _Lowest, auto _Highest> | ||
| inline constexpr bool __valid_static_bounds_v<_ElementType, __static_bounds<_Lowest, _Highest>> = | ||
| inline constexpr bool __valid_static_bounds_v<_ElementType, static_bounds<_Lowest, _Highest>> = |
There was a problem hiding this comment.
There are runtime_bounds and static_bounds. Do we also need a representation of half-run-time, half-static bounds?
submdspan represents bounds generally as any type for which structured binding into two elements, both convertible to index_type, is well-formed. This would let users pass in bounds however they like. It also would permit a unified representation of any combination of run-time and static bounds.
There was a problem hiding this comment.
Also, the template parameters' types are unconstrained here. What are the actual requirements on these types?
There was a problem hiding this comment.
You can tighten the static bounds with runtime bounds, so in this case you can do static_bounds<0, 1000> and runtime_bounds(100, 1000) which effectively would be a half-run-time, half-static bounds. It avoids overcomplicating the input types at the expense of this case being a bit more verbose, but I think it should be rare anyway.
Also, the template parameters' types are unconstrained here. What are the actual requirements on these types?
Good point, I added type validation
| @@ -410,7 +410,7 @@ | |||
| "static argument bounds cannot be represented by the element type"); | |||
|
|
|||
| _Arg __arg_; | |||
There was a problem hiding this comment.
Please see my above comments on public members and bounds checking; thanks!
🥳 CI Workflow Results🟩 Finished in 4h 05m: Pass: 100%/340 | Total: 5d 20h | Max: 2h 07m | Hits: 94%/499980See results here. |
Closes #9254
This PR removes underscores from the
cuda::argumentnamespace and the user facing argument wrappers and bound objects construction.Traits and other utility functions remain underscored, those are meant mostly to be used inside CCCL APIs to examine the argument wrappers passed by the user, but they will most likely be public later on, once we are more confident with the design.