Skip to content

intrinsic-test: sve support #2160

Draft
davidtwco wants to merge 32 commits into
rust-lang:mainfrom
davidtwco:intrinsic-test-sve
Draft

intrinsic-test: sve support #2160
davidtwco wants to merge 32 commits into
rust-lang:mainfrom
davidtwco:intrinsic-test-sve

Conversation

@davidtwco

@davidtwco davidtwco commented Jun 15, 2026

Copy link
Copy Markdown
Member

This needs rust-lang/rust#157915 to be in nightly for CI to pass - it is already approved and in the bors queue.

This patch contains all the changes necessary for intrinsic-test to generate tests for the SVE intrinsics:

  • Some of the commits are just general refactors to the tool, motivated by the commits that follow and that I felt were hard to justify on their own
  • Some of the commits are small changes I made along the way (e.g. forwarding args from intrinsic-test.sh to cargo test)
    • I could split these out if we wanted
  • Some of the commits modify the definitions of intrinsics so that they pass tests - this only happened for a handful of intrinsics and the changes required to their definitions were very minor
    • I could split these out if we wanted
  • Some of the commits update arm_intrinsics.json to add some additional constraints necessary to generate the right tests (we're making sure that these addl. constraints are reflected in the source of truth we use to generate that file internally)
    • I could split these out if we wanted
  • Any remaining commits are changes to make the tool generate the correct tests, primarily:
    • Adding the arm_sve.h headers and appropriate target feature flags for C and Rust
    • Using the svld1 intrinsic when loading an argument of scalable vector type
      • There is a special case for svbool_t arguments which do not have an svld1 intrinsic, we load a svint8_t and use svcmpne_n_s8 against zero to produce a svbool_t from it
    • Generating test arrays for intrinsics that take boolean values and svbool_t values (though the earlier commits use the same svptrue_$ty predicate for these arguments initially)
    • Supporting calls to enum-typed const generic arguments (such as svpattern), using simple const functions that map integers from intrinsic-test constraints to the appropriate variant
    • Generating a local containing a predicate that is used by all intrinsics which require a predicate (just svptrue_$ty() to enable every lane)
    • Using svcmpeq_$ty and svptest_any to compare the Rust and C results instead of assert_eq!
      • There is a special-case for tuples of vectors, where multiple svcmpeq_$ty and svptest_any blocks are generated for each vector in the tuple, with svgetN calls to extract the appropriate vector from the result tuples
      • There is also a special-case for intrinsics that return svbool_t, where there is no equivalent svcmpeq_$ty call, so sveor_b_z (exclusive OR) and !svptest_any is used
      • There is yet another special case for scalable vectors of floats, where the pre-existing NanEqF* type cannot be used, so a (rust == c) || (isnan(rust) && isnan(c)) check is performed instead, via a handful of SVE intrinsics
  • I haven't been been able to test this on x86 to check that I haven't broken the test generation, but I don't think I've made any changes to the generated output that weren't in the Arm-specific parts

This macro isn't necessary and just makes the generated code being
written harder to read compared to multi-line strings.
Replacing `iter_specializations` (which repeatedly invokes a callback)
with an iterator implementation allows `Itertools::format_with` to be
used more broadly, which in turn allows disparate string interpolation
to be combined and hopefully provide greater context to the reader.
This isn't strictly necessary but these type names were longer than they
needed to be.
Updates `get_load_function` to return `svld{n}_{ty}` when loading a
scalable vector type. Caller of `get_load_function` will still need
updated to handle passing the predicate arguments to these load
functions.
Various SVE intrinsics are not yet implemented in stdarch, but are
present in the `arm_intrinsics.json` and so should be skipped.
@davidtwco davidtwco force-pushed the intrinsic-test-sve branch 2 times, most recently from c29cf85 to 6561db8 Compare June 15, 2026 16:17
@davidtwco davidtwco force-pushed the intrinsic-test-sve branch from 6561db8 to 47d9041 Compare June 15, 2026 16:22
davidtwco added 20 commits June 15, 2026 16:31
Updates the headers used by generated C code and the target feature flags
passed to the C compiler to enable SVE.
Some SVE intrinsics take booleans as arguments, so there is a need to
support generating a test value array for booleans.
Constraints that correspond to enum types - such as `svpattern` and
`svprfop` - need to be converted to the enum type in order to be used in
a generic instantiation - so introduce a const function for both types
that provides this mapping.
A small refactoring to make the type printing logic slightly cleaner and
with greater code re-use.
Predicate arguments of type `svbool_t` do not need test value arrays to
be generated as the same enable-all-lanes predicate will be passed to all
invocations of the intrinsic under test. There is no `svld1` equivalent
for `svbool_t` that could be used even if there were test values to
use.
Introduces a per-architecture abstraction over how intrinsic results are
compared, so that later commits can implement Arm-specific comparison
logic for SVE.
Refactoring enabling accessing architecture-specific behaviour that isn't
associated with either of the return or argument types.
Support defining a local variable containing the predicate that will be
used with all subsequent scalable vector intrinsics.
Implementation of `get_comparison_function` and `get_predicate_function`
for SVE which uses the relevant SVE intrinsics.
Refactoring enabling accessing architecture-specific behaviour that isn't
associated with the specific argument type.
Instead of assuming that any scalable boolean argument is a predicate,
handle predicates specifically and generate test values for `svbool_t`
values.
All of the generated output is run through rustfmt so these aren't
necessary.
These intrinsics need `Arguments_Preparation` added so that the
intrinsic-test tool knows to generate const arguments.
These intrinsics need `Arguments_Preparation` added so that the
intrinsic-test tool knows to generate const arguments.
Enables generation of tests for SVE intrinsics leveraging the changes
from the previous commits.
There doesn't need to be so many or other modules with the values.
SVE isn't a baseline target feature for `aarch64-unknown-linux-gnu` but
should be enabled when running tests.
SVE intrinsics aren't available on big endian
Like with non-SVE test generation, comparison of float results in
scalable vectors need special-handling of comparisons.
The output of these cannot be compared.
Forward addl. arguments to `intrinsic-test.sh` to `cargo test` so that
`--no-fail-fast` or a specific test name can be passed.
Clang uses the `llvm.aarch64.sve.rev.bN` intrinsic for `svrev` with
`b16`, `b32` and `b64`. This required small generator changes so it knew
a bool-to-bool conversion was a no-op and a new blanket identity impl of
`SveInto` so the calls generated compile.
Clang uses the `llvm.aarch64.sve.zip.bN` intrinsic for `svzip` with
`b16`, `b32` and `b64` and the `llvm.aarch64.sve.uzp.bN` intrinsic for
`svuzp` with the same types.
`sveorv` intrinsics trigger a miscompile in LLVM where the call to the
Rust intrinsic is optimised out and replaced with a zero, which is
incorrect.
These tests require that we generate test arrays with values that are
valid when cast to a pointer, which we don't currently support.
@davidtwco davidtwco force-pushed the intrinsic-test-sve branch from 47d9041 to 0a396d6 Compare June 15, 2026 16:32
Only include the `arm_sve.h` header if SVE is available - avoiding the
include on 32-bit or big-endian, etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant