intrinsic-test: sve support #2160
Draft
davidtwco wants to merge 32 commits into
Draft
Conversation
This macro isn't necessary and just makes the generated code being written harder to read compared to multi-line strings.
Replacing `iter_specializations` (which repeatedly invokes a callback) with an iterator implementation allows `Itertools::format_with` to be used more broadly, which in turn allows disparate string interpolation to be combined and hopefully provide greater context to the reader.
This isn't strictly necessary but these type names were longer than they needed to be.
Updates `get_load_function` to return `svld{n}_{ty}` when loading a
scalable vector type. Caller of `get_load_function` will still need
updated to handle passing the predicate arguments to these load
functions.
Various SVE intrinsics are not yet implemented in stdarch, but are present in the `arm_intrinsics.json` and so should be skipped.
c29cf85 to
6561db8
Compare
20 tasks
6561db8 to
47d9041
Compare
Updates the headers used by generated C code and the target feature flags passed to the C compiler to enable SVE.
Some SVE intrinsics take booleans as arguments, so there is a need to support generating a test value array for booleans.
Constraints that correspond to enum types - such as `svpattern` and `svprfop` - need to be converted to the enum type in order to be used in a generic instantiation - so introduce a const function for both types that provides this mapping.
A small refactoring to make the type printing logic slightly cleaner and with greater code re-use.
Predicate arguments of type `svbool_t` do not need test value arrays to be generated as the same enable-all-lanes predicate will be passed to all invocations of the intrinsic under test. There is no `svld1` equivalent for `svbool_t` that could be used even if there were test values to use.
Introduces a per-architecture abstraction over how intrinsic results are compared, so that later commits can implement Arm-specific comparison logic for SVE.
Refactoring enabling accessing architecture-specific behaviour that isn't associated with either of the return or argument types.
Support defining a local variable containing the predicate that will be used with all subsequent scalable vector intrinsics.
Implementation of `get_comparison_function` and `get_predicate_function` for SVE which uses the relevant SVE intrinsics.
Refactoring enabling accessing architecture-specific behaviour that isn't associated with the specific argument type.
Instead of assuming that any scalable boolean argument is a predicate, handle predicates specifically and generate test values for `svbool_t` values.
All of the generated output is run through rustfmt so these aren't necessary.
These intrinsics need `Arguments_Preparation` added so that the intrinsic-test tool knows to generate const arguments.
These intrinsics need `Arguments_Preparation` added so that the intrinsic-test tool knows to generate const arguments.
Enables generation of tests for SVE intrinsics leveraging the changes from the previous commits.
There doesn't need to be so many or other modules with the values.
SVE isn't a baseline target feature for `aarch64-unknown-linux-gnu` but should be enabled when running tests.
SVE intrinsics aren't available on big endian
Like with non-SVE test generation, comparison of float results in scalable vectors need special-handling of comparisons.
The output of these cannot be compared.
Forward addl. arguments to `intrinsic-test.sh` to `cargo test` so that `--no-fail-fast` or a specific test name can be passed.
Clang uses the `llvm.aarch64.sve.rev.bN` intrinsic for `svrev` with `b16`, `b32` and `b64`. This required small generator changes so it knew a bool-to-bool conversion was a no-op and a new blanket identity impl of `SveInto` so the calls generated compile.
Clang uses the `llvm.aarch64.sve.zip.bN` intrinsic for `svzip` with `b16`, `b32` and `b64` and the `llvm.aarch64.sve.uzp.bN` intrinsic for `svuzp` with the same types.
`sveorv` intrinsics trigger a miscompile in LLVM where the call to the Rust intrinsic is optimised out and replaced with a zero, which is incorrect.
These tests require that we generate test arrays with values that are valid when cast to a pointer, which we don't currently support.
47d9041 to
0a396d6
Compare
Only include the `arm_sve.h` header if SVE is available - avoiding the include on 32-bit or big-endian, etc.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This needs rust-lang/rust#157915 to be in nightly for CI to pass - it is already approved and in the bors queue.
This patch contains all the changes necessary for
intrinsic-testto generate tests for the SVE intrinsics:populate_randomand related code #2126, intrinsic-test: removearm::argumentmodule #2127)intrinsic-test.shtocargo test)arm_intrinsics.jsonto add some additional constraints necessary to generate the right tests (we're making sure that these addl. constraints are reflected in the source of truth we use to generate that file internally)arm_sve.hheaders and appropriate target feature flags for C and Rustsvld1intrinsic when loading an argument of scalable vector typesvbool_targuments which do not have ansvld1intrinsic, we load asvint8_tand usesvcmpne_n_s8against zero to produce asvbool_tfrom itsvbool_tvalues (though the earlier commits use the samesvptrue_$typredicate for these arguments initially)svpattern), using simple const functions that map integers from intrinsic-test constraints to the appropriate variantsvptrue_$ty()to enable every lane)svcmpeq_$tyandsvptest_anyto compare the Rust and C results instead ofassert_eq!svcmpeq_$tyandsvptest_anyblocks are generated for each vector in the tuple, withsvgetNcalls to extract the appropriate vector from the result tuplessvbool_t, where there is no equivalentsvcmpeq_$tycall, sosveor_b_z(exclusive OR) and!svptest_anyis usedNanEqF*type cannot be used, so a(rust == c) || (isnan(rust) && isnan(c))check is performed instead, via a handful of SVE intrinsics