-
Notifications
You must be signed in to change notification settings - Fork 313
Description
I was enabling Miri tests in my SIMD wrapper crate and ran into UB for the vld3 intrinsics.
The error below is for vld3q_f32 which returns a float32x4x3_t.
test aarch64::tests::test_vld3q_f32 ... error: Undefined Behavior: memory access failed: attempting to access 64 bytes, but got alloc751281 which is only 48 bytes from the end of the allocation
--> \.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\..\..\stdarch\crates\core_arch\src\macros.rs:250:20
|
250 | let w: W = ptr::read_unaligned($ptr as *const W);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Undefined Behavior occurred here
|
::: \.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\..\..\stdarch\crates\core_arch\src\arm_shared\neon\generated.rs:22080:5
|
22080 | crate::core_arch::macros::deinterleaving_load!(f32, 4, 3, a)
| ------------------------------------------------------------ in this macro invocation
|
This comment on a std::simd private load function warns about the interaction of repr(simd) and read_unaligned.
/// This function is necessary since `repr(simd)` has padding for non-power-of-2 vectors (at the time of writing).
/// With padding, `read_unaligned` will read past the end of an array of N elements.
core_arch's Simd is repr(simd) which pads to the nearest power-of-two from 48 to 64 bytes for Simd<f32, 12>.
stdarch/crates/core_arch/src/simd.rs
Lines 39 to 41 in 949ae81
| #[repr(simd)] | |
| #[derive(Copy)] | |
| pub(crate) struct Simd<T: SimdElement, const N: usize>([T; N]); |
Thus, type W here of Simd<f32, 12> is actually 64 bytes which triggers UB in Miri when a pointer to [f32; 12] is passed.
stdarch/crates/core_arch/src/macros.rs
Lines 253 to 268 in 949ae81
| ($elem:ty, $lanes:literal, 3, $ptr:expr) => {{ | |
| use $crate::core_arch::macros::deinterleave_mask; | |
| use $crate::core_arch::simd::Simd; | |
| use $crate::{mem::transmute, ptr}; | |
| type V = Simd<$elem, $lanes>; | |
| type W = Simd<$elem, { $lanes * 3 }>; | |
| let w: W = ptr::read_unaligned($ptr as *const W); | |
| let v0: V = simd_shuffle!(w, w, deinterleave_mask::<$lanes, 3, 0>()); | |
| let v1: V = simd_shuffle!(w, w, deinterleave_mask::<$lanes, 3, 1>()); | |
| let v2: V = simd_shuffle!(w, w, deinterleave_mask::<$lanes, 3, 2>()); | |
| transmute((v0, v1, v2)) | |
| }}; |