Skip to content

Comments

use intrinsics::simd for aarch64 deinterleaving loads#2025

Merged
folkertdev merged 5 commits intorust-lang:mainfrom
folkertdev:arm-deinterleave-load
Feb 18, 2026
Merged

use intrinsics::simd for aarch64 deinterleaving loads#2025
folkertdev merged 5 commits intorust-lang:mainfrom
folkertdev:arm-deinterleave-load

Conversation

@folkertdev
Copy link
Contributor

@folkertdev folkertdev commented Feb 14, 2026

Hitting llvm/llvm-project#181514 for some ld2 cases.

@folkertdev folkertdev force-pushed the arm-deinterleave-load branch 6 times, most recently from 2f748fb to 70ca2a6 Compare February 14, 2026 22:58
@folkertdev
Copy link
Contributor Author

There is also an issue with ld2 on aarch64_be

 ---- core_arch::aarch64::neon::generated::assert_vld2q_p64_ld2 stdout ----
disassembly for stdarch_test_shim_vld2q_p64_ld2: 
	 0: add x9, x0, #0x10
	 1: ld1 {v0.2d}, [x0]
	 2: ld1 {v1.2d}, [x9]
	 3: add x9, x8, #0x10
	 4: zip1 v2.2d, v1.2d, v0.2d
	 5: zip2 v0.2d, v1.2d, v0.2d
	 6: st1 {v2.2d}, [x8]
	 7: st1 {v0.2d}, [x9]
	 8: ret

thread 'core_arch::aarch64::neon::generated::assert_vld2q_p64_ld2' (1746) panicked at crates/stdarch-test/src/lib.rs:204:9:
failed to find instruction `ld2` in the disassembly
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- core_arch::aarch64::neon::generated::assert_vld2q_u64_ld2 stdout ----
disassembly for stdarch_test_shim_vld2q_u64_ld2: 
	 0: add x9, x0, #0x10
	 1: ld1 {v0.2d}, [x0]
	 2: ld1 {v1.2d}, [x9]
	 3: add x9, x8, #0x10
	 4: zip1 v2.2d, v1.2d, v0.2d
	 5: zip2 v0.2d, v1.2d, v0.2d
	 6: st1 {v2.2d}, [x8]
	 7: st1 {v0.2d}, [x9]
	 8: ret

thread 'core_arch::aarch64::neon::generated::assert_vld2q_u64_ld2' (1748) panicked at crates/stdarch-test/src/lib.rs:204:9:
failed to find instruction `ld2` in the disassembly


failures:
    core_arch::aarch64::neon::generated::assert_vld2q_p64_ld2
    core_arch::aarch64::neon::generated::assert_vld2q_u64_ld2

Probably another missed optimization?

@folkertdev folkertdev force-pushed the arm-deinterleave-load branch 2 times, most recently from f7f53ec to ca268d2 Compare February 14, 2026 23:57
@folkertdev
Copy link
Contributor Author

So, skipping ld2 for neon for now, it runs into weird issues that I can't even really reproduce locally. I'd expect ld4 to be by far the most common anyway.

r? sayantn
cc @adamgemmell

@folkertdev folkertdev marked this pull request as ready for review February 15, 2026 00:12
@folkertdev folkertdev force-pushed the arm-deinterleave-load branch from ca268d2 to 8ba8329 Compare February 18, 2026 09:03
@folkertdev folkertdev enabled auto-merge February 18, 2026 09:04
@folkertdev folkertdev added this pull request to the merge queue Feb 18, 2026
Merged via the queue into rust-lang:main with commit a894c23 Feb 18, 2026
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants