Skip to content

Comments

x86: use simd::intrinsics for saturating packs#2033

Merged
folkertdev merged 4 commits intorust-lang:mainfrom
okaneco:saturating_packs
Feb 19, 2026
Merged

x86: use simd::intrinsics for saturating packs#2033
folkertdev merged 4 commits intorust-lang:mainfrom
okaneco:saturating_packs

Conversation

@okaneco
Copy link
Contributor

@okaneco okaneco commented Feb 19, 2026

  • Use simd::intrinsics for sse2, sse41, avx2, avx512bw

All but one of the implementations make use of simd_shuffle. Some avx512 intrinsics call the lower-target-feature intrinsics but with additional masking capability which caused some trial and error figuring out how to make the optimizer happy. Saturating packing instructions are essentially shuffles and LLVM can recognize a lot of these patterns by now.

Combined with masked stores, instruction tests routinely failed unless using shuffles which is probably the lack of being able to see through the clamping and truncating as in the truncating conversion stores issue. This same strategy could probably be used to get more of the saturating masked truncation instructions to pass.

_mm_packs_epi32 was the single case that failed to optimize at all unless I wrote it without a shuffle.

Use intrinsics for `sse2`, `sse41`, `avx2`, `avx512bw`

The majority of implementations make use of `simd_shuffle` since that
optimized through to the avx512 intrinsics that made use of the lower
target feature intrinsics. Combined with masked stores, instruction
tests would fail presumably due to the casting and clamping that
the compiler couldn't see through. This is a known weakness as seen
in the other masked stores like the truncating conversion stores.
@rustbot
Copy link
Collaborator

rustbot commented Feb 19, 2026

r? @folkertdev

rustbot has assigned @folkertdev.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @Amanieu, @folkertdev, @sayantn
  • @Amanieu, @folkertdev, @sayantn expanded to Amanieu, folkertdev, sayantn
  • Random selection from Amanieu, folkertdev, sayantn

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, this looks good to me

cc @sayantn if you have thoughts, otherwise I'll just merge this tomorrow

@sayantn
Copy link
Contributor

sayantn commented Feb 19, 2026

It lgtm too, just one point - all the intrinsics can now be marked const (just remember to mark the corresponding tests const too)

@folkertdev
Copy link
Contributor

That's better as a follow-up I think

@folkertdev folkertdev added this pull request to the merge queue Feb 19, 2026
Merged via the queue into rust-lang:main with commit 16ab994 Feb 19, 2026
77 checks passed
@okaneco okaneco deleted the saturating_packs branch February 19, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants