x86: use `simd::intrinsics` for saturating packs by okaneco · Pull Request #2033 · rust-lang/stdarch

okaneco · 2026-02-19T14:01:10Z

Use simd::intrinsics for sse2, sse41, avx2, avx512bw

All but one of the implementations make use of simd_shuffle. Some avx512 intrinsics call the lower-target-feature intrinsics but with additional masking capability which caused some trial and error figuring out how to make the optimizer happy. Saturating packing instructions are essentially shuffles and LLVM can recognize a lot of these patterns by now.

Combined with masked stores, instruction tests routinely failed unless using shuffles which is probably the lack of being able to see through the clamping and truncating as in the truncating conversion stores issue. This same strategy could probably be used to get more of the saturating masked truncation instructions to pass.

_mm_packs_epi32 was the single case that failed to optimize at all unless I wrote it without a shuffle.

Use intrinsics for `sse2`, `sse41`, `avx2`, `avx512bw` The majority of implementations make use of `simd_shuffle` since that optimized through to the avx512 intrinsics that made use of the lower target feature intrinsics. Combined with masked stores, instruction tests would fail presumably due to the casting and clamping that the compiler couldn't see through. This is a known weakness as seen in the other masked stores like the truncating conversion stores.

rustbot · 2026-02-19T14:01:15Z

r? @folkertdev

rustbot has assigned @folkertdev.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

Owners of files modified in this PR: @Amanieu, @folkertdev, @sayantn
@Amanieu, @folkertdev, @sayantn expanded to Amanieu, folkertdev, sayantn
Random selection from Amanieu, folkertdev, sayantn

folkertdev

Neat, this looks good to me

cc @sayantn if you have thoughts, otherwise I'll just merge this tomorrow

sayantn · 2026-02-19T15:33:36Z

It lgtm too, just one point - all the intrinsics can now be marked const (just remember to mark the corresponding tests const too)

folkertdev · 2026-02-19T15:43:52Z

That's better as a follow-up I think

okaneco added 4 commits February 19, 2026 08:42

Use intrinsics for sse41

31ce954

Use intrinsics for avx2

348737d

Use intrinsics for avx512bw

56d4241

rustbot assigned folkertdev Feb 19, 2026

folkertdev approved these changes Feb 19, 2026

View reviewed changes

folkertdev added this pull request to the merge queue Feb 19, 2026

Merged via the queue into rust-lang:main with commit 16ab994 Feb 19, 2026
77 checks passed

okaneco deleted the saturating_packs branch February 19, 2026 16:03

okaneco mentioned this pull request Feb 20, 2026

x86: Followup to add const for pack intrinsics and tests #2035

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

x86: use `simd::intrinsics` for saturating packs#2033

x86: use `simd::intrinsics` for saturating packs#2033
folkertdev merged 4 commits intorust-lang:mainfrom
okaneco:saturating_packs

okaneco commented Feb 19, 2026 •

edited

Loading

Uh oh!

rustbot commented Feb 19, 2026

Uh oh!

folkertdev left a comment

Uh oh!

sayantn commented Feb 19, 2026

Uh oh!

folkertdev commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

okaneco commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Feb 19, 2026

Uh oh!

folkertdev left a comment

Choose a reason for hiding this comment

Uh oh!

sayantn commented Feb 19, 2026

Uh oh!

folkertdev commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

okaneco commented Feb 19, 2026 •

edited

Loading