Skip to content

refactor!: split_first_chunk to help optimizer remove unreachable panic#65

Merged
Noratrieb merged 2 commits intorust-lang:masterfrom
morrisonlevi:split_first_chunk
Mar 25, 2026
Merged

refactor!: split_first_chunk to help optimizer remove unreachable panic#65
Noratrieb merged 2 commits intorust-lang:masterfrom
morrisonlevi:split_first_chunk

Conversation

@morrisonlevi
Copy link
Contributor

@morrisonlevi morrisonlevi commented Mar 24, 2026

This does two things:

  1. Bumps MSRV to 1.77, which is or isn't a BC break depending on how you look at it, but that's why I used ! in the title. The MSRV bump is for split_first_chunk.
  2. Optimizes the hash_bytes path to remove a panic branch.

I'm trying to write a no-panic hash table and tried using rustc-hash and discovered this. The difficulty of writing no-panic code is that it's dependent on the optimizer. The v1.77 of Rust doesn't have a panic in the generated code whereas Rust v1.92 does (aarch64-apple-darwin). The code on this branch optimizes cleanly for both Rust versions.

Here you can see the diff in assembly and see that the bl with slice_index_fail is gone on the right (and is just generally nicer):

aarch64-apple-darwn side-by-side assembly
        movk    x13, #10655, lsl #16                          |         ldp     x8, x14, [x13], #16
        movk    x13, #14370, lsl #32                          |         eor     x8, x8, x9
        movk    x13, #41993, lsl #48                          |         eor     x9, x14, x11
LBB0_4:                                                       |         mul     x14, x9, x8
        mov     x14, x9                                       < 
        add     x9, x11, #8                                   < 
        cmp     x9, x2                                        < 
        b.hi    LBB0_13                                       < 
        add     x11, x11, #16                                 < 
        ldp     x9, x15, [x12, #-8]                           < 
        eor     x8, x9, x8                                    < 
        eor     x9, x15, x13                                  < 
        mul     x15, x9, x8                                   < 
        umulh   x8, x9, x8                                              umulh   x8, x9, x8
        eor     x9, x8, x15                                   |         eor     x8, x8, x14
        add     x12, x12, #16                                 |         mov     x9, x12
        mov     x8, x14                                       |         cmp     x10, #15
        cmp     x11, x10                                      |         b.hi    LBB0_5
        b.lo    LBB0_4                                        |         add     x9, x0, x1
        add     x8, x0, x10                                   |         ldp     x10, x11, [x9, #-16]
        ldp     x10, x11, [x8]                                |         eor     x9, x10, x12
        eor     x8, x10, x14                                  |         eor     x8, x11, x8
        eor     x9, x11, x9                                   |         mul     x10, x9, x8
        b       LBB0_12                                       |         umulh   x8, x9, x8
                                                              >         eor     x8, x8, x10
                                                              >         eor     x0, x1, x8
                                                              >         ret
LBB0_7:                                                         LBB0_7:
        cmp     x2, #3                                        |         cmp     x1, #3
        b.ls    LBB0_9                                                  b.ls    LBB0_9
        ldr     w10, [x0]                                               ldr     w10, [x0]
        add     x11, x0, x2                                   |         add     x11, x0, x1
        ldur    w11, [x11, #-4]                                         ldur    w11, [x11, #-4]
        eor     x8, x10, x8                                   |         eor     x9, x10, x9
        eor     x9, x11, x9                                   |         eor     x8, x11, x8
        b       LBB0_12                                       |         mul     x10, x9, x8
                                                              >         umulh   x8, x9, x8
                                                              >         eor     x8, x8, x10
                                                              >         eor     x0, x1, x8
                                                              >         ret
LBB0_9:                                                         LBB0_9:
        cbz     x2, LBB0_12                                   |         cbz     x1, LBB0_3
        lsr     x10, x2, #1                                   |         lsr     x10, x1, #1
        ldrb    w10, [x0, x10]                                          ldrb    w10, [x0, x10]
        ldrb    w11, [x0]                                               ldrb    w11, [x0]
        add     x12, x0, x2                                   |         add     x12, x0, x1
        ldurb   w12, [x12, #-1]                                         ldurb   w12, [x12, #-1]
        eor     x8, x11, x8                                   |         eor     x9, x11, x9
        orr     x10, x10, x12, lsl #8                                   orr     x10, x10, x12, lsl #8
LBB0_11:                                                      |         eor     x8, x10, x8
        eor     x9, x10, x9                                   |         mul     x10, x9, x8
LBB0_12:                                                      |         umulh   x8, x9, x8
        mul     x10, x8, x9                                   < 
        umulh   x8, x8, x9                                    < 
        eor     x8, x8, x10                                             eor     x8, x8, x10
        eor     x0, x2, x8                                    |         eor     x0, x1, x8
        .cfi_def_cfa wsp, 16                                  < 
        ldp     x29, x30, [sp], #16                           < 
        .cfi_def_cfa_offset 0                                 < 
        .cfi_restore w30                                      < 
        .cfi_restore w29                                      < 
        ret                                                             ret
LBB0_13:                                                      < 
        .cfi_restore_state                                    < 
        add     x8, x2, #8                                    < 
        and     x0, x8, #0xfffffffffffffff0                   < 
Ltmp0:                                                        < 
Lloh0:                                                        < 
        adrp    x3, l_anon.e881f07e3afd45838d38e6d12340fccf.1 < 
Lloh1:                                                        < 
        add     x3, x3, l_anon.e881f07e3afd45838d38e6d12340fc < 
        orr     x1, x0, #0x8                                  < 
        bl      __ZN4core5slice5index16slice_index_fail17h548 < 
Ltmp1:                                                        < 
        brk     #0x1                                          < 
LBB0_15:                                                      < 
Ltmp2:                                                        < 
        bl      __ZN4core9panicking19panic_cannot_unwind17he0 < 
        .loh AdrpAdd    Lloh0, Lloh1                          < 
Lfunc_end0:                                                   < 
        .cfi_endproc                                                    .cfi_endproc
        .section        __TEXT,__gcc_except_tab               < 
        .p2align        2, 0x0                                < 
GCC_except_table0:                                            < 
Lexception0:                                                  < 
        .byte   255                                           < 
        .byte   155                                           < 
        .uleb128 Lttbase0-Lttbaseref0                         < 
Lttbaseref0:                                                  < 
        .byte   1                                             < 
        .uleb128 Lcst_end0-Lcst_begin0                        < 
Lcst_begin0:                                                  < 
        .uleb128 Ltmp0-Lfunc_begin0                           < 
        .uleb128 Ltmp1-Ltmp0                                  < 
        .uleb128 Ltmp2-Lfunc_begin0                           < 
        .byte   1                                             < 
Lcst_end0:                                                    < 
        .byte   127                                           < 
        .byte   0                                             < 
        .p2align        2, 0x0                                < 
Lttbase0:                                                     < 
        .byte   0                                             < 
        .p2align        2, 0x0                                < 
                                                              < 
        .section        __TEXT,__cstring,cstring_literals     < 
l_anon.e881f07e3afd45838d38e6d12340fccf.0:                    < 
        .asciz  "src/lib.rs"                                  < 
                                                              < 
        .section        __DATA,__const                        < 
        .p2align        3, 0x0                                < 
l_anon.e881f07e3afd45838d38e6d12340fccf.1:                    < 
        .quad   l_anon.e881f07e3afd45838d38e6d12340fccf.0     < 
        .asciz  "\n\000\000\000\000\000\000\0009\001\000\000- < 
                                                                
.subsections_via_symbols                                        .subsections_via_symbols

This can be solved in other ways that do not require an MSRV bump, but I could not personally find one that worked that avoided unsafe, which seems to also be a goal here.

I used an extern "C" stub like this to be able to see the assembly:

#[no_mangle]
pub unsafe extern "C" fn hash_bytes_probe(ptr: *const u8, len: usize) -> u64 {
    let bytes = unsafe { core::slice::from_raw_parts(ptr, len) };
    hash_bytes(bytes)
}

I'm not interested in performance, actually. The reason I need the
optimizer to do well here is to eliminate a panic condition. I'm
trying to write a no-panic hash table and tried using rustc-hash and
discovered this.
@morrisonlevi morrisonlevi marked this pull request as ready for review March 25, 2026 00:08
@Noratrieb
Copy link
Member

Thanks, bumping the MSRV to that old version is no problem. I'm gonna review it later. cc @orlp you might also be interested

Copy link
Contributor

@orlp orlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me, I just don't understand why the old code got pessimized.

@Noratrieb Noratrieb merged commit 140e525 into rust-lang:master Mar 25, 2026
10 checks passed
@Noratrieb
Copy link
Member

#67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants