Fix SIGILL crash on CPUs without AVX-512#6
Open
GavinPalmer1984 wants to merge 1 commit intocomputerex:mainfrom
Open
Fix SIGILL crash on CPUs without AVX-512#6GavinPalmer1984 wants to merge 1 commit intocomputerex:mainfrom
GavinPalmer1984 wants to merge 1 commit intocomputerex:mainfrom
Conversation
The #pragma GCC target directives in simd_dot.c, simd_qq_dot.c, and cpool.c forced AVX-512 instruction generation regardless of the build machine's capabilities. This overrode the -march=native flag already present in CGo CFLAGS, causing GCC to auto-vectorize with AVX-512 instructions that crash (SIGILL) on CPUs without AVX-512 support. Remove the target pragmas entirely and rely on -march=native, which already detects the CPU at compile time. This means: - AVX-512 machines: GCC auto-vectorizes with AVX-512 as before - AVX2-only machines: GCC correctly limits to AVX2/FMA/F16C - No performance regression on capable hardware Fixes computerex#4 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6d64a74 to
da97d6a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
#pragma GCC target(...)inquant/simd_dot.c,quant/simd_qq_dot.c, andquant/cpool.cforced AVX-512 instruction generation regardless of the build machine-march=nativeflag already present in CGo CFLAGS, causing GCC's auto-vectorizer to emit EVEX-prefixed AVX-512 instructionsSIGILLcrash on CPUs without AVX-512 (Intel pre-11th gen, AMD pre-Zen 4)Fix
Remove the
#pragma GCC target(...)directives entirely and rely on-march=native(already in#cgo CFLAGS), which detects CPU capabilities at compile time:The pragmas were originally added to "avoid CGo flag restrictions", but
-march=nativealready handles this correctly.Test plan
-march=nativeenables the same features)Fixes #4
🤖 Generated with Claude Code