Skip to content

Aarch64 asm: Have software fallback and CPU id checks#10754

Open
SparkiDev wants to merge 1 commit into
wolfSSL:masterfrom
SparkiDev:arm64_asm_c_fallback
Open

Aarch64 asm: Have software fallback and CPU id checks#10754
SparkiDev wants to merge 1 commit into
wolfSSL:masterfrom
SparkiDev:arm64_asm_c_fallback

Conversation

@SparkiDev

@SparkiDev SparkiDev commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Description

cpuid.h — added CPUID_ASIMD flag + IS_AARCH64_ASIMD() macro (NEON detection).
cpuid.c — added NEON/ASIMD detection fixed FreeBSD/OpenBSD to use HWCAP_*
sha256.c — runtime dispatch SHA256-crypto → NEON → software
sha512.c — replaced the #error with the same crypto → NEON → software dispatch.
chacha.c: add AArch64 runtime fallback to C.
poly1305.c: add AArch64 runtime fallback to C.

Fixes
test_tls.c: don't memcpy into buffer if length is too long.
sha256.c: even if data is not NULL, return immediately when length is 0.

Testing

Regression tested Aarch64 platforms.

@SparkiDev SparkiDev self-assigned this Jun 22, 2026
@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

@wolfSSL-Fenrir-bot wolfSSL-Fenrir-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fenrir Automated Review — PR #10754

Scan targets checked: wolfcrypt-bugs, wolfcrypt-rs-bugs, wolfcrypt-src, wolfssl-bugs, wolfssl-src

Findings: 1
1 finding(s) posted as inline comments (see file-level comments below)

This review was generated automatically by Fenrir. Findings are non-blocking.

Comment thread wolfcrypt/src/sha256.c Outdated
@SparkiDev SparkiDev force-pushed the arm64_asm_c_fallback branch from 030ceff to d92491c Compare June 23, 2026 03:03
@SparkiDev

SparkiDev commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Jenkins: retest this please

Checkout failed - agent went down
FIPS failure and again
Agent offline
curl failed to get file

@dgarske dgarske self-assigned this Jun 25, 2026
@dgarske dgarske removed the request for review from wolfSSL-Bot June 25, 2026 17:29
Comment thread wolfcrypt/src/cpuid.c Outdated
#define CPUID_AARCH64_FEAT_SHA3 ((word64)1 << 32)
#define CPUID_AARCH64_FEAT_SM3 ((word64)1 << 36)
#define CPUID_AARCH64_FEAT_SM4 ((word64)1 << 40)
#define CPUID_AARCH64_FEAT_ASMID ((word64)0xf << 20)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ASIMD

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread wolfcrypt/src/poly1305.c Outdated

/* Dispatch each Poly1305 operation to the NEON assembly or the C
* implementation, choosing at runtime when both are available. */
void poly1305_blocks_aarch64(Poly1305* ctx, const unsigned char* m, size_t bytes)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making these static

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread wolfcrypt/src/poly1305.c Outdated
#endif
}

void poly1305_block_aarch64(Poly1305* ctx, const unsigned char* m)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making these static

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread wolfcrypt/src/poly1305.c
poly1305_arm32_blocks_16(ctx, m, bytes, 1);
return 0;
/* r &= 0xffffffc0ffffffc0ffffffc0fffffff */
ctx->r[0] = (U8TO32(key + 0) ) & 0x3ffffff;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could/should these also use the POLY1305_CTX_ remap macros?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used with the ARM64 fallback and don't want to have C fallback if it can be helped. That is, want to add assembly implementations that use base instructions only as well.

@dgarske dgarske assigned SparkiDev and unassigned dgarske Jun 30, 2026
cpuid.h — added CPUID_ASIMD flag + IS_AARCH64_ASIMD() macro (NEON detection).
cpuid.c — added NEON/ASIMD detection fixed FreeBSD/OpenBSD to use HWCAP_*
sha256.c — runtime dispatch SHA256-crypto → NEON → software
sha512.c — replaced the #error with the same crypto → NEON → software dispatch.
chacha.c: add AArch64 runtime fallback to C.
poly1305.c: add AArch64 runtime fallback to C.

Fixes
test_tls.c: don't memcpy into buffer if length is too long.
sha256.c: even if data is not NULL, return immediately when length is 0.
@SparkiDev SparkiDev force-pushed the arm64_asm_c_fallback branch from d92491c to 6315f95 Compare June 30, 2026 23:42
@SparkiDev

SparkiDev commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Jenkins: retest this please

makedist check timed out.
valgrind failed
FIPS failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants