wolfCrypt on TI C2000 C28x (LAUNCHXL-F28P55X) by dgarske · Pull Request #10724 · wolfSSL/wolfssl

dgarske · 2026-06-18T00:15:18Z

wolfCrypt: CHAR_BIT != 8 (16-bit byte) support for TI C2000 C28x

Companion example PR (wolfssl-examples): wolfSSL/wolfssl-examples#576

Summary

Adds WOLFSSL_WIDE_BYTE support so wolfCrypt builds and runs correctly on word-addressed targets where CHAR_BIT != 8 - specifically the TI C2000 C28x DSP family, where a C char/unsigned char (wolfSSL's byte) is 16 bits and is the smallest addressable unit. All changes are gated and are a no-op on normal 8-bit-byte targets.

The work was validated end-to-end on a TI LAUNCHXL-F28P55X (TMS320F28P550SJ, C28x, 150 MHz) using the bare-metal example added in the companion wolfssl-examples PR. Every algorithm below passes known-answer tests on hardware, and the standard host wolfcrypt_test continues to pass (no 8-bit regression).

Validated algorithms (on C28x hardware)

SHA-1; SHA-224/256, SHA-384/512, SHA-512/224, SHA-512/256
SHA3-224/256/384/512, SHAKE128/256 (with a 32-bit split Keccak permutation for WC_16BIT_CPU that emits native instructions instead of compiler 64-bit helper calls - ~53% faster SHAKE/SHA3 on this target)
ML-DSA-44/65/87 (Dilithium) verify and full keygen/sign/verify; ML-KEM-512/768/1024 (FIPS 203)
AES-128/192/256 CBC/CTR/CFB/OFB/GCM/XTS; AES-CMAC, AES-CCM, AES-GMAC, AES-SIV, AES-EAX
HMAC + HKDF; ChaCha20-Poly1305; Poly1305
X25519 + Ed25519; X448 + Ed448 (CURVE448_SMALL/ED448_SMALL byte backend); ECDSA + ECDH (SECP256R1, SP math)
RSA-2048 PKCS#1 v1.5 sign and verify; DH FFDHE-2048 (SP math)

What the `CHAR_BIT != 8` fixes address

All behind WOLFSSL_WIDE_BYTE (auto-enabled for CHAR_BIT != 8 and known 16-bit-char TI toolchain macros), each a no-op on 8-bit targets:

Byte/word aliasing. Serializing a word32/word64 by casting to byte* moves addressable cells, not octets. Replaced with explicit shift-based octet I/O via shared helpers in misc.c (WordsFromBytesBE32/BytesFromWordsBE32, BytesFromWordsLE32, the 64-bit variants, octet-correct readUnalignedWord32/readUnalignedWord64). sp_int.c sp_read_unsigned_bin uses an endian-/CHAR_BIT-agnostic shift loop for its leftover bytes (a 3-byte RSA exponent previously loaded as 1 instead of 65537).
(byte)x not truncating to an octet (it keeps 16 bits). Masked with WC_OCTET(x) = (byte)((x) & 0xFF). Used across the ML-KEM/ML-DSA encoders, the SP *_to_bin serializers, AES GETBYTE, base64, the DRBG, and the Curve448/Ed448 CURVE448_SMALL byte-array field backend (whose carry-store (word8) casts must mask before the next limb re-reads them).
Integer promotion. 1U << n is 16-bit on C28x (use 1UL); a bit width written sizeof(t) * 8 is wrong when CHAR_BIT != 8 (use CHAR_BIT * sizeof(t)); byte operands promote to a 16-bit int.
sizeof counting cells, not octets. e.g. CHACHA_CHUNK_BYTES must be 16 * 4, not 16 * sizeof(word32) (= 32 on C28x, which halves the ChaCha block and desyncs the counter).
xorbuf word stride. WOLFSSL_WORD_SIZE_LOG2 vs sizeof(word) mismatch left half of each buffer un-XORed on a 16-bit-cell target; corrected for the WC_16BIT_CPU word16 path.

It also adds WOLFSSL_MLDSA_VERIFY_SMALLEST_MEM (streams the signature z vector per-row), which combined with WOLFSSL_MLDSA_ASSIGN_KEY brings ML-DSA-87 verify to ~10.8 KB RAM with zero heap.

Commit layout

wolfcrypt: add WOLFSSL_WIDE_BYTE support for CHAR_BIT != 8 targets (TI C2000 C28x) - core types, misc octet helpers, base64, DRBG
sha: octet-correct SHA-1/SHA-2 byte I/O and 32-bit split Keccak permutation for CHAR_BIT != 8
aes/chacha: octet-correct block, key, keystream and XTS-tweak I/O for CHAR_BIT != 8
mldsa/mlkem: correct ML-DSA and ML-KEM on CHAR_BIT != 8; add WOLFSSL_MLDSA_VERIFY_SMALLEST_MEM
ecc/25519/448/sp: octet-correct X25519/Ed25519/X448/Ed448 and SP byte<->mp conversion for CHAR_BIT != 8
test/benchmark/ci: CHAR_BIT != 8 test vectors, NO_MALLOC benchmark, TI C2000 compile CI and docs

Footprint (measured on F28P55X, cl2000 25.11.0; octets, KB = 1024 octets)

Code size is per-object .text from the linker map (16-bit words x2). Builds are single-parameter: ML-DSA-87 only, ML-KEM-1024 only.

Item	Size
ML-DSA-87 signature / public key / private key	4627 / 2592 / 7488 B
ML-KEM-1024 ciphertext / public key / private key	1568 / 1568 / 3168 B
ML-DSA sign+verify code (`wc_mldsa.obj`)	~22.4 KB
ML-KEM make+enc+dec code (`wc_mlkem` + `wc_mlkem_poly`)	~2.9 + 12.7 KB
SHA-3/SHAKE code (`WOLFSSL_SHA3_SMALL` / split-64 fast path)	~5.3 / 15.6 KB

RAM per operation, measured on hardware (heap high-water via wolfSSL_SetAllocators, stack via paint/scan):

Operation	RAM
ML-DSA-87 verify	~10.8 KB with `WOLFSSL_MLDSA_ASSIGN_KEY` (zero heap); ~15.9 KB copying the public key into the key struct
ML-DSA-87 sign / keygen	~31.6 / 28.2 KB peak heap (small-mem signer)
ML-KEM-1024 make / encapsulate / decapsulate	~2 / 6 / 9 KB transient heap

Testing

Host: ./configure --enable-dilithium --enable-experimental --enable-shake256 --enable-shake128 && make && ./wolfcrypt/test/testwolfcrypt - passes (RSA, ECC, ML-DSA, ML-KEM, SHA-2/3, all crypto). No behavior change on 8-bit-byte targets.
Hardware: on the LAUNCHXL-F28P55X, KATs for every algorithm listed above pass, and wolfcrypt_test crypto passes.
CI: IDE/C2000/compile.sh runs cl2000 --compile_only over the CHAR_BIT != 8 wolfCrypt subset (SHA-1/2/3, AES + modes, ChaCha/Poly1305, X25519/Ed25519, X448/Ed448, ML-DSA verify, SP-ECC); .github/workflows/ti-c2000-compile.yml runs it on PRs (fetches/caches the TI C2000 code generation tools, with optional SHA-256 pinning of the installer).

Benchmarks (F28P55X @ 150 MHz)

Primitive	Throughput
SHA-256	~284 KiB/s
SHA-384 / SHA-512	~166 KiB/s
SHA3-224 / 256 / 384 / 512	~279 / 264 / 206 / 146 KiB/s
SHAKE128 / SHAKE256	~319 / 264 KiB/s
RNG (Hash-DRBG)	~122 KiB/s

ML-DSA-87: verify ~225 ms/op (~10.8 KB RAM, zero heap); keygen and signing also run (SIGN=1).

Notes

wolfcrypt/src/sp_c32.c is generated. The & 0xFF octet masks added to its sp_*_to_bin_* serializers are also applied in the SP generator templates (kept in sync so a regeneration preserves them).
Documentation: IDE/C2000/README.md describes the support, the build options, and the benchmark/footprint results; the full bare-metal example (with KATs, benchmark, linker scripts, and per-algorithm make toggles) is in wolfssl-examples at embedded/ti-c2000-f28p55x/.

Copilot

Pull request overview

This PR adds and CI-guards a bare-metal wolfCrypt port for TI C2000 C28x targets where CHAR_BIT == 16, introducing gated fixes so hashing, DRBG, ML-DSA verify, and SP-math ECC work correctly when a C “byte” is wider than 8 bits.

Changes:

Introduces WOLFSSL_NO_OCTET_BYTE detection and uses octet-wise load/store paths to avoid invalid byte/word aliasing on CHAR_BIT != 8 targets (SHA-256/512 family, SHA-3/SHAKE, Base64 CT decode, DRBG helpers, rotate helpers).
Adds “smallest memory” ML-DSA verify mode that streams z per polynomial to reduce pinned RAM in wc_MlDsaKey.
Adds TI C2000 compile-only guard scripts plus a GitHub Actions workflow that downloads the TI CGT and compiles a scoped subset.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
wolfssl/wolfcrypt/wc_port.h	Makes atomic arg type selection robust for 16-bit `int` by also checking `UINT_MAX`.
wolfssl/wolfcrypt/wc_mldsa.h	Adds `WOLFSSL_MLDSA_VERIFY_SMALLEST_MEM` struct layout variant for reduced verify RAM.
wolfssl/wolfcrypt/types.h	Adds `WOLFSSL_NO_OCTET_BYTE` auto-detection; adjusts `WC_16BIT_CPU` 64-bit availability behavior.
wolfssl/wolfcrypt/sp_int.h	Adds support for `unsigned char` being 16-bit (no native 8-bit type).
wolfssl/wolfcrypt/settings.h	Requires explicit opt-in for SP math on 16-bit-`int` CPUs via `WOLFSSL_SP_ALLOW_16BIT_CPU`.
wolfssl/wolfcrypt/dilithium.h	Adds smallest-mem verify gating and defaults slow Montgomery reduction macros on `WC_16BIT_CPU`.
wolfcrypt/test/test.c	Switches large-digest constants from C strings to `byte[]` to avoid `CHAR_BIT!=8` pitfalls.
wolfcrypt/src/wc_port.c	Fixes init-state static assert to use `CHAR_BIT` instead of hardcoded 8.
wolfcrypt/src/wc_mldsa.c	Adds octet-masking for packed bytes and fixes integer-promotion/sign issues on 16-bit `int`; adds streaming `z` verify path.
wolfcrypt/src/sha512.c	Adds octet-wise word load/store and corrects length carry/length placement for `CHAR_BIT!=8`.
wolfcrypt/src/sha3.c	Forces bytewise Keccak absorb/squeeze for `WOLFSSL_NO_OCTET_BYTE` and adds squeeze helper.
wolfcrypt/src/sha256.c	Adds octet-wise word load/store and corrects length carry/length placement for `CHAR_BIT!=8`.
wolfcrypt/src/random.c	Fixes DRBG serialization/addition helpers for non-8-bit “byte” targets.
wolfcrypt/src/misc.c	Fixes rotate helpers to use `CHAR_BIT`-based bit width when needed.
wolfcrypt/src/coding.c	Ensures Base64 CT decode returns `0xFF` for invalid chars even when `byte` is wider than 8 bits.
wolfcrypt/benchmark/benchmark.c	Adds static buffers for `WOLFSSL_NO_MALLOC` benchmarking and adjusts frees/allocations accordingly.
scripts/ti-c2000/user_settings.h	Adds minimal CI-only config for cl2000 compile-guard.
scripts/ti-c2000/compile.sh	Adds compile-only script to build a scoped source set with TI cl2000.
.github/workflows/ti-c2000-compile.yml	Adds CI workflow to download/cache TI CGT and run the compile-only guard.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

…I C2000 C28x) - core types, misc octet helpers, base64, DRBG

…tation for CHAR_BIT != 8

… CHAR_BIT != 8

…MLDSA_VERIFY_SMALLEST_MEM

…<->mp conversion for CHAR_BIT != 8 Curve448/Ed448 build with the CURVE448_SMALL / ED448_SMALL byte-array field backend (the default fe_448 backend needs __uint128_t for the sc448 mod-order arithmetic, which the C28x toolchain lacks). The SMALL fe448 carry-stores wrote each limb through a (word8) cast that does not truncate to an octet when a C byte is wider than 8 bits, so the next carry re-read saw a corrupted limb; mask each carry-store with WC_OCTET (a no-op on the usual 8-bit-byte targets).

…I C2000 compile CI and docs

github-actions · 2026-06-30T16:07:02Z

retest this please

github-actions · 2026-06-30T16:12:55Z

MemBrowse Memory Report

gcc-arm-cortex-m4

FLASH: .text -64 B (-0.0%, 198,988 B / 262,144 B, total: 76% used)

gcc-arm-cortex-m7

FLASH: .text -64 B (-0.0%, 198,988 B / 262,144 B, total: 76% used)
No memory changes detected for:
gcc-arm-cortex-m0plus
gcc-arm-cortex-m3
gcc-arm-cortex-m4-baremetal
gcc-arm-cortex-m4-crypto-only
gcc-arm-cortex-m4-dtls13
gcc-arm-cortex-m4-min-ecc
gcc-arm-cortex-m4-openssl-compat
gcc-arm-cortex-m4-pkcs7
gcc-arm-cortex-m4-pq
gcc-arm-cortex-m4-rsa-only
gcc-arm-cortex-m4-sp-math
gcc-arm-cortex-m4-tls12
gcc-arm-cortex-m4-tls13
gcc-arm-cortex-m7-pq
gcc-arm-cortex-m7-tls13
linuxkm-pie
linuxkm-standard
stm32-sim-stm32h753

dgarske self-assigned this Jun 18, 2026

Copilot AI review requested due to automatic review settings June 18, 2026 00:15

Copilot started reviewing on behalf of dgarske June 18, 2026 00:15 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread wolfssl/wolfcrypt/types.h Outdated

Comment thread wolfcrypt/benchmark/benchmark.c

dgarske force-pushed the ti_c25 branch 4 times, most recently from 39c343a to afaf660 Compare June 24, 2026 22:28

dgarske requested a review from Copilot June 24, 2026 22:30

Copilot started reviewing on behalf of dgarske June 24, 2026 22:30 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Comment thread wolfssl/wolfcrypt/types.h

Comment thread wolfcrypt/src/sha3.c

dgarske force-pushed the ti_c25 branch 3 times, most recently from 853641c to 0f8a445 Compare June 26, 2026 04:40

dgarske added 6 commits June 30, 2026 08:55

wolfcrypt: add WOLFSSL_WIDE_BYTE support for CHAR_BIT != 8 targets (T…

19c8674

…I C2000 C28x) - core types, misc octet helpers, base64, DRBG

sha: octet-correct SHA-1/SHA-2 byte I/O and 32-bit split Keccak permu…

5de73d8

…tation for CHAR_BIT != 8

aes/chacha: octet-correct block, key, keystream and XTS-tweak I/O for…

7fb4975

… CHAR_BIT != 8

mldsa/mlkem: correct ML-DSA and ML-KEM on CHAR_BIT != 8; add WOLFSSL_…

89773bc

…MLDSA_VERIFY_SMALLEST_MEM

test/benchmark/ci: CHAR_BIT != 8 test vectors, NO_MALLOC benchmark, T…

b5dfdc2

…I C2000 compile CI and docs

dgarske force-pushed the ti_c25 branch from 0f8a445 to b5dfdc2 Compare June 30, 2026 16:06

dgarske marked this pull request as ready for review June 30, 2026 16:06

dgarske assigned wolfSSL-Bot and unassigned dgarske Jun 30, 2026

dgarske requested a review from SparkiDev June 30, 2026 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

wolfCrypt on TI C2000 C28x (LAUNCHXL-F28P55X)#10724

wolfCrypt on TI C2000 C28x (LAUNCHXL-F28P55X)#10724
dgarske wants to merge 6 commits into
wolfSSL:masterfrom
dgarske:ti_c25

dgarske commented Jun 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

dgarske commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

wolfCrypt: CHAR_BIT != 8 (16-bit byte) support for TI C2000 C28x

Summary

Validated algorithms (on C28x hardware)

What the CHAR_BIT != 8 fixes address

Commit layout

Footprint (measured on F28P55X, cl2000 25.11.0; octets, KB = 1024 octets)

Testing

Benchmarks (F28P55X @ 150 MHz)

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 30, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

MemBrowse Memory Report

gcc-arm-cortex-m4

gcc-arm-cortex-m7

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dgarske commented Jun 18, 2026 •

edited

Loading

What the `CHAR_BIT != 8` fixes address