Module
skainet-compile-hlo / ConstantOperationsConverter
Problem
All weight tensors are emitted as inline stablehlo.constant dense<[[...]]> literals. For Whisper-tiny.en, this produces a 151 MB MLIR file where most of the content is floating-point numbers in text form.
Impact
- Extremely slow to parse (minutes for
iree-compile to read).
- SSA type tracking in post-processors fails on multi-megabyte lines.
- Not scalable to larger models — Whisper small/medium would be gigabytes of text MLIR.
Suggested fix options
- External weight files: emit
stablehlo.constant with a reference to an external binary file (IREE supports #util.byte_pattern or resource loading).
- Splat constants for zeros: when VoidTensorOps produces zero tensors, emit
dense<0.0> splat instead of spelling out every element.
- Separate weight serialization: emit the MLIR structure with placeholder constants and load weights at compile time via IREE's parameter mechanism.
Context
Filed from skainet-whisper IREE GPU bring-up (2026-04-18) on branch feature/iree-vulkan-gpu targeting SKaiNET 0.18.0. The ONNX path (ONNX → iree-import-onnx → MLIR → iree-compile → VMFB) works on device today because iree-import-onnx uses external resources for weights. The native SKaiNET DSL path will need an equivalent mechanism to be scalable beyond tiny models.
Test to reproduce:
./gradlew :SKaiNET-voice:testDebugUnitTest --tests "*WhisperHloExportTest*"
# produces SKaiNET-voice/build/iree/encoder_skainet.mlir
ls -lh SKaiNET-voice/build/iree/encoder_skainet.mlir
Module
skainet-compile-hlo/ConstantOperationsConverterProblem
All weight tensors are emitted as inline
stablehlo.constant dense<[[...]]>literals. For Whisper-tiny.en, this produces a 151 MB MLIR file where most of the content is floating-point numbers in text form.Impact
iree-compileto read).Suggested fix options
stablehlo.constantwith a reference to an external binary file (IREE supports#util.byte_patternor resource loading).dense<0.0>splat instead of spelling out every element.Context
Filed from skainet-whisper IREE GPU bring-up (2026-04-18) on branch
feature/iree-vulkan-gputargeting SKaiNET 0.18.0. The ONNX path (ONNX → iree-import-onnx → MLIR → iree-compile → VMFB) works on device today because iree-import-onnx uses external resources for weights. The native SKaiNET DSL path will need an equivalent mechanism to be scalable beyond tiny models.Test to reproduce: