You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Antora docs describe the compilation chain as targeting XLA (docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc), but the real target is IREE — and the recent #520–#525 work has changed how weights, type annotations, and module headers are emitted. A beginner reading the docs today gets an outdated mental model and no runnable end-to-end example.
docs/modules/ROOT/pages/reference/architecture.adoc is 11 lines pointing at the SVG diagram and explicitly inviting a deeper write-up. Good place to add the new content.
What's stale / missing
In hlo-getting-started.adoc
"Hardware Target Compilation via XLA" section should be rewritten for IREE. Mention iree-compile, VMFB output, target backends like vulkan-spirv, llvm-cpu, cuda. The prerequisites section currently installs XLA / CUDA / ROCm toolchains — should point at IREE.
Weight externalization (PRs B–E, Design: externalize weights via IREE parameter archive (supersedes #519) #523): the ConstantMaterializationPolicy seam, .irpa sidecar packaging, #flow.parameter.named<"scope"::"key"> MLIR idiom, when to stay inline vs go external. Reference example of the emitted MLIR with both paths side-by-side.
skainet-io-iree-params: document IrpaWriter, how to pair it with StableHloConverter.convert(...).externalParameters, and the iree-compile --iree-opt-import-parameters=... CLI invocation.
Where TensorSpec / TensorEncoding / BufferHandle fit (they are the shared vocabulary across DSL, compile, and IO).
The policy seam and why the converter never writes numerical bytes.
Beginner tutorial — proposed structure
Split hlo-getting-started.adoc or add a new tutorials/first-model-to-iree.adoc. Staircase from minimum to meaningful:
Stage 1 — element-wise `y = 2x + 1`
Smallest possible graph (one input, two ops, one output). No weights.
Show: DSL Kotlin, the emitted StableHLO text, iree-compile --iree-hal-target-backends=llvm-cpu invocation, iree-run-module call with a hand-crafted input, expected output. Reader can reproduce in under 5 minutes on a laptop.
Stage 2 — linear layer `y = Wx + b` with tiny inline weights
2×2 weight matrix, size-2 bias. Introduces parameter/weight ops but keeps them inline (policy defaults to InlineAlways). Reader sees a real stablehlo.constant dense<[[...]]> in the emitted MLIR.
Same Kotlin DSL, single-line change to createBasic(ExternalAlways()). Shows how externalParameters comes back populated, write with IrpaWriter, invoke iree-compile --iree-opt-import-parameters=my.irpa. Same runtime behavior, but now the .mlir text no longer contains weight values.
Builds intuition for why this matters at scale (Whisper-tiny: 151 MB → <1 MB text).
The existing example, but end-to-end runnable against IREE on CPU. Keep it as the "second example" for readers who want something slightly more meaningful than a scalar op.
Out of scope for this issue
Mobile deployment tutorial (separate issue when the Android runtime story is ready).
Training tutorials — SKaiNET is inference-first today.
Full Diátaxis reorganization. This issue refreshes three specific pages; broader IA work is a separate thread.
Suggested commit cadence
Three small PRs beat one big one here:
Refresh hlo-getting-started.adoc — rewrite XLA references to IREE, fix pipeline diagrams, update prerequisites. No new content.
Add beginner tutorial — tutorials/first-model-to-iree.adoc with Stages 1–3 above. New page, linked from the tutorials index and from hlo-getting-started.adoc.
Problem
The Antora docs describe the compilation chain as targeting XLA (
docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc), but the real target is IREE — and the recent#520–#525work has changed how weights, type annotations, and module headers are emitted. A beginner reading the docs today gets an outdated mental model and no runnable end-to-end example.docs/modules/ROOT/pages/reference/architecture.adocis 11 lines pointing at the SVG diagram and explicitly inviting a deeper write-up. Good place to add the new content.What's stale / missing
In
hlo-getting-started.adociree-compile, VMFB output, target backends likevulkan-spirv,llvm-cpu,cuda. The prerequisites section currently installs XLA / CUDA / ROCm toolchains — should point at IREE.StableHLO MLIR → iree-compile → .vmfb + (optional) .irpa → iree-run-module.contracting_dims = [3] x [0],(inputType) -> outputType) but doesn't show the newer constant-materialization paths.Missing sections entirely
ConstantMaterializationPolicyseam,.irpasidecar packaging,#flow.parameter.named<"scope"::"key">MLIR idiom, when to stay inline vs go external. Reference example of the emitted MLIR with both paths side-by-side.skainet-io-iree-params: documentIrpaWriter, how to pair it withStableHloConverter.convert(...).externalParameters, and theiree-compile --iree-opt-import-parameters=...CLI invocation.(inputType) -> outputType; reshape pulls source type from the SSA value map. Worth a short "what the converter guarantees about emitted syntax" section because it's a contract callers depend on.In
reference/architecture.adocskainet-lang→skainet-compile-dag→skainet-compile-hlo→skainet-io-iree-params→iree-compile→ runtime.TensorSpec/TensorEncoding/BufferHandlefit (they are the shared vocabulary across DSL, compile, and IO).Beginner tutorial — proposed structure
Split
hlo-getting-started.adocor add a newtutorials/first-model-to-iree.adoc. Staircase from minimum to meaningful:Stage 1 — element-wise `y = 2x + 1`
iree-compile --iree-hal-target-backends=llvm-cpuinvocation,iree-run-modulecall with a hand-crafted input, expected output. Reader can reproduce in under 5 minutes on a laptop.Stage 2 — linear layer `y = Wx + b` with tiny inline weights
parameter/weightops but keeps them inline (policy defaults toInlineAlways). Reader sees a realstablehlo.constant dense<[[...]]>in the emitted MLIR.[1, 2] × [[1, 0], [0, 1]] + [0, 0] = [1, 2].Stage 3 — same linear layer with external weights
createBasic(ExternalAlways()). Shows howexternalParameterscomes back populated, write withIrpaWriter, invokeiree-compile --iree-opt-import-parameters=my.irpa. Same runtime behavior, but now the.mlirtext no longer contains weight values.Stage 4 (optional, later) — rgb2grayscale revisited
Out of scope for this issue
Suggested commit cadence
Three small PRs beat one big one here:
hlo-getting-started.adoc— rewrite XLA references to IREE, fix pipeline diagrams, update prerequisites. No new content.tutorials/first-model-to-iree.adocwith Stages 1–3 above. New page, linked from the tutorials index and fromhlo-getting-started.adoc.reference/architecture.adoc— layering, shared vocabulary, policy seam. Link to Design: externalize weights via IREE parameter archive (supersedes #519) #523 for the weight-externalization rationale.Context / references
#flow.parameter.namedemission).feature/iree-vulkan-gpubranch in skainet-whisper (out-of-repo, MagentaTV One target: Amlogic Mali Valhall, Android 14).