Skip to content

Docs: refresh DSL→DAG→StableHLO→IREE pipeline + beginner tutorial #526

@michalharakal

Description

@michalharakal

Problem

The Antora docs describe the compilation chain as targeting XLA (docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc), but the real target is IREE — and the recent #520#525 work has changed how weights, type annotations, and module headers are emitted. A beginner reading the docs today gets an outdated mental model and no runnable end-to-end example.

docs/modules/ROOT/pages/reference/architecture.adoc is 11 lines pointing at the SVG diagram and explicitly inviting a deeper write-up. Good place to add the new content.

What's stale / missing

In hlo-getting-started.adoc

  • "Hardware Target Compilation via XLA" section should be rewritten for IREE. Mention iree-compile, VMFB output, target backends like vulkan-spirv, llvm-cpu, cuda. The prerequisites section currently installs XLA / CUDA / ROCm toolchains — should point at IREE.
  • Pipeline diagram stops at "XLA Compiler → Hardware Executables". Extend: StableHLO MLIR → iree-compile → .vmfb + (optional) .irpa → iree-run-module.
  • The rgb2grayscale StableHLO example (line ~189) is accurate post-Fix stablehlo.transpose and stablehlo.dot_general MLIR emission #520 (contracting_dims = [3] x [0], (inputType) -> outputType) but doesn't show the newer constant-materialization paths.

Missing sections entirely

In reference/architecture.adoc

  • Almost empty. Load-bearing content from this issue should land here:
    • The layering: skainet-langskainet-compile-dagskainet-compile-hloskainet-io-iree-paramsiree-compile → runtime.
    • Where TensorSpec / TensorEncoding / BufferHandle fit (they are the shared vocabulary across DSL, compile, and IO).
    • The policy seam and why the converter never writes numerical bytes.

Beginner tutorial — proposed structure

Split hlo-getting-started.adoc or add a new tutorials/first-model-to-iree.adoc. Staircase from minimum to meaningful:

Stage 1 — element-wise `y = 2x + 1`

  • Smallest possible graph (one input, two ops, one output). No weights.
  • Show: DSL Kotlin, the emitted StableHLO text, iree-compile --iree-hal-target-backends=llvm-cpu invocation, iree-run-module call with a hand-crafted input, expected output. Reader can reproduce in under 5 minutes on a laptop.

Stage 2 — linear layer `y = Wx + b` with tiny inline weights

  • 2×2 weight matrix, size-2 bias. Introduces parameter/weight ops but keeps them inline (policy defaults to InlineAlways). Reader sees a real stablehlo.constant dense<[[...]]> in the emitted MLIR.
  • Hand-verify the math: [1, 2] × [[1, 0], [0, 1]] + [0, 0] = [1, 2].

Stage 3 — same linear layer with external weights

  • Same Kotlin DSL, single-line change to createBasic(ExternalAlways()). Shows how externalParameters comes back populated, write with IrpaWriter, invoke iree-compile --iree-opt-import-parameters=my.irpa. Same runtime behavior, but now the .mlir text no longer contains weight values.
  • Builds intuition for why this matters at scale (Whisper-tiny: 151 MB → <1 MB text).

Stage 4 (optional, later) — rgb2grayscale revisited

  • The existing example, but end-to-end runnable against IREE on CPU. Keep it as the "second example" for readers who want something slightly more meaningful than a scalar op.

Out of scope for this issue

  • Mobile deployment tutorial (separate issue when the Android runtime story is ready).
  • Training tutorials — SKaiNET is inference-first today.
  • Full Diátaxis reorganization. This issue refreshes three specific pages; broader IA work is a separate thread.

Suggested commit cadence

Three small PRs beat one big one here:

  1. Refresh hlo-getting-started.adoc — rewrite XLA references to IREE, fix pipeline diagrams, update prerequisites. No new content.
  2. Add beginner tutorialtutorials/first-model-to-iree.adoc with Stages 1–3 above. New page, linked from the tutorials index and from hlo-getting-started.adoc.
  3. Flesh out reference/architecture.adoc — layering, shared vocabulary, policy seam. Link to Design: externalize weights via IREE parameter archive (supersedes #519) #523 for the weight-externalization rationale.

Context / references

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions