From 4a7456241a2b3d16ffcc6c7c367e99b7ddc5c459 Mon Sep 17 00:00:00 2001 From: Michal Harakal Date: Mon, 13 Apr 2026 16:02:38 +0200 Subject: [PATCH 1/4] Define component attributes for operator-design article (#496 step 1) Adds an `asciidoc.attributes` block to `docs/antora.yml` defining the four attributes `operator-design.adoc` references but nobody declares: framework_name = SKaiNET ksp_version = 2.2.21-2.0.5 dokka_version = 2.1.0 asciidoctorj_version = 3.0.0 Antora treats component-level attributes as defaults for every page in the component, so the seven `{FRAMEWORK_NAME}` / `{KSP_VERSION}` / `{DOKKA_VERSION}` / `{ASCIIDOCTORJ_VERSION}` references across lines 1, 8, 30, 78, 176, 177, 178, 215 of `operator-design.adoc` now resolve to real values instead of falling back to the literal attribute-name placeholder and producing a warning. Net warning count dropped from 13 to 7. The remaining 6 are the pandoc section-level artifacts in `skainet-for-ai.adoc` and `arduino-c-codegen.adoc` (commit 2) plus the kroki mermaid 400 on the large `hlo-getting-started.adoc` diagram (commit 3). First step of the Antora migration polish pass. See #496. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/antora.yml | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/docs/antora.yml b/docs/antora.yml index 05bf9566..947dcb18 100644 --- a/docs/antora.yml +++ b/docs/antora.yml @@ -3,3 +3,15 @@ title: SKaiNET version: ~ nav: - modules/ROOT/nav.adoc + +# Component-level attributes flow to every page. Defined here so the +# operator-design article (and any future page) can reference them +# without each page declaring its own attributes block. If you need +# to override a value on a per-page basis, declare it above the +# first section heading on that page. +asciidoc: + attributes: + framework_name: SKaiNET + ksp_version: 2.2.21-2.0.5 + dokka_version: 2.1.0 + asciidoctorj_version: 3.0.0 From 7a459bf33814a0713348560e9f9b1b08f88bf406 Mon Sep 17 00:00:00 2001 From: Michal Harakal Date: Mon, 13 Apr 2026 16:17:33 +0200 Subject: [PATCH 2/4] Strip pandoc anchors + demote pandoc heading levels (#496 step 2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Clears 6 section-title-out-of-sequence warnings across `skainet-for-ai.adoc` (5 occurrences) and `arduino-c-codegen.adoc` (1 occurrence) that were left over from the pandoc markdown -> asciidoc conversion in #494. Two interacting issues: 1. Pandoc generated 20 anchor lines of the form `[[1-tape-based-tracing]]`, `[[2-type-safe-tensor-creation-dsl]]` etc. These are standalone block anchors sitting ABOVE their section heading. In this position Asciidoctor treats them as bibliographic block markers that bind to the next block — which prevents the following `==` / `===` from registering as a section-opening heading, so the parser's section-level counter drifts and every subsequent nested heading trips the "expected level N, got level N+1" validator. The anchors are all auto-generated slug form of the heading text they precede. Asciidoctor auto-generates equivalent id-from-title anchors for every heading. Deleting these 20 anchors sacrifices nothing — the id format is the same, the #fragment URLs stay stable. 2. Pandoc converts markdown `#` to asciidoc `==` rather than the more idiomatic `=` (page title). That made every converted page "off by one" with no level-0 title. Demoting every heading by one step (remove one `=`) fixes this: the page now starts with `= Title` and section levels cascade naturally from there. Applied via `sed -E -i '' 's/^=(=+ )/\1/'` on the two affected files — matches `^=` followed by one-or-more additional `=` followed by a space, preserves block delimiters like a bare `====$` that aren't headings. Applied only to files that were flagged; the rest of the migration's converted files had clean hierarchies already. Net: warning count drops from 7 to 1. The remaining warning is the kroki mermaid 400 on the large diagram in `hlo-getting-started.adoc` which commit 3 will handle. Second step of the Antora migration polish pass. See #496. Co-Authored-By: Claude Opus 4.6 (1M context) --- .../explanation/perf/java-25-cpu-backend.adoc | 1 - .../pages/explanation/skainet-for-ai.adoc | 31 +++++++------------ .../ROOT/pages/how-to/arduino-c-codegen.adoc | 27 +++++++--------- .../pages/how-to/java-model-training.adoc | 2 -- .../pages/tutorials/hlo-getting-started.adoc | 3 -- .../pages/tutorials/java-getting-started.adoc | 2 -- 6 files changed, 23 insertions(+), 43 deletions(-) diff --git a/docs/modules/ROOT/pages/explanation/perf/java-25-cpu-backend.adoc b/docs/modules/ROOT/pages/explanation/perf/java-25-cpu-backend.adoc index 2b74c01c..c3183e2d 100644 --- a/docs/modules/ROOT/pages/explanation/perf/java-25-cpu-backend.adoc +++ b/docs/modules/ROOT/pages/explanation/perf/java-25-cpu-backend.adoc @@ -21,7 +21,6 @@ Required flags remain: --enable-preview --add-modules jdk.incubator.vector .... -[[jit--c2-improvements-mapped-to-skainet-ops]] ===== JIT / C2 improvements mapped to SKaiNET ops These are automatic — the JIT produces better native code for existing bytecode. diff --git a/docs/modules/ROOT/pages/explanation/skainet-for-ai.adoc b/docs/modules/ROOT/pages/explanation/skainet-for-ai.adoc index 102aa5ac..365a6a3b 100644 --- a/docs/modules/ROOT/pages/explanation/skainet-for-ai.adoc +++ b/docs/modules/ROOT/pages/explanation/skainet-for-ai.adoc @@ -1,10 +1,8 @@ -[[skainet-core-technology-tensor--data-guide]] -== SKaiNET Core Technology: Tensor & Data Guide += SKaiNET Core Technology: Tensor & Data Guide This document provides technical instructions for AI agents and developers on using SKaiNET's Tensor and Data API as a modern, type-safe replacement for NDArray or Python's NumPy library. -[[1-fundamental-architecture-tensor-composition]] -=== 1. Fundamental Architecture: Tensor Composition +== 1. Fundamental Architecture: Tensor Composition Unlike traditional libraries where a Tensor is a monolithic object, SKaiNET adopts a *compositional architecture*. A `Tensor++<++T, V++>++` is composed of two primary components: @@ -24,12 +22,11 @@ interface Tensor { } ---- -[[2-type-safe-tensor-creation-dsl]] -=== 2. Type-Safe Tensor Creation (DSL) +== 2. Type-Safe Tensor Creation (DSL) SKaiNET provides a powerful Type-Safe DSL for tensor creation. It ensures that the data provided matches the specified `DType` at compile-time (or through the DSL's internal validation). -==== Creation with `ExecutionContext` +=== Creation with `ExecutionContext` Tensors are always created within an `ExecutionContext`, which provides the necessary `TensorOps` and `TensorDataFactory`. @@ -41,7 +38,7 @@ val ones = ctx.ones(Shape(1, 10), Int32::class) val full = ctx.full(Shape(5, 5), FP32::class, 42.0f) ---- -==== Expressive Tensor DSL +=== Expressive Tensor DSL For more complex initializations, use the `tensor` DSL: @@ -66,17 +63,16 @@ val customInit = tensor(ctx, Int32::class) { } ---- -[[3-slicing-dsl-api]] -=== 3. Slicing DSL API +== 3. Slicing DSL API SKaiNET offers a sophisticated Slicing DSL that allows for creating views or copies of tensor segments with high precision and readability. -==== `sliceView` vs `sliceCopy` +=== `sliceView` vs `sliceCopy` * *`sliceView`*: Creates a `TensorView`, which is a window into the original data (no data copying). * *`sliceCopy`*: Creates a new `Tensor` with a copy of the sliced data. -==== Slicing DSL Syntax +=== Slicing DSL Syntax The `SegmentBuilder` provides several ways to define slices for each dimension: @@ -98,8 +94,7 @@ val view = source.sliceView { } ---- -[[4-core-operations-tensorops]] -=== 4. Core Operations (`TensorOps`) +== 4. Core Operations (`TensorOps`) All mathematical operations are dispatched through the `TensorOps` interface. SKaiNET supports: @@ -109,7 +104,7 @@ All mathematical operations are dispatched through the `TensorOps` interface. SK * *Reductions*: `sum`, `mean`, `variance`. * *Shape Ops*: `reshape`, `flatten`, `concat`, `squeeze`, `unsqueeze`. -==== Operator Overloading +=== Operator Overloading When a tensor is "bound" to ops (e.g., via `OpsBoundTensor`), you can use standard Kotlin operators: @@ -119,8 +114,7 @@ val c = a + b // Calls ops.add(a, b) val d = a * 10 // Calls ops.mulScalar(a, 10) ---- -[[5-summary-table-skainet-vs-numpy]] -=== 5. Summary Table: SKaiNET vs NumPy +== 5. Summary Table: SKaiNET vs NumPy [cols="<,<,<",options="header",] |=== @@ -133,8 +127,7 @@ val d = a * 10 // Calls ops.mulScalar(a, 10) |*Reshape* |`a.reshape(new++_++shape)` |`ctx.ops.reshape(a, Shape(new++_++shape))` |=== -[[6-best-practices-for-ai-integration]] -=== 6. Best Practices for AI Integration +== 6. Best Practices for AI Integration [arabic] . *Context Awareness*: Always pass the `ExecutionContext` to functions that create or manipulate tensors. diff --git a/docs/modules/ROOT/pages/how-to/arduino-c-codegen.adoc b/docs/modules/ROOT/pages/how-to/arduino-c-codegen.adoc index 7ef1165c..feb0bf13 100644 --- a/docs/modules/ROOT/pages/how-to/arduino-c-codegen.adoc +++ b/docs/modules/ROOT/pages/how-to/arduino-c-codegen.adoc @@ -1,12 +1,12 @@ -== Arduino C Code Generation += Arduino C Code Generation SKaiNET provides a specialized compiler backend for exporting trained neural networks to highly optimized, standalone C99 code suitable for microcontrollers like Arduino. -=== Overview +== Overview The Arduino C code generation process transforms a high-level Kotlin model into a memory-efficient C implementation. It prioritizes static memory allocation, minimal overhead, and numerical consistency with the original model. -==== Codegen Pipeline +=== Codegen Pipeline [mermaid] ---- @@ -21,18 +21,16 @@ graph TD H --> I[Generated .h/.c files] ---- -=== Technical Deep Dive +== Technical Deep Dive -[[1-tape-based-tracing]] -==== 1. Tape-based Tracing +=== 1. Tape-based Tracing Instead of static analysis of the Kotlin code, SKaiNET uses a dynamic tracing mechanism. When you call `exportToArduinoLibrary`, the framework executes a single forward pass of your model using a specialized `RecordingContext`. * Every operation (Dense, ReLU, etc.) is recorded onto an *Execution Tape*. * This approach handles Kotlin's language features (loops, conditionals) naturally, as it only records the actual operations that were executed. -[[2-compute-graph-construction]] -==== 2. Compute Graph Construction +=== 2. Compute Graph Construction The execution tape is converted into a directed acyclic graph (DAG) called `ComputeGraph`. @@ -40,12 +38,11 @@ The execution tape is converted into a directed acyclic graph (DAG) called `Comp * Edges represent data flow (Tensors). * During this phase, the compiler performs *Shape Inference* to ensure every tensor has a fixed, known size. -[[3-static-memory-management]] -==== 3. Static Memory Management +=== 3. Static Memory Management Microcontrollers typically have very limited RAM and lack robust heap management. SKaiNET uses a *Ping-Pong Buffer Strategy* to eliminate dynamic memory allocation (`malloc`/`free`) during inference. -===== Ping-Pong Buffer Strategy +==== Ping-Pong Buffer Strategy The compiler calculates the maximum size required for any intermediate tensor in the graph and allocates exactly two static buffers of that size. @@ -66,8 +63,7 @@ sequenceDiagram * *Buffer Reuse*: Instead of allocating space for every layer's output, buffers are reused. * *Direct Output Optimization*: The first layer reads from the input pointer, and the last layer writes directly to the output pointer, avoiding unnecessary copies. -[[4-code-generation-emission]] -==== 4. Code Generation (Emission) +=== 4. Code Generation (Emission) The `CCodeGenerator` emits C99-compatible code using templates. @@ -80,15 +76,14 @@ The `CCodeGenerator` emits C99-compatible code using templates. int model_inference(const float* input, float* output); ---- -[[5-validation]] -==== 5. Validation +=== 5. Validation The generator performs post-generation validation: * *Static Allocation Check*: Ensures no dynamic allocation is present in the generated source. * *Buffer Alternation Check*: Verifies that the ping-pong strategy is correctly implemented without data races or overwrites. -=== Performance and Constraints +== Performance and Constraints * *Floating Point*: Currently optimized for `FP32`. * *Supported Ops*: `Dense`, `ReLU`, `Sigmoid`, `Tanh`, `Add`, `MatMul`. diff --git a/docs/modules/ROOT/pages/how-to/java-model-training.adoc b/docs/modules/ROOT/pages/how-to/java-model-training.adoc index 2abf7d17..ddb82976 100644 --- a/docs/modules/ROOT/pages/how-to/java-model-training.adoc +++ b/docs/modules/ROOT/pages/how-to/java-model-training.adoc @@ -173,7 +173,6 @@ float loss = loop.step(inputBatch, targetBatch); System.out.printf("Step loss: %.4f%n", loss); ---- -[[full-training-with-train]] ==== Full Training with `.train()` `train()` accepts a `Supplier` that produces an `Iterator` of `(input, target)` pairs for each epoch: @@ -194,7 +193,6 @@ System.out.printf("Trained %d epochs, final loss: %.4f%n", Each call to the supplier should return a fresh iterator over the training batches for that epoch. This allows reshuffling between epochs. -[[async-training-with-trainasync]] ==== Async Training with `.trainAsync()` `trainAsync()` runs the training loop on a virtual thread and returns a `CompletableFuture++<++TrainingResult++>++`: diff --git a/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc index d7d47a92..65e70b84 100644 --- a/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc +++ b/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc @@ -98,7 +98,6 @@ flowchart LR === Building Blocks -[[1-hlo-converters]] ==== 1. HLO Converters Converters transform SKaiNET operations into StableHLO operations: @@ -109,7 +108,6 @@ Converters transform SKaiNET operations into StableHLO operations: * *NeuralNetOperationsConverter*: High-level NN operations * *ConstantOperationsConverter*: Constant value operations -[[2-type-system]] ==== 2. Type System HLO uses a strict type system for tensors: @@ -123,7 +121,6 @@ Tensor // Batch, Channel, Height, Width tensor<1x3x224x224xf32> // StableHLO representation ---- -[[3-optimization-framework]] ==== 3. Optimization Framework The optimization pipeline includes: diff --git a/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc index 003a6d46..becdecee 100644 --- a/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc +++ b/docs/modules/ROOT/pages/tutorials/java-getting-started.adoc @@ -21,7 +21,6 @@ For Maven Surefire / exec-maven-plugin, add them to `++<++jvmArgs++>++`. For Gra === Maven Setup -[[1-import-the-bom]] ==== 1. Import the BOM The `skainet-bom` manages all SKaiNET module versions so you never have to keep them in sync manually. Add it to your `++<++dependencyManagement++>++` section: @@ -80,7 +79,6 @@ The `skainet-bom` manages all SKaiNET module versions so you never have to keep ---- -[[2-add-more-modules-as-needed]] ==== 2. Add More Modules as Needed Because the BOM is imported, you can add any module without repeating the version: From ca06cf515ef1251113df2556a953334057ef979f Mon Sep 17 00:00:00 2001 From: Michal Harakal Date: Mon, 13 Apr 2026 16:19:11 +0200 Subject: [PATCH 3/4] Fix CI permission failure on bundleDokkaIntoSite (#496 step 3) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First run of the #494 docs.yml workflow on CI failed with: > Task :bundleDokkaIntoSite FAILED > Failed to create directory '/home/runner/work/SKaiNET/SKaiNET/docs/build/site/api' Root cause: the Antora step ran the node:20-alpine container as root (the default), so `docs/build/site/` and everything under it was owned by root. The subsequent Gradle `bundleDokkaIntoSite` step runs on the runner host as the `runner` user — which cannot create a subdirectory inside a root-owned tree. Two coupled fixes, both necessary: 1. `.github/workflows/docs.yml`: add `--user $(id -u):$(id -g)` to the `docker run` invocation. The container process now writes as the runner user and everything under `docs/build/site/` is owned correctly when Gradle takes over. 2. `docs/antora-playbook.yml`: add a `runtime.cache_dir: ./.cache/antora` setting. Without --user the default $HOME/.cache/antora resolution worked; with --user the container process has no matching passwd entry and $HOME falls back to `/`, so Antora would fail with `Failed to create content cache directory /.cache/antora; EACCES: permission denied`. Pointing cache_dir at a path under the mounted workspace makes it writable by the non-root user. The `.cache/` path is already gitignored via the pre-staged `## antora` section in the repo root .gitignore, so the cache never gets committed. Verified end-to-end locally with the CI flow: rm -rf docs/build/site docs/.cache docker run --rm --user "$(id -u):$(id -g)" \ -v "$PWD:/antora" -w /antora \ skainet-antora:local docs/antora-playbook.yml ./gradlew --no-daemon bundleDokkaIntoSite docs/build/site/ owned by $USER:$GROUP, api/ subtree populated with the Dokka aggregate. Third step of the Antora migration polish pass. This commit is independent of the earlier warning-clearance work — it unblocks CI regardless of what other polish happens next. See #496. Co-Authored-By: Claude Opus 4.6 (1M context) --- .github/workflows/docs.yml | 8 ++++++++ docs/antora-playbook.yml | 10 ++++++++++ 2 files changed, 18 insertions(+) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 32a7ed1e..56426f25 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -87,8 +87,16 @@ jobs: cache-to: type=gha,mode=max - name: Build Antora site + # Run the container as the runner user (not root) so the + # files under docs/build/site/ are owned by the same user + # that the subsequent Gradle `bundleDokkaIntoSite` step runs + # as. Without this the Copy task fails with + # "Failed to create directory docs/build/site/api" because + # the Antora container otherwise writes the site tree as + # root and Gradle running as runner can't mkdir inside it. run: | docker run --rm \ + --user "$(id -u):$(id -g)" \ -v "${{ github.workspace }}:/antora" \ --workdir /antora/docs \ skainet-antora:local \ diff --git a/docs/antora-playbook.yml b/docs/antora-playbook.yml index 4c7b9bca..42efb873 100644 --- a/docs/antora-playbook.yml +++ b/docs/antora-playbook.yml @@ -2,6 +2,16 @@ site: title: SKaiNET start_page: skainet::index.adoc +# Keep Antora's content cache inside the project tree so the +# container can be run as a non-root user (via `docker run --user +# $(id -u):$(id -g)`). Without this, Antora defaults to +# `$HOME/.cache/antora` which is unwritable when the container +# process has no matching passwd entry and $HOME falls back to `/`. +# The `.cache/` path is already gitignored via the pre-staged +# `## antora` section in the repo root .gitignore. +runtime: + cache_dir: ./.cache/antora + content: sources: - url: /antora From 724f72bdae4637dc90aaec701d2934c97bea9b05 Mon Sep 17 00:00:00 2001 From: Michal Harakal Date: Mon, 13 Apr 2026 16:28:28 +0200 Subject: [PATCH 4/4] Drop kroki, render mermaid locally via mmdc (#496 step 4) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the asciidoctor-kroki dependency with a small local Asciidoctor block processor that invokes the @mermaid-js/mermaid-cli binary baked into the Antora Docker image directly. Eliminates the last build warning AND removes the build-time network dependency on kroki.io entirely. ## Why asciidoctor-kroki sends the diagram source to kroki.io (by default via GET with the source encoded into the URL). The GET path has a 4 KB URL length limit, so larger diagrams come back with HTTP 400 and the block is silently dropped. Switching the extension to POST did not help — kroki.io also rejected the content for a different reason, with an empty response body and no diagnostic. Turning around each mermaid block through a network round trip for every build was already a sore point; finding that the only path to reliable rendering was "give up on the external service entirely" made the decision clear. The new pipeline is purely local: [mermaid] --> local-mermaid-extension.js ---- --> writes source to /tmp/skainet-mm-*/in.mmd source --> exec mmdc -i in.mmd -o out.svg ---- --> reads out.svg --> emits as a `pass` block (inline SVG) mermaid-cli was already in the image from day one for the asciidoctor-kroki "local fetch" path. Removing kroki and wiring mermaid-cli directly via a 70-line extension is a strictly smaller build dependency tree and strictly more reliable: no network, no rate limits, no URL length caps, no flakes on CI, deterministic outputs. ## Changes 1. `docs/.docker/Dockerfile`: - Drop `asciidoctor-kroki@0.18` from the npm install list. - `COPY local-mermaid-extension.js /opt/antora/` so the playbook can reference it by absolute path without any volume-mount gymnastics at run time. - Update the image description label. 2. `docs/.docker/local-mermaid-extension.js` (new): Asciidoctor.js block processor mirroring the shape used by asciidoctor-kroki (same onContext / process / createBlock pattern) but dispatching to /opt/antora/node_modules/.bin/mmdc via child_process.execSync with the Puppeteer config the image already writes at /opt/antora/puppeteer-config.json. Renders to a temp dir, reads the SVG, returns it inline via a `pass` block. Cleans the temp dir in a finally. On render failure emits a literal block containing the original mermaid source + the stderr from mmdc and logs a warning, matching the degradation style of the upstream kroki extension. 3. `docs/antora-playbook.yml`: - Swap `asciidoctor-kroki` extension for `/opt/antora/local-mermaid-extension.js`. - Drop the `kroki-fetch-diagram` and `kroki-http-method` attributes — both dead code now. 4. `docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc`: The first render against real mermaid-cli surfaced a previously-hidden authoring bug: one of the `sequenceDiagram` participants was aliased as `Opt`, and `opt` is a mermaid sequenceDiagram keyword (for optional blocks). Mermaid's parser matches keywords case-insensitively and was treating `Opt` as the start of an opt-block, producing: Parse error on line 12: ...HLO->>Opt: Unoptimized IR Expecting '+', '-', '()', 'ACTOR', got 'opt' Rename the alias to `Optimizer` and drop the `as` clause. Kroki had been silently rejecting this diagram for a different reason the whole time; local rendering surfaced the actual bug. ## Verification docker build --no-cache -t skainet-antora:local docs/.docker rm -rf docs/build/site docs/.cache docker run --rm --user "$(id -u):$(id -g)" \ -v "$PWD:/antora" -w /antora \ skainet-antora:local docs/antora-playbook.yml grep -c " 3 (one inline SVG per [mermaid] block, all three diagrams) ./gradlew --no-daemon bundleDokkaIntoSite ls docs/build/site/api # full Dokka aggregate present Antora warnings + errors on the full build: 0 + 0. Down from the 13 warnings the Antora migration landed with in #494. Fourth and final step of the Antora migration polish pass. See #496. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/.docker/Dockerfile | 21 +++-- docs/.docker/local-mermaid-extension.js | 91 +++++++++++++++++++ docs/antora-playbook.yml | 13 +-- .../pages/tutorials/hlo-getting-started.adoc | 10 +- 4 files changed, 118 insertions(+), 17 deletions(-) create mode 100644 docs/.docker/local-mermaid-extension.js diff --git a/docs/.docker/Dockerfile b/docs/.docker/Dockerfile index 67c21ba6..fecaca3c 100644 --- a/docs/.docker/Dockerfile +++ b/docs/.docker/Dockerfile @@ -1,8 +1,8 @@ FROM node:20-alpine LABEL org.opencontainers.image.title="SKaiNET Antora" \ - org.opencontainers.image.description="Antora site generator with built-in Mermaid rendering" \ - org.opencontainers.image.source="https://github.com/SKaiNET-developers/SKaiNET-transformers" + org.opencontainers.image.description="Antora site generator with direct local Mermaid rendering (no Kroki round trip)" \ + org.opencontainers.image.source="https://github.com/SKaiNET-developers/SKaiNET" # Chromium for mermaid-cli (puppeteer) RUN apk add --no-cache chromium font-noto @@ -10,25 +10,34 @@ RUN apk add --no-cache chromium font-noto ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser \ PUPPETEER_SKIP_DOWNLOAD=true -# Install Antora + extensions to /opt/antora (not /antora which gets volume-mounted) +# Install Antora + mermaid-cli into /opt/antora (not /antora which gets +# volume-mounted at run time). asciidoctor-kroki is intentionally NOT +# installed — it depends on a Kroki HTTP server (kroki.io or local) +# which returns 400 for large diagrams when using GET and has no +# offline fallback. We render mermaid directly via mermaid-cli through +# the local-mermaid-extension.js asciidoctor block processor. WORKDIR /opt/antora RUN npm init -y && npm i --save-exact \ @antora/cli@3.1 \ @antora/site-generator@3.1 \ - asciidoctor-kroki@0.18 \ @mermaid-js/mermaid-cli@11 \ && npm cache clean --force # Make installed modules visible when workdir is the mounted project ENV NODE_PATH=/opt/antora/node_modules -# Mermaid-cli config +# Mermaid-cli config — used by the local-mermaid-extension to drive +# Puppeteer against the pre-installed Alpine Chromium. RUN echo '{ \ "executablePath": "/usr/bin/chromium-browser", \ "args": ["--no-sandbox", "--disable-gpu", "--disable-dev-shm-usage"] \ }' > /opt/antora/puppeteer-config.json -# Verify mermaid works +# Bake the local mermaid extension in at an absolute path so the +# Antora playbook can reference it without any volume-mount gymnastics. +COPY local-mermaid-extension.js /opt/antora/local-mermaid-extension.js + +# Verify mermaid-cli works end to end at image build time. RUN echo 'graph TD; A-->B;' > /tmp/test.mmd \ && npx mmdc -i /tmp/test.mmd -o /tmp/test.svg -p /opt/antora/puppeteer-config.json \ && rm /tmp/test.mmd /tmp/test.svg diff --git a/docs/.docker/local-mermaid-extension.js b/docs/.docker/local-mermaid-extension.js new file mode 100644 index 00000000..35b4c776 --- /dev/null +++ b/docs/.docker/local-mermaid-extension.js @@ -0,0 +1,91 @@ +'use strict' + +/* + * Local mermaid block processor for Asciidoctor.js. + * + * Replaces the asciidoctor-kroki dependency on kroki.io (and its + * GET URL length limit / 400 rejections on large diagrams) with a + * direct invocation of `mmdc` — the @mermaid-js/mermaid-cli binary + * that the SKaiNET Antora Docker image already bakes in for its + * Chromium-backed Puppeteer rendering path. + * + * The extension is registered via the Antora playbook's + * `asciidoc.extensions` list and gets passed the Asciidoctor.js + * `registry` object. For every `[mermaid]\n----\n...\n----` block + * in any page, we: + * + * 1. write the source to a temp file + * 2. exec `mmdc -i in.mmd -o out.svg -p puppeteer-config.json` + * (synchronous — Antora processes one page at a time and the + * mermaid-cli call is fast enough that sync is fine) + * 3. read the produced SVG + * 4. inline it via a `pass` block so Asciidoctor emits the raw + * SVG markup straight into the HTML output + * + * On render failure we fall back to a literal block containing + * the original source plus the error message, matching the + * degradation mode asciidoctor-kroki uses. + */ + +const { execSync } = require('child_process') +const { mkdtempSync, writeFileSync, readFileSync, rmSync } = require('fs') +const { tmpdir } = require('os') +const { join } = require('path') + +// Absolute paths baked into /opt/antora at image build time. +// These have to match the Dockerfile that installs mermaid-cli and +// writes the puppeteer config. +const MMDC_BIN = '/opt/antora/node_modules/.bin/mmdc' +const PUPPETEER_CONFIG = '/opt/antora/puppeteer-config.json' + +function renderMermaidToSvg (source) { + const dir = mkdtempSync(join(tmpdir(), 'skainet-mm-')) + const inputPath = join(dir, 'in.mmd') + const outputPath = join(dir, 'out.svg') + writeFileSync(inputPath, source, 'utf8') + try { + execSync( + `${MMDC_BIN} -i ${inputPath} -o ${outputPath} -p ${PUPPETEER_CONFIG} --quiet`, + { stdio: ['ignore', 'ignore', 'pipe'] } + ) + return readFileSync(outputPath, 'utf8') + } finally { + try { rmSync(dir, { recursive: true, force: true }) } catch (_) { /* noop */ } + } +} + +function mermaidBlockFactory () { + return function () { + const self = this + self.named('mermaid') + self.onContext(['listing', 'literal']) + self.process((parent, reader, attrs) => { + const source = reader.$read() + try { + const svg = renderMermaidToSvg(source) + return self.createBlock(parent, 'pass', svg, attrs) + } catch (err) { + const logger = parent.getDocument().getLogger() + logger.warn(`local-mermaid-extension: failed to render block — ${err.message}`) + const role = attrs.role + attrs.role = role ? `${role} mermaid-error` : 'mermaid-error' + return self.createBlock( + parent, + 'literal', + `Error rendering mermaid diagram:\n${err.message}\n\n${source}`, + attrs + ) + } + }) + } +} + +module.exports.register = function register (registry) { + if (typeof registry.register === 'function') { + registry.register(function () { + this.block('mermaid', mermaidBlockFactory()) + }) + } else if (typeof registry.block === 'function') { + registry.block('mermaid', mermaidBlockFactory()) + } +} diff --git a/docs/antora-playbook.yml b/docs/antora-playbook.yml index 42efb873..5c9493cf 100644 --- a/docs/antora-playbook.yml +++ b/docs/antora-playbook.yml @@ -20,12 +20,13 @@ content: asciidoc: extensions: - - asciidoctor-kroki - attributes: - # Use local mermaid-cli via Kroki (no external server needed when - # built with the custom Docker image in docs/.docker/Dockerfile — - # copied verbatim from SKaiNET-transformers). - kroki-fetch-diagram: true + # Local mermaid block processor — renders every `[mermaid]` block + # inline by invoking the @mermaid-js/mermaid-cli binary baked into + # the Docker image at /opt/antora/node_modules/.bin/mmdc. Replaces + # asciidoctor-kroki so builds don't depend on kroki.io at all, + # which eliminates the GET-URL length limit (4 KB) that was + # rejecting the large diagrams in hlo-getting-started.adoc. + - /opt/antora/local-mermaid-extension.js ui: bundle: diff --git a/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc b/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc index 65e70b84..cad26ee6 100644 --- a/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc +++ b/docs/modules/ROOT/pages/tutorials/hlo-getting-started.adoc @@ -168,15 +168,15 @@ sequenceDiagram participant DAG as Compute Graph participant Conv as HLO Converter participant HLO as StableHLO IR - participant Opt as Optimizer - + participant Optimizer + DSL->>DAG: rgb2GrayScaleMatMul() DAG->>Conv: MatMul + Transpose ops Conv->>HLO: stablehlo.dot_general Conv->>HLO: stablehlo.transpose - HLO->>Opt: Unoptimized IR - Opt->>HLO: Optimized IR - + HLO->>Optimizer: Unoptimized IR + Optimizer->>HLO: Optimized IR + Note over Conv,HLO: Type inference:
tensor → tensor ----