Polish Java / JVM consumption surface for 0.19.0 (#400)#498
Merged
michalharakal merged 5 commits intodevelopfrom Apr 13, 2026
Merged
Polish Java / JVM consumption surface for 0.19.0 (#400)#498michalharakal merged 5 commits intodevelopfrom
michalharakal merged 5 commits intodevelopfrom
Conversation
The neutral backend api module landed in #470 as the integration seam for future backends (IREE, Metal, NPU, the NNAPI-Amlogic sibling repo) but it was never added to the BOM's version- alignment constraints. Java / JVM consumers that depend on the BOM were therefore not getting a pinned version for skainet-backend-api, so anyone referencing the module from a Maven / Gradle project had to either spell out the version manually or drop the BOM reliance for that coordinate. Adding the missing `api(project(":skainet-backends:skainet-backend-api"))` constraint groups it with skainet-backend-cpu under the backend section. BOM still builds clean. First of five commits polishing the Java / JVM consumption story for the upcoming 0.19.0 release. See #400. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds `@JvmStatic` to every factory method on the
`StableHloConverterFactory` object (`createBasic`, `createExtended`,
`createFast`, `createCustom`) plus `@JvmOverloads` on `createCustom`
so every parameter default generates a separate JVM overload.
Before: Java call sites had to go through the Kotlin singleton
marker:
var converter = StableHloConverterFactory.INSTANCE.createExtended();
After: Java callers can use the idiomatic static form:
var converter = StableHloConverterFactory.createExtended();
The `@JvmStatic` annotation lives in `commonMain` — Kotlin 1.9+
accepts JVM-specific annotations in common code and treats them
as no-ops on non-JVM targets. Verified across all Kotlin
Multiplatform targets (jvmTest, wasmJsTest, wasmJsBrowserTest,
wasmWasiTest, wasmWasiNodeTest, macosArm64Test,
iosSimulatorArm64Test) — zero regressions.
Second of five commits polishing the Java / JVM consumption
story for the upcoming 0.19.0 release. See #400.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds `@JvmStatic` to both factory entry points on the
`TokenizerFactory` object (`fromGguf(Map)`, `fromTokenizerJson(String)`).
Same motivation as StableHloConverterFactory in the previous
commit — without the annotation, Java consumers had to navigate
through the Kotlin object's `INSTANCE` marker:
var tokenizer =
TokenizerFactory.INSTANCE.fromGguf(ggufFields);
With the annotation they get the idiomatic static form:
var tokenizer =
TokenizerFactory.fromGguf(ggufFields);
The factory is the canonical entry point for the new Qwen
byte-level BPE + SentencePiece tokenizers that landed in #463
and #464, so this is a meaningful win for Java consumers of
the upcoming 0.19.0 release — they get Qwen / Llama / Gemma /
TinyLlama tokenization without any Kotlin-specific interop
glue.
Verified across jvmTest, compileKotlinWasmJs, and macosArm64Test
for skainet-io-core — no regressions.
Third of five commits polishing the Java / JVM consumption
story for the upcoming 0.19.0 release. See #400.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds `@file:JvmName("TensorSpecs")` to
`skainet-lang-core/.../tensor/ops/TensorSpecEncoding.kt`.
The file declares three top-level extension functions used to
read and write the `TensorEncoding` metadata that #469 plumbed
onto `TensorSpec`: `tensorEncoding`, `withTensorEncoding`, and
`inferTensorEncoding`. Top-level extensions in Kotlin compile to
static methods on a synthetic class named after the source file
— by default `TensorSpecEncodingKt`. Java call sites ended up
looking like:
TensorEncoding encoding =
TensorSpecEncodingKt.getTensorEncoding(spec);
TensorSpec annotated =
TensorSpecEncodingKt.withTensorEncoding(spec, TensorEncoding.Q8_0.INSTANCE);
TensorEncoding data =
TensorSpecEncodingKt.inferTensorEncoding(tensorData);
With `@file:JvmName("TensorSpecs")` they become:
TensorEncoding encoding = TensorSpecs.getTensorEncoding(spec);
TensorSpec annotated =
TensorSpecs.withTensorEncoding(spec, TensorEncoding.Q8_0.INSTANCE);
TensorEncoding data = TensorSpecs.inferTensorEncoding(tensorData);
Same Kotlin call sites are unaffected (they see the
top-level extension syntax either way) — `spec.tensorEncoding`
and `spec.withTensorEncoding(TensorEncoding.Q8_0)` still work
unchanged. Pure JVM-side binary name change.
Verified with jvmTest, compileKotlinWasmJs, macosArm64Test on
skainet-lang-core — no regressions.
Fourth of five commits polishing the Java / JVM consumption
story for the upcoming 0.19.0 release. See #400.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New JUnit5 Java test in skainet-test-java exercising each of the
three Kotlin surfaces polished in the earlier commits of this
branch for Java-first-citizenship:
- StableHloConverterFactory.createBasic/Extended/Fast() — must
be reachable via the idiomatic `Factory.create*()` static
form, never through `Factory.INSTANCE.create*()`. The test
is effectively a compile-time smoke check: if someone drops
the @JvmStatic annotations it fails to compile before any
assertion runs.
- TokenizerFactory.fromGguf(Map) / fromTokenizerJson(String) —
same pattern. Passing empty inputs exercises the error
path (UnsupportedTokenizerException), which is the cleanest
way to prove static dispatch without needing a real GGUF
fixture in the test classpath.
- TensorSpecs (the new JvmName-bound class for
TensorSpecEncoding.kt): getTensorEncoding / withTensorEncoding
called via `TensorSpecs.<name>(spec, ...)` in Java syntax.
Verifies the round-trip of TensorEncoding.Q8_0.INSTANCE and
confirms withTensorEncoding does not mutate the source spec.
Adds skainet-compile-hlo and skainet-io-core to the Java test
module's `testImplementation` classpath so the new test can
reference the factories + encoding helpers. Existing Java tests
(SKaiNETTest, ModelBuilderTest, TensorJavaOpsTest) are untouched.
Verified: `./gradlew :skainet-test:skainet-test-java:test` green
— all 3 pre-existing tests plus the 4 new tests in
ReleaseApiJavaTest.
Fifth and final commit polishing the Java / JVM consumption
story for the upcoming 0.19.0 release. See #400.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
📖 Documentation Preview The documentation has been built successfully for this PR. Generated Files:
Artifacts:
This comment will be updated automatically when the PR is updated. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refs #400.
Summary
Makes the three new Kotlin surfaces from 0.19.0 (`StableHloConverterFactory`, `TokenizerFactory`, the `TensorEncoding` metadata helpers on `TensorSpec`) cleanly callable from Java, adds `skainet-backend-api` to the BOM so Java consumers get pinned versions for every module, and extends `skainet-test-java` with a `ReleaseApiJavaTest` that exercises each entry point from real Java code — so any future regression surfaces at compile time instead of at runtime in a downstream consumer's project.
The gap before this PR
`SKaiNET` (the pre-existing `sk.ainet.java.SKaiNET` facade in `skainet-backend-cpu/jvmMain`) was already nicely annotated with `@JvmStatic` / `@JvmOverloads`, and `skainet-test-java` already had 3 Java test files exercising `SKaiNET.context()`, `tensor()`, `zeros()`, model building, and tensor ops. But none of the 0.19.0 additions had gotten the same treatment:
The five commits
1. `1253b42f` — BOM adds skainet-backend-api
`skainet-bom/build.gradle.kts` gains one `api(project(":skainet-backends:skainet-backend-api"))` constraint, grouped with `skainet-backend-cpu` under the backend section. BOM still builds clean.
2. `25be9dc7` — @JvmStatic on StableHloConverterFactory
Every factory method gets `@JvmStatic`, and `createCustom` gets `@JvmOverloads` so every parameter default generates a JVM overload. The annotations live in `commonMain` — Kotlin 1.9+ accepts JVM-specific annotations in common code and treats them as no-ops on non-JVM targets. Verified across jvmTest, wasmJsTest, wasmJsBrowserTest, wasmWasiTest, wasmWasiNodeTest, macosArm64Test, and iosSimulatorArm64Test.
3. `1ebd21b4` — @JvmStatic on TokenizerFactory
Same treatment for both factory entry points (`fromGguf(Map)`, `fromTokenizerJson(String)`). This is the canonical entry for the new Qwen byte-level BPE + SentencePiece tokenizers from #463 / #464, so it's a meaningful Java-side win: Qwen / LLaMA / Gemma / TinyLlama tokenization without Kotlin-specific interop glue.
4. `76cfea29` — @file:JvmName("TensorSpecs") on TensorSpecEncoding.kt
Java callers now see `TensorSpecs.getTensorEncoding(spec)` / `TensorSpecs.withTensorEncoding(spec, TensorEncoding.Q8_0.INSTANCE)` / `TensorSpecs.inferTensorEncoding(tensorData)`, matching the name used in the Kotlin extension-syntax receiver. Kotlin call sites stay unchanged (they go through the extension syntax either way). Pure JVM-side binary-name change.
5. `d6e5f226` — ReleaseApiJavaTest
New JUnit5 Java test covering all three surfaces:
```java
// 1. Converter factory via idiomatic static form
StableHloConverter converter = StableHloConverterFactory.createExtended();
// 2. Tokenizer factory via idiomatic static form (error-path
// invocation — cleanest way to prove static dispatch without a
// real GGUF fixture in the test classpath)
assertThrows(UnsupportedTokenizerException.class,
() -> TokenizerFactory.fromGguf(Collections.emptyMap()));
// 3. TensorSpecs facade for the TensorEncoding helpers
TensorSpec annotated = TensorSpecs.withTensorEncoding(
bare, TensorEncoding.Q8_0.INSTANCE);
assertSame(TensorEncoding.Q8_0.INSTANCE,
TensorSpecs.getTensorEncoding(annotated));
```
`skainet-test-java/build.gradle.kts` gains `skainet-compile-hlo` and `skainet-io-core` on the test classpath so the new test can reference the factories and encoding helpers.
Test plan
Net effect for Java consumers of 0.19.0
A Java app pulling in the 0.19.0 BOM can now:
```java
// Execute tensors (pre-existing)
var ctx = SKaiNET.context();
var t = SKaiNET.zeros(ctx, new int[]{2, 3}, DType.fp32());
// Export via StableHLO (0.19.0)
var converter = StableHloConverterFactory.createExtended();
var module = converter.convert(graph, "main");
// Tokenize for Qwen / LLaMA / Gemma / TinyLlama (0.19.0)
var tokenizer = TokenizerFactory.fromGguf(ggufFields);
// Read / write TensorEncoding metadata (0.19.0)
var encoding = TensorSpecs.getTensorEncoding(spec);
var annotated = TensorSpecs.withTensorEncoding(
spec, TensorEncoding.Q8_0.INSTANCE);
```
No `.INSTANCE.` noise, no `TensorSpecEncodingKt` naming, no manual version pins for `skainet-backend-api`.
🤖 Generated with Claude Code