Skip to content

HLO: IrpaWriter accepts FileBacked mmap handles on JVM/Android (PR E of #523)#529

Merged
michalharakal merged 1 commit intodevelopfrom
feature/hlo-iree-params-mmap
Apr 19, 2026
Merged

HLO: IrpaWriter accepts FileBacked mmap handles on JVM/Android (PR E of #523)#529
michalharakal merged 1 commit intodevelopfrom
feature/hlo-iree-params-mmap

Conversation

@michalharakal
Copy link
Copy Markdown
Contributor

Summary

  • Wires BufferHandle.FileBacked into IrpaWriter. JVM and Android use FileChannel.map for zero-copy mmap transfer to the .irpa storage segment. Other targets throw with a pointer to follow-up work.
  • The full ingestion pipeline is now zero-copy end-to-end: source file (GGUF / safetensors) → FileBacked handle (already produced by existing loadTensorStorageMapped methods) → IrpaWriter.irpa bytes.

What's here

  • expect fun writeFileBackedBytes in commonMain.
  • JVM and Android actuals: RandomAccessFile + FileChannel.map + chunked copy into the kotlinx.io Sink. Kept as separate files until this module adopts a hierarchical jvmAndroidMain source set.
  • Native / wasmJs / wasmWasi actuals: explicit NotImplementedError with a pointer to Design: externalize weights via IREE parameter archive (supersedes #519) #523 follow-up. Callers on those targets should resolve the handle to Owned / Borrowed before writing.
  • IrpaWriter.writeBufferHandle dispatches FileBacked to the new seam.

Guardrails

  • Rejects FileBacked.sizeInBytes > Int.MAX_VALUEFileChannel.map can't return a single region larger than ~2 GiB. Multi-window streaming is a separate follow-up once a real model hits the limit.
  • Zero-byte FileBacked is a no-op and does not open the source file.

What's NOT here

  • No loader changes: both skainet-io-gguf and skainet-io-safetensors already expose loadTensorStorageMapped(filePath) that returns BufferHandle.FileBacked. The agent exploration showed this path existed; PR E only had to wire the consumer side.
  • No multi-window mmap for >2 GiB tensors: guarded with a clear diagnostic, deferred.
  • No Native mmap via cinterop: JVM/Android are the targets that matter for the current whisper deployment; native is a cleanup item.

Test plan

  • FileBackedIrpaRoundTripTest.testFileBackedEntryBytesLandInStorageSegment — writes a known byte pattern to a temp file at a non-aligned offset, constructs a FileBacked ref over a sub-range, runs through IrpaWriter, verifies the .irpa storage segment contains source bytes verbatim.
  • testFileBackedRejectsOversizedMap — pins the Int.MAX_VALUE guard.
  • testFileBackedZeroLengthIsNoOp — pins that 0-byte refs don't open the file (uses a non-existent path to prove it).
  • ./gradlew allTests green — 1047 tasks across jvm / android / ios / macos / linux / wasmJs / wasmWasi.

Part of #523. With this, PRs A–C + E are complete on the SKaiNET side. PR D (skainet-whisper caller wiring to flip policy → ExternalAlwaysIrpaWriter) lives in the skainet-whisper repo and is the remaining step to actually close #519 end-to-end.

🤖 Generated with Claude Code

… of #523)

The gguf and safetensors loaders already produce
`BufferHandle.FileBacked` via their `loadTensorStorageMapped`
methods, but IrpaWriter previously rejected that variant with a
placeholder throw pointing at PR E. This wires the real transfer
path: JVM and Android actuals mmap the declared file range via
`RandomAccessFile` + `FileChannel.map` and blit it into the sink
with no intermediate heap copy. The full ingestion pipeline now
runs source file → FileBacked handle → `.irpa` storage segment
with zero-copy semantics end-to-end.

### What's wired

- `writeFileBackedBytes` — `expect` in commonMain, `actual` per
  target.
- JVM / Android — functional mmap. Android's Dalvik/ART exposes the
  same FileChannel API as desktop JVM, so the actuals are byte-for-
  byte identical (kept as separate files until this module adopts a
  hierarchical jvmAndroidMain source set).
- Native (ios*, macos*, linux*), wasmJs, wasmWasi — explicit
  NotImplementedError with a pointer to #523. Browser wasm has no
  filesystem; Native mmap via cinterop is follow-up work. Callers
  on those targets should resolve to Owned/Borrowed first.

### Guardrails

- Rejects FileBacked sizes > Int.MAX_VALUE with a clear diagnostic.
  FileChannel.map cannot return a single region that large; multi-
  window streaming is a follow-up once a real model hits the limit.
- Zero-byte FileBacked is a no-op — does not open the source file.

### Tests

- `FileBackedIrpaRoundTripTest.testFileBackedEntryBytesLandInStorageSegment` —
  writes a known byte pattern to a temp file at a non-aligned
  offset, constructs a FileBacked ref over a sub-range, runs through
  IrpaWriter, verifies the .irpa storage segment contains the
  source bytes verbatim.
- `testFileBackedRejectsOversizedMap` — pins the Int.MAX_VALUE
  guard.
- `testFileBackedZeroLengthIsNoOp` — pins that a 0-byte ref does not
  open the file (references a non-existent path).
- `./gradlew allTests` green across all 1047 tasks.

Part of #523.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@michalharakal michalharakal merged commit 17fbd3a into develop Apr 19, 2026
4 checks passed
@michalharakal michalharakal deleted the feature/hlo-iree-params-mmap branch April 19, 2026 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HLO: inline dense constants for large tensors — scale / compile-time blocker

1 participant