Add developer and user guides for JIT by divyegala · Pull Request #1876 · rapidsai/cuvs

divyegala · 2026-03-04T18:38:54Z

closes #1522 closes #1523

docs/source/jit_lto_guide.md

Co-authored-by: Kyle Edwards <kyedwards@nvidia.com>

betatim · 2026-03-06T09:03:12Z

docs/source/advanced_topics.rst

+1. In-memory cache is valid for the lifetime of the process.
+2. On-disk cache is valid until a CUDA driver upgrade is performed. This is stored in the user's home directory under the path ``~/.nv/ComputeCache/``, and can be portably shared between machines in network or cloud storage.
+
+Thus, the JIT compilation is a one-time cost and you can expect no loss in real performance after the first compilation. We recommend that you run a "warmup" to trigger the JIT compilation before the actual usage.


Do you want to make it super mega obvious to people who deploy services based on cuvs in containers that "We really really strongly recommend you make sure the cache is stored in a persistent location so that containers don't have to warm up the cache after each restart"

Is it possible to include something that warms up the cache in my Dockerfile? So that the cache is built into the image?

I am not sure if I'd make the connection from reading the current docs, hence wondering if a really explicit "hit people over the head with it" call out would be useful.

I think that's a great idea. Let me add some phrasing to convey that very clearly.

Is it possible to include something that warms up the cache in my Dockerfile? So that the cache is built into the image?

You mean automatically?

How does it read now?

Sounds good now. Let's see if people get it, if not can always tune this later.

Wasn't thinking of something automatic, more a command I can include in my Dockerfile as a RUN command

If you see the link that I added now, you can control where the cache is written with an environment variable. I'm hoping docker savvy users can now figure out the volume mount and environment variable connection.

Works for me

cjnolet

Approving so I don't block this, but I think you should address the comments about the links. Those can help users help themselves by reading more in the official documentation, so that they can have a base level knowledge to follow your new uides.

docs/source/advanced_topics.rst

cjnolet · 2026-03-06T22:14:07Z

docs/source/developer_guide.md

+
+## Using Just-in-Time Link-Time Optimization
+
+cuVS is moving to using link-time optimization for new kernels, and this requires some changes to the way kernels are written. Instead of compiling all kernel variants at build time (which leads to binary size explosion), JIT LTO compiles kernel fragments separately and links them together at runtime based on the specific configuration needed.


Can we link somewhere in the cuda docs in this paragraph? Maybe for "link time optimation"?

Also, can you provide an ever so brief summary of the perf implications? Maybe link to the cuda docs where appropriate for expectations?

First run perf implications are very kernel and hardware dependent.. the CUDA docs make no guarantees about that.

docs/source/developer_guide.md

docs/source/jit_lto_guide.md

Co-authored-by: Corey J. Nolet <cjnolet@gmail.com>

divyegala · 2026-03-19T00:27:45Z

/merge

add docs

787a6ec

divyegala self-assigned this Mar 4, 2026

divyegala requested a review from a team as a code owner March 4, 2026 18:38

github-project-automation bot added this to Vector Search, ML, & Data Mining Release Board Mar 4, 2026

divyegala added doc Improvements or additions to documentation improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Mar 4, 2026

divyegala added 2 commits March 4, 2026 23:43

add dev guide

cd1c730

fix errors

b14207a

divyegala changed the title ~~Add docs for JIT~~ Add developer and user guides for JIT Mar 4, 2026

divyegala removed the improvement Improves an existing functionality label Mar 4, 2026

KyleFromNVIDIA requested changes Mar 5, 2026

View reviewed changes

docs/source/jit_lto_guide.md Outdated Show resolved Hide resolved

docs/source/jit_lto_guide.md Outdated Show resolved Hide resolved

docs/source/jit_lto_guide.md Show resolved Hide resolved

docs/source/jit_lto_guide.md Outdated Show resolved Hide resolved

KyleFromNVIDIA requested changes Mar 5, 2026

View reviewed changes

docs/source/jit_lto_guide.md Outdated Show resolved Hide resolved

docs/source/jit_lto_guide.md Outdated Show resolved Hide resolved

docs/source/jit_lto_guide.md Outdated Show resolved Hide resolved

divyegala and others added 3 commits March 5, 2026 15:16

Update docs/source/jit_lto_guide.md

5d2d8d6

Co-authored-by: Kyle Edwards <kyedwards@nvidia.com>

address reviews

70a1805

Merge branch 'main' into jit-doc

3868d3e

KyleFromNVIDIA approved these changes Mar 5, 2026

View reviewed changes

divyegala added 5 commits March 5, 2026 21:41

attempt to fix doc build

2a07fbf

add to main index

1194699

link docs better

c3f64ee

Merge remote-tracking branch 'origin/main' into jit-doc

5751907

try adding to toctree

7832603

betatim reviewed Mar 6, 2026

View reviewed changes

divyegala added 2 commits March 6, 2026 22:07

respond to review

38e6681

Merge remote-tracking branch 'origin/main' into jit-doc

683b730

cjnolet approved these changes Mar 6, 2026

View reviewed changes

divyegala and others added 4 commits March 6, 2026 18:50

Apply suggestions from code review

ee62eb7

Co-authored-by: Corey J. Nolet <cjnolet@gmail.com>

respond to reviews

4fd9ca7

Merge branch 'jit-doc' of github.com:divyegala/cuvs into jit-doc

be9b5df

Merge remote-tracking branch 'origin/main' into jit-doc

a884f7e

divyegala changed the base branch from main to release/26.04 March 12, 2026 18:51

divyegala added 2 commits March 12, 2026 11:51

Merge branch 'release/26.04' into jit-doc

73615b7

Merge branch 'release/26.04' into jit-doc

63200fa

rapids-bot bot merged commit 70dc032 into rapidsai:release/26.04 Mar 19, 2026
49 checks passed

github-project-automation bot moved this to Done in Vector Search, ML, & Data Mining Release Board Mar 19, 2026


		## Using Just-in-Time Link-Time Optimization

		cuVS is moving to using link-time optimization for new kernels, and this requires some changes to the way kernels are written. Instead of compiling all kernel variants at build time (which leads to binary size explosion), JIT LTO compiles kernel fragments separately and links them together at runtime based on the specific configuration needed.

Conversation

divyegala commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cjnolet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

divyegala commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

divyegala commented Mar 4, 2026 •

edited

Loading