Skip to content

Add developer and user guides for JIT#1876

Merged
rapids-bot[bot] merged 19 commits intorapidsai:release/26.04from
divyegala:jit-doc
Mar 19, 2026
Merged

Add developer and user guides for JIT#1876
rapids-bot[bot] merged 19 commits intorapidsai:release/26.04from
divyegala:jit-doc

Conversation

@divyegala
Copy link
Member

@divyegala divyegala commented Mar 4, 2026

closes #1522 closes #1523

@divyegala divyegala self-assigned this Mar 4, 2026
@divyegala divyegala requested a review from a team as a code owner March 4, 2026 18:38
@divyegala divyegala added doc Improvements or additions to documentation improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Mar 4, 2026
@divyegala divyegala changed the title Add docs for JIT Add developer and user guides for JIT Mar 4, 2026
@divyegala divyegala removed the improvement Improves an existing functionality label Mar 4, 2026
1. In-memory cache is valid for the lifetime of the process.
2. On-disk cache is valid until a CUDA driver upgrade is performed. This is stored in the user's home directory under the path ``~/.nv/ComputeCache/``, and can be portably shared between machines in network or cloud storage.

Thus, the JIT compilation is a one-time cost and you can expect no loss in real performance after the first compilation. We recommend that you run a "warmup" to trigger the JIT compilation before the actual usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to make it super mega obvious to people who deploy services based on cuvs in containers that "We really really strongly recommend you make sure the cache is stored in a persistent location so that containers don't have to warm up the cache after each restart"

Is it possible to include something that warms up the cache in my Dockerfile? So that the cache is built into the image?

I am not sure if I'd make the connection from reading the current docs, hence wondering if a really explicit "hit people over the head with it" call out would be useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a great idea. Let me add some phrasing to convey that very clearly.

Is it possible to include something that warms up the cache in my Dockerfile? So that the cache is built into the image?

You mean automatically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it read now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good now. Let's see if people get it, if not can always tune this later.

Wasn't thinking of something automatic, more a command I can include in my Dockerfile as a RUN command

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see the link that I added now, you can control where the cache is written with an environment variable. I'm hoping docker savvy users can now figure out the volume mount and environment variable connection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so I don't block this, but I think you should address the comments about the links. Those can help users help themselves by reading more in the official documentation, so that they can have a base level knowledge to follow your new uides.


## Using Just-in-Time Link-Time Optimization

cuVS is moving to using link-time optimization for new kernels, and this requires some changes to the way kernels are written. Instead of compiling all kernel variants at build time (which leads to binary size explosion), JIT LTO compiles kernel fragments separately and links them together at runtime based on the specific configuration needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link somewhere in the cuda docs in this paragraph? Maybe for "link time optimation"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you provide an ever so brief summary of the perf implications? Maybe link to the cuda docs where appropriate for expectations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First run perf implications are very kernel and hardware dependent.. the CUDA docs make no guarantees about that.

@divyegala divyegala changed the base branch from main to release/26.04 March 12, 2026 18:51
@divyegala
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 70dc032 into rapidsai:release/26.04 Mar 19, 2026
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation non-breaking Introduces a non-breaking change

Development

Successfully merging this pull request may close these issues.

Document limitations of JIT-LTO such as warmup time Document how to use JIT-LTO in the DEVELOPER_GUIDE

4 participants