docs: overhaul installation instructions around uv + bring-your-own Python/PyTorch/CUDA#15769
Open
pzelasko wants to merge 4 commits into
Open
docs: overhaul installation instructions around uv + bring-your-own Python/PyTorch/CUDA#15769pzelasko wants to merge 4 commits into
pzelasko wants to merge 4 commits into
Conversation
…ersions Harmonize and correct installation docs across README, CLAUDE.md, and the Sphinx install page, and fix stale package-metadata URLs. - Lead with uv + cu13 as the recommended install; pip is a documented fallback. - Emphasize bring-your-own Python (>=3.10) / PyTorch (>=2.6) / CUDA: nemo-toolkit only pins torch>=2.6, so a pre-installed PyTorch is kept, not replaced. - Frame the uv.lock/container combo (Python 3.13, PyTorch 2.12, CUDA 12.6/13.2) as the actively-supported stack, not a hard requirement. - Document the compiled / compiled-a100 extras (source-built GPU kernels for SpeechLM2 / Automodel: Transformer Engine, FlashAttention, Mamba, grouped-GEMM, DeepEP), including the H100+ vs A100 split and that they build via the Dockerfile. - Fix broken commands: GPU pip install now shows the required --extra-index-url; test/docs are PEP 735 groups (--group), not extras. - Correct the Python floor (3.10), torch version (2.12), and clone URL (NVIDIA-NeMo/NeMo); add an NGC container placeholder pending the image. - Update stale repo URLs to NVIDIA-NeMo/NeMo in pyproject.toml and package_info.py. Validated installability in Docker (py3.10/3.11/3.12; preinstalled torch 2.6/2.8/official cu124 kept; default + cu13 GPU paths resolve and import). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Incorporate the useful parts of a parallel install-docs review and apply a broader consistency pass: - Distinguish uv sync --locked (exact supported baseline; add --python 3.13) from uv pip / pip (bring-your-own), with a warning not to use uv sync --locked for BYO. Offer uv pip alongside pip for the fallback path. - Clarify A100: works with BOTH CUDA 12 and CUDA 13 — CUDA 13 (default base image) recommended, CUDA 12 base offered only as a convenience. - Broaden PyTorch targets to CPU/CUDA/ROCm/Apple Silicon; note cu12/cu13 also add the matching CUDA Python deps (cuda-python, numba-cuda). - Route scattered pages to the canonical install guide via :ref:`installation` (g2p, magpietts-finetuning, nemo_forced_aligner) and modernize index.rst / speechlm2/intro.rst snippets; add a docker run example and a lighter import-only verify step. - Align docs build with CI (uv sync --locked --group docs; uv run make linkcheck); prune the now-fixed nemo_forced_aligner entry from the broken-links list. - Normalize stale install references in the model-card template, NFA tool docs, and runtime error messages (nemo-toolkit name; NVIDIA-NeMo/NeMo clone URL). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…drop torchvision mention - PyTorch target wording: "CPU, CUDA, etc." (drop explicit ROCm / Apple Silicon). - compiled-a100: note the patched A100 DeepEP is auto-built/installed by the Dockerfile when the CUDA 12 base image is selected. - Remove the stray torchvision mention from the conda tip. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
/ok to test e4ed7e7 |
nithinraok
reviewed
Jun 9, 2026
nithinraok
left a comment
Member
There was a problem hiding this comment.
waiting for doc build for final view.
Collaborator
Author
|
/ok to test e62b47f |
chtruong814
approved these changes
Jun 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Harmonizes and corrects the NeMo Speech installation documentation now that the repo standardizes on uv (committed
uv.lock) with a fresh Dockerfile. The previous docs were inconsistent and in places broken (wrong Python/PyTorch versions, a GPUpipcommand that can't resolve,.[test]which isn't an extra, stale clone URLs).Two framing principles, per maintainer guidance:
cu13is the lead recommendation;pipis a documented fallback.nemo-toolkitonly requirestorch>=2.6, so a pre-installed PyTorch is kept, not replaced. Theuv.lock/container combo (Python 3.13, PyTorch 2.12, CUDA 12.6/13.2) is the actively-supported baseline, not a hard requirement.Key fixes
uv.lockenforce 3.10), PyTorch 2.12 actual (cu126/cu132).pipinstall now shows the required--extra-index-url https://download.pytorch.org/whl/cu13{2,6}(pip doesn't read[tool.uv]indexes).test/docsare PEP 735--groups, not extras (.[test]removed).uv sync --locked(exact baseline) vsuv pip/pip(BYO) distinction, with a warning not to useuv sync --lockedfor BYO.NVIDIA-NeMo/NeMo(docs, package metadata, error messages); fixed stale[project.urls]andpackage_info.py.compiled/compiled-a100extras documented as the optional SpeechLM2/Automodel accelerated-backend kernels (Automodel runs fine without them); H100+ vs A100 split; built via the Dockerfile.g2p,magpietts-finetuning,nemo_forced_aligner,index,speechlm2/intro) to the canonical guide via:ref:installation``; aligned the docs build with CI (uv sync --locked); normalized the model-card template + NFA tool refs.Validation
Installability verified in resource-capped Docker builds (
--cpus=6 --memory=10g):pytorch:2.6-cuda12.4image — pre-installed torch kept in every case,import nemo.collections.asrOK.[asr,cu13] --extra-index-url …/cu132GPU command (resolvestorch 2.12.0+cu132) — both import OK.black/isortpass on changed Python files; RST validated via docutils parse.Pending
The NGC container pull command is a clearly-marked
Coming soonplaceholder in README and the install page, to be filled in when the image is published.🤖 Generated with Claude Code