Skip to content

Slurp train conf nemo2#15772

Open
diarray-hub wants to merge 46 commits into
NVIDIA-NeMo:mainfrom
diarray-hub:slurp-train-conf-nemo2
Open

Slurp train conf nemo2#15772
diarray-hub wants to merge 46 commits into
NVIDIA-NeMo:mainfrom
diarray-hub:slurp-train-conf-nemo2

Conversation

@diarray-hub

Copy link
Copy Markdown

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

Fix AttributeError: 'MultiLayerPerceptron' object has no attribute 'mlp' when training SLUIntentSlotBPEModel

Collection: nemo.collections.asr / nemo_asr

Changelog

In blob/stable/examples/slu/speech_intent_slot/configs/conformer_transformer_large_bpe.yaml
line 133:
target: nemo.collections.common.parts.MultiLayerPerceptron -> nemo.collections.asr.parts.submodules.token_classifier.TokenClassifier

Usage

  • The training script is used exactly the same, this change only transition the config to what nemo 2 expects, i.e a more generic token classifer object

PR Type:

  • Bugfix

Additional Information

ericharper and others added 30 commits May 13, 2024 18:12
Signed-off-by: eharper <eharper@nvidia.com>
* Enable CUDA graphs only for transcription. Sync streams before capture.

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: artbataev <artbataev@users.noreply.github.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>
…IA-NeMo#9178)

* Update PTQ to use nvidia-modelopt

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Restore PTQ tests

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Update docs

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Comment on apply_rope_fusion

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Support for calibration PP > 1

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>

* Fix cicd-main.yml indent

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Set data/tensor parallel groups

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Install only torch dependecies

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Follow up on recent modelopt changes

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Model support matrix

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>

* Rename PTQ script as it should be model-agnostic

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Remove unused import

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Update setup instructions

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: janekl <janekl@users.noreply.github.com>
Co-authored-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: eharper <eharper@nvidia.com>
Signed-off-by: eharper <eharper@nvidia.com>
* rename paths2audiofiles to audio

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* update transcribe to audio

Signed-off-by: Nithin Rao Koluguri <nithinraok>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
…#9201) (NVIDIA-NeMo#9235)

* Support dataloader as input to `audio` for transcription

Signed-off-by: smajumdar <titu1994@gmail.com>

* Apply isort and black reformatting

Signed-off-by: titu1994 <titu1994@users.noreply.github.com>

* Support dataloader as input to `audio` for transcription

Signed-off-by: smajumdar <titu1994@gmail.com>

* Update transcribe signatures

Signed-off-by: smajumdar <titu1994@gmail.com>

* Apply isort and black reformatting

Signed-off-by: titu1994 <titu1994@users.noreply.github.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: titu1994 <titu1994@users.noreply.github.com>
(cherry picked from commit 67401ed)
* revert rope fusion defaults

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
* dist adam transpose fix

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>
Signed-off-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
* Lazily warn about using greedy strategy instead of greedy_batch
strategy.

Previously, the warning would often run spuriously, since several
existing code paths simply call "change_decoding_strategy()" after
having first initialized a Module, rather than changing the config
before initializing the Module. This can be confusing.

The only problem I can see with this is that using logging inside a
forward() method might interfere with some compiler toolkits like
Torchscript or thunder.compile. Presumably it would be easy to add a
conditional statement to avoid this statement in a compiler context if
necessary.

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>
* sum-reudce grad_norm in DP+CP domain

Signed-off-by: Sangkug Lym <slym@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pablo-garay <pablo-garay@users.noreply.github.com>

---------

Signed-off-by: Sangkug Lym <slym@nvidia.com>
Signed-off-by: pablo-garay <pablo-garay@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
Co-authored-by: pablo-garay <pablo-garay@users.noreply.github.com>
* fix t5 g2p model

Signed-off-by: Jason <jasoli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: blisc <blisc@users.noreply.github.com>

---------

Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: blisc <blisc@users.noreply.github.com>
Co-authored-by: blisc <blisc@users.noreply.github.com>
* Remove config aligner - no longer needed after TRT-LLM 0.9 update

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Change default export precision to bf16 (more frequent)

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Specify gpt_attention_plugin

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
* update branch

Signed-off-by: eharper <eharper@nvidia.com>

* pin

Signed-off-by: eharper <eharper@nvidia.com>

---------

Signed-off-by: eharper <eharper@nvidia.com>
…r::forward (NVIDIA-NeMo#9246)

* Accept None as an argument to decoder_lengths in GreedyBatchedCTCInfer::forward

GreedyCTCInfer::forward already allowed for this, so they did not
implement the exact same interface. Now, they do.

Also warn about not passing in the decoder_lengths argument. It is
likely an error on the user's part not to pass it in explicitly.

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: titu1994 <titu1994@users.noreply.github.com>

* Log warning only once for sanity.

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>

---------

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>
Signed-off-by: titu1994 <titu1994@users.noreply.github.com>
Co-authored-by: titu1994 <titu1994@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
* Add trtllm checkpoint

* Change model config

* fix no query_group

* Using build API

* Change export to new API

* Update generate API

* Fix runtime config

* Fix for llama

* Fix for ptuning

* Fix TP issue

* Change TP rank for building weight dict

* Add lora config

* add prompt embedding table config

* Fix PP isue

* PP layers fix

* Fix no prompt task ids

* Add bos for Gemma

* Add multi block mode

* Embedding and layernorm for PP

* MPI multiprocess support for multinode

* Only output text on first rank

* Change to ModelRunnerCpp

* Add falcon

* Add rotary_pct default value

* Falcon fix

* Add MOE config

* Fix MOE weight dict

* Clean code

* Add rotary_base

* Fix MOE config

* Fix falcon new architecture

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix Gemma 7B

* Add rotary_scaling

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

---------

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: abharwani <abharwani@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
…o#9210)

* Add params like max_num_tokens and opt_num_tokens

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* remove padding param added

* update params like max_num_token

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* remove context context_fmha param for now

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* add params like max num token to the script

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
* update launcher name and fix mm circular import

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>
* Remove .nemo instead of renaming

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

* add ignore_errors=True flag

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Revert "Remove .nemo instead of renaming"

This reverts commit b836410.

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

* Remove backup .nemo after success

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

* Update tests

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

* Backup .nemo imediately before save_to

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: mikolajblaz <mikolajblaz@users.noreply.github.com>

* Fix CTC import

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

---------

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>
Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: mikolajblaz <mikolajblaz@users.noreply.github.com>
Co-authored-by: dimapihtar <dpihtar@gmail.com>
* neva media_type + text generation default fix

Signed-off-by: paul-gibbons <paul@gibbonspaul.com>

* Apply isort and black reformatting

Signed-off-by: paul-gibbons <paul-gibbons@users.noreply.github.com>

* repeat image_folder

Signed-off-by: paul-gibbons <paul@gibbonspaul.com>

---------

Signed-off-by: paul-gibbons <paul@gibbonspaul.com>
Signed-off-by: paul-gibbons <paul-gibbons@users.noreply.github.com>
Co-authored-by: paul-gibbons <paul-gibbons@users.noreply.github.com>
* fix lora and ptuning and isort/black

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* remove raise error when multiple config files

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* fix script issues

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
* add check if num_layers % pp == 0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

* move num_layers / pp check to build_transformer_config

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>
* Added the BOS token for Llama, Mistral and Mixtral.

Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com>

* Don't load an existing TRT-LLM model before export to speed up the export process and avoid possible contamination from previous runs.

Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: apanteleev <apanteleev@users.noreply.github.com>

---------

Signed-off-by: Alexey Panteleev <alpanteleev@nvidia.com>
Signed-off-by: apanteleev <apanteleev@users.noreply.github.com>
Co-authored-by: apanteleev <apanteleev@users.noreply.github.com>
Co-authored-by: Onur Yilmaz <35306097+oyilmaz-nvidia@users.noreply.github.com>
titu1994 and others added 16 commits May 23, 2024 19:04
Signed-off-by: smajumdar <titu1994@gmail.com>
* Guard cuda memory allocator update

Signed-off-by: smajumdar <titu1994@gmail.com>

* Apply isort and black reformatting

Signed-off-by: titu1994 <titu1994@users.noreply.github.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Signed-off-by: titu1994 <titu1994@users.noreply.github.com>
Signed-off-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
* add deprecation warnings for non-mcore models

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* change warning default time

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove unused import

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

* remove deprecated tests

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* set mcore_gpt to True

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* set mcore_bert to True

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated tests

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated unit tests

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* add deprecation warning

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

* remove deprecated playbook

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pablo-garay <pablo-garay@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

* remove deprecated tutorial

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* turn off FA for Bert

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* turn of FA for Bert

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* change mcore commit

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

* adjustments

* update TE commit

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix mcore precision issue

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* change precision for bert

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* change precision for fine-tuning

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* turn off fused attention for bert

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix bert test

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Signed-off-by: pablo-garay <pablo-garay@users.noreply.github.com>
Co-authored-by: dimapihtar <dpihtar@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: pablo-garay <pablo-garay@users.noreply.github.com>
Co-authored-by: Dmytro Pykhtar <37850217+dimapihtar@users.noreply.github.com>
* move pooler under post_process

Signed-off-by: Dmytro Pykhtar <dpykhtar@login-eos01.eos.clusters.nvidia.com>

* move pooler under post_process

Signed-off-by: Dmytro Pykhtar <dpykhtar@login-eos01.eos.clusters.nvidia.com>

* change pp size to 2 for bert pp test

Signed-off-by: Dmytro Pykhtar <dpykhtar@login-eos01.eos.clusters.nvidia.com>

* change precision for gpt mock data generation test

Signed-off-by: Dmytro Pykhtar <dpykhtar@login-eos01.eos.clusters.nvidia.com>

---------

Signed-off-by: Dmytro Pykhtar <dpykhtar@login-eos01.eos.clusters.nvidia.com>
Co-authored-by: Dmytro Pykhtar <dpykhtar@login-eos01.eos.clusters.nvidia.com>
* Re-enable cuda graphs in training modes.

"global" capture mode was sporadically crashing because of pinning
host memory in other threads spawned by the data loader when
num_workers > 0.

Add relevant changs to TDT cuda graphs decoding as well.

I didn't test the TDT change because I'm not sure how. But it seems low risk.

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: galv <galv@users.noreply.github.com>

---------

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>
Signed-off-by: galv <galv@users.noreply.github.com>
…ariable seq (NVIDIA-NeMo#9259)

* add stable training fix and contrastive loss update for variable seq length input

Signed-off-by: Nithin Rao Koluguri <nithinraok>

* Apply isort and black reformatting

Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>

* replace remove_bias with use_bias

Signed-off-by: Nithin Rao Koluguri <nithinraok>

---------

Signed-off-by: Nithin Rao Koluguri <nithinraok>
Signed-off-by: nithinraok <nithinraok@users.noreply.github.com>
Co-authored-by: Nithin Rao Koluguri <nithinraok>
Co-authored-by: nithinraok <nithinraok@users.noreply.github.com>
* add deprecation note

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

---------

Signed-off-by: Dmytro Pykhtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
…o#9204)

* Fix incorrect checkpoint removal logic (NVIDIA-NeMo#9192)

* Fix incorrect if logic

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: mikolajblaz <mikolajblaz@users.noreply.github.com>

---------

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>
Signed-off-by: mikolajblaz <mikolajblaz@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: mikolajblaz <mikolajblaz@users.noreply.github.com>

---------

Signed-off-by: Mikołaj Błaż <mblaz@nvidia.com>
Signed-off-by: mikolajblaz <mikolajblaz@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Ali Taghibakhshi <71892896+JRD971000@users.noreply.github.com>
…o#9347) (NVIDIA-NeMo#9350)

* Fix GreedyBatchedCTCInfer regression from GreedyCTCInfer. (NVIDIA-NeMo#9347)

* Fix GreedyBatchedCTCInfer regression from GreedyCTCInfer.

decoder_lengths is allowed to be on CPU even when decoder_output is on
GPU. This matches the behavior of GreedyCTCInfer. Even though that
behavior is unintentional, there is code depending on that behavior,
including our jupyter notebooks.

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: titu1994 <titu1994@users.noreply.github.com>

---------

Signed-off-by: Daniel Galvez <dgalvez@nvidia.com>
Signed-off-by: titu1994 <titu1994@users.noreply.github.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: titu1994 <titu1994@users.noreply.github.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
(cherry picked from commit aed9d07)

* Add Packaging to install documentation

Signed-off-by: smajumdar <titu1994@gmail.com>

* Mark confidence tests as please fix me

Signed-off-by: smajumdar <titu1994@gmail.com>

---------

Signed-off-by: smajumdar <titu1994@gmail.com>
Co-authored-by: Daniel Galvez <galv@users.noreply.github.com>
Co-authored-by: Pablo Garay <palenq@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
…VIDIA-NeMo#9380)

* Fixed clustering diarizer to load MSDD to GPU by default if cuda on

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Fixed clustering diarizer to load MSDD to GPU by default if cuda on

Signed-off-by: Taejin Park <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
* fix fp16 precision issue by disabling enable_autocast

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* revert config

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* add fp16 precision test

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
@diarray-hub diarray-hub requested a review from a team as a code owner June 9, 2026 13:09
@copy-pr-bot

copy-pr-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.