-
Notifications
You must be signed in to change notification settings - Fork 42
Open
Description
Hi!
Thanks for the open-source work.
I tried training an LLVC with a custom voice using a pretrained model from weights.gg, however, there were several discrepancies that lead to failure in reproduction of the results:
- The number of epochs are hardcoded to 10000, as opposed to 53 as mentioned in paper.
- The training takes more than 1 hour on a H100 instance just for 250 global steps, and ~7-8 hours per 250 global steps on an A100 instance (with the default arguments in the
experiments/llvc/config.json, with only thelog_intervalandcheckpoint_intervalaltered). This is way off as compared to the information in the paper claiming a training time of 3 days on RTX 3090 (for 500000 steps), any suggestions to fix the same?:

- Is there a way to efficiently finetune a new voice model with the pretrained
G_500000.pthcheckpoint in a parameter efficient manner that can be open-sourced?
Metadata
Metadata
Assignees
Labels
No labels