Skip to content

Issue in reproducing results from the paper #7

@tripathiarpan20

Description

@tripathiarpan20

Hi!
Thanks for the open-source work.

I tried training an LLVC with a custom voice using a pretrained model from weights.gg, however, there were several discrepancies that lead to failure in reproduction of the results:

  • The number of epochs are hardcoded to 10000, as opposed to 53 as mentioned in paper.
  • The training takes more than 1 hour on a H100 instance just for 250 global steps, and ~7-8 hours per 250 global steps on an A100 instance (with the default arguments in the experiments/llvc/config.json, with only the log_interval and checkpoint_interval altered). This is way off as compared to the information in the paper claiming a training time of 3 days on RTX 3090 (for 500000 steps), any suggestions to fix the same?:
    image
  • Is there a way to efficiently finetune a new voice model with the pretrained G_500000.pth checkpoint in a parameter efficient manner that can be open-sourced?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions