Thanks to visit codestin.com
Credit goes to github.com

Skip to content

discriminator weights #14

@xm1988but

Description

@xm1988but

Hi, there, just read about your fantastic paper and tried your open source model. The results seem quite promising! Congrats!

I've seen an zero-shot VC speaker-sim score of 0.78, which is even close or better than a fine-tuned version of previous technologies. Now I am trying to fine-tune using my own data! Based on some short experiments, it seems like only fine-tuning the DiT module may not be sufficient to further improve the similarity to a score higher than 0.8.

I noticed that perhaps I haven't fine-tuned the discriminator or the vocos decoder. However, it seems the weights of discriminator are not public yet, is that true? Did I miss something? If would be awesome if you could share those weights.

BTW, what would be the threshold to select a proper prompt utterance of the same speaker? Currently I am using 0.8. Any suggestions?

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions