Code repository for paper link
Manh Luong, Viet Anh Tran under review in INTERSPEECH 2021
We use VCTK-Corpus to train and estimate our proposed model, VCTK dataset can be found in this link
pretrained model can be downloaded in this link Wavenet Vocoder: link
- Python 3.6 or newer.
- Pytorch 1.4 or newer.
- librosa.
- tensorboardX.
- wavenet_vocoder
pip install wavenet_vocoder
- Download and uncompress VCTK dataset.
- Move extracted dataset in
[home directory]
. - run command:
export HOME=[home directory]
- run command:
bash preprocessing.sh
.
To train the model run the following command:
bash training.sh
To convert voice from source to target using pretrained model. Run the follwoing commands:
- cd [Disentangled-VAE directory]
- mkdir ./results/checkpoints
- cp [your downloaded checkpoint] ./results/checkpoints/
- Download pretrained model of Wavenet_vocoder
- cp [downloaded Wavenet_Vocoder]/checkpoint_step001000000_ema.pth [Disentangled-VAE directory]
- edit two variables:
src_spk
andtrg_spk
in file conversion.sh to your source and target speaker, respectively. - run command:
bash conversion.sh