This is the codebase for the paper Enforcing Paraphrase Generation via Controllable Latent Diffusion.
You personal dataset should be placed in datasets directory, and split into train, valid, test subsets.
Each dataset should be in csv format with src, tgt as headers.
When training, you should use main.py
--configmeaning the path to your yaml config file, which should be placed inconfdirectory--modemeaning thetrainorresumemode--ckptis required only inresumemode
When inference, you should use seq2seq.py
--ckpt_dirmeaning the checkpoint directory--configplease use the same config file as training, you can find it in<SAVE_PATH>/conf.yaml
Use controlnet_train.py
--ckptrefers to the original ldp checkpoint path
--ldprefers to the original ldp checkpoint path--ckpt_dirmeaning the checkpoint directory
If you find the code helpful, please cite
@article{zou2024enforcing,
title={Enforcing Paraphrase Generation via Controllable Latent Diffusion},
author={Zou, Wei and Zhuang, Ziyuan and Huang, Shujian and Liu, Jia and Chen, Jiajun},
journal={arXiv preprint arXiv:2404.08938},
year={2024}
}