Support for AudioLDM2

We seem to have a working implementation of [AudioLDM2](https://github.com/haoheliu/AudioLDM2)

I understand you have already mentioned you will implement Vocos and AudioCraft. But it seems to me that AudioLDM produces better outputs. 

Please have a look?
 :)