Latent Terrain: Coordinates-to-Latents Generator for Neural Audio Autoencoders

New documentation page: https://jasper-zheng.github.io/nn_terrain/

Latent terrain is a coordinates-to-latents mapping model for neural audio autoencoders (such as RAVE, Music2Latent). A terrain is a surface map for the autoencoder's latent space, taking coordinates in a control space as inputs, and producing continuous latent vectors in real-time.

Latent terrain aims to open up the creative possibilities of latent space navigation, allowing one to adapt an autoencoder to easier-to-navigate interfaces (such as gestural controllers, stylus and tablets, XY-pads, and more), and build new musical instruments that compose and interact with AI audio generators.

An example latent space walk with Music2Latent:

terrain-walk.mp4

All documentation, installation and building instruction, please see our new web documents: https://jasper-zheng.github.io/nn_terrain/

Change Logs

Oct. 2025 v1.5.6.1 and v1.6.0.1

nn.terrain~:
- Terrain training moved to multi-thread, train and plot_interval message won't block the main thread anymore.
- The last argument in plot_interval now defines which latent dimension to sample, a new attribute plot_multi_channel is added for coloured plot.
nn.terrain.encode:
- Supported autoencoders with stereo io channels (i.e., supoorted stable-audio-open-1.0);
- Default encoder_batch_size chenged from 64 to 16;
- Fixed the faulty argument list.
Cleaned up the codebase: Fourier-CPPN, Dataset classes moved to backend.cpp.

Aug. 2025 v1.5.6 and v1.6.0

[BREAKING CHANGE] Changed order of arguments for the plot_interval method in nn.terrain~, to be align with the value_boundaries attribute in nn.terrain.gui.
[BREAKING CHANGE] All attribute names updated.
Mouse behaviours in nn.terrain.gui updated: the output coordinates dictionary will be automatically updated whenever a "mouse up" is performed.
Fixed the playhead updates in the play mode in nn.terrain.gui.
[HELP FILES] Added a JS script for patch cords scripting.

May. 2025 v1.5.6-beta and v1.6.0-beta

The first release.

Build Instructions

Please refer to https://jasper-zheng.github.io/nn_terrain/compile/.

TODOs

[✕︎] Load and inference scripted mapping model exported bt torchscript.
[✔︎] Display terrain visualisation.
- [✔︎] Greyscale (one-channel)
- [✔︎] Multi-channel (yes but no documentation atm)
[✔︎] Interactive training of terrain models in Max MSP.
[✔︎] Customised configuration of Fourier-CPPNs (Tancik et al., 2020).
[✔︎] Example patches, tutorials...
[✕︎] PureData

Get in touch

Hi, this is Shuoyang (Jasper). nn.terrain~ is part of my ongoing PhD work on Discovering Musical Affordances in Neural Audio Synthesis, supervised by Anna Xambó Sedó and Nick Bryan-Kinns, and part of the work has been (will be) on putting AI audio generators into the hands of composers/musicians.

Therefore, I would love to have you involved in it - if you have any feedback, a features request, a demo / a device / or anything made with nn.terrain, I would love to hear. If you would like to collaborate on anything, please leave a message in this feedback form.

Acknowledgements

Shuoyang Zheng, the author of this work, is supported by UK Research and Innovation [EP/S022694/1].
This is built on top of acids-ircam's nn_tilde, with a lot of reused code including the cmakelists templates, backend.cpp, circular_buffer.h, and the model performing loop in nn.terrain_tilde.cpp.
Caillon, A., Esling, P., 2022. Streamable Neural Audio Synthesis With Non-Causal Convolutions. https://doi.org/10.48550/arXiv.2204.07064
Tancik, M., Srinivasan, P.P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., Singhal, U., Ramamoorthi, R., Barron, J.T., Ng, R., 2020. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. NeurIPS.
Vigliensoni, G., Fiebrink, R., 2023. Steering latent audio models through interactive machine learning, in: In Proceedings of the 14th International Conference on Computational Creativity.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
assets		assets
pre-trained		pre-trained
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
ReadMe.md		ReadMe.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Latent Terrain: Coordinates-to-Latents Generator for Neural Audio Autoencoders

Change Logs

Build Instructions

TODOs

Get in touch

Acknowledgements

About

Uh oh!

Releases 3

Languages

License

jasper-zheng/nn_terrain

Folders and files

Latest commit

History

Repository files navigation

Latent Terrain: Coordinates-to-Latents Generator for Neural Audio Autoencoders

Change Logs

Build Instructions

TODOs

Get in touch

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Languages