New documentation page: https://jasper-zheng.github.io/nn_terrain/
Latent terrain is a coordinates-to-latents mapping model for neural audio autoencoders (such as RAVE, Music2Latent). A terrain is a surface map for the autoencoder's latent space, taking coordinates in a control space as inputs, and producing continuous latent vectors in real-time.
Latent terrain aims to open up the creative possibilities of latent space navigation, allowing one to adapt an autoencoder to easier-to-navigate interfaces (such as gestural controllers, stylus and tablets, XY-pads, and more), and build new musical instruments that compose and interact with AI audio generators.
An example latent space walk with Music2Latent:
terrain-walk.mp4
All documentation, installation and building instruction, please see our new web documents: https://jasper-zheng.github.io/nn_terrain/
Oct. 2025 v1.5.6.1 and v1.6.0.1
nn.terrain~:- Terrain training moved to multi-thread,
trainandplot_intervalmessage won't block the main thread anymore. - The last argument in
plot_intervalnow defines which latent dimension to sample, a new attributeplot_multi_channelis added for coloured plot.
- Terrain training moved to multi-thread,
nn.terrain.encode:- Supported autoencoders with stereo io channels (i.e., supoorted
stable-audio-open-1.0); - Default
encoder_batch_sizechenged from 64 to 16; - Fixed the faulty argument list.
- Supported autoencoders with stereo io channels (i.e., supoorted
- Cleaned up the codebase: Fourier-CPPN, Dataset classes moved to
backend.cpp.
Aug. 2025 v1.5.6 and v1.6.0
- [BREAKING CHANGE] Changed order of arguments for the
plot_intervalmethod innn.terrain~, to be align with thevalue_boundariesattribute innn.terrain.gui. - [BREAKING CHANGE] All attribute names updated.
- Mouse behaviours in
nn.terrain.guiupdated: the output coordinates dictionary will be automatically updated whenever a "mouse up" is performed. - Fixed the playhead updates in the
playmode innn.terrain.gui. - [HELP FILES] Added a JS script for patch cords scripting.
May. 2025 v1.5.6-beta and v1.6.0-beta
- The first release.
Please refer to https://jasper-zheng.github.io/nn_terrain/compile/.
- [✕︎] Load and inference scripted mapping model exported bt torchscript.
- [✔︎] Display terrain visualisation.
- [✔︎] Greyscale (one-channel)
- [✔︎] Multi-channel (yes but no documentation atm)
- [✔︎] Interactive training of terrain models in Max MSP.
- [✔︎] Customised configuration of Fourier-CPPNs (Tancik et al., 2020).
- [✔︎] Example patches, tutorials...
- [✕︎] PureData
Hi, this is Shuoyang (Jasper). nn.terrain~ is part of my ongoing PhD work on Discovering Musical Affordances in Neural Audio Synthesis, supervised by Anna Xambó Sedó and Nick Bryan-Kinns, and part of the work has been (will be) on putting AI audio generators into the hands of composers/musicians.
Therefore, I would love to have you involved in it - if you have any feedback, a features request, a demo / a device / or anything made with nn.terrain, I would love to hear. If you would like to collaborate on anything, please leave a message in this feedback form.
-
Shuoyang Zheng, the author of this work, is supported by UK Research and Innovation [EP/S022694/1].
-
This is built on top of acids-ircam's nn_tilde, with a lot of reused code including the cmakelists templates,
backend.cpp,circular_buffer.h, and the model performing loop innn.terrain_tilde.cpp. -
Caillon, A., Esling, P., 2022. Streamable Neural Audio Synthesis With Non-Causal Convolutions. https://doi.org/10.48550/arXiv.2204.07064
-
Tancik, M., Srinivasan, P.P., Mildenhall, B., Fridovich-Keil, S., Raghavan, N., Singhal, U., Ramamoorthi, R., Barron, J.T., Ng, R., 2020. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. NeurIPS.
-
Vigliensoni, G., Fiebrink, R., 2023. Steering latent audio models through interactive machine learning, in: In Proceedings of the 14th International Conference on Computational Creativity.