Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Differentiable grouped feedback delay networks for late reverberation modelling in coupled spaces

Notifications You must be signed in to change notification settings

orchidas/DiffGFDN

Repository files navigation

Data-driven 6DoF late reverberation rendering in coupled spaces for Augmented Reality

The goal of this work is to learn spatially-dynamic late reverberation properties in a complex space from a measured set of RIRs / SRIRs, and render dynamic late reverberation as the user moves around the space. Models are trained to learn from a finite set of measurements, and extrapolate late reverberation behaviour at any location in the space. To setup the repository, follow instructions in CONTRIBUTING.md.

Data-driven late reverberation interpolation in coupled spaces using the Common Slopes model

From a set of Spatial Room Impulse Responses (SRIRs) (encoded in ambisonics) measured at several locations in the space, late reverberation at any location is generalised using machine learning.

To do this, the Common Slopes (CS) model is leveraged. MLPs are trained in octave bands to learn the weights of the decay kernels, known as the CS amplitudes, in the spherical harmonics domain. White noise, shaped in octave bands by the predicted CS parameters, is used to synthesise the direction-dependent late reverberation tail. For 6 DoF rendering, the MLPs update the CS amplitudes, a new reverberation tail is synthesised and time-varying convolution is performed on the input signal and the synthesised late tail.

Dataset

We have tested on the dataset published here which has three coupled rooms simulated with Treble's hybrid solver and has 2nd order ambisonic SRIRs at 838 receiver locations for a single source location. To parse the dataset and save the SRIRs and CS parameters in octave bands, run python3 src/convert_mat_to_pkl_ambi.py.

Training

The scripts for training this model are in the src/spatial_sampling folder.

  • To run training on the three coupled room dataset, you can run the script src/run_spatial_sampling_test.py. The MLPs are trained on a particular frequency band, an example of a config file is available at here. TLDR; to run training, run python3 src/run_spatial_sampling_test -c <config_path>
  • Once trained, you can run inferencing with python3 src/run_spatial_sampling_test -c <config_path> --infer which will plot the results.
  • To generate synthetic SRIR tails once all octave bands have been trained, you can use functions in the script src/spatial_sampling/inference.py. An example of how to use this script to generate binaural sound examples for moving listeners has been provided in this notebook.

Differentiable Grouped Feedback Delay Networks for data-driven late reverberation rendering in coupled spaces

The Grouped Feedback Delay Network renders multi-slope late reverberation in coupled spaces. In this work, the parameters of the GFDN are learned to model multi-slope reverberation in a complex space from a set of measured Room Impulse Responses.

A dataset of RIRs measured in a coupled space, along with the corresponding source and receiver positions, can be used to train the DiffGFDN. Now, the RIR at a new (unmeasured) position can be extrapolated with the DiffGFDN network. More powerfully, the late reverberation in the entire space can be parameterised with a very efficient network which is ideal for real-time rendering as the source-listener moves. This not only reduces memory requirements of storing measured RIRs, but is also faster than convolution with long reverberation tails.

Dataset

We use omni-directional and spatial RIRs from the same three-coupled room dataset. The mat files are converted to pickle files (for faster loading) and filtered in octave bands using the scripts python3 src/convert_mat_to_pkl.py and python3 src/convert_mat_to_pkl_ambi.py respectively.

To use an open-source dataset:

  • Upcoming, not implemented yet!

Training

  • Omnidirectional DiffGFDN
    • To run training of a single full-band GFDN on a grid of receiver positions, create a different config file (example here). Then run python3 src/run_model.py -c <config_file_path>.
    • To run training with one frequency-independent omni-directional DiffGFDN for each octave band, create config files for each band, and run the training for each config file. Alternately, run python3 src/run_subband_training_treble.py --freqs <list_of_octave_frequencies>
    • To only run inference on the trained parallel octave-band GFDNs, run python3 src/run_subband_training_treble.py. This will save the synthesised RIRs as a pkl file.
  • Directional DiffGFDN
    • To run training on a single frequency band, create a config file (example here).
    • After traning all frequency bands, inference can be run using infer_all_octave_bands_directional_fdn in src/diff_gfdn.inference.py.

Publications

  • Neural-network based interpolation of late reverberation in coupled spaces using the common slopes model - Das, Dal Santo, Schlecht and Zvetkovic, IEEE Work. Appl. of Sig. Process. Aud. Acous., IEEE WASPAA 2025 link.
  • Differentiable Grouped Feedback Delay Networks: Learning from measured Room Impulse Responses for spatially dynamic late reverberation rendering - Das, Dal Santo, Schlecht and Zvetkovic, submitted to IEEE Trans. Aud. Speech Lang. Process., IEEE TASLP, 2025 link.
  • Differentiable Grouped Feedback Delay Networks for Learning Position and Direction-Dependent Late Reverberation - Das, Schlecht, Dal Santo, Cvetkovic, submitted to IEEE Int. Conf. Aud., Speech Sig. Process., ICASSP 2026.

Sound examples

Mono sound examples of the DiffGFDN are available here. Binaural sound examples of the DiffGFDN are available here. Binaural sound examples of convolution-based directional rendering are available here.