SOLD: Slot-Attention for Object-centric Latent Dynamics

Malte Mosbach*, Jan Niklas Ewertz*, Angel Villar-Corrales, Sven Behnke

[Paper] [Website] [BibTeX]

Slot-Attention for Object-centric Latent Dynamics (SOLD) is a model-based reinforcement learning algorithm operating on a structured latent representation in its world model.

Installation

Conda

Start by installing the multi-object-fetch environment suite. Then add the SOLD dependencies to the conda environment by running:

conda env update -n mof -f apptainer/environment.yml

Apptainer

Alternatively, we provide an Apptainer build file to simplify installation. To build the .sif image, run:

cd apptainer && apptainer build sold.sif multi_object_fetch.def

To start a training run inside the container:

apptainer run --nv ../sold.sif python train_sold.py

Note

If you're on a SLURM cluster, you can submit training jobs using this container with the provided run script sbatch slurm.sh train_sold.py.

Training

The training routine consists of two stages: pre-training a SAVi model and training a SOLD model on top of it.

Pre-training a SAVi model

The SAVi models (or autoencoders generally) are pre-trained on static datasets of random trajectories. Such datasets can be generated using the following script:

python generate_dataset.py experiment=my_dataset env.name=ReachRed_0to4Distractors_Dense-v1

To train a SAVi model, specify the dataset to be trained on and model parameters such as the number of slots in train_autoencoder.yaml and run:

python train_autoencoder.py experiment=my_savi_model

Show sample pre-training results

Good SAVi models should learn to split the scene into meaningful objects and keep slots assigned to the same object over time. Examples of SAVi models pre-trained for a reaching and picking task are shown below.

Training a SOLD model

To train SOLD, a checkpoint path to the pre-trained SAVi model is required, which can be specified in the train_sold.yaml configuration file. Then, to start the training, run:

python train_sold.py

All results are stored in the experiments directory.

Show sample training outputs

When training a SOLD model, you can check different visualisations to monitor the training progress. The dynamics_prediction plot highlights the differences between the ground truth and the predicted future states, and shows the forward prediction of each slot.

In addition, visualisations of actor_attention or reward_predictor_attention, as shown below, can be used to understand what the model is paying attention to when predicting the current reward, i.e. which elements of the scene the model considers to be reward-predictive.

For further evaluation of a trained model or a set of models in a directory, you can run:

python evaluate_sold.py checkpoint_path=PATH_TO_CHECKPOINT(S)

which will log performance metrics and visualizations for the given checkpoints.

Checkpoints

Pre-trained SAVi and SOLD models are available in the checkpoints directory. The SAVi checkpoints can be used to begin training SOLD models right away. Each checkpoint also includes corresponding TensorBoard logs, allowing you to visualize the expected training dynamics:

tensorboard --logdir checkpoints

Citation

If you find this work useful, please consider citing our paper as follows:

@inproceedings{sold2025mosbach,
  title={SOLD: Slot Object-Centric Latent Dynamics Models for Relational Manipulation Learning from Pixels},
  author={Malte Mosbach and Jan Niklas Ewertz and Angel Villar-Corrales and Sven Behnke},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
apptainer		apptainer
assets		assets
checkpoints		checkpoints
configs		configs
sold		sold
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SOLD: Slot-Attention for Object-centric Latent Dynamics

Installation

Conda

Apptainer

Training

Pre-training a SAVi model

Training a SOLD model

Checkpoints

Citation

About

Uh oh!

Contributors 2

Uh oh!

Languages

maltemosbach/sold

Folders and files

Latest commit

History

Repository files navigation

SOLD: Slot-Attention for Object-centric Latent Dynamics

Installation

Conda

Apptainer

Training

Pre-training a SAVi model

Training a SOLD model

Checkpoints

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages