NEWS: Our paper was accepted into Reinforcement Learning Conference 2025! We are excited to meet you in Edmonton 😊
This repo is heavily based on the code from SFL, JaxUED and XLand-Mingrid. Much thanks to these authors!
We test in the following environments:
- Minigrid: 2D Maze Navigation
- Xland-Minigrid: 2D "Maze" Navigation version of XLand.
- Craftax: A JAX reimplementation of Crafter, plus additional features, world generation, and achievements.
We have a directory for each environment. For each, there is a makefile, in which you run make build
to build the docker image, and make run
to run a docker container.
You can then run any of the files to run any of their respective algorithms.
In essence, we use gradients to learn a PLR distribution. This allows us to obtain convergence guarantees, and ultimately leads to superior performance when the method is extended to its practical variant. We evaluate our method using both the
The
In order to recreate these results, look in each environment's deploy
directory. After running the corresponding experiments, first generate the levels in ...generate_levels.py
(not needed with Craftax), then evaluate the proper runs in ...rollout.py
and finally produce the plots in ...analyse.py
.
If you use our implementation or our method in your work, please cite us!
@inproceedings{
monette2025an,
title={An Optimisation Framework for Unsupervised Environment Design},
author={Nathan Monette and Alistair Letcher and Michael Beukman and Matthew Thomas Jackson and Alexander Rutherford and Alexander David Goldie and Jakob Nicolaus Foerster},
booktitle={Reinforcement Learning Conference},
year={2025},
url={https://openreview.net/forum?id=WnknYUybWX}
}