An Optimisation Framework for Unsupervised Environment Design

NEWS: Our paper was accepted into Reinforcement Learning Conference 2025! We are excited to meet you in Edmonton 😊

This repo is heavily based on the code from SFL, JaxUED and XLand-Mingrid. Much thanks to these authors!

Environments

We test in the following environments:

Minigrid: 2D Maze Navigation
Xland-Minigrid: 2D "Maze" Navigation version of XLand.
Craftax: A JAX reimplementation of Crafter, plus additional features, world generation, and achievements.

We have a directory for each environment. For each, there is a makefile, in which you run make build to build the docker image, and make run to run a docker container.

You can then run any of the files to run any of their respective algorithms.

Paper TL; DR

In essence, we use gradients to learn a PLR distribution. This allows us to obtain convergence guarantees, and ultimately leads to superior performance when the method is extended to its practical variant. We evaluate our method using both the $\alpha$-CVaR evaluation protocol and also the holdout level sets from each test environment. We also propose a new score function (generalised learnability), and show it emphasizes levels of intermediate difficulty (a principle emphasized in SFL).

$\alpha$-CVaR Evaluation

The $\alpha$-CVaR evaluation tests policies under their $\alpha$% worst case levels. Our method obtains very strong performance on Minigrid in this regime:

In order to recreate these results, look in each environment's deploy directory. After running the corresponding experiments, first generate the levels in ...generate_levels.py (not needed with Craftax), then evaluate the proper runs in ...rollout.py and finally produce the plots in ...analyse.py.

BibTeX citation

If you use our implementation or our method in your work, please cite us!

@inproceedings{
  monette2025an,
  title={An Optimisation Framework for Unsupervised Environment Design},
  author={Nathan Monette and Alistair Letcher and Michael Beukman and Matthew Thomas Jackson and Alexander Rutherford and Alexander David Goldie and Jakob Nicolaus Foerster},
  booktitle={Reinforcement Learning Conference},
  year={2025},
  url={https://openreview.net/forum?id=WnknYUybWX}
  }

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
craftax		craftax
minigrid		minigrid
xland		xland
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

An Optimisation Framework for Unsupervised Environment Design

Environments

Paper TL; DR

$\alpha$-CVaR Evaluation

BibTeX citation

About

Uh oh!

Releases

Packages

Languages

License

nmonette/NCC-UED

Folders and files

Latest commit

History

Repository files navigation

An Optimisation Framework for Unsupervised Environment Design

Environments

Paper TL; DR

$\alpha$-CVaR Evaluation

BibTeX citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages