Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Official Implementation of `An Optimisation Framework for Unsupervised Environment Design` from RLC 2025

License

Notifications You must be signed in to change notification settings

nmonette/NCC-UED

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Optimisation Framework for Unsupervised Environment Design

NEWS: Our paper was accepted into Reinforcement Learning Conference 2025! We are excited to meet you in Edmonton 😊

This repo is heavily based on the code from SFL, JaxUED and XLand-Mingrid. Much thanks to these authors!

Environments

We test in the following environments:

  1. Minigrid: 2D Maze Navigation
  2. Xland-Minigrid: 2D "Maze" Navigation version of XLand.
  3. Craftax: A JAX reimplementation of Crafter, plus additional features, world generation, and achievements.

We have a directory for each environment. For each, there is a makefile, in which you run make build to build the docker image, and make run to run a docker container.

You can then run any of the files to run any of their respective algorithms.

Paper TL; DR

In essence, we use gradients to learn a PLR distribution. This allows us to obtain convergence guarantees, and ultimately leads to superior performance when the method is extended to its practical variant. We evaluate our method using both the $\alpha$-CVaR evaluation protocol and also the holdout level sets from each test environment. We also propose a new score function (generalised learnability), and show it emphasizes levels of intermediate difficulty (a principle emphasized in SFL).

$\alpha$-CVaR Evaluation

The $\alpha$-CVaR evaluation tests policies under their $\alpha$% worst case levels. Our method obtains very strong performance on Minigrid in this regime:

In order to recreate these results, look in each environment's deploy directory. After running the corresponding experiments, first generate the levels in ...generate_levels.py (not needed with Craftax), then evaluate the proper runs in ...rollout.py and finally produce the plots in ...analyse.py.

BibTeX citation

If you use our implementation or our method in your work, please cite us!

@inproceedings{
  monette2025an,
  title={An Optimisation Framework for Unsupervised Environment Design},
  author={Nathan Monette and Alistair Letcher and Michael Beukman and Matthew Thomas Jackson and Alexander Rutherford and Alexander David Goldie and Jakob Nicolaus Foerster},
  booktitle={Reinforcement Learning Conference},
  year={2025},
  url={https://openreview.net/forum?id=WnknYUybWX}
  }

About

Official Implementation of `An Optimisation Framework for Unsupervised Environment Design` from RLC 2025

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages