Dual Constrained Reinforcement Learning

This repository provides the necessary code to reproduce all experiments of the paper "A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints" [1].

An example Pytorch implementation of the DualCRL algorithm is provided, with support for any combination of entropy regularization, value constraints, density constraints and transition constraints. Custom constrained setups are defined for the CliffWalking-v0 and Pendulum-v1 Gymnasium environments. Further details are provided in the paper's Experiments section.

learned_values_q.mp4

reward_modifiers_db.mp4

Installation

Clone this repository.

git clone https://github.com/dcbr/dualcrl

cd dualcrl
Install the required packages, as specified in environment.yml. This can be easily done by creating a virtual environment (using e.g. conda or venv).

conda env create -f environment.yml

Usage

Activate the virtual environment, using e.g. conda activate dualcrl. Afterwards, you can simply run the main script with suitable arguments to train the models or analyze their performance. For example

python main.py --mode train --job cliffwalk

to train on the cliff walking environment (with additional policy constraints).

To reproduce all results of Section VI, first train on all jobs with python main.py --mode train --job [JOB] --uid paper, followed by the analysis python main.py --mode analyze --job [JOB] --uid paper. Beware that this might take a while to complete, depending on your hardware!

A summary of the most relevant parameters to this script is provided below. Check python main.py --help for a full overview of supported parameters.

Parameter	Supported values	Description
`--mode`	`train`, `analyze`	Run mode. Either train models, or analyze (and summarize) the results.
`--job`	`cliffwalk`, `pendulum`	Job to run. The job file defines the environment and constraints to train on.
`--uid`	Any value	Unique identifier for a job run.

References

[1] De Cooman, B., Suykens, J.: A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints. Accepted for publication in IEEE Transactions on Artificial Intelligence. DOI: 10.1109/TAI.2025.3564898, ArXiv: 2404.16468

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
jobs		jobs
.gitignore		.gitignore
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
job.py		job.py
main.py		main.py
methods.py		methods.py
metrics.py		metrics.py
models.py		models.py
plotting.py		plotting.py
run.py		run.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dual Constrained Reinforcement Learning

Installation

Usage

References

About

Uh oh!

Languages

License

dcbr/dualcrl

Folders and files

Latest commit

History

Repository files navigation

Dual Constrained Reinforcement Learning

Installation

Usage

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages