Flow-based Policy for Online Reinforcement Learning

👋 Hi, everyone!
We are ByteDance Seed team.

Flow-based Policy for Online Reinforcement Learning

We are delighted to introduce FlowRL. It is a new approach for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. This creates a promising framework that integrates generative policies with reinforcement learning.

News

[2025/06/10] 🔥 We release the PyTorch version of the code.
[2025/09/18] 🎉 Our paper has been accepted to NeurIPS 2025.

Introduction

FlowRL is an Actor-Critic framework that leverages flow-based policy representation and integrates Wasserstein-2-regularized optimization. By implicitly constraining the current policy to the optimal behavioral policy via W2 distance, FlowRL achieves superior performance on challenging benchmarks like the DM_Control (Dog domain, Humanoid domain) and Humanoid_Bench.

Getting Started

Setup Conda Environment: Create an environment with
```
conda create -n flowrl python=3.11
```

Clone this Repository:

git clone https://github.com/bytedance/FlowRL.git
cd FlowRL

Install FlowRL Dependencies:
```
pip install -r requirements.txt
```
Training Examples:
- Run a single training instance:
```
python3 main.py --domain dog --task run
```
- Run parallel training:
```
bash scripts/train_parallel.sh
```

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

TODO

Release JAX version source code

Citation

If you find FlowRL useful for your research and applications, please consider giving us a star ⭐ or cite us using:

@article{lv2025flow,
  title={Flow-Based Policy for Online Reinforcement Learning},
  author={Lv, Lei and Li, Yunfei and Luo, Yu and Sun, Fuchun and Kong, Tao and Xu, Jiafeng and Ma, Xiao},
  journal={arXiv preprint arXiv:2506.12811},
  year={2025}
}

About ByteDance Seed Team

Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
envs		envs
model		model
scripts		scripts
utilis		utilis
LICENSE.txt		LICENSE.txt
README.md		README.md
main.py		main.py
requirements .txt		requirements .txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Flow-based Policy for Online Reinforcement Learning

News

Introduction

Getting Started

License

TODO

Citation

About ByteDance Seed Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

bytedance/FlowRL

Folders and files

Latest commit

History

Repository files navigation

Flow-based Policy for Online Reinforcement Learning

News

Introduction

Getting Started

License

TODO

Citation

About ByteDance Seed Team

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages