Thanks to visit codestin.com
Credit goes to github.com

Skip to content

bytedance/FlowRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

👋 Hi, everyone!
We are ByteDance Seed team.

seed logo

Flow-based Policy for Online Reinforcement Learning

We are delighted to introduce FlowRL. It is a new approach for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. This creates a promising framework that integrates generative policies with reinforcement learning.

News

  • [2025/06/10] 🔥 We release the PyTorch version of the code.
  • [2025/09/18] 🎉 Our paper has been accepted to NeurIPS 2025.

Introduction

FlowRL is an Actor-Critic framework that leverages flow-based policy representation and integrates Wasserstein-2-regularized optimization. By implicitly constraining the current policy to the optimal behavioral policy via W2 distance, FlowRL achieves superior performance on challenging benchmarks like the DM_Control (Dog domain, Humanoid domain) and Humanoid_Bench.

Getting Started

  1. Setup Conda Environment: Create an environment with

    conda create -n flowrl python=3.11
  2. Clone this Repository:

    git clone https://github.com/bytedance/FlowRL.git
    cd FlowRL
  3. Install FlowRL Dependencies:

    pip install -r requirements.txt
  4. Training Examples:

    • Run a single training instance:

      python3 main.py --domain dog --task run
    • Run parallel training:

      bash scripts/train_parallel.sh

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

TODO

  • Release JAX version source code

Citation

If you find FlowRL useful for your research and applications, please consider giving us a star ⭐ or cite us using:

@article{lv2025flow,
  title={Flow-Based Policy for Online Reinforcement Learning},
  author={Lv, Lei and Li, Yunfei and Luo, Yu and Sun, Fuchun and Kong, Tao and Xu, Jiafeng and Ma, Xiao},
  journal={arXiv preprint arXiv:2506.12811},
  year={2025}
}

Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society.

About

Official implementation of "Flow Based Policy for Online Reinforcement Learning"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published