PA: This repository is in maintenance mode. No new features will be added but bugfixes and contributions are welcome. Please create a pull request with any fixes you have!
Dream to Control: Learning Behaviors by Latent Imagination
Paper: https://arxiv.org/abs/1912.01603
Project Website: https://danijar.com/project/dreamer/
TensorFlow 2 implementation: https://github.com/danijar/dreamer
TensorFlow 1 implementation: https://github.com/google-research/dreamer
| Task | Average Return @ 1M | Dreamer Paper @ 1M |
|---|---|---|
| Acrobot Swingup | 69.54 | ~300 |
| Cartpole Balance | 877.5 | ~990 |
| Cartpole Balance Sparse | 814 | ~900 |
| Cartpole Swingup | 633.6 | ~800 |
| Cup Catch | 885.1 | ~990 |
| Finger Turn Hard | 212.8 | ~550 |
| Hopper Hop | 219 | ~250 |
| Hopper Stand | 511.6 | ~990 |
| Pendulum Swingup | 724.9 | ~760 |
| Quadruped Run | 112.4 | ~450 |
| Quadruped Walk | 52.82 | ~650 |
| Reacher Easy | 962.8 | ~950 |
| Walker Stand | 956.8 | ~990 |
Table 1. Dreamer PyTorch vs. Paper Implementation
- 1 random seed for PyTorch, 5 for the paper
- Code @ commit ccea6ae
- 37H for 1M steps on P100, 20H for 1M steps on V100
- Install Python 3.11
- Install Python Poetry
# clone the repo with rlpyt submodule
git clone --recurse-submodules https://github.com/juliusfrost/dreamer-pytorch.git
cd dreamer-pytorch
# Windows
cd setup/windows_cu118
# Linux
cd setup/linux_cu118
# install with poetry
poetry install
# install with pip
pip install -r requirements.txtTo run experiments on Atari, run python main.py, and add any extra arguments you would like.
For example, to run with a single gpu set --cuda-idx 0.
To run experiments on DeepMind Control, run python main_dmc.py. You can also set any extra arguments here.
Experiments will automatically be stored in data/local/yyyymmdd/run_#
You can use tensorboard to keep track of your experiment.
Run tensorboard --logdir=data.
If you have trouble reproducing any results, please raise a GitHub issue with your logs and results. Otherwise, if you have success, please share your trained model weights with us and with the broader community!
To run tests:
pytest testsIf you want additional code coverage information:
pytest tests --cov=dreamermain.pyrun atari experimentmain_dmc.pyrun deepmind control experimentdreamerdreamer codeagentsagent code used in samplingatari_dreamer_agent.pyAtari agentdmc_dreamer_agent.pyDeepMind Control agentdreamer_agent.pybasic sampling agent, exploration, contains shared methods
algosalgorithm specific codedreamer_algo.pyoptimization algorithm, loss functions, hyperparametersreplay.pyreplay buffer
envsenvironment specific codeaction_repeat.pyaction repeat wrapper. ported from tf2 dreameratari.pyAtari environments. ported from tf2 dreamerdmc.pyDeepMind Control Suite environment. ported from tf2 dreamerenv.pybase classes for environmentmodified_atari.pyunused atari environment from rlpytnormalize_actions.pynormalize actions wrapper. ported from tf2 dreamerone_hot.pyone hot action wrapper. ported from tf2 dreamertime_limit.pyTime limit wrapper. ported from tf2 dreamerwrapper.pyBase environment wrapper class
experimentscurrently not usedmodelsall models used in the agentaction.pyAction modelagent.pySummarizes all models for agent moduledense.pyDense fully connected models. Used for Reward Model, Value Model, Discount Model.distribution.pyDistributions, TanH Bijectorobservation.pyObservation Modelrnns.pyRecurrent State Space Model
utilsutility functionslogging.pylogging videosmodule.pyfreezing parameters