Providing lower variance TD updates through a first-order Taylor expansion of expected TD updates. This work has been presented at the 2023 NeurIPS conference and the corresponding paper is available here.
The project uses OpenAI Gym environments (e.g. HalfCheetah-v2), which depend
on MuCoJo 1.50 specifically (through mujoco-py). In addition, it apparently
must be installed under ~/.mujoco/mjpro150 (despite the
MUJOCO_PY_MUJOCO_PATH env var apparently indicating otherwise). Nonetheless,
to install it (on linux), run:
mkdir -p ~/.mujoco && cd ~/.mujoco
curl -LO https://roboti.us/download/mjpro150_linux.zip
unzip mjpro150_linux.zip && rn mjpro150_linux.zipTo keep things versioned and segregated from the rest of the system, we should
use a virtual environment. We will use a conda virtual environment called
taylorrl for this project.
conda create [-p /optional/prefix] -n taylorrlWe will also need to set some environment variables. A convenient way to manage per-project environment variables and other shell configurations is to use direnv (highly recommended!).
In the project's .envrc file, you can begin by activating the conda environment:
conda activate taylorrlBefore installing the python dependencies (i.e. linking mujoco-py against the
dynamic library we just downloaded in the previous section), we need to set the
linker path so that ld knows where to find mujoco. In .envrc,
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mjpro150/binI have also found that the linker cannot find the GLIBCXX_3.4.29 symbol, which
can be rectified by preloading the OS's libstdc++.so file, by adding the
following to .envrcc (change the location to point to your libstdc++.so).
export LD_PRELOAD=/usr/lib64/libstdc++.so.6In order to render things, you will also need GLEW. Using anaconda
conda install -c conda-forge glew mesalib patchelf gxx gcc
conda install -c anaconda mesa-libgl-cos6-x86_64 swig
conda install -c menpo glfw3now, in .envrc:
export LD_PRELOAD=</path/to/conda/env/>lib/libGLEW.so:$LD_PRELOADWe will begin with PyTorch and CUDA libraries, since the CUDA libraries are added to the linker path and may be used by other packages later on.
conda install python=3.9
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorchWe can now go ahead and install OpenAI Gym (and the corresponding environments):
pip install gym[all]And any other python packages required:
pip install dotmap sacredThe underlying structure of the code is based (but not forked) on MAGE: Model-based Action-Gradient-Estimator Policy Optimization (paper).