This repository is cloned from openai/baselines and modifided for our reseach. Don't make PR for ofiginal repogitory.
以下のコマンドで学習済みモデルを作成する. tensorflowのモデルを保存するディレクトリを--lodir_tf で指定する.
例
python -m baselines.her.experiment.train \
--env GraspBlock-v0 \
--num_cpu 1 \
--n_epochs 100 \
--logdir_tf < Dierctory path to save tensorflow model>
以下のコマンドで学習モデルをロードし, 指定したディレクトリにアクションなどを書き出す. --logdir_tfで学習済みのモデルを指定し, --logdir_aqでactionやQ-valueなどを出力するディレクトリを指定する.
python -m baselines.her.experiment.test \
--env GraspBlock-v0 \
--num_cpu 1 --n_epochs 5 \
--logdir_tf < path to saved model > \
--logdir_aq < path to save actions etc... >
ログファイルには以下の項目が記述されている.
goal/desired: ゴール (g)goal/achieved: 到達点 (ag)observation: 観測 (o)action: action, shape=[EpisodeNo, Batch, Sequence, env.action_space]Qvalue: Q-value, shape=[EpisodeNo, Batch, Sequence, env.action_space]fc: Critic Networkの中間出力 (fc2), shape=[EpisodeNo, Batch, Sequence, n_unit(=256)]
TBA
From the general python package sanity perspective, it is a good idea to use virtual environments (virtualenvs) to make sure packages from different projects do not interfere with each other. You can install virtualenv (which is itself a pip package) via
pip install virtualenvVirtualenvs are essentially folders that have copies of python executable and all python packages. To create a virtualenv called venv with python3, one runs
virtualenv /path/to/venv --python=python3To activate a virtualenv:
. /path/to/venv/bin/activate
More thorough tutorial on virtualenvs and options can be found here
-
Clone the repo and cd into it:
git clone https://github.com/openai/baselines.git cd baselines -
If you don't have TensorFlow installed already, install your favourite flavor of TensorFlow. In most cases,
pip install tensorflow-gpu # if you have a CUDA-compatible gpu and proper driversor
pip install tensorflow
should be sufficient. Refer to TensorFlow installation guide for more details.
-
Install baselines package
pip install -e . -
Install original environment
cd gym-grasp
pip install -e .Some of the baselines examples use MuJoCo (multi-joint dynamics in contact) physics simulator, which is proprietary and requires binaries and a license (temporary 30-day license can be obtained from www.mujoco.org). Instructions on setting up MuJoCo can be found here