rlR: Deep Reinforcement learning in R

Installation

R package installation

devtools::install_github("smilesun/rlR")

or

devtools::install_github("smilesun/rlR", dependencies = TRUE)

rlR itself use tensorflow as its backend for neural network as functional approximator, so python dependency is needed.

Configure to connect to python

To run the examples, you need to have the python packages numpy-1.14.5, tensorflow-1.8.0, keras-2.1.6, gym-0.10.5 installed in the same python path.

This python path can be your system default python path or a virtual environment(either system python virtual environment or anaconda virtual environment).

Other package versions might also work but not tested.

To look at all python paths you have, in a R session, run

reticulate::py_discover_config()

Check which is your system default python:

Sys.which("python")

If you want to use a python path other than this system default, run the following(replace the '/usr/bin/python' with the python path you want) before doing anything else with reticulate.

reticulate::use_python("/usr/bin/python", required=TRUE)

"Note that you can only load one Python interpreter per R session so the use_python call only applies before you actually initialize the interpreter." Which means if you changed your mind, you have to close the current R session and open a new R session.

Confirm from the following if the first path is the one you wanted

reticulate::py_config()

Python dependencies installation by rlR function

It is not recommended to mix things up with the system python, so by default, the rlR facility will install the dependencies to virtual environment named 'r-tensorflow' either to your system virtualenv or Anaconda virtualenv.

For Unix user

Ensure that you have either of the following available
- Python Virtual Environment:
```
pip install virtualenv
```
- Anaconda

Install dependencies through

if you have python virtualenv available:

rlR::installDep2SysVirtualEnv(gpu = FALSE)

if you have anaconda available:

rlR::installDepConda(conda_path = "auto", gpu = FALSE)

For Windows user

Ensure that you have Anaconda available.
Install dependencies through {r eval=FALSE} rlR::installDepConda(gpu = FALSE)

If you want to have gpu support, simply set the gpu argument to be true in the function call.

Mannual python dependency installation

You can also install python dependencies without using rlR facility function, for example, you can open an anaconda virtual environment 'r-tensorflow' by

source activate r-tensorflow`
pip install gym
pip install cmake
pip install gym[atari]

Usage

library(rlR)
listGymEnvs()[1L:10L]

##  [1] "DoubleDunk-ramDeterministic-v4" "DoubleDunk-ramDeterministic-v0"
##  [3] "Robotank-ram-v0"                "CartPole-v0"                   
##  [5] "CartPole-v1"                    "Asteroids-ramDeterministic-v4" 
##  [7] "Pooyan-ram-v4"                  "Gopher-ram-v0"                 
##  [9] "HandManipulateBlock-v0"         "Pooyan-ram-v0"

env = makeGymEnv("CartPole-v1")
env$overview()

## 
## action cnt: 2 
## state original dim: 4 
## discrete action

listAvailAgent(env)

## $AgentDQN
## [1] "Deep Q learning"
## 
## $AgentFDQN
## [1] "Frozen Target Deep Q Learning"
## 
## $AgentDDQN
## [1] "Double Deep QLearning"
## 
## $AgentPG
## [1] "Policy Gradient Monte Carlo"
## 
## $AgentPGBaseline
## [1] "Policy Gradient with Baseline"
## 
## $AgentActorCritic
## [1] "Actor Critic Method"

options(width=1000)
listAvailConf()[, .(name, note, name)]

##                         name                                                                                                                      note                     name
##  1:                   render                                                                                    Whether to show rendering video or not                   render
##  2:                      log                                                                             Whether to log important information on drive                      log
##  3:                  console                                                                            Whether to enable debug info output to console                  console
##  4:              agent.gamma                                                                             The discount factor in reinforcement learning              agent.gamma
##  5:     agent.flag.reset.net                                                                                      Whether to reset the neural network      agent.flag.reset.net
##  6:           agent.lr.decay                                                                        The decay factor of the learning rate at each step           agent.lr.decay
##  7:                 agent.lr                                                                                               learning rate for the agent                 agent.lr
##  8:        agent.store.model                                                                            whether to store the model of the agent or not        agent.store.model
##  9: agent.update.target.freq                                                                                How often should the target network be set agent.update.target.freq
## 10:        agent.start.learn                                                                            after how many transitions should replay begin        agent.start.learn
## 11:            agent.clip.td                                                                                                  whether to clip TD error            agent.clip.td
## 12:        policy.maxEpsilon                                                                                      The maximum epsilon exploration rate        policy.maxEpsilon
## 13:        policy.minEpsilon                                                                                      The minimum epsilon exploration rate        policy.minEpsilon
## 14:        policy.decay.rate                                                                                                            the decay rate        policy.decay.rate
## 15:        policy.decay.type                                                        the way to decay epsion, can be decay_geo, decay_exp, decay_linear        policy.decay.type
## 16:       policy.aneal.steps how many steps needed to decay from maximum epsilon to minmum epsilon, only valid when policy.decay.type = 'decay_linear'       policy.aneal.steps
## 17:   policy.softmax.magnify                                                                                                                      <NA>   policy.softmax.magnify
## 18:         replay.batchsize                                                                     how many samples to take from replay memory each time         replay.batchsize
## 19:           replay.memname                                                                                                 The type of replay memory           replay.memname
## 20:          replay.mem.size                                                                                             The size of the replay memory          replay.mem.size
## 21:            replay.epochs                                                               How many gradient decent epochs to carry out for one replay            replay.epochs
## 22:              replay.freq                                                                                   how many steps to wait until one replay              replay.freq
##                         name                                                                                                                      note                     name

conf = getDefaultConf("AgentDQN")
conf$show()

##                                      value
## render                               FALSE
## log                                  FALSE
## console                              FALSE
## agent.gamma                           0.99
## agent.flag.reset.net                  TRUE
## agent.lr.decay           0.999000499833375
## agent.lr                             0.001
## agent.store.model                    FALSE
## agent.update.target.freq               200
## agent.start.learn                       64
## agent.clip.td                        FALSE
## policy.maxEpsilon                        1
## policy.minEpsilon                     0.01
## policy.decay.rate        0.999000499833375
## policy.decay.type                decay_geo
## policy.aneal.steps                   1e+06
## policy.softmax.magnify                   1
## replay.batchsize                        64
## replay.memname                     Uniform
## replay.mem.size                      20000
## replay.epochs                            1
## replay.freq                              1
## policy.name                    ProbEpsilon

conf$set(render = FALSE, console = FALSE)   # Since this file is generated by Rmarkdown, we do not want other output message to blur the markdown file.

agent = makeAgent("AgentDQN", env, conf)
ptmi = proc.time()
perf = agent$learn(200L)  
proc.time() - ptmi

agent$plotPerf()

Name		Name	Last commit message	Last commit date
Latest commit History 488 Commits
R		R
benchmark		benchmark
docs		docs
inst/figures		inst/figures
paper		paper
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.ackrc		.ackrc
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
appveyor.yml		appveyor.yml
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md
requirement.txt		requirement.txt
rlR.Rproj		rlR.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

rlR: Deep Reinforcement learning in R

Installation

R package installation

Configure to connect to python

Python dependencies installation by rlR function

Mannual python dependency installation

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

smilesun/rlR

Folders and files

Latest commit

History

Repository files navigation

rlR: Deep Reinforcement learning in R

Installation

R package installation

Configure to connect to python

Python dependencies installation by rlR function

Mannual python dependency installation

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages