Thanks to visit codestin.com
Credit goes to github.com

Tensorforce 0.6.3

AlexKuhnle released this 22 Mar 21:45

· 90 commits to master since this release

b429910

Agents:

New agent argument tracking and corresponding function tracked_tensors() to track and retrieve the current value of predefined tensors, similar to summarizer for TensorBoard summaries
New experimental value trace_decay and gae_decay for Tensorforce agent argument reward_estimation, soon for other agent types as well
New options "early" and "late" for value estimate_advantage of Tensorforce agent argument reward_estimation
Changed default value for Agent.act() argument deterministic from False to True

Networks:

New network type KerasNetwork (specification key: keras) as wrapper for networks specified as Keras model
Passing a Keras model class/object as policy/network argument is automatically interpreted as KerasNetwork

Distributions:

Changed Gaussian distribution argument global_stddev=False to stddev_mode='predicted'
New Categorical distribution argument temperature_mode=None

Layers:

New option for Function layer argument function to pass string function expression with argument "x", e.g. "(x+1.0)/2.0"

Summarizer:

New summary episode-length recorded as part of summary label "reward"

Environments:

Support for vectorized parallel environments via new function Environment.is_vectorizable() and new argument num_parallel for Environment.reset()
- See tensorforce/environments.cartpole.py for a vectorizable environment example
- Runner uses vectorized parallelism by default if num_parallel > 1, remote=None and environment supports vectorization
- See examples/act_observe_vectorized.py for more details on act-observe interaction
New extended and vectorizable custom CartPole environment via key custom_cartpole (work in progress)
New environment argument reward_shaping to provide a simple way to modify/shape rewards of an environment, can be specified either as callable or string function expression

run.py script:

New option for command line arguments --checkpoints and --summaries to add comma-separated checkpoint/summary filename in addition to directory
Added episode lengths to logging plot besides episode returns

Buxfixes:

Temporal horizon handling of RNN layers
Critical bugfix for late horizon value prediction (including DQN variants and DPG agent) in combination with baseline RNN
GPU problems with scatter operations

Assets 2