You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
New agent argument tracking and corresponding function tracked_tensors() to track and retrieve the current value of predefined tensors, similar to summarizer for TensorBoard summaries
New experimental value trace_decay and gae_decay for Tensorforce agent argument reward_estimation, soon for other agent types as well
New options "early" and "late" for value estimate_advantage of Tensorforce agent argument reward_estimation
Changed default value for Agent.act() argument deterministic from False to True
Networks:
New network type KerasNetwork (specification key: keras) as wrapper for networks specified as Keras model
Passing a Keras model class/object as policy/network argument is automatically interpreted as KerasNetwork
Distributions:
Changed Gaussian distribution argument global_stddev=False to stddev_mode='predicted'
New Categorical distribution argument temperature_mode=None
Layers:
New option for Function layer argument function to pass string function expression with argument "x", e.g. "(x+1.0)/2.0"
Summarizer:
New summary episode-length recorded as part of summary label "reward"
Environments:
Support for vectorized parallel environments via new function Environment.is_vectorizable() and new argument num_parallel for Environment.reset()
See tensorforce/environments.cartpole.py for a vectorizable environment example
Runner uses vectorized parallelism by default if num_parallel > 1, remote=None and environment supports vectorization
See examples/act_observe_vectorized.py for more details on act-observe interaction
New extended and vectorizable custom CartPole environment via key custom_cartpole (work in progress)
New environment argument reward_shaping to provide a simple way to modify/shape rewards of an environment, can be specified either as callable or string function expression
run.py script:
New option for command line arguments --checkpoints and --summaries to add comma-separated checkpoint/summary filename in addition to directory
Added episode lengths to logging plot besides episode returns
Buxfixes:
Temporal horizon handling of RNN layers
Critical bugfix for late horizon value prediction (including DQN variants and DPG agent) in combination with baseline RNN