Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tensorforce 0.6.3

Choose a tag to compare

@AlexKuhnle AlexKuhnle released this 22 Mar 21:45
· 90 commits to master since this release
Agents:
  • New agent argument tracking and corresponding function tracked_tensors() to track and retrieve the current value of predefined tensors, similar to summarizer for TensorBoard summaries
  • New experimental value trace_decay and gae_decay for Tensorforce agent argument reward_estimation, soon for other agent types as well
  • New options "early" and "late" for value estimate_advantage of Tensorforce agent argument reward_estimation
  • Changed default value for Agent.act() argument deterministic from False to True
Networks:
  • New network type KerasNetwork (specification key: keras) as wrapper for networks specified as Keras model
  • Passing a Keras model class/object as policy/network argument is automatically interpreted as KerasNetwork
Distributions:
  • Changed Gaussian distribution argument global_stddev=False to stddev_mode='predicted'
  • New Categorical distribution argument temperature_mode=None
Layers:
  • New option for Function layer argument function to pass string function expression with argument "x", e.g. "(x+1.0)/2.0"
Summarizer:
  • New summary episode-length recorded as part of summary label "reward"
Environments:
  • Support for vectorized parallel environments via new function Environment.is_vectorizable() and new argument num_parallel for Environment.reset()
    • See tensorforce/environments.cartpole.py for a vectorizable environment example
    • Runner uses vectorized parallelism by default if num_parallel > 1, remote=None and environment supports vectorization
    • See examples/act_observe_vectorized.py for more details on act-observe interaction
  • New extended and vectorizable custom CartPole environment via key custom_cartpole (work in progress)
  • New environment argument reward_shaping to provide a simple way to modify/shape rewards of an environment, can be specified either as callable or string function expression
run.py script:
  • New option for command line arguments --checkpoints and --summaries to add comma-separated checkpoint/summary filename in addition to directory
  • Added episode lengths to logging plot besides episode returns
Buxfixes:
  • Temporal horizon handling of RNN layers
  • Critical bugfix for late horizon value prediction (including DQN variants and DPG agent) in combination with baseline RNN
  • GPU problems with scatter operations