You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Agent argument update_frequency / update[frequency] now supports float values > 0.0, which specify the update-frequency relative to the batch-size
Changed default value for argument update_frequency from 1.0 to 0.25 for DQN, DoubleDQN, DuelingDQN agents
New argument return_processing and advantage_processing (where applicable) for all agent sub-types
New function Agent.get_specification() which returns the agent specification as dictionary
New function Agent.get_architecture() which returns a string representation of the network layer architecture
Modules:
Improved and simplified module specification, for instance: network=my_module instead of network=my_module.TestNetwork, or environment=envs.custom_env instead of environment=envs.custom_env.CustomEnvironment (module file needs to be in the same directory or a sub-directory)
Networks:
New argument single_output=True for some policy types which, if False, allows the specification of additional network outputs for some/all actions via registered tensors
KerasNetwork argument model now supports arbitrary functions as long as they return a tf.keras.Model
Layers:
New layer type SelfAttention (specification key: self_attention)
Parameters:
Support tracking of non-constant parameter values
Runner:
Rename attribute episode_rewards as episode_returns, and TQDM status reward as return
Extend argument agent to support Agent.load() keyword arguments to load an existing agent instead of creating a new one.
Examples:
Added action_masking.py example script to illustrate an environment implementation with built-in action masking.
Buxfixes:
Customized device placement was not applied to most tensors