Advantage Actor-Critic

Minimal TensorFlow implementation of the Advantage Actor-Critic model for Atari games.

As an alternative to the asynchronous implementation, researchers found you can write asynchronous, deterministic implementation that waits for each actor to finish its segment of experience before performing an update, averaging over all of the actors. One advantage of this method is that it can more effectively use of GPUs, which perform best with large batch sizes. This algorithm is naturally called A2C, short for advantage actor-critic.

The gym environment wrappers used are from Open AI baseline

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
imgs		imgs
models		models
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
play.py		play.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Advantage Actor-Critic

About

Uh oh!

Releases

Packages

Languages

License

ppyht2/tf-a2c

Folders and files

Latest commit

History

Repository files navigation

Advantage Actor-Critic

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages