Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Releases: enpasos/muzero

v0.7.0

15 Dec 09:45

Choose a tag to compare

The agent learns the perfect game in the tic-tac-toe integration test. Perfect means that every possible decision in the decision tree is correct and stable from epoch to epoch. It also means that the agent is not exploitable in any way. However, it goes beyond exploitability: the agent selects actions that are rewarded with the same probability in the optimal course of the game with the same probability. This means that the agent does not specialize, but remains broadly positioned.

This release uses the latest versions of the most important libraries: PyTorch 2.1.1, Java JDK 21, Spring Boot 3.2.0, and Gradle 8.5. In DJL it is on 0.26.0-SNAPSHOT.

The model is fully encapsulated and stable.

The code needs to be refactored and cleaned up.

v0.5.0

09 Mar 10:37

Choose a tag to compare

What's Changed

This is the first bundled release.