Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 84da970

Browse files
authored
Add minigo (tensorflow#3955)
* Add minigo * Fix comments and make python version compatible
1 parent 6ff0a53 commit 84da970

19 files changed

+3721
-0
lines changed

research/minigo/README.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# MiniGo
2+
This is a simplified implementation of MiniGo based on the code provided by the authors: [MiniGo](https://github.com/tensorflow/minigo).
3+
4+
MiniGo is a minimalist Go engine modeled after AlphaGo Zero, built on MuGo. The current implementation consists of three main modules: the DualNet model, the Monte Carlo Tree Search (MCTS), and Go domain knowledge. Currently the **model** part is our focus.
5+
6+
This implementation maintains the features of model training and validation, and also provides evaluation of two Go models.
7+
8+
9+
## DualNet Model
10+
The input to the neural network is a [board_size * board_size * 17] image stack
11+
comprising 17 binary feature planes. 8 feature planes consist of binary values
12+
indicating the presence of the current player's stones; A further 8 feature
13+
planes represent the corresponding features for the opponent's stones; The final
14+
feature plane represents the color to play, and has a constant value of either 1
15+
if black is to play or 0 if white to play. Check `features.py` for more details.
16+
17+
In MiniGo implementation, the input features are processed by a residual tower
18+
that consists of a single convolutional block followed by either 9 or 19
19+
residual blocks.
20+
The convolutional block applies the following modules:
21+
1. A convolution of num_filter filters of kernel size 3 x 3 with stride 1
22+
2. Batch normalization
23+
3. A rectifier non-linearity
24+
25+
Each residual block applies the following modules sequentially to its input:
26+
1. A convolution of num_filter filters of kernel size 3 x 3 with stride 1
27+
2. Batch normalization
28+
3. A rectifier non-linearity
29+
4. A convolution of num_filter filters of kernel size 3 x 3 with stride 1
30+
5. Batch normalization
31+
6. A skip connection that adds the input to the block
32+
7. A rectifier non-linearity
33+
34+
Note: num_filter is 128 for 19 x 19 board size, and 32 for 9 x 9 board size.
35+
36+
The output of the residual tower is passed into two separate "heads" for
37+
computing the policy and value respectively. The policy head applies the
38+
following modules:
39+
1. A convolution of 2 filters of kernel size 1 x 1 with stride 1
40+
2. Batch normalization
41+
3. A rectifier non-linearity
42+
4. A fully connected linear layer that outputs a vector of size (board_size * board_size + 1) corresponding to logit probabilities for all intersections and the pass move
43+
44+
The value head applies the following modules:
45+
1. A convolution of 1 filter of kernel size 1 x 1 with stride 1
46+
2. Batch normalization
47+
3. A rectifier non-linearity
48+
4. A fully connected linear layer to a hidden layer of size 256 for 19 x 19
49+
board size and 64 for 9x9 board size
50+
5. A rectifier non-linearity
51+
6. A fully connected linear layer to a scalar
52+
7. A tanh non-linearity outputting a scalar in the range [-1, 1]
53+
54+
The overall network depth, in the 10 or 20 block network, is 19 or 39
55+
parameterized layers respectively for the residual tower, plus an additional 2
56+
layers for the policy head and 3 layers for the value head.
57+
58+
## Getting Started
59+
Please follow the [instructions](https://github.com/tensorflow/minigo/blob/master/README.md#getting-started) in original Minigo repo to set up the environment.
60+
61+
## Training Model
62+
One iteration of reinforcement learning consists of the following steps:
63+
- Bootstrap: initializes a random model
64+
- Selfplay: plays games with the latest model, producing data used for training
65+
- Gather: groups games played with the same model into larger files of tfexamples.
66+
- Train: trains a new model with the selfplay results from the most recent N
67+
generations.
68+
69+
Run `minigo.py`.
70+
```
71+
python minigo.py
72+
```
73+
74+
## Validating Model
75+
Run `minigo.py` with `--validation` argument
76+
```
77+
python minigo.py --validation
78+
```
79+
The `--validation` argument is to generate holdout dataset for model validation
80+
81+
## Evaluating MiniGo Models
82+
Run `minigo.py` with `--evaluation` argument
83+
```
84+
python minigo.py --evaluation
85+
```
86+
The `--evaluation` argument is to invoke the evaluation between the latest model and the current best model.
87+
88+
## Testing Pipeline
89+
As the whole RL pipeline may takes hours to train even for a 9x9 board size, we provide a dummy model with a `--debug` mode for testing purpose.
90+
91+
Run `minigo.py` with `--debug` argument
92+
```
93+
python minigo.py --debug
94+
```
95+
The `--debug` argument is for testing purpose with a dummy model.
96+
97+
Validation and evaluation can also be tested with the dummy model by combing their corresponding arguments with `--debug`.
98+
To test validation, run the following commands:
99+
```
100+
python minigo.py --debug --validation
101+
```
102+
To test evaluation, run the following commands:
103+
```
104+
python minigo.py --debug --evaluation
105+
```
106+
To test both validation and evaluation, run the following commands:
107+
```
108+
python minigo.py --debug --validation --evaluation
109+
```
110+
111+
## MCTS and Go features (TODO)
112+
Code clean up on MCTS and Go features.

research/minigo/__init__.py

Whitespace-only changes.

research/minigo/coords.py

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Copyright 2018 The TensorFlow Authors. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
# ==============================================================================
15+
"""Logic for dealing with coordinates.
16+
17+
This introduces some helpers and terminology that are used throughout MiniGo.
18+
19+
MiniGo Coordinate: This is a tuple of the form (row, column) that is indexed
20+
starting out at (0, 0) from the upper-left.
21+
Flattened Coordinate: this is a number ranging from 0 - N^2 (so N^2+1
22+
possible values). The extra value N^2 is used to mark a 'pass' move.
23+
SGF Coordinate: Coordinate used for SGF serialization format. Coordinates use
24+
two-letter pairs having the form (column, row) indexed from the upper-left
25+
where 0, 0 = 'aa'.
26+
KGS Coordinate: Human-readable coordinate string indexed from bottom left, with
27+
the first character a capital letter for the column and the second a number
28+
from 1-19 for the row. Note that KGS chooses to skip the letter 'I' due to
29+
its similarity with 'l' (lowercase 'L').
30+
PYGTP Coordinate: Tuple coordinate indexed starting at 1,1 from bottom-left
31+
in the format (column, row)
32+
33+
So, for a 19x19,
34+
35+
Coord Type upper_left upper_right pass
36+
-------------------------------------------------------
37+
minigo coord (0, 0) (0, 18) None
38+
flat 0 18 361
39+
SGF 'aa' 'sa' ''
40+
KGS 'A19' 'T19' 'pass'
41+
pygtp (1, 19) (19, 19) (0, 0)
42+
"""
43+
44+
import gtp
45+
46+
# We provide more than 19 entries here in case of boards larger than 19 x 19.
47+
_SGF_COLUMNS = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
48+
_KGS_COLUMNS = 'ABCDEFGHJKLMNOPQRSTUVWXYZ'
49+
50+
51+
def from_flat(board_size, flat):
52+
"""Converts from a flattened coordinate to a MiniGo coordinate."""
53+
if flat == board_size * board_size:
54+
return None
55+
return divmod(flat, board_size)
56+
57+
58+
def to_flat(board_size, coord):
59+
"""Converts from a MiniGo coordinate to a flattened coordinate."""
60+
if coord is None:
61+
return board_size * board_size
62+
return board_size * coord[0] + coord[1]
63+
64+
65+
def from_sgf(sgfc):
66+
"""Converts from an SGF coordinate to a MiniGo coordinate."""
67+
if not sgfc:
68+
return None
69+
return _SGF_COLUMNS.index(sgfc[1]), _SGF_COLUMNS.index(sgfc[0])
70+
71+
72+
def to_sgf(coord):
73+
"""Converts from a MiniGo coordinate to an SGF coordinate."""
74+
if coord is None:
75+
return ''
76+
return _SGF_COLUMNS[coord[1]] + _SGF_COLUMNS[coord[0]]
77+
78+
79+
def from_kgs(board_size, kgsc):
80+
"""Converts from a KGS coordinate to a MiniGo coordinate."""
81+
if kgsc == 'pass':
82+
return None
83+
kgsc = kgsc.upper()
84+
col = _KGS_COLUMNS.index(kgsc[0])
85+
row_from_bottom = int(kgsc[1:])
86+
return board_size - row_from_bottom, col
87+
88+
89+
def to_kgs(board_size, coord):
90+
"""Converts from a MiniGo coordinate to a KGS coordinate."""
91+
if coord is None:
92+
return 'pass'
93+
y, x = coord
94+
return '{}{}'.format(_KGS_COLUMNS[x], board_size - y)
95+
96+
97+
def from_pygtp(board_size, pygtpc):
98+
"""Converts from a pygtp coordinate to a MiniGo coordinate."""
99+
# GTP has a notion of both a Pass and a Resign, both of which are mapped to
100+
# None, so the conversion is not precisely bijective.
101+
if pygtpc in (gtp.PASS, gtp.RESIGN):
102+
return None
103+
return board_size - pygtpc[1], pygtpc[0] - 1
104+
105+
106+
def to_pygtp(board_size, coord):
107+
"""Converts from a MiniGo coordinate to a pygtp coordinate."""
108+
if coord is None:
109+
return gtp.PASS
110+
return coord[1] + 1, board_size - coord[0]

0 commit comments

Comments
 (0)