0% found this document useful (0 votes)

18 views15 pages

Combinatorial Games: From Theoretical Solving To AI Algorithms

The document discusses combinatorial game theory (CGT), focusing on the complexity of solving games like Chess and Go, where exact winning strategies are often unattainable. It outlines key concepts such as game outcomes, values, and strategies, emphasizing the role of AI algorithms like Minimax and Monte-Carlo Tree Search in approximating solutions. The complexity of various games is categorized, highlighting that many well-known games are PSPACE or EXPTIME complete, while some simpler games like Nim have polynomial solutions.

Uploaded by

zhanibekalikh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views15 pages

Combinatorial Games: From Theoretical Solving To AI Algorithms

Uploaded by

zhanibekalikh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Combinatorial games: from theoretical solving to

AI algorithms

Eric Duchêne?

Université de Lyon, CNRS

Université Lyon 1, LIRIS, UMR5205, F-69622
[email protected]

Abstract. Combinatorial game solving is a research field that is fre-

quently highlighted each time a program defeats the best human player:
Deep Blue (IBM) vs Kasparov for Chess in 1997, and Alpha Go (Google)
vs Lee Sedol for the game of Go in 2016. But what is hidden behind these
success stories ? First of all, I will consider combinatorial games from a
theoretical point of view. We will see how to proceed to properly define
and deal with the concepts of outcome, value, and winning strategy. Are
there some games for which an exact winning strategy can be expected?
Unfortunately, the answer is no in many cases (including some of the
most famous ones like Go, Othello, Chess or Checkers), as exact game
solving belongs to the problems of highest complexity. Therefore, finding
out an effective approximate strategy has highly motivated the commu-
nity of AI researchers. In the current survey, the basics of the best AI
programs will be presented, and in particular the well-known Minimax
and Monte-Carlo Tree Search approaches.

1 Combinatorial games
1.1 Introduction
Playing combinatorial games is a common activity for the general public. Indeed,
the games of Go, Chess or Checkers are rather familiar to all of us. However, the
underlying mathematical theory that enables to compute the winner of a given
game, or more generally, to build a sequence of winning moves, is rather recent.
It was settled by Berlekamp, Conway and Guy only in the late 70s [2] , [8]. The
current section will present the highlights of this beautiful theory.

In order to avoid any confusion, first note that combinatorial game theory
(here shortened as CGT) is very different from the so-called ”economic” game
theory introduced by Von Neumann and Morgenstern. I often consider that a
preliminary activity to tackle CGT issues is the reading of Siegel’s book [31]
which gives a strong and formal background on CGT. Strictly speaking, a com-
binatorial game must satisfy the following criteria:
?
Supported by the ANR-14-CE25-0006 project of the French National Research
Agency and the CNRS PICS-07315 project.
2

Definition 1 (Combinatorial game). In a combinatorial game, the following

constraints are satisfied:
– There are exactly two players, called “Left” and “Right”, who alternate
moves. Nobody can miss his turn.
– There is no hidden information : all possible moves are known to both play-
ers.
– There are no chance moves such as rolling dice or shuffling cards.
– The rules are defined in such a way that play will always come to an end.
– The last move determines the winner: in the normal play convention, the
first player unable to move loses. In the misère play convention, the last
player to move loses.

Examples of such games are Nim [6] or Domineering [20]. In the first one,
game positions are tuples of non-negative integers (a1 , . . . , an ). A move consists
in strictly decreasing exactly one of the values ai for some 1 ≤ i ≤ n, provided
the resulting position remains valid. The first player unable to move loses. In
other words, reaching the position (0, . . . , 0) is a winning move. The game Dom-
ineering is played on a rectangular grid. The two players alternately place a
domino on the grid under the following condition: Left must place his dominoes
vertically and Right horizontally. Once again, the first player unable to place a
domino loses. Figure 1 illustrates a position for this game, where Left started
and wins, since Right cannot place any additional horizontal domino.

Fig. 1. Playing Domineering: Right cannot play and loses

A useful property derived from Definition 1 is that any combinatorial game

can be played indifferently on a particular (finite) tree. This tree is built as
described in Definition 2.

Definition 2 (Game tree). Given a game G with starting position S, the game
tree associated to (G , S) is a semi-ordered rooted tree defined as follows:
– The vertex root correspond to the starting position S.
– All the game positions reachable for Left (resp. Right) in a single move from
S are set as left (resp. right) children of the root.
3

– Apply the previous rule recursively for each child.

Figure 2 gives an example of such a game tree for Domineering with starting

position . For more convenience, note that only the top three levels of the
tree are depicted (there is one additional level when fully expanded).

Fig. 2. Game tree of a Domineering position

Now, playing any game on its game tree consists is moving alternately a to-
ken from the root to a leaf. Each player must follow an edge corresponding to
his direction (i.e., full edges for Left and dashed ones for Right). In the normal
play convention, the first player who moves the token on a leaf of the tree is the
winner. We will see later on that this tree representation is very useful, both to
compute exact and approximate strategies.

In view of Definition 1, one can remark that the specified conditions are too
strong to cover some of the well-known abstract 2-player games. For example,
Chess and Checkers may have draw outcomes, which is not allowed in a combi-
natorial game. This is due to the fact that some game positions can be visited
several times during the play. Such games are called loopy. In games like Go,
Dots and Boxes or Othello, the winner is determined with a score and not ac-
cording to the player making the last move. However, such games remain very
close to combinatorial games. Some keys can be found in the literature to deal
with their resolution ([31], chap. 6 for loopy games, and [24] for an overview on
scoring game theory). In addition, first attempts to built an “absolute” theory
that would cover normal and misère play conventions, loopy and scoring games
have been recently made [23]. Note that the concepts and issues that will be
introduced in the current survey make also sense in this extended framework.

1.2 Main issues in CGT

Given a game, researchers in CGT are generally concerned with the following
three issues:
– Who is the winner ?
– What is the value of a game (in the sense of Conway) ?
4

– Can one provide a winning strategy, i.e., a sequence of optimal moves for
the winner whatever his opponent’s moves are ?

For each of the above questions, I will give some parts of answer relative to
the known theory.

The first problem is the determination of the winner of a given game, also
called outcome. In a strict combinatorial game (i.e., a game satisfying the con-
ditions of Definition 1), there are only four possible outcomes [31]:

– L if Left has a winning strategy independently of who starts the game,

– R if Right has a winning strategy independently of who starts the game,
– N if the first player has a winning strategy,
– P if the second player has a winning strategy.

This property can be easily deduced from the game tree, by labeling the
vertices from the leaves to the root. Consequently, such an algorithm allows to
compute the outcome of a game in polynomial time in the size of the tree. Yet,
a game position has often a smaller input size than the size of its correspond-
ingPgame tree. For example, a position (a1 , . . . , an ) of Nim has an input size
n
O( i=1 log2 (ai )), which is far smaller than the number of positions in the game
tree. Hence, computing the whole game tree is generally not the good key to
determine effectively the answer to Problem 1 below.

Problem 1 (Outcome). Given a game G with a starting position S, compute the

complexity of deciding whether (G , S) is P , N , L or R ?

Note that for loopy games, the outcome Draw is added to the list of the
possible outcomes.

Example 1. The game Domineering played on a 3 × 1 grid is clearly L since

there is no available (horizontal) move for Right. On a 3 × 2 and 3 × 3 grids,
one can quickly check that the first player has a winning strategy. Such positions
are thus N . When n > 3, it can also be easily proved that 3 × n grids are R ,
since placing an horizontal domino in the middle row allows two free moves for
Right, whereas a vertical move do not constraint further moves of Left.

We now present a second major issue in CGT that can be considered as a

refinement of the previous one.

Problem 2 (Value). Given a game G with a starting position S, compute its

Conway’s value.

The concept of game value was first defined by Conway in [8]. In his theory,
each game position is assigned a numeric value among the set of surreal numbers.
Roughly speaking, it corresponds to the number of moves ahead that Left has
towards his opponent. For instance, position of Domineering has value
−2 since Right can place two more dominoes than Left before being blocked. A
5

more formal definition can be found in [31]. Just note that Conway’s values are
defined recursively and can also be computed from the game tree.
Knowing the value of a game allows to deduce its outcome. For example, all
games having a strictly positive value are L and all games having a zero value
are P . Moreover, its knowledge is even more paramount when the game splits
in sums: it means that a game G can be considered as a set of independent
smaller games whose values allows to compute the overall value of G . Consider
the example depicted by Fig. 3. This game position can be considered as a sum

of the three components , , and of respective outcomes L , L and R ,

and respective Conway’s values 1/2, 1/2 and −1. From this decomposition, there
is no way to compute the outcome of the general position from the outcomes of
each component. Indeed, the sum of three components having outcomes L , L ,
and R can either be L , R , P or N . However, the sum of the three values can
be easily computed and equals 0: we can conclude that the overall position of
Fig. 3 is P .

Fig. 3. Sum of Domineering positions

Example 2. Computing Conway’s values of Domineering is not easy even for

small grids and there is no known formula to get them. On the other hand, the
case of the game of Nim is better known. Indeed, Conway’s value of any position
(a1 , . . . , an ) is an infinitesimal surreal number equal to a1 ⊕ . . . ⊕ an , where ⊕ is
the bitwise XOR operator.
The last problem is generally considered once (at least) the first one is solved.
Problem 3 (Winning strategy). Given a game G and a starting position S, give a
winning move from S for the player having a winning strategy. Do it recursively
whatever the answer of the other player is.
There are really few games for which this question can be solved with a
polynomial time algorithm. The game of Nim is one of them.
Example 3. A winning strategy is known for the game of Nim: from any posi-
tion (a1 , . . . , an ) of outcome N , there always exists a greedy algorithm that yields
6

to a position (a01 , . . . , a0n ) whose bitwise sum a01 ⊕ . . . ⊕ a0n equals 0 (meaning that
it will be losing for the other player).

2 Complexity of combinatorial games

The complexity of combinatorial games is correlated to the computational com-

plexity of the above problems. First of all, one can notice that all these problems
are decidable, since it suffices to consider a simple algorithm on the game tree
to have an answer. Of course, the size of the game tree remains an obstacle
compared with the size of a game position. In [18], Fraenkel claims a game G is
polynomial if:

– Problem 1 and Problem 3 can be solved in polynomial time for any starting
position S of G .
– Winning strategies in G can be consumed in at most an exponential number
of moves.
– These two properties remain valid for any sum of two game positions of G .

If this definition is not always considered as a standard by the CGT com-

munity, there is a general agreement to say that the computational complexities
of Problems 1 and 3 are the main criteria to evaluate the overall complexity of
a game. Of course, this question makes sense only for games whose positions
depends on some parameters such as the size of a grid, the values in a tuple...
This explains why many famous games have been defined in the literature in a
generalized version (e.g. Chess, Go, Checkers on a n × n board...). For almost all
of them, even the computational complexity of Problem 1 is very high, as shown
by the following table (extracted from [5] and [21]). Note that the belonging to
class PSPACE or EXPTIME depends on the length of the play (exponential for
EXPTIME and polynomial for PSPACE).

Game Complexity
Tic Tac Toe PSPACE-complete
Othello PSPACE-complete
Hex PSPACE-complete
Amazons PSPACE-complete
Checkers EXPTIME-complete
Chess EXPTIME-complete
Go EXPTIME-complete

Table 1. Complexity of well-known games in their generalized versions

In addition to these well-known games, there are many other combinato-

rial games that have been proved to be at least PSPACE-hard: Node-Kayles
and Snort [28], many variations on Geography [25] or many other games on
7

graphs. In 2009, Demaine and Hearn wrote a rich book about the complexity of
many combinatorial games and puzzles [16]. If this list confirms that games be-
long to decision problems of highest complexity, some of them admit a lower one.
The game of Nim is one of them and is luckily not the only one. For example,
many games played on tuples of integers admit a polynomial winning strategy
derived from tools arising from arithmetic, algebra or combinatorics on words.
See the recent survey [11] which summarizes some of these games. Moreover,
some games on graphs proved to be PSPACE-complete have a more afford-
able complexity on particular families of graphs. For example, Node Kayles
is proved to be polynomial on paths and cographs [4]. This is also the case for
Geography played on undirected graphs [19]. Finally, note that the complexity
of Domineering is still an open problem.

If the computational complexity of many games is often very high, it makes

no sense to consider it when the game positions have a constant size. It is in
particular the case for well-known board games such as Chess on a 8 × 8 board,
the game of Go on a 19 × 19 board, or standard Hex. Solving them is often
a question a computational performance and algorithmic optimization on the
game tree. In this context, these games can be classified according to the status
of their resolution. For that purpose, Allis [1] defined three levels of resolution
for a game:

– ultra-weakly solved: the answer of Problem 1 is known, but Problem 3 remains

open. This is for instance the case of Hex, that is winning for the first player,
but no winning strategy has been computed yet.
– weakly solved: Problem 1 and 3 are solved for the standard starting position
(e.g., standard initial position of Checkers, empty board of Tic Tac Toe). As
a consequence, the known winning strategy is not improved if the opponent
does not play optimally.
– strongly solved: Problem 1 and 3 are solved for any starting position.

According to this definition, Table 2 summerizes the current knowledge about

the resolution of some games.

Game Size of the board Resolution status

Tic Tac Toe 3×3 Strong
Connect Four 6×7 Strong
Checkers 8×8 Weak
Hex 11 × 11 Ultra-Weak
Go 19 × 19 Open
Chess 8×8 Open
Othello 8×8 Open

Table 2. Status of the resolutions of several well-known games

A natural question arises when reading the above table. What makes a game
harder than another one ? If there is obviously no universal answer, Fraenkel
suggests several relevant criteria in [17].

– The average branching factor, i.e., the average number of available moves
from a position (around 35 for Chess and 250 for the game of Go).
– The total number of game positions (1018 for Checkers, 10171 for the game
of Go).
– The existence of cycles. In other words, loopy games are harder than non
loopy ones.
– Impartial or Partizan. A game is said impartial if both players always have
the same available moves. It implies the game tree to be symmetric. Nim is
an example of an impartial game, whereas Domineering and all the games
mentioned in Table 2 are not. Such games are called partizan. Impartial
games are in general easier to solve since their Conway’s values are more
“controlled”.
– The fact that the game can be decomposed into sums of smaller independent
games (as it is the case for Domineering).
– The number of final positions.

Based on these considerations, how to deal with games whose complexity is

too high - either theoretically, or simply in view of their empirical hardness?
Approximate resolutions (especially for Problem 3) must be considered and ar-
tificial intelligence algorithms were introduced to this end.

3 AI algorithms to deal with the hardest games

In the previous section, we have seen that Problem 1 remains unsolved for games
having a huge number of positions. If the recent work of Schaeffer et al. [29] on
Checkers was a real breakthrough (they found the exact outcome, which is a
Draw), getting a similar result for games like Chess, Othello or Go seems cur-
rently out of reach. Moreover, researchers generally feel more concerned by find-
ing a good way to play these games rather than computing the exact outcome.
In the 50s, this interest led to the beginnings of artificial intelligence [30] and the
construction of the first programs to play Chess [3]. For more information about
computer game history, see [27]. Before going into more details on AI programs
for games, note that in general, these algorithms work on a slight variation of
the game tree given in Definition 2, where Left is always supposed to be the
first player, and only the moves of one player are represented on a level of the
tree. For example, the children of the root correspond exclusively to the moves
available for Left, their children to the possible answers for Right...

3.1 MiniMax algorithms

The first steps in combinatorial game programming were made for Chess. The
so-called MiniMax approach is due to Shannon and Turing in the 50s and has
9

been widely considered in many other AI programs. Its main objective is to

minimize the maximum loss of each player. This algorithm requires some expert
knowledge of the game, as it uses an evaluation function of the values of game
positions.

Roughly speaking, in a MiniMax algorithm, the game tree is built up to a

certain depth. Then each leaf of this partial game tree is evaluated thanks to an
evaluation function. This function is the key of the algorithm and is based on
heuristic considerations. For example, the Chess computer Deep Blue (who first
defeated a human world champion in 1996) had an evaluation function based on
hundreds of parameters (e.g. compare the power of a non-blocked tower versus
a protected king). These parameters were tuned after an fine analyze of 700,000
master games. Each parent node of a leaf is then assigned a value equals to the
minimum value of its children (wlog, we here assume that the depth is even -
then the last moves correspond to moves for Right, whose goal is to minimize
the game value). The next parent nodes are evaluated by taking the maximum
value among their children (it corresponds to moves for Left). Then recursively
each parent node is evaluate according to the values of its children, by taking
alternately the minimum or the maximum according to whether it is Left or
Right’s turn. Figure 4 illustrates this algorithm on a tree of depth 4. In this
example, assume an evaluation function provides the values located on the leaves
of the tree. Then MiniMax ensures that Left can force a win with a score equals
to 4. Red nodes are those for which the maximum of the children is taken, i.e.,
positions from which Left has to play.

4 3

7 4 12 3

7 -5 4 -2 12 3

10 7 -5 4 3 -2 12 3 8

Fig. 4. MinMax algorithm on a tree of depth 4

In addition to an expert tuning of the evaluation function, another signifi-

cant enhancement was made with the introduction of Alpha-Beta pruning [12].
It consists in a very effective selective cut-off of the Minimax algorithm with-
out loss of information. Indeed, if after having computed the values of the first
10

branches, it turns out that the overall value of the root is at least v, then one can
prune all the unexplored branches whose values are guaranteed to be less than v.
The ordering of the branches in the game tree then turns out to be paramount,
as it can considerably increase the efficiency of the algorithm. In addition to
this technique, one can also mention the use of transposition tables (adjoined to
alpha-beta pruning) to speed up the search in the game tree.

Nowadays, the MiniMax algorithm (together with its improving techniques)

is still used by the best algorithms to solve games admitting a relevant evaluation
function. This is for example the case for Chess, Checkers, Connect Four or
Othello. Yet, we will see that for other games, some probabilistic approaches
turn out to be more efficient.

3.2 Monte-Carlo approaches

In 2006, Coulom [9] suggested to combine the principle of the MiniMax algo-
rithm with Monte Carlo methods. These methods were formalized in the 40s to
deal with hard problems by taking a random sampling. For example, they can
be used to estimate the value of π. Of course, the quality of the approximated
solution partially depends on the size of the sample. In our case, their applica-
tion will consist in simulating many random games.

The combination of both MiniMax and Monte Carlo methods is called MCTS,
which stands for Monte Carlo Tree Search. Since its introduction, it has been
considered by much research on AI for games. This success is mainly explained
by the significant improvements made by computer Go programs that are using
this technique. Moreover, it has also shown very good performances for problems
for which other techniques had poor ones (e.g. some problems in combinatorial
optimization, puzzles, multi-player games, scheduling , operation research...).
Another great advantage of MCTS is that there is no need of a strong expert
knowledge to implement a good algorithm. Hence it can be considered for prob-
lems for which humans do not have a strong background. In addition, MCTS
can be stopped at any time to provide the current best solution and the tree
built so far can be reused for the next step.
In what follows, we will give the necessary information to understand the
essence of MCTS applied to games. For additional material, the reader could
refer to the more exhaustive survey [7].

The basic MCTS algorithm consists in building progressively the game tree,
guided by the results of the previous explorations of it. Unlike the standard Min-
iMax algorithm, the tree is built in an asymmetric manner. The in-depth search
is considered only for the most promising branches that are chosen according
to a tuned selection policy. This policy relies on the values of each node of the
tree. Roughly speaking, the value of a node vi corresponds to the percentage
of winning random simulations when vi is played. Of course this value become
more and more accurate when the tree grows.
11

Description As illustrated in Fig. 5, each iteration of MCTS is organized

around 4 steps called descent, growth, roll-out and update. Numbers in grey
correspond to the estimate values of each node (a function of the pourcentage
of win). Here are their main description:
– Descent: starting from the root of the game tree, a child is recursively selected
according to the selection policy. As seen on the figure, a MiniMax selection
is used to descend the tree, according to the values of each node (here, B1
is the most promising move for Left, then E1 for Right). This descent stops
when it lands on a node that needs to be expanded (also given by the policy).
In our example, the node E1 is such a node.
– Growth: Add one or more children to this expandable node in the tree. On
Fig. 5, Node B4 is added to the tree.
– Rollout: From an added node, make a simulation by playing random moves
until the end of the game. In our example, the random simulation from B4
leads to a loss for Left.
– Update: the result of the simulation is backpropagated to the moves of the
tree that have been selected. Their values are thus updated.

update
descent

A3 B1 C2 E3
0.1 0.6 0.2 0.1

A3 D1 E1
0.8 0.2 0.1
growth

E3 B4
1 0

rollout

P
Final positions

Fig. 5. The four stages of the MCTS algorithm

Improvements In general, MCTS is not used in a raw version and is frequently

combined with additional features. As detailed in [36], there is a very rich liter-
ature on the improvements brought to MCTS. They can be organized according
12

to the stage they impact. Table 3 summarizes the most important enhancements
brought to MCTS.

Stage Improvement
Descent UCT (2006) [22]
Descent RAVE (2007) [15]
Descent Criticality (2009) [10]
Growth FPU (2007) [35]
Rollout Pool-RAVE (2011), [26]
Rollout NST (2012) [33]
Rollout BHRF (2016) [14]
Update Fuego reward (2010) [13]

Table 3. Main improvements brought to MCTS

One of the most important feature of the algorithm is the node selection
policy during the descent. At each step of this stage, MCTS chooses the node
that maximizes (or minimizes, according to whether it is Left or Right’s turn)
some quantity. A formula that is frequently used is called Upper Confidence
Bounds (UCB). It associates to each node vi of the tree the following value:
r
ln N
V (vi ) + C × ,
ni
where V (vi ) is the percentage of winning simulations involving vi , ni is the total
number of simulations involving vi , N is the number of times its parent has
been visited, and C is a tunable parameter. This formula is well-known in the
context of bandit problems (choose sequentially amongst n actions the best one
in order to maximize the cumulative reward). It allows in particular to deal with
the exploration-exploitation dilemma, i.e., to find a balance between exploring
unvisited nodes and reinforce the statistics of the best ones. The combination of
MCTS and UCB is called UCT [22].

A second common enhancement for MCTS during the descent is the RAVE
estimator (Rapide Action-Value Estimator [15]). It consists in considering each
move of the rollout as important as the first move. In other words, the moves
visited during the rollout stage will also affect the values of the same moves
in the tree. On Fig. 5, imagine the move E3 is played during the simulation
depicted with dashed line. Then RAVE will thus modify the UCB value of the
node E3 of the tree (the RAVE formula will not be given here).

MCTS has also been widely studied in order to increase the quality of the
random simulations. A first way to mimic the strategy of a good player is to
consider evaluations functions based on expert knowledge. In [34], moves are
categorized according to several criteria: location on the board, capturing or
13

blocking potential and proximity to the last move. Then the approach is to eval-
uate the probability that a move belonging to a category will be played by a real
player. This probability is determined by analyzing a huge sample of real games
played by either humans or computers. Of course this strategy is fully specific to
the game on which MCTS is applied. More generic approaches were considered
such as NST [33], BHRF [14] or Pool RAVE [26]. In the first two ones, good
sequences of moves are kept in memory. Indeed, it is rather frequent that given
successive attacking moves of a player, there is an usual sequence of answers of
the opponent to defend himself. In the second one, the random rollout policy is
biased by the values in the game tree, i.e., good moves visited in the tree are
likely to be played during a simulation.

In addition to the enhancements applied to the different stages of MCTS,

one can also mention several studies to parallelize the algorithm that perform
very good results [36].

We cannot conclude this survey without mentioning the outstanding perfor-

mances of Google’s program Alpha Go [32]. Like Deep Blue for Chess, Alpha
Go is the first AI to defeat the best human player in Go. This program runs
an MCTS algorithm combined with two deep neural networks. The first one is
called the Policy network and is used during the descent phase to find out the
most promising moves. It was bootstrapped from many games of human experts
(around 30 million moves analyzed during three weeks on 50 GPU). The rein-
forcement learning was then enhanced by many games of self-play. The second
neural network is called Value network and can be considered as the first pow-
erful evaluation function for Go that is used to bias the rollout policy. If Alpha
Go’s performances show a real breakthrough in AI programs for games, the last
day of this research field has not yet come. In particular, the need of expert
knowledge to bootstrap the networks cannot be considered when dealing with
problems for which humans have a poor expertise.

4 Perspectives

Working on problems as hard as combinatorial games is a real challenge, both

for CGT and AI researchers. The major results obtained in the past years are
very stimulating and encourage many people to strengthen the overall effort on
the topic. Hence, from a theoretical point of view, the next step for CGT is the
construction of a general framework to cope with scoring games. In particular,
the question of the sum of two scoring games is paramount, as it is radically
different from the sum games in normal play convention (one cannot simply add
the values of each game). First attempts have been recently made in that sense
to consider Conway’s values as waiting moves in scoring games.

Concerning AI algorithms for games, as said in the above paragraph, Alpha

Go has been a breakthrough for the area but very exciting issues remain. More
14

precisely, the neural network approach proposed by Google requires a wide set
of expert knowledge and needs computer power for a long time. However, there
are some games for which both are not available. In particular, the example of
General Game Playing is a real challenge for AI algorithms, as the rules of the
game are given at the latest 20 minutes before running the program. Supervised
learning techniques like those of Alpha Go are thus almost impossible to set up,
and standard MCTS enhancements are currently the most effective ones for this
kind of problem. In addition, one can also look for adapting MCTS to problems
of higher uncertainty such as multi-player games or games having randomness
in their rules (use of dices for example). First results have already been made in
that direction [36].

References
1. L. V. Allis, Searching for solutions in games an artificial intelligence, PhD Maas-
tricht, Netherland: Limburg University (1994).
2. E. Berlekamp, J. H. Conway, and R. K. Guy, Winning ways for your mathematical
plays, Vol. 1, Second edition. A K Peters, Ltd., Natick, MA (2001).
3. A. Bernstein and M. Roberts, Computer V. Chess player, Scientific American 198
(1958), 96-105.
4. H. L. Bodlaender et D. Kratsch, Kayles and nimbers, J. Algorithms 43 (2002),
106–119.
5. E. Bonnet and A. Saffidine, Complexit des Jeux (in french), Bulletin de la ROADEF
31 (2014), 9–12.
6. C. L. Bouton, Nim, a game with a complete mathematical theory, Annals of Math.
3 (1905), 35–39.
7. C. Browne, E. Powley, D. Whitehouse, S. Lucas, P. I. Cowling, P. Rohlfshagen, S.
Tavener, D. Perez, S. Samothrakis and S. Colton, A Survey of Monte Carlo Tree
Search Methods, IEEE Transactions on computational intelligence and AI in games
4 (1) (2012), 1–43.
8. J. H. Conway, On number and games, Academic Press Inc. (1976).
9. R. Coulom, Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search,
in Proc. 5th Int. Conf. Comput. and Games, Turin, Italy (2006), 72-83.
10. R. Coulom, Criticality: a Monte-Carlo Heuristic for Go Programs, invited talk in
University of Electro-Communication, Tokyo, Japan (2009).
11. E. Duchêne, A.S. Fraenkel, V. Gurvich, N.B. Ho, C. Kimberling and U. Larsson,
Wythoff Wisdom, Preprint.
12. D.J. Edwards and T. P. Hart, The α − β heuristic, Artificial intelligence project
RLE and MIT Computation Centre, Memo 30 (1963).
13. M. Enzenberger, M. Muller, B. Arneson, R. Segal, Fuego an open-source framework
for board games and Go engine based on Monte Carlo tree search. IEEE Trans.
Comput. Intell. AI Games 2(4), 259270 (2010)
14. A. Fabbri, F. Armetta, E. Duchne, and S. Hassas, A Self-Acquiring Knowledge Pro-
cess for MCTS, International Journal on Artificial Intelligence Tools 25 (1) (2016).
15. S. Gelly, D. Silver, Combining online and offline knowledge in UCT, in Proceedings
of the International Conference on Machine Learning (ICML), ed. by Z. Ghahramani,
ACM, New York (2007), 273-280.
16. R. A. Hearn and E. D. Demaine, Games, Puzzles, and Computation, A K Peters
(2009).
15

17. A.S. Fraenkel, Nim is easy, chess is hard - but why??, J. Internat. Computer Games
Assoc. 29 (2006), 203–206.
18. A.S. Fraenkel, Complexity, appeal and challenges of combinatorial games, Theo-
retical Computer Science 313 (2004), 393–415.
19. A.S. Fraenkel et S. Simonson, Geography, Theoretical Computer Science 110
(1993), 197–214.
20. M. Gardner, Mathematical Games: Cram, crosscram and quadraphage: new games
having elusive winning strategies, Scientific American 230 (1974), 106–108.
21. A. Junghanns and J. Schaeffer, Sokoban : Enhancing general single-agent search
methods using domain knowledge, Artificial Intelligence 129(1) (2001), 219-251.
22. L. Kocsis, C. Szepesvri, Bandit based Monte-Carlo planning, Lecture Notes in
Artificial Intelligence 4212, Springer, Berlin (2006),282-293.
23. U. Larsson, R.J. Nowakowski and C. Santos, Absolute Combinatorial Game Theory,
arXiv:1606.01975 (2016).
24. U. Larsson, R.J. Nowakowski and C. Santos, When waiting moves you in scoring
combinatorial games, arXiv:1505.01907 (2015).
25. G. Renault and S. Schmidt, On the complexity of the misre version of three games
played on graphs, preprint.
26. A. Rimmel, F. Teytaud and O. Teytaud, Biasing Monte-Carlo simulations through
RAVE values, International Conference on Computers and Games (2011), 59–68.
27. L. Rougetet, Combinatorial games and machines, A Bridge between Conceptual
Frameworks, Sciences, Society and Technology Studies, Pisano, Springer, Dordrecht,
(2015), 475–494.
28. T. J. Schaeffer, On the complexity of some two-person perfect-information games,
J. Comput. System Sci., 16 (1978), 185–225.
29. J. Schaeffer, N. Burch, Y. Bjrnsson, A. Kishimoto, M. Mller, R. Lake, P. Lu and
S. Sutphen, Checkers Is Solved, Science 317(5844) (2007), 1518–1522.
30. C. Shannon, Programming a computer for playing chess, Philosophical Magazine
Series 7 41(314) (1950), 256-275.
31. Aaron N. Siegel, Combinatorial Game Theory, San Francisco, CA (2013).
32. D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J.
Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., Mastering the
game of Go with deep neural networks and tree search, Nature 529(7587) (2016),
484–489.
33. M.J.W. Tak, M.H.M. Winands, Y. Bjornsson, N-grams and the last-good-reply
policy applied in general game playing. IEEE Trans. Comput. Intell. AI Games 4(2)
(2012), 73-83.
34. Y. Tsuruoka, D. Yokoyama and T. Chikayama, Game-tree search algorithm based
on realization probability, ICGA J. 25(3) (2002), 132–144.
35. Y. Wang and S. Gelly, Modifications of UCT and sequence-like simulations for
Monte-Carlo Go, IEEE Symposium on Computational Intelligence and Games, Hon-
olulu, Hawai (2007), 175–182.
36. M. Winands, Monte-Carlo Tree Search in Board Games, Handbook of Digital
Games and Entertainment Technologies, Springer (2015), 1–30.

Glad We Met: The Art and Science of 1:1 Meetings Steven G. Rogelberg PDF Download
100% (1)
Glad We Met: The Art and Science of 1:1 Meetings Steven G. Rogelberg PDF Download
147 pages
Game Theory - 4th Edition - Owen, Guillermo
100% (6)
Game Theory - 4th Edition - Owen, Guillermo
627 pages
The History of Combinatorial Game Theory
No ratings yet
The History of Combinatorial Game Theory
13 pages
Game Theory Exercises
100% (1)
Game Theory Exercises
10 pages
H13 Combinatorial Games Theory and Applications
No ratings yet
H13 Combinatorial Games Theory and Applications
98 pages
Playing Games With Algorithms Algorithmic Combinatorial Game Theory
100% (1)
Playing Games With Algorithms Algorithmic Combinatorial Game Theory
42 pages
Combinatorial GameTheory (Antonie Blom)
No ratings yet
Combinatorial GameTheory (Antonie Blom)
57 pages
Gametheory
No ratings yet
Gametheory
17 pages
Hearn WW
No ratings yet
Hearn WW
5 pages
Stable Winning Coalitions: Games of No Chance MSRI Publications Volume 29, 1996
No ratings yet
Stable Winning Coalitions: Games of No Chance MSRI Publications Volume 29, 1996
21 pages
Previewpdf
No ratings yet
Previewpdf
31 pages
UNIT IV (Autosaved)
No ratings yet
UNIT IV (Autosaved)
34 pages
Nim Game Theory Analysis
No ratings yet
Nim Game Theory Analysis
16 pages
Permainan, Nilai, Analisis, Dan Kompleksitas Xiangqi
No ratings yet
Permainan, Nilai, Analisis, Dan Kompleksitas Xiangqi
8 pages
Bedes Perdon MResearch
No ratings yet
Bedes Perdon MResearch
14 pages
Game Theory Notes
No ratings yet
Game Theory Notes
60 pages
Fundamentals of Artificial Intelligence - Compress 1
No ratings yet
Fundamentals of Artificial Intelligence - Compress 1
87 pages
F.math SSS3 Week 5 2ND Term
No ratings yet
F.math SSS3 Week 5 2ND Term
8 pages
Peres Y., Game Theory Instructor LCTN
No ratings yet
Peres Y., Game Theory Instructor LCTN
62 pages
Unit Iii
No ratings yet
Unit Iii
20 pages
Unit3 AI
No ratings yet
Unit3 AI
36 pages
Game Playing
No ratings yet
Game Playing
16 pages
AI Game Strategy Basics
No ratings yet
AI Game Strategy Basics
57 pages
Combigames
No ratings yet
Combigames
9 pages
IE616 Notes Updated
No ratings yet
IE616 Notes Updated
26 pages
Anti-Chess AI: MinMax Algorithm
No ratings yet
Anti-Chess AI: MinMax Algorithm
22 pages
Game Playing
No ratings yet
Game Playing
8 pages
Games, Fixed Points and Mathematical Economics: Dr. Christian-Oliver Ewald
No ratings yet
Games, Fixed Points and Mathematical Economics: Dr. Christian-Oliver Ewald
134 pages
Study and Application of Game Theory
No ratings yet
Study and Application of Game Theory
4 pages
Unit III AI
100% (1)
Unit III AI
38 pages
Game Theory Yuval Peres
100% (1)
Game Theory Yuval Peres
180 pages
UNIT II Adversarial Search
No ratings yet
UNIT II Adversarial Search
44 pages
Paper-Error Correcting Codes Derived From Cellular Automata Games
No ratings yet
Paper-Error Correcting Codes Derived From Cellular Automata Games
21 pages
Partisan Heap Game Last
No ratings yet
Partisan Heap Game Last
17 pages
Workshop SpragueGrundy
No ratings yet
Workshop SpragueGrundy
191 pages
Lessons in Play
No ratings yet
Lessons in Play
35 pages
Games 2018
No ratings yet
Games 2018
8 pages
Skripsi S1 Barcelona
No ratings yet
Skripsi S1 Barcelona
51 pages
Maths GameSingleSlide
100% (1)
Maths GameSingleSlide
63 pages
Game Theory
No ratings yet
Game Theory
109 pages
Previewpdf
No ratings yet
Previewpdf
30 pages
BTech Advanced AI Unit02
No ratings yet
BTech Advanced AI Unit02
50 pages
Handout Combinatorial Games
No ratings yet
Handout Combinatorial Games
7 pages
Hikaru Manabe Research
No ratings yet
Hikaru Manabe Research
15 pages
Project Report - Combinatorial Game Theory
No ratings yet
Project Report - Combinatorial Game Theory
3 pages
General Game Playing in AI
No ratings yet
General Game Playing in AI
16 pages
X Game
100% (1)
X Game
365 pages
Richard J. Nowakowski Games of No Chance
100% (1)
Richard J. Nowakowski Games of No Chance
530 pages
Introduction To Game Theory: Peter Morris
100% (1)
Introduction To Game Theory: Peter Morris
4 pages
Combinatorial Games Analysis
No ratings yet
Combinatorial Games Analysis
13 pages
2102 13225
No ratings yet
2102 13225
19 pages
(Ferguson T.S.) Game Theory
No ratings yet
(Ferguson T.S.) Game Theory
212 pages
Game Theory - Thomas S. Ferguson
No ratings yet
Game Theory - Thomas S. Ferguson
8 pages
Game Theory
No ratings yet
Game Theory
53 pages
Glossary of Game Theory - Wikipedia
No ratings yet
Glossary of Game Theory - Wikipedia
6 pages
Game Theory Draftj PDF
No ratings yet
Game Theory Draftj PDF
25 pages
Daily Lesson Plan 1
No ratings yet
Daily Lesson Plan 1
5 pages
Korthagen 2016
No ratings yet
Korthagen 2016
20 pages
PFM 2
No ratings yet
PFM 2
4 pages
Draft Intern Report Well Group
No ratings yet
Draft Intern Report Well Group
68 pages
Tgfu Lesson Plan
No ratings yet
Tgfu Lesson Plan
5 pages
Course Teaching Plan
No ratings yet
Course Teaching Plan
5 pages
Nirma University: ? !,'' XTLT"
No ratings yet
Nirma University: ? !,'' XTLT"
3 pages
Revival and Reinvention of Kathak Dance
No ratings yet
Revival and Reinvention of Kathak Dance
14 pages
Classroom Objects Worksheet
100% (2)
Classroom Objects Worksheet
4 pages
Tuck 2017 - 2018 MBA Admissions Discussion PDF
No ratings yet
Tuck 2017 - 2018 MBA Admissions Discussion PDF
257 pages
Guidelines For The Post of Assistant Professor 14E2023
No ratings yet
Guidelines For The Post of Assistant Professor 14E2023
2 pages
From Silent Spring PDF
No ratings yet
From Silent Spring PDF
10 pages
Citizenship Advancement Training
No ratings yet
Citizenship Advancement Training
41 pages
Complete Bundle Six Steps To Managing Alzheimers Disease and Dementia Guide For Families by Budson MD HQ File
No ratings yet
Complete Bundle Six Steps To Managing Alzheimers Disease and Dementia Guide For Families by Budson MD HQ File
411 pages
PR Chapter 1-5
No ratings yet
PR Chapter 1-5
65 pages
Multimodal AI On Wound Images and Clinical Notes For Home Patient Referral
No ratings yet
Multimodal AI On Wound Images and Clinical Notes For Home Patient Referral
11 pages
English 3: Unit 1 - My Friends Lesson
No ratings yet
English 3: Unit 1 - My Friends Lesson
12 pages
2 - An Intelligent Retrievable Object-Tracking System With Real-Time
No ratings yet
2 - An Intelligent Retrievable Object-Tracking System With Real-Time
15 pages
Student Absenteeism Survey
No ratings yet
Student Absenteeism Survey
3 pages
Cognitive Learning Strategies Guide
No ratings yet
Cognitive Learning Strategies Guide
1 page
Sta. Isabel Es Sip 2019-2022
100% (1)
Sta. Isabel Es Sip 2019-2022
28 pages
OR Cours Outline MGMT
No ratings yet
OR Cours Outline MGMT
3 pages
How To Improve Student English-Speaking Skill
No ratings yet
How To Improve Student English-Speaking Skill
2 pages
Cambridge International AS Level: Arabic 8680/31 October/November 2022
No ratings yet
Cambridge International AS Level: Arabic 8680/31 October/November 2022
3 pages
Udemy Course Quality+Checklist
No ratings yet
Udemy Course Quality+Checklist
1 page
DLL - Science 4 - q1 w2 Final
No ratings yet
DLL - Science 4 - q1 w2 Final
11 pages
Personal Development Worksheets WK 1 - 1
No ratings yet
Personal Development Worksheets WK 1 - 1
7 pages
General Mathematics Exam 2019-2020
No ratings yet
General Mathematics Exam 2019-2020
5 pages
Phrases and Clauses PDF
No ratings yet
Phrases and Clauses PDF
14 pages

Combinatorial Games: From Theoretical Solving To AI Algorithms

Uploaded by

Combinatorial Games: From Theoretical Solving To AI Algorithms

Uploaded by

Combinatorial games: from theoretical solving to

Université de Lyon, CNRS

Abstract. Combinatorial game solving is a research field that is fre-

Definition 1 (Combinatorial game). In a combinatorial game, the following

Fig. 1. Playing Domineering: Right cannot play and loses

A useful property derived from Definition 1 is that any combinatorial game

– Apply the previous rule recursively for each child.

Fig. 2. Game tree of a Domineering position

1.2 Main issues in CGT

– L if Left has a winning strategy independently of who starts the game,

Problem 1 (Outcome). Given a game G with a starting position S, compute the

Example 1. The game Domineering played on a 3 × 1 grid is clearly L since

We now present a second major issue in CGT that can be considered as a

Problem 2 (Value). Given a game G with a starting position S, compute its

of the three components , , and of respective outcomes L , L and R ,

Fig. 3. Sum of Domineering positions

Example 2. Computing Conway’s values of Domineering is not easy even for

2 Complexity of combinatorial games

The complexity of combinatorial games is correlated to the computational com-

If this definition is not always considered as a standard by the CGT com-

Table 1. Complexity of well-known games in their generalized versions

In addition to these well-known games, there are many other combinato-

If the computational complexity of many games is often very high, it makes

– ultra-weakly solved: the answer of Problem 1 is known, but Problem 3 remains

According to this definition, Table 2 summerizes the current knowledge about

Game Size of the board Resolution status

Table 2. Status of the resolutions of several well-known games

Based on these considerations, how to deal with games whose complexity is

3 AI algorithms to deal with the hardest games

3.1 MiniMax algorithms

been widely considered in many other AI programs. Its main objective is to

Roughly speaking, in a MiniMax algorithm, the game tree is built up to a

Fig. 4. MinMax algorithm on a tree of depth 4

In addition to an expert tuning of the evaluation function, another signifi-

Nowadays, the MiniMax algorithm (together with its improving techniques)

3.2 Monte-Carlo approaches

Description As illustrated in Fig. 5, each iteration of MCTS is organized

Fig. 5. The four stages of the MCTS algorithm

Improvements In general, MCTS is not used in a raw version and is frequently

Table 3. Main improvements brought to MCTS

In addition to the enhancements applied to the different stages of MCTS,

We cannot conclude this survey without mentioning the outstanding perfor-

Working on problems as hard as combinatorial games is a real challenge, both

Concerning AI algorithms for games, as said in the above paragraph, Alpha

You might also like