Introduction
After learning the MiniMax algorithm, I decided to practice it on Tic Tac Toe. The algorithm is simple to implement. However, it took me much more time than I expected. So, I would like to share what I have learned here.
MiniMax algorithm with alpha beta pruning
The shortest description of MiniMax that I can find is from Wikipedia. I implemented it this way:
public int MiniMaxShortVersion(int depth, int alpha, int beta, out Board childWithMax)
{
childWithMax = null;
if (depth == 0 || IsTerminalNode())
{
RecursiveScore = m_Score;
return m_TurnForPlayerX ? m_Score : -m_Score;
}
foreach (Board cur in GetChildren())
{
Board dummy;
int score = -cur.MiniMaxShortVersion(depth - 1, -beta, -alpha, out dummy);
if (alpha < score)
{
alpha = score;
childWithMax = cur;
if (alpha >= beta)
{
break;
}
}
}
RecursiveScore = alpha;
return alpha;
}
MiniMaxShortVersion(depth, int.MinValue + 1, int.MaxValue - 1, out ret);
The description on Wikipedia is very short and elegant, but there are some things to watch out:
First, we cannot use int.MinValue
and int.MaxValue
as -/+infinity. -int.MaxValue
is not a valid integer. So, I used int.MinValue+1
and int.MaxValue-1
instead.
Second, the terminal nodes include nodes with no successors, and game-over nodes (winner already decided). In my initial implementation, I forgot the game-over nodes.
Finally, the description on Wikipeida seems to indicate that the node type (min/max) doesn't matter. However, the heuristic value needs to change the sign based on whether the node is a min or max node. In the longer version descriptions, the heuristic score doesn't have to change, which makes the coding much easier. In order to find where the mistake is, I had to implement the longer version:
public int MiniMax(int depth, bool needMax, int alpha, int beta, out Board childWithMax)
{
childWithMax = null;
System.Diagnostics.Debug.Assert(m_TurnForPlayerX == needMax);
if (depth == 0 || IsTerminalNode())
{
RecursiveScore = m_Score;
return m_Score;
}
foreach (Board cur in GetChildren())
{
Board dummy;
int score = cur.MiniMax(depth - 1, !needMax, alpha, beta, out dummy);
if (!needMax)
{
if (beta > score)
{
beta = score;
childWithMax = cur;
if (alpha >= beta)
{
break;
}
}
}
else
{
if (alpha < score)
{
alpha = score;
childWithMax = cur;
if (alpha >= beta)
{
break;
}
}
}
}
RecursiveScore = needMax ? alpha : beta;
return RecursiveScore;
}
The second version takes a few more minutes to type. But it saves me a couple hours from debugging. It has another advantage: when its RecurisveScore
is positive, Player X is winning. When the RecurisveScore
is negative, Player O is winning. The shorter version on Wikipedia requires more logic.
Heuristic score in Tic Tac Toe
The Minimax algorithm can be applied to many games. Its implementation doesn't change for a different game. However, the challenge part is to define the heuristic score. I found that it is not easy even for a game as simple as Tic Tac Toe. There are totally 8 rows in a Tic Tac Toe board. The rules to calculate the score are:
- For each row, if there are both X and O, then the score for the row is 0.
- If the whole row is empty, then the score is 1.
- If there is only one X, then the score is 10.
- If there are two Xs, then the score is 100.
- If there are 3 Xs, then the score is 1000, and the winner is Player X.
- For Player O, the score is negative.
- Player X tries to maximize the score.
- Player O tries to minimize the score.
- If the current turn is for Player X, then the score of Player X has more advantage. I gave the advantage rate as 3.
The source code for computing the heuristic score is:
int GetScoreForOneLine(GridEntry[] values)
{
int countX=0, countO=0;
foreach (GridEntry v in values)
{
if (v == GridEntry.PlayerX)
countX++;
else if (v == GridEntry.PlayerO)
countO++;
}
if (countO == 3 || countX == 3)
{
GameOver = true;
}
int advantage = 1;
if (countO == 0)
{
if (m_TurnForPlayerX)
advantage = 3;
return (int)System.Math.Pow(10, countX)*advantage;
}
else if (countX == 0)
{
if (!m_TurnForPlayerX)
advantage = 3;
return -(int)System.Math.Pow(10, countO) * advantage;
}
return 0;
}
void ComputeScore()
{
int ret = 0;
int[,] lines = { { 0, 1, 2 }, { 3, 4, 5 },
{ 6, 7, 8 }, { 0, 3, 6 },
{ 1, 4, 7 }, { 2, 5, 8 },
{ 0, 4, 8 }, { 2, 4, 6 }
};
for (int i = lines.GetLowerBound(0); i <= lines.GetUpperBound(0); i++)
{
ret += GetScoreForOneLine(new GridEntry[]
{ m_Values[lines[i, 0]],
m_Values[lines[i, 1]],
m_Values[lines[i, 2]] });
}
m_Score = ret;
}
Interesting Tic Tac Toe move
Because Player X always makes the first move, it is more difficult for Player O to win. The program prints all the winning moves for Player O. I found these moves for Player O are interesting:
[1]Winner is PlayerO:
OXX OXX OXX OXX OXX OXX
--- => O-- => O-- => OO- => OO- => OOO
--- --- X-- X-- X-X X-X
[2]Winner is PlayerO:
OX- OX- OX- OX- OX- OXO
--X => --X => X-X => XOX => XOX => XOX
--- O-- O-- O-- O-X O-X
[3]Winner is PlayerO:
X-- X-- X-X X-X
XOX => XOX => XOX => XOX
--O O-O O-O OOO
[4]Winner is PlayerO:
OX- OX- OXX OXX
X-O => X-O => X-O => XOO
X-- X-O X-O X-O
[5]Winner is PlayerO:
XXO XXO XXO XXO
X-- => X-- => XX- => XX-
-O- OO- OO- OOO
The program can also print out all the winning moves for Player X. The rules for Player O to defend are:
- If Player X makes the first move at the corner, then Player O has to take the center.
- If Player X makes the first move at the center, then Player O has take the corners.
- If Player X makes the first move at the top edge, then Player O cannot take the side edges or unconnected corners.
Two of the interesting moves for Player X are:
[1]Winner is PlayerX:
X-- X-- X-- X-X XOX XOX
--- => --- => O-- => O-- => O-- => OX-
--O X-O X-O X-O X-O X-O
[2]Winner is PlayerX:
X-O X-O X-O X-O X-O X-O
--- => X-- => X-- => XX- => XXO => XXO
--- --- O-- O-- O-- O-X
Similar Tic Tac Toe boards
When I printed out the Tic Tac Toe moves, I found that many moves were redundant because they were very similar. I defined a class Transform
to find out all the similar Tic Tac Toe boards.
class Transform
{
const int Size = 3;
delegate Point TransformFunc(Point p);
public static Point Rotate90Degree(Point p)
{
return new Point(Size - p.y -1, p.x);
}
public static Point MirrorX(Point p)
{
return new Point(Size - p.x -1, p.y);
}
public static Point MirrorY(Point p)
{
return new Point(p.x, Size - p.y -1);
}
List<TransformFunc> actions = new List<TransformFunc>();
public Point ActOn(Point p)
{
foreach (TransformFunc f in actions)
{
if(f!=null)
p = f(p);
}
return p;
}
Transform(TransformFunc op, TransformFunc[] ops)
{
if(op!=null)
actions.Add(op);
if (ops!=null && ops.Length > 0)
actions.AddRange(ops);
}
public static List<Transform> s_transforms = new List<Transform>();
static Transform()
{
for (int i = 0; i < 4; i++)
{
TransformFunc[] ops =
Enumerable.Repeat<TransformFunc>(Rotate90Degree, i).ToArray();
s_transforms.Add(new Transform(null, ops));
s_transforms.Add(new Transform(MirrorX, ops));
s_transforms.Add(new Transform(MirrorY, ops));
}
}
}
When a board can be transformed into another board, they are considered similar.
Conclusion
In this article, I implemented Tic Tac Toe in two versions of the MiniMax algorithm. One is shorter but more difficult to debug. The longer version takes more time to type in, but saves a lot of trouble when debugging.