Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Learning Connect Four

0.00/5 (No votes)
22 Jul 2004 2  
A Connect Four Game that learns from experience.

Sample Image - ConnectFourGamePiccy.png

Introduction

First of all, it should be noted right from the start that the fact that this is a connect four game is pretty much incidental to the project. There were two distinct reasons behind the creation of this project, the first being that I wanted a better board control than the one that I had been using in previous applications, and I needed a framework to develop that within, without other projects becoming sidetracked by the process of developing the board. Also, I needed to be able to develop the board control so that I could use it in different applications, and so I wanted to try and avoid the dangers of connecting it too strongly to any particular project. My current personal feeling on this is that the board control still needs another project to get it where I want it to be, but it does the job required of it here.

The second reason for the project is to do with the reading I have been doing about Artificial Intelligence and the idea of learning. The argument I read somewhere goes something along the lines that if we ever want to develop computer software that is intelligent then the program must have access to the full knowledge of the human race in order to be able to work anything out. This strikes me as wrong on so many levels, it's kind of hard to know where to start. Firstly, though there is no one on the planet that either has access to the full knowledge of the human race, or would be capable of remembering more than a fraction of it, even if they did somehow manage to pump it into their brains without having a total nervous breakdown. Then there is the point of heuristics and making decisions with incomplete information, let alone taking into account personal and political bias.

But this is distracting from this particular project, so let's get back to it. The question that arises from the idea of a computer having complete access to sum of human knowledge is one thing, but just say for one second that you have written a computer program that is the most intelligent piece of software on the planet. The only problem is that it knows absolutely nothing. How do you get the sum of all human knowledge into the computer in the first place? Forget the sum of all human knowledge for a second, how do you get any information into the program? The most obvious answer to this is with a database of some sort, and the only problem with databases is that they are initially empty. Databases are also used for specific purposes, containing data that is pertinent to the task or business at hand. It would be quite a daunting task to even conceive of a general database that could be used for all things. Finally, though we get to the point where even if we had such a database, we would still need to put the information into it, which brings me to the main thread of thinking that led to this project.

Given that we have the imaginary intelligent piece of software and the imaginary all knowing database schema to feed it information, who is going to either write the program/s to fill the database, or sit there typing in the information. I can assure you, I wouldn't. I find database programming tedious at the best of times, the last thing I would be interested in doing would be entering information into a database for the rest of my life because even though I am personally far from knowing the sum of all human knowledge, it would probably take the rest of my life just to type in the small amount of knowledge I do have into a database in a form where a program could meaningfully extract the information. This brings me to the obvious conclusion that if a piece of software is going to learn the sum of all human knowledge, then it is just going to have to do it on its own.

A child can learn on its own, so why not a piece of software? Rather than hand the software a database of information to use, surely it makes more sense to develop a way of allowing the computer to learn for itself, than to just chuck information at it and hope that it knows how to deal with it. Which rather conveniently brings us to patterns.

Memory Pattern Theory

Everything has patterns. The room you are sitting in at the moment follows exactly the same patterns as all the other rooms you have ever been in your life. The question is "How do you know you're in a room?" What is it about your current environment that lets you know you are in a room, and the answer is that the pattern of your environment fulfils your criteria for the accepted definition of a room. That is, it has an entry way, walls, usually four or more, a floor, and a roof of some description. There are usually windows of some sort but these aren't necessary. We can get more involved when we start talking about types of rooms but they follow the same basic pattern with added extras such as shelves, and books for a library, or a stove for a kitchen, or a Bath or shower or both for the bathroom.

The question then is how do we apply the idea of patterns to a form that the software can use in such a way that it can learn about its environment, which brings us back to the Connect Four program in that the formations of the pieces represent the patterns of the game. If you think about it on a very basic level, the game of Connect Four is simply a collection of color patterns that the players respond to in order to make their moves. Certain patterns mean certain things within the game, and the players make their decisions based on these patterns, especially if they have seen those patterns before and they have caused the game to be either won or lost.

A game of Connect Four takes place on at least three levels. There is the level of memory. The 'have I seen this pattern before' level that the player knows could be a threat or not. There is the level of threat in which the player recognizes a pattern of three pieces of the same color that could result in either the player or their opponent winning the game, and there is the safe level in which the current move will neither win nor lose the game.

Learning Connect Four

The Connect Four game uses four distinct types of patterns to determine a move. These are Aggressive, Defensive, Warning, and Tactical.

  • Aggressive:- An Aggressive Pattern is where the software either knows or thinks that it can make a move that will win the game.
  • Defensive:- A Defensive Pattern is where the software either knows or thinks that a certain move is needed to be made in order to avoid losing the game.
  • Warning:- A Warning Pattern is where the software either knows or thinks that a certain pattern within the board can result in the opponent player winning the game.
  • Tactical:- A Tactical Pattern is where the software either knows or thinks that a certain pattern within the board can result in the software winning the game.

Every time that the software makes a move, it takes a snapshot of the board. At this point, it doesn't matter what color the pieces are or which player they belong to. Once the pieces within the board have been determined then the computer starts searching through the pieces looking for patterns, where a pattern is defined as two or more pieces of the same color. Naturally, the color of the pieces determines what type of pattern the group of pieces is eventually defined as, so a group of either two or three pieces of computers color will be defined as being an aggressive pattern that can be used by the software to win the game. Although, at this stage, the code is simply concerned with determining valid patterns of whatever color.

An example of the code finding a pattern is:

/// check top

/// 

patternSquareInfo = connectFourGame.GetSquareInfo( 
    GetIdentifierAbove( square.Identifier ) );
    
if( patternSquareInfo != null )
{
    bool bTwoPieces = false;
    bool bThreePieces = false;

    if( patternSquareInfo.IsOccupied == true )
    {
        nextInfo = connectFourGame.GetSquareInfo( 
            GetIdentifierAbove( patternSquareInfo.SquareIdentifier ) );
        if( nextInfo != null && nextInfo.IsOccupied == false )
        {
            /// allow this square

            /// 

            bTwoPieces = true;
        }
        else 
        {
            if( nextInfo != null )
            {
                bTwoPieces = true;
                bThreePieces = true;
            }
        }
    }

    if( bTwoPieces == true )
    {
        ConnectFourPiece abovePiece = new 
            ConnectFourPiece( false, patternSquareInfo.SquareIdentifier );

        if( piece.IsPieceRed == patternSquareInfo.IsRed )
        {
            abovePiece.IsPieceRed = patternSquareInfo.IsRed;
            abovePiece.Position = PIECEPOSITION.ABOVE;
            abovePiece.Level = 1;
                            
            if( connectFourGame.PlayerIsRed == patternSquareInfo.IsRed )
                abovePiece .IsEnemy = false;
            else
                abovePiece.IsEnemy = true;

            pattern.AddGamePiece( abovePiece );

            if( bThreePieces == false )
            {
                if( patternCollection.IsIn( pattern ) == false )
                    patternCollection.AddPattern( pattern );
            }
        }
    }

    if( bThreePieces == true )
    {
        patternSquareInfo = nextInfo;
        nextInfo = connectFourGame.GetSquareInfo( 
            GetIdentifierAbove( patternSquareInfo.SquareIdentifier ) );
        if( nextInfo != null && nextInfo.IsOccupied == false )
        {
            ConnectFourPiece abovePiece = new 
                ConnectFourPiece( false, patternSquareInfo.SquareIdentifier );

            if( piece.IsPieceRed == patternSquareInfo.IsRed )
            {
                abovePiece.IsPieceRed = patternSquareInfo.IsRed;
                abovePiece.Position = PIECEPOSITION.ABOVE;
                abovePiece.Level = 2;

                if( connectFourGame.PlayerIsRed == patternSquareInfo.IsRed )
                    abovePiece.IsEnemy = false;
                else
                    abovePiece.IsEnemy = true;

                pattern.AddGamePiece( abovePiece );

                if( patternCollection.IsIn( pattern ) == false )
                    patternCollection.AddPattern( pattern );
            }
        }
    }

    pattern.Clear();
    pattern.AddGamePiece( piece );
}

The code starts with the current piece information that is contained in the square, and in the code example, it checks the board to see if there is a piece on the square above it. If there is, it then checks the square above that to see if the square above is a valid square and if it also contains a piece. If the square above the original piece is valid then this is a two piece pattern, and as long as the colors match, then the piece is stored, but the pattern is not added to the collection unless the square above it is not equal to null.

If the square above that is valid then this is a three piece pattern, and the remaining piece is added as long as the colors are the same and the square above is valid and unoccupied. This is done because if the pattern contains three pieces then the next piece to be added to that pattern is going to be a winning move for someone, and if there is no where to move to, then it is simply not relevant as a pattern.

We are now in the position where we have a snapshot of the board to work on with all the patterns that the board contains being held within the patternCollection object which is just a collection of Connect Four Patterns. The next step is to determine what we are going to do with the patterns.

The following code is a snippet from the code that finds the defensive patterns.

for( int i=0; i<patternCollection.Count; i++ )
{
    holdingPattern = patternCollection.GetPattern( i );

    if( historicalPatternCollection.IsIn( holdingPattern ) == true )
    {
        holdingPattern = 
            ( ConnectFourPattern )historicalPatternCollection.GetPattern( 
            holdingPattern );

        if( holdingPattern.NumberOfTimesSeenInLosingGame > 0 
            && IsValidPattern( holdingPattern ) == true )
        {
            holdingPattern.Weighting = nMemoryDefensiveWeight;
            defensivePatterns.AddPattern( holdingPattern );
        }
        else
        {
            if( connectFourGame.GetSquareInfo( 
                holdingPattern.GetStartsWith() ).IsRed 
                != connectFourGame.PlayerIsRed )
                defensivePatterns.AddPattern( holdingPattern );
        }
    }
    else /// determine if this is a defensive pattern

    {
        patternSquareInfo = connectFourGame.GetSquareInfo( 
            holdingPattern.GetStartsWith() );

        /// if the pattern square colour is NOT the same as the player colour

        /// it's a pattern that needs defending against

        /// 

        if( patternSquareInfo.IsRed != connectFourGame.PlayerIsRed )
        {
            leftSquareInfo = connectFourGame.GetSquareInfo( 
                GetIdentifierToLeft( patternSquareInfo.SquareIdentifier ) );
            leftSquareNextInfo = connectFourGame.GetSquareInfo( 
                GetIdentifierToLeft( 
                GetIdentifierToLeft( patternSquareInfo.SquareIdentifier ) ) );
            rightSquareInfo = connectFourGame.GetSquareInfo( 
                GetIdentifierToRight( patternSquareInfo.SquareIdentifier ) );
            rightSquareNextInfo = connectFourGame.GetSquareInfo( 
                GetIdentifierToRight( 
                GetIdentifierToRight( patternSquareInfo.SquareIdentifier ) ) );

            if( leftSquareNextInfo != null )
                nextSquareInfo = connectFourGame.GetSquareInfo( 
                GetIdentifierToLeft( leftSquareNextInfo.SquareIdentifier ) );
            else
                nextSquareInfo = null;

            if( leftSquareInfo != null && leftSquareNextInfo != null )
            {
                if( leftSquareInfo.IsOccupied == true 
                    && leftSquareInfo.IsRed != connectFourGame.PlayerIsRed ) 
                {
                    if( leftSquareNextInfo.IsOccupied == false )
                        defensivePatterns.AddPattern( holdingPattern );
                    else
                    {
                        if( nextSquareInfo != null 
                        && nextSquareInfo.IsOccupied == false
                        && leftSquareNextInfo.IsRed 
                        != connectFourGame.PlayerIsRed )
                        {
                        defensivePatterns.AddPattern( holdingPattern );
                        }
                    }
                }
            }

The code starts by cycling through the patterns that were collected from the snapshot, and then checks to see if the pattern is a historical pattern. That is, if the code has encountered that specific pattern in previous games. This is determined by the code saving all the patterns collected from the last snapshot of the board at the end of each game. If the pattern is a pattern that has been seen before, and has been part of a losing game, then the weighting for the pattern is given a memory defensive weighting. (The way weights are used in the game is discussed below.) The pattern is then added to the defensive patterns list. If, however, the pattern has been seen before but has not been part of a losing game, then the pattern is still added to the defensive patterns list, as long as the color criteria matches, but it does not get given any special weighting.

The bit that generates a lot of code is when the pattern has not been seen before, and we have to work it out the hard way. This is done by getting the squares along the axis that we want to check, and then testing to see if the squares along the axis in the case of the defensive pattern contain the player's pieces in such a formation that could lead them to win the game.

This procedure is repeated for all the pattern types whenever the computer makes a move in order for the computer to build a complete picture of the patterns within the current snapshot. Once the patterns within the snapshot have been processed then the computer reaches this code:

if( aggressivePatterns.Count == 0 && defensivePatterns.Count == 0 
    && warningPatterns.Count == 0 && tacticalPatterns.Count == 0 )
{
    MakeRandomMove();
    return;
}

Which basically means that if no patterns have been found within the current snapshot then just make a random move. Note though that as there needs to be at least two adjoining squares of the same color to form a pattern, this will not just be called for the first move. Once we have the separate pattern collections, then we get to the process patterns function which is used to deal with all four of the pattern collections.

The Process Patterns function through the patterns in the collections, and attempts to assign weights to individual patterns in order to determine how important it is to react to the pattern.

The function starts by:

if( pattern.NumberOfTimesSeen > 1 || pattern.Weighting == nAggressiveWeight )
{

    /// ensures that a pattern of three pieces will have a higher priority

    /// than a pattern with two pieces

    /// 


    if( pattern.Count == 3 )
    {
        pattern.Weighting = patternWeight * 2;

        /// A winning pattern can be ignored 

        /// if a defensive pattern from memory is used 

        /// and the winning pattern has not been seen before. 

        /// 

 
        if( patternWeight == nAggressiveWeight )
            pattern.Weighting += nMemoryAggressiveWeight;
    }
    else 
        pattern.Weighting = 0;

    /// give added weight if pattern is part of an ending move

    /// 


    if( pattern.IsEndingPattern == true )
        pattern.Weighting += 2;

    if( pattern.ResponsePresent == true )
    {
        if( pattern.NumberOfTimesSeenInWinningGame 
            > pattern.NumberOfTimesSeenInLosingGame )
        {
            pattern.Weighting += pattern.NumberOfTimesSeenInWinningGame * 3;
            bFindMove = false;
        }
        else
            bFindMove = true;
    }

    /// try to increase the warning level in an attempt to detect 

    /// two way tricks

    /// 


    if( patternWeight == nWarningWeight 
        && pattern.NumberOfTimesSeenInLosingGame 
            > pattern.NumberOfTimesSeenInWinningGame )
        pattern.Weighting += pattern.NumberOfTimesSeenInLosingGame;

It starts by checking that the pattern has been seen more than once unless it is an aggressive pattern in which it is allowed through at all times. This is done to reduce the occurrences of the program missing winning moves that it hasn't seen before, as no one really wants to play a game where they get the feeling that their opponent is letting them win. It then checks to see if the pattern has a count of three. This is done so that patterns containing three pieces can be given a higher weighting than patterns with only two pieces, and an aggressive pattern of three pieces that has been seen before being given the highest rating of all.

We then check to see if the pattern is recognized as a pattern that has previously ended a game of Connect Four, and if it has, it is given a slightly increased weighting before checking if the pattern already has a programmed response from a previous game. If there is a response from a previous game, and if it is a winning response, then the weighting for the pattern is increased, so we recognize that we need to find a move to make.

We then check to see if the pattern is a warning and try to increase the odds for preventing two way tricks. A two way trick being where the opponent has two pieces next to each other with both squares on either side being blank, which means that if the opponent places a piece on either end of the two pieces, they are going to win the game, and nothing can be done about it. Once this is done, the function then checks to see if there is a need to find a move. If there is, it checks all the available options and assigns weights accordingly.

When all the patterns have been processed, the code then extracts the pattern with the highest weighting and attempts to make the move, though at this point, a move can still be blocked if it is not to the computer's advantage to make it, in that it will give the player a chance to win the game. It is because of these conditions that the computer will only make a few attempts to make a move before deciding to just move randomly.

Remembering the Moves

The historical patterns for the game are loaded when the program is initialized and saved at the end of every game. At the end of each game, the patterns collected by the last snapshot are compared to the patterns in the historical pattern collection. If the pattern exists, it is updated, and if it doesn't, it is added to the collection before the entire historical pattern collection is saved to disk.

Here is an example of the saved pattern data:

<ConnectFourPattern>
    <BasicGamePatternSet>
        <PatternID>0</PatternID>
        <BasicGamePiece>
            <PieceID>0</PieceID>
            <IsStartForPattern>True</IsStartForPattern>
            <Square>CF</Square>
            <IsEnemy>False</IsEnemy>
            <PiecePosition>START</PiecePosition>
            <Level>0</Level>
        </BasicGamePiece>
        <BasicGamePiece>
            <PieceID>1</PieceID>
            <IsStartForPattern>False</IsStartForPattern>
            <Square>DF</Square>
            <IsEnemy>False</IsEnemy>
            <PiecePosition>RIGHT</PiecePosition>
            <Level>1</Level>
        </BasicGamePiece>
        <NumberOfTimesSeen>80</NumberOfTimesSeen>
        <NumberOfTimesSeenInWinningGame>38</NumberOfTimesSeenInWinningGame>
        <NumberOfTimesSeenInLosingGame>42</NumberOfTimesSeenInLosingGame>
        <EndingPattern>False</EndingPattern>
        <Weighting>103</Weighting>
        <Response>
            <ResponsePresent>0</ResponsePresent>
        </Response>
    </BasicGamePatternSet>
</ConnectFourPattern>

It should be noted that the IDs for both the pattern and the pieces are not used in the current program, and will probably be removed completely from the next program in the series.

Weighting in Connect Four

There are a number of predefined weight constants within the code that help the code decide exactly what it is going to do about a certain pattern. These are:

const int nAggressiveWeight = 10;
const int nDefensiveWeight = 7;
const int nTacticalWeight = 3;
const int nWarningWeight = 5;
const int nMemoryAggressiveWeight = 105;
const int nMemoryDefensiveWeight = 104;
const int nMemoryTacticalWeight = 103;
const int nMemoryWarningWeight = 102;

These weights are split into two categories; these being the standard weighting for each pattern type, and the standard weighting for the type if it comes from memory. The idea here is that when a pattern is recognized from memory, then it is much more likely to be used than a weight from a pattern that the program hasn't seen before. When the code is deciding what patterns to use, it assigns the initial weights.

holdingPattern = ( ConnectFourPattern )historicalPatternCollection.GetPattern( 
    holdingPattern );


if( holdingPattern.NumberOfTimesSeenInLosingGame > 0 
    && IsValidPattern( holdingPattern ) == true )

{
    holdingPattern.Weighting = nMemoryDefensiveWeight;
    defensivePatterns.AddPattern( holdingPattern );
}
else
{
    if( connectFourGame.GetSquareInfo( holdingPattern.GetStartsWith() ).IsRed 
        != connectFourGame.PlayerIsRed )
        defensivePatterns.AddPattern( holdingPattern );
}

The above code shows the loop that is used when finding defensive patterns. At this point, we know that the pattern is in the historical patterns, and if it is a pattern that is commonly seen in a losing game, then we assign it the memory weight for a defensive pattern. If, however, the pattern is not usually seen in a losing game, it is left to the later code to assign a weight to it.

This later code is in the Process Patterns function where the code calculates which end of the pattern to place the response piece. This is done using simple weights where each side is given a weight that is incremented and decremented according to the availability of the squares, with the winning position being awarded the passed in weight. It should be noted that the Process Patterns function processes all patterns and takes the weight for each pattern type as the passed in parameter.

A Question of Balance

The trick to the learning part of the program is not only to get the program to improve as it goes along, but to appear as if it is improving as well. For this reason, the learning does not take place instantly. The code needs to see a move at least twice before it will start to react to it. This means that when playing the game at the start, it is possible for the player to deliberately catch out the computer with what should be a stupidly obvious winning move, and pull off the move a couple of times before the computer recognizes it. For example, say you were doing a straight run across the bottom of the board; at first, the computer should let you get away with it. But once it spots the move and remembers it, the program should trap that move every time.

Although, it should be pointed out that some of the code works in the more traditional sense of calculating the best move immediately. This is done because if the game was to rely entirely on what it remembered, it would be too easy to beat, and I personally don't think that the impression of the learning would be as effective. There is also some higher level code that is executed before the computer is allowed to move, this is done to prevent really obvious mistakes.

This effectively means that the code runs on three separate levels, these being the initial and basic level of things that it remembers. The middle deterministic level that a normal Connect Four game would work on calculates a move directly based on the current status of the board, and the upper control level that can veto any move if it fails to fulfill its rules by making a move that will, for example, allow the player to win with their next move.

The software does have a tactics pattern although this is probably under-used within the program as the question arose a couple of times in the development of just how intelligent do you need a Connect Four game to be. I've tried to reach a compromise between playability and complication within the code, as given the nature of a Connect Four game, I'm not entirely convinced that it is possible to write a killer game that would win every time, and even if you could, it wouldn't be any fun to play. So, for this reason, the tactics aren't a major part of this code though there are some ideas about using tactics floating about for future use. Therefore, any seemingly brilliant moves that catch the player out by getting them to block a winning move that allows the software to use the piece the player just dropped to win the game, are purely incidental.

A Note on the References

The books listed below are the books that I have read since I started researching artificial intelligence. All of them have influenced my thinking to a greater or lesser degree, and where I knowingly use something from a book, I will make it clear in the article that uses it. The fact that a specific book or books are not mentioned in the text does not mean it has not had an influence as they are all part of the core research.

References

  • Tom Archer (2001) Inside C#, Microsoft Press.
  • Jeffery Richter (2002) Applied Microsoft .NET Framework Programming, Microsoft Press.
  • Charles Peltzold (2002) Programming Microsoft Windows With C#, Microsoft Press.
  • Robinson et al (2001) Professional C#, Wrox.
  • William R. Staneck (1997) Web Publishing Unleashed Professional Reference Edition, Sams.net.
  • Robert Callan, (1999) The Essence Of Neural Networks, Prentice Hall.
  • Timothy Masters (1993) Practical Neural Network Recipes In C++, Morgan Kaufmann (Academic Press).
  • Melanie Mitchell (1999) An Introduction To Genetic Algorithms, MIT Press.
  • Joey Rogers (1997) Object-Oriented Neural Networks in C++, Academic Press.
  • Simon Haykin (1999) Neural Networks A Comprehensive Foundation, Prentice Hall.
  • Bernd Oestereich (2002) Developing Software With UML Object-Orientated Analysis And Design In Practice, Addison Wesley.
  • R Beale & T Jackson (1990) Neural Computing An Introduction, Institute Of Physics Publishing.
  • Bart Kosko (1994) Fuzzy Thinking, Flamingo.
  • Buckley & Eslami (2002) An Introduction To Fuzzy Logic And Fuzzy Sets, Physica-Verlag.
  • Steven Johnson (2001) Emergence, The Penguin Press.
  • John H. Holland (1998) Emergence From Chaos To Order, Oxford University Press.
  • Earl Cox (1999) The Fuzzy Systems Handbook, AP Professional.
  • Mark Ward (1999) Virtual Organisms, Pan.
  • Bonabeau, Dorigo, Theraulaz (1999) Swarm Intelligence From Natural To Artificial Systems, Oxford University Press.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here