I'm writing a Connect4 game with an AI opponent using adversarial search techniques and I have somewhat run into a wall. I feel that I'm not far from a solution but that there's perhaps a problem where I'm switching perspectives (as in: the perspective of which participant I'm basing my evaluation scores on), missing a minus sign somewhere or something like that. The problem is either that in the variations that I've tried that the AI chooses not to block the player when the player has three-in-a-row but otherwise the AI plays a perfect game, or that he prefers to block the player, even if he has the chance to win the game. It also seems to matter whether or not the search depth is an even or an uneven number, as the AI is pants-on-head retarded at a 6-ply search, which is pretty telling that something is wrong. Search The algorithm used is negamax with alpha-beta pruning and is implemented as follows: <pre class="prettyprint"><code>private int Negamax(int depth, int alpha, int beta, Player player) { Player winner; if (Evaluator.IsLeafNode(game, out winner)) { return winner == player ? (10000 / depth) : (-10000 / depth); } if (depth == Constants.RecursionDepth) { return Evaluator.Evaluate(game, depth, player); } foreach (var move in moves) { int row; if (board.DoMove(move, player, out row)) { var value = -Negamax(depth + 1, -beta, -alpha, (Player)1 - (int)player); board.UndoMove(move, row, player); if (value > alpha) { alpha = value; if (player == Player.AI) { bestColumn = move; } } if (alpha >= beta) { return alpha; } } } return alpha; } </code></pre> I don't suspect that the problem is in this function, but it might be. Evaluation I've based the evaluation function off of the fact that there are only 69 possible ways to get four-in-a-row on a 7x6 board. I have a look-up table of about 350 items that contains the hardcoded information for every column and row which win-combinations the row+column is a part of. For example, for row 0 and column 0, the table looks like this: <pre class="prettyprint"><code>//c1r1 table[0][0] = new int[3]; table[0][0][0] = 21; table[0][0][1] = 27; table[0][0][2] = 61; </code></pre> This means that column 0, row 0 is part of win-combination 21, 27 and 61. I have a second table, that contains for both players how many stones it has in each of the win-combinations. When I do a move then I do the following: <pre class="prettyprint"><code>public bool DoMove(int column, Player p, out int row) { row = moves[column]; if (row >= 0) { Cells[column + row * Constants.Columns] = p; moves[column]--; var combinations = this.Game.PlayerCombinations[p]; foreach (int i in TerminalPositionsTable.Get(column,row)) { combinations[i]++; } return true; } else { return false; } } </code></pre> The opposite is of course being done for <code>UndoMove</code>. So after doing a move on column 0, row 0 by <code>Player.Human</code>, the table will be filled with a value of 1 at index 21, 27 and 61. If I do another move in a cell that is also part of win-combination 27, then the player combination table gets incremented at index 27 to 2. I hope I have made that clear, as it's used in the evaluation function to very quickly determine how close a player is to scoring four-in-a-row. The evaluation function, where I suspect the problem lies, is as follows: <pre class="prettyprint"><code>public static int Evaluate(Game game, int depth, Player player) { var combinations = game.PlayerCombinations[player]; int score = 0; for (int i = 0; i < combinations.Length; i++) { switch (combinations[i]) { case 1: score += 1; break; case 2: score += 5; break; case 3: score += 15; break; } } return score; } </code></pre> So I simply loop through the 69 possible win-combinations and add an amount to the score based on whether it's a single stone, two-in-a-row or three. The part I'm still confused about in this whole adversarial search is whether I should care which player is doing a move? I mean, should I pass in the player like I do here, or should I always evaluate the board from the perspective of the AI player? I've tried many combinations of <code>aiScore - humanScore</code>, or just always look from the perspective of <code>Player.AI</code>, and such. But I've hit a dead end and every combination I've tried was pretty flawed. So: <ol> <li>Is the logic of my evaluation solid in its basis?</li> <li>When should I 'switch perspective'? </li> </ol> Any help would be much appreciated. Update I've implemented Brennan's suggestions below, and while it has definitely much improved, for some reason it doesn't block three-in-a-rows on any column but the two left and right-most, and only when the search-depth is uneven. The AI is unbeatable at even search depths, but only until depth 8 and up. Then it refuses to block again. This is pretty telling that I'm probably very close, but still have some crucial flaw. Perhaps this has to do with me setting the column the AI should drop a stone in as Brennan commented, but I don't know when else to set it. Setting it only at depth 0 doesn't work. Update 2 Edited the code as it looks like now with Brennan's changes. Update 3 Created a Github repo with the full code. If you don't know how to work Git, just download a zip file from here. It's a .NET 4.0 project, and running it will create log files of the negamax algorithm in your documents/logs directory. The solution also contains a test project, that contains a test for every board column whether or not the AI chooses to block the player when the player has three-in-a-row there.

This stuff makes my brain hurt, so I'm not positive that this answer is correct, but here goes. In negamax, the score is always evaluated relative to the player currently on move. If it's white's move, then a high score is good for white. If it's black's move, then a high score is good for black. So if you have a leaf node, whether the score is +inf or -inf is determined not by whether the node is a win for white or black, but whether it's a win for the player you're currently evaluating. Replace this: <pre class="prettyprint"><code>return winner == Player.AI ? (10000 / depth) : (-10000 / depth); </code></pre> with this: <pre class="prettyprint"><code>return winner == player ? (10000 / depth) : (-10000 / depth); </code></pre> There is a similar problem in your evaluation function. Replace this: <pre class="prettyprint"><code>return player == Player.AI ? score : -score; </code></pre> with this: <pre class="prettyprint"><code>return score; </code></pre> Again, I'm not sure this is right. But I hope you try those two changes and let me know if it works. I'm very curious!

Adverserial search troubles

Tags:

c#

.net

artificial-intelligence

I'm writing a Connect4 game with an AI opponent using adversarial search techniques and I have somewhat run into a wall. I feel that I'm not far from a solution but that there's perhaps a problem where I'm switching perspectives (as in: the perspective of which participant I'm basing my evaluation scores on), missing a minus sign somewhere or something like that.

The problem is either that in the variations that I've tried that the AI chooses not to block the player when the player has three-in-a-row but otherwise the AI plays a perfect game, or that he prefers to block the player, even if he has the chance to win the game. It also seems to matter whether or not the search depth is an even or an uneven number, as the AI is pants-on-head retarded at a 6-ply search, which is pretty telling that something is wrong.

Search

The algorithm used is negamax with alpha-beta pruning and is implemented as follows:

private int Negamax(int depth, int alpha, int beta, Player player)
{
  Player winner;
  if (Evaluator.IsLeafNode(game, out winner))
  {
    return winner == player ? (10000 / depth) : (-10000 / depth);
  }

  if (depth == Constants.RecursionDepth)
  {
    return Evaluator.Evaluate(game, depth, player);
  }

  foreach (var move in moves)
  {
    int row;

    if (board.DoMove(move, player, out row))
    {
      var value = -Negamax(depth + 1, -beta, -alpha, (Player)1 - (int)player);

      board.UndoMove(move, row, player);

      if (value > alpha)
      {
        alpha = value;
        if (player == Player.AI)
        {
          bestColumn = move;
        }
      }

      if (alpha >= beta)
      {
        return alpha;
      }

    }
  }
  return alpha;
}

I don't suspect that the problem is in this function, but it might be.

Evaluation

I've based the evaluation function off of the fact that there are only 69 possible ways to get four-in-a-row on a 7x6 board. I have a look-up table of about 350 items that contains the hardcoded information for every column and row which win-combinations the row+column is a part of. For example, for row 0 and column 0, the table looks like this:

//c1r1
table[0][0] = new int[3];
table[0][0][0] = 21;
table[0][0][1] = 27;
table[0][0][2] = 61;

This means that column 0, row 0 is part of win-combination 21, 27 and 61.

I have a second table, that contains for both players how many stones it has in each of the win-combinations. When I do a move then I do the following:

public bool DoMove(int column, Player p, out int row)
{
  row = moves[column];

  if (row >= 0)
  {
    Cells[column + row * Constants.Columns] = p;

    moves[column]--;

    var combinations = this.Game.PlayerCombinations[p];

    foreach (int i in TerminalPositionsTable.Get(column,row))
    {
      combinations[i]++;
    }

    return true;
  }
  else
  {
    return false;
  }
}

The opposite is of course being done for UndoMove.

So after doing a move on column 0, row 0 by Player.Human, the table will be filled with a value of 1 at index 21, 27 and 61. If I do another move in a cell that is also part of win-combination 27, then the player combination table gets incremented at index 27 to 2.

I hope I have made that clear, as it's used in the evaluation function to very quickly determine how close a player is to scoring four-in-a-row.

The evaluation function, where I suspect the problem lies, is as follows:

public static int Evaluate(Game game, int depth, Player player)
{
  var combinations = game.PlayerCombinations[player];

  int score = 0;

  for (int i = 0; i < combinations.Length; i++)
  {
    switch (combinations[i])
    {
      case 1:
        score += 1;
        break;
      case 2:
        score += 5;
        break;
      case 3:
        score += 15;
        break;
    }
  }

  return score;
}

So I simply loop through the 69 possible win-combinations and add an amount to the score based on whether it's a single stone, two-in-a-row or three.

The part I'm still confused about in this whole adversarial search is whether I should care which player is doing a move? I mean, should I pass in the player like I do here, or should I always evaluate the board from the perspective of the AI player? I've tried many combinations of aiScore - humanScore, or just always look from the perspective of Player.AI, and such. But I've hit a dead end and every combination I've tried was pretty flawed.

So:

Is the logic of my evaluation solid in its basis?
When should I 'switch perspective'?

Any help would be much appreciated.

Update

I've implemented Brennan's suggestions below, and while it has definitely much improved, for some reason it doesn't block three-in-a-rows on any column but the two left and right-most, and only when the search-depth is uneven. The AI is unbeatable at even search depths, but only until depth 8 and up. Then it refuses to block again. This is pretty telling that I'm probably very close, but still have some crucial flaw.

Perhaps this has to do with me setting the column the AI should drop a stone in as Brennan commented, but I don't know when else to set it. Setting it only at depth 0 doesn't work.

Update 2

Edited the code as it looks like now with Brennan's changes.

Update 3

Created a Github repo with the full code. If you don't know how to work Git, just download a zip file from here.

It's a .NET 4.0 project, and running it will create log files of the negamax algorithm in your documents/logs directory. The solution also contains a test project, that contains a test for every board column whether or not the AI chooses to block the player when the player has three-in-a-row there.

573

asked Jul 03 '10 00:07

JulianR

1 Answers

This stuff makes my brain hurt, so I'm not positive that this answer is correct, but here goes.

In negamax, the score is always evaluated relative to the player currently on move. If it's white's move, then a high score is good for white. If it's black's move, then a high score is good for black. So if you have a leaf node, whether the score is +inf or -inf is determined not by whether the node is a win for white or black, but whether it's a win for the player you're currently evaluating. Replace this:

return winner == Player.AI ? (10000 / depth) : (-10000 / depth);

with this:

return winner == player ? (10000 / depth) : (-10000 / depth);

There is a similar problem in your evaluation function. Replace this:

return player == Player.AI ? score : -score;

with this:

return score;

Again, I'm not sure this is right. But I hope you try those two changes and let me know if it works. I'm very curious!

191

answered Sep 23 '22 13:09

Brennan Vincent

Related questions
                            
                                XmlSerializer constructor error with class derived from a base class
                            
                                Find substring ignoring specified characters
                            
                                NCommon - resources, documentation
                            
                                Is there a better way to count the messages in an Message Queue (MSMQ)? [duplicate]
                            
                                Limiting what a .net plugin can access
                            
                                How to receive EMails in C# application?
                            
                                Platform configuration for projects in VS 2010
                            
                                Can I get the method local variables through a stack trace in C#?
                            
                                C# copy file to another directory using other domain/username/password
                            
                                C# How to generate code from code
                            
                                C# Excel Interop : Excel process remains in memory until parent form closed
                            
                                Cost to GC of using weak references in C#?
                            
                                C# RSA public Modulus/Exponent? (Bad Data)
                            
                                Amazon (AWS) - The request must contain the parameter Signature
                            
                                NavigationWindow Click Sound
                            
                                Adding a directory temporarily to Windows 7's DLL search paths
                            
                                Strategies for testing reactive, asynchronous code
                            
                                How to mimic Picasa's rendering of reduced quality image to speed up drawing
                            
                                Efficient foreach child evaluation in parallel
                            
                                Custom Data Annotations IsValid is never invoked. (ASP.NET MVC 2 .NET 4)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With