Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adverserial search troubles

I'm writing a Connect4 game with an AI opponent using adversarial search techniques and I have somewhat run into a wall. I feel that I'm not far from a solution but that there's perhaps a problem where I'm switching perspectives (as in: the perspective of which participant I'm basing my evaluation scores on), missing a minus sign somewhere or something like that.

The problem is either that in the variations that I've tried that the AI chooses not to block the player when the player has three-in-a-row but otherwise the AI plays a perfect game, or that he prefers to block the player, even if he has the chance to win the game. It also seems to matter whether or not the search depth is an even or an uneven number, as the AI is pants-on-head retarded at a 6-ply search, which is pretty telling that something is wrong.

Search

The algorithm used is negamax with alpha-beta pruning and is implemented as follows:

private int Negamax(int depth, int alpha, int beta, Player player)
{
  Player winner;
  if (Evaluator.IsLeafNode(game, out winner))
  {
    return winner == player ? (10000 / depth) : (-10000 / depth);
  }

  if (depth == Constants.RecursionDepth)
  {
    return Evaluator.Evaluate(game, depth, player);
  }

  foreach (var move in moves)
  {
    int row;

    if (board.DoMove(move, player, out row))
    {
      var value = -Negamax(depth + 1, -beta, -alpha, (Player)1 - (int)player);

      board.UndoMove(move, row, player);

      if (value > alpha)
      {
        alpha = value;
        if (player == Player.AI)
        {
          bestColumn = move;
        }
      }

      if (alpha >= beta)
      {
        return alpha;
      }

    }
  }
  return alpha;
}

I don't suspect that the problem is in this function, but it might be.

Evaluation

I've based the evaluation function off of the fact that there are only 69 possible ways to get four-in-a-row on a 7x6 board. I have a look-up table of about 350 items that contains the hardcoded information for every column and row which win-combinations the row+column is a part of. For example, for row 0 and column 0, the table looks like this:

//c1r1
table[0][0] = new int[3];
table[0][0][0] = 21;
table[0][0][1] = 27;
table[0][0][2] = 61;

This means that column 0, row 0 is part of win-combination 21, 27 and 61.

I have a second table, that contains for both players how many stones it has in each of the win-combinations. When I do a move then I do the following:

public bool DoMove(int column, Player p, out int row)
{
  row = moves[column];

  if (row >= 0)
  {
    Cells[column + row * Constants.Columns] = p;

    moves[column]--;

    var combinations = this.Game.PlayerCombinations[p];

    foreach (int i in TerminalPositionsTable.Get(column,row))
    {
      combinations[i]++;
    }

    return true;
  }
  else
  {
    return false;
  }
}

The opposite is of course being done for UndoMove.

So after doing a move on column 0, row 0 by Player.Human, the table will be filled with a value of 1 at index 21, 27 and 61. If I do another move in a cell that is also part of win-combination 27, then the player combination table gets incremented at index 27 to 2.

I hope I have made that clear, as it's used in the evaluation function to very quickly determine how close a player is to scoring four-in-a-row.

The evaluation function, where I suspect the problem lies, is as follows:

public static int Evaluate(Game game, int depth, Player player)
{
  var combinations = game.PlayerCombinations[player];

  int score = 0;

  for (int i = 0; i < combinations.Length; i++)
  {
    switch (combinations[i])
    {
      case 1:
        score += 1;
        break;
      case 2:
        score += 5;
        break;
      case 3:
        score += 15;
        break;
    }
  }

  return score;
}

So I simply loop through the 69 possible win-combinations and add an amount to the score based on whether it's a single stone, two-in-a-row or three.

The part I'm still confused about in this whole adversarial search is whether I should care which player is doing a move? I mean, should I pass in the player like I do here, or should I always evaluate the board from the perspective of the AI player? I've tried many combinations of aiScore - humanScore, or just always look from the perspective of Player.AI, and such. But I've hit a dead end and every combination I've tried was pretty flawed.

So:

  1. Is the logic of my evaluation solid in its basis?
  2. When should I 'switch perspective'?

Any help would be much appreciated.

Update

I've implemented Brennan's suggestions below, and while it has definitely much improved, for some reason it doesn't block three-in-a-rows on any column but the two left and right-most, and only when the search-depth is uneven. The AI is unbeatable at even search depths, but only until depth 8 and up. Then it refuses to block again. This is pretty telling that I'm probably very close, but still have some crucial flaw.

Perhaps this has to do with me setting the column the AI should drop a stone in as Brennan commented, but I don't know when else to set it. Setting it only at depth 0 doesn't work.

Update 2

Edited the code as it looks like now with Brennan's changes.

Update 3

Created a Github repo with the full code. If you don't know how to work Git, just download a zip file from here.

It's a .NET 4.0 project, and running it will create log files of the negamax algorithm in your documents/logs directory. The solution also contains a test project, that contains a test for every board column whether or not the AI chooses to block the player when the player has three-in-a-row there.

like image 573
JulianR Avatar asked Jul 03 '10 00:07

JulianR


People also ask

What are adversarial search problems?

In artificial intelligence, deep learning, machine learning, and computer vision, adversarial search is basically a kind of search in which one can trace the movement of an enemy or opponent. The step which arises the problem for the user or the user or the agent doesn't want that specific step task to be carried out.

Why game playing problems are considered adversarial search problems?

Each agent needs to consider the action of other agent and effect of that action on their performance. So, Searches in which two or more players with conflicting goals are trying to explore the same search space for the solution, are called adversarial searches, often known as Games.

What is adversarial search technique?

AI Adversarial search: Adversarial search is a game-playing technique where the agents are surrounded by a competitive environment. A conflicting goal is given to the agents (multiagent). These agents compete with one another and try to defeat one another in order to win the game.


1 Answers

This stuff makes my brain hurt, so I'm not positive that this answer is correct, but here goes.

In negamax, the score is always evaluated relative to the player currently on move. If it's white's move, then a high score is good for white. If it's black's move, then a high score is good for black. So if you have a leaf node, whether the score is +inf or -inf is determined not by whether the node is a win for white or black, but whether it's a win for the player you're currently evaluating. Replace this:

return winner == Player.AI ? (10000 / depth) : (-10000 / depth);

with this:

return winner == player ? (10000 / depth) : (-10000 / depth);

There is a similar problem in your evaluation function. Replace this:

return player == Player.AI ? score : -score;

with this:

return score;

Again, I'm not sure this is right. But I hope you try those two changes and let me know if it works. I'm very curious!

like image 191
Brennan Vincent Avatar answered Sep 23 '22 13:09

Brennan Vincent