Local Search
C463 / B551 Artificial Intelligence
Adversarial Search
Adversarial Search
- In particular for games involving two opposing players that
attempt to achieve a mutually exclusive goal.
- It can have other applications (economics).
- They qualify as competitive multi-agent environments.
- The players take turns to make a move.
- The environment is strategic, often of perfect information.
Game Problem Setup
- Initial state - positions of every piece involved on both sides
and which player has the next move.
- Successor function - identifies the legal moves from any state
- Terminal test - knowing when one of the player won the game or
when there is a draw.
- Utility (payoff) function - an evaluation of the terminal states,
like the score.
- A heuristic estimating the payoff for an intermediate state is
also often used when the game tree is too large to compute completely.
Minmax
- A strategy assuming that each player is
choosing the next move to maximize their utility function.
- Minmax(state n):
- utility(n) if n is a terminal state
- max{ minmax( successor(n))} if n is a state where the current
player has the next move,
- min{ minmax( successor(n))} if n is a state where the opponent has
the next move.
Minmax
- The minmax algorithm with 1 level aims to maximize the utility or
estimated payoff of all the legal moves available to the player.
- For 2 levels, it assumes that the opponent will optimize their
payoff from the move after the next, so the player minimizes the
utility for the opponent. Each odd level in the tree minimizes its
descendants, while each even level maximizes them. If the tree is
very big, it may not be practical to expand it to the terminal
states. In that case we can choose a maximal expansion depth.
Example: Sticks Game
The players have piles of sticks. They can remove any number of sticks
from any one pile. The one removing the last stick loses.
Minmax Algorithm
; Lisp version
def minmax(n, what):
if n is a terminal node:
return its score
else:
compute the score of all of
its children using 1-what
if what == 1:
return the max score of the children
else:
return the min score of the children
# Python version
(defun minmax (n, what)
(if (terminalp n) (score n)
(let ((scores nil))
(dolist (s (expand n))
(push (cons s
(minmax s (- 1 what)))
scores))
(if (= what 1) (max scores)
(min scores)))))
Alpha-Beta
- Idea: pruning some of the subtrees once we know enough about them
to decide that they don't contribute to the decision for the next
move.
- It uses a global best move in the tree for each side (alpha = us,
beta = opponent).
- For alpha (us) the value starts with the -infinity and increases
over time as we find better moves.
- For beta (the opponent) the value starts as +infinity and
decreases over time as the opponent find moves that disadvantage us
better.
Alpha-Beta
- ALPHA value of a node
It is a value never greater than the true score of this
node. Initially it is the score of that node, if the node is a leaf,
otherwise it is -infinity. Then at a MAX node it is set to the largest
Beta score of its successors, and at a MIN node to the alpha value of
its predecessor.
- BETA value of a node
It is a value never smaller than the true score of this
node. Initially it is the score of that node, if the node is a leaf,
otherwise it is +infinity. Then at a MIN node it is set to the
smallest of the Alpha scores of its successors, and at a MAX node to
the beta value of its predecessor.
- For a terminal node alpha = beta = utility score.
Alpha-Beta for a MAX node
def MAX_AB (N, A, B):
if N is a leaf:
return its score;
else:
alpha = -infinity
for each successor S of N :
val = MIN_AB(S, max(A, alpha), B)
if val > alpha:
alpha = val
if alpha >= B: ]: #prune the subtree
return alpha
return alpha
Alpha-Beta for a MIN node
def MIN_AB (N, A, B):
if N is a leaf :
return its score
else :
beta = +infinity
for each successor S of N:
val = MAX_AB(S, A, min(B, beta))
if val < beta:
beta = val
if A >= beta: #prune the subtree
return beta
return beta
Example:
+infinity = 20, -infinity = -20
Pruning a min subtree
Applet:
http://www.ocf.berkeley.edu/~yosenl/extras/alphabeta/alphabeta.html
->
As 3 < 7, we know the opponent will choose something
equal or smaller, so we can discard the rest of the tree
because we already have a better move (7).
Pruning a max subtree
When we find 8 at the bottom, the parent was (-15, -3) and 8 replaces
-15. As 8 > -3 we update alpha of the parent (the red node above), and
it becomes (8, -3) before. Since that is the opponent, it will notice
that we got a better move and not give us a chance to take it. So the
opponent discards that node, which means that we don't have to
evaluate the rest of the subtree. Since this was the last green child
at this level, we can finalize the red node above with alpha = beta =
-3.
Cutoff Methods
- Sometimes we know a global minimum and maximum.
- If the value of beta reaches the global minimum, there's no use
searching forward.
- The same thing for alpha, if its value reaches the maximum, we can
prune the rest of the subtrees.
- For example, in a win-lose situation, when we find a path from
which we can win, there's no point looking at alternatives. We can
also assume that if the opponent finds a state where we lose, there's
no point looking for other states.
Cutoff
Games Involving Chance
- The successors of a state n can be a list of states depending on a
probabilistic event, like throwing a die, or drawing another card.
- Suppose the successors are {s1, s2, ... sk} where each of them can
occur with a probability pi, and resulting in an estimated score vi.
- Then the score returned by this node will be
score(n) = Sum ( pi * vi)
the sum of the individual scores weighted by their probabilities.
- Sometimes the probabilities are not completely known (card games)
and they are estimated.
Othello
- An 8x8 board. Each player has 32 pieces with one face white and
one black.
- A piece will capture (reverse) all the pieces of the opponent that
are on a line (including diagonal) between the new piece and another
one of the player's color.
- Only moves capturing at least one opponent piece are legal.
- The game ends when no more moves are possible.
- The player with the higher number of pieces wins.
Legal Moves
->
Heuristic Payoff
- Tic-Tac-Toe: one could assign a value of 4 to the center,
of 2 to the corners and of 1 to the rest of the board.
- Othello:
- the number of captured pieces in one move
- corners are good
- squares next to corners are not good,
- borders are better than interior squares