Home > theory > Bimatrix games

Bimatrix games

This week in Israel is the Workshop on Innovations in Algorithmic Game Theory, which has been a collection of amazing talks and I thought of blogging about something new I learned here. First, Paul Goldberg gave an amazing talk about the Lemke-Howson algorithm and Homotopy Methods. Later, during the poster session Jugal Garg presented the impressive work on an exact algorithm for finding Nash equilibria in rank 1 bimatrix games. Both have in common the use of homotopy, which I found I quite cool idea – and has been present in Game Theory for a while, but I didn’t know about. They also have to do with improving the understanding we have on the Lemke-Howson Algorithm – a (potentially exponential time) algorithm to find Nash equilibrium in 2-person bi-matrix games, but that seems to work pretty well in practice. As always, I though that blogging about it would be a good idea in order to understand those concepts well.

Bimatrix Game and How to compute equilibrium

Bimatrix game is the simplest class of games studied. Basically it is a 2 player game, with n strategies for player 1 and m strategies for player 2 which is represented by a pair of n \times m matrices (A,B). Let x \in \mathbb{R}_+^n, \Vert x \Vert_1 = 1 represent a probability distribution that player 1 assigns to his strategies and y \in \mathbb{R}_+^m, \Vert y \Vert_1 = 1 be the same for player 2. This way, the players experience utilities:

[; u_1 (x,y) = x^t A y,  u_2(x,y) = x^t B y;]

The best understood class of those games is the one where A+B = 0, called zero-sum games. For this class, computing a Nash equilibrium is very easy and it is given by the famous min-max theorem: player 1 finds x maximizing \min_j (x^t A e_j) where e_j is the j-th unit vector. Similarly player 2 finds y maximizing \min_i (e_i^t B y). Then the pair of strategies obtained is a Nash equilibrium – and this verifying that is not hard.

When A+B \neq 0, the problems gets a lot more complicated. Proving that equilibrium exist can be done using various fixed point theorems, as Brouwer or Kakutani. There is a very simple exponential time algorithm for finding it and the key observation is the following, if (x,y) is a Nash equilibrium then:

\forall i: x_i = 0 \text{ or } \forall i': e_i^t A y \geq e_{i'}^t A y

\forall j: y_j = 0 \text{ or } \forall j': x^t B e_j \geq x^t B e_{j'}

which means that each strategy for player i is either in the support or is a best response. Proving that is trivial (if some strategy is in the support and is not a best response, then reducing the probability we play it is an improving deviation). Therefore if we just guess the support of x and the support of y we just need to find some strategies with this support satisfying the inequalities above. This can be done using a simple LP. This is clearly not very efficient, since it involves solving 2^{n+m} LPs. A still exponential, but a lot more practical method, is:

Lemke-Howson Algorithm

A good overview of the L-H algorithm can be found in those lecture notes or in a more detailed version in third chapter of the AGT book (a pdf  can be found in Tim’s website). Here I’ll present a quick overview. The main idea is to define the best-response polytopes P and Q:

[; P = \{(x,v) \in \mathbb{R}^{n+1} \vert x_i \geq 0, x^t B e_j \leq v; \textbf{1}^t x = 1 \};]

[; Q = \{(y,u) \in \mathbb{R}^{m+1} \vert e_i^t A y \leq u, y_j \geq 0; \textbf{1}^t y = 1 \} ;]

The intuition is that a point (x,v) \in P represents the fact that the payoff v of player 2 when player 1 plays x is at least v. We could re-write as v \geq \max_j x^t B e_j.

Each of the polytopes has m+n inequalities and 1 equality. Given x we define L(x) as the indices of the tight inequalities in P and similarly we define L(y) as the indices of the tight inequalities in Q. The theorem in the previous section can be rephrased as:

(x,y) is Nash equilibrium iff L(x) \cup L(y) = \{1, 2, \hdots, n+m\}

So we need to look at points in the polytope below that are fully-labeled. And the way that this is done in L-H is quite ingenious – it is a similar idea that is used by the Simplex Method – walking through the vertices of the polytope looking for the desired vertex. In order to do it, let’s define another polytope:

[;P' = \{x \in \mathbb{R}^n \vert x_i \geq 0, x^t B e_j \leq 1\} ;]

[;Q' = \{y \in \mathbb{R}^m \vert e_i^t A y \leq 1, y_j \geq 0\} ;]

Note that there is a clear bijection between P' \setminus 0 \leftrightarrow P and Q' \setminus 0 \leftrightarrow Q. This is a projective transformation given by x \mapsto (\frac{x}{\Vert x \Vert_1},\Vert x \Vert_1). Notice that the labels are preserved by this transformation and the vertex and edges of the polytope are mapped almost 1-1 (except the vertex 0). Notice a couple of details:

1. A vertex in x \in P' corresponds to a point with n labels. A vertex in y \in Q'' corresponds to a point with m labels.

2. The point (0,0) corresponds to a fully-labeled point of P' \times Q', unfortunately this is the only fully-labeled point that doesn’t correspond to a Nash equilibrium.

3. By taking an edge of the polytope (which is define by n-1 labels in P' and m-1 labels in Q'. We can move between two nodes that are almost fully labeled.

The idea is to consider the set of nodes that are almost fully-labeled: fix some label k and consider all the nodes such that L(x) \cup L(y) \supseteq \{1 .. k-1, k+1 .. n+m\}. Those points are either (0,0), a Nash equilibrium or they have one duplicated label, since \vert L(x) \vert + \vert L(y) \vert = n+m. An edge is composed of n+m-1 labels (no duplicated labels). So, one almost fully-labeled point that is not a Nash or (0,0) is only connected to two other vertices via an edge (which correspond to dropping one of the duplicated labels and following the corresponding edge).

This gives us a topology of the space of Nash equilibria. This tells us a couple of facts: (1) we can find a Nash equilibrium by starting in (0,0) and following the L-H path. In the end of the path, we must find a Nash equilibrium. (ii) If the polytope is not degenerate, then there is an odd number of Nash equilibria and they are connected by L-H edges in the following way:

where the blue dots are the Nash equilibria, the white dots are the almost fully-labeled points and the edges are the L-H edges. The number of Nash equilibria are odd by a simple parity argument.

The classic way in which simplex-like methods walk through the vertices of a polytope is by pivoting. The same way we can implement Lemke-Howson. For an explanation on the implementation, please look at the chapter 3 of the AGT book.

One can ask if we could modify the L-H path following to go through the path faster. Recently, Goldberg, Papadimitriou and Savani proved that finding the Nash equilibrium that L-H outputs is PSPACE-complete. So, in principle, finding this specific equilibrium, seems harder than finding any Nash, which is PPAD-complete.

My current plan is on some other blog posts on Bimatrix games. I wanted to discuss two things: first the homotopy methods and homotopy fixed-point-theorems (which is the heart of the two papers mentioned above) and about new techniques for solving zero-matrix where the strategy space is exponential.

  1. No comments yet.
  1. No trackbacks yet.