Richman games

Andrew Lazarus

Richman games

Andrew Lazarus

1995

visibility

…

description

11 pages

link

1 file

A Richman game is a combinatorial game in which, rather than alternating moves, the two players bid for the privilege of making the next move. We find optimal strategies for both the case where a player knows how much money his or her opponent has and the case where the player does not.

Games of No Chance MSRI Publications Volume 29, 1996 Richman Games ANDREW J. LAZARUS, DANIEL E. LOEB, JAMES G. PROPP, AND DANIEL ULLMAN Dedicated to David Richman, 1956–1991 Abstract. A Richman game is a combinatorial game in which, rather than alternating moves, the two players bid for the privilege of making the next move. We find optimal strategies for both the case where a player knows how much money his or her opponent has and the case where the player does not. 1. Introduction There are two game theories. The first is now sometimes referred to as matrix game theory and is the subject of the famous von Neumann and Morgenstern treatise [1944]. In matrix games, two players make simultaneous moves and a payment is made from one player to the other depending on the chosen moves. Optimal strategies often involve randomness and concealment of information. The other game theory is the combinatorial theory of Winning Ways [Berlekamp et al. 1982], with origins back in the work of Sprague [1936] and Grundy [1939] and largely expanded upon by Conway [1976]. In combinatorial games, two players move alternately. We may assume that each move consists of sliding a token from one vertex to another along an arc in a directed graph. A player who cannot move loses. There is no hidden information and there exist deterministic optimal strategies. In the late 1980’s, David Richman suggested a class of games that share some aspects of both sorts of game theory. Here is the set-up: The game is played by two players (Mr. Blue and Ms. Red), each of whom has some money. There 1991 Mathematics Subject Classification. 90D05. Key words and phrases. Combinatorial game theory, impartial games. Loeb was partially supported by URA CNRS 1304, EC grant CHRX-CT93-0400, the PRC Maths-Info, and NATO CRG 930554. 439 440 A. J. LAZARUS, D. E. LOEB, J. G. PROPP, AND D. ULLMAN is an underlying combinatorial game in which a token rests on a vertex of some finite directed graph. There are two special vertices, denoted by b and r; Blue’s goal is to bring the token to b and Red’s goal is to bring the token to r. The two players repeatedly bid for the right to make the next move. One way to execute this bidding process is for each player to write secretly on a card a nonnegative real number no larger than the number of dollars he or she has; the two cards are then revealed simultaneously. Whoever bids higher pays the amount of the bid to the opponent and moves the token from the vertex it currently occupies along an arc of the directed graph to a successor vertex. Should the two bids be equal, the tie is broken by a toss of a coin. The game ends when one player moves the token to one of the distinguished vertices. The sole objective of each player is to make the token reach the appropriate vertex: at the game’s end, money loses all value. The game is a draw if neither distinguished vertex is ever reached. Note that with these rules (compare with [Berlekamp 1996]), there is never a reason for a negative bid: since all successor vertices are available to both players, it cannot be preferable to have the opponent move next. That is to say, there is no reason to part with money for the chance that your opponent will carry out through negligence a move that you yourself could perform through astuteness. A winning strategy is a policy for bidding and moving that guarantees a player the victory, given fixed initial data. (These initial data include where the token is, how much money the player has, and possibly how much money the player’s opponent has.) In section 2, we explain how to find winning strategies for Richman games. In particular, we prove the following facts, which might seem surprising: • Given a starting vertex v, there exists a critical ratio R(v) such that Blue has a winning strategy if Blue’s share of the money, expressed as a fraction of the total money supply, is greater than R(v), and Red has a winning strategy if Blue’s share of the money is less than R(v). (This is not so surprising in the case of acyclic games, but for games in general, one might have supposed it possible that, for a whole range of initial conditions, play might go on forever.) • There exists a strategy such that if a player has more than R(v) and applies the strategy, the player will win with probability 1, without needing to know how much money the opponent has. In proving these assertions, it will emerge that a critical (and in many cases optimal) bid for Blue is R(v) − R(u) times the total money supply, where v is the current vertex and u is a successor of v for which R(u) is as small as possible. A player who cannot bid this amount has already lost, in the sense that there is no winning strategy for that player. On the other hand, a player who has a winning strategy of any kind and bids R(v) − R(u) will still have a winning RICHMAN GAMES 441 strategy one move later, regardless of who wins the bid, as long as the player is careful to move to u if he or she does win the bid. It follows that we may think of R(v)−R(u) as the “fair price” that Blue should be willing to pay for the privilege of trading the position v for the position u. Thus we may define 1 − R(v) as the Richman value of the position v, so that the fair price of a move exactly equals the difference in values of the two positions. However, it is more convenient to work with R(v) than with 1 − R(v). We call R(v) the Richman cost of the position v. We will see that for all v other than the distinguished vertices b and r, R(v) is the average of R(u) and R(w), where u and w are successors of v in the digraph that minimize and maximize R(·), respectively. In the case where the digraph underlying the game is acyclic, this averaging-property makes it easy to compute the Richman costs of all the positions, beginning with the positions b and r and working backwards. If the digraph contains cycles it is not so easy to work out precise Richman costs. We defer most of our examples to another paper [Lazarus et al.], in which we also consider infinite digraphs and discuss the complexity of the computation of Richman costs. 2. The Richman Cost Function Henceforth, D will denote a finite directed graph (V, E) with a distinguished blue vertex b and a distinguished red vertex r such that from every vertex there is a path to at least one of the distinguished vertices. For v ∈ V , let S(v) denote the set of successors of v in D, that is, S(v) = {w ∈ V : (v, w) ∈ E}. Given any function f : V → [0, 1], we define f + (v) = max f (w) and f − (v) = min f (u). w∈S(v) u∈S(v) The key to playing the Richman game on D is to attribute costs to the vertices of D such that the cost of every vertex (except r and b) is the average of the lowest and highest costs of its successors. Thus, a function R : V → [0, 1] is called a Richman cost function if R(b) = 0, R(r) = 1, and for every other v ∈ V we have R(v) = 12 (R+ (v) + R− (v)). (Note that Richman costs are a curious sort of variant on harmonic functions on Markov chains [Woess 1994], where instead of averaging over all the successor-values, we average only over the two extreme values.) The relations R+ (v) ≥ R(v) ≥ R− (v) and R+ (v) + R− (v) = 2R(v) will be much used in what follows. Theorem 2.1. The digraph D has a Richman cost function R(v). Proof. We introduce a function R(v, t), whose game-theoretic significance will be made clearer in Theorem 2.2. Let R(b, t) = 0 and R(r, t) = 1 for all t ∈ N . 442 A. J. LAZARUS, D. E. LOEB, J. G. PROPP, AND D. ULLMAN For v ∈ / {b, r}, define R(v, 0) = 1 and R(v, t) = 21 (R+ (v, t − 1) + R− (v, t − 1)) for t > 0. It is easy to see that R(v, 1) ≤ R(v, 0) for all v, and a simple induction shows that R(v, t+1) ≤ R(v, t) for all v and all t ≥ 0. Therefore R(v, t) is weakly decreasing and bounded below by zero as t → ∞, hence convergent. It is also evident that the function v 7→ lim R(v, t) satisfies the definition of a Richman t→∞ cost function. Alternate proof. Identify functions f : V (D) → [0, 1] with points in the |V (D)|-dimensional cube Q = [0, 1]|V (D)| . Given f ∈ Q, define g ∈ Q by g(b) = 0, g(r) = 1, and, for every other v ∈ V , g(v) = 21 (f + (v) + f − (v)). The map f 7→ g is clearly a continuous map from Q into Q, and so by the Brouwer fixed point theorem it has a fixed point. This fixed point is a Richman cost function. This Richman cost function does indeed govern the winning strategy, as we now prove. Theorem 2.2. Suppose Blue and Red play the Richman game on the digraph D with the token initially located at vertex v. If Blue’s share of the total money exceeds R(v) = limt→∞ R(v, t), Blue has a winning strategy. Indeed , if his share of the money exceeds R(v, t), his victory requires at most t moves. Proof. Without loss of generality, money may be scaled so that the total supply is one dollar. Whenever Blue has over R(v) dollars, he must have over R(v, t) dollars for some t. We prove the claim by induction on t. At t = 0, Blue has over R(v, 0) dollars only if v = b, in which case he has already won. Now assume the claim is true for t − 1, and let Blue have more than R(v, t) dollars. There exist neighbors u and w of v such that R(u, t − 1) = R− (v, t − 1) and R(w, t− 1) = R+ (v, t− 1), so that R(v, t) = 21 (R(w, t− 1)+ R(u, t− 1)). Blue can bid 21 (R(w, t−1)−R(u, t−1)) dollars. If Blue wins the bid at v, then he moves to u and forces a win in at most t−1 moves (by the induction hypothesis), since he has more than 12 (R(w, t−1)+R(u, t−1))− 21 (R(w, t−1)−R(u, t−1)) = R(u, t−1) dollars left. If Blue loses the bid, then Red will move to some z, but Blue now has over 12 (R(w, t − 1) + R(u, t − 1)) + 12 (R(w, t − 1) − R(u, t − 1)) = R(w, t − 1) ≥ R(z, t − 1) dollars, and again wins by the induction hypothesis. One can define another function R′ (v, t) where R′ (b, t) = 0 and R′ (r, t) = 1 for all t ∈ N , and R′ (v, 0) = 0 and R′ (v, t) = 21 (R+ (v, t − 1) + R− (v, t − 1)) for v∈ / {b, r} for t > 0. By an argument similar to the proof of Theorem 2.2, this also converges to a Richman cost function R′ (v) ≤ R(v) (with R(v) defined as in the proof of Theorem 2.1). Thus, R′ (v, t) indicates how much money Blue needs to prevent Red from forcing a win from v in t or fewer moves, so R′ (v) indicates how much money Blue needs to prevent Red from forcing a win in any length of time. RICHMAN GAMES 443 For certain infinite digraphs, it can be shown [Lazarus et al.] that R′ (v) is strictly less than R(v). When Blue’s share of the money supply lies strictly between R′ (v) and R(v), each player can prevent the other from winning. Thus, optimal play leads to a draw. Nevertheless, in this paper, we assume that D is finite, and we can conclude that there is a unique Richman cost function R′ (v) = R(v). Theorem 2.3. The Richman cost function of the digraph D is unique. The proof of Theorem 2.3 requires the following definition and technical lemma. An edge (v, u) is said to be an edge of steepest descent if R(u) = R− (v). Let v̄ be the transitive closure of v under the steepest-descent relation. That is, w ∈ v̄ if there exists a path v = v0 , v1 , v2 , . . . , vk = w such that (vi , vi+1 ) is an edge of steepest descent for i = 0, 1, . . . , k − 1. Lemma 2.4. Let R be any Richman cost function of the digraph D. If R(z) < 1, then z̄ contains b. Proof. Suppose R(z) < 1. Choose v ∈ z̄ such that R(v) = min R(u). Such a v u∈z̄ must exist because D (and hence z̄) is finite. If v = b, we’re done. Otherwise, assume v 6= b, and let u be any successor of v. The definition of v implies R− (v) = R(v), which forces R+ (v) = R(v). Since R(u) lies between R− (v) and R+ (v), R(u) = R(v) = R− (v). Hence (v, u) is an edge of steepest descent, so u ∈ z̄. Moreover, u satisfies the same defining property that v did (it minimized R(·) in the set z̄), so the same proof shows that for any successor w of u, R(w) = R(u) and w ∈ z̄. Repeating this, we see that for any point w that may be reached from v, R(w) = R(v) and w ∈ z̄. On the other hand, R(r) is not equal to R(v) (since R(v) ≤ R(z) < 1 = R(r)), so r cannot be reached from v. Therefore b can be reached from v, so we must have b ∈ z̄. Theorem 2.3. Suppose that R1 and R2 are Richman cost functions of D. Choose v such that R1 − R2 is maximized at v; such a v exists since D is finite. Let M = R1 (v) − R2 (v). Choose u1 , w1 , u2 , w2 (all successors of v) such that Ri− (v) = R(ui ) and Ri+ (v) = R(wi ). Since R1 (u1 ) ≤ R1 (u2 ), we have R1 (u1 ) − R2 (u2 ) ≤ R1 (u2 ) − R2 (u2 ) ≤ M. (2.1) (The latter inequality follows from the definition of M .) Similarly, R2 (w2 ) ≥ R2 (w1 ), so (2.2) R1 (w1 ) − R2 (w2 ) ≤ R1 (w1 ) − R2 (w1 ) ≤ M Adding (2.1) and (2.2), we have (R1 (u1 ) + R1 (w1 )) − (R2 (u2 ) + R2 (w2 )) ≤ 2M. The left side is 2R1 (v) − 2R2 (v) = 2M , so equality must hold in (2.1). In particular, R1 (u2 ) − R2 (u2 ) = M ; i.e., u2 satisfies the hypothesis on v. Since u2 was any vertex with R2 (u2 ) = R2− (v), induction shows that R1 (u) − R2 (u) = M 444 A. J. LAZARUS, D. E. LOEB, J. G. PROPP, AND D. ULLMAN for all u ∈ v̄, where descent is measured with respect to R2 . Since R1 (b)−R2 (b) = 0 and b ∈ v̄, we have R1 (v) − R2 (v) ≤ 0 everywhere. That is, R1 ≤ R2 . The same argument for R2 − R1 shows the opposite inequality, so R1 = R2 . The uniqueness of the Richman cost function implies in particular that the function R′ defined after the proof of Theorem 2.2 coincides with the function R constructed in the first proof of Theorem 2.1. From this we deduce the following: Corollary 2.5. Suppose Blue and Red play the Richman game on the digraph D with the token initially located at vertex v. If Blue’s share of the total money supply is less than R(v) = lim R(v, t), then Red has a winning strategy. t→∞ It is also possible to reverse the order of proof, and to derive Theorem 2.3 from Corollary 2.5. For, if there were two Richman functions R1 and R2 , with R1 (v) < R2 (v), say, then by taking a situation in which Blue’s share of the money was strictly between R1 (v) and R2 (v), we would find that both Blue and Red had winning strategies, which is clearly absurd. Theorem 2.2 and Corollary 2.5 do not cover the critical case where Blue has exactly R(v) dollars. In this case, with both players using optimal strategy, the outcome of the game depends on the outcomes of the coin-tosses used to resolve tied bids. Note, however, that in all other cases, the deterministic strategy outlined in the proof of Theorem 2.2 works even if the player with the winning strategy concedes all ties and reveals his intended bid and intended move before the bidding. Summarizing Theorem 2.2 and Corollary 2.5, we may say that if Blue’s share of the total money supply is less than R(v), Red has a winning strategy, and if it is greater, Blue has a winning strategy (see [Lazarus et al.] for a fuller discussion). 3. Other Interpretations Suppose the right to move the token is decided on each turn by the toss of a fair coin. Then induction on t shows that the probability that Red can win from the position v in at most t moves is equal to R(v, t), as defined in the previous section. Taking t to infinity, we see that R(v) is equal to the probability that Red can force a win against optimal play by Blue. That is to say, if both players play optimally, R(v) is the chance that Red will win. The uniqueness of the Richman cost function tells us that 1 − R(v) must be the chance that Blue will win. The probability of a draw is therefore zero. If we further stipulate that the moves themselves must be random, in the sense that the player whose turn it is to move must choose uniformly at random from the finitely many legal options, we do not really have a game-like situation anymore; rather, we are performing a random walk on a directed graph with two absorbing vertices, and we are trying to determine the probabilities of absorption RICHMAN GAMES 445 at these two vertices. In this case, the relevant probability function is just the harmonic function on the digraph D (or, more properly speaking, the harmonic function for the associated Markov chain [Woess 1994]). Another interpretation of the Richman cost, brought to our attention by Noam Elkies, comes from a problem about makings bets. Suppose you wish to bet (at even odds) that a certain baseball team will win the World Series, but that your bookie only lets you make even-odds bets on the outcomes of individual games. Here we assume that the winner of a World Series is the first of two teams to win four games. To analyze this problem, we create a directed graph whose vertices correspond to the different possible combinations of cumulative scores in a World Series, with two special terminal vertices (blue and red) corresponding to victory for the two respective teams. Assume that your initial amount of money is $500, and that you want to end up with either $0 or $1000, according to whether the blue team or the red team wins the Series. Then it is easy to see that the Richman cost at a vertex tells exactly how much money you want to have left if the corresponding state of affairs transpires, and that the amount you should bet on any particular game is $1000 times the common value of R(v) − R(u) and R(w) − R(v), where v is the current position, u is the successor position in which Blue wins the next game, and v is the successor position in which Red wins the next game. 4. Incomplete Knowledge Surprisingly, it is often possible to implement a winning strategy without knowing how much money one’s opponent has. Define Blue’s safety ratio at v as the fraction of the total money that he has in his possession, divided by R(v) (the fraction that he needs in order to win). Note that Blue will not know the value of his safety ratio, since we are assuming that he has no idea how much money Red has. Theorem 4.1. Suppose Blue has a safety ratio strictly greater than 1. Then Blue has a strategy that wins with probability 1 and does not require knowledge of Red’s money supply. If , moreover , the digraph D is acyclic, his strategy wins regardless of tiebreaks; that is, “with probability 1” can be replaced by “definitely”. Proof. Here is Blue’s strategy: When the token is at vertex v, and he has B dollars, he should act as if his safety ratio is 1; that is, he should play as if Red has Rcrit dollars with B/(B + Rcrit ) = R(v) and the total amount of money is B + Rcrit = B/R(v) dollars. He should accordingly bid X= R(v) − R− (v) B R(v) 446 A. J. LAZARUS, D. E. LOEB, J. G. PROPP, AND D. ULLMAN dollars. Suppose Blue wins (by outbidding or by tiebreak) and moves to u along an edge of steepest descent. Then Blue’s safety ratio changes from B B+R R(v) to B − X B+R , R(u) where R is the actual amount of money that Red has. However, these two safety ratios are actually equal, since X R(v) − R(u) R(u) B−X =1− =1− = . B B R(v) R(v) Now suppose instead that Red wins the bid (by outbidding or by tiebreak) and moves to z. Then Blue’s safety ratio changes from B B+R R(v) to B + Y B+R , R(z) with Y ≥ X. Note that the new safety ratio is greater than or equal to B + X B+R , R(w) where R(w) = R+ (v). But this lower bound on the new safety ratio is equal to the old safety ratio, since X R(w) − R(v) R(w) B+X = 1+ =1+ = . B B R(v) R(v) In either case, the safety ratio is nondecreasing, and in particular must stay greater than 1. On the other hand, if Blue were to eventually lose the game, his safety ratio at that moment would have to be at most 1, since his fraction of the total money supply cannot be greater than R(r) = 1. Consequently, our assumption that Blue’s safety ratio started out being greater than 1 implies that Blue can never lose. In an acyclic digraph, infinite play is impossible, so the game must terminate at b with a victory for Blue. In the case where cycles are possible, suppose first that at some stage Red outbids Blue by εB > 0 and gets to make the next move, say from v to w. If Blue was in a favorable situation at v, the total amount of money that the two players have between them must be less than B/R(v). On the other hand, after the payoff by Red, Blue has R(v) − R(u) +ε B R(v) 2R(v) − R(u) +ε B = R(v) R(w) + ε B, = R(v) B + X + εB = 1 + RICHMAN GAMES 447 so that Blue’s total share of the money must be more than R(w) + εR(v). Blue can do this calculation as well as we can; he then knows that if he had been in a winning position to begin with, his current share of the total money must exceed R(w) + εR(v). Now, R(w) + εR(v) is greater than R(w, t) for some t, so Blue can win in t moves. Thus, Red loses to Blue’s strategy if she ever bids more than he does. Hence, if she hopes to avoid losing, she must rely entirely on tiebreaking. Let N be the length of the longest path of steepest descent in the directed graph D. Then Blue will win the game when he wins N consecutive tiebreaks (if not earlier). When D has cycles, Blue may need to rely on tiebreaks in order to win, as in the case of the Richman game played on the digraph pictured in Figure 4. 1 2 jH 1 2 6 HHH HjH vj j 1 2 j * 0 b - jH HH 1 2 HHjH 1 rj Figure 1. The digraph D and its Richman costs. Suppose that the token is at vertex v, that Blue has B dollars, and that Red has R dollars. Clearly, Blue knows he can win the game if B > R. But without knowing R, it would be imprudent for him to bid any positive amount εB for fear that Red actually started with (1 − ε)B dollars; for if that were the case, and his bid were to prevail, the token would move to a vertex where the Richman cost is 21 and Blue would have less money than Red. Such a situation will lead to a win for Red with probability 1 if she follows the strategy outlined in Theorem 2.2. 5. Rationality For every vertex v of the digraph D (other than b and r), let v + and v − denote successors of v for which R(v + ) = R+ (v) and R(v − ) = R− (v). Then we have R(b) = 0, R(r) = 1, and 2R(v) = R(v + ) + R(v − ) for v 6= b, r. We can view this as a linear program. By Theorem 2.3, this system must have a unique solution. Since all coefficients are rational, we see that Richman costs are always rational numbers. The linear programming approach also gives us a conceptually simple (though computationally dreadful) way to calculate Richman costs. If we augment our program by adding additional conditions of the form R(v − ) ≤ R(w) and R(v + ) ≥ R(w), where v ranges over the vertices of D other than b and r and where, for 448 A. J. LAZARUS, D. E. LOEB, J. G. PROPP, AND D. ULLMAN each v, w ranges over all the successors of v, then we are effectively adding in the constraint that the edges from v to v − and v + are indeed edges of steepest descent and ascent, respectively. The uniqueness of Richman costs tells us that if we let the mappings v 7→ v − and v 7→ v + range over all possibilities (subject to the constraint that both v − and v + must be successors of v), the resulting linear programs (which typically will have no solutions at all) will have only solutions that correspond to the genuine Richman cost functions. Hence, in theory one could try all the finitely many possibilities for v 7→ v − and v 7→ v + and solve the associated linear programs until one found one with a solution. However, the amount of time such an approach would take increases exponentially with the size of the directed graph. In [Lazarus et al.], we discuss other approaches. We will also discuss in [Lazarus et al.], among other things, a variant of Richman games, which we call “Poorman” games, in which the winning bid is paid to a third party (auctioneer or bank) rather than to one’s opponent. The whole theory carries through largely unchanged, except that Poorman costs are typically irrational. References [Berlekamp 1996] E. R. Berlekamp, “An economist’s view of combinatorial games”, pp. 365–405 in this volume. [Berlekamp et al. 1982] E. R. Berlekamp, J. H. Conway, and R. K. Guy, Winning Ways For Your Mathematical Plays, Academic Press, London, 1982. [Conway 1976] J. H. Conway, On Numbers And Games, Academic Press, London, 1976. [Grundy 1939] P. M. Grundy, “Mathematics and Games”, Eureka 2 (1939), 6–8. Reprinted in Eureka 27 (1964), 9–11. [Lazarus et al.] A. J. Lazarus, D. E. Loeb, J. G. Propp, and D. Ullman, “Combinatorial games under auction play”, submitted to Games and Economic Behavior. [von Neumann and Morgenstern 1944] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Wiley, New York, 1944. [Sprague 1935–36] R. Sprague, “Über mathematische Kampfspiele”, Tôhoku Math. J. 41 (1935–36), 438–444. [Woess 1994] W. Woess, “Random walks on infinite graphs and groups: a survey on selected topics”, Bull. London Math. Soc. 26 (1994), 1–60. RICHMAN GAMES Andrew J. Lazarus 2745 Elmwood Avenue Berkeley, CA 94705 [email protected] Daniel E. Loeb LaBRI Université de Bordeaux I 33405 Talence Cedex, France [email protected] http://www.labri.u-bordeaux.fr/˜ loeb James G. Propp Department of Mathematics Massachusetts Institute of Technology Cambridge MA 02139-4307 [email protected] http://www-math.mit.edu/˜propp Daniel Ullman Department of Mathematics The George Washington University Funger Hall 428V 2201 G Street, NW Washington DC 20052-0001 [email protected] http://gwis2.circ.gwu.edu/˜ dullman 449

Log In

Richman games

Related papers

Related papers

Related topics