Academia.eduAcademia.edu

Let the games begin and go on

2018, International Journal of Business and Systems Research

Int. J. Business and Systems Research, Vol. 12, No. 1, 2018 Let the games begin and go on Maria Cristina P. Matos Instituto Politécnico de Viseu (IPV), ESTV, CI&DETS, VISEU, Portugal Email: [email protected] Manuel Alberto M. Ferreira and José António Filipe* Instituto Universitário de Lisboa, BRU-IUL, ISTAR-IUL, Av. das Forças Armadas, 1649-026, Lisboa, Portugal Email: [email protected] Email: [email protected] *Corresponding author Abstract: Real life is a bigger game in which what a player does early on can affect what others choose to do later on. In particular, we can strive to explain how cooperative behaviour can be established as a result of rational behaviour. When engaged in a repeated situation, players must consider not only their short-term gains but also their long-term payoffs. The general idea of repeated games is that players may be able to deter another player from exploiting his short-term advantage by threatening a punishment that reduces his long-term payoff. The aim of the paper that supports this abstract is to present and discuss dynamic game theory. There are three basic kinds of reasons, which are not mutually exclusive, to study what happens in repeated games. First, it provides a pleasant and a very interesting theory and it has the advantage of making us become more humble in our predictions. Second, many of the most interesting economic interactions repeated often can incorporate a phenomenon which we believe are important but which are not captured when we restrict our attention to static games. Finally, economics, and equilibrium-based theories more generally, do best when analysing routinised interactions. Keywords: dynamic games; code form game; repeated game. Reference to this paper should be made as follows: Matos, M.C.P., Ferreira, M.A.M. and Filipe, J.A. (2018) ‘Let the games begin and go on’, Int. J. Business and Systems Research, Vol. 12, No. 1, pp.43–52. Biographical notes: Maria Cristina P. Matos graduated in Pure Mathematics at the University of Coimbra. She holds a Master in Management and a PhD in Quantitative Methods at ISCTE-Lisbon University Institute. She is an Adjunct Professor at IPV – Viseu Polytechnic Institute. Her research interests are mathematics; game theory; applications to management, economics, marketing and finance. She has published more than 40 papers, in scientific journals and conference proceedings, and one book chapter. She presented 16 communications in international scientific conferences. Copyright © 2018 Inderscience Enterprises Ltd. 43 44 M.C.P. Matos et al. Manuel Alberto M. Ferreira is an Electrotechnic Engineer and received his Master in Applied Mathematics at the Lisbon Technical University, PhD in Management-Quantitative Methods and Habilitation in Quantitative Methods at the ISCTE-Lisbon University Institute. He is the former Chairman of the Board of Directors and Vice-President of ISCTE-Lisbon University Institute. He is a Full Professor at the ISCTE-Lisbon University Institute. He is the Director in the Department of Mathematics at the ISTA-School of Technology and Architecture. His research interests are mathematics, statistics, stochastic processes-queues and applied probabilities, game theory, Bayesian statistics: application to forensic identification applications, applications to economics, management, business, marketing, finance and social problems in general. He has published more than 370 papers in scientific journals and conference proceedings, and 35 book chapters. He presented 163 communications in international scientific conferences. José António Filipe is an Assistant Professor at the Instituto Universitário de Lisboa (ISCTE-IUL), has his Habilitation in Quantitative Methods, received his PhD in Quantitative Methods, Master in Management and Graduation in Economics. His current research interests include, among others, mathematics and statistics, multi criteria decision making methods, chaos theory, game theory, stochastic processes – queues and applied probabilities, Bayesian statistics – application to forensic identification, applications to economics, management, business, marketing, finance and social problems in general. He has published more than 200 papers in scientific journals and conference proceedings, and 18 book chapters. He presented about 100 communications in international scientific conferences and other scientific events. 1 Introduction In a contentious atmosphere players, who are not always human or aware of what they are doing, compete in order to achieve their objectives. In a game each party’s interests are confronted, which makes each player develop action strategies to maximise gains or minimise losses, i.e., the player is looking for a strategy which will result in reaching a certain objective in opposition with the other players who are also trying to optimise their position. The final outcome depends on the set of strategies taken up by all participants. That is, a game is any situation governed by rules with well-defined outcomes characterised by strategic interdependence. As we know game theory is a discipline which allows vast and interesting results to be achieved in classifying, formalising and solving distinct interaction situations. For this reason, it is often used to study competition in oligopolistic markets. Competition in this type of environment is characterised by strategic interdependence. It is starting from precisely this assumption that we present and analyse ‘Prisoner’s dilemma’ game to demonstrate how starting with game theory we can model and establish results for situations which occur in economic theory. ‘Prisoner’s dilemma’ has attracted the attention of researchers because it depicts a contradictory situation incisively: in looking for the best, each economic agent produces a non-optimum outcome from the stakeholders’ point of view. In other words, this game demonstrates that under certain circumstances looking out for their personal interests Let the games begin and go on 45 leads economic agents to inefficient outcomes in the Pareto sense. Thus, a concerted action may lead all agents to more favourable outcomes. In a two-company cartel, both companies face an analogous situation. By cooperating they can obtain half of a monopoly’s profits, but if both have to decide independently on quantities or prices, then each company will consider it less favourable to cooperate regardless of the competitor’s decision. So, in terms of self-interest, both companies compete to obtain lower profits than they would obtain through cooperation. In the context of a duopoly, this idea is validated by competitive variables in general, namely production quantities, defining prices, costs with advertising campaigns, research and development. Nevertheless, in a repeated interaction, the expectation of future encounters makes cooperating more attractive. The end of the interactions should not be known beforehand, i.e., there should be some probability of a next play. Company heads are often heard to say: ‘things change’. Therein lays the challenge. If the business relationship between the various economic agents occurs only once, knowledge of what the other agent does is irrelevant. It does not matter what his strategy is; the best response is not to cooperate. If the relationship is to last, it is an entirely different matter. Each one’s decision depends on the durability of the interaction. It also depends on decisions taken at earlier stages. It does not depend, however, on future actions and no company knows the line of reasoning of its competitor. That is, in a game where the result is not cooperative, cooperation may be a perfect Nash equilibrium of the game repeated infinitely. This happens in the ‘Prisoner’s dilemma’. Thus, we can expect companies to cooperate in a duopoly with an infinite horizon but theoretically not in the finite case. Even though a non-cooperative game repeated any finite number of times is still a non-cooperative game, reality shows that companies try often to implement ways to cooperate. 2 ‘Prisoner’s dilemma Let us now analyse the ‘Prisoner’s dilemma’ game. This example of a non-cooperative game highlights the rationality required when two individuals meet in a position where the decision of one of them depends on the decision of the other one. 2.1 Symmetrical game with two players Two individuals, player 1 and player 2, who are supposedly criminals are arrested, see Straffin (1980). The problem for the police is that, assuming both are involved and in the absence of proof, a confession is required. Imprisoned in individual and distant cells without communication between them each one is given the rules in this case: • If no one chooses to confess, both will be accused of a lesser offence which will mean a symbolic sentence of only one month in jail. • If both confess, thereby assuming participation in the crime both will be condemned to a four-year jail sentence. • Finally, if one confesses and the other one does not, the one who confesses will be released immediately, and the other will receive the maximum penalty under the law: five years in jail. 46 M.C.P. Matos et al. The strategies in this case are: ‘confess’ or ‘not confess’. The payoffs are the sentences. Figure 1 shows the game in code form, see Matos and Ferreira (2006) and Matos (2009). Figure 1 ‘Prisoner’s dilemma’ code form Considering player 1, confessing – C, is better if player 2 stays quiet – PC, as he will be freed immediately for confessing. Confessing is better if player 2 confesses because prison is inevitable and he would, therefore, obtain benefits for having confessed. We may conclude then, by analysing player 2’s choices, confessing is strictly better for player 1. Consequently confessing is, for him, a dominant strategy. If the same analysis is carried out for player 2, confessing is also a dominant strategy. Since each suspect only has one dominant strategy, confessing, the only result of the game for rational suspects is (C, C). Note that this analysis only requires that each player is rational and knows the agreement that is being offered by authorities, i.e., his possible payoffs. A suspect does not need to know what the other suspect has been promised nor whether or not the other suspect is rational. 2.2 ‘Prisoner’s dilemma’ in two stages In a repeated game, players present a possibility of establishing a reputation for cooperating which will lead the other player to proceed in accordance. The entire strategy will depend on whether the game is repeated either a finite or an infinite number of times. That strategic interaction process allows a history to be built between players. Thus, this players’ behaviours history is assessed so as to evaluate the convenience or not of following through on the game. Meanwhile, even though the player knows the decisions that were taken in previous stages, they may be asked to decide without prior knowledge of the other players’ choices in that stage. Repeated games are a sample of how to induce cooperation, even when players show significant gains in the opposite behaviour by not cooperating at each stage. The condition for which repetition of a game leads to a valuable relationship between players concerns to credibility. This fact can be observed by analysing the ‘Prisoner’s dilemma’ game repeated in two stages as presented in code form in Figure 2. Let the games begin and go on Figure 2 47 ‘Prisoner’s dilemma’ in code form repeated in two stages Suppose that both players simultaneously decide on two occasions, after seeing the result of the first decision but before deciding the second time. Suppose also that the payoff of the complete game is the sum of payoffs of each stage (meaning there is no discount). In the previous section, we saw that when the ‘Prisoner’s dilemma’ game is played once there is only one equilibrium, in which each player confesses. Using the backward induction process, we can easily see that the only perfect equilibrium in ‘Prisoner’s dilemma’ sub-games in two stages consists in both prisoners confessing in each sub-game. That is, in the two-stage ‘Prisoner’s dilemma’ game the only game equilibrium in the second stage is independent of the result of the first stage. In each of the four final games, there is only one equilibrium and, considering that a perfect equilibrium in subgames should contain equilibrium in each sub-game, and that is the equilibrium of the sub-game, each player confesses in each sub-game. We are led to the first stage of the game, where there is once again only one equilibrium (C, C). Each player makes his choice regarding the strategy to be used considering the consequences that this choice will have in playing out the game. Judging the previous game we can observe that there is no incentive for cooperating in the second phase as no future relationship will be established. Cooperation in the first phase will not lead to cooperation in the second phase. Lack of credibility keeps prisoners away from achieving a better result than the equilibrium of that stage. One prisoner’s promise of not confessing in the second period, which is what would be missing in order to obtain higher payments, does not pass the credibility filter. 48 M.C.P. Matos et al. • Is a cooperative solution possible in a non-cooperative game? • Are there ways to implement cooperation in lasting relationships? Let us answer these questions in the following section. 2.3 The ‘prisoner’s dilemma’ in infinite stages A finite game seems rather unrealistic to understand strategic interactions. Hence, the study of infinitely repeated games is more realistic. Just as in the case of games that are repeated a finite number of times, the main problem of infinitely repeated games is that credible threats and promises may influence present behaviour. Let us consider the ‘Prisoner’s dilemma’ repeated infinitely, see von Neumann and Morgenstern (1967) and Jorgensen et al. (2007). Suppose that for each stage t, results of the previous t – 1 moves of the game were observed before stage t begins. Let us begin by redefining payoffs for this type of game. Payoff assessment in games that are repeated an infinite number of times present some difficulties. As we know a euro in one hundred years will not be worth what it is worth today. To overcome the time horizon, future payoffs will be discounted with respect to the present. Thus, considering a fixed discount rate of δ < 1, the present value of the infinite succession of payoffs associated with each stage is π1 + δπ2 + δ 2 π3 + … = ∑δ πt , t ≥ 1 t −1 Considering our example, suppose each player’s discount factor is δ, and that each player’s payoff in the repeated game is the current value of the player’s payoff at each stage of the game. In the ‘Prisoner’s dilemma’ base game each player must choose his dominant strategy, which is ‘to confess’. Even when the game is repeated finitely, because each stage of the game has a single perfect Nash equilibrium, the only perfect equilibrium in sub-games is reached when both players choose to confess in each period. Nevertheless, when players are patient enough, we may maintain cooperation, keeping quiet, in each sub-game of each stage is a perfect equilibrium of the infinitely repeated game. We begin by seeing that that cooperation is a Nash equilibrium of the repeated game and later that cooperation is a perfect equilibrium in sub-games. When an infinitely repeated game is played, each player i has a strategy for the repeated game, si, that is a sequence of the history depending on the strategies of the games stage sit , that is si = ( si0 , si1 …). The n elements sequence of individual strategies of the repeated game is the strategic profile of the repeated game s, that is, s = (si, …, sn). The strategies of the repeated game, which are sufficient to lead to cooperation, are of the following form: Player i begin the infinitely repeated game by cooperating and keeps cooperating in each game of the following stage only if the players cooperate in each of the previous periods. In this ‘trigger strategy’, player i cooperates until the other player stops cooperating, a situation which implies non-cooperation in the whole future moves. Nevertheless, if both players adopt this strategy, the result of the infinitely repeated game will be (PC, PC). Let the games begin and go on 49 More precisely and formally, the player i’s strategy in the repeated game is written, si = ( si0 , si1 …), as sequence of the story depending on the strategies of the game stage such that at period t and after history ht, ⎧ PC , t = 0 ∨ ht = ( ( PC , PC )t ) sit ( ht ) = ⎨ C , o.c. ⎪⎩ Note that history ht is a sequence of the action profiles in each game stage that were played in t periods 0, 1, 2 …, t – 1. ((PC, PC)t) is only a way of writing ((PC, PC), (PC, PC), …, (PC, PC)), where (PC, PC) is repeated t times. We can simplify the previous system so as to be useful to our later analysis: h0 is the null history, we adopted the convention that profile h0 is played 0 times. Thus, t = 0 implies that ht = ((PC, PC)t). Therefore ⎧ PC , ht = ( ( M , M )t ) sit ( ht ) = ⎨ C , o.c. ⎪⎩ Let us prove, then, that for sufficiently patient players, the strategic profile s = (s1, s2) is a Nash equilibrium of the repeated game. In t = 1, the history is h1 = (PC, PC), such that both keep quiet. Consequently, at t = 2, the history is h2 = ((PC, PC), (PC, PC)), so both stay quiet, and so on…. So, the path associated with s is the infinite sequence of strategic profiles of cooperative actions ((PC, PC), (PC, PC …). The payoff of the repeated game for each player which corresponds to this path is trivial: the payoff at each future stage will be 1. Can player j win if he deviates from strategy of the repeated game on condition that player i follows faithfully? Let t be the period in which player j deviates from that strategy for the first time. Given that player i will confess forever after period t, the best response for player j is effectively to confess whenever the outcome of a stage is different from (PC, PC). If player j chooses to confess at this period he will receive a payoff of 5. This deviation on the part of player j implies player i’s non-cooperation in any future move. This way the payoff at each future stage would δ , the present value of this payoff sequence is be 1. Because 1× δ + 1× δ 2 + = 1− δ δ 5 + 1× δ + 1× δ 2 + = 5 . 1− δ Alternatively, staying quiet would provide a payoff of 4 at this stage and will lead exactly to the same choice between confessing or staying quiet in the following stage. Let V be the present value of the infinite succession of payoffs player j will receive for making this choice in an optimum way. If player j chooses to stay quiet it is optimum and 4 since staying quite leads to the same decision in the following period. therefore V = 1− δ 4 as we have seen before. If player j choosing to confess is optimum, then V = 5 + 1− δ 4 1 δ Thus, remaining quiet will be optimum if and only if V = ≥ 5+ ⇔δ≥ . 1− δ 1− δ 4 50 M.C.P. Matos et al. Thus, as long as players are patient enough, cooperating is a Nash equilibrium of the 1 repeated game as long as δ ≥ . 4 The following section presents a real-life example of a repeated ‘Prisoner’s dilemma’ in which the above concepts are demonstrated. 3 Price leadership in the breakfast cereal industry Companies which have competed daily for a century approximate the case of infinite repetitions. Each time the economic bases change the possibility of benefiting and the set of the equilibriums of the game repeated an infinite number of times also changes. The problem of coordinating an infinite number of equilibriums becomes extremely complex. Let us see how the breakfast cereal industry has adapted to the challenge brought about by the infinite repetition of the game – see Gardner (1996). Breakfast cereals were invented in the USA in 1890 by advocates for a healthy diet. Two of the inventors, Kellogg and Post, gave their names to the companies they founded. Two other companies, General Mills and Ralston-Purina, were also able to get important market shares in this industry. These four companies were rivals throughout the 20th century. Large companies with strong benefits and brilliant futures, these four breakfast cereal companies had reasons to think they were competing in this market for an indefinite time. In this way they could act as if they were playing a game repeated an infinite number of times, an equilibrium of which corresponds to a monopolistic price policy. In real life there is a restriction to the Oral Tradition Theorem for games repeated an infinite number of times that has not been mentioned. Since 1890, a federal law in the USA forbade “monopolizing, the intention of monopolising and the conspiracy to monopolize” a market. Government agencies took it upon themselves to ensure compliance with this legislation in defence of competition. Normally, the government of the USA does not show that companies have monopolised a market. Instead, the government tries to show that companies have conspired to monopolise the market. The underlying behaviour of the Oral Tradition Theorem, which does not allude in any way to conspiracy, could hardly be considered a violation of laws in defence of competition. Nevertheless, companies with wide temporal horizons move along the thin line which separates legal from illegal practices. Suppose the main companies in an industry reach the solution of an infinitely repeated game which grants monopolistic benefits. Each time one of the economic parameters changes, such as costs or consumer preference, the equilibriums of the game repeated a certain number of times also change. Companies need some mechanisms to pass from the equilibrium where they are to the new one. If this mechanism fails, they may end up at the equilibrium of the basic game with reduced benefits for everyone. The solution for this problem, which happened in the breakfast cereal industry, was called price leadership. Under price leadership, a company, the price leader, takes charge of the industry’s price policy. Each time there is a change in one of the economic parameters, a change in the price policy is required; the price leader carries it out. The members of the industry depend on the price leader to adapt to the correct prices, so that the industry reaps the highest possible benefits. Let the games begin and go on 51 Throughout most of the 20th century, the price leader in the breakfast cereal industry was Kellogg’s, which was also number one in market share with over 40% of total sales. Considering inflation in the USA, particularly after World War II, most of price changes were upwards. From 1950 to 1972, 99% of all price changes were price increases. A large proportion of all price increases, 80% between 1965 and 1970, were led by Kellogg’s. Normally the other companies in the sector followed this leader quickly. Even when other companies did not follow the lead, Kellogg’s would not go back on its price increases policy. Instead, they spent more on advertising and waited for the rest of the industry to adjust to the new price. The price leader helped the breakfast cereal sector enjoy very high benefit margins, well above the average rates of the benefits of their assets. A government agency, always on the lookout for signs of a conspiracy, filed a law suit against the breakfast cereal companies. Admitting that they did not have proof of an evil conspiracy, the agency argued that through their behaviour, the breakfast cereal companies had in practice a shared monopoly and, should therefore submit to the agency’s dictates. The idea the agency had of a shared monopoly, if it was correct, confirmed the idea that the companies in this sector, in following the price leader, had effectively achieved and maintained the solution of the maximum benefits of their repeated game. This situation dragged on for several years with all kinds of legal manoeuvres regarding who should be the judge in the case. The case was also greatly politicised. During the 1980 presidential campaign, Ronald Reagan wrote to Kellogg’s expressing his concern with that situation. At the same time, labour organisations, fearing job losses, pressured President Carter strongly not to press Kellogg’s. The agency realised that even if it could prove the existence of a shared monopoly, the case would not win in the courts. The judge in the case against the breakfast cereal companies closed it in 1981. Kellogg’s and its price followers went on enjoying impressive benefits. 4 Conclusions Most strategic games consist of repeated interactions between players. In this type of games, the first step is to recognise the possibility of cooperating. Games which are repeated have payoffs which ‘generate’ tensions between players who want to compete and cooperate. In repeated games, players interact repeated which may condition current behaviour based on past behaviour. This allows each player to be punished and rewarded, and in the end it allows players to achieve higher payoffs, including escaping from the prisoner’s dilemma. If the ‘Prisoner’s dilemma’ is only played once, the tension may produce a competitive payoff. Players may then want to cooperate so as to achieve a higher collective payoff, but the temptation to compete is irresistible. Nevertheless, as we have seen, when this game is played repeatedly, cooperation is reinforced meaning players achieve higher payoffs. In a game where players are patient enough, a trigger strategy may be used to reinforce cooperation. However, it may take some time until the players reach a tacit agreement on how to collaborate. This agreement can be reached as soon as they realise that they are all strategic players with immediate unattainable goals, since in order to achieve these goals they must sacrifice something. One must also consider that cooperation allows higher payoffs to be reached and avoiding it brings punishment. 52 M.C.P. Matos et al. References Gardner, S. (1996) Juegos para Empresarios y Economistas, edited by A. Bosch, Cambridge University Press, Barcelona. Jorgensen, S., Quincampoix, M. and Vincente, T.L. (2007) Advances in Dynamic Game Theory, Birkhauser, Boston. Matos, M.C.P. (2009) Jogos na Forma Codificada – Outra Forma de Representação dos Jogos, PhD thesis, ISCTE-IUL, Lisboa. Matos, M.C.P. and Ferreira, M.A.M. (2006) ‘Game representation-code form’, in Namatame, A., Kaizouji, T. and Aruka, Y. (Eds.): Economics and Heterogeneous Interacting Agents, Lecture Notes in Economics and Mathematical Systems, Vol. 567, pp.321–334 [online] http://dx.doi. org/10.1007/3-540-28727-2_22 (accessed 20 January 2016). Straffin, P. (1980) ‘The prisoner’s dilemma’, UMAP Journal, Vol. 1, pp.101–103. von Neumann, J. and Morgenstern, O. (1967) Theory of Games and Economic Behaviour, John Wiley & Sons, Inc., New York.