Share this Post
Abstract: Game Theory is an application of expected value and rational actor theory. It can answer big questions: Will the US win a war against China? It’s only a matter of gathering the information required to validate your assumptions. But game theory can help answer the mundane; questions about how best to gather information for evaluations.
This past summer, the Department of the Army (DA) adopted changes to the Officer Evaluation Report. Whenever changes to the officer evaluation report (OER) system are discussed, one option always gathers adherents; particularly it seems, in SOCOM with almost religious devotion- peer evaluations. The basic idea is to allow officers to self-select their most capable. The most recent round of revisions has produced some positive changes, most notably, more descriptive sections on an officer’s performance and potential and less ‘box checks’, but the idea of ‘peer evaluations’ was not adopted. There is an application of mathematical logic which illustrates why peer evaluations are bad. Game theory, which looks at the strategic interactions of the players available moves, takes into account preferred outcomes. In particular, game theory can illustrate outcomes that might not have been intended by the players. The success of each actor in the game is dependent on how his or her own actions interact with the competitors. A simple matrix format shows the interaction between the players’ preferred outcomes. Observers can then use the game to more easily conceptualize the effects of proposed policies.
Game theory actually provides two models- one is the zero-sum game, in which the outcome is either winning or losing. The zero-sum model was popular during the early Cold War to model US-Soviet nuclear exchanges, so it is easy to conceive the connection between that type of total war, and a total game, which has only two outcomes, a win or a loss. Economists developed the other model, which was an attempt to understand human interaction in terms of maximizing value. They understood that people possess varying levels of information and will, to some degree, cooperate with each other. This is the model we will use and is similar to the classic ‘prisoner’s dilemma’ which Axelrod made famous in his book, The Evolution of Cooperation (Axelrod, Robert, The Evolution of Cooperation, Basic Books, New York, 1985). In the prisoner’s dilemma, the police promise freedom to the prisoner that cooperates and ‘snitches’ on the other guy.
Imagine two officers, both up for their annual evaluation at the same time. The commander does not have enough information to determine which of these junior officers should receive the coveted Above Center of Mass check, which is restricted by regulation to only 49% of the commander’s rated population. The commander discusses his problem with his XO, who visited the officers and asks each of them to write a recommendation of the other. If both officers write exceptional ratings of the other, they are ‘cooperating’ with the commander (who secretly thinks both officers deserve the best rating). If Lieutenants Anderson and Brown rate each other as substandard, then the Brigade commander would rate both as center of mass- A case in which each junior officer receives less than the superior rating. If the lieutenants cooperate and write glowing reports of each other, one will still lose, because the commander can only rate one as exceptional. However, if Lieutenant Anderson rates the Lieutenant Brown as substandard, then Anderson will receive the better rating. If LT Brown rates Anderson as substandard, then Brown will receive the better rating. What do the officers do? We can use the following model to find out.
The officers have four strategies- they can cooperate and write generally good reviews of each other, or they can adopt a negative strategy in which they ‘trash’ the other officer. In theory, each strategy can have unlimited iterations, and often game theory is explored using complex computer programs which run almost unlimited iterations, using infinite, minute variations on strategies. In order to simplify this demonstration, we limit the officers to four types of evaluations:
-Write a superior peer evaluation
-Write an excellent peer evaluation
-Write a generic peer evaluation
-Write a bad peer evaluation
It is important to remember that Lieutenants Anderson and Brown understand that the commander is limited to giving only one of them a ‘top-block’ OER. In this case, each officer prefers to maximize his own outcome. In other words, each officer wants to receive the superior rating from the commander for himself. Now, we can assign relative values to each strategy (1 through 4 respectively).
The strategies are valued this way because if Anderson writes a superior evaluation for Brown, then the commander would give Brown the top-block, which is the worst outcome for Anderson. Each officer is rational, so each officer’s expected value for each strategy would be identical. There are a couple rules to our game, which reflect real world restraints on the system. One, the commander can only gives one officer a top block- there are no tricks to get around this. Two, our two lieutenants understand the officer evaluation process. Three, each officer will seek to maximize his own evaluation. These three rules force the model to be rational. Surprising to some who would criticize game theory as being too math centric to be understood, or too abstract, these rules reflect, to a surprising large extent, the real world.
At this point, we can put the strategies into the matrix, and observe the interactions. In our matrix, the strategies are grouped under ‘Go Positive’ which reflects generally positive recommendations, and ‘Go Negative’ which reflect generally negative recommendations. Anderson and Brown don’t know what the other will finally write, and neither one will get the chance to read what the other turns into the XO.
However, the observer can easily see which strategy the officers will adopt. Follow the arrows in figure 1: Both officers would like to go positive, but can’t be sure if the other will honor ‘the code’ and so they move to higher valued strategies. Once the movement begins, each officer is confronted with the same dilemma that the other officer will go to his dominant position. So each officer must go to his dominant position, as shown by the highest value strategy, which is to go negative on the other.
Figure 1. Game matrix showing how the lieutenants flow away from 'losing' strategies towards 'winning' strategies in order to protect themselves.
This is reminiscent of the old Cold War Zero Sum game, in which the only recourse available to either country was to go all out, all at once. This became known as MAD, mutually assured destruction, when neither country could benefit unilaterally from a change in strategies. Our game, because its Nash Equilibrium is its saddle point, appears to be a zero-sum game. However, this is not the case, since zero-sum games preclude any cooperation in the form additional information or any further interaction between the players.
This dominant position, effectively a stalemate, is known as a ‘Nash equilibrium’. It is a point in the model at which neither lieutenant will move (in this case down) without additional incentives. These incentives are called ‘side payments’. Side payments are values added to lower ranked strategies in order to make the actor willingly choose that lower valued course of
Figure 2. Graphing the strategies shows the lieutenants' dominant position, or Nash Equlibrium, which lies on the Pareto Optimal line, and the area which side payments would move the lieutenants
action. In international relations, the United States often uses side payments, in the form of loans through the IMF, the import-export bank, or favored status in world organizations to get one or more countries to move away from a strategy which they see as maximum value, but which is detrimental to US interests. In our example, side payments could take the form of reminders from the XO on the other officer’s achievements, or promises of more important jobs in the organization hierarchy. With the restriction to 49% in the current OER system, officers are already familiar with the concept of ‘bill-paying.’ These side payments enter the game as information, and provide bias, moving the players up or down on their scale. In figure 4, we can see the dominant position each lieutenant occupies on the line of the Pareto optimal- If one lieutenant can gain a position that is further to the right, further up, or both, at a reciprocal cost to the other lieutenant’s position, then that officer would win. The two lieutenants could even work together to give each other the same ‘generic’ peer evaluation. It’s implausible they would do this, but still rational within the bounds of the game.
Game theory provides a framework on which to develop options for policies- it gives broad right and left limits for success and in this case, failure. The greatest problem with ‘peer evaluations’ revealed here is the internal, value maximizing impulse in human nature. Unfortunately, officers have no internal motivation to do so if they follow the rules of rationality. Any activity introduced to mitigate this bias only further unbalances the system. Still, the Army has the problem of determining which officers are actually doing the best work. If the officer’s cooperate fully, they would only rate each other with generic recommendations, throwing the issue back on the commander. Additional rules can be instituted, taking the form of side payments and meant to influence the officers to either inflate or deflate their recommendations according to some pre-determined system. When this happens, the commander is not receiving an un-biased report, and so is still making the decision based on his original observations. Even more ominously, mathematicians has produced an ‘evolutionary’ model of the prisoner’s dilemma, there are infinite iterations of the game played in a set population. Jonathan Bendor and Piotr Swistak found that the most successful strategies are replicated themselves faster because they survive. In our case, officers who are successful based on their peer recommendations will continue to use that strategy. It’s then fairly obvious the problems that would ensue from basing promotions on a system that rewards inflated evaluations. Bendor and Swistak also found that players would begin switching from lower valued strategies to the higher valued strategies that are successful. It’s an interesting application of Bayes’ Theorem. In our model, we would soon have all officers switching to the most destructive peer evaluations, regardless of the truth. Only a strong central authority, in this case the commander, could re-introduce stability. If the officers work together and write ‘generic’ evaluations, then the commander still has to decide which gets the better rating. In any case, the burden still falls to the commander, even though the ‘peer evaluation’ system was set up to help the commander with his judgment. (Bendor, Jonathan and Port Swistak, “The Evolutionary Stability of Cooperation,” The American Political Science Review, Vol 91, No 2, (Jun 1997), pp 290-307. Also by the same, “The Controversy about the Evolution of Cooperation and the Evolutionary Roots of Social Institutions,” in Gasparski, Wojciech et al (eds),Social Agency, New Brunswick, N.J.: Transaction Publishers, 1996.)
Still, the process of applying game theory to evaluations has some merit. It has shown that ‘peer evaluations’ are not a viable method, and so it allows decision makers to concentrate on other, more viable alternatives. For example, if the commander needs more information, he should ask, not peers, but other groups. For example, company commanders, generally captains, can provide information on lieutenants, even the ones not in their own rating chain. The staff can provide reports on company commanders. As far as the current regulations, removing the 49% restriction would change the values of rated officers’ strategies. Put in place during the last OER revision in 1997, it was meant to stop the inflation of every officer’s evaluation to ‘superior’ status. This may now be resolved by allowing commander’s with small populations to break the 49% rule. Still, considered separately, ‘peer evaluations’ would not give the commander the kind of information he needs to select ‘superior’ rated officers.
So why Game theory? It can answer big questions: Will the US win a war against China? It’s only a matter of gathering the information required to validate your assumptions. But game theory can help answer the mundane; questions about how best to gather information for evaluations.