Reiter Wagstaff FPA2018

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/318893918
Leadership and Military Effectiveness
Article in Foreign Policy Analysis · August 2017

DOI: 10.1093/fpa/orx003
CITATIONS READS
10 5,975
2 authors, including:
Dan Reiter
Emory University
70 PUBLICATIONS 3,388 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Understanding War and Peace View project
The Sword's Other Edge View project
All content following this page was uploaded by Dan Reiter on 23 April 2019.
The user has requested enhancement of the downloaded file.

Foreign Policy Analysis (2018) 14, 490–511
Leadership and Military Effectiveness
Downloaded from https://academic.oup.com/fpa/article-abstract/14/4/490/4064140 by Pitts Theology Library, Emory University user on 23 April 2019
DAN REITER ANDW I L L I A M A. W A G S T A F F
Emory University
What determines military effectiveness? Though political scientists have

studied the sources of military effectiveness, they have generally ignored
the role of military leadership, a factor that historians have emphasized
as crucial for effectiveness. This article presents the first rigorous
examination of the proposition that militaries improve effectiveness by
replacing low-performing leaders. The article tests three theories describ-
ing how militaries promote and demote leaders: (1) military leaders are
promoted and demoted on the basis of combat performance; (2) politi-
cal leaders fearful of coups do not demote low-performing military
leaders, as a coup-proofing tactic; and (3) military leaders that belong
to powerful interpersonal networks are less likely to be demoted and
more likely to be promoted. Hypotheses are tested using new data on
all American and German generals holding combat commands in the
North African, Italian, and West European theaters in World War II
and new data on the monthly combat performance of American and
German divisions in these theaters. Analysis reveals that both armies
replaced low-performing generals (coup-proofing motives did not
prevent Hitler from demoting low performers) and that interpersonal
networks in the US army did not block demotion of low performers.
Also, the replacement of low-performing generals improved combat
effectiveness in both armies.
Historians have long emphasized the importance of leadership in determining

military effectiveness. In striking comparison, though political scientists have for
decades studied many possible sources of military effectiveness including technol-
ogy, regime type, strategy, military-industrial power, and others, they have almost
completely neglected military leadership as a possible determinant of military
effectiveness.
This article presents the first rigorous social scientific treatment of the causes
and consequences of quality military leadership. It asks three related questions:
Do militaries replace poorly performing leaders? Are there conditions that might
impede a military from replacing poorly performing leaders? When poorly per-
forming leaders are replaced, does military effectiveness improve? The article pro-
poses that militaries replace poorly performing leaders and that replacing
poorly performing leaders improves military effectiveness. It also considers
Dr. Dan Reiter is the Samuel Candler Dobbs Professor of Political Science at Emory University.
William A. Wagstaff is a Ph.D. candidate in Political Science at Emory University and an Assistant Professor with
the Blue Horizons Program at the US Air Force’s Center for Strategy and Technology. His dissertation explores the
connections between military leadership and combat effectiveness. The views expressed in this article are those of
the authors and do not necessarily reflect the official policy or position of the air force, the Department of
Defense, or the US government. Opinions, conclusions, and recommendations expressed or implied within are
solely those of the author and do not necessarily represent the views of the Air University, the Department of
Defense, or any other US government agency.
Reiter, Dan and William A. Wagstaff. (2017) Leadership and Military Effectiveness. Foreign Policy Analysis, doi: 10.1093/fpa/
orx003
C The Author (2017). Published by Oxford University Press on behalf of the International Studies Association.
V
All rights reserved. For permissions, please e-mail: [email protected]
DAN REITER AND WILLIAM A. WAGSTAFF 491
whether interpersonal networks among generals or coup-proofing incentives

that prioritize political loyalty over competence might prevent poorly perform-
ing leaders from being replaced. The focus on individual military leaders here
complements other international relations (IR) scholarship focus on political
leaders (e.g., Goemans 2000; Rosen 2005; Bueno de Mesquita et al. 2003; Gelpi
and Feaver 2005; Wolford 2007; Chiozza and Goemans 2011; Horowitz, Stam,
and Ellis 2015).
We test our ideas on military leadership on the American and German
armies in World War II. We use two new data sets. The first is a data set of the
command tenures of all 320 American and German division-leading generals
who led troops (infantry, airborne, or armored) in battle in North Africa, Italy,
and Western Europe, from 1941 to 1945. The second data set records the
monthly combat performance of all American and German divisions in these
three theaters.1 We test whether generals leading low-performing divisions
were more likely to be replaced on the German and/or the American side, and
then also whether a division’s performance improved after its general was
replaced. We also test two theoretical challenges to the proposition that
low-performing generals get replaced, that interpersonal networks prevent low-
performing generals from being replaced, and that, in civilian dictatorships
like Nazi Germany, because generals are promoted for their political loyalty to
the dictator, generals are unlikely to be removed even after having performed
poorly in combat. Our focus on a single war follows the lead taken by other
conflict scholars—especially scholars of intrastate conflict—of analyzing more
fine-grained microdata from a single conflict. The microdata approach
improves data quality and internal validity (Verwimp, Justino, and Brück 2009;
for an example of a quantitative study of interstate war using microdata, see
Allen and Vincent 2011).
Our results are illuminating. In both the German and American armies,
low-performing generals were more likely to be replaced, and replacement of low-
performing generals significantly boosted military effectiveness. The tendency of
the German army, serving under the civilian dictator Adolf Hitler, to replace low-
performing generals provides evidence against the coup-proofing hypothesis that
dictators always prioritize political loyalty over competence. The finding that
American generals who were members of powerful interpersonal networks were
not less likely to be replaced provides evidence against the hypothesis that such
networks can distort the relationship between performance and leadership
removal.
The article makes three important contributions to IR. First, it improves our
understanding of the sources of military effectiveness. The article provides some
of the first rigorous evidence demonstrating that militaries improve effectiveness
by replacing poorly performing leaders, and more generally that quality leader-
ship boosts effectiveness. Second, the article improves our understanding of
how military organizations behave. The results here support a more Weberian
view of militaries as functionalist organizations, pushing back against perspec-
tives that militaries are always hidebound organizations slow to adapt, impeded
by factors such as interpersonal networks and coup-proofing dynamics. Third,
the article contributes to an ongoing empirical debate about the sources of
American and German military effectiveness in World War II, providing support
for the argument that one reason American forces fought well in World War II
was the willingness of the American high command to dismiss low-performing
generals.
1
As we discuss in greater detail below, our novel measure of combat performance contains division-specific two-
month moving averages of division performance. We use territorial objectives (e.g., capturing or holding territory
versus losing territory) to measure division performance (see Gartner and Myers 1995, 377).
492 Leadership and Military Effectiveness
The remainder of this article contains five sections. The first section develops
the theory that military organizations change leadership as a means of improving
performance. The second section presents hypotheses. The third section presents
the research design. The fourth section presents results. The final section
concludes.
Military Leadership and Military Effectiveness

Especially since the end of the Cold War, IR scholars have become increasingly in-
terested in understanding the sources of military effectiveness and, relatedly, the
determinants of war outcomes. Scholars have examined a number of different
possible determinants of military effectiveness, including regime type, technology,
terrain, ethnicity, civil-military relations, military strategy, alliances, economic
might, nationalism, and others (Stam 1996; Rosen 1996; Brooks 1998; Quinlivan
1999; Reiter and Stam 2002; Biddle 2004; Biddle and Long 2004; Brooks and
Stanley 2007; Desch 2008; Beckley 2010; Grauer and Horowitz 2012; Castillo 2014;
Talmadge 2015; Reiter 2017).
However, there is almost no IR work on the impact of military leadership on
military effectiveness, or on the determinants of the promotion and demotion of
military leaders, especially in wartime.2 This lacuna is especially striking given the
great emphasis that military historians and other observers have placed on mili-
tary leadership as a critical determinant of military effectiveness for millennia.
The ancient Chinese observer Sun Tzu (1963, 115) warned that poor-performing
generals will cause “the ruin of the army.” Machiavelli (1999) placed great empha-
sis on quality military leadership. Napoleon himself once declared, “[t]here are
no bad regiments; there are only bad colonels” (quoted in Farwell 2001, 206).
Historians have long stressed that outstanding military leaders, from Lord Nelson
to Robert E. Lee to Erwin Rommel to David Petraeus, have made key contribu-
tions to victory in conflict types ranging from counterinsurgency to naval warfare
to high-intensity land war. A standard interpretation of the American Civil War is
that the Union won because President Lincoln replaced the timid General
George McClellan with more ruthless, effective leaders such as Ulysses Grant,
William Sherman, and Phillip Sheridan (McPherson 1988).
Military leadership is important for all forms of combat. We propose that effec-
tive military leaders exhibit one or more of the following three qualities. First, ef-
fective military leaders inspire their soldiers to fight and die for their country and
their unit. Though nationalism and other factors have been found to play roles in
motivating soldiers to fight (Castillo 2014), leaders themselves can directly
encourage soldiers to fight and sacrifice themselves. Different leaders employ dif-
ferent strategies for motivating their soldiers, including making competent deci-
sions, applying brutal discipline, and/or conveying concern for the well-being of
their troops. During the American Civil War, for example, General Thomas
“Stonewall” Jackson used strict discipline and consistently successful decision-
making to motivate his troops (McPherson 1988, esp. 456–457).
2 Reiter and Stam (2002, 79) present evidence using Historical Evaluation and Research Organization (HERO)
data connecting leadership quality with combat success. Grauer’s (2016) theory touches on effective military
leadership, but focuses more on organizational structure and information flows. Rosen (1991, esp. 20–21) pro-
posed that higher-ranking officers seeking to innovate may make promotion and demotion decisions on the ba-
sis of whether or not the lower-ranking officers support the proposed innovation, but focused his argument on
peacetime. Moore and Trout (1978) present a theory of promotion in the US military. Avant (2007, 85) devel-
oped a theory connecting political institutions and military effectiveness, listing in a table “officer selection” as
one of the intermediate variables between civilian institutions and effectiveness. The coup-proofing literature
also links military leader quality to combat outcomes, proposing that dictators (at least sometimes) coup-proof
their armies, have lower-quality military leaders, and in turn experience lower effectiveness (Quinlivan 1999;
Brooks 1998; Pilster and Böhmelt 2011; Talmadge 2015). This article tests the coup-proofing thesis that dictators
prioritize political loyalty over competence in their officers, as described below.
Second, quality leaders are masters of the craft of combat. They can accomplish
a wide variety of tasks, including securing logistical support, appropriately deploy-
ing troops, achieving concealment and protection using terrain and human-made
structures, and employing weapons technology to maximum effect. These skills
improve combat effectiveness and affect battle outcomes. For example, during
the 1950 Chinese attack on American forces at Chosin during the Korean War,
American Marine units outperformed American Army units because of superior
leadership of the former in areas such as communications, reconnaissance, and
logistics (Ricks 2012, chapter 11). Masters of combat also provide more strategy
and doctrinal options. Mearsheimer (1983) and Stam (1996) described the opera-
tional superiority of blitzkrieg or maneuver strategies in comparison to attrition
strategies. However, such strategies require quality military leaders who are able
to identify and effectively exploit emerging battlefield opportunities. Biddle
(2004) made a similar point regarding what he described as the modern system of
force employment. Certainly, German adoption of the tremendously successful
blitzkrieg strategy presumed a high level of confidence in the quality of combat
commanders (Van Creveld 1982, 36).
Third, leaders choose subordinates, and quality leaders are more likely to
choose quality subordinates, who in turn bolster performance. Indeed, the single
most important contribution General George C. Marshall made to the American
war effort in World War II may have been his identification and promotion of
Dwight D. Eisenhower (Ricks 2012, chapter 2). Generals’ selection of their com-
mand staffs is especially important for modern armies (Van Creveld 1985).3
If leader quality affects military effectiveness, what factors determine the pro-
motion and demotion of different kinds of military leaders? A straightforward per-
spective is that the state and the high military command retain competent leaders
and dismiss ineffective leaders. This is a centuries-old idea, stressed by none other
than Machiavelli (1999, chapter 12) in his manual on successful statecraft, The
Prince: “the republic has to send its citizens, and when one is sent who does not
turn out satisfactorily, it ought to recall him, and when one is worthy, to hold him
by the laws so that he does not leave his command.” It also echoes Weber’s (1978,
esp. 217–218) vision of a rational bureaucracy in which individuals advance on
the basis of merit. More recent work in political science and, more broadly, the so-
cial sciences has explored how organizational leaders are selected and whether
leadership quality affects organizational performance. Sociologists and scholars of
business have explored whether and how leadership affects corporate perform-
ance and whether corporate leaders are selected on the basis of competence
(Lieberson and O’Connor 1972; Tarakci, Greer, and Groenen 2016). There is a
long-standing literature in American politics on legislator effectiveness (Volden
and Wiseman 2014) and whether presidents choose leaders of federal agencies on
the basis of merit or political considerations (Hollibaugh, Horton, and Lewis
2014).
The proposition that militaries retain, promote, and demote leaders on the ba-
sis of combat performance is not well-developed within modern scholarship.
There is only limited empirical work in political science as to whether militaries
promote officers on the basis of aptitude and competence, though scholars have
occasionally discussed specific episodes of commander (non)replacement.4
3 One potential caveat to the importance of military leaders for military effectiveness is that military leaders have
different degrees of flexibility, that is, authorized decision-making autonomy, across time and space (Grauer
2016). Postindependence Arab armies, such as the Egyptian army during the 1973 Yom Kippur War, generally af-
ford their generals and lower-ranking officers relatively little wartime decision-making autonomy (Brooks 1998;
Pollack 2002). That is, higher leader quality may have less impact on performance in militaries in which leaders
have less autonomy.
4 For example, Posen (1993, 97) noted that Prussia’s 1806 defeats at Jena and Auerstadt pushed its high military
command to dismiss officers responsible for the defeat and make officer recruitment more systematic.
Some work is critical of the idea that military organizations retain quality leaders
and remove poor-performing leaders. One body of scholarship posits that militar-
ies are hidebound institutions resistant to change, likely to suffer from organiza-
tional biases as they process information and make decisions (Posen 1984; Snyder
1984; Van Evera 1999). As discussed in the next section, this article discusses two
dynamics, interpersonal networks and coup-proofing dynamics, that might pre-
vent militaries from replacing low-performing leaders.
A different question is, in the face of poor performance, why would a military
react by replacing its leadership, as opposed to attempting a different kind of
change? Indeed, a number of works have explored other types of reactions mili-
taries might have in an attempt to improve performance, such as changing mili-
tary strategy (Gartner 1997; Biddle 2004) or adopting new technologies
(Horowitz 2010). In the pursuit of better performance, there are a few reasons
why a military organization might choose to replace leadership rather than
change strategy or technology. Changing leaders offers the possibility of a rapid
improvement in performance, as the implementation of strategy or new technol-
ogy can take more time. Financially, changing leadership is relatively cheap, as it
does not require buying new equipment or revamping training programs.
Relatedly, a state may wish to change its technology, but might not have access to
the technology it desires, cost notwithstanding. Changing leaders often does not
require militaries to overturn long-held beliefs about military strategy, beliefs that
may shade into organizational culture and entrenched standardized operating
procedures. Last, changing leaders is a more selective approach to improving per-
formance. It permits an organization to attempt to improve the performance of
low-performing units, such as individual divisions, and leave alone high-perform-
ing units. In contrast, it is more difficult to make such discriminate changes by
changing strategy or technology.
Hypotheses
We propose that militaries see effective military leadership as a key determinant
of success, and accordingly militaries make demotion and promotion decisions in
wartime on the basis of combat outcomes.
Hypothesis 1a: In wartime, military leaders are more likely to lose their command if their
units experience poor combat outcomes.
Hypothesis 1b: In wartime, military leaders are less likely to be promoted if their units experi-
ence poor combat outcomes.
We also present two alternatives to Hypotheses 1a and 1b. The first focuses on
regime politics. Scholars have proposed that political leaders who fear being
deposed in a coup d’état, such as civilian dictators, sometimes take preventive
measures to reduce the coup threat, a technique known as coup-proofing (schol-
ars disagree as to whether or not [certain forms of] coup-proofing actually pre-
vents coups; see Powell 2012, Rwengabo 2013, Albrecht 2015, and Harkness
2016). A central coup-proofing measure is the promotion and retention of mili-
tary officers on the basis of political reliability, as manifested by factors such as fa-
milial ties, ethnic kinship, or ideological affinity with the political leader, rather
than by professional competence. In one of the classic works on civil-military rela-
tions, Janowitz (1960, 353), speaking of the American context, commented that
controlling the system of promotions is “a crucial lever of civilian control.” The
selection of military officers on the basis of political reliability rather than compe-
tence has been hypothesized to reduce military effectiveness because it lowers
leader quality (Biddle and Zirkle 1996; Quinlivan 1999; Talmadge 2015). Scholars
have proposed that nondemocracies, Arab states, and states under lower levels of
external threat are especially likely to engage in coup-proofing (Quinlivan 1999;

Reiter and Stam 2002; Pollack 2002; Pilster and Böhmelt 2012; Talmadge 2015).
We compare the leadership and combat experiences of a democratic military,
World War II America, and a dictatorial military, World War II Germany. A coup-
proofing hypothesis would forecast that a civilian dictator like Adolf Hitler should
be strongly motivated to coup-proof his regime, given Germany’s recent history of
coups and coup attempts, signs of dissatisfaction with the Nazi regime within the
German military, and Hitler’s awareness of latent threats in the German military
to his rule (Fest 1994). Coup-proofing theory would predict a divergence between
the American and German experiences. The United States ought to retain gener-
als on the basis of combat performance, but, according to coup-proofing theory,
coup-proofing considerations should attenuate the connection between combat
performance and career outcomes of military leaders in a dictatorship Nazi
Germany, as officers appointed for political reliability should retain their com-
mands even when their units perform poorly.5
Hypothesis 2: In wartime, dictators should be less likely to relieve low-performing generals as

compared with democratically elected civilian leaders.
A second alternative perspective is that interpersonal networks affect organiza-

tional leadership decisions. Some propose that organizations include interper-
sonal networks and that members of powerful networks are likely to enjoy faster
levels of professional advancement (Burt 1992). Moore and Trout (1978) pre-
sented a complex “visibility” theory of military promotion, emphasizing that,
though talent helps put individuals into the pools of individuals eligible for pro-
motion, personal networks are much more important in the promotion process
than talent. They also proposed that personal networks become increasingly im-
portant in higher ranks of seniority. Other scholarship has provided empirical
evidence that personal networks affect the leadership selection and promotion
process in American and other militaries (Segal 1967; Peck 1994; Kim and Crabb
2014). That said, some have presented evidence that in the US military the value
of such networks does not affect promotion patterns (Schwind and Laurence
2006) or has varied across time (Evans 1992).
This interpersonal networks perspective suggests:
Hypothesis 3: In wartime, military leaders who are members of powerful interpersonal net-
works are more likely to be retained and promoted than leaders who are not members of such
networks.
Thus far we have discussed the determinants of leadership turnover. The

remaining hypothesis examines how leadership turnover affects combat perfor-
mance. We note that Hypotheses 1a and 1b assume that organizations believe that
replacing low-performing leaders will improve military organizational perfor-
mance. We test this assumption. This is a bit more difficult to get at, as the best
measure of leader effectiveness is of course combat effectiveness, and for falsifi-
ability reasons we wish to avoid measuring the independent variable with the de-
pendent variable. We take a more indirect approach. If the assumption in
Hypotheses 1a and 1b is correct, then the performance of a military organization
should improve if the organization’s leader has been replaced. One should note
the assumption that the new, replacement leader will be of higher quality than
5 Very occasionally, IR scholars have observed in passing that Hitler appears not to have engaged in coup-
proofing (e.g., Castillo 2014, 56–57; Talmadge 2015, 31). This article presents the first rigorous empirical test of
the assertion that Hitler did not engage in at least one form of coup-proofing, making military leadership demo-
tion and promotion decisions on the basis of political reliability rather than performance.
the removed leader, an idea that was also explored in a formal model of civilian
leadership replacement (Meirowitz and Tucker 2013).
Hypothesis 4: In wartime, when a military unit’s leader is replaced following poor combat per-
formance of the unit, the unit’s combat performance should improve.
Research Design
We test our hypotheses by collecting and analyzing new data on the command
tenure experiences of generals leading individual ground divisions in the
American and German armies fighting in the North African, Italian, and West
European theaters in World War II in the years 1941–1945. We focus on lower-
ranking generals (hereafter referred to simply as generals) who each led a single
division (very roughly ten thousand to thirty thousand soldiers). Divisions are
larger than regiments, battalions, brigades, companies, platoons, and squads and
are smaller than corps and armies. For World War II, divisions are preferable to
using smaller units, because there are fewer missing data for division-level perfor-
mance than there are for the performance of smaller units.6 There is a sample
size advantage in looking at divisions instead of larger units, as there were hun-
dreds of divisions in World War II, whereas the numbers of corps numbered in
the low dozens and the number of armies even fewer. We focus on generals com-
manding ground troops rather than naval admirals or air force generals.
We focus on World War II for several reasons. First, its scope and extensive his-
toriography provide data that are both comprehensive and high quality, relative
to other wars. Second, the length of the war facilitates testing our hypothesis that
past poor performance increases the likelihood of a general being replaced. In
shorter wars, such as the Six Day War or the 1991 Gulf War, combat does not last
long enough for a military to react to poor combat performance by replacing
leaders of poor-performing units. Third, World War II has been central to the
study of military effectiveness (e.g., Mearsheimer 1983; Biddle 2004; Castillo
2014). Relatedly, we contribute to the long-standing debate comparing American
and German military performance during World War II (see Van Creveld 1982,
Brown 1986, and Overy 1995), and specifically the debate about American
military leadership in World War II. Analyzing a small number of World War II
generals, Ricks (2012) speculated that the United States fought well in World War
II because it replaced low-performing generals with better-performing generals. It
is important to test the Ricks thesis rigorously, as he did not, and it does not re-
flect a consensus among historians. For example, Blumenson disagreed with the
view that the worst American generals were removed, observing instead that most
American decisions to relieve generals in World War II were “unwarranted if not
altogether unjustified” (quoted in Ricks 2012, 110). Fourth, because Germany is a
civilian dictatorship and the United States is not, this comparison provides the
independent variable variance needed to test the coup-proofing hypothesis (H2).
We do not include analysis of the German-Soviet World War II campaign.
Doing so would introduce substantial heterogeneity in both data missingness and
data quality across the Soviet and other campaigns (as described below, we code
combat outcomes for the North African, Italian, and European campaigns from a
single source, and that source does not cover the Soviet campaign). Excluding
the Eastern Front also allows us to maintain data homogeneity, as virtually all
combat in our data set are between German and American forces.
6 We drew this conclusion after inspecting combat records available at the National Archives II in College Park,
Maryland.
Focusing on leadership and combat dynamics in a single war provides advan-

tages over conducting empirical analysis on a data set of several wars. The sample
is large enough to permit quantitative analysis (hundreds of generals and thou-
sands of division-months), but limited enough to permit us to account for context
and to collect more fine-grained, higher-quality data. Context is especially critical
when assessing combat performance, as the meaning of success in combat can
vary substantially across wars, from killing the enemy to capturing territory to se-
curing control of the population (Reiter 2009, chapter 4). In these World War II
campaigns, combat success is generally about capturing territory, as Allied armies
were attempting to secure territory en route to intermediate objectives (such as
the capture of Rome and Paris), ultimately culminating with the capture of Berlin
(see Blanken and Lepore 2014). German forces were also attempting to control
territory to prevent Allied advances.
To test Hypotheses 1, 2, and 3, we use Cox event history models of each gener-
al’s command of a particular division. The “failure” event is the end of the gener-
al’s command of the division. A command demotion occurs any time a division
commander is relieved of command and given either a lower (smaller) combat
command, no command, or command of a military school. It is important here to
note two events that we do not count as command demotions. First, if a general
transitions to the general staff of a corps or army, this is not a demotion.
Transitioning to the staff of a larger unit signifies recognition of promise by a
higher-ranking officer and, potentially, signals being groomed for higher com-
mand. Second, we do not code a general as being demoted if he was a temporary
acting commander returning to his previous, lower-level command once a perma-
nent division commander is designated since such appointments carry the
assumption that the appointee will return to a lower command. We employ com-
peting risks models to model separately factors that make demotion more likely
and promotion more likely (Box-Steffensmeier and Jones 2004). These models
are appropriate in that they allow modeling different ways in which command ten-
ures end, and through “right-censoring” they provide a means of addressing our
being unable to observe the end of command tenure whether through transfer to
the Eastern Front or the end of the war.
Figure 1 displays the number of generals demoted for each year the United
States and Germany participated in the war (for more detailed information on all
generals in the data set, see our Supplementary Codebook).
A general is coded as being promoted if he receives a higher combat command
(such as a corps) or is attached to the general staff of a higher command. Other
means of a general’s command ending include loss of command due to poor
health, injury, death, transfer to the Eastern Front, administrative dissolution of a
division, or end of the war. We treat those outcomes as right-censored, meaning
that we allow for the possibility that such generals would remain at risk of “failure”
in the absence of such events. Figure 2 displays the number of generals promoted
each year the United States and Germany participated in the war.
One of our independent variables is combat performance of the division com-
manded by the general. Coding combat performance is difficult. There are per-
haps three different approaches to coding combat performance: coding territory
lost or gained, coding (balance of) casualties (Biddle 2004; Beckley 2010;
Cochran and Long forthcoming; Weisiger 2016), and a subjective coding of suc-
cess. Coding combat performance simply on the basis of territory lost or gained
can be potentially misleading. If a division on the offensive gains no territory it
should be viewed as unsuccessful, whereas a division on the defensive that gains
no territory but loses no territory should be viewed as successful.
Casualty data pose severe missing data problems, at least for the division level
of analysis. Our exploration of primary sources in the US National Archives
indicates that for American division-level casualties on a monthly basis in
Figure 1. Demotions per year
Figure 2. Promotions per year
World War II, about 40 percent of observations would be missing. Missing data
rates for German divisions would likely be even higher. Missing data issues aside,
there are conceptual problems with using casualty data, as in some circumstances
military forces may be willing to suffer casualties, or may need to suffer casualties,
in order to accomplish goals such as seizing or preventing the seizure of territory
(Reiter 2009, chapter 4; Grauer 2016). For example, in the first phase of the
highly successful 2007 change in American military strategy in the Iraq War, US
combat casualties went up as American forces took the necessary steps of increas-
ing their patrols among the population (Ricks 2009, 238).
Rather than focus on territory or casualties, we collected new data allowing us to
code each division’s combat performance on a monthly basis, accounting for that
division’s combat mission during that month. We drew these data from several sour-
ces, primarily the Green Book US government accounts of combat in World War II.7
The advantage of using the Green Book series is that it provides deep descriptions
of all American combat in the three theaters we examine. Further, because the
Green Books are produced as a single series by the US government, we can be more
confident in the consistency of the volumes’ treatment of combat across the course
of the war. Some might be concerned about an official publication like the Green
Books providing a pro-US tilt, but historians generally view the publications as being
reasonably unbiased, as they “set standards of military history research and writing
and have become basic documents of the American participation in World War II
. . . Overall, the official U.S. military histories, whether written in house, like those of
the army, or by outside contractors, have withstood the initial fastidious skepticism
of academics, and remain basic sources for the history of the United States war
effort” (Schliffer 2001, 233–34). If these publications did provide a pro-US tilt, any
introduced bias would be relatively limited, as our hypotheses do not test whether or
not American forces fought better than German forces, but rather whether variance
in combat performance within a division affected leadership turnover and whether
leadership turnover improves a division’s combat performance over time.
We use a three-category variable, 1/0/–1, of monthly combat performance
(the combat performance independent variable is the two-month moving aver-
age8 of each division’s monthly combat performance coding). A division receives
a 1 in a particular month for its combat performance if it enjoyed success in
achieving its combat goals. For divisions tasked with launching offensives, a divi-
sion is considered successful if it conquered net additional territory during that
month. For divisions tasked with defense, a division is coded 1 if the division suc-
cessfully blocked the adversary from gaining consequential territory, even if the
defending division itself did not capture any territory. For example, the German
29th Panzer Grenadier Division helped successfully contain Allied forces around
Anzio in early 1944 and received a 1 for those months.
A division gets coded as –1 in a month if it failed to achieve its combat goals.
Units that fail to achieve planned offensive objectives, such as the German 11th
Panzer Division failing at Remagen in March 1945 and the American 36th Infantry
Division failing at the Gari (Rapido) River in January 1944, are given codings of
–1. If a division is destroyed, such as the German 84th Division in Italy in March
1944, or dissolved due to poor performance, such as the German 92nd Infantry
Division in June 1944, it receives a –1.9 If a division suffered defeat and engaged
7 Sources for coding data on combat outcomes and the command tenures and outcomes of generals include the
following: Armed Forces Information School (1950); Blumenson (1993a, 1993b); Clarke and Smith (1993); Cole
(1993a, 1993b); Fisher (1993); Garland and Smythe (1993); Harrison (1993); Howe (1993); and MacDonald
(1993a, 1993b). For German combat outcomes, we rely upon Mitchum (2007).
8 The results are robust to using a three-month moving average (see Table A1, Supplementary Appendix). Using
three-month moving averages reduces the number of observations available to analyze. Using moving averages
also has the benefit of accounting for the commander’s overall performance. For example, a commander that
has a successful month and then an unsuccessful month is coded as having average performance in the two-
month moving average. Outlying performance is discounted further in the three-month moving averages where,
for example, two successful months followed by an unsuccessful month yield a coding that indicates that the
commander is relatively successful.
9 Note that we do not code commanders of divisions that are destroyed, surrender, or disband as receiving a demotion
despite not receiving a subsequent command. This provides a more conservative test of the hypotheses. The results
are robust to dropping the division-months in which the division was destroyed, surrendered, or disbanded.
in a disorderly retreat, in the sense that the division suffered heavy casualties as it
retreated (such as the 1st SS Panzer Division retreating from the Falaise Pocket in
August 1944), it received a –1.
A third possible coding is 0. A division gets a 0 if does not fight in a particular
month. A unit engaging in orderly retreat, that is conceding territory without suf-
fering significant casualties, gets a 0. If a division surrendered without destruction
of its unit, a more common outcome at the very end of the war, it gets coded as 0.
It also gets coded as 0 if it fights but has mixed success in a month, such as a suc-
cessful defense and a failed offensive, or a successful defense followed later in the
month by retreat.10 Our Supplementary Codebook describes how we code each
individual observation.
The approach of focusing on the success or failure of each individual division
in each individual month enjoys some advantages over coding battle outcomes
(for quantitative studies using battle outcome data, see Reiter and Stam 2002,
chapter 3; Rotte and Schmidt 2003; Biddle and Long 2004; Ramsay 2008; and
Pilster and Böhmelt 2011).11 The battle outcomes approach means creating lists
of battles likely of asymmetric size and/or duration (Reiter 2009, chapter 4). The
only publicly available quantitative data set of battle outcomes for all battles going
back in time has been the Historical Evaluation and Research Organization
(HERO) battle data set. A number of observers have described flaws in the HERO
data (e.g., Brooks 2003; Desch 2008).12
The division-month approach avoids these problems by applying consistent
temporal and spatial limits. The division is a common organizational unit in mili-
tary commands of very roughly consistent size. Each division’s performance is
evaluated on the basis of a single and consistent measure of time, the month. It
also permits evaluating the performance of a division when it is fighting but not
involved in a major battle.
Testing Hypothesis 2 is empirically straightforward, as it simply forecasts differ-
ent relationships between combat outcomes and command tenure in the
American and German armies. Hypothesis 3 requires new data on the interper-
sonal relationships of military leaders. Perhaps one of the strongest interpersonal
networks within militaries is the set of officers who attended military academies.
We collected new data on the military academy attendance of all the American
generals in our data set. We coded a dichotomous variable 1 if the general
attended a military academy, and 0 otherwise. Almost all American generals in
the data set attending military academies attended West Point (one attended the
Virginia Military Institute). Proxying personal networks for the German sample
with academy attendance is inappropriate as the Germans did not have a stan-
dardized system of academy education during World War II as the Americans did.
We do include a variable coding whether a German general is a member of the
Schutzstaffel (SS). However, we recognize that SS generals may be less likely to be
10 While admittedly blunt, the –1/0/1 coding scheme allows us to sidestep the issue of ordering monthly out-
comes that represent mixed performance. It would be difficult, for example, to distinguish a priori between a
month with a successful defense and failed offensive versus a month with a successful defense with a later disas-
trous retreat. We acknowledge also that any ordinal scheme of this nature requires arbitrary cutoffs. Having
three distinct values reduces the number of these difficult decisions we must make as compared to when there
are four or more values. The current coding scheme allows for this ambiguity while also identifying clearly
good (1) and poor (–1) performance. While casualty data are generally problematic, future work could utilize
primary source material to understand better how commanders viewed the tradeoff between territorial advance-
ment and the perceived costs to that advancement.
11 The month is the most appropriate time unit, as after-action reports are generally passed to higher-level com-
mands each month, rather than smaller or larger time units.
12 Biddle and Long (2004) and Cochran and Long (forthcoming) use a revised version of the HERO data set, but
their revisions strive to correct errors in coding outcomes and other issues and do not strive to address some of
the definitional and conceptual issues described here.
demoted either because of network factors, coup-proofing factors (Hitler relied

on the SS to maintain internal political control), or combat performance factors
(SS divisions fought well).13 We coded a dichotomous variable 1 if the general
commanded an SS division, and 0 otherwise.
Results
The data set contains 1,703 general-month observations for 320 generals—91
American and 229 German—commanding 195 infantry, armored, and airborne divi-
sions—65 American and 130 German—in the North African, Italian, and Western
European theaters in April 1941–May 1945.14 The first division-month with combat
occurs in April 1941, and the last occurs in May 1945. Event history analysis evaluates
the factors that affect the duration of a certain spell, and for this study that spell
is the duration of a general’s command. Generals can have multiple commands
within the war (that is, multiple spells). We measure command duration in months,
and in our data set command duration ranges from one to twenty-five months, with
an average of 5.2 months. Table 1 provides summary statistics of the data.
In the data set a total of twenty-one generals—eight American and thirteen
German—lost their command through demotion. Eighteen German generals
ended their combat commands through promotion, and thirteen American gen-
erals ended their combat commands through promotion.
We analyze the full data set, and then the American and German generals in
separate models. We use robust standard errors, clustering on the general. All
results are robust to clustering on division (see Supplementary Appendix). We
use the Efron method for breaking ties. We exclude general-months in which the
general’s division did not fight or has not fought in the last two months.
Analysis of Schoenfeld residuals reveals that the German generals data set
contains a nonproportional hazard for the combat outcomes and the military
academy independent variables (Box-Steffensmeier, Reiter, and Zorn 2003). We
accordingly include time interactions for the Germany subsample.
The results for analysis of the full data set, the American generals subsample, and
the German generals subsample are displayed in Table 2. Note that we analyze sepa-
rate models for our two primary failure events, promotion and demotion. In the
promotion models, we set command demotion as the competing risk and vice versa.
Table 2 demonstrates that better combat performance significantly reduces the
chance that a general will be demoted, but has no significant effect on the chance
that a general will be promoted, across the full, American, and German samples.
This provides support for Hypothesis 1a, but not 1b—performance affects demo-
tion but not promotion. The null result for promotion may be because both mili-
taries wish to remove ineffective generals from their commands, but when a
general demonstrates competence, it may be more useful to the operational effort
to allow him to keep his command rather than promote him; perhaps combat-
proven generals are more greatly needed in battle rather than on a general staff.
The results also indicate that attending a military academy has no significant
effect on a general’s likelihood of promotion or demotion in the US subsample,
providing evidence against Hypothesis 3. Using an alternative measure of
interpersonal network, graduation from any college as a proxy for membership
in the network of the upper socioeconomic class (note that only 10 percent
of Americans held a college degree in 1920, around when World
War II American generals would have attended college), was also
13 A further issue is that SS membership is likely correlated with both personal networks (being a ranking mem-
ber of the Nazi party) and political loyalty.
14 Note that for 1941–1942, we only include German combat operations in North Africa; we exclude combat oper-
ations in Greece, Yugoslavia, and the Soviet Union.
Table 1. Summary statistics of combat performance and academy attendance
Full Sample US Germany
Performance Mean 0.359 0.666 0.137
SD 0.536 0.409 0.507
Min/Max 1/1 0.5/1 1/1
N 1508 631 887
Academy Mean – 0.560 –
SD – 0.5 –
Min/Max – 0/1 –
N – 693 –
SS Mean – – 0.060
SD – – 0.237
Min/Max – – 0/1
N – – 1007
Table 2. Cox models of combat command duration, US and Germany, 1941–1945
Model 1 Model 2 Model 3 Model 4 Model 5 Model 6

Entire Entire US only US Only Germany Germany
Sample Sample only Only
“Failure” event Demotion Promotion Demotion Promotion Demotion Promotion

Combat 0.380** 0.886 0.143*** 1.274 0.010*** 1.159
performance (0.149) (0.387) (0.084) (1.613) (0.009) (0.690)
Combat x time — — — — 2.08*** —
(0.376)
Military academy — — 0.884 0.850 — —
(0.645) (0.453)
SS — — — — 0.000*** 1.965
(0.000) (1.928)
Observations 1218 1218 488 488 729 729
log pseu.-like. –64.33 –87.84 –19.96 –22.15 –29.15 –52.07
Notes: (1) Robust standard errors reported, clustered on command. (2) Hazard rates reported.
(3) Efron methods used for ties. (4) Statistical significance: *p<0.1, **p<0.05, ***p<0.01. (5) All signifi-
cance tests one-tailed.
insignificant.15 Recall that we do not estimate a model with military academy atten-
dance for the Germany sample because of the different nature of German military
academies. Instead we estimate models with a variable indicating whether the gen-
eral served in the SS. We find very strong evidence, both substantively and statisti-
cally, that SS membership significantly reduces the likelihood of demotion. This
lower likelihood of demotion could be because there was an interpersonal network
protecting SS generals, Hitler used the SS to safeguard his regime and for coup-
proofing reasons might have been hesitant to demote an SS general,16 or because
SS divisions enjoyed higher levels of combat performance. Unfortunately, we do
not currently have the data to conclude with confidence exactly why SS generals
were less likely to be demoted, in part because of the low number of SS generals.
We do not find a significant effect of SS membership on promotion.
We ran a number of robustness checks. First, we treat as censored the nine gen-
erals that were wounded prior to being demoted, to account for recuperation.
Second, we treat as censored three generals that historians indicate may have
15 See Table A5 (Supplementary Appendix).

16 However, though the SS protected Hitler’s regime especially in the 1930s, the SS military leadership grew to be
increasingly disenchanted with Hitler as the war unfolded. Some SS generals eventually planned to ignore
some of Hitler’s military orders and even cooperated with efforts to overthrow Hitler (Ripley 2004, 335–36).
been removed due to troubled relationships with superiors rather than combat
performance. Third, we drop those observations in which divisions are destroyed,
surrender, or disband. The results hold across these modifications.
One possible concern is that generals might be removed only after fighting has
ceased and not while fighting is still ongoing. However, the data do not indicate
such a relationship. There is clear evidence of this trend in only four (of the
twenty-one) cases: Jay Mackelvie, who commanded the American 90th Infantry
Division; Werner Goeritz, who commanded the German 92nd Infantry Division;
Erwin Sander, who commanded the German 245th Infantry Division; and
Eberhard von Schuckmann, who commanded the German 352nd Infantry (later
Volksgrenadier) Division. In only these examples is it the case that divisions took
a break from the front during the change of command.
An additional concern is that our monadic division-month approach does not ac-
count for which divisions are fighting each other (that is, we use division-monads
rather than division-dyads), and a division’s performance may be affected by the
quality of its opponent as well as the quality of its leader. Though using division-
dyads is conceptually appealing, collecting accurate data on division-dyads would
be extraordinarily difficult. Combat often means a jumble of one army’s divisions
fights a jumble of another army’s divisions, making it difficult to discern discrete
dyads of exactly who is fighting whom. Relatedly, a single division may be spread
piecemeal across a front, with different elements of that one division fighting sev-
eral enemy divisions. The Battle of the Bulge, for example, involved large numbers
of American and German troops. Even attempting to code a small portion of this
battle, such as the Battle of St. Vith, would be problematic, as elements of three
American divisions fought elements of four German divisions.
The short temporal window addresses this problem at least in part by increasing
the likelihood that both the removed general and his replacement will face the same
adversary. We also take other steps. We control for the front and month of the war
(Supplementary Table A6, see Supplementary Appendix). This will capture whether
any month or front was particularly difficult, perhaps because a division is facing a
particularly tough opponent (e.g., facing German divisions commanded by General
Rommel in North Africa in 1942–1943). The results are robust to the inclusion of
these controls. Unfortunately, our sample size does not allow including dummies for
each month of the war, which would more flexibly capture monthly variation.
Finally, some may be concerned about potential bias from excluding the Eastern
Front in the analysis. It could be the case that underperforming German generals
were punished with a transfer or that competent German generals were rushed to the
Eastern Front as it crumbled. This is not the case as there is no difference between
the performance of German generals who were transferred and those who were not.
The coefficients presented in Table 2 are difficult to interpret directly because
they present the subhazard ratios estimated in each model. Numbers less than one
(and bounded by zero) indicate that increases in the independent variable make
the failure outcome less likely. In Model 1, for example, the outcome of interest is
command demotion. The coefficient on combat outcome is less than one, which
indicates that as combat performance improves, the commander is less likely to suf-
fer demotion. To assess how different levels of combat performance influence the
probability of command demotion over time, we plot the cumulative incidence
functions17 over the number of analysis months and present result in Figures 3–5.
17 Cumulative incidence functions display the probability that the individual “fails” (here, is demoted) in time pe-
riod t conditional on surviving to period t. We exclude the Germany sample. STATA 14 required us to manipu-
late the data and run a Cox proportional hazard model in order to plot the cumulative hazard functions. We
manipulated the data in a way such that the cumulative hazard functions for Germany are functionally equiva-
lent to the cumulative incidence functions for the full and American samples.
Figure 3. Competing-risks regression (full sample)
Figure 4. Competing-risks regression (USA)
Figure 3 examines the effect of different combat outcomes over time on the
likelihood of suffering a command demotion. Notably, poor performance
(combat outcome ¼ 1) is always associated with the greatest likelihood of demo-
tion. As performance improves, the likelihood of demotion decreases. Good
performance (combat outcome ¼ 1) is associated with the smallest likelihood of
demotion. Figures 4 and 5 describe the effects of combat performance on the
American and German samples, respectively. Analysis indicates that there is no
statistically significant difference between the effect of combat performance on
the likelihood of demotion in the American versus German generals, providing
Figure 5. Cox PH regression (Germany)
evidence against Hypothesis 2, though American generals appear more likely to

be demoted regardless of performance.18
It is possible that the relationship between combat performance and leadership
tenure may have changed as the war endured. Specifically, it may be that in the
later years of the war as the German war effort was collapsing, German army
underperformance would be more likely to be forgiven, and the relationship be-
tween poor combat performance and leader demotion might have weakened.19
To examine this possibility, we reran three variants of Model 5 from Table 2: (1)
just observations from 1944 and 1945, (2) just observations from 1944, and (3)
just observations from 1945. We find that the results hold in these subsamples,
casting doubt on the speculation that the combat-command tenure relationship
changed in the German army as the war endured.
Some might argue that selection effects contaminate our ability to draw
inferences about the effects of military academy attendance on likelihood of pro-
motion or demotion. The assignment of the “treatment” of military academy
attendance is not random, and several possible dynamics could be at play. For ex-
ample, individuals with more intrinsic aptitude might be more likely to attend
military academies. We do not have the data to model factors that affect decisions
to attend military academies, nor do we have the data to permit a matching strat-
egy to get at this threat to causal inference. One possible solution to this threat to
inference is to evaluate the interactive effect between combat performance and
military academy attendance. That is, conditional on combat performance, does
military academy attendance affect the likelihoods of promotion and demotion?
Put differently, does military academy attendance insulate a poorly performing
general from demotion, and does military academy attendance mean that a high-
performing general enjoys an especially high likelihood of promotion?
18 Recent work (Reiter 2016) points out that there is a number of coup-proofing strategies available to leaders
and not all of them entail diminishing military effectiveness. This null finding adds further credence to such
claims and points to the importance of exploring more carefully the relationship between coup-proofing and
subsequent combat performance in other settings.
19 Note that this is different from the possibility of a nonproportional hazard, in which the relationship between
combat and leadership demotion changes as the number of months of the general’s tenure increases.
We analyzed models including an interaction of combat outcome by military acad-

emy in the American subsample. The results fail to find that military academy at-
tendance influences the effect of combat outcome on demotion and promotion.
Thus far, we have examined the determinants of command demotion and pro-
motion, demonstrating that poor command performance increases the likelihood
of being removed from command. We next test Hypothesis 4, asking, if a low-per-
forming commanding general is replaced, does the performance of the division,
under the command of the new general, improve? We explore this possibility in
two empirical tests, using regression. In the first, we compare two sets of divisions:
divisions that are led by generals who replaced a general who was removed for low
performance, and divisions that are led by generals who replaced a general who
was removed for some other reason. The dependent variable is the change in the
division’s performance from just before the general’s removal (whether for low
performance or some other reason) to four months after the transition (allowing
time for the new general to settle into command). The dependent variable ranges
from –1 to 2.20 If Hypothesis 4 is correct, then divisions that experienced a transi-
tion in command because of low performance should see a bigger increase in
combat performance from before to after the transition, as compared with divi-
sions that experienced a transition for some other reason. Table 3 presents this
analysis, with the independent variable being a dichotomous variable coded 1 if a
general was replaced following low performance.
That variable is positively signed and statistically significant, providing support
for Hypothesis 4. Replacement moves combat performance a full point on the
three-point scale of –1 to 2.
One possible concern about the test described in Table 3 is that the observed
improvement in performance when a low-performing general gets replaced could
be regression to the mean and that the observed correlation between the replace-
ment of a low-performing general and an improvement in performance is spurious
rather than causal (Kahneman 2011). We assess the possibility that the observed im-
provement in performance following replacement of a low-performing general is
spurious by comparing two groups. The first group includes divisions four months
after a general was replaced following low performance. The second group includes
divisions some four months after a low level of performance was observed (that is,
performance at the same low level as those divisions that experienced commander
replacement), but only those divisions that did not experience replacement. If the
observed increase in performance reported in Table 3 is spurious, then we should
see no difference between the two groups, as a division’s performance four months
after poor performance will be the same whether or not its general was replaced.
But, if replacing a poor-performing general does cause an improvement in perfor-
mance, then we should observe difference between the two groups. Table 4
describes this analysis. The positive and significant coefficient for the replacement
variable suggests that we can be more confident that the observed relationship in
Table 3 is not spurious. Instead, the replacement of generals did lead to higher per-
formance, as forecasted in Hypothesis 4.
The results are robust to including a number of control variables
(Supplementary Tables A7 and A8, see Supplementary Appendix). We rerun the
regressions in Tables 3 and 4 with a host of biographical information, including
whether or not the commander attended a military academy, commanded a bat-
talion, or commanded a regiment. We also control for the month of the war and
the front in which the division is fighting. The results are robust to these
specifications.
20 A value of 2 means that the division transition from poor performance (combat outcome ¼ –1) to good perfor-
mance (combat outcome ¼ 1). While –2 is also theoretically possible, there is no instance where the division
transitions from good performance to poor performance.
Table 3. Regression analysis of the effects of general replacement on combat performance: comparing
generals of similar command tenure
Model 7
Replacement Following Low Performance 1.01***
(0.11)
Constant 0.99***
(0.11)
Observations 69
R squared 0.02
Notes: (1) Significance tests are one-tailed. (2) Both coefficients significant at the 0.001 level. (3) Robust
standard errors clustered on general reported in parentheses.
Table 4. Regression analysis of the effects of general replacement on combat performance: comparing
divisions after poor performance is observed
Model 8
Replacement 0.48***
(0.181)
Constant 0.27***
(0.086)
Observations 71
R squared 0.05
Notes: (1) Significance tests are one-tailed. (2) Statistical significance: *p<0.1, **p<0.05, ***p<0.01.
(3) Robust standard errors clustered on general reported in parentheses.
Finally, one may be concerned that the improvement in combat performance is

instead the result of the division receiving replacement troops, equipment, or
rest. We examined what divisions tended to be doing during the month their
commanders lost command. In only four cases do the divisions appear to be not
fighting when their commander is relieved (Jay Mackelvie, Werner Goeritz, Erwin
Sander, and Eberhard von Schuckmann). Two of these are not included in the
analysis because the division dissolved after his relief (Goeritz) or occurred just
before war’s end (Sander). While it is certainly possible for a division to receive
replacements while fighting, we, unfortunately, do not have sufficient data to ex-
amine these possibilities in greater detail.
Conclusions
This article has demonstrated rigorous empirical evidence for the importance of a
previously underappreciated determinant of military effectiveness, military leader-
ship. In our empirical sample, militaries improved combat performance by replacing
poorly performing generals. The findings also indicate that interpersonal networks
did not prevent the American military from replacing underperforming generals,
and political loyalty considerations did not prevent the German military from replac-
ing underperforming generals. The finding provides a more comprehensive portrait
of how militaries try to improve their performance, that they replace leaders as well
as change strategy and adopt new technology. The article also provides some support
for a more Weberian view of military organizations, at least for those militaries ana-
lyzed here, in contrast to the perspectives provided by some theories of military
organizations portraying them as dysfunctional, hidebound entities.
Our article is only a first step in understanding the causes and effects of military
leadership. We present three suggestions for future research. First, as noted,
militaries can improve effectiveness by replacing leaders, changing strategy, and/

or adopting new technology.
This article focused on changing leadership, and future work can develop and
test theory that considers how organizations choose among the options of chang-
ing leadership, strategy, and/or technology as means of improving effectiveness.
Second, future work should explore cases beyond the United States and
Germany in World War II. This is not just a matter of exploring the external valid-
ity of this article’s findings. Examining other cases would improve our under-
standing of the scope conditions of our theory, in particular exploring conditions
in which militaries might not replace low-performing leaders. Dictators facing
lower levels of external threat in relation to internal threat (Talmadge 2015) or
lacking possession of other coup-proofing tools like bribery or indoctrination
(Reiter 2016) might be less motivated to replace low-performing military leaders.
Careerism and other trends even within democratic militaries like the United
States can make it more difficult to replace even low-performing leaders (Ricks
2012). Future work needs to develop further theoretical expectations as to why
and when militaries might be less likely to replace low-performing leaders and
then test these expectations on appropriate empirical domains.
Third, future work can consider more carefully the interplay between different
levels of leadership within military organizations. The performance of low-ranking
generals is affected by leadership at higher and lower ranks, as higher-ranking
generals make decisions about the deployment of individual decisions, and lower-
ranking officers must implement the generals’ orders. Future work can explore
the interplay between these leadership levels. One promising avenue is the explo-
ration of information flows between levels of the organizational hierarchy and
how parochial incentives can distort these flows and, in turn, the accuracy of infer-
ences the high command tries to make about whether or not to replace lower-
ranking officers (see Wagstaff 2016).
Supplemental Information
Supplemental information is available at the Foreign Policy Analysis data archive.
References
ALBRECHT, HOLGER. 2015. “The Myth of Coup-Proofing: Risk and Instances of Military Coups d’état in
the Middle East and North Africa, 1950-2013.” Armed Forces & Society 41 (4): 659–87.
ALLEN, SUSAN HANNAH, AND TIFFINY VINCENT. 2011. “Bombing to Bargain? The Air War for Kosovo.”
Foreign Policy Analysis 7 (1): 1–26.
ARMED FORCES INFORMATION SCHOOL. 1950. The Army Almanac: A Book of Facts Concerning the Army of the
United States. Washington: US Government Printing Office.
AVANT, DEBORAH. 2007. “Political Institutions and Military Effectiveness: Contemporary United States
and United Kingdom.” In Creating Military Power: The Sources of Military Effectiveness, edited by Risa
A. Brooks and Elizabeth A. Stanley, 80–104. Stanford, CA: Stanford University Press.
BECKLEY, MICHAEL C. 2010. “Economic Development and Military Effectiveness.” Journal of Strategic
Studies 33 (1): 43–79.
BIDDLE, STEPHEN. 2004. Military Power: Explaining Victory and Defeat in Modern Battle. Princeton, NJ:
Princeton University Press.
BIDDLE, STEPHEN, AND ROBERT ZIRKLE. 1996. “Technology, Civil-Military Relations, and Warfare in the
Developing World.” Journal of Strategic Studies 19 (2): 171–212.
BIDDLE, STEPHEN, AND STEPHEN LONG. 2004. “Democracy and Military Effectiveness: A Deeper Look.”
Journal of Conflict Resolution 48 (4): 525–46.
BLANKEN, LEO J., AND JASON J LEPORE. 2014. “Performance Measurement in Military Operations:
Information Versus Incentives.” Defence and Peace Economics 26 (5): 516–35.
BLUMENSON, MARTIN. 1993a. Breakout and Pursuit. Washington: US Army.
———. 1993b. Salerno to Cassino. Washington: US Army.
BOX-STEFFENSMEIER, JANET M., AND BRADFORD S. JONES. 2004. Event History Modeling: A Guide for Social
Scientists. Cambridge, MA: Cambridge University Press.
BOX-STEFFENSMEIER, JANET M., DAN REITER, AND CHRISTOPHER ZORN. 2003. “Nonproportional Hazards and
Event History Analysis in International Relations.” Journal of Conflict Resolution 47 (1): 33–53.
BROOKS, RISA A. 1998. Political-Military Relations and the Stability of Arab Regimes. New York: Oxford
University Press.
———. 2003. “Making Military Might: Why Do States Fail and Succeed? A Review Essay.” International
Security 28 (2): 149–91.
BROOKS, RISA A., AND ELIZABETH STANLEY, eds. 2007. Creating Military Power: The Sources of Military
Effectiveness. Stanford, CA: Stanford University Press.
BROWN, JOHN SLOAN. 1986. “Colonel Trevor N. Dupuy and the Mythos of Wehrmacht Superiority: A
Reconsideration.” Military Affairs 50 (1): 16–20.
BUENO DE MESQUITA, BRUCE, ALASTAIR SMITH, RANDOLPH M. SIVERSON, AND JAMES D. MORROW. 2003. The
Logic of Political Survival. Cambridge, MA: MIT Press.
BURT, RONALD S. 1992. Structural Holes: The Social Structure of Competition. Cambridge, MA: Harvard
University Press.
CASTILLO, JASEN. 2014. Endurance and War: The National Sources of Military Cohesion. Stanford, CA:
Stanford University Press.
CHIOZZA, GIACOMO, AND H. E. GOEMANS. 2011. Leaders and International Conflict. Cambridge, MA:
Cambridge University Press.
CLARKE, JEFFREY, AND ROBERT ROSS SMITH. 1993. Riviera to the Rhine. Washington: US Army.
COCHRAN, KATHERINE MCNABB, AND STEPHEN B. LONG. Forthcoming. “Measuring Military Effectiveness:
Loss Exchange Ratios for Multilateral Interstate Wars, 1816–1990.” International Interactions.
COLE, HUGH. 1993a. The Ardennes: Battle of the Bulge. Washington: US Army.
———. 1993b. The Lorraine Campaign. Washington: US Army.
DESCH, MICHAEL C. 2008. Power and Military Effectiveness: The Fallacy of Democratic Triumphalism.
Baltimore, MD: Johns Hopkins University Press.
EVANS, DAVID. 1992. A New Way to Train Military Officers. Baltimore Sun, February 18.
FARWELL, BYRON. 2001. The Encyclopedia of Nineteenth-Century Land Warfare: An Illustrated World View.
New York: Norton.
FEST, JOACHIM. 1994. Plotting Hitler’s Death: The Story of German Resistance. Translated by Bruce Little.
New York: Metropolitan Books.
FISHER, ERNEST., JR. 1993. Cassino to the Alps. Washington: US Army.
GARLAND, ALBERT N., AND HOWARD SMYTHE. 1993. The Last Offensive. Washington: US Army.
GARTNER, SCOTT SIGMUND. 1997. Strategic Assessment in War. New Haven, CT: Yale University Press.
GARTNER, SCOTT SIGMUND, AND MARISSA EDSON MYERS. 1995. “Body Counts and ‘Success’ in the Vietnam
and Korean Wars.” Journal of Interdisciplinary History 25 (3): 377–95.
GELPI, CHRISTOPHER, AND PETER D. FEAVER. 2005. Choosing Your Battles: American Civil-Military Relations
and the Use of Force. Princeton, NJ: Princeton University Press.
GOEMANS, H. E. 2000. War and Punishment: The Causes of War Termination and the First World War.
Princeton, NJ: Princeton University Press.
GRAUER, RYAN. 2016. Commanding Military Power. Cambridge, MA: Cambridge University Press.
GRAUER, RYAN, AND MICHAEL C. HOROWITZ. 2012. “What Determines Military Victory? Testing the
Modern System.” Security Studies 21 (1): 83–112.
HARKNESS, KRISTIN A. 2016. “The Ethnic Army and the State: Explaining Coup Traps and the
Difficulty of Democratization in Africa.” Journal of Conflict Resolution 60 (4): 587–616.
HARRISON, GORDON. 1993. Cross-Channel Attack. Washington: US Army.
HOLLIBAUGH, GARY E., JR, GABRIEL HORTON, AND DAVID E. LEWIS. 2014. “Presidents and Patronage.”
American Journal of Political Science 58 (4): 1024–42.
HOROWITZ, MICHAEL C. 2010. The Diffusion of Military Power: Causes and Consequences for International
Politics. Princeton, NJ: Princeton University Press.
HOROWITZ, MICHAEL C., ALLAN C. STAM, AND CALI M. ELLIS. 2015. Why Leaders Fight. Cambridge, MA:
HOWE, GEORGE F. 1993. Northwest Africa: Seizing the Initiative in the West. Washington: US Army.
JANOWITZ, MORRIS. 1960. The Professional Soldier: A Social and Political Portrait. Glencoe, IL: Free Press.
KAHNEMAN, DANIEL. 2011. Thinking Fast and Slow. New York: Farrar, Straus, and Giraux.
KIM, INSOO, AND TYLER CRABB. 2014. “Collective Identity and Promotion Prospects in the South Korean
Army.” Armed Forces & Society 40 (2): 295–309.
LIEBERSON, STANLEY, AND JAMES F. O’CONNOR. 1972. “Leadership and Organizational Performance:
A Study of Large Corporations.” American Sociological Review 37 (2): 117–30.
MACDONALD, CHARLES. 1993a. The Last Offensive. Washington: US Army.

———. 1993b. The Siegfried Line Campaign. Washington: US Army.
MACHIAVELLI, NICCOLO . 1999. The Prince. New York: Signet.
MCPHERSON, JAMES M. 1988. Battle Cry of Freedom: The Civil War Era. New York: Ballentine Books.
MEARSHEIMER, JOHN J. 1983. Conventional Deterrence. Ithaca, NY: Cornell University Press.
MEIROWITZ, ADAM, AND JOSHUA A. TUCKER. 2013. “People Power or a One-Shot Deal? A Dynamic Model
of Protest.” American Journal of Political Science 57 (2): 478–90.
MITCHUM, SAMUEL W., JR. 2007. German Order of Battle, 3 vols. Mechanicsburg, PA: Stackpole Books.
MOORE, DAVID W., AND B. THOMAS TROUT. 1978. “Military Advancement: The Visibility Theory of
Promotion.” American Political Science Review 72 (2): 452–68.
OVERY, RICHARD. 1995. Why the Allies Won. New York: Norton.
PECK, B. MITCHELL. 1994. “Assessing the Career Mobility of U.S. Army Officers: 1950-1974.” Armed
Forces and Society 20 (2): 217–37.
PILSTER, ULRICH, AND TOBIAS BÖHMELT. 2011. “Coup-Proofing and Military Effectiveness in Interstate
Wars.” Conflict Management and Peace Science 28 (4): 331–50.
———. 2012. “Do Democracies Engage in Less Coup-Proofing? On the Relationship between Regime
Type and Civil-Military Relations.” Foreign Policy Analysis 8 (4): 355–72.
POLLACK, KENNETH M. 2002. Arabs at War: Military Effectiveness, 1948–1991. Lincoln, NE: University of
Nebraska.
POSEN, BARRY R. 1984. The Sources of Military Doctrine: France, Germany, and Britain Between the World
Wars. Ithaca, NY: Cornell University Press.
———. 1993. “Nationalism, the Mass Army, and Military Power.” International Security 18 (2): 80–124.
POWELL, JONATHAN. 2012. “Determinants of the Attempting and Outcome of Coups d’état.” Journal of
Conflict Resolution 56 (6): 1017–40.
QUINLIVAN, JAMES T. 1999. “Coup-Proofing: Its Practice and Consequences in the Middle East.”
International Security 24 (2): 131–65.
RAMSAY, KRISTOPHER W. 2008. “Settling it on the Field: Battlefield Events and War Termination.”
Journal of Conflict Resolution 52 (6): 850–79.
REITER, DAN. 2009. How Wars End. Princeton, NJ: Princeton University Press.
———. 2016. “Choosing the Tools of Coup-Proofing: The Puzzle of Nazi Germany.” Unpublished man-
uscript, Emory University.
———. ed. 2017. The Sword’s Other Edge: Tradeoffs in the Pursuit of Military Effectiveness. Cambridge:
REITER, DAN, AND ALLAN C. STAM. 2002. Democracies at War. Princeton, NJ: Princeton University Press.
RICKS, THOMAS E. 2009. The Gamble: General Petraeus and the American Military Adventure in Iraq. New
York: Penguin.
———. 2012. The Generals: American Military Command from World War II to Today. New York: Penguin.
RIPLEY, TIM. 2004. Hitler’s Praetorians: The History of the Waffen-SS, 1925-1945. Staplehurst, UK:
Spellmount.
ROSEN, STEPHEN PETER. 1991. Winning the Next War: Innovation and the Modern Military. Ithaca, NY:
Cornell University Press.
———. 1996. Societies and Military Power: India and Its Armies. Ithaca, NY: Cornell University Press.
———. 2005. War and Human Nature. Princeton, NJ: Princeton University Press.
ROTTE, RALPH, AND CHRISTOPH SCHMIDT. 2003. “On the Production of Victory: Empirical Determinants
of Battlefield Success in Modern War.” Defence and Peace Economics 14 (3): 175–92.
RWENGABO, SABASTIANO. 2013. “Regime Stability in Post-1986 Uganda: Counting the Benefits of Coup-
Proofing.” Armed Forces and Society 39 (3): 531–59.
SCHLIFFER, JOHN. 2001. “History Program, U.S. Army Military.” In World War II in the Pacific: An
Encyclopedia, edited by Stanley Sandler, 232–34. New York: Garland.
SCHWIND, DAVID A., AND JANICE H. LAURENCE. 2006. “Raising the Flag: Promotion to Admiral in the
United States Navy.” Military Psychology 18 (Suppl): S83–S101.
SEGAL, DAVID R. 1967. “Selective Promotion in Officer Cohorts.” Sociological Quarterly 8 (2): 199–205.
SNYDER, JACK. 1984. The Ideology of the Offensive: Military Decision-Making and the Disasters of 1914. Ithaca,
NY: Cornell University Press.
STAM, ALLAN C. III. 1996. Win, Lose, or Draw: Domestic Politics and the Crucible of War. Ann Arbor:
University of Michigan Press.
SUN TZU. 1963. The Art of War. Translated by Samuel B. Griffith. London: Oxford University Press.
TALMADGE, CAITLIN. 2015. The Dictator’s Army: Battlefield Effectiveness in Authoritarian Regimes. Ithaca, NY:
Cornell University Press.
TARAKCI, MURAT, LINDRED L. GREER, AND PATRICK J. F. GROENEN. 2016. “When Does Power Disparity Help
or Hurt Group Performance?” Journal of Applied Psychology 101 (3): 415–29.
VAN CREVELD, MARTIN. 1982. Fighting Power: German and U.S. Army Performance, 1939–1945. Westport,
CT: Greenwood Press.
———. 1985. Command in War. Cambridge, MA: Harvard University Press.
VAN EVERA, STEPHEN. 1999. Causes of War: Power and the Roots of Conflict. Ithaca, NY: Cornell University
Press.
VERWIMP, PHILIP, PATRICIA JUSTINO, AND TILMAN BRU€ CK. 2009. “The Analysis of Conflict: A Micro-Level
Perspective.” Journal of Peace Research 46 (3): 307–14.
VOLDEN, CRAIG, AND ALAN E. WISEMAN. 2014. Legislative Effectiveness in the United States Congress.
Cambridge, MA: Cambridge University Press.
WAGSTAFF, WILLIAM A. 2016. “Organizing Evaluation: Assessing Combat Leadership Quality.” Presented
at the Annual Meeting of the Peace Science Society (International), Oxford, MS.
WEBER, MAX. 1978. Economy and Society. Edited by Gunther Ross and Ckus Wittich. Berkeley: University
of California Press.
WEISIGER, ALEX. 2016. “Learning from the Battlefield: Information, Domestic Politics, and Interstate
War Duration.” International Organization 70 (2): 347–75.
WOLFORD, SCOTT. 2007. “The Turnover Trap: New Leaders, Reputation, and International Conflict.”
American Journal of Political Science 51 (4): 772–88.
View publication stats

Reiter Wagstaff FPA2018

Uploaded by

Copyright:

Available Formats

Reiter Wagstaff FPA2018

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reiter Wagstaff FPA2018

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Leadership and Military Effectiveness

Article in Foreign Policy Analysis · August 2017

Understanding War and Peace View project

The Sword's Other Edge View project

The user has requested enhancement of the downloaded file.

Leadership and Military Effectiveness

What determines military effectiveness? Though political scientists have

Historians have long emphasized the importance of leadership in determining

whether interpersonal networks among generals or coup-proofing incentives

Military Leadership and Military Effectiveness

external threat are especially likely to engage in coup-proofing (Quinlivan 1999;

Hypothesis 2: In wartime, dictators should be less likely to relieve low-performing generals as

A second alternative perspective is that interpersonal networks affect organiza-

Thus far we have discussed the determinants of leadership turnover. The

Focusing on leadership and combat dynamics in a single war provides advan-

Figure 2. Promotions per year

demoted either because of network factors, coup-proofing factors (Hitler relied

Table 1. Summary statistics of combat performance and academy attendance

Full Sample US Germany

Table 2. Cox models of combat command duration, US and Germany, 1941–1945

Model 1 Model 2 Model 3 Model 4 Model 5 Model 6

“Failure” event Demotion Promotion Demotion Promotion Demotion Promotion

15 See Table A5 (Supplementary Appendix).

Figure 4. Competing-risks regression (USA)

evidence against Hypothesis 2, though American generals appear more likely to

We analyzed models including an interaction of combat outcome by military acad-

Finally, one may be concerned that the improvement in combat performance is

militaries can improve effectiveness by replacing leaders, changing strategy, and/

MACDONALD, CHARLES. 1993a. The Last Offensive. Washington: US Army.

View publication stats

You might also like