chapter 33
........................................................................................................
PROBABILITY IN ETHICS
........................................................................................................
david mccarthy
Ethics is mainly about what we ought to do, and about when one situation is better than
another. But facing uncertainty about the consequences of our actions, and about how
situations will evolve, is an all-pervasive feature of our condition. Should this not be a central
topic in ethical theory?
Probability is by far the best-known tool for thinking about uncertainty, a well-known
aphorism telling us that it is the very guide to life. But despite important exceptions, it is
easy to get the impression that mainstream moral philosophy has not been much concerned
with probability.
his relects what seems to be a natural division of labour. he most fundamental
questions for ethical theory seem to arise in the absence of uncertainty. For example, it
seems hard to believe that the questions of whether it is better to give priority to the worse
of, and of whether we ought to favour our nearest and dearest, have anything to do with
uncertainty. Many inluential discussions of these topics never mention uncertainty.
Of course, once answers to these fundamental questions are in, we can try to extend them
to cases involving uncertainty. But ethical theorists may seem well advised to hand this task
over to others, given how mathematical the various disciplines concerned with probability
have become. Technically and philosophically interesting as it may be, the extension of
central ethical ideas to problems involving probability seems to be outside the main business
of ethical theory.
his chapter will argue for the opposite view. he major ethical problems to do with
probability involve very little mathematics to appreciate; many topics which do not seem to
have anything to do with probability are arguably all about probability; and thinking about
various problems to do with probability can help us solve analogous problems which do not
involve probability, sometimes even revealing that popular positions about such problems
are incoherent.
Almost every topic discussed here could easily be given its own survey article, and
an adequate bibliography would exceed the space allotted for the whole chapter. Positive
positions are oten argued for sketchily, many important positions on each topic are
neglected, and some major topics are not discussed at all.
Instead, the goal is to ofer enough breadth to illustrate some ways in which questions
about probability run systematically throughout ethical theory, while in places going into
706
david mccarthy
enough depth to articulate some surprising and potentially important applications. In brief,
what follows is much less a survey of or an argument for particular positions than a plea for
ethical theory to take probability more seriously.
I said that ethics is largely about what we ought to do, and when one situation is better
than another. Some say that rationality is about these things as well. Given that theories of
rationality in the face of uncertainty are highly developed, it might be thought that an appeal
to these theories of rationality straightforwardly solves ethical problems about probability.
his line of thought is importantly mistaken. First, Hume famously claimed that it is not
irrational for an agent to prefer the destruction of the whole world to the scratching of his
inger. Nor would it be irrational for the agent to bring about the destruction to avoid the
scratching. But the destruction is neither better than the scratching, nor better for the agent.
And the agent surely ought not to bring about the destruction. On at least one widely-held
view, therefore, ethics and rationality are not about the same things.
Secondly, it is undeniable that contemporary theories of rationality are an indispensable
resource for thinking about ethics and probability. However, whether and how to apply
these theories to ethics is far from straightforward, and will be one of the principal concerns
of this chapter. Furthermore, in my view, at least, appeals to rationality are almost always
epiphenomenal. For example, suppose we have a convincing argument for the claim that
rational preferences have such and such a structure. We could then try to claim that an
evaluative relation like betterness has to have that structure on the grounds that a rational
agent can surely prefer what’s better to what’s worse. However, it is almost always less
committal and more direct just to modify the original argument to make it apply directly to
the structure of the evaluative relation. Claims about rationality oten have historical priority
over parallel claims about ethics, but I believe they do not have any kind of important
conceptual priority.
he chapter starts with four sections which discuss which probabilities are relevant to
ethics, establish terminology, and rehearse expected utility theory. It then turns to the
evaluative question of when one situation is better than another, focusing on the question of
when one distribution of goods is better than another. Sections . and . discuss popular
but I think inadequate approaches to this question. hese serve as a backdrop to a hugely
important theorem due to Harsanyi () introduced in section .. Sections . to
. discuss such things as the relationship between Harsanyi’s theorem and utilitarianism;
criticisms of Harsanyi’s premises and the relationship of these criticisms to other distributive
views such as egalitarianism, the priority view, and concerns with fairness; the extension
of Harsanyi’s theorem to problems of population size; incommensurability; continuity;
non-expected utility theory; evaluative measurement; and the question of what Harsanyi’s
theorem really shows about aggregation. hese sections also list various open problems and
directions for further work. All of these topics have to do with probability.
One of the beneits of thinking about Harsanyi’s theorem is the way it helps us organize
our thinking about all sorts of fundamental evaluative questions. Section . will suggest
that thinking about decision theory can have the same value in thinking about fundamental
normative questions, questions about what we ought to do. With particular focus on
probability, the remaining sections illustrate by discussing what are arguably the three most
important kinds of normative theories: act consequentialism, rule consequentialism or
contractualism, and deontology (these will be deined in section .).
he discussion aims to be self-contained. For those with a background in ethics who
would like to know more about how probability is involved, the chapter keeps technicalities
probability in ethics
707
to a minimum. But the topic just cannot be addressed without a certain amount of rigor,
and passing acquaintance with expected utility theory and decision theory will be helpful,
though not strictly necessary. For those who know about probability and would like to
see how it applies to ethics, the chapter gives brief guides to the relevant ethical debates.
Such readers will recognize occasional allusions to relatively sophisticated ideas to do with
probability. For one thing is clear: the questions about probability which ethics raises are
profound, and are surely best addressed by combining expertise.
33.1 Probabilities
.............................................................................................................................................................................
One diiculty in thinking about probability in ethics is assessing when ethicists need to
be involved. Suppose we are told that some action will beneit many but involves a small
probability of harming a few. We might think it the job of epistemologists, metaphysicians or
philosophers of science to tell us what kind of judgment ‘the probability is small’ expresses,
what laws probabilities obey, and what makes such a judgment correct. Ethicists need only
ask whether we ought to perform the action given that the probability of harm is small, and
need not be involved any earlier. However, the division of labour is unlikely to be so neat.
here are many conceptions of probability (see e.g. Hájek, , for a survey). his raises
the question of which conception is most relevant to ethics, or whether diferent conceptions
are appropriate in diferent ethical contexts. One of the most basic distinctions is between
subjective and objective conceptions of probability, and this distinction will enable us to
illustrate many of the issues.
he best-known subjective conception claims that the preferences of an ideally rational
agent between uncertain prospects must satisfy various structural conditions (Ramsey,
; Savage, ). Suppose the agent also has a rich set of preferences. hen Ramsey and
Savage showed that there exists a unique function on events satisfying the usual probability
axioms (call it her subjective probability function) and a function on outcomes (her utility
function) such that: the agent weakly prefers one prospect to another if and only if the
former has at least as great expected utility, as calculated by those functions.
Perhaps the most prominent objective conception of probability in the contemporary
debate is the best-system analysis pioneered by Lewis. he original best-system analysis
of the laws of nature of Lewis () says that the laws are the theorems of the best
systematization of the world: the true theory which does best in terms of simplicity and
strength (or informativeness). To allow probabilistic laws in, Lewis () introduced the
idea of it. he more likely the actual world is by the lights of the theory, the better the it of
that theory. heories are now judged according to how well they do in terms of simplicity,
strength, and it. If some of the laws of the best theory are probabilistic, those are what
determine the objective probabilities.
Suppose we have to choose between subjective and objective conceptions for use
in ethics, understood along the lines just sketched. Which conception should it be?
Neither the Ramsey-Savage story about subjective probabilities nor Lewis’s version of best-system
analysis has a hegemony. For surveys of alternative views about subjective probability, see Gilboa (),
and for alternative best-system analyses, see Schwarz in this volume ().
708
david mccarthy
Perhaps it depends on context: for example, subjective probabilities may be appropriate for
agent-evaluation (blame, responsibility etc.), but inappropriate in other contexts. But let us
ix the context by focussing on the most basic normative question of what we ought to do.
Each conception has features we might ind appealing. Objective probabilities seem in
some important sense to trump subjective probabilities. his is relected in the popular
view that when an agent has beliefs about objective probabilities, rationality requires her
to conform her subjective probabilities to those beliefs. his is the basic idea behind the
so-called principle principal of Lewis (). But if objective probabilities do indeed trump
subjective probabilities, it may seem that what we ought to do depends on the objective
probabilities, not our subjective probabilities.
On the other hand, objective probabilities may be disappointingly sparse or epistemically
inaccessible. For example, best-systems analyses may make good sense of the objective
probability of radium atoms decaying or coins landing on heads. But it is much less clear
what best-systems analyses have to say about the objective probability of events like a run
on a particular bank next year, one-of macro events involving chaotic systems. Such events
may fail to have reasonably determinate objective probabilities (compare Hoefer, ), and
even if they do, the epistemology may be too diicult for the objective probabilities to be
usefully action guiding.
So perhaps we should instead say that what we ought to do depends at least in part on our
subjective probabilities. One option is to use subjective probabilities exclusively; another
is to use objective probabilities where available, and subjective probabilities to ill in the
gaps. But every view which makes signiicant use of subjective probabilities faces at least
two major problems.
First, the Ramsey-Savage story about subjective probabilities is a chapter in the Humean
story about rationality. But just as the Humean story refuses to condemn the preference
for the destruction of the whole world over the scratching of a inger, the Ramsey-Savage
story does not condemn subjective probabilities which, to most people, are just as crazy.
For example, provided her preferences are appropriately structured, there is nothing in the
Ramsey-Savage story to condemn someone who thinks it highly likely the world will come
to an end before teatime. Such subjective probabilities will seem to many too irrational to
have any bearing on what we ought to do. But it is a major challenge to articulate a principled
account of which subjective probabilities should be excluded.
Secondly, as soon as we allow in subjective probabilities, we face questions of whose
and how. Whose subjective probabilities count in determining whether an agent ought to
perform some action – the agent’s, those of her potential victims or beneiciaries, everyone’s?
If the subjective probabilities of at least two people are relevant, how should they be used? At
least if we switch to the problem of evaluating the uncertain prospects which actions result
in, this is a long-standing problem in welfare economics. he so-called ex post approach
recommends irst aggregating the separate subjective probability functions into a single
social probability function, then using this social probability function to evaluate uncertain
prospects. he ex ante approach gives the separate subjective probability functions a direct
evaluative role, at least in a special case. Just to give one version, ex ante Pareto says: if for
each individual i, an uncertain prospect P is better for i than another uncertain prospect
P′ relative to i’s own subjective probability function, then P is better than P′ . Both the ex
post and ex ante approaches look appealing, but they are extremely diicult to combine
consistently. For example, given weak assumptions, there will be prospects P and P′ such
probability in ethics
709
that ex ante Pareto has the apparent pathology of implying that P is better than P′ despite
the fact that P′ is guaranteed to produce a better outcome than P. But ex post approaches
will adopt principles which from the outset say that in such cases, P′ is better than P.
Now it is not my goal to try to answer any of the large questions raised in this section.
My claim is rather that they are questions with which ethicists must engage, and that one’s
answers to these questions may depend on one’s more general ethical views. To illustrate,
suppose one sees ethics as being primarily about coordinating action to achieve good
outcomes, and one is prepared to tolerate a signiicant amount of indeterminacy in one’s
normative theory. hen one may be tempted to claim that the probabilities which are
relevant to ethics are the objective probabilities alone. By contrast, suppose one instead sees
ethics as being about trying to achieve some sort of fair compromise between agents with
diverse beliefs and goals. hen it may seem tempting to allow in subjective probabilities
no matter how irrational, and to follow the ex ante approach. On this picture, individual
autonomy is central, and it may seem more important to respect the notion of unanimity
built into ex ante Pareto than to try to avoid the apparent pathology which comes with it.
here are, of course, many other options, but the important point is that which probabilities
are relevant to ethics, and how, is itself a fundamental ethical question.
33.2 Outcomes
.............................................................................................................................................................................
Some writers, however, think that probabilities are never relevant to what we ought to
do. A parallel view applies to the question of when one uncertain prospect is better than
another. Jackson () illustrates with the following. A doctor has to choose between three
treatments for a patient with a minor complaint. Drug A would partially cure the complaint.
One of drugs B and C would completely cure the patient while the other would kill him, but
the doctor cannot tell which is which.
he obvious view, as Jackson notes, is that the doctor ought to give the patient drug A.
his verdict would be delivered by any broadly decision-theoretic account. Along similar
lines, the prospect associated with giving the patient drug A is better than the prospects
associated with drugs B and C. Call any view which assesses actions and prospects involving
uncertainty along broadly decision-theoretic lines probability-based.
But there is a diferent view: if drug B would cure the patient, the doctor ought to give the
patient drug B; similarly for drug C. Likewise, if drug B would cure, the prospect associated
with giving drug B is better than the other prospects. Call such views, positions which assess
actions and prospects in terms of what their consequences would be, outcome-based.
As the drug example shows, an objection to outcome-based views is that they make the
truth about what we ought to do too epistemically inaccessible, or provide poor guides to
action. But there are at least two interesting arguments for outcome-based views.
First, transposing an argument due to homson () to the present example, suppose
the pharmacist walks in and knowing full-well that drug B would cure the patient, says to
he large literature on this topic is rather technical, but Broome (, ch. ) provides a good
introduction and philosophical discussion. Mongin () contains a very general set of results.
710
david mccarthy
the doctor: “You ought to use drug B”. he pharmacist seems right. But doesn’t that imply
an outcome-based view?
In response, consider the case where the pharmacist says: “Drug B would cure. So you
ought to use drug B”. By the time the pharmacist has inished the irst sentence, the doctor
has new evidence, and should upgrade her probabilities accordingly. here is then no
clash between a probability-based view and the truth of the pharmacist’s second sentence.
Likewise, I think that in the actual case something like “Drug B would cure” is implied when
the pharmacist just says: “You ought to use drug B”. What is implied by the pharmacist’s
normative assertion impacts upon the probabilities the doctor should have, making the
literal construal of the normative assertion true (McCarthy, ).
Secondly, advocates of probability-based views have to say which probability functions
are relevant to what we ought to do. But there are many candidates, e.g. the probabilities of
this agent or that agent, at this time or that time. Jackson concludes that we have to recognize
the existence of “an annoying profusion ... of a whole range of oughts” (Jackson, , p.
).
But this seems dissatisfying. When we ask ourselves or others what we ought to do, we
don’t want to learn that some oughts recommend this while others recommend that. We
want to know what we ought to do fullstop. But if there is only one ought, we need to privilege
one probability function. he function of an omniscient agent may seem to be the only
distinguished choice, so we end up with an outcome-based view.
In response, just because it is not obvious which probability function is privileged, it
does not follow that no function (or reasonably narrow class of functions) is privileged. In
the previous section we saw that if we adopt a probability-based view, a variety of fairly
fundamental ethical factors and disputes bears upon the question of which probabilities
are relevant to ethics. he complexity of this topic explains why it is not obvious which
probability function is privileged, but the fact that the problem is complex hardly entails
that some outcome-based view wins by default. Outcome-based views have to be assessed
in terms of various ethical desiderata just as much as probability-based views do, and they do
quite badly in terms of desiderata such as the idea that an ethical theory should be suitably
action-guiding.
It is also worth noting that outcome-based views may result in large-scale indeterminacy.
he drug example stipulated that various counterfactuals relating actions to outcomes are
true. But an increasingly popular view claims that most counterfactuals are false (see e.g.
Hájek, ). In particular, it will oten be the case that for some potential action A there is
no outcome O such that the counterfactual: “If A were performed, O would result”, is true.
On this view about counterfactuals, the facts on which outcome-based views have to call are
much sparser than might have appeared, with the result that there is a lot more evaluative
and normative indeterminacy on outcome-based views than we might have hoped. his
may further undercut the appeal of outcome-based views.
In what follows, I will assume that some probability-based view is correct. But it is a
major question which conception of probability is relevant to ethics, so ethicists need to be
involved with questions about probability early on. In light of the diiculties of aggregating
probability functions alluded to in the previous section, ethicists also need to be prepared
for the possibility that the eventual input into ethics is going to be messier than a single
probability function which satisies the usual axioms.
probability in ethics
711
33.3 Terminology
.............................................................................................................................................................................
However, to simplify I henceforth assume that probabilities are supplied and satisfy the
usual axioms. To relect this I will oten speak of risk rather than probability or uncertainty.
A lottery over a nonempty set of world histories (past, present and future) assigns positive
probabilities to initely many of the histories with the probabilities all summing to one (these
are sometimes known as lotteries with inite support). I will oten write lotteries in the form
[p , h ; . . . ; pm , hm ] where the hj ’s are the histories which could result from the lottery and
the pj ’s their probabilities.
he betterness relation holds between two lotteries just in case the irst is at least as good
as the second. An individual i’s individual betterness relation holds between two lotteries
L and L just in case: i exists in every history which could result from the lotteries, and
L is at least as good for i as L . By identifying histories with lotteries in which the history
gets probability one, and restricting the betterness and individual betterness relations to
such lotteries, we obtain relations between histories. I will refer to these relations as risk-free
versions of the originals. For example, the risk-free betterness relation holds between two
histories just in case the irst is at least as good as the second.
here are many views about when one history is better for someone than another, or in
a more suggestive phrase, about what makes someone’s life go best (Parit, , Appendix
I). On one popular classiication, the three main views are that having a good life is a matter
of: (i) having good-quality experiences; (ii) satisfying one’s preferences or desires; or (iii)
attaining what are said to be objective goods, such as deep knowledge or close personal
relationships. However, some philosophers think that when doing ethics, we should not
be in the business of making ine-grained comparisons between diferent people’s lives,
but should make interpersonal comparisons only in terms of such things as the resources,
freedoms, or opportunities people enjoy (see e.g. Rawls, ; Sen, ). Which of these
views is correct will not matter in what follows, but it will be important that the discussion
can accommodate any of them.
We will be talking a lot about the betterness relation. Not everyone thinks that this is
a useful way of looking at ethics (see e.g. Foot, ; homson, ). But in response,
talking about betterness can be seen as a harmless organizing tool (see e.g. Broome, ),
and is popular enough for us to be able to cover many major positions. For example,
consequentialism (on a probability-based interpretation) is the view that lotteries can be
ranked in terms of betterness, and that betterness somehow determines normativity.
For example, act consequentialism says that we always ought to bring about the best
available lottery, whereas rule consequentialism says that we always ought to act according
to the rule such that, if everyone acted in accord with it (or on a diferent version,
accepted it), the best available lottery would be realized. Contractualism tends to be framed
not in terms of betterness, but in terms of an ideal social contract. However, when it
comes to the assessment of diferent social contracts, contractualists are concerned with
As far as I can see, there is no universally accepted account of consequentialism, so I am only trying
to convey the rough idea rather than provide a precise deinition. In addition, the way moral philosophers
use the term ‘consequentialism’ should not be confused with an important decision-theoretic idea which
also goes by the name of ‘consequentialism’ (see e.g. Hammond, ).
712
david mccarthy
competing sets of principles or rules (see e.g. Scanlon, ), so at the concrete level of
normative theorizing, it is oten hard to tell the diference between contractualism and rule
consequentialism. Finally, deontology is oten characterized as the position that some acts
are wrong even when they would have the best available consequences, such as killing one
innocent person to prevent ive innocent people from being killed.
33.4 Expected utility theory
.............................................................................................................................................................................
his chapter expresses the view that whatever one ultimately makes of expected utility
theory and decision theory, looking at basic evaluative and normative questions through
the frameworks they provide is extremely useful. his section therefore provides a quick
rehearsal, irst of the terminology of expected utility theory, and then of its most basic result.
It takes X to be some ixed nonempty set. In applications, X will usually be a set of histories,
or more colloquially, outcomes.
A preorder on X is a binary relation R on X which is relexive (∀x ∈ X, xRx) and transitive
(∀x,y,z ∈ X, xRy & yRz "⇒ xRz). It is complete if for all x, y ∈ X, either xRy or yRx. It is
incomplete just in case it is not complete. An ordering of X is just a complete preorder of
X. If L and M are lotteries over X, then for all α ∈ (, ), αL + ( − α)M is the so-called
compound lottery in which each member x of X has probability αp + ( − α)q where p is x’s
probability under L and q is its probability under M. Suppose that is an ordering on X.
hen a real-valued function f is said to represent the ordering just in case: for every x and y
in X, x y if and only if f (x) ≥ f (y).
Suppose that is a binary relation on lotteries over X. Here are the three expected utility
axioms.
Ordering is a complete preorder.
Strong Independence For all lotteries L, M and N, and α ∈ (, ): L M if and only if
αL + ( − α)N αM + ( − α)N.
he rough idea of Strong Independence is that the “addition” of the same lottery N to either
side of L M should make no diference: the added N’s will cancel out. Strong Independence
is sometimes explained by imagining that the compound lotteries will be realized by irst
tossing a biased coin, where heads has a probability of α and tails a probability of − α,
then running whichever lottery results. For example, suppose you strictly prefer L to M,
and you now have to decide between αL + ( − α)N and αM + ( − α)N. If the coin lands
on tails, you will face N in either case, so in that scenario there is nothing to choose between
the two compound lotteries. But if the coin lands on heads, you will face L or M, and will
therefore prefer to have chosen αL + ( − α)N to αM + ( − α)N. Since heads has a positive
probability, you should therefore strictly prefer αL + ( − α)N to αM + ( − α)N prior
to the coin being tossed. Or at least that is one of the typical ways of motivating Strong
Independence. he example has focused on preference relations, but it can clearly be applied
directly and without any discussion of rationality to a variety of evaluative comparatives,
such as betterness and individual betterness relations.
probability in ethics
713
Continuity For all lotteries L, M and M such that L ≻ M ≻ N, there exist α, β ∈ (, ) such
that M ≻ αL + ( − α)N and βL + ( − β)N ≻ M.
To illustrate, suppose you strictly prefer to , and strictly prefer to . hen
if your preferences are continuous, there will be some lottery which almost guarantees you
with a tiny chance of (one in a billion, say) which you will strictly prefer to getting
for certain. And you will strictly prefer for certain to some lottery which almost
guarantees you with a tiny chance of . As the example is meant to suggest, many
people think that Continuity is a plausible requirement on various evaluative comparatives.
A binary relation on lotteries over X satisies the expected utility axioms just in case
it satisies Ordering, Strong Independence, and Continuity. Here is the most basic result of
expected utility theory, due to von Neumann and Morgenstern (), but anticipated in a
deeper way by Ramsey ().
heorem (von Neumann and Morgenstern) Let X be a nonempty set, and be a binary
relation on lotteries on X which satisies the expected utility axioms. hen there exists a
real-valued function u on X such that
. For all lotteries L = [p , x ; . . . ; pm , xm ] and L = [q , y ; . . . ; qn , yn ],
L L ⇐⇒ p u(x ) + · · · + pm u(xm ) ≥ q u(y ) + · · · + qn u(yn )
. Any function v satisies (i) when substituted for u if and only if there exist real numbers
a > and b such that v = au + b.
Roughly speaking, (i) says that there is a function u (oten referred to as a “vNM utility
function”) such that L L if and only if the expected value of u associated with L is at
least as great as the expected value of u associated with L . he expected value of u associated
with a lottery is obtained by applying u to each of the lottery’s possible outcomes, weighting
the result by the probability of those outcomes, then adding all those numbers up. In such
circumstances, I will say that the ordering is represented by the expected value of u. (ii)
says that the function u is unique up to choice of zero and unit, or in fancier terminology,
unique up to positive aine transformation. For an analogy, Fahrenheit and Centigrade
measure temperature in essentially the same way, except that they use diferent zeros and
units. Overall, the main message is that if an ordering of lotteries satisies the expected utility
axioms, it can be represented by the expected value of some function which is more or less
unique.
he literature on expected utility theory is vast. It has been applied to all sorts of topics,
and has received a great deal of defense, criticism, and mathematical elaboration. Beyond
a few remarks, this chapter will assume some sort of familiarity with the defense, but will
rehearse many of the criticisms, particularly as they apply to ethics. We now need to ask:
When is one lottery better than another? Which lotteries ought we to bring about? We begin
with the irst question.
L ≻ M is deined as L M and not M L. L ∼ M is deined as L M and M L.
At varying levels of philosophical and mathematical ambition, personal favourites include Fishburn
(), Resnik (), Kreps (), Broome (), Hammond (), Ok () and Gilboa ().
In this volume, see Buchak ().
714
david mccarthy
33.5 Expected goodness
.............................................................................................................................................................................
Some philosophers imply that that if we know when one history is better than another, the
question of when one lottery is better than another is straightforward. For example, Parit
(, p. ) and Scheler (, p. , note ) start their discussions of consequentialism
only by assuming
() he risk-free betterness relation is an ordering.
To cover risky cases, they think that we need to appeal only to expected utility theory. In
particular, they think we just need to add
() One lottery is at least as good as another if and only if its expected goodness is at least
as great.
In other words, the betterness relation is represented by the expected value of goodness.
Parit and Scheler are not claiming that it is obvious when one history is better than another.
Rather, they are claiming that once we have an ordering of histories in terms of betterness,
() then tells us how to order lotteries in terms of betterness.
Now Parit and Scheler are quite brief about this and their real concerns lie elsewhere.
But this sort of claim is commonly made, and it is important to realize that it contains
a serious mistake. he basic diiculty is that () presupposes the existence of goodness
measures, measures of how good histories are, and various problems arise depending on
where we think these measures are coming from.
First, provided certain technical conditions are met, () guarantees that the risk-free
betterness relation can be represented by some function. To deal with the possibility that
there may be more than one such function, we might treat the set of all goodness measures
as the set of all of the functions which represent the risk-free betterness relation. It would
then be natural to interpret () as saying: L L if and only if the expected goodness of
L is at least as great as the expected goodness of L according to every goodness measure.
Unfortunately, however, this approach leads to massive indeterminacy. An example will
illustrate. Suppose there are exactly three histories x, y and z, ordered x ≻ y ≻ z by
the risk-free betterness relation. Let L be the lottery [ , x; z] and let us consider how it
compares with y. Consider the two functions u and v deined by u(x) = v(x) = , u(y) = .,
v(y) = ., and u(z) = v(z) = . Both of these functions represent the risk-free betterness
relation, and therefore count as goodness measures on the current proposal. But according
to u, the expected goodness of L is less than that of y, and according to v, the expected
goodness of L is greater than that of y. he current proposal therefore leaves L and y
unranked, and it only takes a bit more work to show that this will be true of almost every
pair of lotteries. So interpreting () along these lines does almost nothing to cover risky
cases.
Secondly, to get around this problem we might hope to narrow down all of the functions
which represent the risk-free betterness relation to (essentially) a single function to be used
he result goes back to Cantor; for details, see any reasonably advanced book on utility theory, such
as Kreps () or Ok ().
probability in ethics
715
as a goodness measure. his line of thought is tacitly quite common, and what tends
to happen is that one of the functions which represents the risk-free betterness relation
seems quite simple or natural, and it is taken to be the goodness measure. An old idea
will illustrate. According to this idea, each “just noticeable diference” between outcomes
is given the same magnitude of goodness, so that the diference in goodness between the
best outcome and the second best outcome is equal to the diference in goodness between
the second best outcome and the third best outcome, and so on. In the toy example of the
previous paragraph, this would be done by a function w where w(x) = , w(y) = ., and
w(z) = . Using () would then provide a ranking of all lotteries in terms of betterness. For
example, L and y would turn out to be equally good. However, this proposal is ethically
entirely arbitrary, and it is easy to invent circumstances in which the method delivers
implausible conclusions. To illustrate, let us apply the same idea to individual betterness
relations. Consider a wine connoisseur who is able to discriminate among a vast number
of wines, and let us take her ordering of wines as given. Let a+ be the outcome in which
she gets the best possible wine, a the next wine down, r some rough house wine, and r+ the
next one up. he current method would regard the two lotteries [ , a+ ; , r] and [ , a; , r+ ]
as equally good. But our connoisseur might regard experiencing the best possible wine as
worth risking a lot for, and improving a rough house wine as hardly worth anything, leading
her to conclude that the irst lottery is better. But the current method woodenly regards the
two lotteries as equally good.
hirdly, one might approach the problem from a diferent direction. Suppose we start
with a claim which is presupposed by (), namely
Social EUT he betterness relation satisies the expected utility axioms.
Now by the vNM theorem, Social EUT implies
() For some real-valued function on histories f , the betterness relation is represented by
the expected value of f .
We might then deine f as a goodness measure (along with its positive aine transformations). It follows that () now gives us the right results: one lottery is better than another
just in case its expected goodness is greater. Unfortunately, however, just as the irst method
yielded almost complete indeterminacy, this method is almost completely uninformative.
In almost all cases, it provides us with no concrete method of ranking lotteries. For example,
in the toy example used to show why the irst method leads to indeterminacy, it is consistent
with the present method that L is better than y, that L and y are equally good, and that L is
worse than y.
We have now looked at three ways of trying to ill in the story gestured towards by Parit,
Scheler, and many others, the story which thinks that once we are given the risk-free
More precisely, to a set of functions which are all related by positive aine transformation. he vNM
theorem tells us that these will all be equivalent when it comes to ordering lotteries in terms of expected
goodness.
For example, McCarthy () argues that this approach is common in accounts of the priority view
and leads to unsatisfactory deinitions of it.
he basic idea goes back to Edgeworth (). For criticism and defense see e.g. Vickrey () and
Ng () respectively.
716
david mccarthy
betterness relation, we need only to appeal to expected utility theory to cover risky cases.
Each attempt to say where goodness measures are coming from leads to a problem. he irst
leads to indeterminacy, the second to arbitrariness, and the third to uninformativeness.
Now expected utility theory does indeed turn out to be a powerful tool for thinking about
evaluative questions about risk, and even questions which do not seem to be about risk. But
the story has to be more sophisticated than anything we have so far seen.
33.6 Veils of ignorance
.............................................................................................................................................................................
To simplify, I will from now on assume that in evaluating lotteries, we are only concerned
with the ethics of distribution, and in addition, not concerned with rights or responsibilities.
In particular, I will assume: if h and h contain the same population and for each member
i, h is exactly as good for i as h , then h and h are equally good.
he best-known strategy for augmenting an appeal to expected utility theory is to use a
so-called veil of ignorance, made famous but used in diferent ways by Harsanyi () and
Rawls ().
Assume a ixed population , . . . , n. Harsanyi’s presentation of his argument tacitly
identiies individual betterness relations with individual preference relations. But there
are objections to that identiication, and following Broome () we can avoid them by
restating Harsanyi’s argument in terms of individual betterness relations. his enables us to
leave it open whether the content of individual betterness relations has to do with preference
satisfaction, the quality of experience, achievements, or some other account. Harsanyi’s
argument then begins with
Individual EUT Individual betterness relations satisfy the expected utility axioms.
Assume also that interpersonal comparisons are unproblematic in that
Interpersonal Completeness For all individuals i and j and histories h and h , either h is
at least as good for i as h is for j, or vice versa.
Together Individual EUT and Interpersonal Completeness imply that there are real-valued
functions u , . . . , un on histories such that (i) for each individual i, i’s individual betterness
relation is represented by the expected value of ui , and (ii) for all individuals i and j, h is at
least as good for i as h is for j if and only if ui (h ) ≥ uj (h ). From now on, u , . . . , un will
always be such functions, but their existence presupposes Individual EUT and Interpersonal
Completeness. I will sometimes call them utility functions.
Harsanyi () took ethics to be impartial. But how should this be modeled, or made
more concrete? his is where Harsanyi appeals to a veil of ignorance. Choosing under the
Some of the arguments which follow make slightly stronger assumptions about interpersonal
comparisons than I have made explicit. he point of these is to make various impartiality assumptions
have an efect, and also to guarantee that the functions u , . . . , un are essentially unique, in that if some
other set of functions v , . . . , vn plays their role, there are real numbers a > and b such that for all i,
vi = aui + b. But I will suppress this slightly technical issue. For full details, see e.g. Broome (, p. ).
probability in ethics
717
equiprobability assumption is understood as choosing between two social situations on the
assumption that one is equally likely to turn out to be each member of the population. hen
Harsanyi took the idea that ethics is impartial to be well-modeled by
Veil of Harsanyi One lottery is at least as good as another if and only if it would be weakly
preferred by every self-interested and rational person choosing under the equiprobability
assumption.
I will skip the formal details, but from Individual EUT, Interpersonal Completeness and
Veil of Harsanyi, Harsanyi gave a simple argument for
Sum he betterness relation is represented by the expected value of the function u + · · · +
un .
Rawls () agrees with Harsanyi that ethics is impartial, and that a veil of ignorance is a
good way of modeling impartiality. To focus on their treatment of veils, we will ignore other
diferences, such as the diferent ways in which they understand interpersonal comparisons.
With those aside, Rawls can be taken as agreeing with Individual EUT and Interpersonal
Completeness. But his interpretation of the veil difers. Choosing under the uncertainty
assumption is understood as choosing between two social situations on the assumption that
one will turn out to be one of the members of the population, but with complete uncertainty
about who that will be. hen Rawls took the idea that ethics is impartial to be well-modeled
by
Veil of Rawls One history is at least as good as another if and only if it would be weakly
preferred by every self-interested and rational person choosing under the uncertainty
assumption.
Rawls then argued that Individual EUT, Interpersonal Completeness, and Veil of Rawls
would result in
Maximin One history is better than another if and only if the former is better for the worst
of.
Many commentators have thought Rawls should instead have concluded with
Leximin One history is better than another if and only if it is better for the worst of, or
equally good for the worst of and better for the second worst of, and so on.
hese arguments raise three basic questions: (i) What does rational choice under the
uncertainty assumption really require? (ii) Given that one is going to model impartiality via
some sort of veil of ignorance, is the uncertainty assumption a better way of doing it than
the equiprobability assumption? (iii) Is modeling impartiality via a veil of ignorance a good
idea anyway?
Briely, (i) seems to be unclear. For example, suppose the Ramsey-Savage story is
right about rational choice under conditions of uncertainty. For the agent behind the
veil to lack implicit subjective probabilities of any degree of determinateness – and thus
to model complete uncertainty – that story implies that her preferences are incomplete.
At best, maximin (or leximin) would then seem to be but one rationally permissible
718
david mccarthy
choice among many, whereas Rawls needs it to be rationally required (see Angner ()
for further discussion). For (ii), the equiprobability assumption seems at irst glance a
reasonable attempt at giving impartiality a concrete and reasonably clear interpretation.
Moreover, given the diiculties in understanding what rationality in conditions of complete
uncertainty requires, it is hard to see what motivates shiting to the uncertainty assumption,
aside from a question-begging attempt to avoid Sum. I will return to some of these issues,
but the most fundamental question is (iii), and a later result of Harsanyi’s seems to show
that the use of veils of ignorance was never a good idea in the irst place.
33.7 Harsanyi’s theorem
.............................................................................................................................................................................
To present Harsanyi’s result we need to state two more premises. We continue to assume a
ixed population. he irst premise expresses a kind of impartiality.
Impartiality For all histories h and h , if there is some permutation π of the population
such that for each individual i, h is exactly as good for i as h is for π(i), then h and h are
equally good
he second premise is a so-called Pareto assumption.
Pareto (i) If two lotteries are equally good for each member of the population, they are
equally good. (ii) If one lottery is at least as good for every member of the population and
better for some members, then it is better.
his is Harsanyi’s theorem. For an accessible proof, see e.g. Resnik ().
heorem (Harsanyi) Assume a constant population. hen Individual EUT, Interpersonal
Completeness, Social EUT, Impartiality, and Pareto jointly imply Sum.
To recap what Sum says, the conclusion of the theorem says that one lottery is better than
another just in case it has a greater sum of individual expected utilities. his implies that
one history is better than another just in case it has a greater sum of individual utilities.
However, in its classical form, utilitarianism is usually deined as the claim that one history
is better than another just in case it has a greater sum of individual goodness. his raises the
disputed question of what Sum has to do with utilitarianism, and thus whether Harsanyi’s
premises imply utilitarianism. Roughly speaking, Harsanyi’s premises imply the classical
version of utilitarianism just in case individual utilities are measures of individual goodness.
Simplifying somewhat, Sen () and Weymark () denied that the two should be
identiied, whereas along with e.g. Harsanyi (b), Broome (), and Hammond
(), I believe that they should be identiied. I will say more about this in section .,
but the most important claim is that it does not really matter who is right. he conclusion
of Harsanyi’s theorem appears to tell us exactly what the content of the betterness relation
is, and what name we should give to that conclusion is of much less importance.
In my view, it is hard to exaggerate the importance of Harsanyi’s result. I will assume
enough familiarity with expected utility theory, references to which were provided earlier,
probability in ethics
719
to see the prima facie case for Individual EUT and Social EUT. he rough idea is that the
prima facie case for rational preference relations satisfying the expected utility axioms can
be modiied to apply directly to evaluative relations like individual betterness relations and
the betterness relation. he prima facie case for the other premises is fairly natural as well.
he best way to explore this further will be to look at criticisms of the premises. We will do
that shortly, but irst I want to consider how Harsanyi’s theorem improves on what we have
seen so far.
he popular appeal to expected utility theory sketched in section . sufered from
telling us little of any use about the betterness relation. But if we take individual betterness
relations as given, and accept the premises of Harsanyi’s theorem, the theorem shows that
the content of the betterness relation is completely determined.
Consider now veil of ignorance arguments. Both Harsanyi’s and Rawls’s accept Individual
EUT and Interpersonal Completeness. hat leaves Harsanyi’s veil argument with Veil of
Harsanyi and Rawls’s with Veil of Rawls, while Harsanyi’s theorem is let with Social EUT,
Impartiality, and Pareto.
Harsanyi’s veil argument works by assuming that the person behind the veil is rational,
and therefore has preferences which satisfy the expected utility axioms. Given that, Veil of
Harsanyi yields Social EUT, and also, obviously, Impartiality and Pareto. So Harsanyi’s veil
argument enjoys no advantage over his theorem, and the theorem simply bypasses worries
about veil arguments expressed by e.g. Scanlon ().
he comparison with Rawls is less clear. When discussing the veil, Rawls usually
considers only the problem of ranking diferent histories. But someone behind the veil
could also try to rank diferent lotteries (thus facing two forms of ignorance: uncertainty
behind the veil, and risk beyond the veil). So we can ask what she thinks about Social EUT,
Impartiality, and Pareto. It would be surprising if the uncertainty assumption led her to
reject any of these claims, and thence Sum. But since Rawls is so plainly opposed to Sum,
I think this suggests that aspects of his informal reasoning have not been fully captured
in what seems to be his formal model. Sections . and . will discuss two major
Rawlsian worries about some of Harsanyi’s premises. But to foreshadow, these worries can
be expressed directly as criticisms of the premises of Harsanyi’s theorem, and appealing to
the veil does not seem to add anything.
Finally, we will see in section . that there is at least one major view about the ethics
of distribution which is impartial but is immediately ruled out by the adoption of a veil
of ignorance, whether Harsanyi’s or Rawls’s. So much the worse for the veil as a model of
impartiality. hus in my view, the veil turns out to be just an unhelpful distraction, and the
proper focus of attention for the ethics of distribution should be Harsanyi’s theorem.
33.8 Variable populations
.............................................................................................................................................................................
Before looking at various worries about and alternatives to the premises of Harsanyi’s
theorem, it is worth mentioning a way in which it can be extended. Problems where the
population can vary are diicult. But we do not need to add much to the premises of
Harsanyi’s theorem to make progress.
720
david mccarthy
he following says that only the kinds of lives people are living matters, not the identities
of those people.
Anonymity For all histories h and h containing inite populations of the same size, if
there is a mapping ρ from the population of h onto the population of h such that for
every member i of the population of h , h is exactly as good for i as h is for ρ(i), then h
and h are equally good.
his premise makes the nonidentity problem discussed by Parit () rather trivial: if
no one else will be afected, and a woman has to choose between having one of two diferent
children, Anonymity plus Pareto implies that it would be better if she had the child whose
life would be better.
Let U be the function deined on histories such that for any history h with population
, . . . , n,
U(h) := u (h) + u (h) + · · · + un (h).
hen the premises of Harsanyi’s theorem, but with Impartiality replaced by the stronger
Anonymity, jointly imply
Same Number Claim Assume that all histories contain populations of the same size. hen
the risk-free betterness relation is represented by U.
Turning to comparisons between populations of diferent sizes, I will outline an approach
due to Broome () and Blackorby, Bossert, and Donaldson (). I lack the space to
discuss the details, but the crucial step is to argue for the
Neutral existence claim here exists a life l such that in every situation, provided no one
already existing is afected, (i) it is better to create an extra life which is better than l; (ii) it is
worse to create an extra life which is worse than l; (iii) it is a matter of indiference to create
an extra life which is exactly as good as l.
Call such a life a neutral existence. Given a parameter v, let V be the function deined on
histories such that for each history h with population , . . . , n
V(h) := (u (h) − v) + (u (h) − v) + · · · + (un (h) − v)
Some simple algebra shows that the same number and neutral existence claims together
imply the
Variable number claim Assume that all histories contain inite populations. hen the
risk-free betterness relation is represented by V, where v is the utility level of a neutral
existence.
he value of v makes no diference to same number problems. For when comparing two
histories with populations of the same size using V, the subtracted v’s cancel out. In variable
number problems, the presence of v in the deinition of V means that ignoring efects on
other people’s lives, someone’s existence makes a positive contribution towards goodness if
and only if her life is better than a neutral existence.
probability in ethics
721
Nothing so far said tells us what the value of v is, however. Setting it will involve further
ethical issues, and is diicult to do in a way which respects common intuitions (Broome,
). For example, setting it low leads to the conclusion that a large number of people
(e.g. a billion) all with extremely good lives is worse than an extremely large number (e.g.
a billion billion) all with lives which may seem hardly worth living. Parit () evidently
did not think much of this idea when he famously called it “the repugnant conclusion.” On
the other hand, setting the value of v high makes it bad to create someone who would have
an intuitively good life, and that may seem implausible too.
When we ethicists irst start to think seriously about probability, it may seem like a
bane for us, vastly expanding the complexity of questions we have to address. But it may
now look like a blessing. he problem of aggregating individual well-being to form an
overall judgment about when one history is better than another seems diicult. Yet without
appearing to make any assumptions about aggregation, and instead by largely appealing
to expected utility theory, which is all about probability, Harsanyi’s theorem seems to
provide a solution. Section . will provide a closer look at the question of whether the
theorem really does solve the “problem of aggregation.” But we irst examine criticisms of
and alternatives to Harsanyi’s premises which are also about probability.
33.9 Equality and fairness
.............................................................................................................................................................................
he additive form of the conclusion of Harsanyi’s theorem will make some suspect that
its premises conlict with the idea that in the distribution of goods, equality and fairness
matter. But where, if anywhere, is the tension? Assume a population of two people, A and B,
and consider the following lotteries, which combine examples due to Diamond () and
Myerson ().
LE
A
B
heads
tails
LF
A
B
heads
tails
LU
A
B
heads
tails
Anyone who thinks that equality is valuable should think that LE is better than LF . For
while LE and LF are equally good for each person, LE has in its favour that it guarantees
equality of outcome while LF guarantees inequality (Myerson, ). But Pareto implies
that LE and LF are equally good, so it is inconsistent with the idea that equality is valuable.
Anyone who thinks that fairness is valuable should think that LF is better than LU . For
while Impartiality implies that the outcomes under LF and LU are equally good, LF has in
its favour that it distributes the chances fairly (Diamond, ).
Diamond’s example leads to the irst of a series of challenges to the assumptions about
expected utility in Harsanyi’s premises. By Impartiality, all of the outcomes under LF and LU
are equally good. Strong Independence of the betterness relation then implies that LF and LU
are equally good. Hence the assumption that the betterness relation satisies the expected
Proof: for all lotteries L and M, write L M for “L is at least as good as M”. By Impartiality, [, ] ∼
[, ]. Strong Independence for then implies LU = [, ]+ [, ] ∼ [, ]+ [, ] = LF as required.
722
david mccarthy
utility axioms, in particular Strong Independence, clashes with the idea that fairness is
valuable.
I think that Myerson’s and Diamond’s examples lie at the heart of concerns with equality
and fairness. It is diicult to argue for this in a short space, though section . will
say more. But suppose it is correct. How could the examples be generalized into full-blown
theories about what it is for equality or fairness to be valuable?
I will just illustrate an approach for the case of equality. Suppose we are given a preorder
e on histories such that h e h if and only if h is uncontroversially (among egalitarians)
at least as good in terms of equality as h . My own account of the extension of e is in
McCarthy (). But to give two simple cases, every equal distribution is going to be
uncontroversially better in terms of equality than every unequal distribution, and all equal
distributions are going to be uncontroversially equally good in terms of equality. Consider
Equality-neutral Pareto Assume a ixed population. For all lotteries L = [p , h ; . . . pm , hm ]
and L = [p , k ; . . . ; pm , km ]: (i) if L is exactly as good as L for all individuals, and hj ∼e kj
for all j, then L and L are equally good; and (ii) if L is at least as good as L for all
individuals and better for some individual, and hj e kj for all j, then L is better than L .
Equality principle Assume a ixed population. For all lotteries L = [p , h ; . . . ;
pm , hm ] and L = [p , k ; . . . ; pm , km ]: if L is at least as good as L for all individuals, hj e kj
for all j and hj ≻e kj for some j, then L is better than L .
McCarthy () argues that together, these principles are the core of egalitarianism.
Equality-neutral Pareto is a weakening of Pareto, designed to avoid clashes with examples
like Myerson’s. he equality principle is designed to generalize the idea that equality is
valuable, as illustrated by Myerson’s example. hus we obtain a very general egalitarian
theory by starting with Harsanyi’s premises, weakening Pareto to its equality-neutral cousin,
then adding the equality principle.
Notice that the equality principle is inconsistent with the adoption of either Harsanyi’s
or Rawls’s veil of ignorance. But it can easily be shown to be consistent with the notion
of impartiality captured by Impartiality. So if it was meant only to model impartiality, the
adoption of a veil of ignorance is too strong.
he characterization of the idea that equality is valuable via the equality principle exploits
natural dominance ideas. Roughly speaking, suppose that each part of some object x is at
least as good with respect to some value V as the corresponding part of object y. hen x
is said to weakly dominate y in terms of the value V. If x weakly dominates y, but y does
not weakly dominate x, then x strictly dominates y. hus the equality principle says that
if L weakly dominates L in terms of well-being, and strictly dominates L in terms of
equality, then L is better than L . I lack the space to discuss the details, but I believe
that the way to characterize the idea that fairness is valuable is to develop dominance
ideas in a way suggested by Diamond’s example. However, while the apparent similarities
between Diamond’s and Myerson’s examples suggest parallels, it appears that there are subtle
asymmetries between concerns with equality and concerns with fairness (McCarthy and
homas, ).
his is not quite right. In my view it is better to say that Myerson’s example is about equality of
outcome, and Diamond’s is about equality of prospects, not fairness. But here I stick with the more usual
terminology. For reasons for not talking about fairness, see McCarthy ().
probability in ethics
723
33.10 Priority
.............................................................................................................................................................................
Parit () argued that what he called the priority view is an important alternative
to egalitarianism, sharing many of its apparent virtues but avoiding what he called the
leveling-down objection. He summarized it via the slogan that “beneiting the worse of
matters more”, but commentators have been divided over whether he managed to articulate
a genuine alternative to egalitarianism.
A puzzle about making sense of the priority view is that its distinctive feature is advertised
as an intrapersonal phenomenon: what is bad about people being worse of is that they
are worse of than they might have been (Parit, , p. ). his has suggested to
commentators that according to the priority view, it matters more to more to beneit
someone the worse of she is even when no others are around at all (Rabinowicz, ).
But in cases where only one person is around and risk is not involved, the priority view, like
any other sane view, will accept that one history is better than another if and only if it is
better for the sole person.
Matters are diferent, however, when risk is involved. Several commentators have thought
that the priority view should be formulated in a way which makes it have distinctive
consequences in one-person cases involving risk (Rabinowicz, ; McCarthy, ;
Otsuka and Voorhoeve, ). I am inclined to go further and say that the key idea behind
the priority view receives its clearest and most fundamental expression in such cases.
To illustrate, suppose A is the only person around, and compare the history h = [] with
the lottery L = [ , ; , ], with the numbers supplied by uA . Because L and h are equally
good for A, Pareto implies that they are equally good. But I believe that the priority view
should be understood as saying that h is better than L.
More generally, I believe that the key idea of the priority view is what I call the
Priority principle Assume a ixed population. Suppose histories h , h and h each contain
perfect equality. hen (i) h is at least as good as h if and only if h is at least as good for
each individual as h ; and (ii) if for each individual i, h is better for i than h , h is better
for i than h , and h is exactly as good for i as L = [ , h ; , h ], then h is better than L.
Notice that this is inconsistent with equality-neutral Pareto. Some writers ind it absurd
that in one-person worlds, the betterness relation and the sole person’s individual betterness
relation could diverge (e.g. Otsuka and Voorhoeve, ), as the priority principle implies.
Rabinowicz () regards this claim as acceptable, while Parit (), for example, ofers
a defense.
But rather than discuss possible defenses of the priority principle, I will note a less
discussed objection to the priority view. he priority view can be formulated by starting with
Harsanyi’s premises, weakening Pareto far enough to accommodate the priority principle,
then adding the priority principle (McCarthy, forthcoming a). But when this is done, any
account of the extension of the betterness relation which is consistent with the Harsanyi
premises turns out to be consistent with the priority view premises, and vice versa. But the
priority view has a more complicated way of describing the betterness relation, because
of the less simple relationship it posits between betterness and individual betterness in
one-person worlds. So the objection is that the priority view fails to provide a reasonable
alternative to the Harsanyi premises, not because of any ethically absurd implications, but
724
david mccarthy
because of the theoretical vice of needless complexity (cf. Harsanyi, b; Broome, ;
McCarthy, , forthcoming a).
33.11 Continuity
.............................................................................................................................................................................
Continuity is seldom discussed. When it is mentioned, it is oten said just to be a technical
assumption. But when the claim is that the betterness relation or individual betterness
relations satisfy Continuity, this is a clear mistake.
To illustrate, let a be a very good life, a+ a slightly better life, and z an extremely bad
life, such as being in severe pain or enslaved for a long time. he claim that individual
betterness relations satisfy Continuity implies that there is a gamble which would almost
guarantee an individual a+ with a small chance of z which is better for the individual than
having a for certain. But regardless of what one thinks about this case, it is not a technical
assumption to claim that the risk is worth it. It is a substantive evaluative judgment, and
diferent views about it are reasonable. For what it is worth, I believe that many of Rawls’s
informal remarks about his veil of ignorance would have been more naturally modeled by
denying that individual betterness relations satisfy Continuity because of this kind of case
than by his actual model.
It is clear that Continuity is something ethicists should pay attention to. he good news is
that the result of weakening the expected utility axioms by dropping Continuity is formally
well understood, thanks to results by Hausner () and others.
But there are several pieces of bad news. First, the general statement of Hausner’s result is
quite mathematically complex and not easy to speak about informally. Secondly, it is time to
stop speaking of the continuity axiom. here are several EUT-style continuity axioms (see
e.g. Hammond, ), and it is far from clear what the ethical grounds for adopting one
but not another might be. hirdly, speaking loosely, Continuity failures occur when one
lottery in some sense has “ininitesimal” value compared with another. But such cases pose
a challenge to standard treatments of probability as well, and this needs to be incorporated
into the analysis. In summary, perhaps in the end ethicists can safely ignore Continuity.
But it would be better to know that than to hope for it, and the work needed to arrive at such
a conclusion appears to be substantial.
33.12 Incommensurability
.............................................................................................................................................................................
One of the major contributions of the contractualist literature has been to force us to take
seriously diiculties with evaluative comparisons of diferent kinds of goods. But part of
As an analogy, consider again the best-system analysis of laws. Suppose someone ofers some
account of the laws of the world which captures all relevant facts. But this account is more complex
than some other account which also captures all relevant facts. On the best-system analysis, the more
complex account is mistaken about what the laws are, despite getting the relevant facts right. McCarthy
(forthcoming a) argues that the priority view is mistaken on similar grounds.
For an accessible account of how the challenge applies to Savage’s treatment of subjective probability,
and a sketch of mathematically sophisticated responses, see Gilboa () pp. –.
For recent work in this direction, see Jensen ().
probability in ethics
725
the assumption that the betterness relation and individual betterness relations satisfy the
expected utility axioms is that these relations are complete. But from the perspective of
diiculties with evaluative comparisons, such completeness assumptions look far from
obvious. hey may seem particularly implausible if we adopt the popular view that the
basis for such things as interpersonal comparisons should be as neutral as possible between
competing substantive views about what a good life is, as argued, for example, in Rawls
().
One response would be to adopt something like resources, freedoms, or opportunities
as the basis for interpersonal and intrapersonal comparisons (see e.g. Rawls, ; Sen,
). However, the premises of Harsanyi’s theorem are silent on the content of individual
betterness relations, so there is no obvious reason why the theorem cannot be run when
their content is understood in terms of resources and so on. Nevertheless, even resources
have their own problems to do with comparability because of the diferent nature of diferent
kinds of resources. So this response is a diversion, and we should turn directly to Harsanyi’s
premises to see what can be done about diiculties with comparability.
he most immediately tempting response is simply to drop the completeness assumptions. his means that the various evaluative relations featuring in the theorem become
preorders which are not assumed to be complete. A large advantage of working with
preorders is that mathematically speaking, they are relatively tractable. For example, a
corollary of Szpilrajn’s theorem is that a preorder is identical to the intersection of all of the
complete preorders which extend it. his has the advantage that in thinking about preorders
one can oten work with complete preorders anyway.
his corollary is strikingly parallel to the supervaluationist treatment of vague predicates:
a sentence involving a vague predicate is true if it is true on all admissible sharpenings of
the predicate, false if it is false on all admissible sharpenings, and neither true nor false
otherwise.
But this should suggest caution: if a natural response to diiculties to do with comparability is to shit to preorders, the response looks like one of the classic candidates for a
solution to the problem of vagueness. But supervaluationist approaches have been heavily
criticized (see e.g. Williamson, ). Furthermore, perhaps the parallel suggests that the
basic problem with comparing diferent kinds of goods is one of vagueness. In fact, cases
in which evaluative comparisons look extremely diicult seem to lend themselves to sorites
paradoxes, one of the hallmarks of vagueness.
In one way this is good news: there is a vast amount of work on vagueness, so ethicists
have plenty of material to borrow from. Since the topic is probability, it is worth mentioning
that some treatments of vagueness are probabilistic, and that an extensive literature takes
this approach to vague comparatives; see e.g. Fishburn () for a survey. In another way
it’s bad news: perhaps the main reason why there is so much literature on vagueness is the
almost complete lack of consensus.
Perhaps we ethicists should just shelve the problem of how best to model diiculties to do
with evaluative comparisons until there is more convergence in the literature on vagueness.
However, in the absence of such convergence, it may still be possible to achieve some
kind of stability result: show that the solutions to a class of interesting ethical problems
which involve goods which are diicult to compare are insensitive to the resolution of more
general problems about vagueness. For example, Broome () takes this approach in his
discussion of the neutral level for existence. In section . I will suggest that the same can
be done for the question of what Harsanyi’s theorem really shows.
726
david mccarthy
33.13 Non-expected utility theory
.............................................................................................................................................................................
he backbone of Harsanyi’s theorem is expected utility theory, but we have seen a number of
ways in which the claim that various evaluative relations satisfy the expected utility axioms
can be criticized. he axioms so far criticized are Strong Independence, Ordering (insofar
as completeness was criticized), and Continuity. Some writers even go so far as to criticize
transitivity (see e.g. Temkin, ).
hese criticisms are directly based either on distributive intuitions (Strong Independence,
Continuity), or on the nature of goods being distributed (Ordering). But a serious question
about the expected utility axioms arises from a diferent direction.
Since Allais () and Ellsberg (), it has appeared to many that individual
preference relations violate the expected utility axioms in fairly systematic ways. he attempt
to describe these violations has led to a huge body of work developing alternatives to the
expected utility axioms (for surveys see e.g. Schmidt, ; Sugden, ; Gilboa, ;
Wakker, ).
his project has been accompanied by two broad views. One is that the alternative axioms
simply help us catalogue human irrationality, which might of course be very important in
various descriptive and explanatory contexts. he other, oten prompted by the fact that the
violations are oten stable under criticism, is that the support the alternative axioms tacitly
enjoy genuinely threatens the picture of rationality provided by expected utility theory.
Now these are views about rationality, whereas we have been interested in such things
as betterness and betterness for people. But the development of non-expected utility theory
suggests that it would be interesting to modify distributive theories which to varying extents
involve the expected utility axioms by weakening those axioms and then adding some of the
non-expected utility axioms.
If the application of the non-expected utility axioms to such things as individual
betterness relations turns out to be reasonably well motivated, the result should be an
expanded account of reasonable distributive theories. But even if those axioms are not well
motivated when applied to evaluative relations, this project would still be worth pursuing.
If a class of popular distributive intuitions turns out to be generated by such an application
of non-expected utility theory, we would in efect have an important error theory.
33.14 Evaluative measures
.............................................................................................................................................................................
Discussions of the ethics of distribution commonly assume the existence of quantitative
measures of various evaluative properties, then use these measures to formulate various
apparently natural ideas. For example, individual goodness measures, quantitative measures
of how good histories are for individuals, are oten taken to exist. hen assuming a constant
population , . . . , n, it is oten claimed that
(U) According to utilitarianism, two histories are equally good if they contain the same sum
of individual goodness.
probability in ethics
727
(E) According to egalitarianism, an equal distribution is better than an unequal distribution
of the same sum of individual goodness.
(P) According to the priority view, it is better to give a unit of individual goodness to a
worse-of person than to a better-of person.
hese claims tacitly assume that talk of units of individual goodness is well-deined. hey
are oten taken to be (at least partial) deinitions of the distributive theories in question,
making what seems natural or appealing about the theories in question transparent. For
more detail, McCarthy () examines the role of evaluative measurement in common
understandings of the priority view.
However, there are serious diiculties with this kind of approach to the ethics of
distribution. I will mention just one speciic problem.
he only obvious fact about individual goodness measures is that they have to represent risk-free individual betterness relations. But this only makes individual goodness
measures unique up to increasing transformation. But for units of individual goodness
to be well-deined, individual goodness measures must be unique up to positive aine
transformation. So to make them well-deined it looks as if we need to make an arbitrary
choice of measure (Broome, , p. ). But this will make the theories partially deined
by (U), (E), and (P) rest on an arbitrary choice, and fail to vindicate the idea that they are
the fundamental theories about the ethics of distribution we take them to be.
More generally, taking the existence of quantitative evaluative measures as given, then
using them to theorize about the ethics of distribution, is strongly at variance with standard
views about measurement in the physical and social sciences. here, quantitative measures
are seen as emerging as canonical descriptions of qualitatively described prior structures
(see e.g. Krantz, Luce, Suppes and Tversky, ; Narens, ; Roberts, ). My own
view is that we should treat evaluative measurement along the same lines.
By itself, this does not begin to settle what we should say about individual goodness
measures. But individual goodness measures turn out to be well-enough deined for talk
of units, sums, and so onto make sense, at least given certain background assumptions.
I can only sketch this view, but in more detail, sections . and . point to a
characterization of egalitarianism and the priority view in terms of primitive qualitative
relations (betterness, individual betterness). Similarly, I think the premises of Harsanyi’s
theorem should be understood as characterizing utilitarianism. Now (U), (E), and (P) are
close to platitudinous. But given these characterizations of utilitarianism, egalitarianism
and prioritarianism, this means that we can treat (U), (E), and (P) as implicit deinitions of
individual goodness measures. he result is that individual goodness measures turn out to
be the positive aine transformations of u , . . . , un , or what Broome () calls Bernoulli’s
hypothesis. For details, see, for example, McCarthy ().
he background assumptions are that individual betterness relations satisfy the expected
utility axioms and that interpersonal comparisons are unproblematic. But what if these fail?
I will not pursue this, for I think the most important lesson about evaluative measures
is not that they are arguably well-deined, but that it does not much matter. We can and
I.e. if some function f represents the risk-free betterness relation, and g is some strictly increasing
function on the reals (x < y "⇒ g(x) < g(y)), then g ◦ f also represents the risk-free betterness relation.
728
david mccarthy
should theorize about the questions which really matter in the ethics of distribution without
using evaluative measures. By focusing instead on comparatives and various claims about
probability, none of the distributive views we have been discussing presuppose the existence
of evaluative measures, the preeminent example, of course, being Harsanyi’s.
33.15 Aggregation
.............................................................................................................................................................................
But this raises the question of what Harsanyi’s theorem really shows. Ethicists oten talk
about the “problem of aggregation”. What they typically have in mind is the task of somehow
combining an assessment of what things are like for each individual in a particular situation
to form some sort of overall judgment of the situation which enables us to make an
evaluative comparison with other situations.
Supposing the premises of Harsanyi’s theorem are correct, it is tempting to think that
Harsanyi’s theorem solves the problem of aggregation. I believe this was Harsanyi’s view, and
I think it is popular among welfare economists. Harsanyi did not use the terms ‘individual
betterness relation’ and ‘betterness relation’, and I stress that the following passage is mine,
not his. But I think the following captures the spirit of his view (see especially Harsanyi,
a).
Determining the content of individual preferences relations (despite iltering out various
irrationalities, excluding such things as sadistic preferences, and requiring preferences to
be rich enough to enable interpersonal comparisons) is basically a psychological matter
(Harsanyi, a). It does not involve any signiicant evaluative or aggregative assumptions.
But we should identify individual betterness relations with individual preference relations.
Given the truth of Harsanyi’s premises, Harsanyi’s theorem then explicitly determines the
extension of the betterness relation. Problem of aggregation solved.
his position underplays the role of evaluative assumptions in determining the content
of individual betterness relations in at least two ways. First, determining the content of
individual preference relations may well involve prior evaluative assumptions because of
the role of such assumptions in popular accounts of radical interpretation (see e.g. Lewis,
). Secondly, even when they are restricted to histories, identifying individual betterness
relations with individual preferences relations is highly controversial. It is a major evaluative
question whether to understand the content of risk-free individual betterness relations in
terms of preferences, the quality of the individual’s experiences, her achievements, or some
combination thereof.
But suppose that evaluative question has been settled, and that Harsanyi’s premises are
true. he theorem certainly shows that iguring out the content of the betterness relation is
no harder than determining the content of individual betterness relations. But what exactly
does it show about the problem of aggregation?
First, it is a vast exaggeration to say that the theorem solves the problem of aggregation.
Problems of aggregation arise whenever we have to make some sort of assessment of a whole
based on an assessment of its parts. But iguring out the content of individual betterness
relations involves major questions of aggregation. Even in the case in which all outcomes
are equally likely, to assess whether facing some lottery is better for someone than some
particular outcome, we will have to assess what each of the possible outcomes of the lottery
probability in ethics
729
are like for her, then somehow aggregate to reach an overall assessment of the lottery.
his problem is complicated and is, in my view, much neglected. Like many economists,
Harsanyi’s own account tacitly appeals to the individual’s preferences. But this should not
seem very appealing to those of us who think that preference satisfaction accounts are
mistaken even for the question of when one outcome is better for an individual than another.
Secondly, there is no logical reason why we cannot use the theorem to deduce the content
of individual betterness relations from the content of the betterness relation, in particular
from judgments about when one history is better than another. In cases where we are very
conident about the latter, this will even seem appealing. I am afraid I lack the space to
discuss this, but I think this idea provides a natural way of interpreting various contractualist
comments about veil of ignorance arguments (see e.g. Scanlon, ; Nagel, ), in
particular leading to an interesting case for rejecting the claim that individual betterness
relations satisfy Continuity.
More generally, if its premises are true, Harsanyi’s theorem teaches us that determining
the content of the betterness relation is easier than we may have thought. But the lipside
is that determining the content of individual betterness relations is harder than many of us
have assumed.
33.16 Summary on evaluation
.............................................................................................................................................................................
When thinking about the ethics of distribution, it may seem that the real evaluative
questions are about when one history is better than another, or better for some individual.
Factoring in probability may then seem like a basically technical exercise, not one ethicists
need be much concerned with.
Almost every topic discussed could easily have its own survey article. I have had to
omit many important positions, and give only sketchy defenses of positive positions.
Nevertheless, I have tried to make the case for the opposite view. Not only are there
very important ethical issues about how to rank lotteries, but these issues directly bear on
questions about when one history is better than another. I will end the evaluative discussion
with two opinions.
First, if I am right, almost every major position on the ethics of distribution is essentially
to do with probability. For example, assuming a constant population . . . n, concerns with
fairness, equality, and giving priority to the worse-of as characterized in sections . and
. can each be shown to be consistent with the popular idea that the risk-free betterness
relation is represented by w ◦ u + · · · + w ◦ un for some strictly increasing and strictly
concave function w. hese views come apart only when probability is introduced. So one
aspect of the importance of probability is the increase in expressive power its introduction
provides: it allows us to draw distinctions which are diicult or impossible to draw in a
risk-free framework.
Secondly, I think the various challenges to Harsanyi’s premises stemming from appeals to
equality, fairness, priority, and non-expected utility theory fail. To be sure, there is at least
a reasonable case for rejecting Continuity, and Ordering (at least, the completeness part of
it) is under serious threat. Nevertheless, we can drop Continuity and under many ways of
modeling diiculties to do with comparability, what I take to be the core lesson of Harsanyi’s
730
david mccarthy
theorem remains stable: determining the content of individual betterness relations and
determining the content of the betterness relation are just diferent descriptions of the
same problem. his may help. Our initial judgments about individual betterness and about
betterness may be in tension with each other, and we may be more conident about some
judgments than others. Harmonizing these judgments in an attempt to achieve relective
equilibrium may increase our conidence in the result.
33.17 Ought
.............................................................................................................................................................................
Expected utility theory has turned out to be hugely important for developing a taxonomy of
answers to the fundamental evaluative question: when is one history or lottery better than
another? I have not emphasized this, but I also think that the clarity of this taxonomy is also
extremely helpful for assessing which answer is correct. In the remaining space I have room
for only one suggestion which, though hardly very original, is that the same turns out to be
true for decision theory and the fundamental normative question: what ought we to do?
One immediate disclaimer is needed. Expected utility theory is usually understood
as a theory about the structure of the preferences of ideally rational agents. But this
chapter has discussed the application of expected utility theory to understanding evaluative
comparatives without having to say anything about rationality. Rather, many of the ideas
and criticisms of expected utility theory are directly applicable to questions about evaluative
comparatives.
Similarly, decision theory is usually understood as an account of ideally rational action,
and it is typically assumed that the rationality of an action depends in some way upon the
agent’s preferences. However, we can apply many ideas from decision theory directly to
questions about the fundamental normative question without having to presuppose some
grand connection between rationality and ethics. For example, it is a serious mistake to
think that decision theory is going to be important to ethics only if ethics is somehow about
preference satisfaction, or if we hitch ourselves to the unlikely project of deriving ethics
from rationality. hus the discussion of decision theory in what follows is only meant to
draw parallels between questions about ethics and questions about rationality. Because the
debates about rationality are oten better developed, these parallels may be illuminating.
With no attempt at exhaustiveness, the sequel will look briely at three examples, with
particular emphasis on probability.
33.18 Act consequentialism
.............................................................................................................................................................................
Given some account of betterness, the most obvious ethical theory is act consequentialism:
what we ought to do is to bring about the best available lottery. If we assume for simplicity
that the betterness relation satisies the expected utility axioms, act consequentialism then
In fact, this is true even if we weaken some of the EUT ideas in Harsanyi’s framework and add
various well-known nonEUT ideas. his is further pursued in McCarthy, Mikkola, and homas ()
and McCarthy (forthcoming b).
probability in ethics
731
implies that there is some value function such that we ought to perform the action with
the greatest expected value. hus act consequentialism is the ethical theory which most
obviously parallels decision theory.
Act consequentialism is also one of the most criticized theories, one standard criticism
being that it has implausible implications. For example, assuming an impartial method
of valuation, Williams () argued that act consequentialism undermines the partiality
which for many people makes life worth living: devotion to personal projects and particular
people, oten friends and family. But this raises the question of what act consequentialism
really requires in the irst place.
Taking for granted a probability-based view which uses subjective probabilities, or
at least, probabilities which are relative to the evidence available to the agent, Jackson
() famously argued that because of facts about each individual’s probabilities, act
consequentialism will typically not require each agent to promote general well-being and
pursue whichever projects are the most impartially valuable. Rather, it will require a typical
agent – Alice, let’s call her – to promote the well-being of the relatively small group of
people Alice knows and cares about, and to adopt and then pursue projects in which
Alice takes a natural interest. his does not amount to a rejection of impartial valuation,
but instead relects facts about each agent’s limited information, the costs of deliberation
and of acquiring new information, the complexity of the interpersonal and intrapersonal
coordination problems she faces, the efects her actions will have on the expectations others
will have of her future behaviour, her motivational strengths, and so on. Such facts will be
encoded in the agent’s probabilities, and will therefore afect which of her acts will maximize
expected value. Very oten, Jackson argued, such acts will favour her nearest and dearest.
Jackson’s argument was ofered as a response to Williams, but it ofers a much more
general lesson. Understanding what act consequentialism implies is going to require
sophisticated thinking about probability. he huge complexity of this problem stands in
sharp contrast to the occasional complaint that act consequentialism is simple-minded.
33.19 Rule consequentialism
.............................................................................................................................................................................
Many writers, however, prefer rule consequentialism (or contractualism: at the normative
level, these views are oten very similar). On the one hand, rule consequentialism seems to
it better with common opinion about what we ought to do than act consequentialism (it is
said to secure rights etc.). On the other, it seems to avoid the obscurities of deontology by
resting its account of what we ought to do on an appeal to what is good for people. But how
is this achieved?
Harsanyi’s writings on rule utilitarianism ofer a relatively clear answer. Simplifying
slightly, Harsanyi () claims that each member of a society of act utilitarians will always
maximize the sum of expected individual utilities where the calculation is based on her
subjective probabilities of what the other members are going to do. Each member of a
society of rule utilitarians is committed to and thus will always act upon the rule R which
is such that if everyone acts according to R expected utility will be greater than if everyone
acts according to some other rule (I ignore the possibility that two rules could be tied).
732
david mccarthy
Harsanyi claims that rule utilitarianism will lead to “incomparably superior” overall results
in comparison with act utilitarianism because of its superiority in two kinds of scenarios:
(i) in certain simultaneous coordination games (e.g. choosing whether to vote), and (ii)
in certain sequential games (typically involving choices about respecting rights, keeping
promises etc.). his superiority is despite the fact that R will sometimes tell agents to
perform actions which they are certain will produce suboptimal results, where optimality
is understood in terms of maximizing the sum of expected utilities. his last feature leads
many to suspect that there is something unstable about rule utilitarianism, but Harsanyi
claims that these superior overall results imply that rule utilitarianism is correct.
It would take a separate article even to outline the important issues here, and I merely
want to make three points to illustrate the potential value of looking at this style of argument
through the lens of contemporary debates about decision theory. To do that, I will assume
for the sake of argument (though this is far from obvious) that Harsanyi is right about the
superior overall results of rule utilitarianism in comparison with act utilitarianism.
First, Harsanyi stresses that the rule utilitarians take themselves to be facing a problem
involving complete probabilistic dependence: each will commit to (and thus act on) rule R if
and only if all commit to R. In this respect, rule utilitarians are like clones in the well-known
case of clones playing a prisoner’s dilemma. It is this probabilistic dependence which leads
to rule utilitarianism’s superior performance in the coordination games. However, in these
coordination games, there is causal independence between the actions of each player. But
“probabilistic dependence yet causal independence” takes us to a crucial issue in decision
theory. Very roughly, so-called evidential decision theory assesses (the rationality of) actions
in terms of how likely good outcomes are conditional upon the actions being performed. By
contrast, causal decision theory assesses actions in terms of their causal tendency to produce
good outcomes. he classic case in which the two come apart is Newcomb’s problem.
However, for those of us who think that Newcomb’s problem teaches us to be causal decision
theorists (see e.g. Joyce, ), probabilistic dependence is a red herring when there is causal
independence, as there plainly is in Harsanyi’s simultaneous coordination games. So we
may think that Harsanyi has tacitly built something like evidential decision theory into rule
utilitarianism, and so much the worse for rule utilitarianism.
Secondly, the success of rule utilitarianism in various sequential games stems from the
rule utilitarians’ commitment to the rule R even in contexts in which acting on R leads to
suboptimal results. he conclusion that in virtue of this success, rule utilitarianism is right
about what we ought to do is parallel to a revision to standard decision theory later urged by
Gauthier () and McClennan (). his revision claims that if it is rational at time t to
become committed to performing some action at a later time t ′ which is obviously irrational
when considered in isolation, it is rational to commit to the action and then later perform
that action. But those of us who take the toxin puzzle of Kavka () to dramatize why this
revision is mistaken may think that rule utilitarianism is making the same kind of mistake.
hirdly, Harsanyi’s characterization of act versus rule utilitarianism parallels the inluential distinction in von Neumann and Morgenstern () between games against nature
and games against other people. Each act utilitarian will have probabilities about a number
of relevant variables, and will maximize expected value accordingly. he fact that some
of these variables are the behaviour of other people who like herself are act utilitarians
is neither here nor there; the decision theoretic model still applies. But when an agent
is in a situation in which the outcome depends in part on the behaviour of agents just
probability in ethics
733
like her, von Neumann and Morgenstern argued that decision theory is inappropriate. he
problem of self-reference embedded into such situations requires the diferent tools of game
theory, and Harsanyi’s rule utilitarians reason along similar lines. Perhaps von Neumann
and Morgenstern’s argument could be used to bolster Harsanyi’s approach. Alternatively,
those of us who are convinced by Skyrms () in thinking that problems of self-reference
can and should be handled without having to abandon decision theory may think this points
to a further diiculty for rule utilitarianism.
Of course, the fact that Harsanyi focussed on rule utilitarianism rather than rule
consequentialism has been inessential to the discussion. hese crude and preliminary
remarks are meant only to suggest the value of looking at the foundations of rule
consequentialism through the lens of parallel and oten much more extensive debates about
decision theory.
33.20 Deontology
.............................................................................................................................................................................
hose with strong deontological intuitions may reject rule consequentialism, either because
they are not convinced that it is a stable alternative to act consequentialism, or because
its conclusions are not deontological enough. But we may now seem to have reached the
limits of the usefulness of thinking about decision theory. Very roughly, anything like a
decision theoretic approach to deontology looks like the wrong model: the former is all
about weighing goods against evils, and the latter thinks there are circumstances in which
such weighing is illegitimate, or counts for nothing. Nevertheless, one lesson from thinking
about probability is that weighing is not so easy to avoid.
In trying to characterize a deontological view, there seem to be two basic options. What
I will call agent-centered views typically prohibit actions which would involve the agent’s
mental states bearing some kind of inappropriate relation to the outcome. he most obvious
example is the so-called principle of double efect, which in its simplest form prohibits
bringing about intended harm, but permits certain otherwise identical cases of bringing
about merely foreseen harm. What I will call causal structure views typically prohibit actions
which stand in some kind of inappropriate causal relation to the outcome. For example, in
the famous trolley problem, an out-of-control trolley is going to kill ive people who are stuck
on the track, but a bystander can switch the trolley to a sidetrack where it will kill one person.
Many people who have strong deontological intuitions think it is permissible to switch the
trolley. But in most cases, they think that killing one to save ive is impermissible, as in the
variant where the bystander can push a fat man of a bridge to stop the trolley (homson,
), killing him but saving the ive. Causal structure theorists think the intentions of the
bystander are irrelevant, and search for diferences in the causal structure of the cases to
explain the diference in permissibility.
Many deontologists have not had much sympathy for agent-centered views, and have
preferred some kind of causal structure view (e.g. Kamm, ). But here is what I believe is
a relatively neglected problem about views. If the inappropriate causal relation is between the
action and the outcome – as in, e.g., the fat man variant but not the trolley problem itself –
then prima facie, there are going to be actions which bring about the following lotteries:
734
david mccarthy
some beneit occurs with nonzero probability p, some inappropriate causal structure obtains
with probability − p. For example, driving a truck across the bridge will either miss the fat
man and deliver aid elsewhere, or else hit him and topple him of the bridge, stopping the
trolley and saving the ive.
What should causal structure deontologists say about such actions? here are at least
ive responses: (i) All such actions are impermissible. Objection: this leads to an intolerably
restrictive view. (ii) Such actions are impermissible if and only if they turn out to result in
the inappropriate causal structure. Objection: similar to the objections to outcome-based
views in section .. (iii) Actions which lead to the inappropriate causal structure with
probability one are impermissible, all others are permissible. Objection: it is not credible
that there should be such a gulf between probability one and probabilities just less than one.
(iv) Actions performed by agents whose reasons for performing them include the beneits
resulting from the inappropriate structure are impermissible. Objection: this collapses
causal structure views into agent-centered views. (v) Actions are impermissible if and only
if p exceeds some intermediate probability threshold. Objection: this seems to be the most
principled response for a causal structure view, but it suggests the acceptability of weighing
the alleged badness of the causal structure against the production of beneits. his seems to
it poorly with the guiding deontological image of the inappropriateness of weighing when
inappropriate causal structures are concerned.
Perhaps this kind of case points towards a serious problem for causal structure views; see
further Jackson and Smith (). Or it may provide an opportunity for causal structure
theorists to reine their views. Either way, thinking about probability and deontology seems
helpful.
Acknowledgments
.............................................................................................................................................................................
hanks to Alan Hájek and Kalle Mikkola for very helpful comments. Support was
partially provided by a grant from the Research Grants Council of the Hong Kong Special
Administrative Region, China (HKU H).
References
Allais, M. () Le comportement de l’homme rationnel devant le risque, critique des
postulates et axiomes de l’ecole américaine. Econometrica. . pp. –.
Angner, E. () Revisiting Rawls: A heory of Justice in the light of Levi’s theory of decision.
heoria. . pp. –.
Blackorby, C., Bossert, W., and Donaldson, D. () Intertemporal population ethics:
critical-level utilitarian principles. Econometrica. . pp. –.
Broome, J. () Weighing Goods. Cambridge, MA: Blackwell.
Broome, J. () Weighing Lives. Oxford: Oxford University Press.
Buchak, L. () Decision theory. In Hájek, A. and Hitchcock, C. (eds.). he Oxford
Handbook of Philosophy and Probability. Oxford: Oxford University Press.
Diamond, P. () Cardinal welfare, individualistic ethics, and interpersonal comparisons of
utility: comment. Journal of Political Economy. . pp. –.
probability in ethics
735
Edgeworth, F. () Mathematical Psychics. London: Kegan Paul.
Ellsberg, M. () Risk, ambiguity and the Savage axioms. Quarterly Journal of Economics.
. pp. –.
Fishburn, P. () Utility heory for Decision Making. New York, NY: Wiley.
Fishburn, P. () Stochastic utility. In Barberá, S., Hammond, P., and Seidl, C. (eds.)
Handbook of Utility heory. Vol. . Dordrecht: Kluwer.
Foot, P. () Utilitarianism and the virtues. Mind. . pp. –.
Gauthier, D. () Assure and threaten. Ethics. . pp. –.
Gilboa, I. () heory of Decision under Uncertainty. Cambridge: Cambridge University
Press.
Hájek, A. () Interpretations of probability. In Zalta, E. N. (ed.) he Stanford Encyclopedia of Philosophy. (Winter) [Online] Available from: http://plato.stanford.edu/archives/
win/entries/probability-interpret.
Hájek, A. () Most Counterfactuals are False. Manuscript.
Hammond, P. () Interpersonal comparisons of utility: why and how they are and should
be made. In Elster, J. and Roemer, J. (eds.). Interpersonal Comparisons of Well-Being.
pp. –. Cambridge: Cambridge University Press.
Hammond, P. () Objective expected utility. In Barberá, S., Hammond, P. and Seidl, C.
(eds.) Handbook of Utility heory. Vol. . pp. –. Dordrecht: Kluwer.
Harsanyi, J. () Cardinal utility in welfare economics and in the theory of risk-taking.
Journal of Political Economy. . pp. –.
Harsanyi, J. () Cardinal welfare, individualistic ethics, and interpersonal comparisons of
utility. Journal of Political Economy. . pp. –.
Harsanyi, J. (a) Morality and the theory of rational behavior. Social Research. .
pp. –.
Harsanyi, J. (b) Nonlinear social welfare functions: a rejoinder to Professor Sen. In Butts,
R. and Hintikka, J. (eds.). Foundational Issues in the Special Sciences. pp. –. Dordrecht:
Reidel.
Harsanyi, J. () Rule utilitarianism, rights, obligations and the theory of rational behavior.
heory and Decision. . pp. –.
Hausner, M. () Multidimensional utilities. In hrall, R., Coombs, C., and Davis, R. (eds.)
Decision Processes. pp. –. New York, NY: John Wiley & Sons.
Hoefer, C. () he third way on objective probability: a sceptic’s guide to objective chance.
Mind. . pp. –.
Jackson, F. () Decision-theoretic consequentialism and the nearest and dearest objection.
Ethics. . pp. –.
Jackson, F. and Smith, M. () Absolutist moral theories and uncertainty. Journal of
Philosophy. . pp. –.
Jensen, K. () Unacceptable risks and the continuity axiom. Economics and Philosophy. .
pp. –.
Joyce, J. () he Foundations of Causal Decision heory. Cambridge: Cambridge University
Press.
Kamm, F. () Morality, Mortality. Vol. . New York, NY: Oxford University Press.
Kavka, G. () he toxin puzzle. Analysis. . pp. –.
Krantz, D., Luce, R. D., Suppes, P., and Tversky, A. () Foundations of Measurement. Vol. .
New York, NY: Academic Press.
Kreps, D. () Notes on the heory of Choice. Underground Classics in Economics. Boulder,
CO: Westview Press.
736
david mccarthy
Lewis, D. () Counterfactuals. Oxford: Blackwell.
Lewis, D. () Radical interpretation. Synthese. . pp. –.
Lewis, D. () A subjectivist’s guide to objective chance. In Jefrey, R. (ed.). Studies in
Inductive Logic and Probability. Vol. . pp. –. Berkeley, CA: University of California
Press.
Lewis, D. () Humean supervenience debugged. Mind. . pp. –.
McCarthy, D. () Actions, beliefs and consequences. Philosophical Studies. . pp. –.
McCarthy, D. () Utilitarianism and prioritarianism II. Economics and Philosophy. .
pp. –.
McCarthy, D. () Risk-free approaches to the priority view. Erkenntnis. . pp. –.
McCarthy, D. () Distributive equality. Mind. . pp. –.
McCarthy, D. (forthcoming a) he priority view. Economics and Philosophy.
McCarthy, D. (forthcoming b) he Structure of Good. Oxford: Oxford University Press.
McCarthy, D., Mikkola, K., and homas, T. () Utilitarianism with and without expected
utility. MPRA Paper No. https://mpra.ub.uni-muenchen.de//.
McCarthy, D. and homas, T. () Egalitarianism with risk. Manuscript.
McClennen, E. () Rationality and Dynamic Choice. Cambridge University Press.
Mongin, P. () Consistent Bayesian aggregation. Journal of Economic heory. . pp. –.
Myerson, R. () Utilitarianism, egalitarianism, and the timing efect in social choice
problems. Econometrica. . pp. –.
Nagel, T. () he Possibility of Altruism. Princeton, NJ: Princeton University Press.
Narens, L. () Introduction to the heories of Measurement and Meaningfulness and the Use
of Symmetry in Science. Mahwah, NJ: Lawrence Erlbaum Associates.
Ng, Y. () Bentham or Bergson? Finite sensibility, utility functions, and social welfare
functions. Review of Economic Studies. . pp. –.
Ok, E. () Real Analysis with Economic Applications. Princeton, NJ: Princeton University
Press.
Otsuka, M. and Voorhoeve, A. () Why it matters that some are worse than others: an
argument against the priority view. Philosophy and Public Afairs. . pp. –.
Parit, D. () Reasons and Persons. Oxford: Clarendon Press.
Parit, D. () Equality or priority? In Clayton, M. and Williams, A. (eds.). he Ideal of
Equality. pp. –. Basingstoke: Macmillan.
Parit, D. () Another defense of the priority view. Utilitas. . pp. –.
Rabinowicz, W. () Prioritarianism for prospects. Utilitas. . pp. –.
Ramsey, F. () Truth and probability. In Ramsey, F. and Braithwaite, R. (ed.). Foundations
of Mathematics and other Essays. pp. –. London: Kegan, Paul, Trench, Trubner, & Co.
Rawls, J. () A heory of Justice. Cambridge, MA: Harvard University Press.
Rawls, J. () Social unity and primary goods. In Sen, A. and Williams, B. (eds.)
Utilitarianism and Beyond. Cambridge: Cambridge University Press.
Resnik, M. () Choices: An Introduction to Decision heory. Minneapolis, MN: University
of Minnesota Press.
Roberts, F. () Measurement heory. Cambridge: Cambridge University Press.
Savage, L. () he Foundations of Statistics. New York, NY: John Wiley.
Scanlon, T. () Contractualism and utilitarianism. In Sen, A. and Williams, B. (eds.)
Utilitarianism and Beyond. Cambridge, MA: Cambridge University Press.
Scheler, S. () he Rejection of Consequentialism. Oxford: Oxford University Press.
probability in ethics
737
Schmidt, U. () Alternatives to expected utility: formal theories. In Barberá, S., Hammond,
P., and Seidl, C. (eds.) Handbook of Utility heory. Vol. . pp. –. Dordrecht: Kluwer.
Schwarz, W. () Best system approaches to chance. In Hájek, A. and Hitchcock, C. (eds.)
he Oxford Handbook of Philosophy and Probability. Oxford: Oxford University Press.
Sen, A. () Welfare inequalities and Rawlsian axiomatics. heory and Decision. .
pp. –.
Sen, A. () Well-being, agency and freedom. Journal of Philosophy. . pp. –.
Skyrms, B. () he Dynamics of Rational Deliberation. Cambridge, MA: Harvard University
Press.
Sugden, R. () Alternatives to expected utility: foundations. In Barberá, S., Hammond, P.
and Seidl, C. (eds.). Handbook of Utility heory. Vol. . pp. –. Dordrecht: Kluwer.
Temkin, L. () Rethinking the Good: Moral Ideals and the Nature of Practical Reasoning.
Oxford: Oxford University Press.
homson, J. () Killing, letting die, and the trolley problem. he Monist. . pp. –.
homson, J. () Imposing risks. In Parent, W. (ed.) Rights, Restitution, and Risk.
Cambridge, MA: Harvard University Press.
homson, J. () Goodness and Advice. Princeton, NJ: Princeton University Press.
Vickrey, W. () Utility, strategy, and social decision rules. he Quarterly Journal of
Economics. . pp. –.
von Neumann, J. and Morgenstern, O. () heory of Games and Economic Behavior.
Princeton, NJ: Princeton University Press.
Wakker, P. () Prospect heory: For Risk and Ambiguity. Cambridge: Cambridge University
Press.
Weymark, J. () A reconsideration of the Harsanyi-Sen debate on utilitarianism. In Elster,
J. and Roemer, J. (eds.). Interpersonal Comparisons of Well-Being. Cambridge: Cambridge
University Press.
Williams, B. () A critique of utilitarianism. In Smart, J. and Williams, B. (eds.)
Utilitarianism: For and Against. Cambridge: Cambridge University Press.
Williamson, T. () Vagueness. New York, NY: Routledge.