Academia.eduAcademia.edu

Probability in ethics

2015, The Oxford Handbook of Philosophy and Probability, A. Hajek and C. Hitchcock eds.

Abstract

The article is a plea for ethicists to regard probability as one of their most important concerns. It outlines a series of topics of central importance in ethical theory in which probability is implicated, often in a surprisingly deep way, and lists a number of open problems. Topics covered include: interpretations of probability in ethical contexts; the evaluative and normative significance of risk or uncertainty; uses and abuses of expected utility theory; veils of ignorance; Harsanyi's aggregation theorem; population size problems; equality; fairness; giving priority to the worse off; continuity; incommensurability; nonexpected utility theory; evaluative measurement; aggregation; causal and evidential decision theory; act consequentialism; rule consequentialism; and deontology.

chapter 33 ........................................................................................................ PROBABILITY IN ETHICS ........................................................................................................ david mccarthy Ethics is mainly about what we ought to do, and about when one situation is better than another. But facing uncertainty about the consequences of our actions, and about how situations will evolve, is an all-pervasive feature of our condition. Should this not be a central topic in ethical theory? Probability is by far the best-known tool for thinking about uncertainty, a well-known aphorism telling us that it is the very guide to life. But despite important exceptions, it is easy to get the impression that mainstream moral philosophy has not been much concerned with probability. his relects what seems to be a natural division of labour. he most fundamental questions for ethical theory seem to arise in the absence of uncertainty. For example, it seems hard to believe that the questions of whether it is better to give priority to the worse of, and of whether we ought to favour our nearest and dearest, have anything to do with uncertainty. Many inluential discussions of these topics never mention uncertainty. Of course, once answers to these fundamental questions are in, we can try to extend them to cases involving uncertainty. But ethical theorists may seem well advised to hand this task over to others, given how mathematical the various disciplines concerned with probability have become. Technically and philosophically interesting as it may be, the extension of central ethical ideas to problems involving probability seems to be outside the main business of ethical theory. his chapter will argue for the opposite view. he major ethical problems to do with probability involve very little mathematics to appreciate; many topics which do not seem to have anything to do with probability are arguably all about probability; and thinking about various problems to do with probability can help us solve analogous problems which do not involve probability, sometimes even revealing that popular positions about such problems are incoherent. Almost every topic discussed here could easily be given its own survey article, and an adequate bibliography would exceed the space allotted for the whole chapter. Positive positions are oten argued for sketchily, many important positions on each topic are neglected, and some major topics are not discussed at all. Instead, the goal is to ofer enough breadth to illustrate some ways in which questions about probability run systematically throughout ethical theory, while in places going into 706 david mccarthy enough depth to articulate some surprising and potentially important applications. In brief, what follows is much less a survey of or an argument for particular positions than a plea for ethical theory to take probability more seriously. I said that ethics is largely about what we ought to do, and when one situation is better than another. Some say that rationality is about these things as well. Given that theories of rationality in the face of uncertainty are highly developed, it might be thought that an appeal to these theories of rationality straightforwardly solves ethical problems about probability. his line of thought is importantly mistaken. First, Hume famously claimed that it is not irrational for an agent to prefer the destruction of the whole world to the scratching of his inger. Nor would it be irrational for the agent to bring about the destruction to avoid the scratching. But the destruction is neither better than the scratching, nor better for the agent. And the agent surely ought not to bring about the destruction. On at least one widely-held view, therefore, ethics and rationality are not about the same things. Secondly, it is undeniable that contemporary theories of rationality are an indispensable resource for thinking about ethics and probability. However, whether and how to apply these theories to ethics is far from straightforward, and will be one of the principal concerns of this chapter. Furthermore, in my view, at least, appeals to rationality are almost always epiphenomenal. For example, suppose we have a convincing argument for the claim that rational preferences have such and such a structure. We could then try to claim that an evaluative relation like betterness has to have that structure on the grounds that a rational agent can surely prefer what’s better to what’s worse. However, it is almost always less committal and more direct just to modify the original argument to make it apply directly to the structure of the evaluative relation. Claims about rationality oten have historical priority over parallel claims about ethics, but I believe they do not have any kind of important conceptual priority. he chapter starts with four sections which discuss which probabilities are relevant to ethics, establish terminology, and rehearse expected utility theory. It then turns to the evaluative question of when one situation is better than another, focusing on the question of when one distribution of goods is better than another. Sections . and . discuss popular but I think inadequate approaches to this question. hese serve as a backdrop to a hugely important theorem due to Harsanyi () introduced in section .. Sections . to . discuss such things as the relationship between Harsanyi’s theorem and utilitarianism; criticisms of Harsanyi’s premises and the relationship of these criticisms to other distributive views such as egalitarianism, the priority view, and concerns with fairness; the extension of Harsanyi’s theorem to problems of population size; incommensurability; continuity; non-expected utility theory; evaluative measurement; and the question of what Harsanyi’s theorem really shows about aggregation. hese sections also list various open problems and directions for further work. All of these topics have to do with probability. One of the beneits of thinking about Harsanyi’s theorem is the way it helps us organize our thinking about all sorts of fundamental evaluative questions. Section . will suggest that thinking about decision theory can have the same value in thinking about fundamental normative questions, questions about what we ought to do. With particular focus on probability, the remaining sections illustrate by discussing what are arguably the three most important kinds of normative theories: act consequentialism, rule consequentialism or contractualism, and deontology (these will be deined in section .). he discussion aims to be self-contained. For those with a background in ethics who would like to know more about how probability is involved, the chapter keeps technicalities probability in ethics 707 to a minimum. But the topic just cannot be addressed without a certain amount of rigor, and passing acquaintance with expected utility theory and decision theory will be helpful, though not strictly necessary. For those who know about probability and would like to see how it applies to ethics, the chapter gives brief guides to the relevant ethical debates. Such readers will recognize occasional allusions to relatively sophisticated ideas to do with probability. For one thing is clear: the questions about probability which ethics raises are profound, and are surely best addressed by combining expertise. 33.1 Probabilities ............................................................................................................................................................................. One diiculty in thinking about probability in ethics is assessing when ethicists need to be involved. Suppose we are told that some action will beneit many but involves a small probability of harming a few. We might think it the job of epistemologists, metaphysicians or philosophers of science to tell us what kind of judgment ‘the probability is small’ expresses, what laws probabilities obey, and what makes such a judgment correct. Ethicists need only ask whether we ought to perform the action given that the probability of harm is small, and need not be involved any earlier. However, the division of labour is unlikely to be so neat. here are many conceptions of probability (see e.g. Hájek, , for a survey). his raises the question of which conception is most relevant to ethics, or whether diferent conceptions are appropriate in diferent ethical contexts. One of the most basic distinctions is between subjective and objective conceptions of probability, and this distinction will enable us to illustrate many of the issues. he best-known subjective conception claims that the preferences of an ideally rational agent between uncertain prospects must satisfy various structural conditions (Ramsey, ; Savage, ). Suppose the agent also has a rich set of preferences. hen Ramsey and Savage showed that there exists a unique function on events satisfying the usual probability axioms (call it her subjective probability function) and a function on outcomes (her utility function) such that: the agent weakly prefers one prospect to another if and only if the former has at least as great expected utility, as calculated by those functions. Perhaps the most prominent objective conception of probability in the contemporary debate is the best-system analysis pioneered by Lewis. he original best-system analysis of the laws of nature of Lewis () says that the laws are the theorems of the best systematization of the world: the true theory which does best in terms of simplicity and strength (or informativeness). To allow probabilistic laws in, Lewis () introduced the idea of it. he more likely the actual world is by the lights of the theory, the better the it of that theory. heories are now judged according to how well they do in terms of simplicity, strength, and it. If some of the laws of the best theory are probabilistic, those are what determine the objective probabilities. Suppose we have to choose between subjective and objective conceptions for use in ethics, understood along the lines just sketched. Which conception should it be?  Neither the Ramsey-Savage story about subjective probabilities nor Lewis’s version of best-system analysis has a hegemony. For surveys of alternative views about subjective probability, see Gilboa (), and for alternative best-system analyses, see Schwarz in this volume (). 708 david mccarthy Perhaps it depends on context: for example, subjective probabilities may be appropriate for agent-evaluation (blame, responsibility etc.), but inappropriate in other contexts. But let us ix the context by focussing on the most basic normative question of what we ought to do. Each conception has features we might ind appealing. Objective probabilities seem in some important sense to trump subjective probabilities. his is relected in the popular view that when an agent has beliefs about objective probabilities, rationality requires her to conform her subjective probabilities to those beliefs. his is the basic idea behind the so-called principle principal of Lewis (). But if objective probabilities do indeed trump subjective probabilities, it may seem that what we ought to do depends on the objective probabilities, not our subjective probabilities. On the other hand, objective probabilities may be disappointingly sparse or epistemically inaccessible. For example, best-systems analyses may make good sense of the objective probability of radium atoms decaying or coins landing on heads. But it is much less clear what best-systems analyses have to say about the objective probability of events like a run on a particular bank next year, one-of macro events involving chaotic systems. Such events may fail to have reasonably determinate objective probabilities (compare Hoefer, ), and even if they do, the epistemology may be too diicult for the objective probabilities to be usefully action guiding. So perhaps we should instead say that what we ought to do depends at least in part on our subjective probabilities. One option is to use subjective probabilities exclusively; another is to use objective probabilities where available, and subjective probabilities to ill in the gaps. But every view which makes signiicant use of subjective probabilities faces at least two major problems. First, the Ramsey-Savage story about subjective probabilities is a chapter in the Humean story about rationality. But just as the Humean story refuses to condemn the preference for the destruction of the whole world over the scratching of a inger, the Ramsey-Savage story does not condemn subjective probabilities which, to most people, are just as crazy. For example, provided her preferences are appropriately structured, there is nothing in the Ramsey-Savage story to condemn someone who thinks it highly likely the world will come to an end before teatime. Such subjective probabilities will seem to many too irrational to have any bearing on what we ought to do. But it is a major challenge to articulate a principled account of which subjective probabilities should be excluded. Secondly, as soon as we allow in subjective probabilities, we face questions of whose and how. Whose subjective probabilities count in determining whether an agent ought to perform some action – the agent’s, those of her potential victims or beneiciaries, everyone’s? If the subjective probabilities of at least two people are relevant, how should they be used? At least if we switch to the problem of evaluating the uncertain prospects which actions result in, this is a long-standing problem in welfare economics. he so-called ex post approach recommends irst aggregating the separate subjective probability functions into a single social probability function, then using this social probability function to evaluate uncertain prospects. he ex ante approach gives the separate subjective probability functions a direct evaluative role, at least in a special case. Just to give one version, ex ante Pareto says: if for each individual i, an uncertain prospect P is better for i than another uncertain prospect P′ relative to i’s own subjective probability function, then P is better than P′ . Both the ex post and ex ante approaches look appealing, but they are extremely diicult to combine consistently. For example, given weak assumptions, there will be prospects P and P′ such probability in ethics 709 that ex ante Pareto has the apparent pathology of implying that P is better than P′ despite the fact that P′ is guaranteed to produce a better outcome than P. But ex post approaches will adopt principles which from the outset say that in such cases, P′ is better than P. Now it is not my goal to try to answer any of the large questions raised in this section. My claim is rather that they are questions with which ethicists must engage, and that one’s answers to these questions may depend on one’s more general ethical views. To illustrate, suppose one sees ethics as being primarily about coordinating action to achieve good outcomes, and one is prepared to tolerate a signiicant amount of indeterminacy in one’s normative theory. hen one may be tempted to claim that the probabilities which are relevant to ethics are the objective probabilities alone. By contrast, suppose one instead sees ethics as being about trying to achieve some sort of fair compromise between agents with diverse beliefs and goals. hen it may seem tempting to allow in subjective probabilities no matter how irrational, and to follow the ex ante approach. On this picture, individual autonomy is central, and it may seem more important to respect the notion of unanimity built into ex ante Pareto than to try to avoid the apparent pathology which comes with it. here are, of course, many other options, but the important point is that which probabilities are relevant to ethics, and how, is itself a fundamental ethical question. 33.2 Outcomes ............................................................................................................................................................................. Some writers, however, think that probabilities are never relevant to what we ought to do. A parallel view applies to the question of when one uncertain prospect is better than another. Jackson () illustrates with the following. A doctor has to choose between three treatments for a patient with a minor complaint. Drug A would partially cure the complaint. One of drugs B and C would completely cure the patient while the other would kill him, but the doctor cannot tell which is which. he obvious view, as Jackson notes, is that the doctor ought to give the patient drug A. his verdict would be delivered by any broadly decision-theoretic account. Along similar lines, the prospect associated with giving the patient drug A is better than the prospects associated with drugs B and C. Call any view which assesses actions and prospects involving uncertainty along broadly decision-theoretic lines probability-based. But there is a diferent view: if drug B would cure the patient, the doctor ought to give the patient drug B; similarly for drug C. Likewise, if drug B would cure, the prospect associated with giving drug B is better than the other prospects. Call such views, positions which assess actions and prospects in terms of what their consequences would be, outcome-based. As the drug example shows, an objection to outcome-based views is that they make the truth about what we ought to do too epistemically inaccessible, or provide poor guides to action. But there are at least two interesting arguments for outcome-based views. First, transposing an argument due to homson () to the present example, suppose the pharmacist walks in and knowing full-well that drug B would cure the patient, says to  he large literature on this topic is rather technical, but Broome (, ch. ) provides a good introduction and philosophical discussion. Mongin () contains a very general set of results. 710 david mccarthy the doctor: “You ought to use drug B”. he pharmacist seems right. But doesn’t that imply an outcome-based view? In response, consider the case where the pharmacist says: “Drug B would cure. So you ought to use drug B”. By the time the pharmacist has inished the irst sentence, the doctor has new evidence, and should upgrade her probabilities accordingly. here is then no clash between a probability-based view and the truth of the pharmacist’s second sentence. Likewise, I think that in the actual case something like “Drug B would cure” is implied when the pharmacist just says: “You ought to use drug B”. What is implied by the pharmacist’s normative assertion impacts upon the probabilities the doctor should have, making the literal construal of the normative assertion true (McCarthy, ). Secondly, advocates of probability-based views have to say which probability functions are relevant to what we ought to do. But there are many candidates, e.g. the probabilities of this agent or that agent, at this time or that time. Jackson concludes that we have to recognize the existence of “an annoying profusion ... of a whole range of oughts” (Jackson, , p. ). But this seems dissatisfying. When we ask ourselves or others what we ought to do, we don’t want to learn that some oughts recommend this while others recommend that. We want to know what we ought to do fullstop. But if there is only one ought, we need to privilege one probability function. he function of an omniscient agent may seem to be the only distinguished choice, so we end up with an outcome-based view. In response, just because it is not obvious which probability function is privileged, it does not follow that no function (or reasonably narrow class of functions) is privileged. In the previous section we saw that if we adopt a probability-based view, a variety of fairly fundamental ethical factors and disputes bears upon the question of which probabilities are relevant to ethics. he complexity of this topic explains why it is not obvious which probability function is privileged, but the fact that the problem is complex hardly entails that some outcome-based view wins by default. Outcome-based views have to be assessed in terms of various ethical desiderata just as much as probability-based views do, and they do quite badly in terms of desiderata such as the idea that an ethical theory should be suitably action-guiding. It is also worth noting that outcome-based views may result in large-scale indeterminacy. he drug example stipulated that various counterfactuals relating actions to outcomes are true. But an increasingly popular view claims that most counterfactuals are false (see e.g. Hájek, ). In particular, it will oten be the case that for some potential action A there is no outcome O such that the counterfactual: “If A were performed, O would result”, is true. On this view about counterfactuals, the facts on which outcome-based views have to call are much sparser than might have appeared, with the result that there is a lot more evaluative and normative indeterminacy on outcome-based views than we might have hoped. his may further undercut the appeal of outcome-based views. In what follows, I will assume that some probability-based view is correct. But it is a major question which conception of probability is relevant to ethics, so ethicists need to be involved with questions about probability early on. In light of the diiculties of aggregating probability functions alluded to in the previous section, ethicists also need to be prepared for the possibility that the eventual input into ethics is going to be messier than a single probability function which satisies the usual axioms. probability in ethics 711 33.3 Terminology ............................................................................................................................................................................. However, to simplify I henceforth assume that probabilities are supplied and satisfy the usual axioms. To relect this I will oten speak of risk rather than probability or uncertainty. A lottery over a nonempty set of world histories (past, present and future) assigns positive probabilities to initely many of the histories with the probabilities all summing to one (these are sometimes known as lotteries with inite support). I will oten write lotteries in the form [p , h ; . . . ; pm , hm ] where the hj ’s are the histories which could result from the lottery and the pj ’s their probabilities. he betterness relation holds between two lotteries just in case the irst is at least as good as the second. An individual i’s individual betterness relation holds between two lotteries L and L just in case: i exists in every history which could result from the lotteries, and L is at least as good for i as L . By identifying histories with lotteries in which the history gets probability one, and restricting the betterness and individual betterness relations to such lotteries, we obtain relations between histories. I will refer to these relations as risk-free versions of the originals. For example, the risk-free betterness relation holds between two histories just in case the irst is at least as good as the second. here are many views about when one history is better for someone than another, or in a more suggestive phrase, about what makes someone’s life go best (Parit, , Appendix I). On one popular classiication, the three main views are that having a good life is a matter of: (i) having good-quality experiences; (ii) satisfying one’s preferences or desires; or (iii) attaining what are said to be objective goods, such as deep knowledge or close personal relationships. However, some philosophers think that when doing ethics, we should not be in the business of making ine-grained comparisons between diferent people’s lives, but should make interpersonal comparisons only in terms of such things as the resources, freedoms, or opportunities people enjoy (see e.g. Rawls, ; Sen, ). Which of these views is correct will not matter in what follows, but it will be important that the discussion can accommodate any of them. We will be talking a lot about the betterness relation. Not everyone thinks that this is a useful way of looking at ethics (see e.g. Foot, ; homson, ). But in response, talking about betterness can be seen as a harmless organizing tool (see e.g. Broome, ), and is popular enough for us to be able to cover many major positions. For example, consequentialism (on a probability-based interpretation) is the view that lotteries can be ranked in terms of betterness, and that betterness somehow determines normativity. For example, act consequentialism says that we always ought to bring about the best available lottery, whereas rule consequentialism says that we always ought to act according to the rule such that, if everyone acted in accord with it (or on a diferent version, accepted it), the best available lottery would be realized. Contractualism tends to be framed not in terms of betterness, but in terms of an ideal social contract. However, when it comes to the assessment of diferent social contracts, contractualists are concerned with  As far as I can see, there is no universally accepted account of consequentialism, so I am only trying to convey the rough idea rather than provide a precise deinition. In addition, the way moral philosophers use the term ‘consequentialism’ should not be confused with an important decision-theoretic idea which also goes by the name of ‘consequentialism’ (see e.g. Hammond, ). 712 david mccarthy competing sets of principles or rules (see e.g. Scanlon, ), so at the concrete level of normative theorizing, it is oten hard to tell the diference between contractualism and rule consequentialism. Finally, deontology is oten characterized as the position that some acts are wrong even when they would have the best available consequences, such as killing one innocent person to prevent ive innocent people from being killed. 33.4 Expected utility theory ............................................................................................................................................................................. his chapter expresses the view that whatever one ultimately makes of expected utility theory and decision theory, looking at basic evaluative and normative questions through the frameworks they provide is extremely useful. his section therefore provides a quick rehearsal, irst of the terminology of expected utility theory, and then of its most basic result. It takes X to be some ixed nonempty set. In applications, X will usually be a set of histories, or more colloquially, outcomes. A preorder on X is a binary relation R on X which is relexive (∀x ∈ X, xRx) and transitive (∀x,y,z ∈ X, xRy & yRz "⇒ xRz). It is complete if for all x, y ∈ X, either xRy or yRx. It is incomplete just in case it is not complete. An ordering of X is just a complete preorder of X. If L and M are lotteries over X, then for all α ∈ (, ), αL + ( − α)M is the so-called compound lottery in which each member x of X has probability αp + ( − α)q where p is x’s probability under L and q is its probability under M. Suppose that  is an ordering on X. hen a real-valued function f is said to represent the ordering just in case: for every x and y in X, x  y if and only if f (x) ≥ f (y). Suppose that  is a binary relation on lotteries over X. Here are the three expected utility axioms. Ordering  is a complete preorder. Strong Independence For all lotteries L, M and N, and α ∈ (, ): L  M if and only if αL + ( − α)N  αM + ( − α)N. he rough idea of Strong Independence is that the “addition” of the same lottery N to either side of L  M should make no diference: the added N’s will cancel out. Strong Independence is sometimes explained by imagining that the compound lotteries will be realized by irst tossing a biased coin, where heads has a probability of α and tails a probability of  − α, then running whichever lottery results. For example, suppose you strictly prefer L to M, and you now have to decide between αL + ( − α)N and αM + ( − α)N. If the coin lands on tails, you will face N in either case, so in that scenario there is nothing to choose between the two compound lotteries. But if the coin lands on heads, you will face L or M, and will therefore prefer to have chosen αL + ( − α)N to αM + ( − α)N. Since heads has a positive probability, you should therefore strictly prefer αL + ( − α)N to αM + ( − α)N prior to the coin being tossed. Or at least that is one of the typical ways of motivating Strong Independence. he example has focused on preference relations, but it can clearly be applied directly and without any discussion of rationality to a variety of evaluative comparatives, such as betterness and individual betterness relations. probability in ethics 713 Continuity For all lotteries L, M and M such that L ≻ M ≻ N, there exist α, β ∈ (, ) such that M ≻ αL + ( − α)N and βL + ( − β)N ≻ M. To illustrate, suppose you strictly prefer  to , and strictly prefer  to . hen if your preferences are continuous, there will be some lottery which almost guarantees you  with a tiny chance of  (one in a billion, say) which you will strictly prefer to getting  for certain. And you will strictly prefer  for certain to some lottery which almost guarantees you  with a tiny chance of . As the example is meant to suggest, many people think that Continuity is a plausible requirement on various evaluative comparatives. A binary relation  on lotteries over X satisies the expected utility axioms just in case it satisies Ordering, Strong Independence, and Continuity. Here is the most basic result of expected utility theory, due to von Neumann and Morgenstern (), but anticipated in a deeper way by Ramsey (). heorem  (von Neumann and Morgenstern) Let X be a nonempty set, and  be a binary relation on lotteries on X which satisies the expected utility axioms. hen there exists a real-valued function u on X such that . For all lotteries L = [p , x ; . . . ; pm , xm ] and L = [q , y ; . . . ; qn , yn ], L  L ⇐⇒ p u(x ) + · · · + pm u(xm ) ≥ q u(y ) + · · · + qn u(yn ) . Any function v satisies (i) when substituted for u if and only if there exist real numbers a >  and b such that v = au + b. Roughly speaking, (i) says that there is a function u (oten referred to as a “vNM utility function”) such that L  L if and only if the expected value of u associated with L is at least as great as the expected value of u associated with L . he expected value of u associated with a lottery is obtained by applying u to each of the lottery’s possible outcomes, weighting the result by the probability of those outcomes, then adding all those numbers up. In such circumstances, I will say that the ordering  is represented by the expected value of u. (ii) says that the function u is unique up to choice of zero and unit, or in fancier terminology, unique up to positive aine transformation. For an analogy, Fahrenheit and Centigrade measure temperature in essentially the same way, except that they use diferent zeros and units. Overall, the main message is that if an ordering of lotteries satisies the expected utility axioms, it can be represented by the expected value of some function which is more or less unique. he literature on expected utility theory is vast. It has been applied to all sorts of topics, and has received a great deal of defense, criticism, and mathematical elaboration. Beyond a few remarks, this chapter will assume some sort of familiarity with the defense, but will rehearse many of the criticisms, particularly as they apply to ethics. We now need to ask: When is one lottery better than another? Which lotteries ought we to bring about? We begin with the irst question.  L ≻ M is deined as L  M and not M  L. L ∼ M is deined as L  M and M  L. At varying levels of philosophical and mathematical ambition, personal favourites include Fishburn (), Resnik (), Kreps (), Broome (), Hammond (), Ok () and Gilboa (). In this volume, see Buchak ().  714 david mccarthy 33.5 Expected goodness ............................................................................................................................................................................. Some philosophers imply that that if we know when one history is better than another, the question of when one lottery is better than another is straightforward. For example, Parit (, p. ) and Scheler (, p. , note ) start their discussions of consequentialism only by assuming () he risk-free betterness relation is an ordering. To cover risky cases, they think that we need to appeal only to expected utility theory. In particular, they think we just need to add () One lottery is at least as good as another if and only if its expected goodness is at least as great. In other words, the betterness relation is represented by the expected value of goodness. Parit and Scheler are not claiming that it is obvious when one history is better than another. Rather, they are claiming that once we have an ordering of histories in terms of betterness, () then tells us how to order lotteries in terms of betterness. Now Parit and Scheler are quite brief about this and their real concerns lie elsewhere. But this sort of claim is commonly made, and it is important to realize that it contains a serious mistake. he basic diiculty is that () presupposes the existence of goodness measures, measures of how good histories are, and various problems arise depending on where we think these measures are coming from. First, provided certain technical conditions are met, () guarantees that the risk-free betterness relation can be represented by some function. To deal with the possibility that there may be more than one such function, we might treat the set of all goodness measures as the set of all of the functions which represent the risk-free betterness relation. It would then be natural to interpret () as saying: L  L if and only if the expected goodness of L is at least as great as the expected goodness of L according to every goodness measure. Unfortunately, however, this approach leads to massive indeterminacy. An example will illustrate. Suppose there are exactly three histories x, y and z, ordered x ≻ y ≻ z by the risk-free betterness relation. Let L be the lottery [  , x;  z] and let us consider how it compares with y. Consider the two functions u and v deined by u(x) = v(x) = , u(y) = ., v(y) = ., and u(z) = v(z) = . Both of these functions represent the risk-free betterness relation, and therefore count as goodness measures on the current proposal. But according to u, the expected goodness of L is less than that of y, and according to v, the expected goodness of L is greater than that of y. he current proposal therefore leaves L and y unranked, and it only takes a bit more work to show that this will be true of almost every pair of lotteries. So interpreting () along these lines does almost nothing to cover risky cases. Secondly, to get around this problem we might hope to narrow down all of the functions which represent the risk-free betterness relation to (essentially) a single function to be used  he result goes back to Cantor; for details, see any reasonably advanced book on utility theory, such as Kreps () or Ok (). probability in ethics 715 as a goodness measure. his line of thought is tacitly quite common, and what tends to happen is that one of the functions which represents the risk-free betterness relation seems quite simple or natural, and it is taken to be the goodness measure. An old idea will illustrate. According to this idea, each “just noticeable diference” between outcomes is given the same magnitude of goodness, so that the diference in goodness between the best outcome and the second best outcome is equal to the diference in goodness between the second best outcome and the third best outcome, and so on. In the toy example of the previous paragraph, this would be done by a function w where w(x) = , w(y) = ., and w(z) = . Using () would then provide a ranking of all lotteries in terms of betterness. For example, L and y would turn out to be equally good. However, this proposal is ethically entirely arbitrary, and it is easy to invent circumstances in which the method delivers implausible conclusions. To illustrate, let us apply the same idea to individual betterness relations. Consider a wine connoisseur who is able to discriminate among a vast number of wines, and let us take her ordering of wines as given. Let a+ be the outcome in which she gets the best possible wine, a the next wine down, r some rough house wine, and r+ the next one up. he current method would regard the two lotteries [  , a+ ;  , r] and [  , a;  , r+ ] as equally good. But our connoisseur might regard experiencing the best possible wine as worth risking a lot for, and improving a rough house wine as hardly worth anything, leading her to conclude that the irst lottery is better. But the current method woodenly regards the two lotteries as equally good. hirdly, one might approach the problem from a diferent direction. Suppose we start with a claim which is presupposed by (), namely Social EUT he betterness relation satisies the expected utility axioms. Now by the vNM theorem, Social EUT implies () For some real-valued function on histories f , the betterness relation is represented by the expected value of f . We might then deine f as a goodness measure (along with its positive aine transformations). It follows that () now gives us the right results: one lottery is better than another just in case its expected goodness is greater. Unfortunately, however, just as the irst method yielded almost complete indeterminacy, this method is almost completely uninformative. In almost all cases, it provides us with no concrete method of ranking lotteries. For example, in the toy example used to show why the irst method leads to indeterminacy, it is consistent with the present method that L is better than y, that L and y are equally good, and that L is worse than y. We have now looked at three ways of trying to ill in the story gestured towards by Parit, Scheler, and many others, the story which thinks that once we are given the risk-free  More precisely, to a set of functions which are all related by positive aine transformation. he vNM theorem tells us that these will all be equivalent when it comes to ordering lotteries in terms of expected goodness.  For example, McCarthy () argues that this approach is common in accounts of the priority view and leads to unsatisfactory deinitions of it.  he basic idea goes back to Edgeworth (). For criticism and defense see e.g. Vickrey () and Ng () respectively. 716 david mccarthy betterness relation, we need only to appeal to expected utility theory to cover risky cases. Each attempt to say where goodness measures are coming from leads to a problem. he irst leads to indeterminacy, the second to arbitrariness, and the third to uninformativeness. Now expected utility theory does indeed turn out to be a powerful tool for thinking about evaluative questions about risk, and even questions which do not seem to be about risk. But the story has to be more sophisticated than anything we have so far seen. 33.6 Veils of ignorance ............................................................................................................................................................................. To simplify, I will from now on assume that in evaluating lotteries, we are only concerned with the ethics of distribution, and in addition, not concerned with rights or responsibilities. In particular, I will assume: if h and h contain the same population and for each member i, h is exactly as good for i as h , then h and h are equally good. he best-known strategy for augmenting an appeal to expected utility theory is to use a so-called veil of ignorance, made famous but used in diferent ways by Harsanyi () and Rawls (). Assume a ixed population , . . . , n. Harsanyi’s presentation of his argument tacitly identiies individual betterness relations with individual preference relations. But there are objections to that identiication, and following Broome () we can avoid them by restating Harsanyi’s argument in terms of individual betterness relations. his enables us to leave it open whether the content of individual betterness relations has to do with preference satisfaction, the quality of experience, achievements, or some other account. Harsanyi’s argument then begins with Individual EUT Individual betterness relations satisfy the expected utility axioms. Assume also that interpersonal comparisons are unproblematic in that Interpersonal Completeness For all individuals i and j and histories h and h , either h is at least as good for i as h is for j, or vice versa. Together Individual EUT and Interpersonal Completeness imply that there are real-valued functions u , . . . , un on histories such that (i) for each individual i, i’s individual betterness relation is represented by the expected value of ui , and (ii) for all individuals i and j, h is at least as good for i as h is for j if and only if ui (h ) ≥ uj (h ). From now on, u , . . . , un will always be such functions, but their existence presupposes Individual EUT and Interpersonal Completeness. I will sometimes call them utility functions. Harsanyi () took ethics to be impartial. But how should this be modeled, or made more concrete? his is where Harsanyi appeals to a veil of ignorance. Choosing under the  Some of the arguments which follow make slightly stronger assumptions about interpersonal comparisons than I have made explicit. he point of these is to make various impartiality assumptions have an efect, and also to guarantee that the functions u , . . . , un are essentially unique, in that if some other set of functions v , . . . , vn plays their role, there are real numbers a >  and b such that for all i, vi = aui + b. But I will suppress this slightly technical issue. For full details, see e.g. Broome (, p. ). probability in ethics 717 equiprobability assumption is understood as choosing between two social situations on the assumption that one is equally likely to turn out to be each member of the population. hen Harsanyi took the idea that ethics is impartial to be well-modeled by Veil of Harsanyi One lottery is at least as good as another if and only if it would be weakly preferred by every self-interested and rational person choosing under the equiprobability assumption. I will skip the formal details, but from Individual EUT, Interpersonal Completeness and Veil of Harsanyi, Harsanyi gave a simple argument for Sum he betterness relation is represented by the expected value of the function u + · · · + un . Rawls () agrees with Harsanyi that ethics is impartial, and that a veil of ignorance is a good way of modeling impartiality. To focus on their treatment of veils, we will ignore other diferences, such as the diferent ways in which they understand interpersonal comparisons. With those aside, Rawls can be taken as agreeing with Individual EUT and Interpersonal Completeness. But his interpretation of the veil difers. Choosing under the uncertainty assumption is understood as choosing between two social situations on the assumption that one will turn out to be one of the members of the population, but with complete uncertainty about who that will be. hen Rawls took the idea that ethics is impartial to be well-modeled by Veil of Rawls One history is at least as good as another if and only if it would be weakly preferred by every self-interested and rational person choosing under the uncertainty assumption. Rawls then argued that Individual EUT, Interpersonal Completeness, and Veil of Rawls would result in Maximin One history is better than another if and only if the former is better for the worst of. Many commentators have thought Rawls should instead have concluded with Leximin One history is better than another if and only if it is better for the worst of, or equally good for the worst of and better for the second worst of, and so on. hese arguments raise three basic questions: (i) What does rational choice under the uncertainty assumption really require? (ii) Given that one is going to model impartiality via some sort of veil of ignorance, is the uncertainty assumption a better way of doing it than the equiprobability assumption? (iii) Is modeling impartiality via a veil of ignorance a good idea anyway? Briely, (i) seems to be unclear. For example, suppose the Ramsey-Savage story is right about rational choice under conditions of uncertainty. For the agent behind the veil to lack implicit subjective probabilities of any degree of determinateness – and thus to model complete uncertainty – that story implies that her preferences are incomplete. At best, maximin (or leximin) would then seem to be but one rationally permissible 718 david mccarthy choice among many, whereas Rawls needs it to be rationally required (see Angner () for further discussion). For (ii), the equiprobability assumption seems at irst glance a reasonable attempt at giving impartiality a concrete and reasonably clear interpretation. Moreover, given the diiculties in understanding what rationality in conditions of complete uncertainty requires, it is hard to see what motivates shiting to the uncertainty assumption, aside from a question-begging attempt to avoid Sum. I will return to some of these issues, but the most fundamental question is (iii), and a later result of Harsanyi’s seems to show that the use of veils of ignorance was never a good idea in the irst place. 33.7 Harsanyi’s theorem ............................................................................................................................................................................. To present Harsanyi’s result we need to state two more premises. We continue to assume a ixed population. he irst premise expresses a kind of impartiality. Impartiality For all histories h and h , if there is some permutation π of the population such that for each individual i, h is exactly as good for i as h is for π(i), then h and h are equally good he second premise is a so-called Pareto assumption. Pareto (i) If two lotteries are equally good for each member of the population, they are equally good. (ii) If one lottery is at least as good for every member of the population and better for some members, then it is better. his is Harsanyi’s theorem. For an accessible proof, see e.g. Resnik (). heorem  (Harsanyi) Assume a constant population. hen Individual EUT, Interpersonal Completeness, Social EUT, Impartiality, and Pareto jointly imply Sum. To recap what Sum says, the conclusion of the theorem says that one lottery is better than another just in case it has a greater sum of individual expected utilities. his implies that one history is better than another just in case it has a greater sum of individual utilities. However, in its classical form, utilitarianism is usually deined as the claim that one history is better than another just in case it has a greater sum of individual goodness. his raises the disputed question of what Sum has to do with utilitarianism, and thus whether Harsanyi’s premises imply utilitarianism. Roughly speaking, Harsanyi’s premises imply the classical version of utilitarianism just in case individual utilities are measures of individual goodness. Simplifying somewhat, Sen () and Weymark () denied that the two should be identiied, whereas along with e.g. Harsanyi (b), Broome (), and Hammond (), I believe that they should be identiied. I will say more about this in section ., but the most important claim is that it does not really matter who is right. he conclusion of Harsanyi’s theorem appears to tell us exactly what the content of the betterness relation is, and what name we should give to that conclusion is of much less importance. In my view, it is hard to exaggerate the importance of Harsanyi’s result. I will assume enough familiarity with expected utility theory, references to which were provided earlier, probability in ethics 719 to see the prima facie case for Individual EUT and Social EUT. he rough idea is that the prima facie case for rational preference relations satisfying the expected utility axioms can be modiied to apply directly to evaluative relations like individual betterness relations and the betterness relation. he prima facie case for the other premises is fairly natural as well. he best way to explore this further will be to look at criticisms of the premises. We will do that shortly, but irst I want to consider how Harsanyi’s theorem improves on what we have seen so far. he popular appeal to expected utility theory sketched in section . sufered from telling us little of any use about the betterness relation. But if we take individual betterness relations as given, and accept the premises of Harsanyi’s theorem, the theorem shows that the content of the betterness relation is completely determined. Consider now veil of ignorance arguments. Both Harsanyi’s and Rawls’s accept Individual EUT and Interpersonal Completeness. hat leaves Harsanyi’s veil argument with Veil of Harsanyi and Rawls’s with Veil of Rawls, while Harsanyi’s theorem is let with Social EUT, Impartiality, and Pareto. Harsanyi’s veil argument works by assuming that the person behind the veil is rational, and therefore has preferences which satisfy the expected utility axioms. Given that, Veil of Harsanyi yields Social EUT, and also, obviously, Impartiality and Pareto. So Harsanyi’s veil argument enjoys no advantage over his theorem, and the theorem simply bypasses worries about veil arguments expressed by e.g. Scanlon (). he comparison with Rawls is less clear. When discussing the veil, Rawls usually considers only the problem of ranking diferent histories. But someone behind the veil could also try to rank diferent lotteries (thus facing two forms of ignorance: uncertainty behind the veil, and risk beyond the veil). So we can ask what she thinks about Social EUT, Impartiality, and Pareto. It would be surprising if the uncertainty assumption led her to reject any of these claims, and thence Sum. But since Rawls is so plainly opposed to Sum, I think this suggests that aspects of his informal reasoning have not been fully captured in what seems to be his formal model. Sections . and . will discuss two major Rawlsian worries about some of Harsanyi’s premises. But to foreshadow, these worries can be expressed directly as criticisms of the premises of Harsanyi’s theorem, and appealing to the veil does not seem to add anything. Finally, we will see in section . that there is at least one major view about the ethics of distribution which is impartial but is immediately ruled out by the adoption of a veil of ignorance, whether Harsanyi’s or Rawls’s. So much the worse for the veil as a model of impartiality. hus in my view, the veil turns out to be just an unhelpful distraction, and the proper focus of attention for the ethics of distribution should be Harsanyi’s theorem. 33.8 Variable populations ............................................................................................................................................................................. Before looking at various worries about and alternatives to the premises of Harsanyi’s theorem, it is worth mentioning a way in which it can be extended. Problems where the population can vary are diicult. But we do not need to add much to the premises of Harsanyi’s theorem to make progress. 720 david mccarthy he following says that only the kinds of lives people are living matters, not the identities of those people. Anonymity For all histories h and h containing inite populations of the same size, if there is a mapping ρ from the population of h onto the population of h such that for every member i of the population of h , h is exactly as good for i as h is for ρ(i), then h and h are equally good. his premise makes the nonidentity problem discussed by Parit () rather trivial: if no one else will be afected, and a woman has to choose between having one of two diferent children, Anonymity plus Pareto implies that it would be better if she had the child whose life would be better. Let U be the function deined on histories such that for any history h with population , . . . , n, U(h) := u (h) + u (h) + · · · + un (h). hen the premises of Harsanyi’s theorem, but with Impartiality replaced by the stronger Anonymity, jointly imply Same Number Claim Assume that all histories contain populations of the same size. hen the risk-free betterness relation is represented by U. Turning to comparisons between populations of diferent sizes, I will outline an approach due to Broome () and Blackorby, Bossert, and Donaldson (). I lack the space to discuss the details, but the crucial step is to argue for the Neutral existence claim here exists a life l such that in every situation, provided no one already existing is afected, (i) it is better to create an extra life which is better than l; (ii) it is worse to create an extra life which is worse than l; (iii) it is a matter of indiference to create an extra life which is exactly as good as l. Call such a life a neutral existence. Given a parameter v, let V be the function deined on histories such that for each history h with population , . . . , n V(h) := (u (h) − v) + (u (h) − v) + · · · + (un (h) − v) Some simple algebra shows that the same number and neutral existence claims together imply the Variable number claim Assume that all histories contain inite populations. hen the risk-free betterness relation is represented by V, where v is the utility level of a neutral existence. he value of v makes no diference to same number problems. For when comparing two histories with populations of the same size using V, the subtracted v’s cancel out. In variable number problems, the presence of v in the deinition of V means that ignoring efects on other people’s lives, someone’s existence makes a positive contribution towards goodness if and only if her life is better than a neutral existence. probability in ethics 721 Nothing so far said tells us what the value of v is, however. Setting it will involve further ethical issues, and is diicult to do in a way which respects common intuitions (Broome, ). For example, setting it low leads to the conclusion that a large number of people (e.g. a billion) all with extremely good lives is worse than an extremely large number (e.g. a billion billion) all with lives which may seem hardly worth living. Parit () evidently did not think much of this idea when he famously called it “the repugnant conclusion.” On the other hand, setting the value of v high makes it bad to create someone who would have an intuitively good life, and that may seem implausible too. When we ethicists irst start to think seriously about probability, it may seem like a bane for us, vastly expanding the complexity of questions we have to address. But it may now look like a blessing. he problem of aggregating individual well-being to form an overall judgment about when one history is better than another seems diicult. Yet without appearing to make any assumptions about aggregation, and instead by largely appealing to expected utility theory, which is all about probability, Harsanyi’s theorem seems to provide a solution. Section . will provide a closer look at the question of whether the theorem really does solve the “problem of aggregation.” But we irst examine criticisms of and alternatives to Harsanyi’s premises which are also about probability. 33.9 Equality and fairness ............................................................................................................................................................................. he additive form of the conclusion of Harsanyi’s theorem will make some suspect that its premises conlict with the idea that in the distribution of goods, equality and fairness matter. But where, if anywhere, is the tension? Assume a population of two people, A and B, and consider the following lotteries, which combine examples due to Diamond () and Myerson (). LE A B heads   tails   LF A B heads   tails   LU A B heads   tails   Anyone who thinks that equality is valuable should think that LE is better than LF . For while LE and LF are equally good for each person, LE has in its favour that it guarantees equality of outcome while LF guarantees inequality (Myerson, ). But Pareto implies that LE and LF are equally good, so it is inconsistent with the idea that equality is valuable. Anyone who thinks that fairness is valuable should think that LF is better than LU . For while Impartiality implies that the outcomes under LF and LU are equally good, LF has in its favour that it distributes the chances fairly (Diamond, ). Diamond’s example leads to the irst of a series of challenges to the assumptions about expected utility in Harsanyi’s premises. By Impartiality, all of the outcomes under LF and LU are equally good. Strong Independence of the betterness relation then implies that LF and LU are equally good. Hence the assumption that the betterness relation satisies the expected  Proof: for all lotteries L and M, write L  M for “L is at least as good as M”. By Impartiality, [, ] ∼ [, ]. Strong Independence for  then implies LU =  [, ]+  [, ] ∼  [, ]+  [, ] = LF as required. 722 david mccarthy utility axioms, in particular Strong Independence, clashes with the idea that fairness is valuable. I think that Myerson’s and Diamond’s examples lie at the heart of concerns with equality and fairness. It is diicult to argue for this in a short space, though section . will say more. But suppose it is correct. How could the examples be generalized into full-blown theories about what it is for equality or fairness to be valuable? I will just illustrate an approach for the case of equality. Suppose we are given a preorder e on histories such that h e h if and only if h is uncontroversially (among egalitarians) at least as good in terms of equality as h . My own account of the extension of e is in McCarthy (). But to give two simple cases, every equal distribution is going to be uncontroversially better in terms of equality than every unequal distribution, and all equal distributions are going to be uncontroversially equally good in terms of equality. Consider Equality-neutral Pareto Assume a ixed population. For all lotteries L = [p , h ; . . . pm , hm ] and L = [p , k ; . . . ; pm , km ]: (i) if L is exactly as good as L for all individuals, and hj ∼e kj for all j, then L and L are equally good; and (ii) if L is at least as good as L for all individuals and better for some individual, and hj e kj for all j, then L is better than L . Equality principle Assume a ixed population. For all lotteries L = [p , h ; . . . ; pm , hm ] and L = [p , k ; . . . ; pm , km ]: if L is at least as good as L for all individuals, hj e kj for all j and hj ≻e kj for some j, then L is better than L . McCarthy () argues that together, these principles are the core of egalitarianism. Equality-neutral Pareto is a weakening of Pareto, designed to avoid clashes with examples like Myerson’s. he equality principle is designed to generalize the idea that equality is valuable, as illustrated by Myerson’s example. hus we obtain a very general egalitarian theory by starting with Harsanyi’s premises, weakening Pareto to its equality-neutral cousin, then adding the equality principle. Notice that the equality principle is inconsistent with the adoption of either Harsanyi’s or Rawls’s veil of ignorance. But it can easily be shown to be consistent with the notion of impartiality captured by Impartiality. So if it was meant only to model impartiality, the adoption of a veil of ignorance is too strong. he characterization of the idea that equality is valuable via the equality principle exploits natural dominance ideas. Roughly speaking, suppose that each part of some object x is at least as good with respect to some value V as the corresponding part of object y. hen x is said to weakly dominate y in terms of the value V. If x weakly dominates y, but y does not weakly dominate x, then x strictly dominates y. hus the equality principle says that if L weakly dominates L in terms of well-being, and strictly dominates L in terms of equality, then L is better than L . I lack the space to discuss the details, but I believe that the way to characterize the idea that fairness is valuable is to develop dominance ideas in a way suggested by Diamond’s example. However, while the apparent similarities between Diamond’s and Myerson’s examples suggest parallels, it appears that there are subtle asymmetries between concerns with equality and concerns with fairness (McCarthy and homas, ).  his is not quite right. In my view it is better to say that Myerson’s example is about equality of outcome, and Diamond’s is about equality of prospects, not fairness. But here I stick with the more usual terminology. For reasons for not talking about fairness, see McCarthy (). probability in ethics 723 33.10 Priority ............................................................................................................................................................................. Parit () argued that what he called the priority view is an important alternative to egalitarianism, sharing many of its apparent virtues but avoiding what he called the leveling-down objection. He summarized it via the slogan that “beneiting the worse of matters more”, but commentators have been divided over whether he managed to articulate a genuine alternative to egalitarianism. A puzzle about making sense of the priority view is that its distinctive feature is advertised as an intrapersonal phenomenon: what is bad about people being worse of is that they are worse of than they might have been (Parit, , p. ). his has suggested to commentators that according to the priority view, it matters more to more to beneit someone the worse of she is even when no others are around at all (Rabinowicz, ). But in cases where only one person is around and risk is not involved, the priority view, like any other sane view, will accept that one history is better than another if and only if it is better for the sole person. Matters are diferent, however, when risk is involved. Several commentators have thought that the priority view should be formulated in a way which makes it have distinctive consequences in one-person cases involving risk (Rabinowicz, ; McCarthy, ; Otsuka and Voorhoeve, ). I am inclined to go further and say that the key idea behind the priority view receives its clearest and most fundamental expression in such cases. To illustrate, suppose A is the only person around, and compare the history h = [] with the lottery L = [  , ;  , ], with the numbers supplied by uA . Because L and h are equally good for A, Pareto implies that they are equally good. But I believe that the priority view should be understood as saying that h is better than L. More generally, I believe that the key idea of the priority view is what I call the Priority principle Assume a ixed population. Suppose histories h , h and h each contain perfect equality. hen (i) h is at least as good as h if and only if h is at least as good for each individual as h ; and (ii) if for each individual i, h is better for i than h , h is better for i than h , and h is exactly as good for i as L = [  , h ;  , h ], then h is better than L. Notice that this is inconsistent with equality-neutral Pareto. Some writers ind it absurd that in one-person worlds, the betterness relation and the sole person’s individual betterness relation could diverge (e.g. Otsuka and Voorhoeve, ), as the priority principle implies. Rabinowicz () regards this claim as acceptable, while Parit (), for example, ofers a defense. But rather than discuss possible defenses of the priority principle, I will note a less discussed objection to the priority view. he priority view can be formulated by starting with Harsanyi’s premises, weakening Pareto far enough to accommodate the priority principle, then adding the priority principle (McCarthy, forthcoming a). But when this is done, any account of the extension of the betterness relation which is consistent with the Harsanyi premises turns out to be consistent with the priority view premises, and vice versa. But the priority view has a more complicated way of describing the betterness relation, because of the less simple relationship it posits between betterness and individual betterness in one-person worlds. So the objection is that the priority view fails to provide a reasonable alternative to the Harsanyi premises, not because of any ethically absurd implications, but 724 david mccarthy because of the theoretical vice of needless complexity (cf. Harsanyi, b; Broome, ; McCarthy, , forthcoming a). 33.11 Continuity ............................................................................................................................................................................. Continuity is seldom discussed. When it is mentioned, it is oten said just to be a technical assumption. But when the claim is that the betterness relation or individual betterness relations satisfy Continuity, this is a clear mistake. To illustrate, let a be a very good life, a+ a slightly better life, and z an extremely bad life, such as being in severe pain or enslaved for a long time. he claim that individual betterness relations satisfy Continuity implies that there is a gamble which would almost guarantee an individual a+ with a small chance of z which is better for the individual than having a for certain. But regardless of what one thinks about this case, it is not a technical assumption to claim that the risk is worth it. It is a substantive evaluative judgment, and diferent views about it are reasonable. For what it is worth, I believe that many of Rawls’s informal remarks about his veil of ignorance would have been more naturally modeled by denying that individual betterness relations satisfy Continuity because of this kind of case than by his actual model. It is clear that Continuity is something ethicists should pay attention to. he good news is that the result of weakening the expected utility axioms by dropping Continuity is formally well understood, thanks to results by Hausner () and others. But there are several pieces of bad news. First, the general statement of Hausner’s result is quite mathematically complex and not easy to speak about informally. Secondly, it is time to stop speaking of the continuity axiom. here are several EUT-style continuity axioms (see e.g. Hammond, ), and it is far from clear what the ethical grounds for adopting one but not another might be. hirdly, speaking loosely, Continuity failures occur when one lottery in some sense has “ininitesimal” value compared with another. But such cases pose a challenge to standard treatments of probability as well, and this needs to be incorporated into the analysis. In summary, perhaps in the end ethicists can safely ignore Continuity. But it would be better to know that than to hope for it, and the work needed to arrive at such a conclusion appears to be substantial. 33.12 Incommensurability ............................................................................................................................................................................. One of the major contributions of the contractualist literature has been to force us to take seriously diiculties with evaluative comparisons of diferent kinds of goods. But part of  As an analogy, consider again the best-system analysis of laws. Suppose someone ofers some account of the laws of the world which captures all relevant facts. But this account is more complex than some other account which also captures all relevant facts. On the best-system analysis, the more complex account is mistaken about what the laws are, despite getting the relevant facts right. McCarthy (forthcoming a) argues that the priority view is mistaken on similar grounds.  For an accessible account of how the challenge applies to Savage’s treatment of subjective probability, and a sketch of mathematically sophisticated responses, see Gilboa () pp. –.  For recent work in this direction, see Jensen (). probability in ethics 725 the assumption that the betterness relation and individual betterness relations satisfy the expected utility axioms is that these relations are complete. But from the perspective of diiculties with evaluative comparisons, such completeness assumptions look far from obvious. hey may seem particularly implausible if we adopt the popular view that the basis for such things as interpersonal comparisons should be as neutral as possible between competing substantive views about what a good life is, as argued, for example, in Rawls (). One response would be to adopt something like resources, freedoms, or opportunities as the basis for interpersonal and intrapersonal comparisons (see e.g. Rawls, ; Sen, ). However, the premises of Harsanyi’s theorem are silent on the content of individual betterness relations, so there is no obvious reason why the theorem cannot be run when their content is understood in terms of resources and so on. Nevertheless, even resources have their own problems to do with comparability because of the diferent nature of diferent kinds of resources. So this response is a diversion, and we should turn directly to Harsanyi’s premises to see what can be done about diiculties with comparability. he most immediately tempting response is simply to drop the completeness assumptions. his means that the various evaluative relations featuring in the theorem become preorders which are not assumed to be complete. A large advantage of working with preorders is that mathematically speaking, they are relatively tractable. For example, a corollary of Szpilrajn’s theorem is that a preorder is identical to the intersection of all of the complete preorders which extend it. his has the advantage that in thinking about preorders one can oten work with complete preorders anyway. his corollary is strikingly parallel to the supervaluationist treatment of vague predicates: a sentence involving a vague predicate is true if it is true on all admissible sharpenings of the predicate, false if it is false on all admissible sharpenings, and neither true nor false otherwise. But this should suggest caution: if a natural response to diiculties to do with comparability is to shit to preorders, the response looks like one of the classic candidates for a solution to the problem of vagueness. But supervaluationist approaches have been heavily criticized (see e.g. Williamson, ). Furthermore, perhaps the parallel suggests that the basic problem with comparing diferent kinds of goods is one of vagueness. In fact, cases in which evaluative comparisons look extremely diicult seem to lend themselves to sorites paradoxes, one of the hallmarks of vagueness. In one way this is good news: there is a vast amount of work on vagueness, so ethicists have plenty of material to borrow from. Since the topic is probability, it is worth mentioning that some treatments of vagueness are probabilistic, and that an extensive literature takes this approach to vague comparatives; see e.g. Fishburn () for a survey. In another way it’s bad news: perhaps the main reason why there is so much literature on vagueness is the almost complete lack of consensus. Perhaps we ethicists should just shelve the problem of how best to model diiculties to do with evaluative comparisons until there is more convergence in the literature on vagueness. However, in the absence of such convergence, it may still be possible to achieve some kind of stability result: show that the solutions to a class of interesting ethical problems which involve goods which are diicult to compare are insensitive to the resolution of more general problems about vagueness. For example, Broome () takes this approach in his discussion of the neutral level for existence. In section . I will suggest that the same can be done for the question of what Harsanyi’s theorem really shows. 726 david mccarthy 33.13 Non-expected utility theory ............................................................................................................................................................................. he backbone of Harsanyi’s theorem is expected utility theory, but we have seen a number of ways in which the claim that various evaluative relations satisfy the expected utility axioms can be criticized. he axioms so far criticized are Strong Independence, Ordering (insofar as completeness was criticized), and Continuity. Some writers even go so far as to criticize transitivity (see e.g. Temkin, ). hese criticisms are directly based either on distributive intuitions (Strong Independence, Continuity), or on the nature of goods being distributed (Ordering). But a serious question about the expected utility axioms arises from a diferent direction. Since Allais () and Ellsberg (), it has appeared to many that individual preference relations violate the expected utility axioms in fairly systematic ways. he attempt to describe these violations has led to a huge body of work developing alternatives to the expected utility axioms (for surveys see e.g. Schmidt, ; Sugden, ; Gilboa, ; Wakker, ). his project has been accompanied by two broad views. One is that the alternative axioms simply help us catalogue human irrationality, which might of course be very important in various descriptive and explanatory contexts. he other, oten prompted by the fact that the violations are oten stable under criticism, is that the support the alternative axioms tacitly enjoy genuinely threatens the picture of rationality provided by expected utility theory. Now these are views about rationality, whereas we have been interested in such things as betterness and betterness for people. But the development of non-expected utility theory suggests that it would be interesting to modify distributive theories which to varying extents involve the expected utility axioms by weakening those axioms and then adding some of the non-expected utility axioms. If the application of the non-expected utility axioms to such things as individual betterness relations turns out to be reasonably well motivated, the result should be an expanded account of reasonable distributive theories. But even if those axioms are not well motivated when applied to evaluative relations, this project would still be worth pursuing. If a class of popular distributive intuitions turns out to be generated by such an application of non-expected utility theory, we would in efect have an important error theory. 33.14 Evaluative measures ............................................................................................................................................................................. Discussions of the ethics of distribution commonly assume the existence of quantitative measures of various evaluative properties, then use these measures to formulate various apparently natural ideas. For example, individual goodness measures, quantitative measures of how good histories are for individuals, are oten taken to exist. hen assuming a constant population , . . . , n, it is oten claimed that (U) According to utilitarianism, two histories are equally good if they contain the same sum of individual goodness. probability in ethics 727 (E) According to egalitarianism, an equal distribution is better than an unequal distribution of the same sum of individual goodness. (P) According to the priority view, it is better to give a unit of individual goodness to a worse-of person than to a better-of person. hese claims tacitly assume that talk of units of individual goodness is well-deined. hey are oten taken to be (at least partial) deinitions of the distributive theories in question, making what seems natural or appealing about the theories in question transparent. For more detail, McCarthy () examines the role of evaluative measurement in common understandings of the priority view. However, there are serious diiculties with this kind of approach to the ethics of distribution. I will mention just one speciic problem. he only obvious fact about individual goodness measures is that they have to represent risk-free individual betterness relations. But this only makes individual goodness measures unique up to increasing transformation. But for units of individual goodness to be well-deined, individual goodness measures must be unique up to positive aine transformation. So to make them well-deined it looks as if we need to make an arbitrary choice of measure (Broome, , p. ). But this will make the theories partially deined by (U), (E), and (P) rest on an arbitrary choice, and fail to vindicate the idea that they are the fundamental theories about the ethics of distribution we take them to be. More generally, taking the existence of quantitative evaluative measures as given, then using them to theorize about the ethics of distribution, is strongly at variance with standard views about measurement in the physical and social sciences. here, quantitative measures are seen as emerging as canonical descriptions of qualitatively described prior structures (see e.g. Krantz, Luce, Suppes and Tversky, ; Narens, ; Roberts, ). My own view is that we should treat evaluative measurement along the same lines. By itself, this does not begin to settle what we should say about individual goodness measures. But individual goodness measures turn out to be well-enough deined for talk of units, sums, and so onto make sense, at least given certain background assumptions. I can only sketch this view, but in more detail, sections . and . point to a characterization of egalitarianism and the priority view in terms of primitive qualitative relations (betterness, individual betterness). Similarly, I think the premises of Harsanyi’s theorem should be understood as characterizing utilitarianism. Now (U), (E), and (P) are close to platitudinous. But given these characterizations of utilitarianism, egalitarianism and prioritarianism, this means that we can treat (U), (E), and (P) as implicit deinitions of individual goodness measures. he result is that individual goodness measures turn out to be the positive aine transformations of u , . . . , un , or what Broome () calls Bernoulli’s hypothesis. For details, see, for example, McCarthy (). he background assumptions are that individual betterness relations satisfy the expected utility axioms and that interpersonal comparisons are unproblematic. But what if these fail? I will not pursue this, for I think the most important lesson about evaluative measures is not that they are arguably well-deined, but that it does not much matter. We can and  I.e. if some function f represents the risk-free betterness relation, and g is some strictly increasing function on the reals (x < y "⇒ g(x) < g(y)), then g ◦ f also represents the risk-free betterness relation. 728 david mccarthy should theorize about the questions which really matter in the ethics of distribution without using evaluative measures. By focusing instead on comparatives and various claims about probability, none of the distributive views we have been discussing presuppose the existence of evaluative measures, the preeminent example, of course, being Harsanyi’s. 33.15 Aggregation ............................................................................................................................................................................. But this raises the question of what Harsanyi’s theorem really shows. Ethicists oten talk about the “problem of aggregation”. What they typically have in mind is the task of somehow combining an assessment of what things are like for each individual in a particular situation to form some sort of overall judgment of the situation which enables us to make an evaluative comparison with other situations. Supposing the premises of Harsanyi’s theorem are correct, it is tempting to think that Harsanyi’s theorem solves the problem of aggregation. I believe this was Harsanyi’s view, and I think it is popular among welfare economists. Harsanyi did not use the terms ‘individual betterness relation’ and ‘betterness relation’, and I stress that the following passage is mine, not his. But I think the following captures the spirit of his view (see especially Harsanyi, a). Determining the content of individual preferences relations (despite iltering out various irrationalities, excluding such things as sadistic preferences, and requiring preferences to be rich enough to enable interpersonal comparisons) is basically a psychological matter (Harsanyi, a). It does not involve any signiicant evaluative or aggregative assumptions. But we should identify individual betterness relations with individual preference relations. Given the truth of Harsanyi’s premises, Harsanyi’s theorem then explicitly determines the extension of the betterness relation. Problem of aggregation solved. his position underplays the role of evaluative assumptions in determining the content of individual betterness relations in at least two ways. First, determining the content of individual preference relations may well involve prior evaluative assumptions because of the role of such assumptions in popular accounts of radical interpretation (see e.g. Lewis, ). Secondly, even when they are restricted to histories, identifying individual betterness relations with individual preferences relations is highly controversial. It is a major evaluative question whether to understand the content of risk-free individual betterness relations in terms of preferences, the quality of the individual’s experiences, her achievements, or some combination thereof. But suppose that evaluative question has been settled, and that Harsanyi’s premises are true. he theorem certainly shows that iguring out the content of the betterness relation is no harder than determining the content of individual betterness relations. But what exactly does it show about the problem of aggregation? First, it is a vast exaggeration to say that the theorem solves the problem of aggregation. Problems of aggregation arise whenever we have to make some sort of assessment of a whole based on an assessment of its parts. But iguring out the content of individual betterness relations involves major questions of aggregation. Even in the case in which all outcomes are equally likely, to assess whether facing some lottery is better for someone than some particular outcome, we will have to assess what each of the possible outcomes of the lottery probability in ethics 729 are like for her, then somehow aggregate to reach an overall assessment of the lottery. his problem is complicated and is, in my view, much neglected. Like many economists, Harsanyi’s own account tacitly appeals to the individual’s preferences. But this should not seem very appealing to those of us who think that preference satisfaction accounts are mistaken even for the question of when one outcome is better for an individual than another. Secondly, there is no logical reason why we cannot use the theorem to deduce the content of individual betterness relations from the content of the betterness relation, in particular from judgments about when one history is better than another. In cases where we are very conident about the latter, this will even seem appealing. I am afraid I lack the space to discuss this, but I think this idea provides a natural way of interpreting various contractualist comments about veil of ignorance arguments (see e.g. Scanlon, ; Nagel, ), in particular leading to an interesting case for rejecting the claim that individual betterness relations satisfy Continuity. More generally, if its premises are true, Harsanyi’s theorem teaches us that determining the content of the betterness relation is easier than we may have thought. But the lipside is that determining the content of individual betterness relations is harder than many of us have assumed. 33.16 Summary on evaluation ............................................................................................................................................................................. When thinking about the ethics of distribution, it may seem that the real evaluative questions are about when one history is better than another, or better for some individual. Factoring in probability may then seem like a basically technical exercise, not one ethicists need be much concerned with. Almost every topic discussed could easily have its own survey article. I have had to omit many important positions, and give only sketchy defenses of positive positions. Nevertheless, I have tried to make the case for the opposite view. Not only are there very important ethical issues about how to rank lotteries, but these issues directly bear on questions about when one history is better than another. I will end the evaluative discussion with two opinions. First, if I am right, almost every major position on the ethics of distribution is essentially to do with probability. For example, assuming a constant population  . . . n, concerns with fairness, equality, and giving priority to the worse-of as characterized in sections . and . can each be shown to be consistent with the popular idea that the risk-free betterness relation is represented by w ◦ u + · · · + w ◦ un for some strictly increasing and strictly concave function w. hese views come apart only when probability is introduced. So one aspect of the importance of probability is the increase in expressive power its introduction provides: it allows us to draw distinctions which are diicult or impossible to draw in a risk-free framework. Secondly, I think the various challenges to Harsanyi’s premises stemming from appeals to equality, fairness, priority, and non-expected utility theory fail. To be sure, there is at least a reasonable case for rejecting Continuity, and Ordering (at least, the completeness part of it) is under serious threat. Nevertheless, we can drop Continuity and under many ways of modeling diiculties to do with comparability, what I take to be the core lesson of Harsanyi’s 730 david mccarthy theorem remains stable: determining the content of individual betterness relations and determining the content of the betterness relation are just diferent descriptions of the same problem. his may help. Our initial judgments about individual betterness and about betterness may be in tension with each other, and we may be more conident about some judgments than others. Harmonizing these judgments in an attempt to achieve relective equilibrium may increase our conidence in the result. 33.17 Ought ............................................................................................................................................................................. Expected utility theory has turned out to be hugely important for developing a taxonomy of answers to the fundamental evaluative question: when is one history or lottery better than another? I have not emphasized this, but I also think that the clarity of this taxonomy is also extremely helpful for assessing which answer is correct. In the remaining space I have room for only one suggestion which, though hardly very original, is that the same turns out to be true for decision theory and the fundamental normative question: what ought we to do? One immediate disclaimer is needed. Expected utility theory is usually understood as a theory about the structure of the preferences of ideally rational agents. But this chapter has discussed the application of expected utility theory to understanding evaluative comparatives without having to say anything about rationality. Rather, many of the ideas and criticisms of expected utility theory are directly applicable to questions about evaluative comparatives. Similarly, decision theory is usually understood as an account of ideally rational action, and it is typically assumed that the rationality of an action depends in some way upon the agent’s preferences. However, we can apply many ideas from decision theory directly to questions about the fundamental normative question without having to presuppose some grand connection between rationality and ethics. For example, it is a serious mistake to think that decision theory is going to be important to ethics only if ethics is somehow about preference satisfaction, or if we hitch ourselves to the unlikely project of deriving ethics from rationality. hus the discussion of decision theory in what follows is only meant to draw parallels between questions about ethics and questions about rationality. Because the debates about rationality are oten better developed, these parallels may be illuminating. With no attempt at exhaustiveness, the sequel will look briely at three examples, with particular emphasis on probability. 33.18 Act consequentialism ............................................................................................................................................................................. Given some account of betterness, the most obvious ethical theory is act consequentialism: what we ought to do is to bring about the best available lottery. If we assume for simplicity that the betterness relation satisies the expected utility axioms, act consequentialism then  In fact, this is true even if we weaken some of the EUT ideas in Harsanyi’s framework and add various well-known nonEUT ideas. his is further pursued in McCarthy, Mikkola, and homas () and McCarthy (forthcoming b). probability in ethics 731 implies that there is some value function such that we ought to perform the action with the greatest expected value. hus act consequentialism is the ethical theory which most obviously parallels decision theory. Act consequentialism is also one of the most criticized theories, one standard criticism being that it has implausible implications. For example, assuming an impartial method of valuation, Williams () argued that act consequentialism undermines the partiality which for many people makes life worth living: devotion to personal projects and particular people, oten friends and family. But this raises the question of what act consequentialism really requires in the irst place. Taking for granted a probability-based view which uses subjective probabilities, or at least, probabilities which are relative to the evidence available to the agent, Jackson () famously argued that because of facts about each individual’s probabilities, act consequentialism will typically not require each agent to promote general well-being and pursue whichever projects are the most impartially valuable. Rather, it will require a typical agent – Alice, let’s call her – to promote the well-being of the relatively small group of people Alice knows and cares about, and to adopt and then pursue projects in which Alice takes a natural interest. his does not amount to a rejection of impartial valuation, but instead relects facts about each agent’s limited information, the costs of deliberation and of acquiring new information, the complexity of the interpersonal and intrapersonal coordination problems she faces, the efects her actions will have on the expectations others will have of her future behaviour, her motivational strengths, and so on. Such facts will be encoded in the agent’s probabilities, and will therefore afect which of her acts will maximize expected value. Very oten, Jackson argued, such acts will favour her nearest and dearest. Jackson’s argument was ofered as a response to Williams, but it ofers a much more general lesson. Understanding what act consequentialism implies is going to require sophisticated thinking about probability. he huge complexity of this problem stands in sharp contrast to the occasional complaint that act consequentialism is simple-minded. 33.19 Rule consequentialism ............................................................................................................................................................................. Many writers, however, prefer rule consequentialism (or contractualism: at the normative level, these views are oten very similar). On the one hand, rule consequentialism seems to it better with common opinion about what we ought to do than act consequentialism (it is said to secure rights etc.). On the other, it seems to avoid the obscurities of deontology by resting its account of what we ought to do on an appeal to what is good for people. But how is this achieved? Harsanyi’s writings on rule utilitarianism ofer a relatively clear answer. Simplifying slightly, Harsanyi () claims that each member of a society of act utilitarians will always maximize the sum of expected individual utilities where the calculation is based on her subjective probabilities of what the other members are going to do. Each member of a society of rule utilitarians is committed to and thus will always act upon the rule R which is such that if everyone acts according to R expected utility will be greater than if everyone acts according to some other rule (I ignore the possibility that two rules could be tied). 732 david mccarthy Harsanyi claims that rule utilitarianism will lead to “incomparably superior” overall results in comparison with act utilitarianism because of its superiority in two kinds of scenarios: (i) in certain simultaneous coordination games (e.g. choosing whether to vote), and (ii) in certain sequential games (typically involving choices about respecting rights, keeping promises etc.). his superiority is despite the fact that R will sometimes tell agents to perform actions which they are certain will produce suboptimal results, where optimality is understood in terms of maximizing the sum of expected utilities. his last feature leads many to suspect that there is something unstable about rule utilitarianism, but Harsanyi claims that these superior overall results imply that rule utilitarianism is correct. It would take a separate article even to outline the important issues here, and I merely want to make three points to illustrate the potential value of looking at this style of argument through the lens of contemporary debates about decision theory. To do that, I will assume for the sake of argument (though this is far from obvious) that Harsanyi is right about the superior overall results of rule utilitarianism in comparison with act utilitarianism. First, Harsanyi stresses that the rule utilitarians take themselves to be facing a problem involving complete probabilistic dependence: each will commit to (and thus act on) rule R if and only if all commit to R. In this respect, rule utilitarians are like clones in the well-known case of clones playing a prisoner’s dilemma. It is this probabilistic dependence which leads to rule utilitarianism’s superior performance in the coordination games. However, in these coordination games, there is causal independence between the actions of each player. But “probabilistic dependence yet causal independence” takes us to a crucial issue in decision theory. Very roughly, so-called evidential decision theory assesses (the rationality of) actions in terms of how likely good outcomes are conditional upon the actions being performed. By contrast, causal decision theory assesses actions in terms of their causal tendency to produce good outcomes. he classic case in which the two come apart is Newcomb’s problem. However, for those of us who think that Newcomb’s problem teaches us to be causal decision theorists (see e.g. Joyce, ), probabilistic dependence is a red herring when there is causal independence, as there plainly is in Harsanyi’s simultaneous coordination games. So we may think that Harsanyi has tacitly built something like evidential decision theory into rule utilitarianism, and so much the worse for rule utilitarianism. Secondly, the success of rule utilitarianism in various sequential games stems from the rule utilitarians’ commitment to the rule R even in contexts in which acting on R leads to suboptimal results. he conclusion that in virtue of this success, rule utilitarianism is right about what we ought to do is parallel to a revision to standard decision theory later urged by Gauthier () and McClennan (). his revision claims that if it is rational at time t to become committed to performing some action at a later time t ′ which is obviously irrational when considered in isolation, it is rational to commit to the action and then later perform that action. But those of us who take the toxin puzzle of Kavka () to dramatize why this revision is mistaken may think that rule utilitarianism is making the same kind of mistake. hirdly, Harsanyi’s characterization of act versus rule utilitarianism parallels the inluential distinction in von Neumann and Morgenstern () between games against nature and games against other people. Each act utilitarian will have probabilities about a number of relevant variables, and will maximize expected value accordingly. he fact that some of these variables are the behaviour of other people who like herself are act utilitarians is neither here nor there; the decision theoretic model still applies. But when an agent is in a situation in which the outcome depends in part on the behaviour of agents just probability in ethics 733 like her, von Neumann and Morgenstern argued that decision theory is inappropriate. he problem of self-reference embedded into such situations requires the diferent tools of game theory, and Harsanyi’s rule utilitarians reason along similar lines. Perhaps von Neumann and Morgenstern’s argument could be used to bolster Harsanyi’s approach. Alternatively, those of us who are convinced by Skyrms () in thinking that problems of self-reference can and should be handled without having to abandon decision theory may think this points to a further diiculty for rule utilitarianism. Of course, the fact that Harsanyi focussed on rule utilitarianism rather than rule consequentialism has been inessential to the discussion. hese crude and preliminary remarks are meant only to suggest the value of looking at the foundations of rule consequentialism through the lens of parallel and oten much more extensive debates about decision theory. 33.20 Deontology ............................................................................................................................................................................. hose with strong deontological intuitions may reject rule consequentialism, either because they are not convinced that it is a stable alternative to act consequentialism, or because its conclusions are not deontological enough. But we may now seem to have reached the limits of the usefulness of thinking about decision theory. Very roughly, anything like a decision theoretic approach to deontology looks like the wrong model: the former is all about weighing goods against evils, and the latter thinks there are circumstances in which such weighing is illegitimate, or counts for nothing. Nevertheless, one lesson from thinking about probability is that weighing is not so easy to avoid. In trying to characterize a deontological view, there seem to be two basic options. What I will call agent-centered views typically prohibit actions which would involve the agent’s mental states bearing some kind of inappropriate relation to the outcome. he most obvious example is the so-called principle of double efect, which in its simplest form prohibits bringing about intended harm, but permits certain otherwise identical cases of bringing about merely foreseen harm. What I will call causal structure views typically prohibit actions which stand in some kind of inappropriate causal relation to the outcome. For example, in the famous trolley problem, an out-of-control trolley is going to kill ive people who are stuck on the track, but a bystander can switch the trolley to a sidetrack where it will kill one person. Many people who have strong deontological intuitions think it is permissible to switch the trolley. But in most cases, they think that killing one to save ive is impermissible, as in the variant where the bystander can push a fat man of a bridge to stop the trolley (homson, ), killing him but saving the ive. Causal structure theorists think the intentions of the bystander are irrelevant, and search for diferences in the causal structure of the cases to explain the diference in permissibility. Many deontologists have not had much sympathy for agent-centered views, and have preferred some kind of causal structure view (e.g. Kamm, ). But here is what I believe is a relatively neglected problem about views. If the inappropriate causal relation is between the action and the outcome – as in, e.g., the fat man variant but not the trolley problem itself – then prima facie, there are going to be actions which bring about the following lotteries: 734 david mccarthy some beneit occurs with nonzero probability p, some inappropriate causal structure obtains with probability  − p. For example, driving a truck across the bridge will either miss the fat man and deliver aid elsewhere, or else hit him and topple him of the bridge, stopping the trolley and saving the ive. What should causal structure deontologists say about such actions? here are at least ive responses: (i) All such actions are impermissible. Objection: this leads to an intolerably restrictive view. (ii) Such actions are impermissible if and only if they turn out to result in the inappropriate causal structure. Objection: similar to the objections to outcome-based views in section .. (iii) Actions which lead to the inappropriate causal structure with probability one are impermissible, all others are permissible. Objection: it is not credible that there should be such a gulf between probability one and probabilities just less than one. (iv) Actions performed by agents whose reasons for performing them include the beneits resulting from the inappropriate structure are impermissible. Objection: this collapses causal structure views into agent-centered views. (v) Actions are impermissible if and only if p exceeds some intermediate probability threshold. Objection: this seems to be the most principled response for a causal structure view, but it suggests the acceptability of weighing the alleged badness of the causal structure against the production of beneits. his seems to it poorly with the guiding deontological image of the inappropriateness of weighing when inappropriate causal structures are concerned. Perhaps this kind of case points towards a serious problem for causal structure views; see further Jackson and Smith (). Or it may provide an opportunity for causal structure theorists to reine their views. Either way, thinking about probability and deontology seems helpful. Acknowledgments ............................................................................................................................................................................. hanks to Alan Hájek and Kalle Mikkola for very helpful comments. Support was partially provided by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (HKU H). References Allais, M. () Le comportement de l’homme rationnel devant le risque, critique des postulates et axiomes de l’ecole américaine. Econometrica. . pp. –. Angner, E. () Revisiting Rawls: A heory of Justice in the light of Levi’s theory of decision. heoria. . pp. –. Blackorby, C., Bossert, W., and Donaldson, D. () Intertemporal population ethics: critical-level utilitarian principles. Econometrica. . pp. –. Broome, J. () Weighing Goods. Cambridge, MA: Blackwell. Broome, J. () Weighing Lives. Oxford: Oxford University Press. Buchak, L. () Decision theory. In Hájek, A. and Hitchcock, C. (eds.). he Oxford Handbook of Philosophy and Probability. Oxford: Oxford University Press. Diamond, P. () Cardinal welfare, individualistic ethics, and interpersonal comparisons of utility: comment. Journal of Political Economy. . pp. –. probability in ethics 735 Edgeworth, F. () Mathematical Psychics. London: Kegan Paul. Ellsberg, M. () Risk, ambiguity and the Savage axioms. Quarterly Journal of Economics. . pp. –. Fishburn, P. () Utility heory for Decision Making. New York, NY: Wiley. Fishburn, P. () Stochastic utility. In Barberá, S., Hammond, P., and Seidl, C. (eds.) Handbook of Utility heory. Vol. . Dordrecht: Kluwer. Foot, P. () Utilitarianism and the virtues. Mind. . pp. –. Gauthier, D. () Assure and threaten. Ethics. . pp. –. Gilboa, I. () heory of Decision under Uncertainty. Cambridge: Cambridge University Press. Hájek, A. () Interpretations of probability. In Zalta, E. N. (ed.) he Stanford Encyclopedia of Philosophy. (Winter) [Online] Available from: http://plato.stanford.edu/archives/ win/entries/probability-interpret. Hájek, A. () Most Counterfactuals are False. Manuscript. Hammond, P. () Interpersonal comparisons of utility: why and how they are and should be made. In Elster, J. and Roemer, J. (eds.). Interpersonal Comparisons of Well-Being. pp. –. Cambridge: Cambridge University Press. Hammond, P. () Objective expected utility. In Barberá, S., Hammond, P. and Seidl, C. (eds.) Handbook of Utility heory. Vol. . pp. –. Dordrecht: Kluwer. Harsanyi, J. () Cardinal utility in welfare economics and in the theory of risk-taking. Journal of Political Economy. . pp. –. Harsanyi, J. () Cardinal welfare, individualistic ethics, and interpersonal comparisons of utility. Journal of Political Economy. . pp. –. Harsanyi, J. (a) Morality and the theory of rational behavior. Social Research. . pp. –. Harsanyi, J. (b) Nonlinear social welfare functions: a rejoinder to Professor Sen. In Butts, R. and Hintikka, J. (eds.). Foundational Issues in the Special Sciences. pp. –. Dordrecht: Reidel. Harsanyi, J. () Rule utilitarianism, rights, obligations and the theory of rational behavior. heory and Decision. . pp. –. Hausner, M. () Multidimensional utilities. In hrall, R., Coombs, C., and Davis, R. (eds.) Decision Processes. pp. –. New York, NY: John Wiley & Sons. Hoefer, C. () he third way on objective probability: a sceptic’s guide to objective chance. Mind. . pp. –. Jackson, F. () Decision-theoretic consequentialism and the nearest and dearest objection. Ethics. . pp. –. Jackson, F. and Smith, M. () Absolutist moral theories and uncertainty. Journal of Philosophy. . pp. –. Jensen, K. () Unacceptable risks and the continuity axiom. Economics and Philosophy. . pp. –. Joyce, J. () he Foundations of Causal Decision heory. Cambridge: Cambridge University Press. Kamm, F. () Morality, Mortality. Vol. . New York, NY: Oxford University Press. Kavka, G. () he toxin puzzle. Analysis. . pp. –. Krantz, D., Luce, R. D., Suppes, P., and Tversky, A. () Foundations of Measurement. Vol. . New York, NY: Academic Press. Kreps, D. () Notes on the heory of Choice. Underground Classics in Economics. Boulder, CO: Westview Press. 736 david mccarthy Lewis, D. () Counterfactuals. Oxford: Blackwell. Lewis, D. () Radical interpretation. Synthese. . pp. –. Lewis, D. () A subjectivist’s guide to objective chance. In Jefrey, R. (ed.). Studies in Inductive Logic and Probability. Vol. . pp. –. Berkeley, CA: University of California Press. Lewis, D. () Humean supervenience debugged. Mind. . pp. –. McCarthy, D. () Actions, beliefs and consequences. Philosophical Studies. . pp. –. McCarthy, D. () Utilitarianism and prioritarianism II. Economics and Philosophy. . pp. –. McCarthy, D. () Risk-free approaches to the priority view. Erkenntnis. . pp. –. McCarthy, D. () Distributive equality. Mind. . pp. –. McCarthy, D. (forthcoming a) he priority view. Economics and Philosophy. McCarthy, D. (forthcoming b) he Structure of Good. Oxford: Oxford University Press. McCarthy, D., Mikkola, K., and homas, T. () Utilitarianism with and without expected utility. MPRA Paper No.  https://mpra.ub.uni-muenchen.de//. McCarthy, D. and homas, T. () Egalitarianism with risk. Manuscript. McClennen, E. () Rationality and Dynamic Choice. Cambridge University Press. Mongin, P. () Consistent Bayesian aggregation. Journal of Economic heory. . pp. –. Myerson, R. () Utilitarianism, egalitarianism, and the timing efect in social choice problems. Econometrica. . pp. –. Nagel, T. () he Possibility of Altruism. Princeton, NJ: Princeton University Press. Narens, L. () Introduction to the heories of Measurement and Meaningfulness and the Use of Symmetry in Science. Mahwah, NJ: Lawrence Erlbaum Associates. Ng, Y. () Bentham or Bergson? Finite sensibility, utility functions, and social welfare functions. Review of Economic Studies. . pp. –. Ok, E. () Real Analysis with Economic Applications. Princeton, NJ: Princeton University Press. Otsuka, M. and Voorhoeve, A. () Why it matters that some are worse than others: an argument against the priority view. Philosophy and Public Afairs. . pp. –. Parit, D. () Reasons and Persons. Oxford: Clarendon Press. Parit, D. () Equality or priority? In Clayton, M. and Williams, A. (eds.). he Ideal of Equality. pp. –. Basingstoke: Macmillan. Parit, D. () Another defense of the priority view. Utilitas. . pp. –. Rabinowicz, W. () Prioritarianism for prospects. Utilitas. . pp. –. Ramsey, F. () Truth and probability. In Ramsey, F. and Braithwaite, R. (ed.). Foundations of Mathematics and other Essays. pp. –. London: Kegan, Paul, Trench, Trubner, & Co. Rawls, J. () A heory of Justice. Cambridge, MA: Harvard University Press. Rawls, J. () Social unity and primary goods. In Sen, A. and Williams, B. (eds.) Utilitarianism and Beyond. Cambridge: Cambridge University Press. Resnik, M. () Choices: An Introduction to Decision heory. Minneapolis, MN: University of Minnesota Press. Roberts, F. () Measurement heory. Cambridge: Cambridge University Press. Savage, L. () he Foundations of Statistics. New York, NY: John Wiley. Scanlon, T. () Contractualism and utilitarianism. In Sen, A. and Williams, B. (eds.) Utilitarianism and Beyond. Cambridge, MA: Cambridge University Press. Scheler, S. () he Rejection of Consequentialism. Oxford: Oxford University Press. probability in ethics 737 Schmidt, U. () Alternatives to expected utility: formal theories. In Barberá, S., Hammond, P., and Seidl, C. (eds.) Handbook of Utility heory. Vol. . pp. –. Dordrecht: Kluwer. Schwarz, W. () Best system approaches to chance. In Hájek, A. and Hitchcock, C. (eds.) he Oxford Handbook of Philosophy and Probability. Oxford: Oxford University Press. Sen, A. () Welfare inequalities and Rawlsian axiomatics. heory and Decision. . pp. –. Sen, A. () Well-being, agency and freedom. Journal of Philosophy. . pp. –. Skyrms, B. () he Dynamics of Rational Deliberation. Cambridge, MA: Harvard University Press. Sugden, R. () Alternatives to expected utility: foundations. In Barberá, S., Hammond, P. and Seidl, C. (eds.). Handbook of Utility heory. Vol. . pp. –. Dordrecht: Kluwer. Temkin, L. () Rethinking the Good: Moral Ideals and the Nature of Practical Reasoning. Oxford: Oxford University Press. homson, J. () Killing, letting die, and the trolley problem. he Monist. . pp. –. homson, J. () Imposing risks. In Parent, W. (ed.) Rights, Restitution, and Risk. Cambridge, MA: Harvard University Press. homson, J. () Goodness and Advice. Princeton, NJ: Princeton University Press. Vickrey, W. () Utility, strategy, and social decision rules. he Quarterly Journal of Economics. . pp. –. von Neumann, J. and Morgenstern, O. () heory of Games and Economic Behavior. Princeton, NJ: Princeton University Press. Wakker, P. () Prospect heory: For Risk and Ambiguity. Cambridge: Cambridge University Press. Weymark, J. () A reconsideration of the Harsanyi-Sen debate on utilitarianism. In Elster, J. and Roemer, J. (eds.). Interpersonal Comparisons of Well-Being. Cambridge: Cambridge University Press. Williams, B. () A critique of utilitarianism. In Smart, J. and Williams, B. (eds.) Utilitarianism: For and Against. Cambridge: Cambridge University Press. Williamson, T. () Vagueness. New York, NY: Routledge.