(Theodore Sider) Logic For Philosophy

Logic for Philosophy
Theodore Sider
May ,
Prcfacc
This book is an introduction to logic for students of contemporary philosophy.
It covers i) basic approaches to logic, including proof theory and especially
model theory, ii) extensions of standard logic (such as modal logic) that are
important in philosophy, and iii) some elementary philosophy of logic. It pre-
pares students to read the logically sophisticated articles in todays philosophy
journals, and helps them resist bullying by symbol-mongerers. In short, it
teaches the logic you need to know in order to be a contemporary philosopher.
For better or for worse (I think better), the last century-or-sos developments
in logic are part of the shared knowledge base of philosophers, and informnearly
every area of philosophy. Logic is part of our shared language and inheritance.
The standard philosophy curriculum therefore includes a healthy dose of logic.
This is a good thing. But in many cases only a single advanced logic course
is required, which becomes the de facto sole exposure to advanced logic for
many undergraduate philosophy majors and beginning graduate students. And
this one course is often an intensive survey of metalogic (for example, one
based on the excellent Boolos et al. ().) I do believe in the value of such a
course, especially for students who take multiple logic courses or specialize in
technical areas of philosophy. But for students taking only a single course, that
course should not, I think, be a course in metalogic. The standard metalogic
course is too mathematically demanding for the average philosophy student,
and omits material that the average student ought to know. If there can be only
one, let it be a crash course in logic literacy.
Logic literacy includes knowing what metalogic is all about. And you cant
really learn about anything in logic without getting your hands dirty and doing
it. So this book does contain some metalogic (e.g., soundness and completeness
proofs in propositional logic and propositional modal logic). But it doesnt
cover the central metalogical results one normally covers in a mathematical
logic course: soundness and completeness in predicate logic, computability,
i
PREFACE ii
Gdels incompleteness theorems, and so on.
I have decided to be very sloppy about use and mention. When such issues
matter I draw attention to them; but where they do not I do not.
Solutions to exercises marked with a single asterisk (*) are included in
Appendix A. Exercises marked with a double asterisk (**) tend to be more
difcult, and have hints in Appendix A.
I drew heavily from the following sources, which would be good for sup-
plemental reading: Bencivenga () (free logic); Boolos et al. (, chapter
) (metalogic, second-order logic); Cresswell () (two-dimensional modal
logic); Davies and Humberstone () (two-dimensional modal logic); Gamut
(a,b) (Descriptions, -abstraction, multi-valued, modal, and tense logic);
Hilpinen () (deontic logic); Hughes and Cresswell () (modal logicI
borrowed particularly heavily hereand tense logic); Kripke () (intuition-
istic logic); Lemmon () (sequents in propositional logic); Lewis (a)
(counterfactuals); Mendelson () (propositional and predicate logic, meta-
logic); Meyer () (epistemic logic); Priest () (intuitionistic and paracon-
sistent logic); Stalnaker () (-abstraction); Westersthl () (generalized
quantiers).
Another important source, particularly for chapters 6 and 8, was Ed Gettiers
modal logic class at the University of Massachusetts. The rst incarnation
of this work grew out of my notes from this course. I am grateful to Ed for his
wonderful class, and for getting me interested in logic.
I am also deeply grateful for feedback from many students, colleagues,
and referees. In particular, Marcello Antosh, Josh Armstrong, Dean Chap-
man, Tony Dardis, Justin Clarke-Doane, Mihailis Diamantis, Mike Fara, Gabe
Greenberg, Angela Harper, John Hawthorne, Paul Hovda, Phil Kremer, Sami
Laine, Gregory Lavers, Brandon Look, Stephen McLeod, Kevin Moore, Alex
Morgan, Tore Fjetland
(
Ogaard, Nick Riggle, Jeff Russell, Brock Sides, Ja-
son Turner, Crystal Tychonievich, Jennifer Wang, Brian Weatherson, Evan
Williams, Xing Taotao, Seth Yalcin, Zanja Yudell, Richard Zach, and especially
Agustn Rayo: thank you.
Contcnts
Prcfacc i
1 What is Logic? 1
1.1 Logical consequence and logical truth . . . . . . . . . . . . . . .
1.2 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Metalogic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 1.11.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 The nature of logical consequence . . . . . . . . . . . . . . . . . .
Exercise 1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6 Logical constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.7 Extensions, deviations, variations . . . . . . . . . . . . . . . . . . .
1.8 Set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 1.41.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Propositional Logic jo
2.1 Grammar of PL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 The semantic approach to logic . . . . . . . . . . . . . . . . . . . .
2.3 Semantics of propositional logic . . . . . . . . . . . . . . . . . . .
Exercise 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Validity and invalidity in PL . . . . . . . . . . . . . . . . . . . . . .
Exercise 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Schemas, validity, and invalidity . . . . . . . . . . . . . .
2.5 Sequent proofs in PL . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Sequents . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.3 Sequent proofs . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.4 Example sequent proofs . . . . . . . . . . . . . . . . . . .
Exercise 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
iii
CONTENTS iv
2.6 Axiomatic proofs in PL . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7 Soundness of PL and proof by induction . . . . . . . . . . . . . .
Exercises 2.52.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.8 PL proofs and the deduction theorem . . . . . . . . . . . . . . .
Exercises 2.112.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.9 Completeness of PL . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.9.1 Maximal consistent sets of wffs . . . . . . . . . . . . . .
2.9.2 Maximal consistent extensions . . . . . . . . . . . . . . .
2.9.3 Features of maximal consistent sets . . . . . . . . . . . .
2.9.4 The proof . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 Bcyond Standard Propositional Logic 8<
3.1 Alternate connectives . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Symbolizing truth functions in propositional logic .
3.1.2 Sheffer stroke . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Inadequate connective sets . . . . . . . . . . . . . . . . .
Exercises 3.13.3 . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Polish notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Nonclassical propositional logics . . . . . . . . . . . . . . . . . . .
3.4 Three-valued logic . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 ukasiewiczs system . . . . . . . . . . . . . . . . . . . . .
Exercises 3.53.6 . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 Kleenes tables . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 3.73.9 . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Determinacy . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.4 Priests logic of paradox . . . . . . . . . . . . . . . . . . .
Exercises 3.103.11 . . . . . . . . . . . . . . . . . . . . . . .
3.4.5 Supervaluationism . . . . . . . . . . . . . . . . . . . . . . .
Exercises 3.123.16 . . . . . . . . . . . . . . . . . . . . . . .
3.5 Intuitionistic propositional logic: proof theory . . . . . . . . . .
Exercise 3.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Prcdicatc Logic 11<
4.1 Grammar of predicate logic . . . . . . . . . . . . . . . . . . . . . .
4.2 Semantics of predicate logic . . . . . . . . . . . . . . . . . . . . . .
Exercise 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS v
4.3 Establishing validity and invalidity . . . . . . . . . . . . . . . . . .
Exercises 4.24.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Axiomatic proofs in PC . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Metalogic of PC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Bcyond Standard Prcdicatc Logic 1j~
5.1 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.1 Grammar for the identity sign . . . . . . . . . . . . . . .
5.1.2 Semantics for the identity sign . . . . . . . . . . . . . . .
5.1.3 Symbolizations with the identity sign . . . . . . . . . .
Exercises 5.15.2 . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Function symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 5.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Grammar for function symbols . . . . . . . . . . . . . .
5.2.2 Semantics for function symbols . . . . . . . . . . . . . .
Exercise 5.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Denite descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1 Grammar for . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.2 Semantics for . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 5.55.6 . . . . . . . . . . . . . . . . . . . . . . . .
5.3.3 Elimination of function symbols and descriptions . .
Exercises 5.75.8 . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Further quantiers . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.1 Generalized monadic quantiers . . . . . . . . . . . . .
Exercise 5.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.2 Generalized binary quantiers . . . . . . . . . . . . . . .
Exercise 5.10 . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.3 Second-order logic . . . . . . . . . . . . . . . . . . . . . .
Exercise 5.11 . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Complex Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 5.125.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6 Free Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6.1 Semantics for free logic . . . . . . . . . . . . . . . . . . .
Exercises 5.145.15 . . . . . . . . . . . . . . . . . . . . . . .
5.6.2 Proof theory for free logic . . . . . . . . . . . . . . . . .
CONTENTS vi
6 Propositional Modal Logic 1~1
6.1 Grammar of MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Symbolizations in MPL . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Semantics for MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Kripke models . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.2 Semantic validity proofs . . . . . . . . . . . . . . . . . . .
Exercise 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.3 Countermodels . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4 Axiomatic systems of MPL . . . . . . . . . . . . . . . . . . . . . . .
6.4.1 System K . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 6.46.5 . . . . . . . . . . . . . . . . . . . . . . . .
6.4.2 System D . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.3 System T . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.4 System B . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.5 System S . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.6 System S . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.10 . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.7 Substitution of equivalents and modal reduction . . .
Exercise 6.11 . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5 Soundness in MPL . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 6.126.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.1 Soundness of K . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.2 Soundness of T . . . . . . . . . . . . . . . . . . . . . . . . .
6.5.3 Soundness of B . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 6.146.15 . . . . . . . . . . . . . . . . . . . . . . .
6.6 Completeness in MPL . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.1 Denition of canonical models . . . . . . . . . . . . . .
6.6.2 Facts about maximal consistent sets . . . . . . . . . . .
Exercise 6.16 . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.3 Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 6.17 . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.6.4 Truth and membership in canonical models . . . . . .
CONTENTS vii
6.6.5 Completeness of systems of MPL . . . . . . . . . . . . .
Exercises 6.186.20 . . . . . . . . . . . . . . . . . . . . . . .
7 Bcyond Standard MPL zji
7.1 Deontic logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 7.17.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Epistemic logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 Propositional tense logic . . . . . . . . . . . . . . . . . . . . . . . .
7.3.1 The metaphysics of time . . . . . . . . . . . . . . . . . . .
7.3.2 Tense operators . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.3 Kripke-style semantics for tense logic . . . . . . . . . .
Exercises 7.47.5 . . . . . . . . . . . . . . . . . . . . . . . .
7.3.4 Formal constraints on . . . . . . . . . . . . . . . . . . .
Exercise 7.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4 Intuitionistic propositional logic: semantics . . . . . . . . . . . .
7.4.1 Proof stages . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 7.77.8 . . . . . . . . . . . . . . . . . . . . . . . .
7.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 7.97.10 . . . . . . . . . . . . . . . . . . . . . . . .
7.4.3 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 7.117.13 . . . . . . . . . . . . . . . . . . . . . . .
8 Countcrfactuals z<i
8.1 Natural language counterfactuals . . . . . . . . . . . . . . . . . . .
8.1.1 Antecedents and consequents . . . . . . . . . . . . . . .
8.1.2 Can be contingent . . . . . . . . . . . . . . . . . . . . . . .
8.1.3 No augmentation . . . . . . . . . . . . . . . . . . . . . . .
8.1.4 No contraposition . . . . . . . . . . . . . . . . . . . . . . .
8.1.5 Some implications . . . . . . . . . . . . . . . . . . . . . . .
8.1.6 Context dependence . . . . . . . . . . . . . . . . . . . . .
8.2 The Lewis/Stalnaker theory . . . . . . . . . . . . . . . . . . . . . .
8.3 Stalnakers system (SC) . . . . . . . . . . . . . . . . . . . . . . . . .
8.3.1 Syntax of SC . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.3.2 Semantics of SC . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 8.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.4 Validity proofs in SC . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 8.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS viii
8.5 Countermodels in SC . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 8.38.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.6 Logical Features of SC . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.1 No exportation . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.2 No importation . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.3 No transitivity . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.4 No transposition . . . . . . . . . . . . . . . . . . . . . . . .
8.7 Lewiss criticisms of Stalnakers theory . . . . . . . . . . . . . . .
8.8 Lewiss system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 8.58.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.9 The problem of disjunctive antecedents . . . . . . . . . . . . . .
9 Quantihcd Modal Logic z86
9.1 Grammar of QML . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.2 De re and de dicto . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.3 A simple semantics for QML . . . . . . . . . . . . . . . . . . . . .
9.4 Countermodels and validity proofs in SQML . . . . . . . . . .
Exercise 9.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.5 Philosophical questions about SQML. . . . . . . . . . . . . . . .
9.5.1 The necessity of identity . . . . . . . . . . . . . . . . . . .
9.5.2 The necessity of existence . . . . . . . . . . . . . . . . . .
Exercise 9.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.5.3 Necessary existence defended . . . . . . . . . . . . . . .
9.6 Variable domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.6.1 Contingent existence vindicated . . . . . . . . . . . . . .
Exercises 9.39.4 . . . . . . . . . . . . . . . . . . . . . . . .
9.6.2 Increasing, decreasing domains . . . . . . . . . . . . . .
Exercise 9.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9.6.3 Strong and weak necessity . . . . . . . . . . . . . . . . . .
9.6.4 Actualist and possibilist quantication . . . . . . . . . .
9.7 Axioms for SQML . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercise 9.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10 Two-dimcnsional modal logic j1o
10.1 Actuality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1.1 Kripke models with designated worlds . . . . . . . . .
Exercise 10.1 . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1.2 Semantics for @ . . . . . . . . . . . . . . . . . . . . . . . .
CONTENTS ix
10.1.3 Establishing validity and invalidity . . . . . . . . . . . .
10.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2.1 Two-dimensional semantics for . . . . . . . . . . . . .
Exercise 10.2 . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.3 Fixedly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 10.310.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.4 Necessity and a priority . . . . . . . . . . . . . . . . . . . . . . . . .
Exercises 10.610.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A Answcrs and Hints jj<
Rcfcrcnccs j<j
Indcx j<o
Chaptcr 1
What is Logic?
S
ixti voi avi viabixo this book, you probably know some logic already.
You probably know how to translate English sentences into symbolic
notation, into propositional logic:
English Propositional logic
Either violets are blue or I need glasses VN
If snow is white then grass is not green SG
and into predicate logic:
English Predicate logic
If Grant is male then someone is male M gxMx
Any friend of Barry is either insane or
friends with everyone
x[F xb(I x yF xy)]
You are probably also familiar with some techniques for evaluating arguments
written out in symbolic notation. You have probably encountered truth tables,
and some form of proof theory (perhaps a natural deduction system; perhaps
truth trees.) You may have even encountered some elementary model theory.
In short: you have taken an introductory course in symbolic logic.
What you already possess is: literacy in elementary logic. What you will
get out of this book is: literacy in the rest of logic that philosophers tend to
presuppose, plus a deeper grasp of what logic is all about.
So what is logic all about?
CHAPTER 1. WHAT IS LOGIC?

1.1 Logical conscqucncc and logical truth
Logic is about many things, but most centrally it is about logical consequence. The
statement someone is male is a logical consequence of the statement Grant is
male. If Grant is male, then it logically follows that someone is male. Put another
way: the statement Grant is male logically implies the statement someone
is male. Likewise, the statement Grant is male is a logical consequence of
the statements Its not the case that Leisel is male and Either Leisel is male
or Grant is male (taken together). The rst statement follows from the latter
two statements; they logically imply it. Put another way: the argument whose
premises are the latter two statements, and whose conclusion is the former
statement, is a logically correct one.
1
So far weve just given synonyms. The following slogan advances us a bit
further: logical consequence is truth-preservation by virtue of form. To unpack a
bit: for to be a logical consequence of , it is not enough that we all know
that is true if is. We all know that an apple will fall if it is dropped, but
the relationship between falling and dropping does not hold by virtue of logic.
Why not? For one thing, by virtue of logic requires the presence of some sort
of necessary connection, a connection that is absent in the case of the dropped
apple (since it would be possiblein some sensefor a dropped apple not to
fall). For another, it requires the relationship to hold by virtue of the forms
of the statements involved, whereas the relationship between the apple was
dropped and the apple fell holds by virtue of the contents of these statements
and not their form. (By contrast, the inference from Its not the case that Leisel
is male and Either Leisel is male or Grant is male to Grant is male is said
to hold in virtue of form, since any argument of the form its not the case that
; either or ; therefore is logically correct.) As well see shortly, there
are many open philosophical questions in this vicinity, but perhaps we have
enough of an intuitive x on the concept of logical consequence to go on with,
at least for the moment.
A related concept is that of a logical truth. Just as logical consequence is
truth-preservation by virtue of form, logical truth is truth by virtue of form.
Examples might include: its not the case that snow is white and also not
white, All sh are sh, and If Grant is male then someone is male. As with
logical consequence, logical truth is thought to require some sort of necessity
1
The word valid is sometimes used for logically correct arguments, but I will reserve that
word for a different concept: that of a logical truth, under the semantic conception.
and to hold by virtue of form, not content. It is plausible that logical truth
and logical consequence are related thus: a logical truth is a sentence that is a
logical consequence of the empty set of premises. One can infer a logical truth
by using logic alone, without the help of any premises.
Acentral goal of logic, then, is to study logical truth and logical consequence.
But the contemporary method for doing so is somewhat indirect. As we will
see in the next section, instead of formulating claims about logical consequence
and logical truth themselves, modern logicians develop formal models of how
those concepts behave.
1.2 Iormalization
Modern logic is called mathematical or symbolic logic, because its method
is the mathematical study of formal languages. Modern logicians use the tools
of mathematics (especially, the tools of very abstract mathematics, such as set
theory) to treat sentences and other parts of language as mathematical objects.
They dene up formal languages, dene up sentences of the languages, dene
up properties of the sentences, and study those properties. Mathematical logic
was originally developed to study mathematical reasoning, but its techniques
are now applied to reasoning of all kinds.
Take propositional logic, the topic of chapter 2. Here our goal is to shed
light on the logical behavior of and, or, and so on. But rather than studying
those words directly, we will develop a certain formal language, the language
of propositional logic. The sentences of this language look like this:
P
(QR) (QS)
P (PQ)
Symbols like and represent natural language logical words like and and
or; and the sentence letters P, Q, . . . represent declarative natural language
sentences. We will then go on to dene (as always, in a mathematically rigorous
way) various concepts that apply to the sentences in this formal language. We
will dene the notion of a tautology (all Trues in the truth table), for example,
and the notion of a provable formula (we will do this using a system of deduction
with rules of inference; but one could use truth trees, or some other method).
These dened concepts are formalized versions of the concepts of logical
consequence and logical truth.
Formalized logical consequence and logical truth should be distinguished
from the real things. The formal sentence PP is a tautology, but since it is
uninterpreted, we probably shouldnt call it a logical truth. Rather, it represents
logical truths like If snow is white then snow is white. A logical truth ought
at least to be true, after all, and PP isnt true, since it doesnt even have
a meaningwhats the meaning of P? (Caveat: one might give meanings to
formal sentencesby translation into natural language (let P mean that snow
is white; let mean and), or perhaps by some direct method if no natural
language translation is available. And we may indeed speak of logical truth and
logical consequence for interpreted formal sentences.)
Why are formal languages called formal? (Theyre also sometimes called
articial languages.) Because their properties are mathematically stipulated,
rather than being pre-existent in esh-and-blood linguistic populations. We
stipulatively dene a formal languages grammar. (Natural languages like En-
glish also have grammars, which can be studied using mathematical techniques.
But these grammars are much more complicated, and are discovered rather than
stipulated.) And we must stipulatively dene any properties of the symbolic
sentences that we want to study, for example, the property of being a tautology.
(Sentences of natural languages already have meanings, truth values, and so
on; we dont get to stipulate these.) Further, formal languages often contain
abstractions, like the sentence letters P, Q, . . . of propositional logic. A given
formal language is designed to represent the logical behavior of a select few
natural language words; when we use it we abstract away from all other features
of natural language sentences. Propositional logic, for example, represents the
logical behavior of and, or, and a few other words. When a sentence contains
none of these words of interest, we represent it with one of the sentence letters
P, Q, . . . , indicating that we are ignoring its internal structure.
1.3 Mctalogic
There are many reasons to formalizeto clarify meaning, to speak more
concisely, and so on. But one of the most powerful reasons is to do metalogic.
In introductory logic one learns to use certain logical systemshow to
construct truth tables, derivations and truth trees, and the rest. But logicians
do not develop systems only to sit around all day using them. As soon as a
logician develops a new system, she begins to ask questions about that system.
For an analogy, imagine people who make up new games for a living. If they
invent a new version of chess, they might spend some time actually playing
it. But if they are like logicians, they will quickly tire of this and start asking
questions about the game. Is the average length of this new game longer than
the average length of a game of standard chess?. Is there any strategy that
guarantees victory? Analogously, logicians ask questions about logical systems.
What formulas can be proven in such and such a system? Can you prove
the same things in this system as in system X? Can a computer program be
written to determine whether a given formula is provable in this system? The
study of such questions about formal systems is called metalogic.
The best way to denitively answer metalogical questions is to use the
methods of mathematics. And to use the methods of mathematics, we need
to have rigorous denitions of the crucial terms that are in play. For example,
in chapter 2 we will mathematically demonstrate that every formula that is
provable (in a certain formal system) is a tautology. But doing so requires
carefully dening the crucial terms: formula, provable, and tautology; and
the best way to do this is to formalize. We treat the languages of logic as
mathematical objects so that we can mathematically demonstrate facts about
them.
Metalogic is a fascinating and complex subject; and other things being
equal, its good to know as much about it as you can. Now, other things are
rarely equal; and the premise of this book is that if push sadly comes to shove,
limited classroom time should be devoted to achieving logic literacy rather
than a full study of metalogic in all its glory. But still, logic literacy does require
understanding metalogic: understanding what it is, what it accomplishes, and
how one goes about doing it. So we will be doing a decent amount of metalogic
in this book. But not too much, and not the harder bits.
Much of metalogic consists of proving things about formal systems. And
sometimes, those formal systems themselves concern proof. For example, as I
said a moment ago, we will prove in chapter 2 that every provable formula is a
tautology. If this seems dizzying, keep in mind that proof here is being used
in two different senses. There are metalogic proofs, and there are proofs in formal
systems. Metalogic proofs are phrased in natural language (perhaps augmented
with mathematical vocabulary), and employ informal (though rigorous!) rea-
soning of the sort one would encounter in a mathematics book. The chapter 2
argument that every provable formula is a tautology will be a metalogic proof.
Proofs in formal systems, on the other hand, are phrased using sentences of
formal languages, and proceed according to prescribed formal rules. Provable
in the statement every provable formula is a tautology signies proof in a
certain formal system (one that we will introduce in chapter 2), not metalogic
proof.
Logicians often distinguish the object language from the metalanguage.
The object language is the language thats being studied. One example is the
language of propositional logic. Its sentences look like this:
PQ
(PQ)R
The metalanguage is the language we use to talk about the object language.
In the case of the present book, the metalanguage is English. Here are some
example sentences of the metalanguage:
PQ is a formal sentence with three symbols
Every sentence of propositional logic has the same num-
ber of left parentheses as right parentheses
Every provable formula is a tautology
Thus, we formulate metalogical claims about an object language in the meta-
language, and prove such claims by reasoning in the metalanguage.
Using the metalanguage to make statements about words can sometimes
be tricky to do properly. In an effort to make a statement about the name of
the United Statess most excellent city, suppose I say:
() Philadelphia is made up of twelve letters
Sentence () does not at all capture my intention. It says that a certain city is
made up of twelve letters. But cities arent made up of letters; theyre made up
of things like buildings, streets, and people. The problem with sentence () is
that its subject is the word Philadelphia. The word Philadelphia refers to
the city, Philadelphia; thus, sentence () says something about that city. But I
intended to say something about the word that names that city, not about the
city itself. What I should have said is this:
() Philadelphia is made up of twelve letters
The subject of sentence () is the following expression:
Philadelphia
That is, the subject of sentence () is the result of enclosing the word Philadel-
phia in quotation marks; the subject is not the word Philadelphia itself. So ()
says something about the word Philadelphia, not the city Philadelphia, which
is what I intended.
The moral is that if we want to talk about a word or other linguistic item,
we need to refer to it correctly. We cannot just use that word (as in ()), for
then that word refers to its referent (a city, in the case of ()). We must instead
mention the wordwe must instead use some expression that refers to the
word itself, not an expression that refers to the words referent. And the most
common device for doing this is to enclose the word in quotation marks (as in
()).
However: having made such a big deal about this issue, I propose henceforth
to ignore it. Zealous care about use and mention would result in an ugly
proliferation of quotation marks. So, instead of writing things strictly correctly:
The formula PP is a tautology
I will mostly write somewhat naughty things instead:
The formula PP is a tautology
Now that youre clued into the distinction between use and mention, youll be
able to detect where Ive been sloppy in this way.
2
2
Cartwright (, Appendix) has interesting exercises for learning more about use and
mention.
Ixcrcisc 1.1 For each of the following, i) is it a sentence of the
object language or the metalanguage? ii) is it true?
a)* PP is a logical truth.
b)* (PQ)(QP)
c)* Frank and Joe are brothers logically implies Frank and Joe
are siblings.
Ixcrcisc 1.2 Each of the following sentences confuses use and
mention. In each case, ll in quotation marks to x the problem.
a)* Attorney and lawyer are synonyms.
b)* If S
1
is an English sentence and S
2
is another English sen-
tence, then the string S
1
and S
2
is also an English sentence.
1.4 Application
The modern method for studying logical consequence, then, is to construct
formalized versions of the concepts of logical consequence and logical truth
concepts applying to sentences in formal languagesand to mathematically
study how those concepts behave. But what does the construction of such
formalized concepts establish? After all, some formalized constructions shed
no light at all on logical consequence. Imagine dening up a formal proof
system that includes a rule of inference allowing one to infer P from P.
One could dene the rules of such a system in a perfectly precise way and
investigate its mathematical properties, but doing so wouldnt shed light on the
intuitive notion of logical consequence that was introduced in section 1.1on
genuine logical consequence, as I will call it, to distinguish it from the various
formalized notions we could stipulatively dene. It would be ridiculous to
claim, for example, that the existence of this system shows that Snow is not
white follows from Snow is white.
Thus, the mathematical existence and coherence of a formal system must be
distinguished from its value in representing genuine logical consequence and
logical truth. To be sure, logicians use formal systems of various sorts for many
purposes that have nothing to do with reasoning at all: for studying syntax,
computer programming, electric circuits, and many other phenomena. But
one core, central goal of logic is indeed to study genuine logical consequence.
What, exactly, might it mean to say that a formal system represents or
models or sheds light on genuine logical consequence? How are formal
systems to be applied? Heres an oversimplied account of one such claim.
Suppose we have developed a certain formal system for constructing proofs
of symbolic sentences of propositional logic. And suppose we have specied
some translation scheme from English into the language of propositional logic.
This translation scheme would translate the English word and into the logical
expression , or into , and so on. We might then say that the formal
system accurately represents the logical behavior of and, or, and the rest in
the following sense: one English sentence is a logical consequence of some
other English sentences in virtue of and, or, etc., if and only if one can prove
the translation of the former English sentence from the translations of the
latter English sentences in the formal system.
The question of whether a given formal system represents genuine logical
consequence is a philosophical one, because the question of what is a genuine
logical consequence of what is a philosophical question. This book wont spend
much time on such questions. My main goal is to introduce the formalisms
that are ubiquitous in philosophy, so that you will have the tools to address the
philosophical questions yourself. Still, well dip into such questions from time
to time, since they affect our choices of which logical systems to study.
1.5 Thc naturc of logical conscqucncc
I have characterized genuine logical consequence intuitively, and distin-
guished it from the formal notions we introduce in mathematical logic to
represent it. But what is genuine logical consequence? What is its nature?
The question here is analogous to questions like what is knowledge? and
what is the good life?. Its a philosophical question, to be answered using
the methods of philosophy. (This is not to deny that formal results from
mathematical logic bear on the question.) Like any philosophical question, it is
debatable how we should go about answering it. Do we use conceptual analysis
to explore the nuances of our ordinary concept? Do we seek rational insight
into the nature of objective reality behind our ordinary concept? Do we jettison
ambiguous and vague ordinary concepts in favor of shiny new replacements?
All this is up for grabs.
Its important to see that there really is an open philosophical question here.
This is sometimes obscured by the fact that terms like logical consequence and
logical truth are often stipulatively dened in logic books. The open question
does not concern such stipulated notions, of course; it concerns the notion
of logical consequence that the stipulative denitions are trying to represent.
The question is also obscured by the fact that one conception of the nature of
logical consequencethe model-theoretic oneis so dominant that one can
forget that there are alternatives.
3
This is not a book on the philosophy of logic, so after this section we wont
spend more time on the question of the nature of genuine logical consequence.
But perhaps a quick survey of some competing philosophical answers to the
question, just to convey their avor, is in order.
The most popular answer is the semantic, or model-theoretic one. Whats most
familiar here is its implementation for formal languages. Under this approach,
one chooses a formal language, denes a notion of model (or interpretation)
for the chosen language, denes a notion of truth-in-a-model for sentences of
the language, and then nally represents logical consequence for the chosen
language as truth-preservation in models ( is represented as being a logical
consequence of
1
,
2
, . . . if and only if is true in any model in which each of
1
,
2
, . . . is true.)
Now, as stated, this isnt a theory of genuine logical consequence. Its only a
way of representing logical consequence using formal languages. What theory
of genuine logical consequence lies behind it? Perhaps one like this: is
a logical consequence of
1
,
2
. . . if and only if the meanings of the logical
expressions in and
1
,
2
. . . guarantee that is true whenever
1
,
2
. . .
are all true. (Nonlogical expressions are expressions other than and, or,
not, some, and so on; more on this below.) To its credit, this theory of
genuine consequence seems to mesh with the model-theoretic formal method
for representing consequence; for since (as well see in section 2.2) everything
other than the meanings of the logical expressions is allowed to vary between
models, truth-preservation in all models seems to indicate that the meanings
of the logical expressions guarantee truth-preservation. But on the other
hand, what does that mean exactly? What does it mean to say that meanings
guarantee a certain outcome? The theory is unclear. Perhaps, instead,
there isnt really a semantic/model-theoretic theory of the nature of logical
3
See Etchemendy (, chapter ).
consequence at all, but rather a preference for a certain approach to formalizing
or representing logical consequence.
A second answer to the question about the nature of logical consequence is
a proof-theoretic one, according to which logical consequence is more a matter
of provability than of truth-preservation. As with the semantic account, there is
a question of whether we have here a proper theory about the nature of logical
consequence (in which case we must ask: what is provability? by which rules?
and in which language?) or whether we have merely a preference for a certain
approach to formalizing logical consequence. In the latter case, the approach
to formalization is one in which we dene up a relation of provability between
sentences of formal languages. We do this, roughly speaking, by dening
certain acceptable transitions between sentences of formal languages, and
then saying that a sentence is provable from sentences
1
,
2
, . . . if and only
if there is some way of moving by acceptable transitions from
1
,
2
, . . . to .
The semantic and proof-theoretic approaches are the main two sources
of inspiration for formal logic, and certainly for the systems we will discuss
in this book. But there are alternate philosophical conceptions of logical
consequence that are worth briey mentioning. There is the view of W. V. O.
Quine: is a logical consequence of
1
,
2
. . . iff there is no way to (uniformly)
substitute expressions for nonlogical expressions in and
1
,
2
. . . so that
1
,
2
. . . all become true but does not.
4
There is a modal account: is a
logical consequence of
1
,
2
. . . iff it is not possible for
1
,
2
. . . to all be true
without being true (under some suitable notion of possibility).
5
And there
is a primitivist account, according to which logical consequence is a primitive
notion.
Ixcrcisc 1.3* Let sentence S
1
be There exists an x such that x
and x are identical, and let S
2
be There exists an x such that there
exists a y such that x and y are not identical. Does S
1
logically
imply S
2
according to the modal criterion? According to Quines
criterion?
4
Quine (); p. in Quine ().
5
Perhaps semantic/model-theoretic formalisms can be regarded as being inspired by the
modal account.
1.6 Logical constants
Its natural to think of logic as having something to do with form. (Recall
the slogans of section 1.1.) The idea can be illustrated by seeing how it clashes
with the modal conception of logical consequence from the previous section.
Since it is impossible to be a bachelor without being unmarried, the modal
account says that Grant is a bachelor logically implies Grant is unmarried.
But this seems wrong. Perhaps the rst sentence analytically or conceptually
implies the second sentence, but the implication doesnt seem logical. And its
natural to put this by saying that, whatever exactly logical implication amounts
to, logical implications must at least hold by virtue of form.
6
But what does that mean? Consider an implication that, one is inclined to
say, does hold by virtue of form; the implication from Leisel is a swimmer and
Leisel is famous to Leisel is a swimmer. This holds by virtue of form, one
might think, because i) it has the form and ; so, ; and ii) for any pair of
sentences of this form, the rst logically implies the second. But the defender
of the modal conception of logical consequence could say the following:
The inference from Grant is a bachelor to Grant is unmarried
also holds in virtue of form. For: i) it has the form is a bachelor;
so, is unmarried; and ii) for any pair of sentences of this form,
the rst sentence logically implies the second (since its impossible
for the rst to be true while the second is false.)
Whats wrong with saying this? We normally think of the forms of inferences
as being things like and ; so, , and not things like is a bachelor; so,
is unmarried, but why not?
When we assign a form to an inference, we focus on some phrases while
ignoring others. The phrases we ignore disappear into the schematic letters
(, , and in the previous paragraph); the phrases on which we focus remain
(and, bachelor, unmarried). Now, logicians do not focus on just any old
phrases. They focus on and, or, not, ifthen, and so on, in propositional
logic; on all and some in addition in predicate logic; and on a few others.
But they do not focus on bachelor and unmarried. Call the words on which
logicians focusthe words they leave intact when constructing forms, and the
6
A hybrid of the modal and Quinean accounts of logical consequence respects this: is a
logical consequence of
1
,
2
. . . iff its impossible for
/
1
,
/
2
. . . to be true while
/
is false, for
any
/
and
/
1
,
/
2
. . . that result from and
1
,
2
. . . by uniform substitution for nonlogical
expressions.
words for which they introduce special symbolic correlates, such as , , and
the logical constants. (These are what I was calling logical expressions in
the previous section.)
We can speak of natural language logical constants (and, or, all, some)
as well as symbolic logical constants (, , , ). The symbolic logical
constants get special treatment in formal systems. For example, in proof
systems for propositional logic there are special rules governing ; and these
rules differ from the rules governing . This reects the fact that and have
xed interpretations in propositional logic. Unlike P, Q, and so on, which
are not symbolic logical constants, and which do not xedly represent any
particular natural language sentences, and xedly represent and and or.
In terms of the notion of a logical constant, then, we can say why the
inference from Grant is a bachelor to Grant is unmarried is not a logical
one. When we say that logical implications hold by virtue of form, we mean
that they hold by virtue of logical form; and the form is a bachelor; so,
is unmarried is not a logical form. A logical form must consist exclusively of
logical constants (plus punctuation and schematic variables); and the fact is that
logicians do not treat bachelor and unmarried as logical constants.
But this just pushes the question back: why dont they? Whats so special
about and, or, all, and some? Just as the meaning of and guarantees that
whenever Leisel is a swimmer and Leisel is famous is true, Leisel is a swimmer
is true as well, so, the meanings of bachelor and unmarried guarantee that
whenever Grant is a bachelor is true, Grant is unmarried is true as well. Why
not expand logic beyond propositional and predicate logic to include the logic
of bachelorhood and unmarriage?
On the one hand theres no formal obstacle to doing just that. We could
develop mathematical models of the inferential behavior of bachelor and
unmarried, by analogy to our models of the behavior of the usual logical
constants. To our predicate logic containing the special symbols , , , ,
and the rest, we could add the special predicates B (for bachelor) and U (for
unmarried). To our derivation systems, in addition to rules like -elimination
(which lets us infer (and also ) from ) we could add a rule that lets
us infer U from B. But on the other hand, there are, intuitively, signicant
differences between the expressions usually regarded as logical constants and
words like bachelor and unmarried. The question of what, exactly, these
differences amount to is a philosophical question in its own right.
7
7
See MacFarlane () for a survey of the issues here.
1.7 Ixtcnsions, dcviations, variations
Standard logic is what is usually studied in introductory logic courses. It
includes propositional logic (logical constants: , , , , ), and predicate
logic (logical constants: , , variables). In this book well consider various
modications of standard logic. Following Gamut (a, pp. -), it is
helpful to distinguish three sorts: extensions, deviations, and variations.
In an extension we add to standard logic. We add new symbolic logical con-
stants (for example, the 2of modal logic), and new cases of logical consequence
and logical truth that we can model using the new logical constants. We do
this in order to represent more facets of the notion of logical consequence.
We extend propositional logic, after all, to get predicate logic. Propositional
logic is great as far as it goes, but it cannot represent the logical implication of
someone is male by Grant is male. That is why we add quantiers, variables,
predicates, and so on, to propositional logic (new symbols), and add means
to deal with these new symbols in semantics and proof theory (new cases of
logical consequence and logical truth we model), to obtain predicate logic.
As we saw in the previous section, logicians dont treat just any old words as
logical constants. They never treat bachelor as a logical constant, for example.
But many logicians do allow some expansion of the usual list familiar from
propositional and predicate logic. Many consider modal logic, for example,
in which one treats necessarily as a logical constant (symbolized by the new
symbol 2) to be part of logic.
In a deviation we retain the usual set of logical constants, but change what
we say about them. We keep standard logics symbols, but alter its proof theory
and semantics, thereby offering a different model of logical consequence and
logical truth.
Why do this? Perhaps because we think that standard logic is wrong. For
example, the standard semantics for propositional logic counts the sentence
PP as a tautology. But some philosophers resist the idea that natural lan-
guage sentences like the following are logically true:
Either I am tall or I am not tall
Either there will be a sea battle tomorrow or there will
not be a sea battle tomorrow
If these philosophers are right, then the standard notion of a tautology is an
imperfect model of genuine logical truth, and we need a better model.
Variations also change standard logic, but here the changes are, roughly
speaking, merely notational; they leave the content of standard logic unal-
tered. For example, in Polish notation, instead of writing P(QR), we write
PQR; binary connectives go in front of the sentences they connect rather
than between them.
1.8 Sct thcory
I said earlier that modern logic uses mathematical techniques to study formal
languages. The mathematical techniques in question are those of set theory.
Only the most elementary set-theoretic concepts and assumptions will be
needed, and you may already be familiar with them; but nevertheless, here is a
brief overview.
Sets have members. Consider the set, A, of even integers between 2 and 6. 2
is a member of A, 4 is a member of A, 6 is a member of A; and nothing else is a
member of A. We use the expression for membership; thus, we can say:
2 A, 4 A, and 6 A. We often name a set by putting names of its members
between braces: ]2, 4, 6] is another name of A.
We can also speak of sets with innitely many members. Consider N, the set
of natural numbers. Each natural number is a member of N; thus, 0 N, 1 N,
and so on. We can informally name this set with the brace notation as well:
]0, 1, 2, 3, . . . ], so long as it is clear which continued series the ellipsis signies.
The members of a set need not be mathematical entities; anything can be a
member of a set.
8
Sets can contain people, or cities, orto draw nearer to our
intended purposesentences and other linguistic entities.
There is also the empty set, . This is the one set with no members. That
is, for each object u, u is not a member of (i.e.: for each u, u / .)
Though the notion of a set is an intuitive one, the Russell Paradox (discov-
ered by Bertrand Russell) shows that it must be employed with care. Let R be
the set of all and only those sets that are not members of themselves. That is,
R is the set of non-self-members. Russell asks the following question: is R a
member of itself? There are two possibilities:
R / R. Thus, R is a non-self-member. But R was said to be the set of all
non-self-members, and so wed have R R. Contradiction.
8
Well, some axiomatic set theories bar certain very large collections from being members
of sets. This issue wont be relevant here.
R R. So R is not a non-self-member. R, by denition, contains only
non-self-members. So R / R. Contradiction.
Thus, each possibility leads to a contradiction. But there are no remaining
possibilitieseither R is a member of itself or it isnt! So it looks like the very
idea of sets is paradoxical.
Since Russells time, set theorists have developed theories of sets that avoid
Russells paradox (as well as other related paradoxes). They do this chiey by
imposing rigid restrictions on when sets exist. So far we have been blithely
assuming that there exist various sets: the set N, sets containing people, cities,
and sentences, Russells set R. That got us into trouble. So what we want is
a theory of when sets exist that blocks the Russell paradox by saying that set
R simply doesnt exist (for then Russells argument falls apart), but which says
that the sets we need to do mathematics and metalogic do exist. The details of
set theory are beyond the scope of this book. Here, we will help ourselves to
intuitively safe sets, sets that arent anything like the Russell set. Well leave
the task of what safe amounts to, exactly, to the set theorists.
Various other useful set-theoretic notions can be dened in terms of the
notion of membership. Set A is a subset of set B (AB) when every member
of A is a member of B. The intersection of A and B (A B) is the set that
contains all and only those things that are members of both A and B; the union
of A and B (AB) is the set containing all and only those things that are
members of either Aor B (or both
9
).
Suppose we want to refer to the set of the so-and-sosthat is, the set
containing all and only objects, u, that satisfy the condition so-and-so. Well
do this with the term {u: u is a so-and-so}. Thus, we could write: N=]u :
u is a natural number]. And we could restate the denitions of and from
the previous paragraph as follows:
AB =]u : u Aand u B]
AB =]u : u Aor u B]
Sets have members, but they dont contain them in any particular order.
For example, the set containing me and Barack Obama doesnt have a rst
member. {Ted, Obama} and {Obama, Ted} are two different names for
the same setthe set containing just Obama and me. (This follows from
9
In this book I always use or in its inclusive sense.
the criterion of identity for sets: sets are identical if and only if they have
exactly the same members.) But sometimes we need to talk about set-like things
containing objects in a particular order. For this purpose we use ordered sets.
10
Two-membered ordered sets are called ordered pairs. To name the ordered
pair of Obama and Ted, we use: Obama, Ted. Here, the order is signicant;
Obama, Ted and Ted, Obama are not the same ordered pair. The three-
membered ordered set of u, v, and w (in that order) is written: u, v, w; and
similarly for ordered sets of any nite size. A n-membered ordered set is called
an n-tuple. (For the sake of convenience, lets dene the -tuple u to be just
the object u itself.)
A further concept well need is that of a relation. A relation is just a feature
of multiple objects taken together. The taller-than relation is one example:
when one person is taller than another, thats a feature of those two objects
taken together. Another example is the less-than relation for numbers. When
one number is less than another, thats a feature of those two numbers taken
together.
Binary relations apply to two objects at a time. The taller-than and less-
than relations are binary relations, or two-place relations as we might say.
We can also speak of three-place relations, four-place relations, and so on. An
example of a three-place relation would be the betweenness relation for numbers:
the relation that holds among 2, 5, and 23 (in that order), for example.
We can use ordered sets to give an ofcial denition of what a relation is.
iiixi:iox oi viia:iox: An n-place relation is a set of n-tuples.
So a binary (two-place) relation is a set of ordered pairs. For example, the
taller-than relation may be taken to be the set of ordered pairs u, v such that
u is a taller person than v. The less-than relation for positive integers is the set
of ordered pairs m, n such that m is a positive integer less than n, another
positive integer. That is, it is the following set:
]1, 2, 1, 3, 1, 4 . . . 2, 3, 2, 4 . . . ]
10
Theres a trick for dening ordered sets in terms of sets. First, dene the ordered pair
u, v as the set ]]u], ]u, v]]. (We can recover the information that u is intended to be the rst
member because u appears twice.) Then dene the n-tuple u
1
. . . u
n
as the ordered pair
u
1
, u
2
. . . u
n
, for each n 3. But henceforth Ill ignore this trick and just speak of ordered
sets without worrying about how theyre dened.
When u, v is a member of relation R, we say, equivalently, that u and v stand
in R, or R holds between u and v, or that u bears R to v. Most simply,
we write Ruv.
11
Some more denitions:
iiixi:iox oi boxaix, vaxoi, oviv: Let R be any binary relation and A be
any set.
The domain of R (dom(R)) is the set ]u: for some v, Ruv]
The range of R (ran(R)) is the set ]u: for some v, Rvu]
R is over Aiff dom(R) Aand ran(R) A
In other words, the domain of R is the set of all things that bear R to something;
the range is the set of all things that something bears R to; and R is over A iff
the members of the tuples in R are all drawn from A.
Binary relations come in different kinds, depending on the patterns in which
they hold:
iiixi:iox oi xixbs oi nixavv viia:ioxs: Let R be any binary relation over
some set A.
R is serial (in A) iff for every u A, there is some v Asuch that Ruv.
R is reexive (in A) iff for every u A, Ruu
R is symmetric iff for all u, v, if Ruv then Rvu
R is transitive iff for any u, v, w, if Ruv and Rvw then Ruw
R is an equivalence relation (in A) iff R is symmetric, transitive, and
reexive (in A)
R is total (in A) iff for every u, v A, Ruv
Notice that we relativize some of these relation types to a given set A. We do
this in the case of reexivity, for example, because the alternative would be
to say that a relation is reexive simpliciter if everything bears R to itself; but
that would require the domain and range of any reexive relation to be the set
of absolutely all objects. Its better to introduce the notion of being reexive
relative to a set, which is applicable to relations with smaller domains. (I will
11
This notation is like that of predicate logic; but here Im speaking the metalanguage, not
displaying sentences of a formalized language.
sometimes omit the qualier in A when it is clear which set that is.) Why
dont symmetry and transitivity have to be relativized to a set?because they
only say what must happen if R holds among certain things. Symmetry, for
example, says merely that if R holds between u and v, then it must also hold
between v and u, and so we can say that a relation is symmetric absolutely,
without implying that everything is in its domain.
Well also need the concept of a function. A function takes in an object or
objects (in a certain order), and spits out a further object. For example, the
addition function takes in two numbers, and spits out their sum. As with sets,
ordered sets, and relations, functions are not limited to mathematical entities:
they can take in and spit out any objects whatsoever. We can speak of the
father-of function, for example, which takes in a person, and spits out the father
of that person. (The more common way of putting this is: the function maps
the person to his or her father.) And later in this book we will be considering
functions that take in and spit out linguistic entities.
Some functions must take in more than one object before they are ready to
spit out something. For example, you need to give the addition function two
numbers in order to get it to spit out something; for this reason it is called a
two-place function. The father-of function, on the other hand, needs to be given
only one object, so it is a one-place function. Lets simplify this by thinking
of an n-place function as simply being a one-place function that takes in only
n-tuples. Thus, if you give the addition function the ordered pair 2, 5, it spits
out 7.
The objects that a function takes in are called its arguments, and the objects
it spits out are called its values. If u is an argument of f we write f (u) for
the value of function f as applied to the argument u. f (u) is the object that
f spits out, if you feed it u. For example, where f is the father-of function,
since Ron is my father we can write: f (Ted) = Ron. When f is an n-place
functioni.e., its arguments are n-tuplesinstead of writing f (u
1
, . . . , u
n
)
we write simply f (u
1
, . . . , u
n
). So where a is the addition function, we can write:
a(2, 3) =5. The domain of a function is the set of its arguments, and its range is
the set of its values. If u is not in function f s domain (i.e., u is not one of f s
arguments), then f is undened for u. The father-of function, for example, is
undened for numbers (since numbers have no fathers). These concepts may
be pictured for (a part of) the father-of function thus:
Jenna Bush
George W. Bush Barbara Bush
George W. Bush George H. W. Bush
Chelsea Clinton Bill Clinton
17 Chelsea Clinton
Massachusetts Cygnus X-
range
domain
The number 17 and the state of Massachusetts are excluded from the domain
because, being a number and a political entity, they dont have fathers. Chelsea
Clinton and Cygnus X- are excluded from the range because, being a woman
and a black hole, they arent fathers of anyone. 17 and Massachusetts arent in
the range either; and Cygnus X- isnt in the domain. But Chelsea Clinton is
in the domain, since she has a father.
Its part of the denition of a function that a function can never map an
argument to two distinct values. That is, f (u) cannot be equal both to v and
also to v
/
when v and v
/
are two different objects. That is, a function always
has a unique value, given any argument for which the function is dened. (So
there is no such function as the parent-of function; people typically have more
than one parent.) Functions are allowed to map two distinct arguments to the
same value. (The father-of function is an example; two people can have the
same father.) But if a given function happens never to do this, then it is called
one-to-one. That is, a (one-place) function f is one-to-one iff for any u and v
in its domain, if u (= v then f (u) (= f (v). (The function of natural numbers f
dened by the equation f (n) = n +1 is an example.) This all may be pictured
as follows:
Not a function
@
@
@
@
@
@

One-to-one function
Function thats not one-to-one
@
@
@
@
@
@
As with the notion of a relation, we can use ordered sets to give ofcial
denitions of function and related notions:
iiixi:iox oi iixt:iox-:uiovi:it xo:ioxs:
A function is a set of ordered pairs, f , obeying the condition that if u, v
and u, w are both members of f , then v = w
When u, v f , we say that u is an argument of f , v is a value of f , and
that f maps u to v; and we write: f (u) = v
The domain of a function is the set of its arguments; its range is the set
of its values
A function is n-place when every member of its domain is an n-tuple
Thus, a function is just a certain kind of binary relationone that never relates
a single thing u to two distinct objects v and w. (Notice that the denition
of domain and range for functions yields the same results as the denition
given earlier for relations.)
The topic of innity is perhaps set theorys most fascinating part. And one
of the most fascinating things about innity is the matter of sizes of innity.
Compare the set N of natural numbers and the set E of even natural numbers
(]0, 2, 4, 6, . . . ]). Which set is biggerwhich has more members? You might
think that N has got to be bigger, since it contains all the members of E and
then the odd natural numbers in addition. But in fact these sets have the same
size. For we can line up their members as follows:
N: 0 1 2 3 4 5 . . .
E : 0 2 4 6 8 10 . . .
If two sets can be lined up in this way, then they have the same size. Indeed,
this is how set theorists dene same size. Or rather, they give a precise
denition of sameness of size (they call it equinumerosity, or sameness of
cardinality) which captures this intuitive idea:
iiixi:iox oi !oiixixivosi:v: Sets A and B are equinumerous iff there
exists some one-to-one function whose domain is Aand whose range is B
Intuitively: sets are equinumerous when each member of either set can be
associated with a unique member of the other set. You can line their members
up.
The picture in which the members of N and the members of E were lined
up is actually a picture of a function: the function that maps each member of N
to the member of E immediately below it in the picture. Mathematically, this
function, f , may be dened thus:
f (n) =2n (for any n N)
This function is one-to-one (since if two natural numbers are distinct then
doubling each results in two distinct numbers). So N and E are equinumerous.
Its quite surprising that a set can be equinumerous with a mere subset of itself.
But thats how it goes with innity.
Even more surprising is the fact that the rational numbers are equinumerous
with the natural numbers. A (nonnegative) rational number is a number that
can be written as a fraction
n
m
where n and m are natural numbers and m (=0.
To show that N is equinumerous with the set Q of rational numbers, we must
nd a one-to-one function whose domain is Nand whose range is Q. At rst this
seems impossible, since the rationals are dense (between every two fractions
there is another fraction) whereas the naturals are not. But we must simply be
clever in our search for an appropriate one-to-one function.
Each rational number is represented in the following grid:
numerators
denominators
1 2 3 4 5 . . .
0
0
1
0
2
0
3
0
4
0
5
. . .
1
1
1
1
2
1
3
1
4
1
5
. . .
2
2
1
2
2
2
3
2
4
2
5
. . .
3
3
1
3
2
3
3
3
4
3
5
. . .
4
4
1
4
2
4
3
4
4
4
5
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Any rational number
n
m
can be found in the row for n and the column for
m. For example,
2
3
(circled above) is in the row for 2 (the third row, since
the rst row is for 0) and the column for 3 (the third column). In fact, every
rational number appears multiple times in the grid (innitely many times, in
fact). For example, the rational number
1
2
, which occurs in the second row,
second column, is the same as the rational number
2
4
, which occurs in the third
row, fourth column. (Its also the same as
3
6
,
4
8
,
5
10
. . . .)
Our goal is to nd a way to line up the naturals with the rationalsto nd
a one-to-one function, f , with domain N and range Q. Since each rational
number appears in the grid, all we need to do is go through all of the (innitely
many!) points on the grid, one by one, and count off a corresponding natural
number for each; well then let our function f map the natural numbers we
count off to the rational numbers that appear at the corresponding points on
the grid. Lets start at the top left of the grid, and count off the rst natural
number, 0. So well have f map 0 to the rational number at the top left of the
grid, namely,
0
1
. That is, f (0) =
0
1
. We can depict this by labeling
0
1
with the
natural number we counted off, 0:
numerators
denominators
1 2 3 4 5 . . .
0
0
1
(0)
0
2
0
3
0
4
0
5
. . .
1
1
1
1
2
1
3
1
4
1
5
. . .
2
2
1
2
2
2
3
2
4
2
5
. . .
3
3
1
3
2
3
3
3
4
3
5
. . .
4
4
1
4
2
4
3
4
4
4
5
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Next, ignoring a certain wrinkle which Ill get to in a moment, lets count off
natural numbers for the rationals in the uppermost ring around the top left
of the grid, in counterclockwise order, beginning at the left:
numerators
denominators
1 2 3 4 5 . . .
0
0
1
(0)
0
2
(3)
0
3
0
4
0
5
. . .
1
1
1
(1)
1
2
(2)
1
3
1
4
1
5
. . .
2
2
1
2
2
2
3
2
4
2
5
. . .
3
3
1
3
2
3
3
3
4
3
5
. . .
4
4
1
4
2
4
3
4
4
4
5
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Then (continuing to ignore the wrinkle) lets count off the next ring of numbers,
again in counterclockwise order beginning at the left:
numerators
denominators
1 2 3 4 5 . . .
0
0
1
(0)
0
2
(3)
0
3
(8)
0
4
0
5
. . .
1
1
1
(1)
1
2
(2)
1
3
(7)
1
4
1
5
. . .
2
2
1
(4)
2
2
(5)
2
3
(6)
2
4
2
5
. . .
3
3
1
3
2
3
3
3
4
3
5
. . .
4
4
1
4
2
4
3
4
4
4
5
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
And so on innitely. For each new ring, we begin at the left, and move through
the ring counterclockwise, continuing to count off natural numbers.
Every point on the grid will eventually be reached by one of these increas-
ingly large (but always nite) rings. Since every rational number appears on
the grid, every rational number eventually gets labeled with a natural number.
So the range of our function f is the entirety of Q! There are two tricks that
make this work. First, even though the rational numbers are dense, they can
be laid out in a discrete grid. Second, even though the grid is two dimensional
and the natural numbers are only one-dimensional, there is a way to cover the
whole grid with naturals since there is a one-dimensional path that covers
the entire grid: the path along the expanding rings.
The wrinkle is that this procedure, as weve laid it out so far, doesnt deliver
a one-to-one function, because rational numbers appear multiple times in the
grid. For example, given our denition, f maps 0 to
0
1
and 3 to
0
2
. But
0
2
is the
same rational number as
0
1
namely, 0so f isnt one-to-one. ( f also maps 8
to 0; and it maps both 1 and 5 to 1, etc.) But its easy to modify the procedure
to x this problem. In our trek through the rings, whenever we hit a rational
number that weve already encountered, lets now simply skip it, and go on to
the next rational number on the trek. Thus, the new diagram looks as follows
(the skipped rational numbers are struck out):
numerators
denominators
1 2 3 4 5 . . .
0
0
1
(0)
0
2

0
3

0
4

0
5
. . .
1
1
1
(1)
1
2
(2)
1
3
(5)
1
4
(9)
1
5
(15) . . .
2
2
1
(3)
2
2
2
3
(4)
2
4
2
5
(14) . . .
3
3
1
(6)
3
2
(7)
3
3
3
4
(8)
3
5
(13) . . .
4
4
1
(10)
4
2
4
3
(11)
4
4
4
5
(12) . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Weve now got our desired function f : it is the function that maps each natural
number to the rational number in the grid labelled by that natural number.
(Notice, incidentally, that f could be displayed in this way instead:
n : 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 . . .
f (n) : 0 1
1
2
2
2
3
1
3
3
3
2
3
4
1
4
4
4
3
4
5
3
5
2
5
1
5
. . .
This is just a different picture of the same function.) Since each rational number
is labeled by some natural number, f s range is Q. f s domain is clearly N. And
f is clearly one-to-one (since our procedure skips previously encountered
rational numbers). So f is our desired function; N and Q are the same size.
If even a dense set like Q is no bigger than N, are all innite sets the same
size? The answer is in fact no. Some innite sets are bigger than N; there are
different sizes of innity.
One such set is the set of real numbers. Real numbers are numbers that can
be represented by decimals. All rational numbers are real numbers; and their
decimal representations either terminate or eventually repeat in some innitely
recurring pattern. (For example,
1
3
has the repeating decimal representation
0.3333. . . ;
7
4
has the terminating decimal representation 1.75.) But some real
numbers are not rational numbers. These are the real numbers with decimal
representations that never repeat. One example is the real number , whose
decimal representation begins: 3.14159. . . .
Well prove that there are more real than natural numbers by proving that
there are more real numbers between 0 and 1 than there are natural numbers.
Let R be the set of real numbers in this interval. Now, consider the function
f which maps each natural number n to
1
n+2
. This is a one-to-one function
whose domain is N and whose range is ]
1
2
,
1
3
,
1
4
, . . . ]. But this latter set is a subset
of R. So R is at least as big as N. So all we need to do is show that R is not the
same size as N. And we can do this by showing that the assumption that N and
R are the same size would lead to a contradiction.
So, suppose that Nand R are equinumerous. Given the denition of equinu-
merosity, there must exist some one-to-one function, f , whose domain is N
and whose range is R. We can represent f on a grid as follows:
f (0) = 0 . a
0,0
a
0,1
a
0,2
. . .
f (1) = 0 . a
1,0
a
1,1
a
1,2
. . .
f (2) = 0 . a
2,0
a
2,1
a
2,2
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
The grid represents the real numbers in the range of f by their decimal repre-
sentations.
12
The as are the digits in these decimal representations. For any
natural number i , f (i ) is represented as the decimal 0.a
i ,0
a
i ,1
a
i ,2
. . . . Thus a
i , j
is the ( j +1)
st
digit in the decimal representation of f (i ). Consider f (2), for
example. If f (2) happens to be the real number 0.2562894. . . , then a
2,0
= 2,
a
2,1
=5, a
2,2
=6, a
2,3
=2, and so on.
12
If a decimal representation terminates, we can think of it as nevertheless being innite:
there are innitely many zeros after the termination point.
The right hand part of the grid (everything except the column beginning
with f (0) =) is a list of real numbers. The rst real number on this list is
0.a
0,0
a
1,1
a
0,2
. . . , the second is 0.a
1,0
a
1,1
a
1,2
. . . , the third is 0.a
2,0
a
2,1
a
2,2
. . . , and so
on. The real numbers in this list, in fact, comprise the range of f . But we
have supposed, remember, that the range of f is the entirety of R. Thus, we
have an important consequence of our supposition: this list is a complete list of
R. That is, every member of R occurs somewhere on the list, as the decimal
0.a
i ,0
a
i ,1
a
i ,2
. . . , for some natural number i .
But in fact, we can show that this cant be a complete list of R, by showing
that there is at least one real number between 0 and 1 that does not appear on
the list. Were going to do this in a crafty way: well look at the grid above,
and construct our real number as a function of the grid in such a way that its
guaranteed not to be anywhere on the list.
Ill call the real number Im after d; to specify d, Im going to specify its
decimal representation 0.d
0
d
1
d
2
. . . . Here is my denition of the j
th
digit in
this decimal representation:
d
j
=
6 if a
j , j
=5
5 otherwise
The a
j , j
s refer to the grid depicting f above; thus, what real number d we
have dened depends on the nature of the grid, and thus on the nature of the
function f .
To get a handle on whats going on here, think about it geometrically.
Consider the digits on the following diagonal line in the grid:
f (0) = 0 .

a
0,0
a
0,1
a
0,2
. . .
f (1) = 0 . a
1,0
a
1,1
a
1,2
. . .
f (2) = 0 . a
2,0
a
2,1
a
2,2
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
To these diagonal digits, there corresponds a real number: 0.a
0,0
a
1,1
a
2,2
. . . . Call
this real number a. What we did to arrive at our number d (so-called because we
are giving a diagonal argument) was to begin with as decimal representation
and change each of its digits. We changed each of its digits to 5, except when
the digit was already 5, in which case we changed it to 6.
We now approach the punch line. ds denition insures that it cannot
be anywhere on the list. Let f (i ) be any member of the list. We can prove
that d and f (i ) are not the same number. If they were, then their decimal
representations 0.d
0
d
1
d
2
. . . and 0.a
i ,0
a
i ,1
a
i ,2
. . . would also be the same. So each
digit d
j
in ds decimal representation would equal its corresponding digit a
i , j
in
f (i )s decimal representation. But this cant be. There is one place in particular
where the digits must differ: the i
th
place. d
i
is dened to be 6 if a
i ,i
is 5, and
dened to be 5 if a
i ,i
is not 5. Thus, d
i
is not the same digit as a
i ,i
. So ds decimal
representation differs in at least one place from f (i )s decimal representation;
so d is different from f (i ). But f (i ) was an arbitrarily chosen member of the
list. Thus we have our conclusion: d isnt anywhere on the list. But d is a real
number between 0 and 1. So if our initial assumption that the range of f is all
of R were correct, d would have to be on the list. So that initial assumption
was false, and weve completed our argument: its impossible for there to be
a one-to-one function whose domain is N and whose range is all of R. Even
though N and R are both innite sets, R is a bigger innite set.
To grasp the arguments nal phase, think again in geometric terms. If
d were on the list, its decimal representation would intersect the diagonal.
Suppose, for instance, that d were f (3):
f (0) = 0 .

a
0,0
a
0,1
a
0,2
a
0,3
a
0,4
. . .
f (1) = 0 . a
1,0
a
1,1
a
1,2
a
1,3
a
1,4
. . .
f (2) = 0 . a
2,0
a
2,1
a
2,2
a
2,3
a
2,4
. . .
d = f (3) = 0 . a
3,0
a
3,1
a
3,2
a
3,3
a
3,4
. . .
f (4) = 0 . a
4,0
a
4,1
a
4,2
a
4,3
a
4,4
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Then, given ds denition, its decimal representation would be guaranteed to
differ from the diagonal series in its fourth digit, the point of intersection.
Its natural to voice the following misgiving about the argument: if d was
left off the list, then why cant you just add it in? You could add it in at the
beginning, bumping all the remaining members of the list down one slot to
make room for it:
initial list make room for d new list
f (0)
d
f (1) f (0)
f (0)
f (2) f (1)
f (1)
.
.
.
f (2)
f (2)
.
.
.
.
.
.
.
.
.
Natural as it is, the misgiving is misguided. Its true that, given any list, one
could add d to that list using the method described. But this fact is irrelevant
to the argument. The argument wasnt that there is some unlistable real number,
dsome real number d that is somehow prevented from occurring in the
range of any one-to-one function whose domain is N. That would be absurd.
The argument was rather that no one list can be complete; any list (i.e., any
one-to-one function whose domain is N) will leave out some real numbers.
The left-out real numbers can appear on other lists, but thats beside the point.
Compare: if a thousand people show up to eat at a small restaurant, many
people will be left out. Thats not to say that any individual person is incapable
of entering; its just to say that not everyone can enter at once. No matter who
enters, others will be left out in the cold.
Ixcrcisc 1.4* For any set, A, the powerset of A is dened as the
set of all As subsets. Write out the denition of the powerset of A
in the ]u : . . . ] notation. Write out the powerset of ]2, 4, 6] in the
braces notation (the one where you list each member of the set).
Ixcrcisc 1.5* Is N equinumerous with the set Z of all integers,
negative, positive, and zero: ] 3, 2, 1, 0, 1, 2, 3, . . . ]?
Chaptcr 2
Propositional Logic
W
i nioix with the simplest logic commonly studied: propositional logic.
Despite its simplicity, it has great power and beauty.
2.1 Grammar of PL
Were going to approach propositional logic by studying a formal language.
And the rst step in the study of a formal language is always to rigorously dene
the languages grammar.
If all you want to do is to use and understand the language of logic, you
neednt be so careful about grammar. For even without a precisely formulated
grammar, you can intuitively recognize that things like this make sense:
PQ
R(SP)
whereas things like this do not:
PQR
(PQ(
P Q
But to make any headway in metalogic, we will need more than an intuitive
understanding of what makes sense and what does not. We will need a precise
denition that has the consequence that only the strings of symbols in the rst
group make sense.
CHAPTER 2. PROPOSITIONAL LOGIC

Grammatical strings of symbols (i.e., ones that make sense) are called
well-formed formulas, or formulas or wffs for short. We dene these by
rst carefully dening exactly which symbols are allowed to occur in wffs (the
primitive vocabulary), and second, carefully dening exactly which strings of
these symbols count as wffs. Here is the ofcial denition; Ill explain what it
means in a moment:
Pvixi:ivi votaniiavv:
Connectives:
1
,
Sentence letters: P, Q, R. . . , with or without numerical subscripts
Parentheses: ( , )
iiixi:iox oi vii:
i) Every sentence letter is a PL-wff
ii) If and are PL-wffs then () and are also PL-wffs
iii) Only strings that can be shown to be PL-wffs using i) and ii) are PL-wffs
(We allow numerical subscripts on sentence letters so that we dont run out
when constructing increasingly complex formulas. Since P
1
, P
2
, P
3
. . . are all
sentence letters, we have innitely many to choose from.)
We will be discussing a number of different logical systems throughout this
book, with differing grammars. What we have dened here is the notion of
a wff for one particular language, the language of PL. So strictly, we should
speak of PL-wffs, as the ofcial denition does. But usually Ill just say wff if
there is no danger of ambiguity.
Here is how the denition works. Its core is clauses i) and ii) (theyre
sometimes called the formation rules). Clause i) says that if you write down a
sentence letter on its own, that counts as a wff. So, for example, the sentence
letter P, all by itself, is a wff. (So is Q, so is P
147
, and so on. Sentence letters are
often called atomic wffs, because theyre not made up of smaller wffs.) Next,
clause ii) tells us how to build complex wffs from smaller wffs. It tells us that
we can do this in two ways. First, it says that if we already have a wff, then we
can put a in front of it to get another wff. (The resulting wff is often called a
1
Some books use instead of , or instead of . Other common symbols include & or
for conjunction, | for disjunction, and for the biconditional.
negation.) For example, since P is a wff (we just used clause i) to establish this),
then P is also a wff. Second, clause ii) says that if we already have two wffs,
then we can put an between them, enclose the whole thing in parentheses,
and we get another wff. (The resulting wff is often called a conditional, whose
antecedent is the wff before the and whose consequent is the wff after
the .) For example, since we know that Q is a wff (clause i)), and that P
is a wff (we just showed this a moment ago), we know that (QP) is also
a wff. This process can continue. For example, we could put an between
the wff we just constructed and R (which we know to be a wff from clause i))
to construct another wff: ((QP)R). By iterating this procedure, we can
demonstrate the wffhood of arbitrarily complex strings.
Why the greek letters in clause ii)? Well, it wouldnt be right to phrase it,
for example, in the following way: if P and Q are wffs, then P and (PQ)
are also wffs. That would be too narrow, for it would apply only in the case of
the sentence letters P and Q. It wouldnt apply to any other sentence letters (it
wouldnt tell us that R is a wff, for example), nor would it allow us to construct
negations and conditionals from complex wffs (it wouldnt tell us that (PQ)
is a wff). We want to say that for any wff (not just P), if you put a in front
of it you get another wff; and for any two wffs (not just P and Q), if you put
an between them (and enclose the result in parentheses) you get another
wff. Thats why we use the metalinguistic variables and .
2
The practice
of using variables to express generality is familiar; we can say, for example,
for any integer n, if n is even, then n +2 is even as well. Just as n here
is a variable for numbers, metalinguistic variables are variables for linguistic
items. (We call them metalinguistic because they are variables we use in our
metalanguage, in order to talk generally about the object language, which is in
this case the formal language of propositional logic.)
Whats the point of clause iii)? Clauses i) and ii) provide only sufcient
conditions for being a wff, and therefore do not on their own exclude nonsense
combinations of primitive vocabulary like PQR, or even strings like P Q
that include disallowed symbols. Clause iii) rules these strings out, since there
is no way to build up either of these strings from clauses i) and ii), in the way
that we built up the wff (P(PQ)).
Notice an interesting feature of this denition: the very expression we
are trying to dene, wff, appears on the right hand side of clause ii) of the
denition. In a sense, we are using the expression wff in its own denition. But
2
Strictly speaking clause iii) ought to be phrased using corner quotes; see exercise 1.2b.
this circularity is benign, because the denition is recursive. A recursive (or
inductive) denition of a concept F contains a circular-seeming clause, often
called the inductive clause, which species that if such-and-such objects are
F , then so-and-so objects are also F . But a recursive denition also contains
a base clause, which species noncircularly that certain objects are F . Even
though the inductive clause rests the status of certain objects as being F s on
whether certain other objects are F s (whose status as F s might in turn depend
on the status of still other objects), this eventually traces back to the base
clause, which secures F -hood all on its own. Thus, recursive denitions are
anchored by their base clauses; thats what distinguishes them from viciously
circular denitions. In the denition of wffs, clause i) is the base, and clause ii)
is the inductive clause. The wffhood of the string of symbols ((PQ)R),
for example, rests on the wffhood of (PQ) and of R by clause ii); and the
wffhood of these, in turn, rests on the wffhood of P, Q and R, again by clause
ii). But the wffhood of P, Q, and R doesnt rest on the wffhood of anything
else; clause i) species directly that all sentence letters are wffs.
What happened to , , and ? The only connectives in our primitive
vocabulary are and ; expressions like PQ, PQ, and PQ therefore do
not ofcially count as wffs. But we can still use , , and unofcially, since
we can dene those connectives in terms of and :
iiixi:ioxs oi , , axb :
is short for ()
is short for
is short for () () (which is in turn short for
(() ()))
So, whenever we subsequently write down an expression that includes one of
the dened connectives, we can regard it as being short for an expression that
includes only the ofcial connectives, and . (Why did we choose these
particular denitions? Well show below that they generate the usual truth
conditions for , , and .)
Our choice to begin with and as our ofcial connectives was somewhat
arbitrary. We could have started with and , and dened the others as follows:
is short for ()
is short for ()
is short for () ()
And other alternate choices are possible. (Why did we choose only a small num-
ber of primitive connectives, rather than including all of the usual connectives?
Because, as we will see, it makes metalogic easier.)
The denition of wff requires conditionals to have outer parentheses. PQ,
for example, is ofcially not a wff; one must write (PQ). But informally,
Ill often omit those outer parentheses. Similarly, Ill sometimes write square
brackets instead of the ofcial round ones (for example, [(PQ)R]P)
to improve readability.
2.2 Thc scmantic approach to logic
In the next section I will introduce a semantics for propositional logic, and
formal representations of logical truth and logical consequence of the semantic
(model-theoretic) variety (recall section 1.5).
On the semantic conception, logical consequence amounts to: truth-preser-
vation in virtue of the meanings of the logical constants. This slogan isnt
perfectly clear, but it does lead to a clearer thought: suppose we keep the
meanings of an arguments logical constants xed, but vary everything else. If
the argument remains truth-preserving no matter how we vary everything else,
then it would seem to preserve truth in virtue of the meanings of its logical
constants. But what is to be included in everything else?
Here is an attractive picture of truth and meaning. The truth of a sentence
is determined by two factors, meaning and the world. A sentences meaning
determines the conditions under which its truethe ways the world would have
to be, in order for that sentence to be true. If the world is one of the ways picked
out by the sentences truth conditions, then the sentence is true; otherwise, not.
Furthermore, a sentences meaning is typically determined by the meanings of
its partsboth its logical constants and its nonlogical expressions. So: three
elements determine whether a sentence is true: the world, the meanings of its
nonlogical expressions, and the meanings of its logical constants.
3
Now we can say what everything else means. Since were holding con-
stant the third element (the meanings of logical constants), varying everything
else means varying the rst two elements. The clearer thought about logical
consequence, then, is that if an argument remains truth-preserving no matter
3
And also a fourth element: its syntax. We hold this constant as well.
how we vary i) the world, and ii) the meanings of nonlogical expressions, then
its premises logically imply its conclusion.
To turn this clearer, but still not perfectly clear, thought into a formal ap-
proach, we need to do two things. First, we need mathematical representations
Ill call them congurationsof variations of types i) and ii). A conguration
is a mathematical representation, both of the world and of the meanings of
nonlogical expressions. Second, we need to dene the conditions under which
a sentence of the formal language in question is true in one of these congu-
rations. When weve done both things, well have a semantics for our formal
language.
One thing such a semantics is good for, is giving a formalization, of the
semantic variety, of the notions of logical consequence and logical truth. This
formalization represents one formula as being a logical consequence of others
iff it is true in any conguration in which the latter formulas are true, and
represents a formula as being a logical truth iff it is true in all congurations.
But a semantics for a formal language is good for something else as well.
Dening congurations, and truth-in-a-conguration, can shed light on mean-
ing in natural and other interpreted languages.
Philosophers disagree over how to understand the notion of meaning in
general. But meaning surely has something to do with truth conditions, as in the
attractive picture above. If so, a formal semantics can shed light on meaning, if
the ways in which congurations render formal sentences true and false are
parallel to the ways in which the real world plus the meanings of words render
corresponding interpreted sentences true and false. Expressions in formal
languages are typically intended to represent bits of interpreted languages. The
PLlogical constant , for example, represents the English logical constant not;
the sentence letters represent English declarative sentences, and so on. Part of
specifying a conguration will be specifying what the nonlogical expressions
mean in that conguration. And the denition of truth-in-a-conguration will
be constructed so that the contributions of the symbolic logical constants to
truth-conditions will mirror the contributions to truth conditions of the logical
constants that they represent.
2.3 Scmantics of propositional logic
Our semantics for propositional logic is really just a more rigorous version
of the method of truth tables from introductory logic books. What a truth
table does is depict how the truth value of a given formula is determined by the
truth values of its sentence letters, for each possible combination of truth values
for its sentence letters. To do this nonpictorially, we need to dene a notion
corresponding to a possible combination of truth values for sentence letters:
iiixi:iox oi ix:ivvvi:a:iox: A PL-interpretation is a function , that
assigns to each sentence letter either 1 or 0
The numbers 1 and 0 are our truth values. (Sometimes the letters T and
F are used instead.) So an interpretation assigns truth values to sentence
letters. Instead of saying let P be false, and Q be true, we can say: let be
an interpretation such that (P) =0 and (Q) =1. (As with the notion of a
wff, we will have different denitions of interpretations for different logical
systems, so strictly we must speak of PL-interpretations. But usually it will be
ne to speak simply of interpretations when its clear which system is at issue.)
An interpretation assigns a truth value to each of the innitely many sentence
letters. To picture one such interpretation we could begin as follows:
(P) =1
(Q) =1
(R) =0
(P
1
) =0
(P
2
) =1
but since there are innitely many sentence letters, the picture could not be
completed. And this is just one interpretation among innitely many; any other
combination of assigned 1s and 0s to the innitely many sentence letters counts
as a new interpretation.
Once we settle what truth values a given interpretation assigns to the sen-
tence letters, the truth values of complex sentences containing those sentence
letters are thereby xed. The usual, informal, method for showing exactly how
those truth values are xed is by giving truth tables for each connective. The
standard truth tables for the and are the following:
4
1 0
1 1 0
0 1 1
1 0
0 1
What we will do, instead, is write out a formal denition of a functionthe
valuation functionthat assigns truth values to complex sentences as a function
of the truth values of their sentence lettersi.e., as a function of a given
intepretation . But the idea is the same as the truth tables: truth tables are
really just pictures of the denition of a valuation function.
iiixi:iox oi vaiia:iox: For any PL-interpretation, , the PL-valuation
for , V
, is dened as the function that assigns to each wff either 1 or 0, and

which is such that, for any sentence letter and any wffs and :
V
() =()
V
() =1 iff either V
() =0 or V
() =1
V
() =1 iff V
() =0
Intuitively: we begin by choosing an interpretation function, which xes the
truth values for sentence letters. Then the valuation function assigns corre-
sponding truth values to complex sentences depending on what connectives
theyre built up from: a negation is true iff the negated formula is false, and a
conditional is true when its antecedent is false or its consequent is true.
We have here another recursive denition: the valuation functions values
for complex formulas are determined by its values for smaller formulas; and this
procedure bottoms out in the values for sentence letters, which are determined
directly by the interpretation function .
Notice how the denition of the valuation function contains the English
logical connectives eitheror, and iff . I used these English connectives
rather than the logical connectives and , because at that point I was not
4
The table, for example, shows what truth value takes on depending on the truth
values of its parts. Rows correspond to truth values for , columns to truth values for . Thus,
to ascertain the truth value of when is 1 and is 0, we look in the 1 row and the 0
column. The listed value there is 0the conditional is false in this case. The table has only
one input-column and one result-column because is a one-place connective.
writing down wffs of the language of study (in this case, the language of propo-
sitional logic). I was rather using sentences of Englishour metalanguage, the
informal language were using to discuss the formal language of propositional
logicto construct my denition of the valuation function. My denition
needed to employ the logical notions of disjunction and biconditionalization,
the English words for which are eitheror and iff.
One might again worry that something circular is going on. We dened
the symbols for disjunction and biconditionalization, and , in terms of
and in section 2.1, and now weve dened the valuation function in
terms of disjunction and biconditionalization. So havent we given a circular
denition of disjunction and biconditionalization? No. When we dene the
valuation function, were not trying to dene logical concepts such as negation,
conjunction, disjunction, conditionalization, and biconditionalization, and
so on, at all. Reductive denition of these very basic concepts is probably
impossible (though one can dene some of them in terms of the others). What
we are doing is starting with the assumption that we already understand the
logical concepts, and then using those concepts to provide a semantics for a
formal language. This can be put in terms of object- and meta-language: we use
metalanguage connectives, such as iff and or, which we simply take ourselves
to understand, to provide a semantics for the object language connectives
and .
An elementary fact will be important in what follows: for every wff and
every PL-interpretation , V
() is either 0 or 1, but not both.

5
Equivalently:
a formula has one of the truth values iff it lacks the other. That this is a fact
is built into the denition of the valuation function for PL. First of all, V
is
dened as a function, and so it cant assign both the number 0 and the number 1
to a wff. And second, V
is dened as a function that assigns either 1 or 0 to each

wff (thus, in the case of the second and third clauses, if a complex wff fails the
condition for getting assigned 1, it automatically gets assigned 0.)
Back to the denition of the valuation function. The denition applies only
to ofcial wffs, which can contain only the primitive connectives and . But
sentences containing , , and are abbreviations for ofcial wffs, and are
therefore indirectly governed by the denition. In fact, given the abbreviations
dened in section 2.1, we can show that the denition assigns the intuitively
5
This fact wont hold for all the valuation functions well consider in this book; in chapter
3 we will consider trivalent semantic systems in which some formulas are assigned neither 1
nor 0.
correct truth values to sentences containing , , and . In particular, we can
show that for any PL-interpretation , and any wffs and ,
V
() =1 iff V
() =1 and V
() =1
V
() =1 iff either V
() =1 or V
() =1
V
() =1 iff V
() =V
()
Ill show that the rst statement is true here; the others are exercises for the
reader. Ill write out this proof in excessive detail, to make it clear exactly how
the reasoning works.
Example 2.1: Proof that gets the right truth condition. We are to show that
for every wffs and , and any PL-interpretation , V
() =1 iff V
() =
1 and V
() =1. So, let and be any wffs, and let be any PL-interpretation;
we must show that: V
() =1 iff V
() =1 and V
() =1. The expression

is an abbreviation for the expression (). So what we must show
is this: V
(()) =1 iff V
() =1 and V
() =1.
Now, in order to show that a statement A holds iff a statement B holds,
we must rst show that if Aholds, then B holds; then we must show that if B
holds then Aholds. So, rst we must establish that if V
(()) =1, then

V
() =1 and V
() =1. So, we begin by assuming that V
(()) =1,
and we then attempt to show that V
() = 1 and V
() = 1. Well, since
V
(()) =1, by denition of the valuation function, clause for , we

know that V
() = 0. Now, we earlier noted the principle that a wff

has one of the two truth values iff it lacks the other; thus, V
() is not 1.
(Henceforth I wont mention it when I make use of this principle.) But then,
by the clause in the denition of V
for the , we know that its not the case

that: either V
() =0 or V
() =1. So, V
() =1 and V
() =0. From
the latter, by the clause for , we know that V
() =1. So now we have what

we wanted: V
() =1 and V
() =1.
Next we must showthat if V
() =1 and V
() =1, then V
(()) =
1. This is sort of like undoing the previous half. Suppose that V
() =1 and
V
() =1. Since V
() =1, by the clause for , V
() =0; but now since

V
() =1 and V
() =0, by the clause for we know that V
() =0;
then by the clause for , we know that V
(()) =1, which is what we

were trying to show.
Example 2.1 is the rst of many metalogic proofs we will be constructing in this
book. (The symbol marks the end of such a proof.) It is an informal argument,
phrased in the metalanguage, which establishes a fact about a formal language.
As noted in section 1.3, metalogic proofs must be distinguished from proofs
in formal systemsfrom the derivations and truth trees of introductory logic,
and from the axiomatic and sequent proofs we will introduce below. Although
there are no explicit guidelines for how to present metalogic proofs, they are
generally given in a style that is common within mathematics. Constructing
such proofs can at rst be difcult. I offer the following pointers. First, keep in
mind exactly what you are trying to prove. (In your rst few proofs, it might
be a good idea to begin by writing down: what I am trying to prove is.)
Second, keep in mind the denitions of all the relevant technical terms (the
denition of , for instance.) Third, keep in mind exactly what you are
given. (In the preceding, for example, the important bit of information you are
given is the denition of the valuation function; that denition tells you the
conditions under which valuation functions assign 1s and 0s to negations and
conditionals.) Fourth, keep in mind the canonical methods for establishing
claims of various forms. (For example, if you want to show that a certain claim
holds for every two wffs, begin with let and be any wffs; show that the
claim holds for and ; and conclude that the claim holds for all pairs of
wffs. If you want to establish something of the form if A, then B, begin by
saying suppose A, go on to reason your way to B, and conclude: and so, if
A then B. Often it can be helpful to reason by reductio ad absurdum: assume
the opposite of the assertion you are trying to prove, reason your way to a
contradiction, and conclude that the assertion is true since its opposite leads to
contradiction.) Fifth: practice, practice, practice. As we progress, Ill gradually
speed up the presentation of such proofs, omitting more and more details when
they seem obvious. You should feel free to do the same; but it may be best
to begin by constructing proofs very deliberately, so that later on you know
exactly what details you are omitting.
Lets reect on what weve done so far. We have dened the notion of a PL-
interpretation, which assigns 1s and 0s to sentence letters of the formal language
of propositional logic. And we have also dened, for any PL-interpretation, a
corresponding PL-valuation function, which extends the interpretations as-
signment of 1s and 0s to complex wffs of PL. Note that we have been informally
speaking of these assignments as assignments of truth values. Thats because
the assignment of 1s and 0s to complex wffs mirrors the way complex natural
language sentences get their truth values, as a function of the truth values of
their parts. For example, the of propositional logic is supposed to represent
the English phrase it is not the case that. Accordingly, just as an English
sentence It is not the case that is true iff is false, one of our valuation
functions assigns 1 to iff it assigns 0 to . But strictly, its probably best not
to think of wffs of our formal language as genuinely having truth values. They
dont genuinely have meanings after all. Our assignments of 1 and 0 represent
the having of truth values.
A semantics for a formal language, recall, denes two things: congurations
and truth-in-a-conguration. In the propositional logic semantics we have
laid out, the congurations are the interpretation functions. A conguration is
supposed to represent a way for the world to be, plus the meanings of nonlogical
expressions. The only nonlogical expressions in PL are the sentence letters;
and, for the purposes of PL anyway, their meanings can be represented simply
as truth-values. And once weve specied a truth-value for each sentence letter,
weve already represented the world as much as we can in PL. Thus, PL-
interpretations are appropriate congurations. As for truth-in-a-conguration,
this is accomplished by the valuation functions. For any PL-interpretation,
its corresponding valuation function species, for each complex wff, what
truth value that wff has in that interpretation. Thus, for each wff () and
each conguration (), we have specied the truth value of that wff in that
conguration (V
()).
Onward. We are now in a position to dene the semantic versions of the
notions of logical truth and logical consequence for propositional logic. The
semantic notion of a logical truth is that of a valid formula:
iiixi:iox oi vaiibi:v: A wff is PL-valid iff for every PL-interpretation,
, V
() =1
We write
PL
for is PL-valid. (When its obvious which system
were talking about, well omit the subscript on .) The valid formulas of
propositional logic are also called tautologies.
As for logical consequence, the semantic version of this notion is that of a
single formulas being a semantic consequence of a set of formulas:
iiixi:iox oi sixax:it toxsioiixti: A wff is a PL-semantic consequence
of a set of wffs iff for every PL-interpretation, , if V
() = 1 for each
such that , then V
() =1
That is, is a PL-semantic consequence of iff is true whenever each
member of is true. We write
PL
for is a PL-semantic consequence
of . (As usual well often omit the PL subscript; and further, lets improve
readability by writing
1
, . . . ,
n
instead of ]
1
, . . . ,
n
] . That is,
lets drop the set braces when its convenient to do so.)
Arelated concept is that of semantic equivalence. Formulas and are said to
be (PL-) semantically equivalent iff each (PL-) semantically implies the other.
For example, and are semantically equivalent. Notice that
we could just as well have worded the denition thus: semantically equivalent
formulas are those that have exactly the same truth value in every interpretation.
Thus, there is a sense in which semantically equivalent formulas say the same
thing: they have the same truth-conditional content.
Just as its probably best not to think of sentences of our formal language
as genuinely having truth values, its probably best not to think of them as
genuinely being logically true or genuinely standing in the relation of logi-
cal consequence. The notions we have just dened, of PL-validity and PL-
semantic-consequence, are just formal representations of logical truth and
logical consequence (semantically conceived). Indeed, the denitions we have
given are best thought of as representing, rather than really being, a semantics.
Further, when we get to formal provability, the denitions we will give are
probably best thought of as representing facts about provability, rather than
themselves dening a kind of provability. But forgive me if I sometimes speak
loosely as if formal sentences really do have these features, rather than just
representing them.
By the way, we can now appreciate why it was important to set up our
grammar so carefully. The valuation function assigns truth values to complex
formulas based on their form. One clause in its denition kicks in for atomic
wffs, another clause kicks in for wffs of the form , and a third kicks in for
wffs of the form . This works only if each wff has exactly one of these
three forms; only a precise denition of wff guarantees this.
Ixcrcisc 2.1 Given the denitions of the dened symbols and
, show that for any PL-interpretation, , and any wffs and ,
V
() =1 iff either V
() =1 or V
() =1
V
() =1 iff V
() =V
()
2.4 Istablishing validity and invalidity in PL
Now that we have set up a semantics, we can establish semantic facts about
particular wffs. For example:
Example 2.2: Proof that
PL
(PQ)(QP). To show a wff to be PL-
valid, we must show that it is true in every PL-interpretation. So, let be any
PL-interpretation, and suppose for reductio that V
((PQ)(QP)) =0.
This assumption leads to a contradiction, as the following argument shows:
i) V
((PQ)(QP)) =0 (reductio assumption)

ii) So, by the denition of a valuation function, clause for the , V
(PQ) =
1 and
iii) V
(QP) =0
iv) Given iii), again by the clause for the , V
(Q) =1 and
v) V
(P) =0
vi) Given iv), by the clause for the , V
(Q) =0.
vii) Similarly, v) tells us that V
(P) =1.
viii) From vii) and vi), by the clause for the we know that V
(PQ) =0,
which contradicts line ii).
Here again we have given a metalogic proof: an informal mathematical ar-
gument establishing a fact about one of our formal languages. (The conclusion
of the argument was not sufciently impressive to merit the ourish at the
end.) There is nothing special about the form that this argument took. One
could just as well have established the fact that
PL
(PQ)(QP) by
constructing a truth table, as one does in introductory textbooks, for such a
construction is in effect a pictorial metalogic proof that a certain formula is
PL-valid.
Arguments establishing facts of semantic consequence are parallel (in this
example we will proceed more briskly):
Example 2.3: Proof that P(QR) Q(PR). We must show that in
any PL-interpretation in which P(QR) is true, Q(PR) is true as well.
Let be any PL-interpretation; we then reason as follows:
i) Suppose for reductio that V
(P(QR)) =1 but
ii) V
(Q(PR)) =0. (From now on well omit the subscripted .)

iii) line ii) tells us that V(Q) =1 and V(PR) =0, and hence that V(R) =0.
So V(QR) =0.
iv) Since V(PR) =0 (line iii)), V(P) =1. So then, by iii), V(P(QR)) =
0. This contradicts i).
One can also establish facts of invalidity and failures of semantic conse-
quence:
Example 2.4: Proof that ((PR)Q)(RQ). To be valid is to be true
in all interpretations; so to be invalid (i.e., not valid) is to be false in at least one
interpretation. So all we must do is nd one interpretation in which this wff
is false. Let be an interpretation such that (R) =1 and (P) =(Q) =0.
Then V
(PR) =0 (example 2.1), so V
((PR)Q) =1. But since V
(R) =1
and V
(Q) =0, V
(RQ) =0. So V
((PR)Q)(RQ)) =0
Example 2.5: Proof that PR (PQ)R. Consider a PL-interpretation
in which P and R are false, and in which Q is true. PR is then true (since its
antecedent is false), but PQ is true (since Q is truesee exercise 2.1) while R
is false, so (PQ)R is false.
Ill end this section by noting a certain fact about validity in propositional
logic: it is mechanically decidable. That is, a computer program could be
written that is capable of telling, for any given formula, whether or not that
formula is valid. The program would simply construct a complete truth table
for the formula in question. To give a rigorous proof of this fact would take us
too far aeld, since we would need to give a rigorous denition of what counts
as a computer program, but the point is intuitively clear.
Ixcrcisc 2.2 Establish each of the following facts:
a) [P(QR)] [(PQ)(PR)]
b) (PQ) (RS) PR
c) (PQ) and PQ are semantically equivalent.
2.4.1 Schcmas, validity, and invalidity
In example 2.2 of the previous section we showed a particular wff to be valid:
(PQ)(QP). But the proof of this fact depended only on the fact
that the wff had the form ()(). We could just as easily have
argued that any wff of that form is valid, simply by replacing each reference
to P in the argument with a reference to , and each reference to Q with a
reference to . The conclusion of this argument would be: for any wffs
and , ()(). This conclusion is more general than, and
so more useful than, the conclusion of example 2.2. Similarly, instead of
showing particular wffs to semantically imply one another (as in example 2.3),
we can show types of wffs to semantically imply one another (we can show,
for example, that () (), for any wffs , , and ). And
instead of showing particular wffs to be semantically equivalent, we can show
types of wffs to be semantically equivalent.
Its tempting to think of general proofs of this sort as establishing facts about
schemasstrings like ()(). Once the proof of example 2.2 has
been appropriately generalized, its tempting to think of it as showing that the
schema ()() is valid. But strictly speaking such talk is incorrect
since the notion of validity does not apply to schemas. Validity is dened in
terms of truth in interpretations, and truth in interpretations is dened only for
wffs. And schemas are not wffs, since schemas contain metalinguistic variables
like , , and , which are not part of the primitive vocabulary of the language
of propositional logic. Rather, schemas are blueprints, which become wffs
when we substitute particular wffs in for the metalinguistic variables.
Now, a schema can have a property thats closely related to validity. The
schema ()() has the following feature: all of its instances (that
is, all formulas resulting from replacing and in the schema with wffs) are
valid. So one can informally speak of schemas as being valid when they have
this closely related property. But we must take great care when speaking of the
invalidity of schemas. One might think to say that the schema is invalid.
But what would that mean? If it means that every instance of the schema is
invalid, then the statement would be wrong. The wffs PP and P(QQ),
for example, are instances of , but each is valid. Whats true about the
schema is that some of its instances are invalid (for example PQ).
So when dealing with schemas, it will often be of interest to ascertain
whether each instance of the schema is valid; it will rarely (if ever) be of interest
to ascertain whether each instance of the schema is invalid.
2.5 Scqucnt proofs in propositional logic
The denitions of the previous section were inspired by the semantic con-
ception of logical truth and logical consequence. An alternate conception is
proof-theoretic. On this conception, the logical consequences of a set are
those statements that can be proved if one takes the members of the set as
premises; and a logical truth is a sentence that can be proved without using any
premises at all. A proof procedure is a method of reasoning ones way, step by
step, according to mechanical rules, from some premises to a conclusion. The
formal systems inspired by this conception introduce mathematical models of
proof procedures, which apply to sentences of formal languages.
There are different methods for dening what a proof procedure is. One is
the method of natural deduction. This method is popular in introductory logic
textbooks, since it allows reasoning with assumptions. For example, in order
to prove a conditional, one assumes its antecedent for the sake of conditional
proof, and goes on to establish its consequent on that basis. Natural deduction
proofs often look like this:
P(QR)
PQ
P , E
Q , E
QR , , E
R , , E
(PQ)R , I
or like this:
.
.
.
.
.
.
.
.
P(QR)
show (PQ)R
PQ
show R
P
Q
QR
R
Pr.
CD
As.
DD
, E
, E
, , E
, E
The system we will examine in this section is a bit different. Our sequent
proofs will look different from natural deduction proofs:
. P(QR) P(QR) RA
. PQ PQ RA
. PQ P , E
. PQ Q , E
. P(QR), PQ QR , , E
. P(QR), PQ R , , E
. P(QR) (PQ)R , I
Nevertheless, the underlying idea is quite similar. As we will see, sequent proofs
also let us reason with assumptions.
2.5.1 Scqucnts
How does everyday reasoning work? In its simplest form, one reasons in a
step-by-step fashion from premises to a conclusion, each step being sanctioned
by a rule of inference. For example, suppose that you begin with the premise
P (PQ). You already know this premise to be true, or you are supposing
it to be true for the sake of argument. You can then reason your way to the
conclusion that Q is also true, as follows:
. P (PQ) premise
. P from line
. PQ from line
. Q from lines and
In this kind of proof, each step is a tiny, indisputably correct, logical inference.
Consider the moves from to and from to , for example. These are
indisputably correct because a conjunctive statement clearly logically implies
each of its conjuncts. Likewise for the move from and to : it is clear that a
conditional statement together with its antecedent imply its consequent. Proof
systems consist in part of simple rules of inference, which allow one to infer
further formulas from formulas already contained in the proof. One example
of a rule of inference (the one used to derive lines and in the above example)
might be stated thus: from a conjunctive statement one may infer either of
the conjuncts.
In addition to rules of inference, ordinary reasoning employs a further
technique: the use of assumptions. In order to establish a conditional claim if A
then B, one would ordinarily i) assume A, ii) reason ones way to B, and then
iii) on that basis conclude that the conditional claim if Athen B is true. Once
the assumption of Ais shown to lead to B, the conditional claim if Athen B
may be concluded. Another example: to establish a claim of the form not-A,
one would ordinarily i) assume A, ii) reason ones way to a contradiction, and
iii) on that basis conclude that not-A is true. Once the assumption of A is
shown to lead to a contradiction, not-A may be concluded. The rst sort of
reasoning is called conditional proof, the second, reductio ad absurdum.
When you reason with assumptions, you write down sentence that you dont
know to be true. Suppose you write down the sentence Jones is a bachelor as
an assumption for a conditional proof, with the goal of using it to prove the
statement Jones is male and thus to conclude that the conditional if Jones is a
bachelor then Jones is male is true. In this context, you do not know Jones is
a bachelor to be true. Youre merely assuming it for the sake of establishing
the conditional. Outside of this conditional proof, the assumption need not
hold. Once youve established the conditional, you stop assuming that Jones is
a bachelor. To model this sort of reasoning formally, we need a way to keep
track of how the conclusions we establish depend on the assumptions we have
made. Natural deduction systems in introductory textbooks tend to do this
geometrically (by placement on the page), with special markers (e.g., show),
and by drawing lines or boxes around parts of the proof once the assumptions
that led to those parts are no longer operative. We will do it differently: we
will keep track of the dependence of conclusions on assumptions by writing
down explicitly, for each conclusion, which assumptions it depends on. We will
do this using what are known as sequents.
6
A sequent looks like this:

is a set of formulas, called the premises of the sequent.
7
is a single formula,
called the conclusion of the sequent. is a sign that goes between the sequents
premises and its conclusion, to indicate that the whole thing is a sequent. Think
intuitively of a sequent as meaning that its conclusion is a logical consequence
of its premises.
In the proof system that I am about to introduce, one constructs proofs
out of sequents, rather than out of wffs. The lines of a sequent proof are
sequents; the conclusion of a sequent proof is a sequent; and the rules of
inference in sequent proofs let us infer new sequents from earlier sequents in a
proof. Reasoning with sequents might initially seem weird. For example, one
normally infers formulas from formulas; what does it mean to infer sequents
from sequents? Well, think of it this way. Call a natural language sequent one
in which and the members of are natural language sentences; and call a
natural language sequent logically correct iff is a (genuine) logical consequence
of the members of . Natural language sequent proofs can then be thought of
as attempts to show that natural language sequents are logically correct, and
thus, as attempts to establish that some sentences are logical consequences of
others. On this conception, a good natural language sequent rule ought to
preserve logical correctness. That is, if the rule lets us infer a new sequent from
some old sequents, then if the old sequents are logically correct, so must be
6
The method of sequents (as well as the method of natural deduction) was invented by
Gerhard Gentzen ().
7
For reasons I wont go into, multiple formulas are sometimes allowed on the right hand
side of a sequent. Also, the premises of a sequent are usually taken to be an ordered sequence (or
some other ordered structure) of wffs rather than a set of wffs. This is to allow for nonstandard
logics in which order and repetition of premises can affect the correctness of arguments. To
recover logics in which order and repetition do not matter, one must then introduce structural
rules of inference, for example a rule allowing one to infer , from , and a rule
allowing one to infer , from . In the sequent systems well be discussing, order
and repetition of premises dont matter, and so Ill just treat premises as sets. See Restall ()
for more on sequent proof systems and structural rules.
the new sequent. Natural language sequent proofs, thus understood, let us
establish new cases of logical consequence on the basis of old cases of logical
consequencewe reason about logical consequence. The symbolic sequent
proof system we are about to dene can be thought of as modeling this sort of
reasoning.
We have seen how to think of reasoning with sequents as reasoning about
logical consequence. But notice that this is, in effect, reasoning with assump-
tions. For whenever one makes some assumptions , and on that basis estab-
lishes , will be a logical consequence of if the reasoning is any good.
Assumptions that lead to a conclusion are just statements that logically imply
that conclusion. So, one can think of reasoning to on the basis of assumptions
as a sequent proof of the sequent .
2.5.2 Rulcs
The rst step in developing our system is to write down sequent rules. A
sequent rule is a permission to move from certain sequents to another sequent.
Our rst rule will be introduction, or I for short:
8

,
I
Above the line go the from sequents; below the line goes the to-sequent.
(The comma between and in the to sequent simply means that the
premises of this sequent are all the members of plus all the members of .
Strictly speaking we should write this in set-theoretic notation: .)
Thus, I permits us to move from the sequents and to the
sequent , . We say that the to sequent (, in this case)
follows from the from sequents (in this case and ) via the rule
(in this case, I.)
Remember that our sequent rules are supposed to represent natural language
sequent rules that preserve logical correctness. So intuitively, our rules ought to
have the following feature: if all of the from sequents are (represent) logically
correct sequents, then the to sequent is guaranteed to be (represent) a logically
correct sequent. Intuitively, I has this feature. For if some assumptions
logically imply , and some assumptions logically imply , then (since
intuitively follows from and taken together) the conclusion should
8
We have rules for and , even though theyre not grammatically primitive connectives.
indeed logically follow from all the assumptions together, the ones in and
the ones in .
Our next sequent rule is E:

E
This has two forms. The rst lets one move from the sequent
to the sequent ; the second lets one move from to .
Again, each appears to preserve logical correctness. If the members of
imply the conjunction , then (since intuitively implies both and
individually) it must be that the members of imply , and they must also
imply .
The rule I is known as an introduction rule for , since it allows us to move
to a sequent of the form . Likewise, the rule E is known as an
elimination rule for , since it allows us to move from a sequent of that form.
In fact our sequent system contains introduction and elimination rules for the
other connectives as well: , , and (lets forget the here.) Well present
those rules in turn.
First I and E:

I

1
,
2
,
,
1
,
2
E
E embodies reasoning by separation of cases. Here, intuitively, is why it is a
good sequent rule. Suppose we know that the three from-sequents of E are
logically correct. We can then give an intuitive argument that the to-sequent
,
1
,
2
is also logically correct; that is, that is a logical consequence
of the formulas in ,
1
, and
2
. Suppose the formulas in ,
1
, and
2
are all
true. The rst from-sequent tells us that the disjunction is true. So either
or is true. Now, if is true then the second from-sequent tells us that is
true. And if is true then the third from-sequent tells us that is again true.
Either way, we learn that is true (theres the separation of cases reasoning).
Next, we have double negation:

DN
In connection with negation, we also have the rule of reductio ad absurdum:
,

RAA
That is, if (along with perhaps some other assumptions, ) leads to a contra-
diction, we can conclude that is true (given the assumptions in ). RAA
and DN together are our introduction and elimination rules for .
And nally we have I and E:
,

I

,
E
E is perfectly straightforward; its just the familiar rule of modus ponens.
I is the principle of conditional proof. Suppose you can get to on the
assumption that (plus perhaps some other assumptions .) Then, you should
be able to conclude that the conditional is true (assuming the formulas
in ). Put another way: if you want to establish the conditional , all you
need to do is assume that is true, and reason your way to .
We add, nally, one more sequent rule, the rule of assumptions
RA
This is the one sequent rule that requires no from sequents (there are no
sequents above the line). The rule permits us to move from no sequents at
all to a sequent of the form . (Strictly, this sequent should be written
]] .) Intuitively, any such sequent is logically correct since any statement
logically implies itself.
2.5.3 Scqucnt proofs
We have assembled all the sequent rules. Now well see how to construct
sequent proofs with them.
iiixi:iox oi sioiix: vvooi: A sequent proof is a series of sequents, each
of which is either of the form , or follows from earlier sequents in the
series by some sequent rule.
So, for example, the following is a sequent proof
. PQ PQ RA
. PQ P , E
. PQ Q , E
. PQ QP , , I
Though it isnt strictly required, we write a line number to the left of each
sequent in the series, and to the right of each line we write the sequent rule
that justies it, together with the line or lines (if any) that contained the from
sequents required by the sequent rule in question. (The rule of assumptions
requires no from sequents, recall.)
To reiterate a distinction Ive been making, its important to distinguish
sequent proofs from metalogic proofs. Sequent proofs (and also the axiomatic
proofs we will introduce in section 2.6) are proofs in formal systems. They
consist of wffs in a formal language (plus the sequent sign, ), and are structured
according to a carefully formulated denition (the denition of a sequent proof).
Moreover, only the systems ofcial rules of inference may be used. Metalogic
proofs are very different. Recall the argument I gave in section 2.3 that any PL-
valuation assigns 1 to iff it assigns 1 to and 1 to . The sentences in the
argument were sentences of English, and the argument used informal reasoning.
Informal means merely that the reasoning doesnt follow a formally stipulated
set of rules; it doesnt imply lack of rigor. The argument conforms to the
standards of good argumentation that generally prevail in mathematics.
Next we introduce the notion of a provable sequent:
iiixi:iox oi vvovanii sioiix:: A provable sequent is a sequent that is the
last line of some sequent proof
So, for example, the sequent proof given above establishes that PQ QP
is a provable sequent. We call a sequent proof, whose last line is , a
sequent proof of .
Note that it would be equivalent to dene a provable sequent as any line
in any sequent proof, because at any point in a sequent proof one may simply
stop adding lines; the proof up until that point counts as a legal sequent proof.
The denitions we have given in this section give us a formal model (of the
proof-theoretic variety) of the core logical notions, as applied to propositional
logic. The formal model of being a logical consequence of the formulas in
set is: the sequent is a provable sequent. The formal model of being
a logical truth is: the sequent is a provable sequent ( is the empty set).
2.5.4 Ixamplc scqucnt proofs
Lets explore how to construct sequent proofs. (You may nd this initially
awkward, but a little experimentation will show that the techniques familiar
from proof systems in introductory textbooks will work here.)
Example 2.6: Lets return to the sequent proof of PQ QP:
. PQ PQ RA
. PQ P , E
. PQ Q , E
. PQ QP , , I
Notice the strategy. Were trying to prove the sequent PQ QP. The
premise of this sequent is PQ, so our rst step is to use the rule of assumptions
to introduce this wff into our proof (line ). We now have a sequent with a
conjunction as its conclusion, but its conjuncts are in the wrong order (we want
QP, not PQ). So rst we take the conjuncts apart using E (lines and ),
and then put them back together in the other order (line ).
Example 2.7: Next an example to illustrate conditional proof. Lets construct
a sequent proof of PQ, QRPR:
. PQ PQ RA
. QRQR RA
. P P RA (for conditional proof)
. PQ, P Q , , E
. PQ, QR, P R , , E
. PQ, QRPR , I
Here we are trying to establish a sequent whose premises are PQ and QR,
so we start by using RA to get these two wffs into the proof. Then, since
the conclusion of the sequent were after is a conditional (PR), we use RA
to introduce its antecedent (P), and our goal then is to get a sequent whose
conclusion is the conditionals consequent (R). (To prove a conditional you
assume the antecedent and then try to establish the consequent.) When we
achieve this goal in line , weve shown that Rfollows fromvarious assumptions,
including P. The rule I (in essence, the principle of conditional proof) then
lets us conclude that the conditional PRfollows fromthose other assumptions
alone, without the help of P.
Notice how dependencies sometimes get added and sometimes get sub-
tracted when we use sequent rules. The sequent on line has P among its
premises, but when we use I to move to line , P is no longer present as a
premise. Whereas the conclusion of line (R) depends on P, the conclusion
of line (PR) does not. A dependency is subtracted. (In compensation, the
conclusion weakens, from R to PR.) But the move from and to adds
dependencies: the conclusion of line depends on the premises from lines
and taken together. (The rule E requires this.)
Example 2.8: Next a DeMorgan sequent, (PQ) PQ:
. (PQ) (PQ) RA
. P P RA (for reductio)
. P PQ , I
. (PQ), P (PQ) (PQ) , , I
. (PQ) P , RAA
. Q Q RA (for reductio)
. Q PQ , I
. (PQ), Q (PQ) (PQ) , , I
. (PQ) Q , RAA
. (PQ) PQ , , I
The main strategies at work here are two. First, in order to establish a con-
junction (such as PQ) you independently establish the conjuncts and then
put them together using I. Two, in order to establish a negation (such as P),
you use reductio ad absurdum.
Example 2.9: Next lets establish PP:
. (PP) (PP) RA (for reductio)
. P PP , I
. (PP), P (PP) (PP) , , I
. (PP) P , RAA
. (PP) PP , I
. (PP) (PP) (PP) , , I
. (PP) , RAA
. PP , DN
Here my overall goal was to assume (PP) and then derive a contradiction.
And my route to the contradiction was to rst establish P (by a reductio
argument, in lines ), and then to get my contradiction from that.
Example 2.10: Finally, lets establish a sequent corresponding to a way that
E is sometimes formulated: PQ, P Q:
. PQ PQ RA
. P P RA
. Q Q RA (for use with E)
. P P RA (for use with E)
. Q Q RA (for reductio)
. P, P PP , , I
. P, P, Q (PP)Q , , I
. P, P, Q PP , E
. P, P Q , RAA
. P, P Q , DN
. PQ, P Q , , , E
The basic idea of this proof was to use E on line to get Q. That called, in
turn, for showing that each disjunct of PQ leads to Q. Showing that Q leads
to Q is easy; that was line . Showing that P leads to Q took lines -; line
states the result of that reasoning, namely that Q follows from P (given also
the other premise of the whole argument, P). I began at line by assuming
P. Then my strategy was to establish Q by reductio, so I assumed Q in
line , and then got a contradiction in line . But there was a minor hitch. I
wanted next to use RAA to conclude Q. But look carefully at how RAA is
formulated. It says that if we have , , we can conclude .
So to use RAA to infer , together with must imply a contradiction.
So in the present case, in order to nish the reductio argument and conclude
Q, the contradiction PP needed to depend on the reductio assumption
Q. But on line , the contradiction depended only on P and P. To get
around this, I used a little trick in lines and . I used I to pop Q onto the
end of the contradiction (thus adding a dependency on Q), and then I used
E to pop it off (retaining the dependency). One can always use this trick to
add a dependencyto add any desired wff to the premises of a sequent.
9
(If
the wff you want to add isnt in the proof already, just use RA to get it in there.)
Ixcrcisc 2.3 Prove the following sequents:
a) P(QR) (QR)P
b) P, Q, RP
c) PQ, RQ (PR)Q
2.6 Axiomatic proofs in propositional logic
In this section we consider a different approach to proof theory, the axiomatic
approach. Sequent proofs are comparatively easy to construct; that is their great
advantage. Axiomatic (or Hilbert-style) systems offer different advantages.
Like sequent proofs, axiomatic proofs consist of step-by-step reasoning in
which each step is sanctioned by a rule of inference. But axiomatic systems do
not allow reasoning with assumptions, and therefore do not allow conditional
proof or reductio ad absurdum; and they have very few rules of inference.
Although these differences make axiomatic proofs much harder to construct,
there is a compensatory advantage in metalogic: in many cases it is easier to
prove things about axiomatic systems.
9
Adding arbitrary dependencies is not allowed in relevance logic, where a sequent is provable
only when all of its premises are, intuitively, relevant to its conclusion. Relevance logicians
modify various rules of standard logic, including the rule of E.
Lets rst think about axiomatic systems informally. An axiomatic proof
will be dened as a series of formulas (not sequentswe no longer need them
since were not reasoning with assumptions anymore), the last of which is the
conclusion of the proof. Each line in the proof must be justied in one of two
ways: it may be inferred by a rule of inference from earlier lines in the proof,
or it may be an axiom. An axiom is a certain kind of formula, a formula that
one is allowed to enter into a proof without any further justication. Axioms
are the starting points of proofs, the foundation on which proofs rest. Since
axioms are to play this role, the axioms in a good axiomatic system ought to
represent indisputable logical truths. (For example, PP would be a good
axiom, since sentences like if it is raining then it is raining and if snow is
white then snow is white are obviously logical truths. But we wont choose
this particular axiom; well choose other axioms from which it may be proved.)
Similarly, a rule of inference in a good axiomatic system ought to represent an
argument form in which the premises clearly logically imply the conclusion.
Actually well employ a slightly more general notion of a proof: a proof
from a given set of wffs . A proof from will be allowed to contain members
of , in addition to axioms and wffs that follow from earlier lines by a rule.
Think of the members of as premises, which in the context of a proof from
are temporarily treated as axioms, in that they are allowed to be entered into
the proof without any justication. (Premises are a bit like the assumptions in
sequent proofs, but theyre not the same: a proof of from set of premises
cannot contain any further assumptions beyond those in . You cant just
assume a formula for the sake of conditional proof or reductiothere simply is
no conditional proof or proof by reductio in an axiomatic system.) The intuitive
point of a proof from is to demonstrate its conclusion on the assumption that
the members of are true, in contrast to a proof simpliciter (i.e. a proof in the
sense of the previous paragraph), whose point is to demonstrate its conclusion
unconditionally. (Note that we can regard a proof simpliciter as a proof from
the empty set .)
Formally, to apply the axiomatic method, we must choose i) a set of rules,
and ii) a set of axioms. In choosing a set of axioms, we simply choose any set
of wffs, although as we saw, in a good axiomatic system the axioms should
represent logical truths. A rule is simply a permission to infer one sort of
sentence from other sentences. For example, the rule modus ponens can be
stated thus: From and you may infer , and pictured as follows:

MP
(There typically are very few rules, often just modus ponens. Modus ponens
corresponds to the sequent rule E.) Given any chosen axioms and rules, we
can dene the following concepts:
iiixi:iox oi axioxa:it vvooi ivox a si:: Where is a set of wffs and is
a wff, an axiomatic proof from is a nite sequence of wffs whose last line is
, in which each line either i) is an axiom, ii) is a member of , or iii) follows
from earlier wffs in the sequence via a rule.
iiixi:iox oi axioxa:it vvooi: An axiomatic proof of is an axiomatic proof
of from (i.e., a nite sequence of wffs whose last line is , in which each
line either i) is an axiom, or ii) follows from earlier wffs in the sequence via a
rule.)
It is common to write ' to mean that is provable from , i.e., that
there exists some axiomatic proof of from . We also write ' to mean
that ' , i.e. that is provable, i.e., that there exists some axiomatic proof
of from no premises at all. (Formulas provable from no premises at all are
often called theorems.) This notation can be used for any axiomatic system,
i.e. any choice of axioms and rules. The symbol ' may be subscripted with
the name of the system in question. Thus, for our axiom system for PL below,
we may write: '
PL
. (Well omit this subscript when its clear which axiomatic
system is being discussed.)
Here is an axiomatic system for propositional logic:
Axioxa:it svs:ix iov P!:
Rule: modus ponens
Axioms: The result of substituting wffs for , , and in any of the
following schemas is an axiom:
() (PL)
(()) (()()) (PL)
() (()) (PL)
Thus, a PL-theorem is any formula that is the last of a sequence of formulas,
each of which is either a PL, PL, or PL axiom, or follows from earlier
formulas in the sequence by modus ponens. And a formula is PL-provable from
some set if it is the last of a sequence of formulas, each of which is either a
member of , a PL, PL, or PL axiom, or follows from earlier formulas in
the sequence by modus ponens.
The axiom schemas PL-PL are not themselves axioms. They are,
rather, recipes for constructing axioms. Take PL, for example:
()
This string of symbols isnt itself an axiom because it isnt a wff; it isnt a wff
because it contains Greek letters, which arent allowed in wffs (since theyre
not on the list of PL primitive vocabulary). and are variables of our
metalanguage; you only get an axiom when you replace these variables with
wffs. P(QP), for example, is an axiom (well, ofcially it requires outer
parentheses.) It results from PL by replacing with P and with Q. (Note:
since you can put in any wff for these variables, and there are innitely many
wffs, there are innitely many axioms.)
A few points of clarication about how to construct axioms from schemas.
First point: you can stick in the same wff for two different Greek letters. Thus
you can let both and in PL be P, and construct the axiom P(PP).
(But of course, you dont have to stick in the same thing for as for .) Sec-
ond point: you can stick in complex formulas for the Greek letters. Thus,
(PQ)((RS)(PQ)) is an axiom (I put in PQ for and (RS)
for in PL). Third point: within a single axiom, you cant substitute different
wffs for a single Greek letter. For example, P(QR) is not an axiom; you
cant let the rst in PL be P and the second be R. Final point: even
though you cant substitute different wffs for a single Greek letter within a
single axiom, you can let a Greek letter become one wff when making one axiom,
and let it become a different wff when making another axiom; and you can use
each of these axioms within a single axiomatic proof. For example, each of the
following is an instance of PL; you could use both within a single axiomatic
proof:
P(QP)
P((QR)P)
In the rst case, I made be P and be Q; in the second case I made be P
and be QR. This is ne because I kept and constant within each axiom.
(The type of symbol replacement described in this paragraph is sometimes
called uniform substitution.)
Thus, we have developed another formalism that is inspired by the proof-
theoretic conception of the core logical notions. The PL-theorems represent
the logical truths, and PL-provability represents logical consequence.
Axiomatic proofs are much harder to construct than sequent proofs. Some
are easy, of course. Here is a proof of (PQ)(PP):
. P(QP) PL
. (P(QP))((PQ)(PP)) PL
. (PQ)(PP) , , MP
The existence of this proof shows that (PQ)(PP) is a theorem. (The
line numbering and explanations of how the lines were obtained arent required,
but they make the proofs easier to read.)
Building on the previous proof, we can construct a proof of PP from
]PQ]. (In a proof from a set, when we write down a member of the set well
annotate it premise.)
. P(QP) PL
. (P(QP))((PQ)(PP)) PL
. (PQ)(PP) , , MP
. PQ premise
. PP , , MP
Thus, we have shown that ]PQ] ' PP. (Lets continue with our practice
of dropping the set-braces in such statements. In this streamlined notation,
what we just showed is: PQ ' PP.)
The next example is a little harder: (RP)(R(QP))
. [R(P(QP))][(RP)(R(QP))] PL
. P(QP) PL
. [P(QP)][R(P(QP))] PL
. R(P(QP)) , , MP
. (RP)(R(QP)) , , MP
Heres how I approached this problem. What I was trying to prove, namely
(RP)(R(QP)), is a conditional whose antecedent and consequent both
begin: (R. That looks like the consequent of PL. So I wrote out an instance
of PL whose consequent was the formula I was trying to prove; that gave me
line of the proof. Then I tried to gure out a way to get the antecedent of
line ; namely, R(P(QP)). And that turned out to be pretty easy. The
consequent of this formula, P(QP) is an axiom (line of the proof). And
if you can get a formula , then you choose anything you likesay, R,and
then get R, by using PL and modus ponens; thats what I did in lines
and .
As you can see, the proofs are getting harder. And they get harder still.
Fortunately, we will be able to develop some machinery to make them easier;
but that will need to wait for a couple of sections.
Ixcrcisc 2.4 Establish each of the following facts. For these prob-
lems, do not use the toolkit assembled below; construct the ax-
iomatic proofs from scratch. However, you may use a fact you
prove in an earlier problem in later problems.
a) ' PP
b) ' (PP)P
c) P ' P
2.7 Soundncss of PL and proof by induction
Note: the next three sections are more difcult than the preceding sections,
and may be skipped without much loss. If you decide to work through the
more difcult sections dealing with metalogic later in the book (for example
sections 6.5 and 6.6), you might rst return to these sections.
In this chapter we have taken both a proof-theoretic and a semantic approach
to propositional logic. In each case, we introduced formal notions of logical
truth and logical consequence. For the semantic approach, these notions
involved truth in PL-interpretations. For the proof-theoretic approach, we
considered two formal denitions, one involving sequent proofs, the other
involving axiomatic proofs.
An embarrassment of riches! We have multiple formal accounts of our
logical notions. But in fact, it can be shown that all three of our denitions yield
exactly the same results. Here Ill prove this just for the notion of a theorem
(last line of an axiomatic proof) and the notion of a valid formula (true in all
PL-interpretations). Ill do this by proving the following two statements:
Soundncss of PL: Every PL-theorem is PL-valid
Complctcncss of PL: Every PL-valid wff is a PL-theorem
Soundness is pretty easy to prove; well do that in a moment. Completeness is
harder; well prove that in section 2.9. Soundness and completeness together
tell us that PL-validity and PL-theoremhood exactly coincide.
But rst a short detour: we need to introduce a method of proof that
is ubiquitous throughout metalogic (as well as mathematics generally), the
method of induction. The basic idea, in its simplest form, is this. Suppose we
have innitely many objects lined up like this:

. . .
And suppose we want to show that each of these objects has a certain property.
How to do it?
The method of induction directs us to proceed in two steps. First, show
that the rst object has the property:
'&%$ !"#

. . .
This is called the base case of the inductive proof. Next, show that quite
generally, whenever one object in the line has the property, then the next must
have the property as well. This is called the inductive step of the proof. The
method of induction then says: if youve established those two things, you can
go ahead and conclude that all the objects in the line have the property. Why
is this conclusion justied? Well, since the rst object has the property, the
second object must have the property as well, given the inductive step:
'&%$ !"#

'&%$ !"#

. . .
But then another application of the inductive step tells us that the third object
has the property as well:
'&%$ !"#
'&%$ !"#

'&%$ !"#

. . .
And so on; all objects in the line have the property:
'&%$ !"#
'&%$ !"#
'&%$ !"#
'&%$ !"#
. . .
That is how induction works when applied to objects lined up in the manner
depicted: there is a rst object in line; after each object there is exactly one
further object; and each object appears some nite number of jumps after the
rst object. Induction can also be applied to objects structured in different
ways. Consider, for example, the following innite grid of objects:
.
.
.
.
.
.
.
.
.

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
At the bottom of this grid there are three dots. Every pair of these three dots
combines to produce one newdot. (For example, the leftmost dot on the second
from the bottom level is produced by the leftmost two dots on the bottom
level.) The resulting three dots (formed from the three pairs drawn from the
three dots on the bottom level) form the second level of the grid. These three
dots on the second level produce the third level in the same way, and so on.
Suppose, now, that one could prove that the bottom three dots have some
property:
.
.
.
.
.
.
.
.
.

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
'&%$ !"#

'&%$ !"#
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

'&%$ !"#
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
(This is the base case.) And suppose further that one could prove that when-
ever two dots with the property combine, the resulting dot also has the property
(inductive step). Then, just as in the previous example, induction allows us
to conclude that all the dots in the grid have the property. Given the base case
and the inductive step, we know that the dots on the second level of the grid
have the property:
.
.
.
.
.
.
.
.
.

'&%$ !"#

'&%$ !"#
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

'&%$ !"#
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
'&%$ !"#

'&%$ !"#
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?

'&%$ !"#
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
But then, given the inductive step, we know that the dots on the third level
have the property. And so on, for all the other levels.
In general, induction is a method for proving that each member of a certain
collection of objects has a property. It works when (but only when) each object
in the collection results from some starting objects by a nite number of
iterations of some operations. In the base case one proves that the starting
objects have the property; in the induction step one proves that the operations
preserve the property, in the sense that whenever one of the operations is applied
to some objects with the property, the resulting new object has the property as
well; and nally one concludes that all objects have the property.
This idea manifests itself in logic in a few ways. One is in a style of proof
sometimes called induction on formula construction (or: induction on the
number of connectives of the formula). Suppose we want to establish that
absolutely every wff has a certain property, p. The method of proof by induction
on formula construction tells us to rst establish the following two claims:
b) every atomic wff (i.e. every sentence letter) has property p
i) for any wffs and , if both and have property p, then the wffs
and also have property p
Once these are established, proof by induction allows us to conclude that every
wff has property p. Why is this conclusion justied? Recall the denition
of a wff from section 2.1: each wff is built up from atomic wffs by repeated
application of clause ii): if and are wffs then and are also wffs.
So each wff is the culmination of a nite process that starts with atomic wffs and
continues by building conditionals and negations from wffs formed in previous
steps of the process. But claim b) (the base case) shows that the starting points
of this process all have property p. And claim i) (the induction step) shows that
the subsequent steps in this process preserve property p: if the formulas one has
built up so far have property p, then the next formula in the process (built up of
previous formulas using either or ) is guaranteed to also have p. So all wffs
have property p. In terms of the general idea of inductive proof, the atomic
wffs are our starting objects (like the bottom three dots in the grid), and the
rules of grammar for and which generate complex wffs from simpler wffs
are the operations.
Here is a simple example of proof by induction on formula construction:
Proof that every wff contains a nite number of sentence letters. We are trying to
prove a statement of the form: every wff has property p. The property p
in this case is having a nite number of different sentence letters. Our proof has
two separate steps:
base case: here we must show that every atomic sentence has the property.
This is obviousatomic sentences are just sentence letters, and each of them
contains one sentence letter, and thus nitely many different sentence letters.
induction step: here we must show that if wffs and have property p, then
so do and . So we begin by assuming:
formulas and each have nitely many different sentence letters (ih)
This assumption is often called the inductive hypothesis. And we must go on
to show that both and have nitely many different sentence letters.
This, too, is easy. has as many different sentence letters as does ; since ih)
tells us that has nitely many, then so does . As for , it has, at most,
n +m sentence letters, where n and m are the number of different sentence
letters in and , respectively; ih) tells us that n and m are nite, and so n+m
is nite as well.
Weve shown that every atomic formula has the property having a nite
number of different sentence letters; and weve shown that the property is inherited
by complex formulas built according to the formation rules. But every wff is
either atomic, or built from atomics by a nite series of applications of the
formation rules. Therefore, by induction, every wff has the property.
A different form of inductive proof is called for in the following proof of
soundness:
Proof of soundness for PL. Unlike the previous inductive proof, here we are not
trying to prove something of the form Every wff has property p. Instead,
were trying to prove something of the form Every theorem has property p.
Nevertheless we can still use induction, only we need to use induction of a
slightly different sort from induction on formula construction. Consider: a
theorem is any line of a proof. And every line of every proof is the culmination
of a nite series of wffs, in which each member is either an axiom, or follows
from earlier lines by modus ponens. So the conditions are right for an inductive
proof. The starting points are the axioms; and the operation is the inference
of a new line from earlier lines using modus ponens. If we can show that the
starting points (axioms) have the property of validity, and that the operation
(modus ponens) preserves the property of validity, then we can conclude that
every wff in every proofi.e., every theoremhas the property of validity.
This sort of inductive proof is called induction on the proof of a formula (or
induction on the length of the formulas proof).
base case: here we need to show that every PL-axiom is valid. This is
tedious but straightforward. Take PL, for example. Suppose for reduc-
tio that some instance of PL is invalid, i.e., for some PL-interpretation ,
V
(()) = 0. Thus, V
() = 1 and V
() = 0. Given the latter,

V
() = 0contradiction. Analogous proofs can be given that instances of

PL and PL are also valid (exercise 2.5).
induction step: here we begin by assuming that every line in a proof up to
a certain point is valid (the inductive hypothesis); we then show that if one
adds another line that follows from earlier lines by the rule modus ponens, that
line must be valid too. I.e., were trying to show that modus ponens preserves
validity. So, assume the inductive hypothesis: that all the earlier lines in the
proof are valid. And now, consider the result of applying modus ponens. That
means that the new line weve added to the proof is some formula , which
weve inferred from two earlier lines that have the forms and . We
must show that is a valid formula, i.e., is true in every interpretation. So
let be any interpretation. By the inductive hypothesis, all earlier lines in
the proof are valid, and hence both and are valid. Thus, V
() = 1
and V
() =1. But if V
() =1 then V
() cant be 0, for if it were, then

V
() would be 0, and it isnt. Thus, V
() =1.
(If our system had included rules other than modus ponens, we would have
needed to show that they too preserve validity. The paucity of rules in axiomatic
systems makes the construction of proofs within those systems a real pain in
the neck, but now we see how it makes metalogical life easier.)
Weve shown that the axioms are valid, and that modus ponens preserves
validity. All theorems are generated from the axioms via modus ponens in a
nite series of steps. So, by induction, every theorem is valid.
One nice thing about soundness is that it lets us establish facts of unprov-
ability. Soundness says if ' then . Equivalently, it says: if then
. So, to show that something isnt a theorem, it sufces to show that it
isnt valid. Consider, for example, the formula (PQ)(QP). There exist
PL-interpretations in which the formula is false, namely, PL-interpretations in
which P is 0 and Q is 1. So, (PQ)(QP) is not valid (since its not true
in all PL-interpretations.) But then soundness tells us that it isnt a theorem
either. In general: given soundness, in order to show that a formula isnt a
theorem, all you need to do is nd an interpretation in which it isnt true.
Before we leave this section, let me reiterate the distinction between the
two types of induction most commonly used in metalogic. Induction on the
proof of a formula (the type of induction used to establish soundness) is used
when one is establishing a fact of the form: every theorem has a certain property
p. Here the base case consists of showing that the axioms have the property p,
and the inductive step consists of showing that the rules of inference preserve
pi.e., in the case of modus ponens: that if and both have property
p then so does . (Induction on proofs can also be used to show that all wffs
provable from a given set have a given property; in that case the base case
would also need to include a demonstration that all members of have the
property.) Induction on formula construction (the type of induction used to
show that all formulas have nitely many sentence letters), on the other hand,
is used when one is trying to establish a fact of the form: every formula has
a certain property p. Here the base case consists of showing that all sentence
letters have property p; and the inductive step consists of showing that the
rules of formation preserve pi.e., that if and both have property p, then
both () and also will have property p.
If youre ever proving something by induction, its important to identify
what sort of inductive proof youre constructing. What are the entities youre
dealing with? What is the property p? What are the starting points, and what
are the operations generating new entities from the starting points? If youre
trying to construct an inductive proof and get stuck, you should return to these
questions and make sure youre clear about their answers.
Ixcrcisc 2.5 Finish the soundness proof by showing that all in-
stances of axiom schemas PL and PL are valid.
Ixcrcisc 2.6 Consider the following (strange) system of propo-
sitional logic. The denition of wffs is the same as for standard
propositional logic, and the rules of inference are the same (just one
rule: modus ponens); but the axioms are different. For any wffs
and , the following are axioms:
()()
Establish the following two facts about this system: (a) every the-
orem of this system has an even number of s; (b) soundness is
false for this systemi.e., some theorems are not valid formulas.
Ixcrcisc 2.7 Show by induction that the truth value of a wff de-
pends only on the truth values of its sentence letters. That is,
show that for any wff and any PL-interpretations and
/
, if
() =
/
() for each sentence letter in , then V
() =V
/ ().
Ixcrcisc 2.8** Suppose that a wff has no repetitions of sentence
letters (i.e., each sentence letter occurs at most once in .) Show
that is not PL-valid.
Ixcrcisc 2.9 Prove strong soundness: for any set of formulas, ,
and any formula , if ' then (i.e., if is provable from
then is a semantic consequence of .)
Ixcrcisc 2.10** Prove the soundness of the sequent calculus. That
is, show that if is a provable sequent, then . (No need
to go through each and every detail of the proof once it becomes
repetitive.)
2.8 PL proofs and thc dcduction thcorcm
Before attempting to prove completeness we need to get better at establishing
theoremhood. And the way to do that is to assemble a toolkit: a collection
of techniques for doing bits of proofs, techniques that are applicable in a wide
range of situations. These techniques will both save time and make proofs
easier to construct.
To assemble the toolkit, well need to change our focus from construct-
ing proofs to constructing proof schemas. Recall the proof of the formula
(RP)(R(QP)) from section 2.6:
. [R(P(QP))][(RP)(R(QP))] PL
. P(QP) PL
. [P(QP)][R(P(QP))] PL
. R(P(QP)) , , MP
. (RP)(R(QP)) , , MP
Consider the result of replacing the sentence letters P, Q, and R in this proof
with metalinguistic variables , , and :
. [(())][()(())] PL
. () PL
. [()][(())] PL
. (()) , , MP
. ()(()) , , MP
Given our ofcial denition, this does not count as a proof: proofs must be made
up of wffs, and the symbols , , and cant occur in wffs. But it becomes
a proof if we substitute in wffs for , , and . (As with the construction of
axioms, the substitution must be uniform. Uniform throughout the proof,
in fact: each greek letter must be changed to the same wff throughout the
proof.) So lets call it a proof schemaa proof schema of the wff schema
()(()) (call this latter schema weakening the consequent).
The existence of this proof schema shows that each instance of weakening the
consequent is a theorem.
Aproof schema is more useful than a proof because it shows that any instance
of a certain schema can be proved. Suppose youre laboring away on a proof,
and you nd that you need (PP)[P((RR)P)] to complete the
proof. This wff is an instance of weakening the consequent. So you know that
you can construct a ve-line proof of it anytime you like, by beginning with
the proof schema of weakening the consequent, and substituting P for , P
for , and RR for . Instead of actually inserting those ve lines into your
proof, why not instead just write down the line:
i . (PP)[P((RR)P)] weakening the consequent
? You know that you could always replace this line, if you wanted to, with the
ve-line proof.
Citing previously proved theorem schemas saves time and writing. Lets
introduce another time-saving practice: that of doing two or more steps at once.
Well allow ourselves to do this, and annotate in some perspicuous way, when
its reasonably obvious what the skipped steps are. For example, lets rewrite
the proof of the weakening-the-consequent schema thus:
. () PL
. (()) PL, , MP
. ()(()) PL, , MP
So the rst tools in our toolkit are the weakening the consequent schema
and doing multiple steps at once. Once the kit is full, well try to reduce a given
problem to a few chunks, each of which can be accomplished by citing a tool
from the kit.
Notice that as soon as we start using the toolkit, the proofs we construct
cease to be ofcial proofsnot every line will be either an axiom or premise or
follow from earlier lines by MP. They will instead be informal proofs, or proof
sketches. A proof sketch is in essence a metalogic proof to the effect that there
exists some proof or other of the desired type. It is a blueprint that an ambitious
reader could always use to construct an ofcial proof, by lling in the details.
Were nowready to make a more signicant addition to our toolkit. Suppose
we already have and (). The following technique then shows
us how to move to . Lets call it the MP technique, since it lets us do
modus ponens within the consequent of the conditional :
.
. ()
. (())(()()) PL
. ()() , , MP
. , , MP
In effect we have given a metalogic proof of the following fact: for any wffs ,
, and : , () ' .
Lets add a meta-tool to the kit:
Cut: If
1
'
1
, . . . ,
n
'
n
, and ,
1
. . .
n
' , then
1
. . . ,
n
, '
Think of Cut as saying that one can cut out the middleman. Suppose
1
. . .
n
lead to some intermediate conclusions,
1
. . .
n
(the middleman). And suppose
one can go from those intermediate conclusions to some ultimate conclusion
(perhaps with the help of some auxiliary premises ). Then, Cut says, you
can go directly from
1
. . .
n
to the ultimate conclusion (with the help of
if needed). I call this a meta-tool because it facilitates use of other tools in the
kit. For example, suppose you know that
1
' PQ and
2
' P(QR). We
know from the MP technique that PQ, P(QR) ' PR. Cut then tells
us that
1
,
2
' PR (
1
is PQ,
2
is P(QR); is null in this case).
Proof of Cut. We are given that there exists a proof A
i
of
i
from
i
, for i =
1. . . n, and that there exists a proof B of from,
1
. . .
n
. Let C be the result
of concatenating all these proofs, in that order. That is, C begins with a rst
phase, consisting of the formulas of proof A
1
, followed by the formulas of proof
A
2
, and so on, nishing with the formulas of proof A
n
. Then, in the second
phase, C concludes with the formulas of proof B. The last formula of C is the
last formula of B, namely, . So all we need to show is that C counts as a proof
from
1
. . . ,
n
, that is, that each line of C is either an axiom, a member of
1
, or of
2
,, or of
n
, or of , or follows from earlier lines in C by MP. For
short, we must show that each line of C is legit. Clearly, each line j of the
rst phase of C is legit: j is from one of the A
i
segments; A
i
is a proof from
i
;
so the formula on line j is either an axiom, a member of
i
, or follows from
earlier lines in that A
i
segment by MP. Consider, nally, the second phase of
C, namely, the B portion. Since B is a proof from ,
1
. . .
n
, the formula on
any line j here is either i) an axiom, ii) a member of , iii) one of the
i
s, or
iv) follows from earlier lines of the B portion by MP. Line j is clearly legit in
cases i), ii), and iv). In case iii), the formula on line j is some
i
. But
i
also
occurred in the rst phase of C, as the last line, k, of the A
i
portion. So
i
is either an axiom, or a member of
i
, or follows from earlier lines in the A
i
portionwhich are before kby MP. In either of the rst two cases, line j is
legit; and its also legit in the last case because lines before k in C are also lines
before j .
Were now ready for the most important addition to our toolkit: the deduc-
tion theorem. As you have been learning (perhaps to your dismay), constructing
axiomatic proofs is much harder than constructing sequent proofs. Its hard
to prove things when youre not allowed to reason with assumptions! Nev-
ertheless, one can prove a metalogical theorem about our axiomatic system
that is closely related to one method of reasoning with assumptions, namely
conditional proof:
Dcduction thcorcm for PL: If , '
PL
, then '
PL
That is: whenever there exists a proof from ( and) ]] to , then there also
exists a proof of (from ).
Suppose we want to prove . Our axiomatic system does not allow
us to assume in a conditional proof of . But once weve proved the
deduction theorem, well be able to do the next best thing. Suppose we succeed
in constructing a proof of from ]]. That is, we write down a proof in which
each line is either i) a member of ]] (that is, itself), or ii) an axiom, or
iii) follows from earlier lines in the proof by modus ponens. The deduction
theorem then lets us conclude that some proof of exists. We wont have
constructed such a proof ourselves; we only constructed the proof from to .
Nevertheless the deduction theorem assures us that it exists. More generally,
whenever we can construct a proof of from plus some other premises (the
formulas in some set ), then the deduction theorem assures us that some proof
of from those other premises also exists.
Proof of deduction theorem. Suppose ]] ' . Thus there exists some proof,
A, from]] to . Each line
i
of Ais either a member of ]], an axiom,
or follows from earlier lines in the proof by MP; the last line of A is . Our
strategy will be to establish that:
for each
i
in proof A, '
i
(*)
We already know that each line of proof Ais provable from ; what (*) says
is that if you stick in front of any of those lines, the result is provable
from all by itself. Once we succeed in establishing (*) then we will have
proved the deduction theorem. For since the last line of proof Ais , (*) tells
us that is provable from .
(*) says that each line of proof A has a certain property, namely, the property
of: being provable from when prexed with . Just as in the proof of
soundness, this calls for the method of proof by induction, and in particular,
induction on s proof. Here goes.
What were going to do is show that whenever a line is added to proof A,
then it has the propertyprovided, that is, that all earlier lines in the proof
have the property. There are three cases in which a line
i
could have been
added to proof A. The rst case is where
i
is an axiom. We must show that
i
has the propertythat is, show that '
i
. Well, consider this:
.
i
axiom
.
i
PL, , MP
This is a proof (sketch) of
i
from . Its true that we didnt actually use
any members of in the proof, but thats OK. If you look back at the denition
of a proof from a set, youll see that this counts ofcially as a proof from .
The second case in which a line
i
could have been added to proof A is
where
i
is a member of ]]. This subdivides into two subcases. The rst
is where
i
is itself. Here,
i
is , which can be proved from no
premises at all using the method of exercise 2.4a; so ' . The second
subcase is where
i
. But here we can prove
i
from as follows:
.
i
premise
.
i
PL, , MP
The rst two cases were base cases of our inductive proof, because we
didnt need to assume anything about earlier lines in proof A. The third case in
which a line
i
could have been added to proof Aleads us to the inductive part
of our proof: the case in which
i
follows from two earlier lines of the proof
by MP. Here we simply assume that those earlier lines of the proof have the
property were interested in (this assumption is the inductive hypothesis; the
property, recall, is: being provable from when prexed with ) and we show
that
i
has the property as well.
So: were considering the case where
i
follows from earlier lines in the
proof by modus ponens. That means that the earlier lines have to have the
forms
i
and . Furthermore, the inductive hypothesis tells us that the
result of prexing either of these earlier lines with is provable from .
Thus, ' (
i
) and ' . But then, given the MP technique and
Cut, '
i
.
Thus, in all three cases, whenever
i
was added to proof A, there always
existed some proof of
i
from . By induction, (*) is established; and this
in turn completes the proof of the deduction theorem.
Once weve got the deduction theorem for PL in our toolkit, we can really
get going. For we can now, in effect, use conditional proof. As an illustration, Ill
show how to use the deduction theorem to establish that: , ' .
That is: conditionals are transitive (a useful addition to the toolkit). Consider
the following proof schema:
. premise
. premise
. premise
. , , MP
. , , MP
This is a proof of from the set ], , ]. Thus, , , ' .
The deduction theorem then tells us that , ' . Thats all it
takes!much easier than constructing from scratch a proof of from
and .
Lets call this last addition to the toolkit, the fact that , ' ,
transitivity. (As with the MP technique, its a metalogical theorem.)
The transitivity schema tells us that certain wffs are provable from cer-
tain other wffs. It does not tell us that certain wffs are theorems. That is,
its not a theorem schema. However, there is a theorem schema correspond-
ing to transitivity: ()[()()]. The theoremhood of this
schema follows immediately from the transitivity schema via two application
of the deduction theorem. In general, if the toolkit includes a provability-
from schema
1
. . .
n
' rather than the corresponding theorem schema
'
1
(
2
. . . (
n
)), one can always infer the existence of the latter, if one
wants it, by using the deduction theorem repeatedly.
Example 2.11: More additions to the toolkit:
' (contraposition ):
The following proof shows that , ' :
. premise
. premise
. PL, , MP
. PL, , MP, , MP
The desired result then follows by the deduction theorem.
' (contraposition ):
. premise
. exercise 2.11d
. exercise 2.11c
. , , , transitivity
. , contraposition
, ' (ex falso quodlibet):
. premise
. premise
. PL, , MP
. PL, , MP
. PL, , MP, , MP
() ' and () ' (negated conditional)
To demonstrate the rst: by two applications of the deduction theorem to ex
falso quodlibet, we know that ' (). So, begin a proof with a proof
of this wff, and then continue as follows:
. ()
. () , contraposition
. () premise
. , , MP
. , exercise 2.4c
As for the second:
. () PL
. () , contraposition
. () premise
. , , MP
, ' (excluded middle MP)
. premise
. premise
. , contraposition
. , contraposition
. , exercise 2.11c, transitivity
. PL, , MP, , MP
Ixcrcisc 2.11 Establish each of the following. You may use the
toolkit, including the deduction theorem.
a) ' [()]
b) ' [()][()] (permutation):
c) ' (double-negation elimination)
d) ' (double-negation introduction)
Ixcrcisc 2.12 (Long.) Establish the axiomatic correctness of the
rules of inference from our sequent system. For example, in the
case of E, show that , ' i.e., give an axiomatic proof of
() from ], ]. You may use the toolkit.
2.9 Complctcncss of PL
Were nally ready for the completeness proof. We will give what is known as a
Henkin-proof, after Leon Henkin, who used similar methods to demonstrate
completeness for (nonmodal) predicate logic. Most of the proof will consist of
assembling various piecesvarious denitions and facts. The point of these
pieces will become apparent at the end, when we put them all together.
2.9.1 Maximal consistcnt scts of wffs
Let abbreviate (PP). (The idea of is that it stands for a generic
contradiction. The choice of (PP) was arbitrary; all that matters is that
is the negation of a theorem.) Here are the central denitions well need:
iiixi:iox oi toxsis:ixtv axb xaxixaii:v:
A set of wffs, , is inconsistent iff ' . is consistent iff it is not
inconsistent
A set of wffs, , is maximal iff for every wff , either or (or perhaps
both) is a member of
Intuitively: a maximal set is so large that it contains each formula or its negation;
and a consistent set is one from which you cant prove a contradiction. Note
the following lemmas:
Lemma 2.1 For any set of wffs and wff , if is provable from then is
provable from some nite subset of . That is, if ' then
1
. . .
n
' for
some
1
. . .
n
(or else ' )
Proof. If ' then there is some proof, A, of from . Like every proof,
A is a nite series of wffs. Thus, only nitely many of s members can have
occurred as lines in A. Let
1
. . .
n
be those members of . (If no member of
occurs in Athen Aproves from no premises at all, in which case ' .) In
addition to counting as a proof of from , proof A is also a proof of from
]
1
. . .
n
]. Thus,
1
. . .
n
' .
Lemma 2.2 For any set of wffs , if ' and ' for some then is
inconsistent
Proof. Follows immediately from ex falso quodlibet (example 2.11) and Cut.
Note that the rst lemma tells us that a set is inconsistent iff some nite subset
of that set is inconsistent.
2.9.2 Maximal consistcnt cxtcnsions
Suppose we begin with a consistent set that isnt maximalfor at least one
wff , contains neither nor . Is there some way of adding wffs to to
make it maximal, without destroying its consistency? That is, is guaranteed
to have some maximal consistent extension? The following theorem tells us
that the answer is yes:
Thcorcm 2.3 If is a consistent set of wffs, then there exists some maximal
consistent set of wffs, , such that
Proof of Theorem 2.3. In outline, were going to build up as follows. Were
going to start by dumping all the formulas in into . Then we will go through
all the wffs,
1
,
2
,, one at a time. For each wff, were going to dump either
it or its negation into , depending on which choice would be consistent. After
were done, our set will obviously be maximal; it will obviously contain as
a subset; and, well show, it will also be consistent.
So, let
1
,
2
, be a listan innite list, of courseof all the wffs.
10
To
10
We need to be sure that there is some way of arranging all the wffs into such a list. Here is
one method. First, begin with a list of the primitive expressions of the language. In the case of
PL this can be done as follows:
( ) P
1
P
2
. . .
1 2 3 4 5 6 . . .
(For simplicity, get rid of all the sentence letters except for P
1
, P
2
, . . . .) Since well need to
refer to what position an expression has in this list, the positions of the expressions are listed
underneath those expressions. (E.g., the position of the is .) Now, where is any wff,
call the rating of the sum of the positions of the occurrences of its primitive expressions.
(The rating for the wff (P
1
P
1
), for example, is 1 +5 +4 +5 +2 =17.) We can now construct
the listing of all the wffs of PL by an innite series of stages: stage , stage , etc. In stage
n, we append to our growing list all the wffs of rating n, in alphabetical order. The notion of
alphabetical order here is the usual one, given the ordering of the primitive expressions laid
out above. (E.g., just as and comes before dna in alphabetical order, since a precedes d
in the usual ordering of the English alphabet, (P
1
P
2
) comes before (P
2
P
1
) in alphabetical
order since P
1
comes before P
2
in the ordering of the alphabet of PL. Note that each of these
wffs are inserted into the list in stage , since each has rating .) In stages no wffs are
added at all, since every wff must have at least one sentence letter and P
1
is the sentence letter
with the smallest position. In stage there is one wff: P
1
. Thus, the rst member of our list
of wffs is P
1
. In stage there is one wff: P
2
, so P
2
is the second member of the list. In every
subsequent stage there are only nitely many wffs; so each stage adds nitely many wffs to
the list; each wff gets added at some stage; so each wff eventually gets added after some nite
amount of time to this list.
construct , our strategy is to start with , and then go through this list one-
by-one, at each point adding either
i
or
i
. Heres how we do this more
carefully. We rst dene an innite sequence of sets,
0
,
1
, . . . :
0
=
n+1
=
n
]
n+1
] if
n
]
n+1
] is consistent
n
]
n+1
] if
n
]
n+1
] is not consistent
This denition is recursive, notice. We begin with a noncircular denition
of the rst member of the sequence of sets,
0
, and after that, we dene each
subsequent member
n+1
in terms of the previous member
n
: we add
n+1
to
n
if the result of doing so would be consistent; otherwise we add
n+1
.
Next lets prove that each member in this sequencethat is, each
i
is a
consistent set. We do this inductively, by rst showing that
0
is consistent, and
then showing that if
n
is consistent, then so will be
n+1
. This is a different sort
of inductive proof from what weve seen so far, neither an induction on formula
construction nor on formula proof. Nevertheless we have the required structure
for proof by induction: each of the objects of interest (the
i
s) is generated
from a starting point (
0
) by a nite series of operations (the operation taking
us from
n
to
n+1
).
Base case: obviously,
0
is consistent, since was stipulated to be consistent.
Inductive step: we suppose that
n
is consistent (inductive hypothesis), and
then show that
n+1
is consistent. Look at the denition of
n+1
. What
n+1
gets dened as depends on whether
n
]
n+1
] is consistent. If
n
]
n+1
]
is consistent, then
n+1
gets dened as that very set
n
]
n+1
]. So of course
n+1
is consistent in that case.
The remaining possibility is that
n
]
n+1
] is inconsistent. In that case,
n+1
gets dened as
n
]
n+1
]. So must show that in this case,
n
]
n+1
]
is consistent. Suppose for reductio that it isnt. Then is provable from
n
]
n+1
], and so given lemma 2.1 is provable from some nite subset
of this set; and the nite subset must contain
n+1
since
n
was consistent.
Letting
1
. . .
m
be the remaining members of the nite subset, we have,
then:
1
. . .
m
,
n+1
' , from which we get
1
. . .
m
'
n+1
by the
deduction theorem. Since
n
]
n+1
] is inconsistent, similar reasoning tells
us that
1
. . .
p
'
n+1
, for some
1
. . .
p

n
. It then follows by ex-
cluded middle MP (example 2.11) and Cut that
1
. . .
m
,
1
. . .
p
' . Since
1
. . .
m
,
1
. . .
p
are all members of
n
, this contradicts the fact that
n
is
consistent.
We have shown that all the sets in our sequence
i
are consistent. Let
us now dene to be the union of all the sets in the innite sequencei.e.,
]:
i
for some i ]. We must now show that is the set were after: that i)
, ii) is maximal, and iii) is consistent.
Any member of is a member of
0
(since
0
was dened as ), hence is a
member of one of the
i
s, and hence is a member of . So .
Any wff is in the list of all the wffs somewherei.e., it is
i
for some i . But
by denition of
i
, either
i
or
i
is a member of
i
; and so one of these is a
member of . is therefore maximal.
Suppose for reductio that is inconsistent. Given lemma 2.1, there exist
1
. . .
m
such that
1
. . .
m
' . By denition of , each
i

j
i
, for
some j
i
. Let k be the largest of j
1
. . . j
m
. Given the way the
0
,
1
, . . . series is
constructed, each set in the series is a subset of all subsequent ones. Thus, each
of
1
. . .
m
is a member of
k
, and thus
k
is inconsistent. But we showed that
each member of the series
0
,
1
, . . . is consistent.
2.9.3 Icaturcs of maximal consistcnt scts
Next well establish two facts about maximal consistent sets that well need for
the completeness proof:
Lemma 2.4 Where is any maximal consistent set of wffs:
2.4a for any wff , exactly one of , is a member of
2.4b iff either / or
Proof of Lemma 2.4a. Since is maximal it must contain at least one of or
. But it cannot contain both; otherwise each would be provable from ,
whence by lemma 2.2, would be inconsistent.
Proof of Lemma 2.4b. Suppose rst that is in , and suppose for reductio
that is in but is not. Then we can prove from (begin with and
as premises, and then use MP). But since / and is maximal,
is in , and hence is provable from . Given lemma 2.2, this contradicts s
consistency.
Suppose for the other direction that either / or , and suppose for
reductio that / . Since is maximal, () . Then ' (),
and so by negated conditional (example 2.11) and Cut, ' and ' .
Now, if / then and so ' ; and if on the other hand then
' . Each possibility contradicts s consistency, given lemma 2.2.
2.9.4 Thc proof
Now its time to put together all the pieces that weve assembled.
Proof of PL completeness. Completeness says: if then ' . Well prove this
by proving the equivalent statement: if then . So, suppose that .
We must construct some PL-interpretation in which isnt true.
Since , ]] must be consistent. For suppose otherwise. Then ' ;
so ' by the deduction theorem. That is, given the denition of :
' (PP). Then by contraposition (example 2.11), ' (PP). But
' PP (exercise 2.4a), and so ' contradiction.
Since ]] is consistent, theorem 2.3 tells us that it is a subset of some
maximal consistent set of wffs . Next, lets use to construct a somewhat odd
PL-interpretation. This PL-interpretation decides whether a sentence letter
is true or false by looking to see whether that sentence letter is a member of .
What we will do next is show that all formulas, not just sentence letters, are
true in this odd interpretation iff they are members of .
So, let be the PL-interpretation in which for any sentence letter ,
() =1 iff . We must show that:
for every wff , V
() =1 iff (*)
We do this by induction on formula construction. The base case, that the
assertion holds for sentence letters, follows immediately from the denition of
. Next we make the inductive hypothesis (ih): that wffs and are true in
iff they are members of , and we show that the same is true of and .
First, : we must show that V
() =1 iff :
11
V
() =1 iff V
() =0 (truth cond. for )

iff / (ih)
iff (lemma 2.4a)
11
Here we continue to use the fact that a formula has one truth value iff it lacks the other.
Next, : we must show that V
() =1 iff :
V
() =1 iff either V
() =0 or V
() =1 (truth cond for )

iff either / or (ih)
iff (lemma 2.4b)
The inductive proof of (*) is complete. But now, since ]] , ,
and so by lemma 2.4a, / . Thus, by (*), is not true in . So we have
succeeded in constructing an interpretation in which isnt true.
Chaptcr 3
Bcyond Standard Propositional
Logic
A
s vvoxisib, we will study more than the standard logical systems familiar
from introductory textbooks. In this chapter well examine some varia-
tions and deviations from standard propositional logic. (In later chapters we
will discuss several extensions of standard propositional logic.)
In this chapter, lets treat all connectives as primitive unless otherwise
specied. (So, for example, our recursive denition of a wff now has a clause
saying that if and are wffs, then so are (), (), and (), and our
ofcial denition of a PL-valuation now contains the semantic clauses for the
, , and that were derived in chapter 2.) The main reason for doing this
is that in some nonstandard logics, the denitions of the dened connectives
given in section 2.1 are inappropriate.
3.1 Altcrnatc conncctivcs
3.1.1 Symbolizing truth functions in propositional logic
Standard propositional logic is in a sense expressively complete. To get at this
idea, lets introduce the idea of a truth function. A truth function is a function
that maps truth values (i.e., 0s and 1s ) to truth values. For example:
f (1) =0
f (0) =1
CHAPTER 3. BEYOND STANDARD PROPOSITIONAL LOGIC

f is a one-place function because it takes only one truth value as input. We
have an English name for this truth function: negation; and we have a symbol
for it: . Consider next the two-place conjunction truth function:
g(1, 1) =1
g(1, 0) =0
g(0, 1) =0
g(0, 0) =0
We have a symbol for this truth function as well: .
The language of propositional logic we have been using doesnt have a
symbol for every truth function. It has no symbol for the not-both truth
function, for example:
1
h(1, 1) =0
h(1, 0) =1
h(0, 1) =1
h(0, 0) =1
But in a sense that Ill introduce in a moment, we can symbolize this truth
function using a complex sentence: (PQ). In fact, we can symbolize (in this
sense) any truth function (of any nite number of places) using just , , and .
Proof that every truth function can be symbolized using just , , and . We need
to dene what it means to say that a wff symbolizes a truth function. The
rough idea is that the wff has the right truth table. Heres a precise denition:
iiixi:iox oi svxnoiizixo: Wff symbolizes n-place truth function f
iff contains the sentence letters P
1
. . . P
n
and no others, and for any PL-
interpretation , V
() = f ((P
1
) . . . (P
n
)).
The sentence letters P
1
. . . P
n
represent the n inputs to the truth-function f .
(The choice of these letters (and this order) is arbitrary; but given the choice,
(PQ) doesnt ofcially symbolize not-both; we must instead use (P
1
P
2
).)
Now lets prove that for every truth function, there exists some wff contain-
ing no connectives other than , , and that symbolizes the truth function. Ill
1
Though well consider below the addition of a symbol, |, for this truth function.
do this informally. Lets begin with an example. Suppose we want to symbolize
the following three-place truth-function:
i (1, 1, 1) =0
i (1, 1, 0) =1
i (1, 0, 1) =0
i (1, 0, 0) =1
i (0, 1, 1) =0
i (0, 1, 0) =0
i (0, 0, 1) =1
i (0, 0, 0) =0
We must construct a sentence whose truth value is the same as the output of
function i , whenever the sentence letters P
1
, P
2
, and P
3
are given i s inputs.
Now, if we ignore everything but the numbers in the above picture of function
i , we can think of it as a kind of truth table for the sentence were after. The
rst column of numbers represents the truth values of P
1
, the second column,
the truth values of P
2
, and the third column, the truth values of P
3
; and the far
right column represents the truth values that the desired formula should have.
Each row represents a possible combination of truth values for these sentence
letters. Thus, the second row (i (1, 1, 0) =1) is the combination where P
1
is
1, P
2
is 1, and P
3
is 0; the fact that the fourth column in this row is 1 indicates
that the desired formula should be true here.
Since function i returns the value 1 in just three cases (rows two, four, and
seven), the sentence were after should be true in exactly those three cases. Now,
we can construct a sentence that is true in the case of row two (i.e. when P
1
, P
2
,
and P
3
are 1, 1, and 0, respectively) and false otherwise: P
1
P
2
P
3
. And we
can do the same for rows four and seven: P
1
P
2
P
3
and P
1
P
2
P
3
. But
then we can simply disjoin these three sentences to get the sentence we want:
(P
1
P
2
P
3
) (P
1
P
2
P
3
) (P
1
P
2
P
3
)
(Strictly speaking the three-way conjunctions, and the three-way disjunction,
need parentheses. But it doesnt matter where theyre added since conjunc-
tion and disjunction are associative. That is, () and () are
semantically equivalent, as are () and ().)
This strategy is in fact purely general. Any n-place truth function, f , can
be represented by a chart like the one above. Each row in the chart consists of
a certain combination of n truth values, followed by the truth value returned
by f for those n inputs. For each such row, construct a conjunction whose
i
th
conjunct is P
i
if the i
th
truth value in the row is 1, and P
i
if the i
th
truth
value in the row is 0. Notice that the conjunction just constructed is true if and
only if its sentence letters have the truth values corresponding to the row in
question. The desired formula is then simply the disjunction of all and only
the conjunctions for rows where the function f returns the value 1.
2
Since the
conjunction for a given row is true iff its sentence letters have the truth values
corresponding to that row, the resulting disjunction is true iff its sentence
letters have truth values corresponding to one of the rows where f returns the
value true, which is what we want.
Say that a set of connectives is adequate iff all truth functions can be symbol-
ized using sentences containing no connectives not in that set. What we just
showed was that the set ], , ] is adequate. We can now use this fact to prove
that other sets of connectives are adequate. Take ], ], for example. Where
f is any truth function, we must nd some wff that symbolizes f whose
only connectives are and . Since ], , } is adequate, some sentence
/
containing only , , and symbolizes f . But its easy to see that any wff of the
form is (PL-) semantically equivalent to (); so we can obtain
our desired by replacing all wffs in
/
of the form with ().
3
Similar arguments can be given to show that other connective sets are
adequate as well. For example, the can be eliminated in favor of the and
the (since is semantically equivalent to ()); therefore, since
], ] is adequate, {, } is also adequate.
3.1.2 Shcffcr strokc
All of the adequate connective sets weve seen so far contain more than one
connective. But consider next a new connective, called the Sheffer stroke: |.
2
Special case: if there are no such rowsi.e., if the function returns 0 for all inputs
then let the formula simply be any always-false formula containing P
1
. . . P
n
, for example
P
1
P
1
P
2
P
3
P
n
.
3
Here Im using the obvious fact that semantically equivalent wffs represent the same truth-
functions, and also the slightly less obvious but still obvious fact that substituting semantically
equivalent wffs inside a wff results in a wff that is semantically equivalent to .
| means that not both and are true; thus, its truth table is:
| 1 0
1 0 1
0 1 1
In fact, | is an adequate connective all on its own; one can symbolize all the
truth functions using just |! (One other binary connective is adequate all on its
own; see exercise 3.2.)
Proof that ]|] is an adequate connective set. | is semantically equivalent to .
Furthermore, is semantically equivalent to (), and thus to |,
and thus to |(|). So: take any truth function, f . We showed earlier that
], ] is adequate; so some sentence containing just and symbolizes f .
Replace each occurrence of in with |(|), and each occurrence of
with |; the resulting wff contains only | and symbolizes f .
3.1.3 Inadcquatc conncctivc scts
Can we show that certain sets of connectives are not adequate?
We can quickly answer yes, for a trivial reason. The set ]] isnt adequate,
for the simple reason that, since is a one-place connective, no sentence with
more than one sentence letter can be built using just . So theres no hope of
symbolizing n-place truth functions, for n >1, using just the .
More interestingly, we can show that there are inadequate connective sets
containing two-place connectives. One example is ], ].
Proof that ], ] is not an adequate set of connectives. Suppose for reductio that
the set is adequate. Then there exists some wff, , containing just the sentence
letter P
1
and no connectives other than and that symbolizes the negation
truth function. But there can be no such wff . For would have to be false
whenever P
1
is true, whereas we can prove the following by induction:
Each wff whose only sentence letter is P
1
, and which
contains no connectives other than and , is true in
any PL-interpretation in which P
1
is true.
Base case: if has no connectives then is just the sentence letter P
1
itself,
in which case its clearly true in any PL-interpretation in which P
1
is true.
Next we assume the inductive hypothesis, that wffs and are true in any
PL-interpretation in which P
1
is true; we must now show that and
are true in any such PL-interpretation. But this follows immediately from the
truth tables for and .
Ixcrcisc 3.1 For each of the following two truth functions, i) nd
a sentence with just , , , , ) that symbolizes it; and ii) nd
a sentence containing just the Sheffer stroke that symbolizes it. You
may save time by making abbreviations and saying things like make
such-and-such substitutions throughout.
f (1, 1) =1 g(1, 1, 0) =0
f (1, 0) =0 g(0, 0, 1) =0
f (0, 1) =0 g(x, y, z) =1 otherwise
f (0, 0) =1
Ixcrcisc 3.2 Show that all truth functions can be symbolized using
just (nor). is 1 when both and are 0, and 0 otherwise.
Ixcrcisc 3.3 Can all the truth functions be symbolized using just
the following connective? (Give a proof to justify your answer.)
% 1 0
1 0 1
0 1 0
3.2 Polish notation
Reformulating standard logic using the Sheffer stroke is a mere variation
(section 1.7) of standard logic, since in a sense its a mere notational change.
Another variation is Polish notation. In Polish notation, the connectives all
go before the sentences they connect. Instead of writing PQ, we write PQ.
Instead of writing PQ we write PQ. Formally, we redene the wffs as
follows:
iiixi:iox oi viis iov Poiisu xo:a:iox:
sentence letters are wffs
if and are wffs, then so are: , , , , and
Whats the point? This notation eliminates the need for parentheses. With the
usual notation, in which we put the connectives between the sentences they
connect, we need parentheses to distinguish, e.g.:
(PQ) R
P (QR)
But with Polish notation, these are distinguished without parentheses:
PQR
PQR
Ixcrcisc 3.4 Translate each of the following into Polish notation:
a) PP
b) (P(Q(R(ST))))
c) [(PQ)(PQ)][(PQ)(PQ)]
3.3 Nonclassical propositional logics
In the rest of this chapter we will examine certain deviations from standard
propositional logic. These are often called nonclassical logics, classical
logic being the standard type of propositional and predicate logic studied in
introductory courses and presented here in chapters 2 and 4.
4
These nonclassi-
cal logics use the standard language of logic, but they offer different semantics
and/or proof theories.
4
Extensions to standard propositional logic, such as modal logic, are also sometimes called
nonclassical; but by nonclassical Ill have in mind just deviations.
There are many reasons to get interested in nonclassical logic, but one
exciting one is the belief that classical logic is wrongthat it provides an inade-
quate model of (genuine) logical truth and logical consequence. For example,
every wff of the form is PL-valid and a PL-theorem. But mathematical
intuitionists (section 3.5) claim that for certain mathematical statements , the
sentence either or it is not the case that is not even one we are entitled to
assert, let alone a logical truth. As elsewhere in this book, our primary concern
is to understand how formalisms work, rather than to evaluate philosophical
claims about genuine logical truth and logical consequence. However, to ex-
plain why nonclassical formalisms have been developed, and to give them some
context, in each case well dip briey into the relevant philosophical issues.
In principle, a critic of classical logic could claim either that classical logic
recognizes too many logical consequences (or logical truths), or that it rec-
ognizes too few. But in practice, the latter is rare. In nearly every case, the
nonclassicalists concern is to scale back classical logics set of logical truths or
logical consequences. Intuitionists and many other nonclassical logicians want
to remove , the so-called law of the excluded middle, from the set of
logical truths; paraconsistent logicians (section 3.4.4) want to remove ex falso
quodlibet (; ; therefore, ) from the set of logical implications; and so on.
Like classical logic, one can approach a given nonclassical logic in various
ways. One can take a proof-theoretic approach (using axioms, sequents, or
some other proof system). Or one can take a semantic approach. Ill take
different approaches to different logics, depending on which approach seems
most natural.
Nonclassical logic can seem dizzying. It challenges assumptions that we
normally regard as utterly unproblematic, assumptions we normally make
without even noticing, assumptions that form the very bedrock of rational
thought. Can these assumptions sensibly be questioned? Some nonclassical
logicians even say that there are true contradictions! (See section 3.4.4.) If
even the law of noncontradiction is up for grabs, one might worry, how is
argumentation possible at all?
My own view is that even the most radical challenges to classical logic can
coherently be entertained, and need not amount to intellectual suicide. But if
youre more philosophically conservative, fear not: from a formal point of view
theres nothing at all dizzying about nonclassical logic. In the previous chapter
we gave various mathematical denitions: of the notion of a PL-interpretation,
the notion of a sequent proof, and so on. Formally speaking, nonclassical logics
result simply from giving different denitions. As well see, these different
denitions are easy to give and to understand. Furthermore, when I give the
denitions and reason about them, I will myself be assuming classical logic in
the metalanguage. For example, even when we discuss the formalism accepted
by the defenders of true contradictions, I wont myself accept any true contra-
dictions. I will reason normally in the course of developing a formal system
which represents abnormal patterns of inference, much as a sane psychologist
might develop a model of insanity. Thus, even if theres something philosophi-
cally perplexing about the claims about (genuine) logical consequence made
by nonclassical logicians, theres nothing mathematically perplexing about the
formal systems that represent those claims.
3.4 Thrcc-valucd logic
For our rst foray into nonclassical logic, we will take a semantic approach.
Various logicians have considered adding a third truth value to the usual two. In
these new systems, in addition to truth (1) and falsity (0) , we have a third truth-
value, #. The third truth value is (in most cases anyway) supposed to represent
sentences that are neither true nor false, but rather have some other status.
This other status could be taken in various ways, depending on the intended
application, for example: meaningless, undened, or indeterminate.
Classical logic is bivalent: there are exactly two truth values, and each
formula is assigned exactly one of them in any interpretation. So, admitting a
third truth value is one way to deny bivalence. There are others. One could
admit four, ve, or even innitely many truth values. Or, one could stick with
two truth values but allow formulas to have both truth values, or to lack both.
(Some would argue that theres no real difference between allowing formulas
to lack both of two truth values, and admitting a third truth value thought
of as meaning neither true nor false.) Here we will only discuss trivalent
systemssystems in which each formula has exactly one of three truth values.
Why introduce a third truth value? Various philosophical reasons have been
given. One concerns vagueness. Donald Trump is rich. Pete the peasant is not.
Somewhere in the middle there are people who are hard to classify. Perhaps
middling Mary, who has $,, is an example. Is she rich? She is on the
borderline. It is hard to admit either that she is rich, or that she is not rich. (If
you think $, clearly makes you rich, choose a somewhat smaller amount
for the example; if you think it clearly doesnt, choose a larger amount.) So
theres pressure to say that the statement Mary is rich can be neither true nor
false.
Others say we need a third truth value for statements about the future. If
it is in some sense not yet determined whether there will be a sea battle
tomorrow, then, it has been argued, the sentence:
There will be a sea battle tomorrow
is neither true nor false. In general, this viewpoint says, statements about
the future are neither true nor false if there is nothing about the present that
determines their truth value one way or the other.
5
Yet another case in which some have claimed that bivalence fails concerns
failed presupposition. Consider this sentence:
Ted stopped beating his dog
In fact, Ive never beaten a dog. Ive never beaten anything. I dont even have a
dog. So is it true that I stopped beating my dog? Obviously not. But on the
other hand, is this statement false? Certainly no one would want to assert its
negation: Ted has not stopped beating his dog. Ted stopped beating his dog
presupposes that I was beating a dog in the past; since this presupposition is false,
the sentence does not rise to the level of truth or falsity.
For a nal challenge to bivalence, consider the sentence:
Sherlock Holmes has a mole on his left leg
Sherlock Holmes doesnt refer to a real entity. Further, Sir Arthur Conan
Doyle does not specify in his Sherlock Holmes stories whether Holmes has such
a mole. For either of these reasons, one might argue, the displayed sentence is
neither true nor false.
Its an open question whether any of these arguments against bivalence is
any good. Moreover, powerful arguments can be given against the idea that
some sentences are neither true nor false. First, it is natural to identify the
falsity of a sentence with the truth of its negation. So, if we say that Mary is
rich is neither true nor false, i.e., not true and not false, we must also say that:
5
There is an alternate view that upholds the open future without denying bivalence.
According to this view, both There will be a sea battle tomorrow and There will fail to be a
sea battle tomorrow are false. Thus, the defender of this position denies that It will be the
case tomorrow that not- and not: it will be the case tomorrow that are equivalent. See
Prior (, chapter X).
May is rich is not true, and Mary is not rich is not
true
Second, the notion of truth is often thought to be transparent, in that for
any (meaningful) sentence , and is true are interchangeable, even
when (nonquotationally) embedded inside other expressions. So in particular,
is not truei.e., not: is trueimplies not-. Thus, the previously
displayed sentence commits us to saying:
not: Mary is rich, and not: Mary is not rich
Saying that Mary is rich is neither true nor false would therefore seem to
commit us to a contradiction!
So there is controversy about whether some sentences are neither true nor
false. But rather than spending more time on such philosophical questions, lets
now concentrate on a certain sort of formalism that is intended to represent
the failure of bivalence. The idea is simple: give three-valued truth-tables for
the connectives of propositional logic. The classical truth tables give you the
truth values of complex formulas based on whether their constituent sentences
are true or false (1 or 0), whereas the new truth tables will take into account
new cases: cases where sentences are #.
3.4.1 ukasicwiczs systcm
Here is one set of three-valued truth tables, due to Jan ukasiewicz (who also
invented the Polish notation of section 3.2):
1 0
0 1
# #
1 0 #
1 1 0 #
0 0 0 0
# # 0 #
1 0 #
1 1 1 1
0 1 0 #
# 1 # #
1 0 #
1 1 0 #
0 1 1 1
# 1 # 1
(In our discussion of three-valued logic, let abbreviate () ()
as before.) Using these truth tables, one can calculate truth values of wholes
based on truth values of parts.
Example 3.1: Where P is 1, Q is 0 and R is #, calculate the truth value of
(PQ)(RQ). First, what is RQ? The truth table for tells us that
#0 is #. So, since the negation of a # is #, (RQ) is # as well. Next, PQ:
thats 10i.e., 1. Finally, the whole thing: 1#, i.e., #.
We can formalize this a bit more by dening new interpretation and valua-
tion functions:
iiixi:iox oi :vivaiix: ix:ivvvi:a:iox: A trivalent interpretation is a func-
tion that assigns to each sentence letter exactly one of the values: 1, 0, #.
iiixi:iox oi vaiia:iox: For any trivalent interpretation, , the ukasiewicz-
valuation for , V
, is dened as the function that assigns to each wff either

1, 0, or #, and which is such that, for any wffs and ,
V
() =() if is a sentence letter

V
() =
1 if V
() =1 and V
() =1
0 if V
() =0 or V
() =0
# otherwise
V
() =
1 if V
() =1 or V
() =1
0 if V
() =0 and V
() =0
# otherwise
V
() =
1 if V
() =0, or V
() =1, or
V
() =V
() =#
0 V
() =1 and V
() =0
# otherwise
V
() =
1 if V
() =0
0 if V
() =1
# otherwise
Lets dene validity and semantic consequence for ukasiewiczs system much
like we did for standard PL:
iiixi:ioxs oi vaiibi:v axb sixax:it toxsioiixti:
is ukasiewicz-valid (
) iff for every trivalent interpretation ,

V
() =1
is a ukasiewicz-semantic-consequence of (
) iff for every

trivalent interpretation, , if V
() =1 for each , then V
() =1
Example 3.2: Is P P ukasiewicz-valid? Answer: no, it isnt. Suppose P
is #. Then P is #; but then the whole thing is # (since ## is #.)
Example 3.3: Is PP ukasiewicz-valid? Answer: yes. P could be either 1,
0 or #. From the truth table for , we see that PP is 1 in all three cases.
Notice that even if a formula can never be false, it doesnt follow that the
formula is validperhaps the formula is sometimes #. Valid (under this
denition) means always true; it does not mean never false. (Similarly, the notion
of semantic consequence that we dened is that of truth-preservation, not
nonfalsity-preservation.)
One could dene validity differently, as meaning never-false (rather than
always-true). (And one could dene semantic consequence as nonfalsity-
preservation.) Such denitions would generate a very different system; they
would generate a very different range of valid formulas and semantic conse-
quences. This illustrates an important fact. Once one chooses to introduce
extra truth values (and extra truth tables based on them), one then faces a
second choice: how should validity and semantic consequence be understood?
New theories of the nature of validity and semantic consequence do not result
solely from the rst choice, only from a combination of the two choices.
There is a helpful terminology for talking about the second of these choices.
Consider any semantics that employs some set of truth values. (In standard
logic = ]1, 0]; in our trivalent systems = ]1, 0, #].) We can select some
subset of and call the members of that subset the designated truth values. Once
the designated values have been selected, we can then say: a valid formula is
one that has a designated truth value in every interpretation; and semantically
implies iff has a designated truth value in every interpretation in which
each has a designated truth value. Our denition of ukasiewicz-validity
(as meaning always-true) takes 1 to be the sole designated value; dening valid
to mean never-false would amount to taking both 1 and # as designated.
Now is perhaps as good at time as any to make a general point about
semantic denitions of logical truth and logical consequence. In this section we
used a three-valued semantics to dene a certain property of wffs (ukasiewicz-
validity) and a certain relation between sets of wffs and wffs (ukasiewicz-
semantic-consequence). It would be possible to sharply distinguish the semantic
means from the resulting end. Imagine a philosopher who says the following:
The three-valued ukasiewicz semantics does not represent the
real semantics of natural language, since no (meaningful) natural
language sentences are neither true nor false. (I accept the argument
at the end of section 3.4: the claim that a sentence is neither true
nor false would lead to a contradiction.) Nevertheless, I do think
that ukasiewicz-validity and ukasiewicz-semantic-consequence
do a pretty good job of modeling genuine logical truth and logical
consequence. If you ignore the internal workings of the denitions,
and focus just on their outputsthat is, if you focus just on which
wffs count as ukasiewicz-valid and which sets of wffs ukasiewicz-
semantically-imply which other wffsyou get the right results.
For example, PP is ukasiewicz-valid whereas PP is not; and
sure enough, on my view, if there will be a sea battle tomorrow
then there will be a sea battle tomorrow is a logical truth whereas
either there will be a sea battle tomorrow or there wont is not.
There may well be tensions within such a position, but it is, at least on its face, a
position someone might take. The moral is that the properties and relations we
dene using a formal semantics have a life of their own beyond the semantics.
Ixcrcisc 3.5 We noted that it seems in-principle possible for a
formula to be never-false, given the ukasiewicz tables, without
being always-true. Give an example of such a formula.
Ixcrcisc 3.6 Show that no wff whose sentence letters are just
P and Q and which has no connectives other than , , and has
the same ukasiewicz truth table as PQi.e., that for no such
is V
() =V
(PQ) for each trivalent interpretation .

3.4.2 Klccncs tablcs
ukasiewiczs tables are not the only three-valued truth-tables one can give.
Stephen C. Kleene gave three-valued tables that are just like ukasiewiczs
except for the following different table for the :
6
1 0 #
1 1 0 #
0 1 1 1
# 1 # #
As in the previous section, we could write out a corresponding denition of
a Kleene valuation function KV
, relative to a trivalent assignment . But

lets not bother. To dene Kleene-validity and Kleene-semantic-consequence
(
K
), we continue to take 1 as the sole designated value; thus we have:
K
iff KV
() =1 for all trivalent interpretations ; and

K
iff KV
() =1
for each trivalent interpretation in which KV
() =1 for all .
Here is the intuitive idea behind the Kleene tables. Lets call the truth values
0 and 1 the classical truth values. If the immediate parts of a complex formula
have only classical truth values, then the truth value of the whole formula is
just the classical truth value determined by the classical truth values of those
parts. But if some of those parts are #, then we must consider the result of
turning each # into one of the classical truth values. If the entire formula would
sometimes be 1 and sometimes be 0 after doing this, then the entire formula
is #. But if the entire formula always takes the same truth value, X, no matter
which classical truth value any #s are turned into, then the entire formula gets
this truth value X. Intuitively: if there is enough information in the classical
truth values of a formulas immediate parts to settle on one particular classical
truth value, then that truth value is the formulas truth value.
Take Kleenes truth table for , for example. When is 0 and is #, the
table says that is 1because the false antecedent is classically sufcient
to make true, no matter what classical truth value we convert to. On
the other hand, when is 1 and is #, then is #. For what classical truth
value we substitute in for s # affects the truth value of . If the # becomes
a 0 then is 0; but if the # becomes a 1 then is 1.
Let me mention two important differences between the ukasiewicz and
Kleene systems. First, unlike ukasiewiczs system, Kleenes system makes the
formula PP invalid. (This might be regarded as an advantage for ukasiewicz.)
6
These are sometimes called Kleenes strong tables. Kleene also gave another set of tables
known as his weak tables, which assign # whenever any constituent formula is # (and are
classical otherwise). Perhaps # in the weak tables can be thought of as representing nonsense:
any nonsense in a part of a sentence is infectious, making the entire sentence nonsense.
The reason is that in Kleenes system, ## is #; thus, PP isnt true in all
valuations (it is # in the valuation where P is #.) In fact, its easy to show that
there are no valid formulas in Kleenes system (exercise 3.7). Nevertheless,
there are cases of semantic consequence. For example, PQ
K
P, since the
only way for PQ to be 1 is for both P and Q to be 1.
Second, in Kleenes system, is interdenable with the and , in that
has exactly the same truth table as . (Look at the truth tables to
verify that this is true.) Thats not true in ukasiewiczs system (exercise 3.6).
Ixcrcisc 3.7* Show that there are no Kleene-valid wffs.
Ixcrcisc 3.8** Say that one trivalent interpretation renes an-
other, , iff for any sentence letter , if () = 1 then () = 1,
and if () =0 then () =0. That is, preserves all of s clas-
sical values (though it may assign some additional classical values,
in cases where assigns #.) Show that rening a trivalent interpre-
tation preserves classical values for all wffs, given the Kleene tables.
That is, if renes then for every wff, , if KV
() = 1 then
KV
() =1, and if KV
() =0 then KV
() =0.
Ixcrcisc 3.9 Show that the claim in exercise 3.8 does not hold if
you valuate using ukasiewiczs tables rather than Kleenes.
3.4.3 Dctcrminacy
As we saw at the beginning of section 3.4, one potential application of three-
valued logic is to vagueness. Here we think of 1 as representing denite truth
(Donald Trump is rich), 0 as representing denite falsehood (Pete the peasant
is rich), and # as representing indeterminacy (Middling Mary is rich).
1, 0, and # are values that are possessed by sentences (relative to three-valued
interpretations). To attribute one of these values to a sentence is thus to say
something about that sentence. So these values represent statements about
determinacy that we make in the metalanguage, by quoting sentences and
attributing determinacy-statuses to them:
Donald Trump is rich is determinately true
Pete the peasant is rich is determinately false
Middling Mary is rich is indeterminate
But we can speak of determinacy directly, in the object language, without
quoting sentences, by using the adverb denitely:
Donald Trump is denitely rich
Pete the peasant is denitely not rich
Middling Mary is rich is indenitely rich (shes neither
denitely rich nor denitely not rich)
How might we represent this use of denitely within logic?
We could add a new symbol to the language of propositional logic. The
usual choice is a one-place sentence operator, Z. We read Z as meaning
denitely, (or: determinately, ). (Being a one-place sentence operator, Z
has the same grammar as ; its governed in the denition of a wff by the clause
that if is a wff then so is Z. A corresponding operator for indeniteness
could be dened in terms of Z:
\
is short for ZZ.)
The next question is how to treat Z semantically. Its easy to see how to
extend the systems of ukasiewicz and Kleene to cover Z; we simply adopt the
following new truth table:
1 1
0 0
# 0
Thus, Z is 1 whenever is 1, and is 0 otherwise. (And
\
is 1 when is #;
0 otherwise.)
This approach to the semantics of Zhas an apparently serious shortcoming:
Z can never have the value #. This is a shortcoming because some statements
about determinacy seem themselves to be indeterminate. Donald Trump is
denitely rich; but if in a t of philanthropy he started giving money away,
one dollar at a time, eventually it would become unclear whether he was still
denitely rich. Letting R symbolize Philanthropic Trump is rich, its natural
to think that ZR should here be #.
Higher-order vagueness is vagueness in whether theres vagueness. The
shortcoming of our three-valued approach to Z is in essence that it doesnt
allow for higher-order vagueness. This deciency comes out in other ways as
well. For example, its natural to describe philanthropic Trump as being an
indenite case of denite richnesshes neither denitely denitely rich nor
denitely not denitely rich. But ZZRZZR (i.e.,
\
ZR) comes out 0
no matter what value Rhas (on all three systems), given the above truth table for
Z. Our semantics does a bad job with Zs embedded within Zs. Furthermore,
ZRZR comes out 1 no matter what value R has, whereas, one might think,
Philanthropic Trump is either denitely rich or not denitely rich is neither
true nor false.
The root of these problems is that the approach to vagueness that we have
taken in the last three sections only lets us represent three states for a given
sentence letter: denite truth (1), denite falsity (0), and indeterminacy (#); this
leaves out states distinctive of higher-order vagueness such as denite denite
falsity, indenite denite falsity, and so on. More sophisticated approaches
to vagueness and the logic of Z than those we will consider in this book do a
better job of allowing for higher-order vagueness.
7
3.4.4 Pricsts logic of paradox
Suppose we keep Kleenes tables, but take both # and 1 to be designated truth
values. Thus, we call a wff valid iff it is either 1 or # in every trivalent interpre-
tation; and we say that a set of wffs semantically implies wff iff is either
1 or # in every trivalent interpretation in which each member of is either 1 or
#. The resulting logic is Graham Priests () LP. The ofcial denitions:
is LP-valid (
LP
) iff KV
() (=0 for each trivalent interpretation
is an LP-semantic-consequence of (
LP
) iff for every trivalent
interpretation, , if KV
() (=0 for each , then KV
() (=0
LP stands for the logic of paradox. Priest chose this name because of
the philosophical interpretation he gave to #. For Priest, # represents the
state of being both true and false (a truth-value glut), rather than the state of
being neither true nor false (a truth-value gap). Correspondingly, he takes 1 to
represent true and only true, and 0 to represent false and only false.
7
See for example Fine (); Williamson (b).
For Priest, LP is not an idle formal game, since according to him, some
natural language sentences really are both true and false. (This position is
known as dialetheism.) Consider, for example, the liar sentence this sentence
is false. The liar sentence presents a challenging paradox to everyone. Is it
true? Well, if so, then since what it says is that it is false, it must be false as
well. Is it false? Well, if so, then since what it says is that it is false, it must then
be true as well. Weve shown that in each alternativethe alternative that the
liar sentence is true and the alternative that the liar sentence is falsethe liar
sentence comes out both true and false. These are the only alternatives; hence,
the formula is both true and false. Thats the liar paradox. Most people conclude
that something has gone wrong along the way, whereas Priest embraces the
paradoxical conclusion.
Its natural for a dialetheist like Priest to embrace a logic like LP. For its
natural to think of logical consequence as truth preservation; LP represents
logical consequence as the preservation of either 1 or #; and in LP, a formula is
thought of as being true iff it is either 1 or # (in the latter case the formula is
false as well). Further, a look at the Kleene tables shows that their assignments
to # seem, intuitively, to mesh with Priests both true and false interpretation.
Further, Priest embraces some contradictions. That is, for some sentences
, he accepts both and also not-.
8
But in standard propositional logic,
everything follows from a contradiction, via the principle of ex falso quodlibet:
,
PL
. Priest does not of course want to have to accept every sentence
, and so he needs a logic that does not let you infer any old sentence from a
contradiction. That is, he needs a paraconsistent logic. But LP is a paraconsistent
logic (there are others). For its easy to check that P, P
LP
Q. In a trivalent
interpretation in which P is # and Q is 0, both P and P are #, but Q is 0. So
in this trivalent interpretation, the premises (P and P) have designated values
whereas the conclusion (Q) does not.
Ex falso quodlibet is not the only classical inference that fails in LP. Modus
ponens is another (exercise 3.10d). So LPs relation of logical consequence
differs drastically from that of classical logic. However, LP generates precisely
the same results as classical propositional logic when it comes to the validity of
individual formulas (exercise 3.11).
8
Accepting Sentence is both true and false is not exactly the same as accepting both
and not-; but the former leads to the latter given the principles about truth and negation
described at the end of section 3.4.
Ixcrcisc 3.10 Demonstrate each of the following.
a) PQ
LP
QP
b) P(QR)
LP
Q(PR)
c) (PQ)
LP
PQ
d) P, PQ
LP
Q
e) P, PQ
LP
Q
Ixcrcisc 3.11** Show that a formula is PL-valid iff it is LP-valid.
3.4.5 Supcrvaluationism
Recall the guiding thought behind the Kleene tables: if a formulas classical
truth values x a particular truth value, then that is the value that the formula
takes on. There is a way to take this idea a step further, which results in a new
and interesting way of thinking about three-valued logic.
According to the Kleene tables, we get a classical truth value for Z,
where Z is any connective, only when we have enough classical information
in the truth values of and to x a classical truth value for Z. Consider
for example: if either or is false, then since the falsehood of a conjunct
is classically sufcient for the falsehood of the whole conjunction, the entire
formula is false. But if, on the other hand, both and are #, then neither
nor has a classical truth value, we do not have enough classical information
to settle on a classical truth value for , and so the whole formula is #.
But now consider a special case of the situation in the previous paragraph:
let be P, be P, and consider a trivalent interpretation in which P is
#. According to the Kleene tables, the conjunction PP is #, since it is the
conjunction of two formulas that are #. But there is a way of thinking about
truth values of complex sentences according to which the truth value ought
to be 0, not #. Consider changing s assignment to P from # to a classical
truth value. No matter which classical value we choose, the whole sentence
PP would then become 0. If we changed to make P 0, then PP would
be 00that is 0; and if we made P 1 then PP would be 110 again.
PP becomes false no matter what classical truth value we give to its sentence
letter Pisnt that a reason to think that, contrary to what Kleene says, PP
is false?
The general thought here is this: suppose a sentence contains some
sentence letters P
1
. . . P
n
that are #. If would be false no matter how we assign
classical truth values to P
1
. . . P
n
that is, no matter how we precisied then
is in fact false. Further, if would be true no matter how we precisied it,
then is in fact true. But if precisifying would sometimes make it true and
sometimes make it false, then in fact is #.
The idea here can be thought of as an extension of the idea behind the
Kleene tables. Consider a formula Z, where Zis any connective. If there is
enough classical information in the truth values of and to x on a particular
classical truth value, then the Kleene tables assign Z that truth value. Our
new idea goes further, and says: if there is enough classical information within
and to x a particular classical truth value, then Z gets that truth value.
Information within and includes, not only the truth values of and ,
but also a certain sort of information about sentence letters that occur in both
and . For example, in PP, when P is #, there is insufcient classical
information in the truth values of P and of P to settle on a truth value for
the whole formula PP (since each is #). But when we look inside P and P,
we get more classical information: we can use the fact that P occurs in each
to reason as we did above: whenever we turn P to 0, we turn P to 1, and so
PP becomes 0; and whenever we turn P to 1 we turn P to 0, and so again,
PP becomes 0.
This new ideathat a formula has a classical truth value iff every way of
precisifying it results in that truth valueis known as supervaluationism. Let us
lay out this idea formally.
Where is a trivalent interpretation and ( is a PL-interpretation (i.e., a
bivalent interpretation in the sense of section 2.3), say that ( is a precisication of
iff: whenever assigns a classical truth value (i.e., 1 or 0) to a sentence letter,
( assigns that sentence letter the same classical value. Thus, precisications
of agree with what says about the classical truth values, but in addition
being PL-interpretationsthey also assign classical truth values to sentence
letters to which assigns #. Each precisication of decides each of s
#s in some way or other; different precisications decide those #s in different
ways.
We can now say how the supervaluationist assigns truth values to complex
formulas relative to a given trivalent interpretation.
iiixi:iox oi sivivvaiia:iox: When is any wff and is a trivalent inter-
pretation, the supervaluation of relative to , is the function SV
() which
assigns 0, 1, or # to each wff as follows:
SV
() =
1 if V
(
() =1 for every precisication, (, of
0 if V
(
() =0 for every precisication, (, of
# otherwise
Here V
(
is the valuation for PL-interpretation (, as dened in section 2.3.
When SV
() = 1, we say that is supertrue in ; when SV
() = 0, we
say that is superfalse in . Supervaluational notions of validity and semantic
consequence may be dened thus:
is supervaluationally valid (
S
) iff is supertrue in every trivalent
interpretation
is a supervaluational semantic consequence of (
S
) iff is
supertrue in each trivalent interpretation in which every member of is
supertrue
Example 3.4: Let be a trivalent interpretation where (P) =(Q) =#.
What is SV
(PQ)? Answer: #. Let ( and (

/
be functions dened as follows,
where is any sentence letter:
(() =
1 if () =#
() if () = either 1 or 0
(
/
() =
0 if () =#
() if () = either 1 or 0
( and (
/
always assign either 1 or 0; and they agree with whenever the
latter assigns a classical value. So each is a precisication of . Since ((P) =
((Q) =1, V
(
(PQ) =1. Since (
/
(P) =(
/
(Q) =0, V
(
/ (PQ) =0. So PQ
is 1 on some precisications of and 0 on others.
Example 3.5: Where is the same trivalent interpretation considered in
example 3.4, what is SV
(PP)? Answer: 0. (A different result, notice,

from that delivered by the Kleene and ukasiewicz tables.) For let ( be
any precisication of . ( is a PL-interpretation, and PP is 0 in each
PL-interpretation. So PP is supertrue in .
Supervaluation is a formalism, a way of assigning 1s, 0s, and #s to wffs of
the language of PL relative to trivalent interpretations. While this formalism
can be applied in many waysnot all of them involving vaguenessthe follow-
ing philosophical idea is often associated with it. For any vague, interpreted
language, we can consider various sharpenings: ways of making its vague terms
precise without disturbing their determinate semantic features. For example, to
sharpen the vague term rich, we go through everyone who is on the borderline
of being rich and arbitrarily classify each one either as being rich or as not being
rich; but we must continue to classify all the denitely rich people as being rich
and all the denitely not rich people as being not rich. Some sentences come
out true on some sharpenings and false on others. For example, since Middling
Mary is a borderline case of being rich, we are free to sharpen rich so that
Mary is rich comes out true, and we are free to sharpen rich so that Mary is
rich comes out false. But since Donald Trump is denitely rich, we are not free
to sharpen rich so that Trump is rich comes out false; Trump is rich is true
on all sharpenings. Also, the disjunction Middling Mary is either rich or not
rich comes out true on all sharpenings, even though Mary is in the borderline,
since each sharpening will count one or the other of its disjuncts, and hence the
whole disjunction, as being true. And still other sentences come out false on all
sharpenings, for instance Pete the peasant is rich and Mary is both rich and
not rich. The philosophical idea is this: truth is truth-on-all-sharpenings, and
falsity is falsity-on-all-sharpenings. Trump is rich is true, because it is true on
all sharpenings; Pete is rich is false, because it is false on all sharpenings; Mary
is rich is neither true nor false, because it is neither true on all sharpenings
nor false on all sharpenings. Supertruth relative to trivalent interpretations is a
good formal model of truth-in-all-sharpenings, and hence of truth itself; so
supervaluational validity and semantic consequence are good formal models of
(genuine) logical truth and logical consequence.
9
Lets close by noticing two important facts about supervaluationism. The
9
See Fine () for a fuller presentation and Williamson (, chapter ) for a critique.
Some supervaluationists do not identify truth with truth-on-all-sharpenings; see McGee and
McLaughlin ().
rst is that the supervaluation functions SV are not in general truth-functional.
To say that a valuation function is truth-functional is to say that the value it
assigns to any complex wff is a function of the values it assigns to that wffs
immediate constituents. Now, the valuation functions associated with the
ukasiewicz and Kleene tables are truth-functional. (This is trivialwhat a
truth table is, is a specication of how the values of a certain sort of complex
wff depend on the values of that wffs parts.) But not so for supervaluations.
Examples 3.4 and 3.5 show that if trivalent interpretation assigns # to both
P and Q, then SV
(PQ) = # whereas SV
(PP) = 0. But SV
(P) is
obviously # (the precisications of considered in example 3.4 show this). So
PQ and PP are both conjunctions, each of whose conjuncts is # in SV
,
and yet they are assigned different values by SV
. So SV
isnt truth-functional;
the values it assigns to conjunctions arent a function of the values it assigns to
their conjuncts. (Similar arguments can be made for other connectives as well;
see for example exercise 3.12.)
The second important fact about supervaluationism is this: even though
supervaluations are three-valued, there is a sense in which supervaluationism
preserves classical logic. For example, every tautology (PL-valid formula) turns
out to be supervaluationally valid. Let be a tautology; and consider any
trivalent interpretation , and any precisication ( of . Precisications are
PL-interpretations; so, since is a tautology, is true in (. So is supertrue
in . was arbitrarily chosen, so
S
. Similarly, any PL-consequence of a
set is also a supervaluational consequence of that set (exercise 3.13).
So in a sense, supervaluationismpreserves classical logic. However, when we
add the operator Zfor determinacy, and extend the supervaluational semantics
in a natural way to handle Z, theres a sense in which classical logic is violated.
The details of this semantics and argument can be found in Williamson (,
section .); here I will argue informally. Specically, Ill argue for two claims
with respect to English, assuming that truth is truth-on-all-sharpenings, and
then Ill draw a conclusion about supervaluationism.
Assume that truth is truth-on-all-sharpenings. Claim : any English sen-
tence logically implies denitely, . Argument: assume is true. Then
is true on all sharpenings. But then, surely, denitely, is true. Claim
: the sentence if Middling Mary is rich, then Middling Mary is denitely
rich is not true, and so is not a logical truth. Argument: on some sharpenings,
the antecedent of this conditional is true while its consequent is false (assume
Mary is a denite case of indenite richness; so the consequent is false on all
sharpenings).
Given claims and , if a supervaluational semantics for Z is to model
English, it must have these two features: P ZP and PZP. But it is a
law of classical logic that whenever
PL
, its also true that
PL
. So
the classical law of conditional proof (compare the deduction theorem) fails
supervaluationally. Analogous arguments can be made for other classical laws.
For example, contraposition and reductio hold for classical logic:
If
PL
then
PL
If
PL
and
PL
then
But they too can be argued to fail, given a supervaluational semantics for
Z (exercise 3.16). These discrepancies with classical logic involve, in effect,
laws about sequent validityreasoning with assumptions. When it comes
to reasoning with assumptions, then, a supervaluational logic for Z will be
nonclassical, if it is inspired by the identication of truth with truth-on all
sharpenings.
Ixcrcisc 3.12 Show that supervaluations arent truth-functional
with respect to conditionals. That is, nd a trivalent interpretation,
, and wffs
1
,
2
,
1
, and
2
, such that SV
(
1
) =SV
(
2
) and
SV
(
1
) =SV
(
2
), but SV
(
1
1
) (=SV
(
2
2
).
Ixcrcisc 3.13 Show that if
PL
then
S
.
Ixcrcisc 3.14 Show that if a formula is true in a trivalent inter-
pretation given the Kleene truth tables, then it is supertrue in that
interpretation.
Ixcrcisc 3.15** Our denition of supervaluational semantic con-
sequence is sometimes called the global denition. An alternate
denition, sometimes called the local denition, says that is a
supervaluational semantic consequence of iff for every trivalent
interpretation, , and every precisication, (, of , if V
(
() =1
for each , then V
(
() =1. Show that the global and local de-
nitions are equivalent. (Equivalent, that is, before Z is introduced.
Under some supervaluational semantics for Z, the global and local
denitions are not equivalent.)
Ixcrcisc 3.16* Argue on intuitive grounds that a supervaluational
semantics for Z should violate contraposition and reductio.
3.5 Intuitionistic propositional logic: proof thc-
ory
Intuitionism is a philosophy of mathematics according to which there are no
mind-independent mathematical facts. Rather, mathematical facts and entities
are mental constructs that owe their existence to the activities of mathematicians
constructing proofs.
In addition to espousing this constructivist philosophy of mathematics,
intuitionists also rejected classical logic, in favor of a new nonclassical logic
now known as intuitionistic logic. This logic rejects various classical laws,
most notoriously the lawof the excluded middle, which says that each statement
of the form or not- is a logical truth, and double-negation elimination,
which says that a statement of the form not-not- logically implies the
statement . Intuitionistic logic has been highly inuential within philosophy
in a way that transcends its connection with constructivist mathematics, in
large part because it is often regarded as a logic appropriate to anti-realism.
While intuitionistic logic itself will be our main focus, let me rst say a bit
about why mathematical intuitionists are drawn to it. Consider the decimal
expansion of : 3.14159. . . Little is known about the patterns occurring in it.
We do not know, for example, whether the sequence 0123456789 eventually
appears. It has not been observed in the trillion or so digits to which has so
far been expanded; but no one has proved that it cannot appear. Now, from
a mathematical realist (platonist) point of view, we should say nevertheless
that: either this sequence eventually appears or it does not. That is, where P
is the statement The sequence 0123456789 occurs somewhere in the decimal
expansion of , we should accept this instance of the law of the excluded
middle: P or not-P. Mathematical reality includes a certain innite object,
the decimal expansion of , which either contains or fails to contain the se-
quence 0123456789. But facts about innite totalities of this sort are precisely
what intuitionists reject. According to intuitionists, there are no completed
innities. In the case of , we have the potential to construct longer and
longer initial segments of its decimal expansion, but we should not think of
the entire innite expansion as already existing. As a result, according to
intuitionists, until we either observe the sequence 0123456789 (thus proving P)
or show that it cannot appear (thus proving P), we cannot assume that P or
not-P is true. To assume this would be to assume that facts about s decimal
expansion are already out there, independently of our constructing proofs.
But these vague thoughts are not an argument. And turning them into
an argument is not straightforward. For example, we cannot formulate the
intuitionists challenge to P or not-P as follows: since mathematical truth is
constituted by proof, and we have no proof of either disjunct, neither disjunct is
true, and so the disjunction is not true. This challenge leads to a three-valued
approach to propositional logic (if neither P nor not-P is true then P is neither
true nor false) whereas intuitionistic logic is not a three-valued approach. It is
not based on constructing truth tables of any sort, and it embraces a different set
of logical truths and logical consequences from all the three-valued approaches
we have considered so far (see exercises 3.17 and 7.10).
What then is the intuitionists complaint about P or not-P, if not that
its disjuncts are untrue? Here is one thought.
10
Intuitionist philosophy of
mathematics requires acceptance of the following two conditionals:
If P then it is provable that P
If not-P, then it is provable that not-P
So if we were entitled to assume P or not-P, we could infer that: it is provable
that P or it is provable that not-P. But were not entitled to this conclusion.
We dont have any guarantee that our methods of proof are powerful enough
to settle the question of whether P is true.
11
Conclusion: we are not entitled
to assume P or not-P, so its not a logical truth.
So: intuitionists are unwilling to accept P or not-P. Interestingly, they
do not accept its denial not: P or not-P, since they accept the denial of this
denial: not-not: P or not-P. Why? Consider the following argument.
12
Assume for reductio: not: P or not-P. Now, if P were
true, then we would have P or not-P, contradicting
the assumption. So not-P must be true. But from
not-P it follows that P or not-Pcontradiction. So,
not-not: P or not-P.
The reasoning in this argument is hard to resist (in essence it uses only reductio
ad absurdum and disjunction-introduction) and is accepted by intuitionists. So
even intuitionists have reason to accept that not-not: P or not-P is a logical
truth. Since intuitionists reject double-negation elimination, this is consistent
with their refusal to accept P or not-P.
13
In the classical semantics for propositional logic, is of course as-
signed the truth value 1 no matter what truth value is assigned, and is
10
Here I follow Wright (, ). For some other thoughts on this matter, see the
works by Brouwer, Heyting and Dummett in Benacerraf and Putnam ().
11
Beware: the intuitionist will not say it is not provable that P nor is it provable that
not-Pthat would lead, via the two conditionals, to a contradiction: not-P and not-not-P.
12
Compare the rst lines of example 2.9.
13
To get more of a feel for the intuitionists rejection of double-negation elimination, suppose
we could show that the assumption of not-Pthat 0123456789 never occursleads to a
contradiction. This would establish not-not-P, but it would not establish P. To establish P,
we would need to construct enough of s decimal expansion to observe 0123456789. (Relatedly,
intuitionistic predicate logic (which we wont consider further in this book) rejects the inference
from not everything is F to something is not-F . To prove the former one must merely
show that everything is F leads to contradiction; to prove the latter one must prove an
instancesome particular sentence of the form a is not-F .)
assigned 1 whenever is. But this does not faze the intuitionist, since
classical semantics is by her lights based on a mistaken picture: the picture
of mathematical statements being statements about independently-existing
mathematical reality (such as the innite decimal expansion of ), and thus as
being appropriately represented as having truth values (either 1 or 0) depending
on the nature of this reality.
So much for philosophical justication; now on to the logic itself. Im
going to approach this proof-theoretically, with sequents. (A semantics will
have to wait until section 7.4.) Two simple modications to the sequent proof
system of section 2.5 generate a proof system for intuitionistic propositional
logic. First, we need to split up the double-negation rule, DN, into two halves,
double-negation introduction and double-negation elimination:
'
'
DNE
'
'
DNI
In the classical system of section 2.5 we were allowed to use both DNE and
DNI; but in the intuitionist system, only DNI is allowed. Second, to make up
for the dropped rule DNE, our intuitionist system adds the rule ex falso:
'
'
EF
In the move from our old classical sequent system to the new intuitionist
system, the only rule we have added was EF. And any use of EF can be replicated
in the old system: simply use RAA and then DNE. That means that every
sequent proof in the new system can be replicated in the old system; every
intuitionistically provable sequent is also classically provable.
Notice how dropping DNE blocks proofs of various classical theorems the
intuitionist wants to avoid. The proof of ' PP (example 2.9), for instance,
used DNE. Of course, for all weve said so far, there might be some other way
to prove this sequent. Only when we have a semantics for intuitionistic logic,
and a soundness proof relative to that semantics, can we show that this sequent
cannot be proven without DNE (section 7.4).
It is interesting to note that even though intuitionists reject the inference
from P to P, they accept the inference fromP to P, since its proof
only requires the half of DN that they accept, namely the inference from P to
P:
. P P RA
. P P , DNI
. P, P P P , , I
. P P , RAA
Note that you cant use this sort of proof to establish P ' P. Given the way
RAA is stated, its application always results in a formula beginning with .
Ixcrcisc 3.17* Show that our intuitistic proof system generates
a different logic from the three-valued systems of ukasiewicz,
Kleene, and Priest. For each of those three-valued systems S
3
, nd
an intuitionistically provable sequent such that
S
3
(if
your chosen is the empty set this means showing that
S
3
.)
Chaptcr 4
Prcdicatc Logic
L
i:s xov :ivx from propositional logic to predicate logic, or the pred-
icate calculus (PC), as it is sometimes calledthe logic of all and
some. As with propositional logic, were going to formalize predicate logic.
Well rst do grammar, then semantics, then proof theory.
4.1 Grammar of prcdicatc logic
As before, we start by specifying the primitive vocabularythe symbols that
may be used in (well-formed) formulas of predicate logic. Then we dene the
formulas as strings of primitive vocabulary that have the right form.
Connectives: , ,
variables x, y . . ., with or without subscripts
for each n >0, n-place predicates F, G. . ., with or without subscripts
individual constants (names) a, b . . ., with or without subscripts
parentheses
No symbol of one type is a symbol of any other type. Lets call any variable or
constant a term.
Note how we allow subscripts on predicates, variables, and names, just as
we allowed subscripts on sentence letters in propositional logic. We do this so
CHAPTER 4. PREDICATE LOGIC

that well never run out of vocabulary when constructing increasingly complex
sentences, such as xyzx
259
y
47
(Rxyzx
259
R
3
xy
47
).
iiixi:iox oi vii:
i) if is an n-place predicate and
1
. . .
n
are terms, then
1
. . .
n
is a
PC-wff
ii) if and are PC-wffs, and is a variable, then , (), and
are PC-wffs
iii) Only strings that can be shown to be PC-wffs using i) and ii) are PC-wffs
Well call wffs generated by clause i) atomic formulas.
is called the universal quantier. Read x . . . as saying everything x
is such that . So xF x is read as everything is F , x(F xGx) as
not all F s are Gs, and so on.
Notice that in addition to familiar-looking wffs such as Fa and xyRxy,
our denition also counts the following as wffs:
F x
xRxy
What is distinctive about such wffs is that they contain variables that dont
belong to any quantier in the formula. In the rst formula, for example, the
variable x doesnt belong to any quantier; and in the second formula, whereas
the second x belongs to the quantier x, the variable y doesnt belong to any
quantier. Variables that dont belong to quantiers are called free; variables
that do belong to quantiers are called bound.
More carefully: we must speak of variables as being free or bound in given
formulas (since x is free in F x but bound in xF x). Still more carefully, we must
speak of individual occurrences of variables being free or bound (in formulas).
For example, in the formula F xxF x, the rst occurrence of x is free (in
the whole formula) whereas the third is bound. (We also count the second
occurrence of x, within the quantier x itself, as being bound.) Even more
carefully: we may dene the notions as follows.
iiixi:iox oi ivii axb noixb vavianiis: An occurrence of variable in wff
is bound in iff that occurrence is within an occurrence of some wff of the
form within . Otherwise the occurrence is free in .
When a formula has no free occurrences of variables, well say that it is a closed
formula, or sentence; otherwise it is an open formula.
Our concern is normally with closed formulas, since it is those formulas that
represent quanticational statements of everyday language. A statement with
free variables, by contrast, is semantically incomplete, intuitively speaking.
Nevertheless, open formulas are useful for certain purposes, especially in proof
theory (section 4.4).
We have the same dened connectives: , , . We also add the following
denition of the existential quantier:
iiixi:iox oi : is short for (where is a variable and is
a wff)
This is an intuitively correct denition, given that is supposed to represent
some: there are some pigs if and only if not everything is a non-pig.
4.2 Scmantics of prcdicatc logic
Recall from section 2.2 the semantic approach to logic, in which we i) dene
congurations, which are mathematical representations of ways for the world
to be, and of the meanings of nonlogical expressions; and ii) dene the notion of
truth for formulas in these congurations. We thereby shed light on meaning,
and we are thereby able to dene formal analogs of the notions of logical truth
and logical consequence.
In propositional logic, the congurations were assignments of truth values
to atomic wffs. This strategy breaks down in predicate logic, for various reasons.
First, atomic wffs now include formulas with free variables, and we shouldnt
assign truth values to such wffs. A variable like x doesnt stand for any xed
thing; variables are rather used to express generality when combined with
quantiers, as in sentences like xF x and x(F xGx). But when a variable
is not combined with a quantier, as in wffs like F x and Rxy, the result is,
intuitively, semantically incomplete, and not the kind of linguistic entity that is
capable of truth or falsity. Second, congurations generally assign meanings
to the smallest meaningful bits of language, so as to enable the calculation
of truth values of complex sentences. In propositional logic, sentence letters
were the smallest meaningful bits of language, and so it was appropriate for
the congurations there to assign semantic values to them (and truth values
are appropriate semantic values for sentence letters). But here in predicate
logic, the smallest meaningful bits of language are the names and predicates,
for example a, b, F , and R, so the congurations here ought to assign semantic
values to names and predicates, so as to enable the calculation of truth values of
complex sentences like Fa, Rab, and xF x. But truth values are not appropriate
semantic values for names and predicates.
As a rst step towards solving these problems, lets begin by adopting a new
conception of a conguration, that of a model:
iiixi:iox oi xobii: A PC-model is an ordered pair , such that:
is a non-empty set (the domain)
is a function (the interpretation function) obeying the following
constraints:
if is a constant then ()
if is an n-place predicate, then () is an n-place relation over
(Recall the notion of a relation from section 1.8.)
A conguration is supposed to represent a way for the world to be, as well
as meanings for nonlogical expressions. The part of a model that represents
a way for the world to be is its domain, , which contains, intuitively, the
individuals that exist in the conguration.
1
The part of a model that represents
the meanings of nonlogical expressions is its interpretation function, , which
tells us what names and predicates mean in the conguration. assigns to
each name a member of the domainits referent. For example, if the domain
is the set of persons, then might assign me to the name a. An n-place
predicate gets assigned an n-place relation over that is, a set of n-tuples
drawn from . This set is called the extension of the predicate in the model.
Think of the extension of a predicate as the set of tuples to which the predicate
applies. One-place predicates get assigned sets of -tuples of that is, sets
of members of . If the extension of F is the set of males, then F might be
thought of as symbolizing is male. Two-place predicates get assigned binary
relations over the domain. If a two place predicate R is assigned the set of
ordered pairs of persons u, v such that u is taller than v, we might think of
1
Theres more to the world than which objects exist; there are also the features those objects
have. Predicate logic models blur their representation of this second aspect of the world with
their representation of the meanings of predicates (much as PL-interpretations blur their
representation of the world with their representation of the meanings of sentence letters.)
R as symbolizing is taller than. Similarly, three-place predicates get assigned
sets of ordered triples, and so on.
Relative to any PC-model , , we want to dene what it is for wffs to
be true in that model. But well need some apparatus rst. Its pretty easy to
see what truth value a sentence like Fa should have. assigns a member of
the domain to acall that member u. also assigns a subset of the domain to
F lets call that subset S. The sentence Fa should be true iff u Sthat is,
iff the referent of a is a member of the extension of F . That is, Fa should be
true iff (a) (F ). Similarly, Rab should be true iff (a), (b) (R).
Similarly for other atomic wffs without free variables.
As before, we can give recursive clauses for the truth values of negations
and conditionals. , for example, will be true iff either is false or is
true.
But we encounter a problem when we try to specify the truth value of xF x.
It should, intuitively, be true if and only if F x is true, no matter what we put
in in place of x. But what does no matter what we put in place of x mean?
Does it mean no matter what name (constant) we put in place of x? No,
because we dont want to assume that weve got a name for everything in the
domain (F x might be true for all the objects we have names for, but false for
one of the nameless things). Does it mean, no matter what object from the
domain we put in place of x? No; objects from the domain neednt be part of
our primitive vocabulary, so the result of replacing x with an object from the
domain wont in general be a wff.
The way forward here is due to Alfred Tarski. First step: we let the variables
refer to certain things in the domain temporarily. Second step: we show how to
compute the truth value of a formula like F x, relative to a temporary referent
of the variable x. Third step: we say that xF x is true iff for all objects u in
the domain , F x is true when x temporarily refers to u.
We implement this idea of temporary reference with the idea of a variable
assignment (Tarski did it a bit differently):
iiixi:iox oi vavianii assioxxix:: g is a variable assignment for model
, iff g is a function that assigns to each variable some object in .
When g(x) = u, think of u as the object to which the variable x temporar-
ily refers. Notice that a variable assignment assigns a value to each of the
innitely many variables that are allowed to occur in predicate logic wffs.
We do this because we need to be ready to evaluate any formula for a truth
value, no matter what variables it contains. When we evaluate the formula
F xyGzx
1
y
47
x
191
, for example, well need temporary referents for all its vari-
ables: x, y, z, x
1
, y
47
, x
191
. Other formulas contain other variables. So we take
the safe course and assign temporary referents to all variables.
We need a further bit of notation. Let u be some object in , let g be some
variable assignment, and let be a variable. We then dene g
u
to be the
variable assignment that is just like g, except that it assigns u to . (If g already
assigns u to then g
u
will be the same function as g.) Note the following
important fact about variable assignments: g
u
, when applied to , must give
the value u. (Work through the denitions to see that this is so.) That is:
g
u
() = u
One more bit of apparatus. Given any model .(=, ), any variable
assignment, g, and any term (i.e., variable or name) , we dene the denotation
of , relative to . and g, []
., g
, as follows:
[]
., g
=
() if is a constant
g() if is a variable
The subscripts . and g on [ ] indicate that denotations are assigned relative
to a model (.), and relative to a variable assignment ( g).
Now we are ready to dene truth in a model. That is, were ready to dene
the valuation function for a given model, .. The valuation function will assign
truth values to formulas relative to variable assignments. This relativization is
crucial to Tarskis strategy. The second step of that strategy, recall, was to show
how to compute truth values of sentences relative to choices of temporary
referents for their variablesi.e., relative to variable assignments.
iiixi:iox oi vaiia:iox: The PC-valuation function, V
., g
, for PC-model
. (= , ) and variable assignment g, is dened as the function that assigns
to each wff either 0 or 1 subject to the following constraints:
i) for any n-place predicate and any terms
1
. . .
n
, V
., g
(
1
. . .
n
) =1
iff [
1
]
., g
. . . [
n
]
., g
()
ii) for any wffs , , and any variable :
V
., g
() =1 iff V
., g
() =0
V
., g
() =1 iff either V
., g
() =0 or V
., g
() =1
V
., g
() =1 iff for every u , V
., g
u
() =1
The valuation functions of propositional logic dened a kind of relativized
truth: truth relative to an PL-interpretation. Predicate logic valuation functions
are relativized to variable assignments as well as to interpretations (which
are now models), and so dene a doubly relativized kind of truth; think of
V
., g
() =1 as meaning that is true in . relative to g. But wed also like a
singly relativized notion of truth that is relativized only to models, not valuation
functions. (We want this because we want to dene, e.g., a valid formula as
one that is true in all models.) How are we to dene such a notion? Consider
an example. What must be true in order for the formula xF x to be true in
some model . (=, ), relative to some variable assignment g? Working
through our various denitions:
V
., g
(xF x) =1 iff for every u , V
., g
x
u
(F x) =1 (truth condition for )
iff for every u , [x]
g
x
u
(F ) (t.c. for atomics)
iff for every u , g
x
u
(x) (F ) (def of denotation)
iff for every u , u (F ) (def of g
x
u
)
Notice how, by the end, the function g with which we began has dropped
out. The values that g assigns, as a result, do not affect whether xF x is true
relative to g in this model. In fact, this happens for every formula which, like
xF x, lacks free variables: whether the formula is true in a model relative to
variable assignment g does not depend at all on g (exercise 4.1). So we might
as well dene the singly relativized notion of truth thus:
iiixi:iox oi :vi:u ix a xobii: is true in PC-model . iff V
., g
() =1,
for each variable assignment g for .
(So as far as closed formulas are concerned, we would have gotten the same
result if we had required truth relative to some variable assignment.)
What about formulas with free variables, such as F x? These arent generally
the formulas were interested in; but nevertheless, what does our denition of
singly relativized truth say about them? Its fairly easy to see that these formulas
turn out true in a model iff they are true for all values of their variables in that
models domain. Thus, a formula with free variables is true in a model iff its
universal closure, the result of prexing the formula with universal quantiers
for each of its free variables, is true in that model. For example, F x is true in a
model iff xF x is true in that model.
Next, we can give denitions of validity and consequence:
iiixi:iox oi vaiibi:v: is PC-valid (
PC
) iff is true in all PC-models
iiixi:iox oi sixax:it toxsioiixti: is a PC-semantic consequence of set
of wffs (
PC
) iff for every PC-model . and every variable assignment
g for ., if V
., g
() =1 for each , then V
., g
() =1
Note: exercise 4.1 tells us that if a closed formula is true in a model relative
to one variable assignment, then its true relative to every variable assignment.
Thus, when and the members of are all closed formulas, an equivalent
denition of semantic consequence would be this: if every member of is true in
., then so is .
Since predicate logic valuation functions treat the propositional connectives
and in the same way as propositional logic valuations do, they also treat
the dened connectives , , and in the same way:
V
., g
() =1 iff V
., g
() =1 and V
., g
() =1
V
., g
() =1 iff V
., g
() =1 or V
., g
() =1
V
., g
() =1 iff V
., g
() =V
., g
()
Moreover, we can also prove that gets the correct truth condition:
Example 4.1: Lets show that
V
., g
() =1 iff there is some u such that V
., g
u
() =1
The denition of is: . So, we must show that for any model, .
(=, ), and any variable assignment g for ., V
., g
() =1 iff there
is some u such that V
., g
u
() =1. (Ill sometimes stop writing the subscript
. in order to reduce clutter. It should be obvious from the context what the
relevant model is.) Heres the argument:
V
g
() =1 iff V
g
() =0 (t.c. for )
iff for some u , V
g
u
() =0 (t.c. for )
iff for some u , V
g
u
() =1 (t.c. for )
Ixcrcisc 4.1** Show that if has no free variables, then for any
model . and variable assignments g and h for ., V
., g
() =
V
.,h
()
4.3 Istablishing validity and invalidity
Given our denitions, we can establish that particular formulas are valid.
Example 4.2: Show that xF xFa is valid. That is, show that this formula
is true relative to any model and any variable assignment for that model:
i) Suppose otherwise; then V
., g
(xF xFa) = 0, for some model . =
, and variable assignment g for .. So (dropping the . subscript
henceforth) V
g
(xF x) =1 and V
g
(Fa) =0.
ii) Given the latter, [a]
g
/ (F ). But [a]
g
=(a); so (a) / (F ).
iii) Given the former, for any u , V
g
x
u
(F x) = 1. But (a) , so
V
g
x
(a)
(F x) = 1. So, by the truth condition for atomics, [x]
g
x
(a)
(F ).
But [x]
g
x
(a)
= g
x
(a)
(x) =(a). Thus, (a) (F ), contradicting ii).
The claim in step iii) that (a) comes from the denition of an interpreta-
tion function: the interpretation of a name is always a member of the domain.
Notice that (a) is a term of our metalanguage; thats why, when I learn that
for any u in step ii), I can set u equal to (a).
Example 4.3: Show that xyRxyxRxx (moving more quickly now):
g
(xyRxyxRxx) =0 (for some assign-
ment g in some model). Then V
g
(xyRxy) =1 and
ii) V
g
(xRxx) = 0. So for some v , V
g
x
v
(Rxx) = 0. Call one such v
u. So we have: V
g
x
u
(Rxx) =0.
iii) Given ii), [x]
g
x
u
, [x]
g
x
u
/ (R). [x]
g
x
u
is g
x
u
(x), i.e., u. So u, u / (R)
iv) Given i), for every member of , and so for u in particular, V
g
x
u
(yRxy) =
1. So for every member of , and so for u in particular, V
g
xy
uu
(Rxy) =1.
So [x]
g
xy
uu
, [y]
g
xy
uu
(R). But [x]
g
xy
uu
and [y]
g
xy
uu
are each just u. Hence
u, u (R), contradicting iii).
Line ii) of example 4.3 illustrates an elementary inferential practice that is
ubiquitous in mathematical reasoning. Suppose you learn that there exists some
object of a certain type, T. Immediately afterwards you should give one of
these objects of type T a name. Say: call one such object u.. Then continue
your proof, using the name u.
2
Once this practice becomes familiar, Ill streamline proofs by no longer
explicitly saying call one such object u. Instead, after writing down an initial
line of the form there exists some u of type T, Ill subsequently use u as a
name of one such object. But strictly one ought always to say call one of the
objects of type T u, to mark this change in how u is being used, since in
the initial line u is not a name, but is rather a bound metalanguage variable
(bound to the metalanguage quantier there is some). (A common mistake to
avoid: using an expression like u initially as a metalanguage variable, but then
drifting into using it as if its a name, where it isnt clear which object it names.)
This practice needs to be employed with care. Suppose you introduce u
as a name for some object of type T, and suppose that later in the same proof,
you learn that there exists an object of a certain other type T
/
. You cannot then
introduce the same name u for some object of type T
/
what if nothing is
both of type T and of type T
/
? You must instead give the new object a new
name: v, say.
The practice of introducing a name for an object of a certain type is for use
with existentially quantied statements of the metalanguagestatements of
the form there exists some object of such and such type. Its not for use with
universally quantied statements; if you learn that every object is of a certain
type, its usually not a good idea to say: call one such object u. Instead,
wait. Wait until some particular object or objects of interest have emerged
in the proofuntil, for example, youve learned some existentially quantied
statements, and have introduced corresponding names. Only then should you
use the universally quantied statementyou can now apply it to the objects
of interest. For example, if you introduced a name u, you could use a univer-
sally quantied statement everything is of type T to infer that u is of type T.
(Compare line iv) in example 4.3.) In general: deal with existentially quantied
metalanguage statements rst, and universally quantied metalanguage state-
ments later. (Note that statements of the form V
g
() =1 and V
g
() =0
imply universally quantied metalanguage statements, whereas statements of
2
You havent really attached the name u to any particular one of the objects of type T.
But this doesnt matter, so long as you only use the name u to derive conclusions that could
be derived for any object of type T. The practice Im describing is often called the rule of
existential elimination in introductory logic texts.
the form V
g
() =1 and V
g
() =0 imply existentially quantied metalan-
guage statements. So deal with the latter rst.)
Weve seen how to establish that particular formulas are valid. How do we
show that a formula is invalid? All we must do is exhibit a single model in which
the formula is false. (A valid formula must be true in all models; therefore, it
only takes one model in which a formula is false to make that formula invalid.)
Example 4.4: Show that the formula (xF xxGx)x(F xGx) isnt
valid. We need to nd a model in which this formula is false. My model
will contain letters in its domain:
=]u, v]
(F ) =]u]
(G) =]v]
It is intuitively clear that the formula is false in this model. In this model,
something is F (namely, u), and something is G (namely, v), but nothing in the
models domain is both F and G.
Example 4.5: Show that xyRxy yxRxy. We must showthat the rst
formula does not semantically imply the second. So we must come up with a
model and variable assignment in which the rst formula is true and the second
is false. (Since these formulas are closed, as noted above it wont matter which
variable assignment we choose; so all we need is a model in which the premise
is true and the conclusion is false.) It helps to think about natural language
sentences that these formulas might represent. If R symbolizes respects,
then the rst formula says that everyone respects someone or other, and
the second says that there is someone whom everyone respects. Clearly, the
rst can be true while the second is false: suppose that each person respects a
different person, so that no one person is respected by everyone. A simple case
of this occurs when there are just two people, each of whom respects the other,
but neither of whom respects him/herself:

Here is a model based on this idea:
=]u, v]
(R) =]u, v, v, u]
Ixcrcisc 4.2 Show that:
a) x(F x(F xGx))
b) x(F xGx)(xF xxGx)
c) x(F xGx), x(GxHx) x(F xHx)
d) xyRxyyxRxy
Ixcrcisc 4.3 Show that:
a) x(F xGx)x(GxF x)
b) x(F xGx)(xF xxGx)
c) Rab xRxx
d)** F x xF x
e) xyz[(RxyRyz)Rxz], xyRxy xRxx
4.4 Axiomatic proofs in PC
Lets turn now to proof theory for predicate logic. One can construct natural
deduction, sequent, or axiomatic systems of proof for predicate logic, just as
with propositional logic. (And there are other approaches as well.) Although
axiomatic proofs are less intuitive than the others, well take the axiomatic
approach since this will be convenient for use with modal logic later on.
Well continue to use section 2.6s denitions of the key concepts of the
axiomatic approach: a proof from a set of wffs is dened as a sequence of wffs,
each of which is either a member of , an axiom, or follows from earlier lines
in the proof by a rule; is provable from iff is the last line of a proof from
; is a theorem iff is provable from the empty seti.e. provable using only
the axioms and rules. Once we have given appropriate axioms and rules for
predicate logic, we will have dened provability in predicate logic ('
PC
and
'
PC
).
Our axioms and rules for predicate logic will include our axioms and rules
for propositional logic, plus additional ones dealing with quantiers:
3
Axioxa:it svs:ix iov PC:
Rules: modus ponens, plus universal generalization (UG):
Axioms: all instances of PL-PL, plus:

(/) (PC)
() () (PC)
where:
, , and are any PC-wffs, is any variable, and is any term
(/) results from by correct substitution of for (see
below)
in PC, no occurrences of variable may be free in
Lets examine the new predicate logic axioms and rule. The rule UG is
based on the idea that proving an arbitrary instance of a universal generalization
sufces to prove that universal generalization. To prove that every F is an F ,
for example, one picks an arbitrary object, x, proves that F xF x, and then
concludes by UG that x(F xF x). (See also example 4.6.)
Axiomatic proof systems tend to handle inferences using free variables a bit
unsteadily. (Its easier with natural deduction and sequent systems to smooth
out the wrinkles.) For example, our system allows the following proof of xF x
from F x:
. F x premise
. xF x , UG
Hence, F x ' xF x. Since F xxF x (I wont prove this here, but its true),
and since F x xF x (exercise 4.3d), it follows that unless they are restricted
in certain ways, the deduction theorem (section 2.9) and a generalized version
3
See Mendelson (, ).
of soundness ( whenever ' compare exercise 2.9) both fail for
our axiomatic system. (The needed restrictions are of a sort familiar from
introductory logic books, which require variables used in connection with UG
to be new to proofs.) Lets not worry about this glitch; our interest will be
solely in theoremhood, and in inferences ' where and all the members of
are closed wffs; and UG doesnt lead to bad results in those cases.
4
PC embodies the familiar principle of substitution (often called uni-
versal instantiation), which yields axioms like xF xFa (and xF xF b,
xF xF x, etc.) To construct an instance of PC, you: i) begin with , ii)
strip off the quantier to get , iii) choose a term (variable or constant) ,
called the instantial term, iv) change the s in to s to arrive at (/),
and then v) write down the conditional (/). But steps iii) and iv)
need to be restricted. First, only the s that are free in are to be changed
in step iv). For example, if is F xxRxx and the instantial term is a, you
only change the rst x to a. (Thus, the resulting axiom is x(F xxRxx)
(FaxRxx). Its not x(F xxRxx) (FaaRaa)thats not even a
wff.) Second, all free occurrences of in must be changed to the instantial
term. (xRxxRxa is not an instance of PC.) Third, if the instantial term is
a variable, none of the occurrences of that variable that would result from the
substitution can be bound in the axiom. For example, xyRxyyRyy isnt
an instance of PC (even after is replaced with its denition). You cant choose
y as the instantial term here, since the occurrence of y that would result from
the substitution in the consequent (the underlined one: xyRxyyRyy)
would be bound in the would-be axiom, not free. (This wff shouldnt count as
an axiom; it would symbolize, for example, the sentence If everyone respects
someone (or other) then someone respects him or her self, which isnt a logical
truth.) Correct substitutions are those that meet these three restrictions.
The importance of PC will be illustrated in the examples below.
As we saw in section 2.6, constructing axiomatic proofs in propositional
logic can be tedious. We paid our dues in that section, so now lets give
ourselves a break. Suppose, for example, that we want to get the formula
(xF xxGx)(xF xxF x) into one of our predicate logic proofs. Re-
call from section 2.6 that we were able to construct an axiomatic proof in
propositional logic of (PQ)(PP). But if we take that proof and change
each P to xF x and each Q to xGx, the result is a legal predicate logic proof
of (xF xxGx)(xF xxF x), since our predicate logic axiomatic sys-
4
A similar issue will be raised by modal logics rule of necessitation.
tem includes the axioms and rules of propositional logic. Instead of actually
inserting this proof of (xF xxGx)(xF xxF x) into our predicate
logic proof, lets allow ourselves to write merely:
i . (xF xxGx)(xF xxF x) PL
In essence, writing PL means: I could prove this line using just PLPL
and MP if I wanted to.
Since our focus in this section is on predicate rather than propositional
logic, lets be quite liberal about when this time-saving expedient may be used:
lets allow it for any formula that is a PC-tautology. By this I mean the
following. Suppose that is a tautologyi.e., a valid wff of propositional logic.
And suppose that there is some way of uniformly substituting predicate logic
formulas for s sentence letters to obtain a predicate-logic formula . In such
a case, well say that is a PC-tautology. For example, in the previous para-
graph, (xF xxGx)(xF xxF x) is a PC-tautology, resulting from the
tautology (PQ)(PP). (I call a PC-tautology rather than a tautology
full-stop because tautologies have to be propositional logic wffs, whereas is a
predicate logic wff.) Breezily writing PL beside any such is justied because
i) our PL-axiomatic system is complete (section 2.9), so has a PL-proof,
and ii) that proof can be converted into a PC-proof of as in the previous
paragraph.
Furthermore, suppose in some PCproof we have some formulas
1
. . .
n
on
separate lines. And suppose that formula is a PC-tautological consequence
of formulas
1
. . .
n
, in the sense that the formula
(
1
(
2
. . . (
n
))
is a PC-tautology. Then, lets allow ourselves to enter into our proof, anno-
tating PL and referencing the lines on which
1
. . .
n
occurred. This too
is a harmless shortcut, for since (
1
(
2
. . . (
n
)) is a PC-tautology, we
know that a proof of it exists, which we could insert and then use modus ponens
n times from the lines containing
1
. . .
n
to obtain by more legitimate
means.
When annotating PL, how do we gure out whether something is a tau-
tology? Any way we like: with truth tables, natural deduction derivations,
memorywhatever. For future reference, table 4.1 lists some helpful tau-
tologies. Henceforth, when I annotate a line PL I will sometimes refer
parenthetically to one or more of the tautologies in this table, to clarify how I
Table 4.1: Some tautologies
(double negation)
() () (contraposition)
(() ()) () (syllogism)
(()) (()) (import/export)
(()) (()) (permutation)
(() ()) (()) (composition)
(() ()) (()) (dilemma)
(() ()) () (biconditional)
() () (disjunction)
() () (negated conjunction)
obtained the line. (The line wont always come exactly or solely from the cited
tautology; my goal here is to make proofs easier to understand, not to intro-
duce a rigorous convention.) Also, notice this fact about propositional logic: if
is a tautology, then the result of substituting for in any tautology is
itself a tautology.
5
This fact makes table 4.1 all the more useful. For example,
since (PQ) (QP) is a tautology (contraposition), we can substitute
QP for PQ in the tautology ((PR) (RQ)) (PQ) (syllogism)
to conclude that ((PR) (RQ)) (QP) is also a tautology.
And while were on the topic of shortcuts, lets also continue in the practice
of doing two or more steps at once, as in section 2.8. (As noted in that section,
whenever we use any of these shortcuts, we are constructing proof sketches
rather than ofcial proofs.)
Example 4.6: As our rst example, lets show that xF x, x(F xGx) '
PC
xGx:
5
See note 3.
. xF x Premise
. x(F xGx) Premise
. xF xF x PC
. F x , MP
. F xGx PC, , MP
. Gx , MP
. xGx , UG
This proof illustrates the main method for proving universally quantied
formulas: to prove x, rst prove ; and then use UG. Here we wanted to
prove xGx, so we rst proved Gx (line ) and then used UG. To do this,
notice, we must include formulas with free variables in our proofs. We must
use free variables as instantial terms when using PC (lines and ), we must
apply propositional logics axioms and rules to formulas with free variables
(lines ), and we must apply UG to such formulas (line ). This may seem
odd. What does a formula with a free variable mean? Well, intuitively, think of
a free variable as denoting some particular but unspecied object. Thus, think
of line , xF xF x (in which the nal occurrence of x is free), as saying if
everything is F , then this particular object is F . And think of the whole proof as
follows. Since we want to prove xGx, we choose an arbitrary object, x, and
try to show that x is G. Once we do so (line ), we can conclude that everything
is G because x was arbitrarily chosen.
6
Example 4.7: Lets show that '
PC
xyRxyyxRxy (this will illustrate
the need for PC):
. xyRxyyRxy PC
. yRxyRxy PC
. xyRxyRxy , , PL (syllogism)
. x(xyRxyRxy) , UG
. xyRxyxRxy , PC, MP
. xyRxyyxRxy , UG, PC, MP
6
If any of the premises contained free occurrences of x then x wouldnt really have been
arbitrarily chosen. Such cases are precisely the ones where UG gets restricted in introductory
books; but as I said, Im not worrying here about this glitch.
Example 4.8: A theorem schema that will be useful is the following:
'
PC
()() (Distribution)
Any instance of Distribution can be established as follows:
. ()() PC
. PC
. ()() , PL (see below)
. (()()) , UG
. ()()) PC, , MP
. ()() PC
. ()() , , PL (syllogism)
(Line is via the tautology (P(QR))((SQ)(P(SR))). Note that
steps and are legal instances of PC, regardless of what and look like.
In step , for example, we strip off the from , and leave alone. If
you go back and look at the two restrictions on PC, you will see that since
no occurrences of within are changed, those two restrictions are satised.
And notice further why the uses of PC are correct. Line , for example, is
a legal instance of PC because the variable is not free in any free
occurrences of in get bound to the quantier .
Example 4.9: One thing distribution is good for is proving wffs of the form
xx where is provable. For example:
. (F xGx)F x PL
. x((F xGx)F x) , UG
. x(F xGx)xF x Distribution, , MP
Example 4.10: Show that xyRxy '
PC
yxRxy. Given the denition of
this means showing that xyRxy '
PC
yxRxy:
. xyRxy premise
. yRxyRxy PC
. RxyyRxy , PL (contraposition)
. x(RxyyRxy) , UG
. xRxyxyRxy Distribution, , MP
. xRxy , PL (contraposition)
. yxRxy , UG
My approach to this problem was to work my way backwards. (This approach
is often helpful.) I set myself an initial goal, and then thought about how to
reach that goal. Whatever I would need to reach that initial goal became my
new goal. Then I thought about how to reach this new goal. I continued in this
way until I got a goal I knew how to reach. In this case, this thought process
went as follows:
goal : get yxRxy (since this is the conclusion of the argument)
goal : get xRxy (since then I can get goal by UG)
goal : get xRxyxyRxy (since then I can get goal from the
arguments premise and propositional logic)
goal : get RxyyRxy (since then I can get goal by UG and
distribution)
Once I had written down goal , I had something I knew how to achieve, so
then I started work on the actual proof. I then worked backwards toward the
ultimate goal: goal . Notice in particular goal . Something like this strategy
is often needed in connection with negation. I gured that at some point I
would need to use the arguments premise, which was a negation. And a natural
way to use a negation, , is to attempt to prove some conditional , and
then conclude by modus tollens. This is what happened in goal .
Ixcrcisc 4.4 Construct axiomatic proofs to establish each of the
following facts. You may use the various shortcuts introduced in
this chapter; and you may use the principle of Distribution.
a) x(F xGx), x(GxHx) '
PC
x(F xHx)
b) '
PC
FaxF x
c) '
PC
xRaxxyRyx
d) xRax, y(RayzRzy) '
PC
xzRzx
4.5 Mctalogic of PC
We have given a semantics and a proof theory for predicate logic. Mathematical
logicians have proved fascinating metalogical results about this semantics and
proof theory. Although the raison dtre of this book is to not focus on these
matters in detail, the results are important to appreciate. Ill stateinformally
and without proofand comment on some of the most signicant results.
7
Needless to say, our discussion will only scratch the surface.
Soundness and Completeness. When and contain only sentences (wffs
without free variables
8
), then it can be shown that '
PC
iff
PC
. For
predicate logic (closed, rst-order wffs), provability and semantic consequence
coincide. Thus, one can establish facts of the form by exhibiting a model
in which all members of are true and is false, and then citing soundness;
and one can establish facts of the form ' while avoiding the agonies of
axiomatic proofs by reasoning directly about models to conclusions about
semantic consequence, and then citing completeness.
Compactness. Say that a set of sentences is satisable iff there is some model
in which each of its members is true. It can be shown that if each nite subset
of a set of sentences is satisable, then itself must be satisable. This result,
known as compactness, is intuitively surprising because it holds even in the
case where contains innitely many sentences. One might have thought that
there could be some contradiction latent within some innite set , preventing
it from being satisable, but which only emerges when you consider all of its
innitely many members togethera contradiction which does not emerge,
that is, if you consider only nite subsets of . Compactness says that this can
never happen.
Compactness is a sign of a kind of expressive weakness in (rst-order)
predicate logic. The weakness pertains to innity: intuitively speaking, you
cant say anything in predicate logic whose logical signicance would emerge
only in connection with innitely many other sentences. For example, after
we add the identity sign to predicate logic in section 5.1, we will show how to
symbolize the sentences there are at least two F s, there are at least three
F s, and so on. Call these symbolizations
2
xF x,
3
xF x . . . . These symbolize
the various numeric claims in the sense that
n
xF x is true in a model iff the
7
See, for example, Boolos et al. () or Mendelson () for the details.
8
Other axiomatic systems for predicate logic can be given which are sound and complete
even for inferences involving free variables.
extension of F in that model has at least n members. Given compactness, there
is no way to symbolize, in this same sense of symbolize, there are nitely
many F s. For if there existed a sentence, , that is true in a given model iff
the extension of F in that model is nite, then the following innite set would
violate compactness: ],
2
xF x,
3
xF x . . . ] (exercise 4.5).
Undecidability says roughly that there is no mechanical procedure for decid-
ing whether a given sentence of predicate logic is valid. Intuitively, this means
that there is no way to write a computer program that will tell you whether an
arbitrary sentence is valid or invalid, in the sense that:
i) You feed the program sentences; it can give answers of the form valid
or invalid
ii) It never answers incorrectly. That is, if it says valid then the sentence
is indeed valid; if it says invalid then the sentence is indeed invalid
iii) If you feed it a valid sentence it eventually answers valid
iv) If you feed it an invalid sentence it eventually answers invalid
The intuitive idea of a mechanical procedure needs to be precisely dened,
of course. But, it turns out, all reasonable ways of dening it are equivalent.
(One common denition is that of a Turing Machine.) So the upshot is: on
any reasonable construal of mechanical procedure, theres no mechanical
procedure for guring out whether an arbitrary sentence is PC-valid. (Given
soundness and completeness, it follows that theres no mechanical procedure to
gure out whether an arbitrary sentence is a PC-theorem.) There are, it turns
out, mechanical positive tests for validity, in the sense of computer programs
satisfying i)-iii). Such a program would be guaranteed to correctly classify any
valid formula as such. But if you fed it an invalid formula, it might just go on
churning away forever, never delivering an answer.
Gdels incompleteness theorem. One can write down axioms for predicate logic
fromwhich one can prove all and only the valid sentences of predicate logic. (That is
what the soundness and completeness theorems say.) This axiomatic approach
has been attempted in other areas as well. Euclid, for example, attempted to
write down axioms for plane geometry. The intent was that one could prove
all and only the truths of plane geometry using his axioms. What Kurt Gdel
showed is that this axiomatic approach will not work for the truths of arithmetic.
Arithmetic is the theory of multiplication and addition over natural numbers.
One can represent statements of arithmetic using the language of predicate
logic.
9
Can we write down axioms for arithmetic? That is, are there axioms
from which one can prove all and only the truths of arithmetic? In a trivial sense
there are: we could just say let each truth of arithmetic be an axiom. But such
an axiomatic system would be useless; there would be no way of telling what
counts as an axiom! Gdels (rst) incompleteness theorem tells us that there
is no set S of axioms such that i) there is a mechanical procedure for telling
what counts as a member of S, and ii) one can prove all and only the truths
of arithmetic from S. (It can also be shown that there exists no mechanical
procedure for guring out whether an arbitrary sentence of Arithmetic is true.)
Ixcrcisc 4.5* Show that the set ],
2
xF x,
3
xF x . . . ] mentioned
above would violate compactness.
9
Including identitysee section 5.1.
Chaptcr 5
Bcyond Standard Prcdicatc Logic
S
:axbavb vvibita:i iooit is powerful. It can be used to model the logical
structure of a signicant portion of natural language. Still, it isnt perfect.
In this chapter we consider some of its limitations, and in each case well discuss
additions to predicate logic to make up for the decits.
1
5.1 Idcntity
How might we symbolize Only Ted is happy using predicate logic? Ht gets
half of it rightweve said that Ted is happybut weve left out the only part.
We cant say Ht xHx, because thats a logical falsehood: if the rst part,
Ted is happy, is true, then the second part, its not the case that someone is
happy cant be right, since Ted is a someone, and we just said that hes happy.
What we want to add to Ht is that its not the case that someone else is happy.
But how to say someone else?
Someone else means: someone not identical to. So we need a predicate for
identity. Now, we could simply choose some two-place predicate to symbolize
is identical toI , say. Then we could symbolize Only Ted is happy as
meaning Ht x(Hx I xt ). But treating is identical to as just another
predicate sells it short. For surely its a logical truth that everything is self-
identical, whereas the sentence xI xx is not PC-valid.
In order to recognize distinctive logical truths issuing from the meaning of
1
Actually standard predicate logic is often taken to already include the identity sign, and
sometimes function symbols as well.
CHAPTER 5. BEYOND STANDARD PREDICATE LOGIC

is identical to, we must treat that predicate as a logical constant (recall section
1.6). To mark this special status, well symbolize identity with a symbol unlike
other predicates: =. And well write it between its two arguments rather than
before themwe write = rather than =. We can now symbolize Only
Ted is happy thus: Ht x(Hx x=t ).
5.1.1 Grammar for thc idcntity sign
We rst need to expand our grammar of predicate logic to allow for the new
symbol =. Two changes are needed. First, we need to add = to the primitive
vocabulary of predicate logic. Then we need to add the following clause to the
denition of a well-formed formula:
If and are terms, then = is a wff
Im now using the symbol = as the object-language symbol for identity.
But Ive also been using = as the metalanguage symbol for identity, for instance
when I write things like (P) =1. This shouldnt generally cause confusion,
but if theres a danger of misunderstanding, Ill clarify by writing things like:
(P) =(i.e., is the same object as) 1, to make clear that its the metalanguages
identity predicate Im using.
5.1.2 Scmantics for thc idcntity sign
This is easy. We keep the notion of a PC-model from the last chapter, and
simply add a clause to the denition of a valuation function telling it what truth
values to give to sentences containing the = sign. Here is the clause:
V
., g
(=) =1 iff: []
., g
= (i.e., is the same object as) []
., g
That is, the wff = is true iff the terms and refer to the same object.
Example 5.1: Show that the formula xy x=y is valid. Let g be any vari-
able assignment for any model, and suppose for reductio that V
g
(xy x=y) =0.
Given the clause for , we know that for some object in the domain, call it u,
V
g
x
u
(y x=y) =0. Given the clause for , for every member of the domain, and
so for u in particular, V
g
xy
uu
(x=y) =0. So, given the clause for =, [x]
g
xy
uu
is not
the same object as [y]
g
xy
uu
. But [x]
g
xy
uu
and [y]
g
xy
uu
are the same object. [x]
g
xy
uu
is
g
xy
uu
(x), i.e., u; and [y]
g
xy
uu
is g
xy
uu
(y), i.e., u.
5.1.3 Symbolizations with thc idcntity sign
Why do we ever add anything to our list of logical constants? Why not stick
with the tried and true logical constants of propositional and predicate logic?
We generally add a logical constant when it has a distinctive inferential and
semantic role, and when it has very general applicationwhen, that is, it occurs
in a wide range of linguistic contexts. We studied the distinctive semantic role
of = in the previous section. In this section, well have a quick look at some
linguistic contexts that can be symbolized using =.
The most obvious sentences that may be symbolized with = are those
that explicitly concern identity, such as Mark Twain is identical to Samuel
Clemens:
t =c
and Every man fails to be identical to George Sand:
x(Mxx=s )
(It will be convenient to abbreviate = as (=. Thus, the second symbol-
ization can be rewritten as: x(Mxx(=s ).) But many other sentences involve
the concept of identity in subtler ways.
For example, there are sentences involving only, as the example Only Ted
is happy illustrated. Next, consider Every lawyer hates every other lawyer.
The other signies nonidentity; we have, therefore:
x(Lxy[(Lyx(=y)Hxy])
Another interesting class of sentences concerns number. We cannot symbolize
There are at least two dinosaurs as: xy(DxDy), since this would be
true even if there were only one dinosaur: x and y could be assigned the same
dinosaur. The identity sign to the rescue:
xy(DxDy x(=y)
This says that there are two different objects, x and y, each of which are di-
nosaurs. To say There are at least three dinosaurs we say:
xyz(DxDyDz x(=y x(=z y(=z)
Indeed, for any n, one can construct a sentence
n
that symbolizes there are
at least n F s:
n
: x
1
. . . x
n
(F x
1
F x
n
)
where is the conjunction of all sentences x
i
(=x
j
where i and j are integers
between 1 and n (inclusive) and i < j . (The sentence says in effect that no
two of the variables x
1
. . . x
n
stand for the same object.)
Since we can construct each
n
, we can symbolize other sentences involving
number as well. To say that there are at most n F s, we write:
n+1
. To say
that there are between n and m F s (where m > n), we write:
n
m+1
. To
say that there are exactly n F s, we write:
n
n+1
.
These methods for constructing sentences involving number will always
work; but one can often construct shorter numerical symbolizations by other
methods. For example, to say there are exactly two dinosaurs, instead of
saying there are at least two dinosaurs, and its not the case that there are at
least three dinosaurs, we could say instead:
xy(DxDy x(=y z[Dz(z=xz=y)])
Ixcrcisc 5.1 Demonstrate each of the following:
a) Fab x(x=aF xb)
b) xyz(F xF yF zx(=yx(=zy(=z),
x(F x(GxHx) xyz(GxGyGz x(=yx(=zy(=z)
Ixcrcisc 5.2 Symbolize each of the following, using predicate
logic with identity.
a) Everyone who loves someone else loves everyone
b) The only truly great player who plays in the NBA is Allen
Iverson
c) If a person shares a solitary connement cell with a guard,
then they are the only people in the cell
d) There are at least ve dinosaurs (What is the shortest sym-
bolization you can nd?)
5.2 Iunction symbols
A singular term, such as Ted, New York City, George W. Bushs father,
or the sum of 1 and 2, is a term that purports to refer to a single entity.
Notice that some of these have semantically signicant structure. George
W. Bushs father, for example, means what it does because of the meaning
of George W. Bush and the meaning of father (and the meaning of the
possessive construction). But standard predicate logics only (constant) singular
terms are its names: a, b, c . . . , which do not have semantically signicant parts.
Thus, using predicate logics names to symbolize semantically complex English
singular terms leads to an inadequate representation.
Suppose, for example, that we give the following symbolizations:
3 is the sum of 1 and 2: a = b
George W. Bushs father was a politician: Pc
By symbolizing the sumof 1 and 2 as simply b, the rst symbolization ignores
the fact that 1, 2, and sum are semantically signicant constituents of the
sum of 1 and 2; and by symbolizing George W. Bushs father as c, we ignore
the semantically signicant occurrences of George W. Bush and father. This
is a bad idea. We ought, rather, to produce symbolizations of these terms
that take account of their semantic complexity. The symbolizations ought to
account for the distinctive logical behavior of sentences containing the complex
terms. For example, the sentence George W. Bushs father was a politician
logically implies the sentence Someones father was a politician. This ought
to be reected in the symbolizations; the rst sentences symbolization ought
to semantically imply the second sentences symbolization.
One way of doing this is via an extension of predicate logic: we add function
symbols to its primitive vocabulary. Think of George W. Bushs father as
the result of plugging George W. Bush into the blank in s father. s
father is an English function symbol. Function symbols are like predicates
in some ways. The predicate is happy has a blank in it, in which you can
put a name. s father is similar in that you can put a name into its blank.
But there is a difference: when you put a name into the blank of is happy,
you get a complete sentence, such as Ted is happy, whereas when you put a
name into the blank of s father, you get a noun phrase, such as George
W. Bushs father.
Corresponding to English function symbols, well add logical function
symbols. Well symbolize s father as f ( ). We can put names into the
blank here. Thus, well symbolize George W. Bushs father as f (a), where
a symbolizes George W. Bush.
This story needs to be revised in two ways. First, what goes into the blank
doesnt have to be a nameit could be something that itself contains a function
symbol. E.g., in English you can say: George W. Bushs fathers father. Wed
symbolize this as: f ( f (a)). Second, just as we have multi-place predicates, we
have multi-place function symbols. The sum of 1 and 2 contains the function
symbol the sum of and . When you ll in the blanks with the names 1
and 2, you get the noun phrase the sum of 1 and 2. So, we symbolize this
using the two-place function symbol, s( ,). If we let a symbolize 1 and
b symbolize 2, then the sum of 1 and 2 becomes: s (a, b).
The result of plugging names into function symbols in English is a noun
phrase. Noun phrases combine with predicates to form complete sentences.
Function symbols function analogously in logic. Once you combine a function
symbol with a name, you can take the whole thing, apply a predicate to it, and
get a complete sentence. Thus, the sentence George W. Bushs father was a
politician becomes:
P f (a)
And 3 is the sum of 1 and 2 becomes:
c = s (a, b)
(here c symbolizes 3). We can put variables into the blanks of function
symbols, too. Thus, we can symbolize Someones father was a politician as
xP f (x)
Example 5.2: Symbolize the following sentences using predicate logic with
identity and function symbols:
Everyone loves his or her father
xLx f (x)
No ones father is also his or her mother
x f (x)=m(x)
No one is his or her own father
x x=f (x)
A persons maternal grandfather hates that persons pa-
ternal grandmother
x H f (m(x)) m( f (x))
Every even number is the sum of two prime numbers
x(Exyz(PyPzx=s (y, z)))
logic with identity and function symbols.
a) The product of an even number and an odd number is an
even number.
b) If the square of a number that is divisible by each smaller
number is odd, then that number is greater than all numbers. (I
know, the sentence is silly.)
5.2.1 Grammar for function symbols
We need to update our grammar to allow for function symbols. First, we need
to add function symbols to our primitive vocabulary:
for each n > 0, n-place function symbols f , g,, with or without sub-
scripts
The denition of a wff, actually, stays the same. What needs to change is the
denition of a term. Before, terms were just names or variables. Now, we
need to allow for f (a), f ( f (a)), etc., to be terms. This is done by the following
recursive denition of a term:
2
iiixi:iox oi :ivxs:
names and variables are terms
if f is an n-place function symbol and
1
. . .
n
are terms, then f (
1
. . .
n
)
is a term
Only strings that can be shown to be terms by the preceding clauses are
terms
2
Complex terms formed from function symbols with more than one place do not, ofcially,
contain commas. But to improve readability I will write, for example, f (x, y) instead of f (xy).
5.2.2 Scmantics for function symbols
We now need to update our denition of a PC-model by saying what the
interpretation of a function symbol is. Thats easy: the interpretation of an
n-place function symbol ought to be an n-place function dened on the models
domaini.e., a rule that maps any n members of the models domain to another
member of the models domain. For example, in a model in which the domain
is a set of people and the one-place function symbol f ( ) is to represent s
father, the interpretation of f will be the function that assigns to any member
of the domain that objects father. So we must add to our denition of a model
the following clause (call the new models PC+FS-models, for predicate
calculus plus function symbols):
If f is an n-place function symbol, then ( f ) is an n-place (total) function
dened on .
Calling the function a total function dened on means that the function
must have a well-dened output (which is a member of ) whenever it is given
as inputs any n members of . So if, for example, contains both numbers
and people, ( f ) could not be the father-of function, since that function is
undened for numbers.
The denition of the valuation function stays the same; all we need to do is
update the denition of denotation to accommodate our new complex terms:
iiixi:iox oi bixo:a:iox: For any model .(=, ), variable assignment
g for ., and term , []
., g
is dened as follows:
[]
., g
=
() if is a constant
( f )([
1
]
., g
. . . [
n
]
., g
) if is a complex term f (
1
. . .
n
)
Note the recursive nature of this denition: the denotation of a complex term
is dened in terms of the denotations of its smaller parts. Lets think carefully
about what the nal clause says. It says that, in order to calculate the denotation
of the complex term f (
1
. . .
n
) (relative to assignment g), we must rst gure
out what ( f ) isthat is, what the interpretation function assigns to the
function symbol f . This object, the new denition of a model tells us, is an
n-place function on the domain. We then take this function, ( f ), and apply
it to n arguments: namely, the denotations (relative to g) of the terms
1
. . .
n
.
The result is our desired denotation of f (
1
. . .
n
).
It may help to think about a simple case. Suppose that f is a one-place
function symbol; suppose our domain consists of the set of natural numbers;
suppose that the name a denotes the number 3 in this model (i.e., (a) =3),
and suppose that f denotes the successor function (i.e., ( f ) is the function,
successor, that assigns to any natural number n the number n +1.) In that case,
the denition tells us that:
[ f (a)]
g
=( f )([a]
g
)
=( f )((a))
=successor(3)
=4
Example 5.3: Heres a sample metalanguage argument that makes use of
the new denitions. As mentioned earlier, George W. Bushs father was a
politician logically implies Someones father was a politician. Lets show that
these sentences symbolizations stand in the relation of semantic implication.
That is, lets show that P f (c) xP f (x)
i) Suppose for reductio that for some model and variable assignment g,
V
g
(P f (c)) =1, but
ii) V
g
(xP f (x)) =0
iii) By line i), V
g
(P f (c)) =1, and so [ f (c)]
g
(P). [ f (c)]
g
is just ( f )([c]
g
),
and [c]
g
is just (c). So ( f )((c)) (P).
iv) By ii), for every member of , and so for (c) in particular, V
g
x
(c)
(P f (x)) =
0. So [ f (x)]
g
x
(c)
/ (P). But [ f (x)]
g
x
(c)
= ( f )([x]
g
x
(c))
), and [x]
g
x
(c)
=
g
(c)
x (x) =(c). So ( f )((c)) / (P), which contradicts line iii)
Ixcrcisc 5.4 Demonstrate each of the following:
a) xF xF f (a)
b) ]x f (x)(=x] xy( f (x)=y f (y)=x)
5.3 Dchnitc dcscriptions
Our logic has gotten more powerful with the addition of function symbols,
but it still isnt perfect. Function symbols let us break up certain complex
singular termse.g., Bushs father. But there are others we still cant break
upe.g., The black cat. Even with function symbols, the only candidate for
a direct symbolization of this phrase into the language of predicate logic is a
simple name, a for example. But this symbolization ignores the fact that the
black cat contains black and cat as semantically signicant constituents.
It therefore fails to provide a good model of this terms distinctively logical
behavior. For example, The black cat is happy logically implies Some cat
is happy. But the simple-minded symbolization of the rst sentence, Ha,
obviously does not semantically imply x(CxHx).
One response is to introduce another extension of predicate logic. We
introduce a new symbol, , to stand for the. The grammatical function of
the in English is to turn predicates into noun phrases. Black cat is a predicate
of English; the black cat is a noun phrase that refers to the thing that satises
the predicate black cat. Similarly, in logic, given a predicate F , well let xF x
be a term that means: the thing that is F .
Well want to let x attach to complex wffs, not just simple predicates. To
symbolize the black cati.e., the thing that is both black and a catwe want
to write: x(BxCx). In fact, well let x attach to wffs with arbitrary complexity.
To symbolize the reman who saved someone, well write: x(F xySxy).
5.3.1 Grammar for
To the primitive vocabulary of the previous section, we add one further expres-
sion: . And we revise our denition of terms and wffs, as follows:
iiixi:iox oi :ivxs axb viis:
i) names and variables are terms
ii) if is a wff and is a variable then is a term
iii) if f is an n-place function symbol, and
1
. . .
n
are terms, then f (
1
. . .
n
)
is a term
iv) if is an n-place predicate and
1
. . .
n
are terms, then
1
. . .
n
is a wff
v) If and are terms, then = is a wff
vi) if , are wffs, and is a variable, then , (), and are wffs
vii) Only strings that can be shown to be terms or wffs using i)-vi) are terms
or wffs
Notice how we needed to combine the recursive denitions of term and wff
into a single recursive denition of wffs and terms together. The reason is that
we need the notion of a wff to dene what counts as a term containing the
operator (clause ii); but we need the notion of a term to dene what counts as
a wff (clause iv). The way we accomplish this is not circular. The reason it isnt
is that we can always decide, using these rules, whether a given string counts as
a wff or term by looking at whether smaller strings count as wffs or terms. And
the smallest strings are said to be wffs or terms in non-circular ways.
5.3.2 Scmantics for
We need to update the denition of denotation so that x will denote the one
and only thing in the domain that is . But theres a snag. What if there is
no such thing as the one and only thing in the domain that is ? Suppose
that K symbolizes king of and a symbolizes USA. Then what should
xKxa denote? It is trying to denote the king of the USA, but there is no
such thing. Further, what if more than one thing satises the predicate? What
should the daughter of George W. Bush denote, given that Bush has more
than one daughter? In short, what do we say about empty descriptions?
One approach is to say that every atomic sentence with an empty description
is false.
3
To implement this thought, we keep the denition of a PC+FS model
from before, but rework the denition of truth in a model as follows:
iiixi:iox oi bixo:a:iox axb vaiia:iox: The denotation and valuation
functions, []
., g
and V
., g
, for PC+FS-model . (=, ) and variable as-
signment g, are dened as the functions that satisfy the following constraints:
i) V
., g
assigns to each wff either 0 or 1
ii) For any term ,
3
An alternate approach would appeal to three-valued logic. We could treat atomic sentences
with empty descriptions as being neither true nor falsei.e., #. We would then need to
update the other semantic clauses to allow for #s, using one of the three-valued approaches to
propositional logic from chapter 3.
[]
., g
=
() if is a constant
( f )([
1
]
., g
. . . [
n
]
., g
)
if has the form f (
1
. . .
n
)
and [
1
]
., g
. . . [
n
]
., g
are all dened
undened
if has the form f (
1
. . .
n
)
and not all of [
1
]
., g
. . .
[
n
]
., g
are dened
the u such that V
., g
u
() =1
if has the form and
there is a unique such u
undened
if has the form and
there is no such u
iii) for any n-place predicate and any terms
1
. . .
n
,V
., g
(
1
. . .
n
) =1
iff [
1
]
., g
. . . [
n
]
., g
are all dened and [
1
]
., g
. . . [
n
]
., g
()
iv) V
., g
(=) =1 iff: []
., g
and []
., g
are each dened and are the same
object
v) for any wffs , , and any variable :
V
., g
() =1 iff V
., g
() =0
V
., g
() =1 iff either V
., g
() =0 or V
., g
() =1
V
., g
., g
u
() =1
As with the grammar, we need to mix together the denition of denotation
and the denition of the valuation function. The reason is that we need to
dene the denotations of denite descriptions using the valuation function (in
clause ii), but we need to dene the valuation function using the concept of
denotation (in clauses iii and iv). As before, this is not circular.
Notice that the denotation of a term can now be undened. This means
simply that there is no such thing as the denotation of such a term (put another
way: such a term is not in the domain of the denotation function.) The initial
source of this status is the sixth case of clause ii)empty denite descriptions.
But then the undened status is inherited by complex terms formed from such
terms using function symbols, via the fourth case of clause ii). And then, nally,
clauses iii) and iv) insure that atomic and identity sentences containing such
terms all turn out false.
Note a consequence of this last feature of the semantics. There are now
two ways that an atomic sentence can be false (similar remarks apply to identity
sentences). There is the old way: the tuple of the denotations of the terms
can fail to be in the predicates extension. But now there is a new way: one
of the terms might have an undened denotation. So you have to be careful
when constructing validity proofs. Suppose, for example, that you learn that
V
g
(F ) =0 for some term. You cant immediately conclude that []
g
/ (F ),
since []
g
might not even be dened. To conclude this, you must rst show
that []
g
is dened.
Example 5.4: Show that GxF xx(F xGx):
i) Suppose for reductio that in some model, and some assignment g in that
model, V
g
(GxF xx(F xGx)) =0. So, V
g
(GxF x) =1 and
ii) V
g
(x(F xGx)) =0.
iii) By i), via the clause for atomics in the denition of truth in a model,
[xF x]
g
is both dened and a member of (G).
iv) Since [xF x]
g
is dened, the denition of denotation for terms tells us
that [xF x]
g
is the unique u such that V
g
x
u
(F x) =1. Call this object
(i.e., [xF x]
g
) henceforth: u.
v) Given ii), for every member of , and so for u in particular, V
g
x
u
(F xGx) =
0. So either V
g
x
u
(F x) = 0 or V
g
x
u
(Gx) = 0. Since V
g
x
u
(F x) = 1 (line iv)),
V
g
x
u
(Gx) =0.
vi) Since V
g
x
u
(Gx) =0, given the denition of truth for atomics, either [x]
g
x
u
is undened or else it is dened and is not a member of (G). But it is
dened: the denition of denotation (second case) denes it as g
x
u
(u)i.e.,
u. So u / (G), contradicting iii).
Ixcrcisc 5.5 Establish the following:
a)** xLxyF xyxyLxy
b) F xyLxy xy((zLxz zLyz) x=y)
c) GxF xF xGx
Ixcrcisc 5.6* Show that the denotation of any term is either un-
dened or a member of .
5.3.3 Ilimination of function symbols and dcscriptions
In a sense, we dont really need function symbols or the . Lets return to
the English singular term the black cat. Introducing the gave us a way
to symbolize this singular term in a way that takes into account its semantic
structure (namely: x(BxCx).) But even without the , there is a way to
symbolize whole sentences containing the black cat, using just standard predicate
plus identity. We could, for example, symbolize The black cat is happy as:
x[(BxCx) y[(ByCy)y=x] Hx]
That is, there is something such that: i) it is a black cat, ii) nothing else is a
black cat, and iii) it is happy.
This method for symbolizing sentences containing the is called Russells
theory of descriptions, in honor of its inventor Bertrand Russell (). The
general idea is to symbolize: the is as x[(x) y((y)x=y) (x)].
This method can be iterated so as to apply to sentences with two or more
denite descriptions, such as The -foot tall man drove the -foot long
limousine, which becomes, letting E stand for is eight feet tall and T stand
for is twenty feet long:
x[ExMx z([EzMz]x=z)
y[TyLy z([T zLz]y=z) Dxy]]
An interesting question arises with negations of sentences involving denite
descriptions, when we use Russells method. Consider The president is not
bald. Does this mean The president is such that hes non-bald, which is
symbolized as follows:
x[Px y(Pyx=y) Bx]
? Or does it mean It is not the case that the President is bald, which is
symbolized thus:
x[Px y(Pyx=y) Bx]
? According to Russell, the original sentence is simply ambiguous. Symbolizing
it the rst way is called giving the description wide scope (relative to the ),
since the is in the scope of the . (That is, the is inside the ; i.e., the
formula has the form x, and the is part of the .) Symbolizing it in the
second way is called giving the description narrow scope (relative to the ),
because the is in the scope of the (the formula has the form , and the
is part of the ). These two symbolizations differ in meaning. The rst says
that there really is a unique president, and adds that he is not bald. So the rst
implies that theres a unique president. The second merely denies that: there
is a unique president who is bald. That doesnt imply that theres a unique
president. It would be true if theres a unique president who is not bald, but it
would also be true in two other cases: the case in which there are no presidents
at all, and the case in which there is more than one president.
A similar issue arises with the sentence The round square does not exist.
We might think to symbolize it:
x[RxSxy([RySy]x=y) Ex]
letting E stands for exists. In other words, we might give the description
wide scope. But this symbolization says something very odd: that there is
a certain round square that doesnt exist. This corresponds to reading the
sentence as saying The thing that is a round square is such that it does not
exist. But that isnt the most natural way to read the sentence. The sentence
would usually be interpreted to mean: It is not true that the round square
exists, that is, as the negation of the round square exists:
x[RxSxy([RySy]x=y) Ex]
with the out in front. Here weve given the description narrow scope.
If we are willing to use Russells method for translating denite descrip-
tions, we can drop from our language. We would, in effect, not be treating
the F as a syntactic unit. We would instead be symbolizing sentences that
contain the F with wffs that contain no correlative term. The black cat is
happy gets symbolized as x[(BxCx) y[(ByCy)y=x] Hx] See?
no term corresponds to the black cat. The only terms in the symbolization
are variables.
In fact, once we use Russells method, we can get rid of function symbols too.
Given function symbols, we treated father as a function symbol, symbolized
it with f , and symbolized the sentence George W. Bushs father was a
politician as P f (b). But instead, we could treat father of as a two-place
predicate, F , and regard the whole sentence as meaning: The father of George
W. Bush was a politician. Given the , this could be symbolized as:
PxF xb
But given Russells method, we can symbolize the whole thing without using
either function symbols or the :
x(F xb y(F yby=x) Px)
We can get rid of all function symbols this way, if we want. Heres the method:
Take any n-place function symbol f
Introduce a corresponding n +1-place predicate R
In any sentence containing the term f (
1
. . .
n
), replace each occur-
rence of this term with the x such that R(x,
1
. . .
n
).
Finally, symbolize the resulting sentence using Russells theory of de-
scriptions
For example, lets go back to: Every even number is the sum of two prime
numbers. Instead of introducing a function symbol s (x, y) for the sum of x
and y, lets introduce a predicate letter R(z, x, y) for z is a sum of x and y.
We then use Russells method to symbolize the whole sentence thus:
x(Ex yz[PyPz w(Rwyz w
1
(Rw
1
yzw
1
=w) x=w)])
The end of the formula (beginning with w) says the product of y and z is
identical to xthat is, that there exists some w such that w is a product of y
and z, and there is no other product of y and z other than w, and w = x.
logic with identity, function symbols, and the operator. (Do not
eliminate descriptions using Russells method.)
a) If a person commits a crime, then the judge that sentences
him/her wears a wig.
b) The tallest spy is a spy. (Use a two-place predicate to sym-
bolize is taller than.)
Ixcrcisc 5.8 For the sentence The ten-feet-tall man is not happy,
rst symbolize with the operator. Then symbolize two readings us-
ing Russells method. Explain the intuitive difference between those
two readings. Which gives truth conditions like the symbolization?
5.4 Iurthcr quantihcrs
Predicate logic, with its quantiers and , can symbolize a great many sen-
tences of natural language. But not all. For instance, it can be shown that there
is no way to symbolize the following sentences using just predicate logic:
Most things are massive
Most men are brutes
There are innitely many numbers
Some critics admire only one another
Like those sentences that are representable in standard logic, these sentences
involve quanticational notions: most things, some critics, and so on. In this
section we introduce a broader conception of what a quantier is, and new
quantiers that allow us to symbolize these sentences.
5.4.1 Gcncralizcd monadic quantihcrs
We will generalize the idea behind the standard quantiers and in two ways.
To approach the rst, lets introduce the following bit of terminology. For any
PC-model, . (= , ), and wff, , lets introduce the name
., g,
for
(roughly speaking) the set of members of .s domain of which is true:
iiixi:iox:
., g,
=]u : u and V
., g
u
() =1]
Thus, if we begin with any variable assignment g, then
., g,
is the set of
things u in such that is true, relative to variable assignment g
u
. Now,
recall the truth conditions in a PC-model, ., with domain , for and :
V
., g
., g
u
() =1
V
., g
() =1 iff for some u , V
., g
u
() =1
Given our newterminology, we can write equivalent truth conditions as follows:
V
., g
() =1 iff
., g,
=
V
., g
() =1 iff
., g,
(=
But if we can rewrite the truth conditions for the familiar quantiers and
in this wayas conditions on
., g,
then why not introduce new symbols
of the same grammatical type as and , whose semantics is parallel to and
except in laying down different conditions on
., g,
? These would be new
kinds of quantiers. For instance, for any integer n, we could introduce a
quantier
n
such that
n
means: there are at least n s. The denitions of
a wff, and of truth in a model, would be updated with the following clauses:
if is a variable and is a wff, then
n
is a wff
V
., g
(
n
) =1 iff |
., g,
| n
The expression |A| stands for the cardinality of set Ai.e., the number of
members of A. So the truth condition says that
n
is true iff
., g,
has at
least n members.
Now, the introduction of the symbols
n
do not increase the expressive
power of predicate logic, for as we saw in section 5.1.3, we can symbolize
there are at least n F s using just standard predicate logic (plus =). The
new notation is merely a space-saver. But other such additions are not mere
space-savers. For example, by analogy with the symbols
n
, we can introduce a
symbol
, meaning there are innitely many:

if is a variable and is a wff, then
is a wff
V
., g
(
) =1 iff |
., g,
| is innite
As it turns out, the addition of
genuinely enhances predicate logic: no

sentence of standard (rst-order) predicate logic has the same truth condition
as does
xF x.
4
One can then use this new generalized quantier to symbolize
new English sentences. For example, The number of sh that have escaped
some predator is innite could be symbolized thus:
x(F xy(PyExy)).
And for every number, there are innitely many greater numbers could be
symbolized thus: x(Nx
y(NyGyx)).
Another generalized quantier that is not symbolizable using standard
predicate logic is most:
If is a variable and is a wff, then most is a wff
V
., g
(most ) =1 iff |
., g,
| >|
., g,
|
The minus-sign in the second clause is the symbol for set-theoretic difference:
AB is the set of things that are in Abut not in B. Thus, the denition says
that most is true iff more things in the domain are than are not .
One could add all sorts of additional quantiers Q in this way. Each
would be, grammatically, just like and , in that each would combine with
a variable, , and then attach to a sentence , to form a new sentence Q.
Each of these new quantiers, Q, would be associated with a relation between
sets, R
Q
, such that Q would be true in a PC-model, ., with domain ,
relative to variable assignment g, iff
., g,
bears R
Q
to .
If such an added symbol Q is to count as a quantier in any intuitive sense,
then the relation R
Q
cant be just any relation between sets. It should be a
relation concerning the relative quantities of its relata. It shouldnt, for
instance, concern particular objects in the way that the following symbol,
Ted-loved
, concerns particular objects:
V
., g
(
Ted-loved
) =1 iff
., g,
]u : u and Ted loves u] (=
So we should require the following of R
Q
: if a subset X of some set D bears
R
Q
to D, and f is a one-to-one function with domain D and range D
/
, then
f [X] must bear R
Q
to D
/
. ( f [X] is the image of X under function f i.e.,
4
I wont prove this; but see note 4.5.
]u : u D
/
and u = f (v), for some v D]. It is the subset of D
/
onto which
f projects X.)
Ixcrcisc 5.9 Let the quantier
prime
mean there are a prime
number of. Using the notation of generalized quantiers, write
out the semantics of this quantier.
5.4.2 Gcncralizcd binary quantihcrs
We have seen how the standard quantiers and can be generalized in
one way: syntactically similar symbols may be introduced and associated with
different semantic conditions of quantity. Our second way of generalizing the
standard quantiers is to allow two-place, or binary quantiers. and are
monadic in that and attach to a single open sentence . Compare the
natural language monadic quantiers everything and something:
Everything is material
Something is spiritual
Here, the predicates (verb phrases) is material and is spiritual correspond
to the open sentences of logic; it is to these that everything and something
attach.
But in fact, monadic quantiers in natural language are atypical. Every
and some typically occur as follows:
Every student is happy
Some sh are tasty
The quantiers every and some attach to two predicates. In the rst sentence,
every attaches to [is a] student and is happy; in the second, some attaches
to [is a] sh and [is] tasty. In these sentences, we may think of every and
some as binary quantiers. (Indeed, one might think of everything and
something as the result of applying the binary quantiers every and some
to the predicate is a thing.) A logical notation with a parallel structure can be
introduced, in which and attach to two open sentences. In this notation we
symbolize every is a as (:), and some is a as (:). The
grammar and semantic clauses for these binary quantiers are as follows:
if and are wffs and is a variable, then (:) and (:) are
wffs
V
., g
((:)) =1 iff
., g,
., g,
V
., g
((:)) =1 iff
., g,
., g,
(=
A further important binary quantier is the:
if and are wffs and is a variable, then (the:) is a wff
V
., g
((the:)) =1 iff |
., g,
| =1 and
., g,
., g,
That is, (the:) is true iff i) there is exactly one , and ii) every is a
. This truth condition, notice, is exactly the truth condition for Russells
symbolization of the is a ; hence the name the.
As with the introduction of the monadic quantiers
n
, the introduction of
the binary existential and universal quantiers, and of the, does not increase the
expressive power of rst order logic, for the same effect can be achieved with
monadic quantiers. (:), (:), and (the:) become, respectively:
()
()
(((/)=) )
But, as with the monadic quantiers
and most, there are binary quantiers

that genuinely increase expressive power. For example, most occurrences of
most in English are binary, as in:
Most sh swim
To symbolize such sentences, we can introduce a binary quantier most
2
. We
read the sentence (most
2
:) as most s are s. The semantic clause for
most
2
is:
V
., g
((most
2
:)) =1 iff |
., g,
., g,
| >|
., g,
., g,
|
The binary most
2
increases our expressive power, even relative to the monadic
most: not every sentence expressible with the former is equivalent to a sentence
expressible with the latter.
5
One can then use this binary quantier to symbolize
5
See Westersthl () for this and related results cited in this chapter.
more complex sentences. For example, Most people who love someone are
loved by someone could be symbolized as: (most
2
x : yLxy)yLyx.
Ixcrcisc 5.10 Symbolize the following sentence:
The number of people multiplied by the
number of cats that bite at least one dog is
.
You may invent any generalized quantiers you need, provided you
write out their semantics.
5.4.3 Sccond-ordcr logic
All the predicate logic we have considered so far is known as rst-order. Well
now briey look at second-order predicate logic, a powerful extension to rst-
order predicate logic. The distinction has to do with how variables behave, and
has syntactic and semantic aspects.
The syntactic aspect concerns the grammar of variables. All the variables
in rst-order logic are grammatical terms. That is, they behave grammatically
like names: to produce a wff you must combine them with a predicate, not just
other terms. But in second-order logic, variables can occupy predicate position,
resulting in well-formed formulas like the following:
XXa
XyXy
Here the variable X occupies predicate position. Predicate variables, like the
normal predicates of standard rst-order logic, can be one-place, two-place,
three place, etc. Thus, to our primitive vocabulary we must add, for each n,
n-place predicate variables X, Y, . . . ; and we must add the following clause to
the denition of a wff:
If is an n-place predicate variable and
1
. . .
n
are terms, then
1
. . .
n
is a wff
The semantic aspect concerns the interpretation of variables. In rst-order
logic, a variable-assignment assigns to each variable a member of the domain. A
variable assignment in second-order logic assigns to each standard (rst-order)
variable a member of the domain, as before, but assigns to each n-place
predicate variable a set of n-tuples drawn from the domain. (This is what one
would expect: the semantic value of a n-place predicate is its extension, a set of
n-tuples, and variable assignments assign temporary semantic values.) Then,
the following clauses to the denition of truth in a PC-model must be added:
If is an n-place predicate variable and
1
. . .
n
are terms, then
V
., g
(
1
. . .
n
) =1 iff [
1
]
., g
. . . [
n
]
., g
g()
If is a predicate variable and is a wff, then V
., g
() = 1 iff for
every set U of n-tuples from , V
., g
U
() =1
(where g
U
is the variable assignment just like g except in assigning U to .)
Notice that, as with the generalized monadic quantiers, no alteration to the
denition of a PC-model is needed. All we need to do is change grammar and
the denition of the valuation function.
The metalogical properties of second-order logic are dramatically different
from those of rst-order logic that we briey mentioned in section 4.5. For
instance, second order logic is incomplete in the sense that there are no
axioms from which one can prove all and only the second-order valid sentences.
(Unless, that is, one resorts to cheap tricks like saying let every valid wff
be an axiom. This trick is cheap because there would be no mechanical
procedure for telling what an axiom is.
6
) Moreover, the compactness theorem
fails for second-order logic. Moreover, one can write down a single second-
order sentence whose second-order semantic consequences are all and only
the truths of Arithmetic. (This is cold-comfort given the incompleteness of
second-order logic: there is no complete axiomatic system we can use to draw
out the consequences of this arithmetic axiom.)
Second-order logic also differs expressively from rst-order logic; the
addition of the second-order quantiers and variables lets us, in a sense, say
new things that we couldnt say using rst-order logic. For example, in second-
order logic we can state the two principles that are sometimes collectively called
Leibnizs Law:
xy(x=y X(XxXy)) (indiscernibility of identicals)
xy(X(XxXy) x=y) (identity of indiscernibles)
6
For a rigorous statement and proof of this and other metalogical results about second-order
logic, see, e.g., Boolos et al. (, chapter ).
The indiscernibility of identicals says, intuitively, that identical objects have
exactly the same properties; the identity of indiscernibles says that objects with
exactly the same properties are identical. Given our denitions, each is a logical
truth (exercise 5.11).
7
This might seem like an unwanted result. The identity of indiscernibles
isnt necessarily true, it might be thought; there could exist two distinct objects
that are nevertheless exactly alikeperfectly alike marbles, say, made by the
same factory. But in fact nothing is amiss here. The identity of indiscernibles is
necessarily true, provided we construe property very broadly, so that being
a member of such-and-such set counts as a property. Under this construal,
there just couldnt be two marbles, Aand B, with exactly the same properties,
since if A(= B then A would have the property of being a member of the set
]A] whereas B would not. If we want to say that two marbles could have the
same properties, we must construe property more restrictivelyperhaps as
meaning qualitative property.
8
It was the broad conception of property that I
had in mind when I wrote above that the identity of indiscernibles says that
objects with exactly the same properties are identical, since the second order
variable X ranges over all the subsets of the domain (in the semantics I gave
above, anyway), not just those picked out by some qualitative property.
The increased expressive power of second-order logic can be illustrated by
the Geach-Kaplan sentence:
9
Some critics admire only one another (GK)
On one reading, anyway, this sentence says that there is a (nonempty) group
of critics in which members admire only other members. Suppose we want to
symbolize (GK) as some formal sentence . What must be like? First, must
contain a one-place predicate symbolizing critic and a two-place predicate
symbolizing admires. Let these be C and A, respectively. Second, must
have the right truth condition; must be true in an arbitrary model , iff:
(C) has some nonempty subset E, such that whenever u, v (A)
and u E, then v E and v (= u
(*)
Now, it can be shown that no sentence of rst-order logic has this truth-
condition. That is, for no sentence of rst-order logic containing A and C
7
Relatedly, one can now dene = as X(XX).
8
See Lewis (, section .) on different conceptions of properties.
9
The sentence and its signicance were discovered by Peter Geach and David Kaplan. See
Boolos ().
is (*) true of every model , . However, there is a sentence of second-order
logic with this truth-condition; namely:
X[xXx x(XxCx) xy([XxAxy][Xyx(=y])] (GK
2
)
So in a sense, you need to use second-order logic if you want to symbolize the
Geach-Kaplan sentence. But we have to be careful with this talk of symbolizing,
since there is another sense of symbolize on which the Geach-Kaplan sentence
can be symbolized in rst-order logic after all. Suppose we use a two-place
predicate M for set-membership:
10
z[xMxz x(MxzCx) xy([MxzAxy][Myzx(=y)] (GK
1
)
(GK
1
) doesnt symbolize (GK) in the sense of being true in exactly those models
that satisfy (*); correspondingly, it isnt true in exactly the same models as (GK
2
).
For even though we said that M is to be a predicate for set-membership,
theres nothing in the denition of a model that reects this, and so there are
models in which M doesnt mean set-membership; and in such models, (GK
1
)
and (GK
2
) neednt have the same truth value. But if we restrict our attention
to models , in which M does mean set-membership (restricted to the
models domain, of coursethat is, (M) =]u, v : u, v and u v]), and
in which each subset of (C) is a member of , then (GK
1
) will indeed be true
iff (GK
2
) is (and iff the model satises (*)). In essence, the difference between
(GK
1
) and (GK
2
) is that it is hard-wired into the denition of truth in a model
that second-order predications X express set-membership, whereas this is
not hard-wired into the denition of the rst-order predication M.
11
Ixcrcisc 5.11 Show that the indiscernibility of identicals and the
identity of indiscernibles are both true under every variable assign-
ment in every model.
5.5 Complcx Prcdicatcs
In section 5.3 we introduced the symbol, which allowed us to create complex
terms from sentences. In this section well introduce something analogous:
10
One can in the same sense symbolize the identity of indiscernibles and the indiscernibility
of identicals using rst order sentences and the predicate M.
11
For more on second-order logic, see Boolos (, , ).
complex predicates. In particular, well introduce the means for taking a sen-
tence, , and creating a corresponding complex predicate that means is such
that .
The means is a new symbol, , with the following grammar:
if is a variable and is a wff then is a one-place predicate
Think of as meaning is an such that . Such predicates are often
called -abstracts (lambda-abstracts).
We nowhave two kinds of predicates, simple predicates (like F , G, R, and so
on) which are part of the primitive vocabulary, and complex predicates formed
by -abstraction. As a result, the class of atomic wffs now includes wffs like the
following (in addition to wffs like Fa, Gy, and Ryb):
xF x(a) a is such that: it is F
xGx(y) y is such that: it is not G
xyRyx(b) b is such that: everyone respects her/him
(Ofcially these wffs do not contain parentheses; I added them for readability.)
I call these atomic, even though the latter two contain and , because each
is formed by attaching a predicate (albeit a complex one) to a term.
As for semantics, in any model . (= , ), what should the meaning
of be? Since its a one-place predicate, its meaning should be the same
kind of animal as the meaning of a simple one-place predicate like F : a set of
members of . Which set? Roughly: the set of members of for which is
true. More precisely (using the notation of section 5.4.1): the set
., g,
(i.e.,
]u : u and V
., g
u
() =1].) So the meaning of xF x, for example, will be
the set of members of the domain that are not in the extension of F . This talk
of the meaning of -abstracts is incorporated into the semantics ofcially
as a new clause in the denition of the valuation function governing atomic
sentences containing -abstracts:
for any wff , variable , and term , V
., g
( ) = 1 iff []
., g

., g,
The -abstracts are semantically superuous (given our current setup, any-
way). For example, x(F xGx)(a) is true in a model iff FaGa is true in that
model, xRxx(y) is true in a model under a variable assignment iff Ryy is true
in that model under that assignment, and so on. So what is their point?
For one thing, even though x(F xGx)(a) and FaGa are semantically
equivalent, they are grammatically different. The former has a subject-predicate
form, whereas the latter is a conjunction. Likewise, xRxx(y) is a one-place
predication, whereas Ryy is a two-place predication. Such grammatical differ-
ences are important in some theoretical contexts, such as in empirical linguistics
when semantics must be integrated with natural language syntax. We might
prefer x(F xGx)(a) as the symbolization of John is cold and hungry, for
example, since it treats is cold and hungry as a single predicate. And we
might prefer to symbolize No self-respecting Philadelphian is a Yankees fan
as x(y(RyyPy)(x) Y x) since this treats self-respecting Philadelphian
as a single one-place predicate.
12
For another case of this sort, consider the
symbolization of natural language denite descriptions.
13
The semantics of
section 5.3 treated atomic sentences containing terms (terms of the form
) as existence-entailingas being true only if the contained terms are
non-empty. But sometimes we want existence-entailing sentences containing
terms even when those sentences arent atomic. Suppose, for example, that
we want to symbolize a reading of The King of the USA is not bald that is
existence-entailing. (Imagine the sentence uttered by someone who believes
that there is a King of the USA; intuitively, the person is trying to say that the
King of the USA is nonbald.) This reading of the sentence is false since the
USA has no king. So it cant be symbolized as BxKxu: the atomic sentence
BxKxu is false since xKxu is empty, and thus the whole sentence is true. We
could always give up on using the , and use Russells wide-scope symbolization
instead:
x(Kxu y(Kyuy=x) Bx)
This generates the right truth conditions. But The King of the USA functions
syntactically in English as a singular term, whereas the Russellian symbolization
contains no corresponding syntactic unit. Lambda abstraction lets us capture
the correct truth conditions
14
while continuing to symbolize The King of the
USA with an term, thus treating it as a syntactic unit:
xBx(xKxu)
12
See Gamut (b, section ..).
13
Compare Stalnaker ().
14
Assuming we update the semantics of section 5.3.2 in the obvious way, treating atomic sen-
tences with -abstract predicates as false when they contain terms with undened denotations.
The difference between a sentence of the formxF () ( is non-F ), on the
one hand, and the sentences xF () and F (its not the case that is F ),
on the other, is often called the difference between internal and external
negation.
The kind of -abstraction we have been discussing is a special case of a
much more general and powerful tool, of particular interest in linguistics.
15
For just a taste of the possibilities, consider the sentences:
John crossed the street without looking
Crossing the street without looking is dangerous.
Its natural to regard crossed the street and looking in the rst sentence as
predicates, generating the symbolization: Cj Lj . And it would be strange
to treat crossed the street and looking as meaning something different in
the second sentence. But the second sentence doesnt seem to be claiming that
people who cross the street without looking are dangerous. Rather, it seems to
be saying that crossing the street without looking in generalthe activity (or feature,
or property)is dangerous. So how do we represent the second sentence?
One possibility is to use -abstraction, together with second-order predicates. A
second-order predicate attaches to an ordinary (rst-order) predicate to form a
sentence. Thus, walking is dangerous might be symbolized by attaching a
second-order predicate D
2
to the rst order predicate W: D
2
(W). So, we could
symbolize the second displayed sentence above by attaching D
2
to a -abstract:
D
2
(x(CxLx))
As a nal example, we might additionally bring in second-order quantication
to symbolize If John crossed the street without looking, and crossing the street
without looking is dangerous, then John did something dangerous:
(Cj Lj D
2
(x(CxLx))) X(D
2
(X) Xj )
15
See for example Dowty et al. (); Gamut (b); Heim and Kratzer ().
Ixcrcisc 5.12 Symbolize the following sentences, sticking as close
to the English syntax as possible:
a) Any friend of Barry is either insane or friends with everyone
b) If a man is from Philadelphia, then insulting him is foolish
Ixcrcisc 5.13 Showthat xyRyx(a) and xRxa are semantically
equivalent (true in the same models).
5.6 Ircc Logic
So far we have considered extensions of standard predicate logic. Lets nish
this chapter with a brief discussion of a variation: free logic. In standard
predicate logic, it is assumed that individual constants denote existing entities.
In each model, the interpretation function assigns to each individual constant
some member of the domain. But some natural language names, for example
Pegasus, Santa Claus, and Sherlock Holmes, seem not to denote existing
entities. Call such names empty names.
Standard predicate logic does not capture the logic of empty names, accord-
ing to the advocates of free logic. Consider, for example, the sentence Sherlock
Holmes exists. This sentence seems false. But its natural to symbolize it as
x x=a (to say that something exists is to say that something is identical to it),
and x x=a is a valid sentence of standard predicate logic. (In any model, the
name a must denote some member u of the models domain. But then, where
g is any variable assignment for this model, the open sentence x=a is true with
respect to g
x
u
. So, x x=a is true with respect to g, and so is true in the model.)
In essence: standard predicate logic assumes that all names are nonempty.
How to respond to this apparent discrepancy? The free logicians propose
to alter the semantics and proof theory of predicate logic so as to allow empty
names.
In addition to assuming that names are nonempty, standard predicate logic
also assumes that: something exists. For example, the sentence x(F xF x)
is valid in standard predicate logic. The denition of a model in standard
predicate logic requires that the domain be nonempty; as a result this formula
comes out valid. This too might be regarded as objectionable. Other things
being equal, it would be good to have a logic that recognizes the possibility of
there existing nothing at all.
One could admit empty names without admitting the logical possibility
of there existing nothing. Nevertheless, its natural to follow up the former
with the latter. Theres a barrier to the latter: if nothing exists then what do
empty names denote? So if were in the business of guring out how to admit
empty names anyway, why not simultaneously gure out how to recognize the
possibility of nothing? Logics allowing the possibility of nothing existing are
sometimes called inclusive.
5.6.1 Scmantics for frcc logic
There are various ways to implement a semantics for (inclusive) free logic. The
most straightforward introduces, in addition to the normal domain over which
quantiers range, a further outer domain. Think of the normal domainnow
called the inner domainas containing the existent entities; think of the
outer domain as containing the nonexistent ones, such as Pegasus, Santa Claus,
and Sherlock Holmes. Here are the denitions (the language in question is
assumed to be the language of predicate logic plus identity):
iiixi:iox oi xobii: A FPC-model (F for free) is an ordered triple
,
/
, such that
is a set (the inner domain)

/
is a set (the outer domain)
and
/
share no member in common, and while either one of them
may be empty, their union must be nonempty
is a function obeying the following constraints
if is a constant then () is a member of
/
if is an n-place predicate then () is a set of n-tuples of members
of
iiixi:iox oi vavianii assioxxix:: A variable assignment for a FPC-model,
,
/
, is a function that assigns to each variable some member of
/
iiixi:iox oi vaiia:iox: The FPC-valuation function, V
., g
, for FPC-model
. (= ,
/
, ) and variable assignment g, is dened as the function that
assigns to each wff either 0 or 1 subject to the following constraints:
for any n-place predicate and any terms
1
. . .
n
, V
., g
(
1
. . .
n
) =1
iff [
1
]
., g
. . . [
n
]
., g
()
V
., g
(=) =1 iff: []
., g
= (i.e., is the same object as) []
., g
for any wffs , , and any variable :
V
., g
() =1 iff V
., g
() =0
V
., g
() =1 iff either V
., g
() =0 or V
., g
() =1
V
., g
., g
u
() =1
The denition of denotation, []
., g
, is unchanged, as are the denitions of
truth in a model, validity, and semantic consequence.
Let me make several comments about these denitions. First, few philoso-
pherseven among the free logiciansbelieve in such things as nonexistent
entities. Now, even if these philosophers are right, theres nothing wrong
with FPC models as formal constructions. Accepting the existence of FPC-
models doesnt commit you to real live nonexistent objects. We call
/
the
outer domain for the sake of vividness, and it is a convenient heuristic to
call its members nonexistent objects, but nowhere do the formal denitions
require its members really to be nonexistent. Its members can be any sorts of
existent entities one likes. There is, however, a genuine worry about the FPC
semantics. If the philosophical opponents of nonexistent objects are right, then
the structure of FPC-models doesnt match the structure of the real world;
so why should FPC-validity and FPC-semantic consequence shed any light
on genuine validity and logical consequence? The question is legitimate and
pressing. Nevertheless, lets stick to our inner/outer domain approach. For one
thing, its an approach that many free logicians have taken; and for another, its
the most straightforward, formally speaking.
16
Second, the denition of the valuation function says that is true if and
only if is true for each object of the inner domain. (Similarly, the obvious
16
Another approach is to stick to a single domain, allow that domain to sometimes be empty,
and allow the interpretation function to be partial, so that () is undened for some names
. But a formal obstacle looms: no variable assignments will exist if the domain is empty; how
then will truth in such models be dened? Williamson (a) discusses some of these issues.
derived clause for the says that is true iff is true for some object in
the inner domain.) The quantiers range only over the inner domain, not the
outer. As a result, no sentence of the form turns out valid (example 5.5).
Thus, x(F xF x) turns out invalid. Which is what we wanted: if its logically
possible that there be nothing, then it shouldnt be a logical truth that there is
something that is either green or not green.
Third, notice that the denition of a model does not require the denotation
of a constant to be a member of the inner domain (though it must be a member
either of the inner or outer domain). This gives us another thing we wanted out
of free logic: individual constants dont need to denote what one usually thinks
of as existing objectsi.e., objects in the range of the quantiers. Now, the fact
noted in the previous paragraph already showed that x x=a is not valid (since
it has the form ). But something stronger is true: x x=a doesnt even
follow from x x=x, which says in effect that something exists (example 5.6).
This too is what we wanted: it shouldnt follow (according to the defenders of
free logic) from the fact that something exists that Sherlock Holmes exists.
Fourth, notice that the denition of a model requires the extension of a
predicate to be a set of tuples drawn from the inner domain.
17
As a result,
formulas of the form
1
. . .
n
are false (relative to a variable assignment)
whenever any of the
i
s fail to denote anything in the inner domain (relative
to that variable assignment). Informally: atomic formulas containing empty
terms are always false. Free logics with this feature are often called negative
free logics. This is not the only alternative. Positive free logics allow some
atomic formulas containing empty terms to be true. And neutral free logics say
that all such formulas are neither true nor false.
18
Though we wont pursue
any of these alternatives in detail, note some possible strategies: for positive
free logic, we might modify our current denitions to allow the extensions of
predicates to be tuples drawn from all of
/
; and for neutral free logic, one
might make use of strategies for multi-valued logic discussed in section 3.4.
Some examples:
17
The identity predicate is a kind of exception. Though the interpretation function does
not assign values to the identity predicate, the valuation function counts = as being true
whenever and denote the same thingeven if that thing is in the outer domain. Thus the
identity sign is in effect treated as if its extension is ]u, u], for all u
/
.
18
Exception: neutral free logics that treat exists as a primitive predicate (rather than dening
exists as x x=) sometimes allow exists to be false, rather than lacking in truth-value,
when fails to denote an existing entity.
Example 5.5: Show that
FPC
, for any variable and any wff . Con-
sider any model in which the inner domain is empty, and let g be any variable
assignment in this model. (Since the inner domain is empty g assigns only
members of the outer domain.) The derived truth condition for the then says
that V
g
() =1 iff there is some u in the inner domain such that V
g
u
() =1.
But there is no such u since the inner domain is empty. So V
g
() = 0 for
this model; and so is invalid.
Example 5.6: Show that x x=x
FPC
x x=a. Consider a model with
a nonempty inner domain, but in which the constant a denotes something
in the outer domain. Where g is any variable assignment, note rst that
V
g
(x x=x) = 1. For V
g
(x x=x) = 1 iff for some u , V
g
x
u
(x=x) = 1. But
is nonempty, so we can let u be any member of . And note second that
V
g
(x x=a) =0. For V
g
(x x=a) =1 iff for some u , V
g
x
u
(x=a) =1, which
holds iff for some u , [x]
g
x
u
=[a]
g
x
u
, i.e. iff for some u , u =(a). But
there is no such u, since (a) / .
Ixcrcisc 5.14 Show that
FPC
xF x Fa.
Ixcrcisc 5.15 Show
FPC
xF x (y y=aFa).
5.6.2 Proof thcory for frcc logic
Here we will be brief. How would the free logician view the axioms and rules
of predicate logic from section 4.4?
UG
(/) (PC)
() () (PC)
UG and PC seem unobjectionable, but the free logician will reject PC.
She will not accept that xF xFa, for example, is a logical truth: if a is an
empty name then Fa will be false even if all existing things are F . (Compare
exercise 5.14.) To make things even more vivid, consider another instance of
PC: xy y=x y y=a (if everything exists, then a exists). This the free
logician will clearly reject. For, since she thinks that both the existential and the
universal quantier range only over the existent entities, she thinks that the
antecedent xy y=x is a logical truth. For every existent thing, there is some
existent thing to which it is identical. But she thinks that the consequent might
be false: there will be no existent thing identical to a, if a is an empty name.
If PC is to be rejected, what should be put in its place? One possibility is:
(=(/)) (PC
/
)
That is: if everything is , then if exists, must be as well. The principle
of universal instantiation has been restricted to existing entities; the free
logician will accept this restricted principle. (Compare exercise 5.15.)
Chaptcr 6
Propositional Modal Logic
M
obai iooit is the logic of necessity and possibility. In it we treat modal
words like necessary, possible, can, and must as logical constants.
Our new symbols for these words are called modal operators:
2: It is necessary that (or: Necessarily, , It must be that )
3: It is possible that (or: Possibly, , It could be that , It can be
that , It might be that , it might have been that )
It helps to think of modality in terms of possible worlds. A possible world is a
complete and possible scenario. Calling a scenario possible means simply that
its possible in the broadest sense for the scenario to happen. This requirement
disqualies scenarios in which, for example, it is both raining and also not
raining (at the same time and place)such a thing couldnt happen, and so
doesnt happen in any possible world. But within this limit, we can imagine all
sorts of possible worlds: possible worlds with talking donkeys, possible worlds
in which I am ten feet tall, and so on. Complete means simply that no detail is
left outpossible worlds are completely specic scenarios. There is no possible
world in which I am somewhere between ten and eleven feet tall without
being some particular height.
1
Likewise, in any possible world in which I am
exactly ten feet, six inches tall (say), I must have some particular weight, must
live in some particular place, and so on. One of these possible worlds is the
actual worldthis is the complete and possible scenario that in fact obtains.
1
This is not to say that possible worlds exclude vagueness.
CHAPTER 6. PROPOSITIONAL MODAL LOGIC

The rest of them are merely possiblethey do not obtain, but would have
obtained if things had gone differently. In terms of possible worlds, we can
think of our modal operators thus:
2 is true iff is true in all possible worlds
3 is true iff is true in at least one possible world
It is necessarily true that all bachelors are male; in every possible world, every
bachelor is male. There might have existed a talking donkey; some possible
world contains a talking donkey.
Possible worlds provide, at the very least, a vivid way to think about necessity
and possibility. Howmuch more they provide is an open philosophical question.
Some maintain that possible worlds are the key to the metaphysics of modality,
that what it is for a proposition to be necessarily true is for it to be true in all
possible worlds.
2
Whether this view is defensible is a question beyond the
scope of this book; what is important for present purposes is that we distinguish
possible worlds as a vivid heuristic from possible worlds as a concern in serious
metaphysics.
Natural language modal words are semantically exible in a systematic way.
For example, suppose I say that I cant attend a certain conference in Cleveland.
What is the force of cant here? Probably Im saying that my attending the
conference is inconsistent with honoring other commitments Ive made at
that time. But notice that another sentence I might utter is: I could attend
the conference; but I would have to cancel my class, and I dont want to do
that. Now Ive said that I can attend the conference; have I contradicted my
earlier assertion that I cannot attend the conference? Nowhat I mean now is
perhaps that I have the means to get to Cleveland on that date. I have shifted
what I mean by can.
In fact, there is quite a wide range of things one can mean by words for
possibility:
I can come to the party, but I cant stay late. (can = is
not inconvenient)
Humans can travel to the moon, but not Mars. (can = is
achievable with current technology)
2
Sider () presents an overview of this topic.
Its possible to move almost as fast as the speed of light, but
not to travel faster than light. (possible = is consistent
with the laws of nature)
Objects could have traveled faster than the speed of light (if
the laws of nature had been different), but no matter what
the laws had been, nothing could have traveled faster than
itself. (could = metaphysical possibility)
You may borrow but you may not steal. (may = morally
acceptable)
It might rain tomorrow (might = epistemic possibil-
ity)
For any strength of possibility, there is a corresponding strength of necessity,
since necessarily is equivalent to not-possibly-not-. (Similarly, possibly
is equivalent to not-necessarily-not-.) So we have a range of strengths
of necessity as well: natural necessity (guaranteed by the laws of nature), moral
or deontic necessity (required by morality), epistemic necessity (known to
be true) and so on.
Some sorts of necessity imply truth; those that do are called alethic neces-
sities. For example, if P is known then P is true; if it is naturally necessary that
massive particles attract one another, then massive particles do in fact attract
one another. Epistemic and natural necessity are alethic. Deontic necessity, on
the other hand, is not alethic; we do not always do what is morally required.
As we saw, we can think of the 2 and the 3 as quantiers over possible
worlds (the former a universal quantier, the latter an existential quantier).
This idea can accommodate the fact that necessity and possibility come in
different strengths: those different strengths result from different restrictions
on the quantiers over possible worlds. Thus, natural possibility is truth in
some possible world that obeys the actual worlds laws; deontic possibility is
truth in some possible world in which nothing morally forbidden occurs; and
so on.
3
3
This raises a question, though: to what strength of necessary and possible does the
notion of possible world itself correspond? Is there some special, strictest notion of necessity,
which can be thought of as truth in absolutely all possible worlds? Or do we simply have
different notions of possible world corresponding to different strengths of necessity?
6.1 Grammar of MPL
Our rst topic in modal logic is the addition of the 2and the 3to propositional
logic; the result is modal propositional logic (MPL). A further step will be modal
predicate logic (chapter 9).
We need a new language: the language of MPL. The grammar of this
language is just like the grammar of propositional logic, except that we add the
2 as a new one-place sentence connective:
Sentence letters: P, Q, R. . . , with or without numerical subscripts
Connectives: , , 2
Parentheses: (, )
iiixi:iox oi vii:
Sentence letters are wffs
If and are wffs then , , and 2 are also wffs
Only strings that can be shown to be wffs using the preceding clauses are
wffs
The 2 is the only new primitive connective. But just as we were able to
dene , , and , we can dene new nonprimitive modal connectives:
3 (Possibly ) is short for 2
( strictly implies ) is short for 2()
6.2 Symbolizations in MPL
Modal logic allows us to symbolize a number of sentences we couldnt symbolize
before. The most obvious cases are sentences that overtly involve necessarily,
possibly, or equivalent expressions:
Necessarily, if snowis white, then snowis white or grass
is green
2[S(SG)]
Ill go if I must
2GG
It is possible that Bush will lose the election
3L
Snow might have been either green or blue
3(GB)
If snow could have been green, then grass could have
been white
3G3W
Impossible and related expressions signify the lack of possibility:
It is impossible for snow to be both white and not white
3(WW)
If grass cannot be clever then snow cannot be furry
3C3F
Gods being merciful is inconsistent with your imper-
fection being incompatible with your going to heaven
3(M3(I H))
As for the strict conditional, it arguably does a decent job of representing
certain English conditional constructions:
Snow is a necessary condition for skiing
WK
Food and water are required for survival
(F W)S
Thunder implies lightning
TL
Once we add modal operators, we can make an important distinction in-
volving modal conditionals in natural language. Consider the sentence if Jones
is a bachelor, then he must be unmarried. The surface grammar misleadingly
suggests the symbolization:
B2U
But suppose that Jones is in fact a bachelor. It would then follow from this
symbolization that the proposition that Jones is unmarried is necessarily true.
But nothing we have said suggests that Jones is necessarily a bachelor. Surely
Jones could have been married! In fact, one would normally not use the sentence
if Jones is a bachelor, then he must be unmarried to mean that if Jones is in fact
a bachelor, then the following is a necessary truth: Jones is unmarried. Rather,
one would mean: necessarily, if Jones is a bachelor then Jones is unmarried:
2(BU)
It is the relationship between Joness being a bachelor and his being unmarried
that is necessary. Think of this in terms of possible worlds: the rst symboliza-
tion says that if Jones is a bachelor in the actual world, then Jones is unmarried
in every possible world (which is absurd); whereas the second one says that in
each possible world, w, if Jones is a bachelor in w, then Jones is unmarried in
w (which is quite sensible). The distinction between 2 and 2() is
called the distinction between the necessity of the consequent (rst sentence)
and the necessity of the consequence (second sentence). It is important to
keep the distinction in mind, because of the fact that English surface structure
is misleading.
One nal point: when representing English sentences using the 2 and
the 3, keep in mind that these expressions can be used to express different
strengths of necessity and possibility. (One could introduce different symbols
for the different sorts; well do a bit of this in chapter 7.)
6.3 Scmantics for MPL
As usual, well consider semantics rst. Well show how to construct mathe-
matical congurations in a way thats appropriate to modal logic, and show
how to dene truth for formulas of MPL within these congurations. Ideally,
wed like the assignment of truth values to wffs to mirror the way that natural
language modal statements are made true by the real world, so that we can
shed light on the meanings of natural language modal words, and in order to
provide plausible semantic models of the notions of logical truth and logical
consequence.
In constructing a semantics for MPL, we face two main challenges, one
philosophical, the other technical. The philosophical challenge is simply that
it isnt wholly clear which formulas of MPL are indeed logical truths. Its hard
to construct an engine to spit out logical truths if you dont know which logical
truths you want it to spit out. With a few exceptions, there is widespread
agreement over which formulas of nonmodal propositional and predicate logic
are logical truths. But for modal logic this is less clear, especially for sentences
that contain iterations of modal operators. Is 2P22P a logical truth? Its
hard to say.
A quick peek at the history of modal logic is in order. Modal logic arose
from dissatisfaction with the material conditional of standard propositional
logic. In standard logic, is true whenever is false or is true; but in
expressing the conditionality of on , we sometimes want to require a tighter
relationship: we want it not to be a mere accident that either is false or
is true. To express this tighter relationship, C. I. Lewis introduced the strict
conditional , which he dened, as above, as 2().
4
Thus dened,
isnt automatically true just because is false or is true. It must be
necessarily true that either is false or is true.
Lewis then asked: what principles govern this new symbol 2? Certain
principles seemed clearly appropriate, for instance: 2()(22).
Others were less clear. Is 222 a logical truth? What about 32?
Lewiss solution to this problem was not to choose. Instead, he formulated
several different modal systems. He did this axiomatically, by formulating differ-
ent systems that differed from one another by containing different axioms and
hence different theorems.
We will follow Lewiss approach, and construct several different modal
systems. Unlike Lewis, well do this semantically at rst (the semantics for
modal logic we will study was published by Saul Kripke in the s, long
after Lewis was writing), by constructing different denitions of a model for
modal logic. The denitions will differ from one another in ways that result
in different sets of valid formulas. In section 6.4 well study Lewiss axiomatic
systems, and in sections 6.5 and 6.6 well discuss the relationship between the
semantics and the axiom systems.
Formulating multiple systems does not answer the philosophical question
of which formulas of modal logic are logically true; it merely postpones it.
The question re-arises when we want to apply Lewiss systems; when we ask
which system is the correct systemi.e., which one correctly mirrors the logical
properties of the English words possibly and necessarily? (Note that since
there are different sorts of necessity and possibility, different systems might
4
See Lewis (); Lewis and Langford ().
correctly represent different sorts.) But Ill mostly ignore such philosophical
questions here.
The technical challenge to constructing a semantics for MPL is that the
modal operators 2 and 3 are not truth functional. A sentential connective is
truth-functional, recall, iff whenever it combines with sentences to form a new
sentence, the truth value of the resulting sentence is determined by the truth
values of the component sentences. For example, it is not the case that is truth-
functional, since the truth value of it is not the case that is determined by
the truth value of (the latter is true iff the former is not true). But necessarily
is not truth-functional. If I tell you that is true, you wont yet have enough
information to determine whether Necessarily is true or false, since you
wont know whether is necessarily true or merely contingently true. Heres
another way to put the point: even though the sentences If Ted is a philosopher
then Ted is a philosopher and Ted is a philosopher have the same truth value,
if you prex each with Necessarily (intended to mean metaphysical necessity,
say), you get sentences with different truth values. Hence, the truth value of
Necessarily is not a function of the truth value of . Similarly, possibly
isnt truth-functional either: I might have been six feet tall is true, whereas I
might have been a round square is false, despite the sad fact that I am six feet
tall and I am a round square have the same truth value.
Since the 2 and the 3 are supposed to represent necessarily and possibly,
and since the latter arent truth-functional, we cant do modal semantics with
truth tables. For the method of truth tables assumes truth-functionality. Truth
tables are just pictures of truth functions: they specify what truth value a
complex sentence has as a function of what truth values its parts have. Our
challenge is clear: we need a semantics for the 2 and the 3 other than the
method of truth tables.
6.3.1 Kripkc modcls
Our approach will be that of possible-worlds semantics. The intuitive idea is to
count 2 as being true iff is true in all possible worlds, and 3 as being true
iff is true in some possible worlds. More carefully: we are going to develop
models for modal propositional logic. These models will contain objects we
will call possible worlds. And formulas are going to be true or false in (or
at) these worlds. That is, we are going to assign truth values to formulas in
these models relative to possible worlds, rather than absolutely. Truth values of
propositional-logic compound formulasthat is, negations and conditionals
will be determined by truth tables within each world; , for example, will be
true at a world iff is false at that world. But the truth value of 2 at a world
wont be determined by the truth value of at that world; the truth value of
at other worlds will also be relevant.
Specically, 2 will count as true at a world iff is true at every world
that is accessible from the rst world. What does accessible mean? Each
model will come equipped with a binary relation, %, over the set of possible
worlds; we will say that world v is accessible from world w when %wv. The
intuitive idea is that %wv if and only if v is possible relative to w. That is, if you
live in world w, then from your perspective, the events in world v are possible.
The idea that what is possible might vary depending on what possible
world you live in might at rst seem strange, but it isnt really. It is physically
impossible to travel faster than the speed of light is true in the actual world,
but false in worlds where the laws of nature allow faster-than-light travel.
On to the semantics. We rst dene a generic notion of an MPL model,
which well then use to give a semantics for different modal systems:
iiixi:iox oi xobii: An MPL-model is an ordered triple, T, %, , where:
T is a non-empty set of objects (possible worlds)
% is a binary relation over T (accessibility relation)
is a two-place function that assigns 0 or 1 to each sentence letter,
relative to (at, or in) each worldthat is, for any sentence letter ,
and any w T, (, w) is either 0 or 1. (interpretation function)
Each MPL-model contains a set T of possible worlds, and an accessibility
relation % over T. T, % is sometimes called the models frame. Think of
the frame as giving the structure of the models space of possible worlds: it
says how many worlds there are, and which worlds are accessible from which.
In addition to a frame, each model also contains an interpretation function ,
which assigns truth values to sentence letters in worlds.
MPL-models are the congurations for propositional modal logic (recall
section 2.2). A conguration is supposed to represent both a way for the
world to be, and also the meanings of nonlogical expressions. In MPL-models,
the former is represented by the frame. (When we say that a conguration
represents the world, we dont just mean the actual world. The world
signies, rather, reality, which is here thought of as including the entire space
of possible worlds.) The latter is represented by the interpretation function.
(Recall that in propositional logic, the meaning of a sentence letter was a mere
truth value. The meaning is now richer: a truth value for each possible world.)
A models interpretation function assigns truth values only to sentence
letters. But the sum total of all the truth values of sentence letters in worlds,
together with the frame, determines the truth values of all complex wffs, again
relative to worlds. It is the job of the models valuation function to specify
exactly how these truth values get determined:
iiixi:iox oi vaiia:iox: Where . (=T, %, ) is any MPL-model, the
valuation for ., V
.
, is dened as the two-place function that assigns either
0 or 1 to each wff relative to each member of T, subject to the following
constraints, where is any sentence letter, and are any wffs, and w is any
member of T:
V
.
(, w) =(, w)
V
.
(, w) =1 iff V
.
(, w) =0
V
.
(, w) =1 iff either V
.
(, w) =0 or V
.
(, w) =1
V
.
(2, w) =1 iff for each v T, if %wv, then V
.
(, v) =1
What about truth values for complex formulas containing , , , 3, and ?
Given the denition of these dened connectives in terms of the primitive
connectives, it is easy to prove that the following derived conditions hold:
V
.
(, w) =1 iff V
.
(, w) =1 and V
.
(, w) =1
V
.
(, w) =1 iff V
.
(, w) =1 or V
.
(, w) =1
V
.
(, w) =1 iff V
.
(, w) =V
.
(, w)
V
.
(3, w) =1 iff for some v T, %wv and V
.
(, v) =1
V
.
(, w) =1 iff for each v T, if %wv then either V
.
(, v) =0 or
V
.
(, v) =1
So far, we have introduced a generic notion of an MPL model, and have
dened the notion of a wffs being true at a world in an MPL model. But
remember C. I. Lewiss plight: it wasnt clear which modal formulas ought to
count as logical truths. His response, and our response, is to construct different
modal systems, in which different formulas count as logical truths. The systems
we will discuss are named: K, D, T, B, S, S. Here in our discussion of
semantics, we will come up with different denitions of what counts as a model,
one for each system: K, D, T, B, S, S. As a result, different formulas will
come out valid in the different systems. For example, the formula 2P22P
is going to come out valid in S and S, but not in the other systems.
The models for the different systems differ according to the formal prop-
erties of their accessibility relations. (Formal properties of relations were
discussed in section 1.8.) For example, we will dene a model for system T
(T-model) as any MPL model whose accessibility relation is reexive (in T,
the set of worlds in that model). Here is the denition:
iiixi:iox oi xobii iov xobai svs:ixs: An S-model, for any of our
systems S, is dened as an MPL-model T, %, whose accessibility relation
% has the formal feature given for system S in the following chart:
System accessibility relation must be
K no requirement
D serial (in T)
T reexive (in T)
B reexive (in T) and symmetric
S reexive (in T) and transitive
S reexive (in T), symmetric, and transitive
Thus, any MPL-model counts as a K-model, whereas the requirements for the
other systems are more stringent.
Our next task is to dene validity and semantic consequence for the various
systems. A slight wrinkle arises: we cant just dene validity as truth in all
models, since formulas arent simply true or false in MPL-models; theyre
true or false in various worlds in these models. Instead, we rst dene a notion
of being valid in an MPL model:
iiixi:iox oi vaiibi:v ix ax MP! xobii: An MPL-wff is valid in MPL-
model . (=T, %, iff for every w T, V
.
(, w) =1
Finally we can give the desired denitions:
iiixi:iox oi vaiibi:v axb sixax:it toxsioiixti:
An MPL-wff is valid in system S (where S is either K, D, T, B, S, or S)
iff it is valid in every S-model
MPL-wff is a semantic consequence in system S of set of MPL-wffs
iff for every S-model T, %, and each w T, if V
.
(, w) =1 for
each , then V
.
(, w) =1
As before, well use the notation for validity and semantic consequence.
But since we have many modal systems, if we claim that a formula is valid, well
need to indicate which system were talking about. Lets do that by subscripting
with the name of the system; e.g.,
T
means that is T-valid.
Its important to get clear on the status of possible-worlds lingo here. Where
T, %, is an MPL-model, we call the members of T worlds, and we call
% the accessibility relation. This is certainly a vivid way to talk about these
models. But ofcially, T is nothing but a nonempty set, any old nonempty
set. Its members neednt be the kinds of things metaphysicians call possible
worlds. They can be numbers, people, bananaswhatever you like. Similarly
for % and . The former is just dened to be any old binary relation on T;
the latter is just dened to be any old function mapping each pair of a sentence
letter and a member of T to either 1 or 0. Neither needs to have anything to
do with the metaphysics of modality. Ofcially, then, the possible-worlds talk
we use to describe our models is just talk, not heavy-duty metaphysics.
Still, models are usually intended to depict some aspect of the real world.
The usual intention is that wffs get their truth values within models in a parallel
fashion to how natural language sentences are made true by the real world. So
if natural language modal sentences arent made true by anything like possible
worlds, then possible worlds semantics would be less valuable than, say, the usual
semantics for nonmodal propositional and predicate logic. To be sure, possible
worlds semantics would still be useful for various purely formal purposes. For
example, given the soundness proofs we will give in section 6.5, the semantics
could still be used to establish facts about unprovability in the axiomatic systems
to be introduced in section 6.4. But it would be hard to see why possible worlds
models would shed any light on the meanings of English modal words, or
why truth-in-all-possible-worlds-models would be a good way of modeling
(genuine) logical truth for modal statements.
On the other hand, if English modal sentences are made true by facts about
possible worlds, then the semantics takes on a greater importance. Perhaps
then we can, for example, decide what the right logic is, for a given strength of
necessity, by reecting on the formal properties of the accessibility relation
the real accessibility relation, over real possible worlds, not the relation % over
the members of T in our models. Suppose were considering some strength,
M, of modality. A (real) possible world v is M-accessible from another world,
w, iff what happens in v counts as being M-possible, from the point of view
of w. Perhaps we can gure out the logic of M-necessity and M-possibility by
investigating the formal properties of M-accessibility.
Consider deontic necessity and possibility, for example: a proposition is
deontically necessary iff it ought to be the case; a proposition is deontically
possible iff it is morally acceptable that it be the case. The relation of deontic
accessibility seems not to be reexive: in an imperfect world like our own, many
things that ought not to be true are nevertheless true. Thus, a world can fail to
be deontically accessible relative to itself. (As we will see, this corresponds to
the fact that deontic necessity is non-alethic; it does not imply truth.) On the
other hand, one might argue, deontic accessibility is serial, since surely there
must always be some deontically accessible worldsome world in which what
occurs is morally acceptable. (To deny this would be to admit that everything
could be forbidden.) So, perhaps system D gives the logic of deontic necessity
and possibility (see also section 7.1).
To take one other example: some have argued that the relation of metaphysi-
cal-accessibility (the relation relevant to metaphysical necessity and possibility)
is a total relation: every world is metaphysically possible relative to every other.
5
What modal logic would result from requiring % to be a total (in T) relation?
The answer is: S. That is, you get the same valid formulas whether you require
% to be a total relation or an equivalence relation (see exercise 6.1). So, if the
(real) metaphysical accessibility relation is a total relation, the correct logic for
metaphysical necessity is S. But others have argued that metaphysical accessi-
bility is intransitive.
6
Perhaps one possible world is metaphysically accessible
from another only if the individuals in the latter world arent too different from
how they are in the former worldonly if such differences are below a certain
threshold. In that case, it might be argued, a world in which Im a frog is not
metaphysically accessible from the actual world: any world in which Im that
drastically different from my actual, human, self, just isnt metaphysically pos-
sible, relative to actuality. But perhaps a world, w, in which Im a human-frog
hybrid is accessible from the actual world (the difference between a human
and a frog-human hybrid is below the threshold); and perhaps the frog world
is accessible from w (since the difference between a frog-human hybrid and
a frog is also below the threshold). If so, then metaphysical accessibility is
intransitive. Metaphysical accessibility is clearly reexive. So perhaps the logic
of metaphysical possibility is given by system B or system T.
5
See Lewis (, ).
6
Compare Salmon ().
Ixcrcisc 6.1** Let O be the modal system given by the require-
ment that % must be total (in T). Show that
O
iff
S
.
6.3.2 Scmantic validity proofs
Given our denitions, we can now show particular formulas to be valid in
various systems.
Example 6.1: The wff 2(PP) is K-valid. To show this, we must show
that the wff is valid in all MPL-models, since validity-in-all-MPL-models is
the denition of K-validity. Being valid in a model means being true at every
world in the model. So, consider any MPL-model T, %, , and let w be any
world in T. We must show that V
.
(2(PP), w) =1. (As before, Ill start to
omit the subscript . on V
.
when its clear which model were talking about.)
i) Suppose for reductio that V(2(PP), w) =0
ii) So, by the truth condition for 2in the denition of the valuation function,
there is some world, v, such that %wv and V(PP, v) =0
iii) Given the (derived) truth condition for , V(P, v) =0 and V(P, v) =0
iv) Since V(P, v) = 0, given the truth condition for , V(P, v) = 1. But
thats impossible; V(P, v) cant be both 0 and 1.
Thus,
K
2(PP).
Note that similar reasoning would establish
K
2, for any tautology .
For within any world, the truth values of complex statements of propositional
logic are determined by the truth values of their constituents in that world by
the usual truth tables. So if is a tautology, it will be true in any world in any
model; hence 2 will turn out true in any world in any model.
T
(32(PQ)2P) 3Q. Let w be any world
in any T-model .; we must show that V
.
((32(PQ)2P)3Q, w) =1:
i) Suppose for reductio that V((32(PQ)2P)3Q, w) =0.
ii) So V(32(PQ)2P, w) =1 and
iii) V(3Q, w) =0. So Q is false in every world accessible from w.
iv) From ii), 32(PQ) is true at w, and so V(2(PQ), v) = 1, for some
world, call it v, such that %wv.
v) From ii), V(2P, w) = 1. So, by the truth condition for the 2, P is true
in every world accessible from w; since %wv, it follows that V(P, v) =1.
But V(Q, v) =0 given iii). So V(PQ) =0.
vi) From iv), PQ is true in every world accessible from v; since . is a
T-model, % is reexive; so %vv; so V(PQ, v) =1, contradicting v).
The last example showed that the formula (32(PQ)2P) 3Q is valid
in T. Suppose we wanted to show that it is also valid in S. What more would
we have to do? Nothing! To be S-valid is to be valid in every S-model. But
a quick look at the denitions shows that every S-model is a T-model. So,
since we already know that the the formula is valid in all T-models, we may
conclude that it must be valid in all S-models without doing a separate proof:
S
models
T
models
The S models are a subset of the
T models.
S
models
T
models
So if a formula is valid in all Tmod-
els, its automatically valid in all S
models
Think of it another way. A proof that a wff is S-valid may use the information
that the accessibility relation is both transitive and reexive. But it doesnt need
to. So the T-validity proof in example 6.2 also counts as an S-validity proof.
(It also counts as a B-validity proof and an S-validity proof.) But it doesnt
count as a K-validity proof, since it assumes in line vi) that % is reexive. To be
K-valid, a wff must be valid in all models, whereas the proof in example 6.2 only
establishes validity in all reexive models. (In fact (32(PQ)2P) 3Q
isnt K-valid, as well be able to demonstrate shortly.)
Consider the following diagram of systems:
S5
S4
|
|
|
|
|
|
B
@
@
@
@
@
@
T
~
~
~
~
~
~
B
B
B
B
B
B
D
An arrow from one system to another indicates that validity in the rst system
implies validity in the second system. For example, all D-valid wffs are also
T-valid. For if a wff is valid in all D-models, then, since every T-model is also a
D-model (reexivity implies seriality), it must be valid in all T-models as well.
S is the strongest system, since it has the most valid formulas. Thats
because it has the fewest models: its easy to be S-valid since there are so
few potentially falsifying models. K is the weakest systemfewest validities
since it has the most potentially falsifying models. The other systems are
intermediate.
Notice that the diagram isnt linear. Both B and S are stronger than T:
each contains all the T-valid formulas and more besides. And S is stronger
than both B and S. But (as we will see below) neither B nor S is stronger
than the other (nor are they equally strong): some B-valid wffs arent S-valid,
and some S-valid wffs arent B-valid. (The denitions of B and S hint at this.
B requires symmetry but not transitivity, whereas S requires transitivity but
not symmetry, so some B-models arent S-models, and some S-models arent
B-models.)
Suppose youre given a formula, and for each system in which it is valid,
you want to give a semantic proof of its validity. This neednt require multiple
semantic proofs. As we saw with example 6.2, to prove that a wff is valid in a
number of systems, it sufces to give a validity proof in the weakest of those
systems, since that very proof will automatically be a proof that it is valid in
all stronger systems. For example, a K-validity proof is itself a validity proof
for D, T, B, S, and S. But there is an exception. Suppose a wff is not valid
in T, but youve given a semantic proof of its validity in B. This proof also
shows that the wff is S-valid, since every S model is a B-model. But you cant
yet conclude that the wff is S-valid, since not every S-model is a B-model.
Another semantic proof may be needed: of the formulas S-validity. (Of course,
the formula may not be S-valid.) So: when a wff is valid in both B and S, but
not in T, two semantic proofs of its validity are needed.
We are now in a position to do validity proofs. But as well see in the next
section, its often easier to do proofs of validity when one has failed to construct
a counter-model for a formula.
Ixcrcisc 6.2 Use validity proofs to demonstrate the following:
a)
D
[2P2(PQ)]3Q
b)
S
33(PQ)3Q
6.3.3 Countcrmodcls
We have a denition of validity for the various systems, and weve shown how
to establish validity of particular formulas. (We have also dened semantic
consequence for these systems, but our focus will be on validity.) Now well see
howto establish invalidity. We establish that a formula is invalid by constructing
a countermodel for ita model containing a world in which the formula is
false. (Since validity means truth in every world in every model, the existence
of a single countermodel establishes invalidity.)
Im going to describe a helpful graphical procedure, introduced by Hughes
and Cresswell (), for constructing countermodels. Now, its always an
option to bypass the graphical procedure and directly intuit what a counter-
model might look like. But the graphical procedure makes things a lot easier,
especially with more complicated formulas.
Ill illustrate the procedure by using it to show that the wff 3P2P is not
K-valid. To be K-valid, a wff must be valid in all MPL-models, so all we must
do is nd one MPL-model in which 3P2P is false in some world.
Placc thc formula in a box
We begin by drawing a box, which represents some chosen world in the model
were in the process of pictorially constructing. The goal is to make the formula
false in this world. In these examples Ill always call this rst world r:
3P2P
r
Now, since the box represents a world, we should have some way of representing
the accessibility relation. What worlds are accessible from r; what worlds does
r see? Well, to represent one world (box) seeing another, well draw an arrow
from the rst to the second. But in this case we dont need to draw any arrows.
Were only trying to show that 3P2P is K-invalid, and the accessibility
relation for system K doesnt even need to be serialno world needs to see any
worlds at all. So, well forget about arrows for the time being.
Makc thc formula falsc in thc world
Well indicate a formulas truth value by writing that truth value above the
formulas major connective. (The major connective of a wff is the last con-
nective that was added when the wff was formed via the rules of grammar.
7
Thus, the major connective of P2Q is the , and the major connective of
2(P2Q) is the leftmost 2.) So to indicate that 3P2P is to be false in this
model, well put a 0 above its arrow:
0
3P2P
r
Intcr forccd truth valucs
Assigning a truth value to a formula sometimes forces us to assign truth values
to other formulas in the same world. For example, if we make a conjunction
true in a world then we must make each of its conjuncts true at that world; and
if we make a conditional false at a world, we must make its antecedent true and
its consequent false at that world. In the current example, since weve made
3P2P false in r, weve got to make 3P true at r (indicated on the diagram
by a 1 over its major connective, the 3), and weve got to make its consequent
2P false at r:
1 0 0
3P2P
r
7
In talking about major connectives, lets treat nonprimitive connectives as if they were
primitive. Thus, the major connective of 2PQ is the .
Intcr astcrisks
When we assign a truth value to a modal formula, we thereby commit ourselves
to assigning certain other truth values to various formulas at various worlds.
For example, when we make 3P true at r, we commit ourselves to making P
true at some world that r sees. To remind ourselves of this commitment, well
put an asterisk (*) below 3P. An asterisk below indicates a commitment to there
being some world of a certain sort. Similarly, since 2P is false at r, this means
that P must be false in some world P sees (if it were true in all such worlds then
2P would be true at r). We again have a commitment to there being some
world of a certain sort, so we enter an asterisk below 2P as well:
1 0 0
3P2P

r
Dischargc bottom astcrisks
The next step is to fulll the commitments we incurred when we added the
bottom asterisks. For each, we need to add a world to the diagram. The rst
asterisk requires us to add a world in which P is true; the second requires us to
add a world in which P is false. We do this as follows:
1 0 0
3P2P

r

?
?
?
?
?
?
?
?
?
?
1
P
a
0
P
b
Thc ofhcial modcl
We now have a diagram of a K-model containing a world in which 3P2P
is false. But we need to produce an ofcial model, according to the ofcial
denition of a model. A model is an ordered triple T, %, , so we must
specify the models three members.
The set of worlds, T, is simply the set of worlds I invoked:
T =]r, a, b]
What are r, a, and b? Lets just take them to be the letters r, a, and b. No
reason not tothe members of T, recall, can be any things whatsoever.
Next, the accessibility relation. This is represented on the diagram by the
arrows. In our model, there is an arrow from r to a, an arrow from r to b, and
no other arrows. Thus, the diagram represents that r sees a, that r sees b, and
that there are no further cases of seeing. Now, remember that the accessibility
relation, like all relations, is a set of ordered pairs. So, we simply write out this
set:
% =]r, a, r, b]
That is, we write out the set of all ordered pairs w
1
, w
2
such that w
1
sees
w
2
.
Finally, we need to specify the interpretation function, , which assigns
truth values to sentence letters at worlds. In our model, must assign 1 to
P at world a, and 0 to P at world b. Now, our ofcial denition requires an
interpretation to assign a truth value to each of the innitely many sentence
letters at each world; but so long as P is true at world a and false at world b,
it doesnt matter what other truth values assigns. So lets just (arbitrarily)
choose to make all other sentence letters false at all worlds in the model. We
have, then:
(P, a) =1
(P, b) =0
(, w) =0 for all other sentence letters and worlds w
Thats itwere done. We have produced a model in which 3P2P is
false at some world; hence this formula is not valid in all models; and hence its
not K-valid:
K
3P2P.
Chcck thc modcl
At the end of this process, its a good idea to double-check that your model is
correct. This involves various things. First, make sure that youve succeeded in
producing the correct kind of model. For example, if youre trying to produce
a T-model, make sure that the accessibility relation youve written down is
reexive. (In our case, we were only trying to construct a K-model, and so for
us this step is trivial.) Second, make sure that the formula in question really
does come out false at one of the worlds in your model.
Simplifying modcls
Sometimes a model can be simplied. In the countermodel for 3P2P, we
neednt have used three worlds. We added world a because the truth of 3P
called for a world that r sees in which P is true. But we neednt have made
that a new worldwe could have made P true in r and made r see itself. (We
couldnt have done that for both asterisks; that would have made P both true
and false at r.) So, we could make this one simplication:
1 1 0 0
3P2P

r
0
P
b
Ofcial model:
T =]r, b]
% =]r, r , r, b]
(P, r) =1, all others 0
Adapting modcls to diffcrcnt systcms
We have shown that 3P2P is not K-valid. Next lets show that this formula
isnt D-validthat it is false in some world of some model with a serial accessi-
bility relation. The model we just constructed wont do, since its accessibility
relation isnt serial; world b doesnt see any world. But we can easily change
that:
1 1 0 0
3P2P

r
0
P
b
Ofcial model:
T =]r, b]
% =]r, r , r, b, b, b]
That was easyadding the fact that b sees itself didnt require changing any-
thing else in the model.
Suppose we want now to show that 3P2P isnt T-valid. What more
must we do? Nothing! The model we just displayed is a T-model, in addition
to being a D-model, since its accessibility relation is reexive. In fact, its
accessibility relation is also transitive, so its also an S-model. What about B?
Its easy to make the accessibility relation symmetric:
1 1 0 0
3P2P

r
0
P
b
Ofcial model:
T =]r, b]
% =]r, r, r, b, b, b, b, r]
So weve established B-invalidity as well. In fact, the model just displayed is
an S model since its accessibility relation is an equivalence relation. And so,
since any S model is also a K, D, T, B, and S model, this one model shows
that 3P2P is not valid in any of our systems. So we have established that:
K,D,T,B,S,S
3P2P.
In this case it wouldnt have been hard to move straight to the nal S
model, right from the start. But in more difcult cases, its best to proceed
slowly, as I did here. Try rst for a countermodel in K. Then build the model
up gradually, trying to make its accessibility relation satisfy the requirements of
stronger systems. When you get a countermodel in a stronger system (a system
with more requirements on its models), that very countermodel will establish
invalidity in all weaker systems. Keep in mind the diagram of systems:
S5
S4
|
|
|
|
|
|
B
@
@
@
@
@
@
T
~
~
~
~
~
~
B
B
B
B
B
B
D
An arrow from one system to another, recall, indicates that validity in the rst
system implies validity in the second. The arrows also indicate facts about
invalidity, but in reverse: when an arrow points from one system to another,
then invalidity in the second system implies invalidity in the rst. For example,
if a wff is invalid in T, then it is invalid in D. (Thats because every T-model is
a D-model; a countermodel in T is therefore a countermodel in D.)
When our task is to discover the systems in which a given formula is invalid,
usually only one countermodel will be neededa countermodel in the strongest
system in which the formula is invalid. But there is an exception involving B
and S. Suppose a given formula is valid in S, but we discover a model showing
that it isnt valid in B. That model is automatically a T, D, and K-model, so
we know that the formula isnt T, D, or K-valid. But we dont yet know about
S-validity. If the formula is S-invalid, then we will need to produce a second
countermodel, an S countermodel. (Notice that the B-model couldnt already
be an S-model. If it were, then its accessibility relation would be reexive,
symmetric, and transitive, and so it would be an S model, contradicting the
fact that the formula was S-valid.)
So far we have the following steps for constructing countermodels:
. Place the formula in a box and make it false
. Enter forced truth values
. Enter asterisks
. Discharge bottom asterisks
. The ofcial model
We need to add to this list.
Top astcrisks
Lets try to get a countermodel for 32P23P in all the systems in which it
is invalid. A cautious beginning would be to try for a K-model. After the rst
few steps, we have:
1 0 0
32P23P

r
{
{
{
{
{
{
{
{
{
{
{
C
C
C
C
C
C
C
C
C
C
C
1
2P
a
0
3P
b
At this point we have a true 2 (in world a) and a false 3 (in world b). Like
true 3s and false 2s, these generate commitments pertaining to other worlds.
But unlike true 3s and false 2s, they dont commit us to the existence of some
accessible world of a certain type; they carry commitments for every accessible
world. The true 2P in world a, for example, requires us to make P true in every
world accessible from a. Similarly, the falsity of 3P in world b commits us to
making P false in every world accessible fromb. We indicate such commitments,
universal rather than existential, by putting asterisks above the relevant modal
operators:
1 0 0
32P23P

r
~
~
~
~
~
~
~
~
~
~
~
~
@
@
@
@
@
@
@
@
@
@
@
@
1
2P
a
0
3P
b
Now, how can we honor these commitments; how must we discharge these
asterisks? In this case, when trying to construct a K-model, we dont need to do
anything. Since world a, for example, doesnt see any world, P is automatically
true in every world it sees; the statement for every world, w, if %aw then
V(P, w) =1 is vacuously true. Same goes for bP is automatically false in all
worlds it sees. So, weve got a K-model in which 32P23P is false.
Now lets turn the model into a D-model. Every world must now see at
least one world. Lets try:
1 0 0
32P23P

r
~
~
~
~
~
~
~
~
~
~
~
~
@
@
@
@
@
@
@
@
@
@
@
@
1
2P
a
0
3P
b
1
P
c
0
P
d
I added worlds c and d, so that a and b would each see at least one world.
(Further, worlds c and d each had to see a world, to keep the relation serial. I
could have added new worlds e and f seen by c and d, but e and f would have
needed to see some worlds. So I just let c and d see themselves.) But once c
and d were added, discharging the upper asterisks in worlds a and b required
making P true in c and false in d (since a sees c and b sees d).
Lets now try for a T-model. Worlds a and b must now see themselves. But
then we no longer need worlds c and d, since they were added just to make the
relation serial. So we can simplify:
1 0 0
32P23P

r
~
~
~
~
~
~
~
~
~
~
~
~
1 1
2P
a
0 0
3P
b
Ofcial model:
T =]r, a, b]
% =]r, r, a, a, b, b, r, a, r, b]
(P, a) =1, all others 0
When you add arrows, you need to make sure that all top asterisks are dis-
charged. In this case this required nothing of world r, since there were no top
asterisks there. There were top asterisks in worlds a and b; these I discharged
by making P be true in a and false in b.
Notice that I could have moved straight to this T-modelwhich is itself a
D-modelrather than rst going through the earlier mere D-model. However,
this wont always be possiblesometimes youll be able to get a D-model, but
no T-model.
At this point lets verify that our model does indeed assign the value 0 to
our formula 32P23P. First notice that 2P is true in a (since a only sees
one worlditselfand P is true there). But r sees a. So 32P is true at r. Now,
consider b. b sees only one world, itself; and P is false there. So 3P must also
be false there. But r sees b. So 23P is false at r. But now, the antecedent of
32P23P is true, while its consequent is false, at r. So that conditional is
false at r. Which is what we wanted.
Onward. Our model is not a B-model since r sees a and b but they dont
see r back. Suppose we try to make a and b see r:
1 0 0
32P23P

r
~
~
~
~
~
~
~
~
~
~
~
~

@
@
@
@
@
@
@
@
@
@
@
@
1 1
2P
a
0 0
3P
b
We must now make sure that all top asterisks are discharged. Since a now sees
r, P must be true at r. But b sees r too, so P must be false at r. Since P cant be
both true and false at r, were stuck. We have failed to construct a B-model in
which this formula is false.
Our failure to construct a B-countermodel suggests that it may be impossible
to do so. We can prove that this is impossible by showing that the formula is
true in every world of every B-modelthat is, that the formula is B-valid. Let
. = T, %, be any model in which % is reexive and symmetric, and
consider any w T; we must show that V
.
(32P23P, w) =1:
i) Suppose for reductio that V(32P23P, w) =0. Then V(32P, w) =1
and V(23P, w) =0.
ii) Given the former, for some v, %wv and V(2P, v) =1.
iii) Given the latter, for some u, %wu and V(3P, u) =0.
iv) From ii), P is true at every world accessible from v; by symmetry, %vw;
so V(P, w) =1.
v) From iii), P is false at every world accessible from u; by symmetry, %uw;
so V(P, w) =0, contradicting iv)
Just as we suspected: the formula is indeed B-valid; no wonder we failed to
come up with a B-countermodel!
Might there be an S countermodel? No: the B-validity proof we just
constructed also shows that the formula is S-valid. What about an S coun-
termodel? The existence of the B-validity proof doesnt tell us one way or the
other. Remember the diagram: validity in S doesnt imply validity in B, nor
does validity in B imply validity in S. So we must either try to come up with
an S-model, or try to construct an S semantic validity proof. Usually its best
to try for a model. In the present case this is easy: the T-model we gave earlier
is itself an S-model. Thus, on the basis of that model, we can conclude that
K,D,T,S
32P23P.
We have accomplished our task. We gave an S countermodel, which is a
countermodel for each system in which 32P23P is invalid. And we gave
a validity proof in B, which is a validity proof for each system in which the
formula is valid.
Example 6.3: Determine in which systems 32P3232P is valid and in
which systems it is invalid. We can get a T-model as follows:
1 0 0 0
32P3232P

r
I discharged the second

bottom asterisk in
r by letting r see b
1 1 0
2P 232P
Notice how commitments

to truth values for different
formulas are recorded
by placing the formulas
side by side in the box
0 0 1
32P P
0
P
c
Ofcial model:
T =]r, a, b, c]
% =]r, r , a, a, b, b, c, c, r, a, r, b, a, b, b, c]
(P, a) =(P, b) =1, all others 0
Now consider what happens when we try to turn this model into a B-model.
World b must see back to world a. But then the false 32P in b conicts with
the true 2P in a. So its time for a validity proof. In constructing this validity
proof, we can be guided by our failed attempt to construct a countermodel
(assuming all of our choices in constructing that countermodel were forced).
In the following proof that the formula is B-valid, I use variables for worlds
that match up with the attempted countermodel above:
i) Suppose for reductio that V(32P3232P, r ) =0, in some world r in
some B-model T, %, . So V(32P, r ) =1 and V(3232P, r ) =0.
ii) Given the former, for some world a, %ra and V(2P, a) =1.
iii) Given the latter, since %ra, V(232P, a) =0. So for some b, %ab and
V(32P, b) =0. By symmetry, %ba; so V(2P, a) =0, contradicting ii).
We now have a T-model for the formula, and a proof that it is B-valid. The
B-validity proof shows the formula to be S-valid; the T-model shows it to be
K- and D-invalid. We dont yet know about S. So lets return to the T-model
above and try to make its accessibility relation transitive. World a must then
see world c, which is impossible since 2P is true in a and P is false in c. So
were ready for a S-validity proof (the proof looks like the B-validity proof at
rst, but then diverges):
i) Suppose for reductio that V(32P3232P, r ) =0, in some world r in
some B-model T, %, . So V(32P, r ) =1 and V(3232P, r ) =0.
ii) Given the former, for some world a, %ra and V(2P, a) =1.
iii) Given the latter, since %ra, V(232P, a) =0. So for some b, %ab and
V(32P, b) =0. By reexivity, %b b, so V(2P, b) =0. So for some world
c, %b c and V(P, c) =0.
iv) Since %ab and %b c, by transitivity we have %ac. So, given ii), V(P, c) =
1, contradicting iii)
Daggcrs
If we make a conditional false, were forced to enter certain truth values for its
components: 1 for the antecedent, 0 for the consequent. Similarly, making a
conjunction true forces us to make its conjuncts true, making a disjunction false
forces us to make its disjuncts false, and making a negation either true or false
forces us to give the negated formula the opposite truth value. But consider
making a disjunction true. Here we have a choice; we can make either disjunct
true (or both). We similarly have a choice for how to make a conditional true,
or a conjunction false, or a biconditional either true or false.
When one faces choices like these, its best to delay making the choice
as long as possible. After all, some other part of the model might force you
to make one choice rather than the other. If you investigate the rest of the
countermodel, and nothing has forced your hand, you may need then to make
a guess: try one of the truth-value combinations open to you, and see whether
you can nish the countermodel. If not, go back and try another combination.
To remind ourselves of these choices, we will place a dagger () underneath
the major connective of the formula in question. Consider, as an example, con-
structing a countermodel for the formula 3(3P2Q)(3PQ). Throwing
caution to the wind and going straight for a T-model, we have after a few steps:
1 0 0 0 0 0
3(3P2Q)(3P Q)
1 0
3P2Q P
We still have to decide how to make 3P2Q true in world a: which disjunct
to make true? Well, making 2Q true wont require adding another world to
the model, so lets do that. We have, then, a T-model:
1 0 0 0 0 0
3(3P2Q)(3P Q)
1 1 1 0
3P2Q P
Ofcial model:
T =]r, a]
% =]r, r, a, a, r, a]
(Q, a) =1, all else 0
Next lets try to upgrade this to a B-model. We cant simply leave everything
as-is while letting world a see back to world r, since 2Q is true in a and Q is
false in r. But theres another possibility. We werent forced to discharge the
dagger in world a by making 2Q true. So lets explore the other possibility;
lets make 3P true:
1 0 0 0 0 0
3(3P2Q)(3P Q)
1 1 0
3P2Q P

a

1
P
b
Ofcial model:
T =]r, a, b]
% =]r, r, a, a, b, b, r, a, a, r,
a, b, b, a]
(P, b) =1, all else 0
What about an S-model? We cant just add the arrows demanded by
transitivity to our B-model, since 3P is false in world r and P is true in world
b. What we can do instead is revisit the choice of which disjunct of 3P2Q to
make true. Instead of making 3P true, we can make 2Q true, as we did when
we constructed our T-model. In fact, that T-model is already an S-model.
So, we have countermodels in both S and B. The rst resulted from
one choice for discharging the dagger in world a, the second from the other
choice. An S-model, though, looks impossible. When we made the left
disjunct of 3P2Q true we couldnt make the accessibility relation transitive,
and when we made the right disjunct true we couldnt make the accessibility
relation symmetric. So apparently we cant make the accessibility relation both
transitive and symmetric. Here is an S-validity proof, based on this line of
thought. Note the separation of cases reasoning:
i) Suppose for reductio that V(3(3P2Q)(3PQ), r ) = 0, for some
world r in some S-model. Then V(3(3P2Q), r ) =1 and
ii) V(3PQ, r ) =0. So V(3P, r ) =0 and V(Q, r ) =0.
iii) Given i), for some world a, %ra and V(3P2Q, a) = 1. So, either
V(3P, a) =1 or V(2Q, a) =1
iv) The rst possibility leads to a contradiction. For if V(3P, a) = 1, then
for some b, %ab and V(P, b) =1. But then given transitivity, %r b, and
so, given V(3P, r ) =0 (line ii)), V(P, b) =0.
v) So does the second. For symmetry yields %ar , so if V(2Q, a) =1 then
V(Q, r ) =1, contradicting ii).
vi) Either way we have a contradiction.
So we have demonstrated that
S
3(3P2Q)(3PQ).
Summary of stcps
Here, then, is a nal list of the steps for constructing countermodels:
. Place the formula in a box and make it false
. Enter forced truth values
. Enter daggers, and after all forced moves are over
. enter asterisks
. Discharge asterisks (hint: do bottom asterisks rst)
. Back to step if not nished
. The ofcial model
Ixcrcisc 6.3 For each of the following wffs, give a countermodel
for every system in which it is not valid, and give a semantic validity
proof for every system in which it is valid. When you use a single
countermodel or validity proof for multiple systems, indicate which
systems it is good for.
a)* 2[P3(QR)]3[Q(2P3R)]
b) 3(P3Q)(23P32Q)
c) 2(P3Q)(2P3Q)
d)* 2(PQ)2(2P2Q)
e) 2(PQ)22(3P3Q)
f) 2(2PQ)2(2P2Q)
g)* 332P2P
h) 33P23P
i) 2[2(P2P)2P](32P2P)
6.4 Axiomatic systcms of MPL
Lets turn next to proof theory. In one respect the proof-theoretic approach to
logic is particularly attractive in the case of modal logic. Model-theoretic ap-
proaches are most attractive when they are realisticwhen truth-in-a-model
parallels real truth in the real world. But possible-worlds models are realistic
only if a possible-worlds metaphysics of modality is correct. Proof theory, on
the other hand, has the virtue of caution, since its attraction does not rely on
assumptions about semantics. Opponents of possible-worlds metaphysics can
always retreat to proof theory and characterize the inferential roles of modal
expressions directly.
Our approach to proof theory will be axiomatic: well write down axioms,
which are sentences of propositional modal logic that seem clearly to be logical
truths, and well write down rules of inference, which say which sentences can
be logically inferred from which other sentences.
Well continue to follow C. I. Lewis in constructing multiple modal systems,
since its so unclear which sentences of MPL are logical truths. Well formu-
late multiple axiomatic systems, which differ from one another by containing
different axioms (and so, different theorems). In fact, well give these systems
the same names as the systems we investigated semantically: K, D, T, B, S,
and S. (Thus we will subscript the symbol for theoremhood with the names
of systems; '
K
, for example, will mean that is a theorem of system K.) Our
re-use of the system names will be justied in sections 6.5 and 6.6, where we
will establish soundness and completeness for each system, thereby showing
that in each system, exactly the same formulas are provable as are valid.
6.4.1 Systcm K
Our rst system, K, is the weakest systemthe systemwith the fewest theorems.
Axioxa:it svs:ix K:
Rules: modus ponens and necessitation:

MP

2
NEC
Axioms: for any MPL-wffs , , and , the following are axioms:
() (PL)
(()) (()()) (PL)
() (()) (PL)
2() (22) (K)
System K (like all the modal systems well study) is an extension of proposi-
tional logic, in the sense that it includes all of the theorems of propositional
logic, but then adds more theorems. It includes all of propositional logic be-
cause it contains all of the propositional logic rules and axioms; it adds theorems
by adding a new rule of inference (NEC), and a new axiom schema (the K-
schema) (as well as adding new wffswffs containing the 2to the stock of
wffs that can occur in the PL axioms.)
If youve been paying attention, the rule NEC (for necessitation) ought
to strike you as being, well, wrong. It says that if you have a formula on a line,
then you may infer the formula 2. But cant a sentence be true without being
necessarily true? Yes; but so long as were careful how we use our axiomatic
system, this fact wont get us into trouble. Recall the distinction from section
2.6 between a proof (simpliciter) and a proof from a set . In a proof, each
line must be either i) an axiom or ii) a wff that follows from earlier lines in the
proof by a rule; in a proof from a line may also be iii) a member of (i.e.,
a premise). A theorem is dened as the last line of any proof. So every line
of every proof is a theorem. So whenever one uses necessitation in a proofa
proof simpliciter, that isone is applying it to a theorem. And necessitation does
seem appropriate when applied to theorems: if is a theorem, then 2 ought
also to be a theorem. Think of it another way. The worry about necessitation
is that it doesnt preserve truth: its premise can be true when its conclusion
is false. But necessitation does preserve logical truth. So if were thinking of
our axiomatic denition of theoremhood as being a (proof-theoretic) way to
represent logical truth, there seems to be no trouble with its use of necessitation.
So: we dont get into trouble with NEC if we only consider proofs of
theorems. But we do get into trouble if we consider proofs from premises.
Consider the following:
. P premise
. 2P , NEC
This is a proof of 2P from ]P]. Thus, P '
K
2P (given the way our denitions
are set up). But its easy to construct a model showing that P
K
2P. Thus,
we have a failure of the generalized version of soundness, according to which

K
whenever '
K
. Whats more, even though P '
K
2P, its not the
case that '
K
P2P. (Well be able to demonstrate this once weve proved
soundness for K.) Thus, the deduction theorem (section 2.9) fails for our
axiomatic system Kand indeed, for all the axiomatic modal systems we will
consider. So we cannot do anything like conditional proof in these systemswe
cannot show that a conditional is a theorem by assuming its antecedent and
proving its consequent on that basis.
8
8
Compare also the failure of conditional proof given a supervaluational semantics for Z
discussed at the end of section 3.4.5.
These problems arent insurmountable. One can develop more complex
denitions of provability from premises that lack these negative consequences.
9
But for our purposes, it will be simpler to sidestep rather than solve the prob-
lems, by staying away fromproofs frompremises. Our axiomatic systemdelivers
bad results when it comes to proofs from premises, so we wont think of that
aspect of the system as representing logical consequence.
Lets investigate some proof techniques. The simplest consists of rst
proving something from the PL axioms, and then necessitating it.
Example 6.4: Proof of 2((PQ)(PP)):
. P(QP) PL
. P(QP))((PQ)(PP)) PL
. (PQ)(PP) , , MP
. 2((PQ)(PP)) , NEC
To save on sweat, tears, and ink, lets reinstate the time-saving shortcuts
introduced in sections 2.8 and 4.4. Whenever is an MPL-tautologyi.e.,
results from some tautology (PL-valid wff) by uniform substitution of MPL-
wffs for sentence letterswe allow ourselves to simply write down in an
axiomatic proof, with the annotation PL. (Since our PL axioms and rule
are complete, and are included here in K, we know we could always insert an
ofcial K-proof of .) Thus, the previous proof could be shortened to:
10
. (PQ)(PP) PL
. 2((PQ)(PP)) , NEC
And we allow ourselves to move directly from some wffs
1
. . .
n
to any MPL-
tautological consequence of those wffs. That is, if we already have
1
. . .
n
,
then we may write , annotating the line numbers of
1
. . .
n
and PL, if the
conditional
1
(
2
(
n
)) is an MPL-tautology. (As in section 4.4,
after writing PL I will sometimes cite one of the tautologies from table 4.1 to
clarify what Ive done.) And we allow ourselves to perform multiple steps at
once, if its obvious whats going on.
9
See, for example, Garson ().
10
Here the formula annotated PL is in fact a genuine tautology, but in other cases it wont
be. The MPL-tautology 2P2P comes from the tautology PP by uniformly substituting
2P for P, but it isnt itself a tautology because it isnt a PL-wffthe 2isnt part of the primitive
vocabulary of propositional logic.
Back to investigating what we can do in K. In K, tautologies are neces-
sary: the strategy of example 6.4 can be used to prove 2 whenever is a
MPL-tautology. The next example illustrates a related fact about K: in it,
contradictions are impossible.
Example 6.5: Proof of 3(PP)i.e., of 2(PP)
. (PP) PL
. 2(PP) , NEC
. 2(PP) , PL
So far we have only used necessitation and the PL axioms. What about the
K-axioms 2()(22)? Their point is to enable distribution of the
2 over the . That is, if you ever have the formula 2(), then you can
always move to 22 as follows:
i . 2()
i +1. 2()(22) K
i +2. 22 i , i +1, MP
Distribution of the 2 over the , plus the rule of necessitation, combine
to give us a powerful technique for proving wffs of the form 22. First
prove (this technique works only if you can do this); then necessitate it
to get 2(); then distribute the 2 over the arrow to get 22. This is
one of the core K-techniques, and is featured in the next example.
Example 6.6: Proof of 2(PQ)(2P2Q):
. (PQ)P PL
. 2[(PQ)P] , NEC
. 2[(PQ)P] [2(PQ)2P] K
. 2(PQ)2P , , MP
. 2(PQ)2Q Insert steps similar to
. 2(PQ)(2P2Q) , , PL (composition)
Next, lets consider how to prove (2P2Q)2(PQ). Here we run into
problems. We must prove a conditional whose antecedent is a disjunction of
two 2s. But the modal techniques weve developed so far dont deliver results
of this form. They only show us how to put 2s in front of theorems, and how to
distribute 2s over s, and so only deliver results of the form2 and 22.
And since were working in an axiomatic system, we cannot use proof strategies
like conditional proof and reductio ad absurdum. To overcome these problems,
Ill use our modal techniques to prove two conditionals, 2P2(PQ) and
2Q2(PQ), from which the desired result follows by PL.
Example 6.7: Proof of (2P2Q)2(PQ):
. P(PQ) PL
. 2(P(PQ)) , NEC
. 2P2(PQ) K, , MP
. Q(PQ) PL
. 2Q2(PQ) , NEC, K, MP
. (2P2Q)2(PQ) , , PL (dilemma)
In general: if the modal techniques dont deliver the result youre after, look
for one or more modal formulas that they do deliver which, by PL, imply the
desired result. (Again, remember to consult table 4.1.) Assemble the modal
formulas using the modal techniques, and then write down your desired result,
annotating PL.
The next example illustrates our next modal technique: combining two 2s
to get a single 2.
Example 6.8: Proof of (2P2Q)2(PQ):
. P (Q(PQ)) PL
. 2P 2(Q(PQ)) , NEC, K, MP
. 2(Q(PQ)) [2Q2(PQ)] K
. 2P[2Q2(PQ)] , , PL (syllogism)
. (2P2Q)2(PQ) , PL (import/export)
(Step is unnecessary since you could go straight from and to by proposi-
tional logic; I put it in for perspicuity.)
In general, whenever
1
(
2
(
n
)) is provable you can use the
technique of example 6.8 to prove 2
1
(2
2
(2
n
2)). Thus you
can move from 2
1
. . . 2
n
to 2 in any such case. Roughly speaking: you
can combine several 2s to get a further 2, provided you can prove the inside
of the further 2 from the insides of the former 2s. First prove the conditional
1
(
2
(
n
)); then necessitate it to get 2[
1
(
2
(
n
))];
then distribute the 2 over the arrows repeatedly using K-axioms and PL to get
2
1
(2
2
(2
n
2)).
Onward. The next example illustrates one way to prove formulas with
nested modal operators:
Example 6.9: Proof of 22(PQ)22P:
. (PQ)P PL
. 2(PQ)2P , NEC, K, MP
. 2[2(PQ)2P] , NEC
. 22(PQ)22P K, , MP
Notice in line that we necessitated something that was not a PL theorem.
Thats ok; the rule of necessitation applies to all K-theorems, even those whose
proofs were distinctively modal. Notice also how this proof contains two
instances of the technique of example 6.6. This technique involves obtaining
a conditional, necessitating it, and then distributing the 2 over the . We
did this rst using the conditional (PQ)P; that led us to a conditional,
2(PQ)2P. Then we started the technique over again, using this as our
initial conditional.
So far we have no techniques for dealing with the 3, other than eliminating
it by denition. It will be convenient to derive some shortcuts involving the 3
some theorems that we may subsequently cite in proofs. The most important
is an analog of the K axiom:
2()(33) (K3)
By denition of the 3, this is an abbreviation of 2()(22).
How to prove it? None of our modal techniques delivers a wff of this form.
But notice that this wff follows by PL from 2()(22). And
this latter wff looks like the result of necessitating an MPL-tautology and then
distributing the 2 over the a couple of timesjust the kind of thing we
know how to do in K. So, any instance of K3 may be proved as follows:
. ()() PL (contraposition)
. 2()2() , NEC, K, MP
. 2()(22) K
. 2()(22) , , PL (syllogism)
. 2()(22) , PL (contraposition)
The next example illustrates the importance of K3:
Example 6.10: Proof of 2P(3Q3(PQ)):
. P[Q(PQ)] PL
. 2P2[Q(PQ)] , NEC, K, MP
. 2[Q(PQ)][3Q3(PQ)] K3
. 2P[3Q3(PQ)] , , PL (syllogism)
In general, K3 lets us construct proofs of the following sort. Suppose we
wish to prove a formula of the form:
O
1
1
(O
2
2
(O
n
n
3))
where the O
i
s are modal operators, all but one of which are 2s. (Thus, the
remaining O
i
is 3.) The technique is like that of example 6.8. First prove a
nested conditional, the antecedents of which are the
i
s, and the consequent of
which is (the technique works only when this can be done); then necessitate
it; then repeatedly distribute the 2 over the s, once using K3, the rest of
the times using K. But there is one catch. We need to use K3 last, after all the
uses of K. This in turn requires that the nal antecedent in the initial nested
conditional must be whichever of the
i
s that we want to end up underneath
the 3. For instance, suppose that O
2
is 3. Thus, what we are trying to prove is:
2
1
(3
2
(2
3
(2
n
3)))
In this case, the conditional to use would be:
1
(
n
(
3
(
n1
(
2
))))
In other words, one must swap
n
with
2
. The end result will therefore have
the modal statements out of order:
2
1
(2
n
(2
3
(2
n1
(3
2
3))))
But thats not a problem since this implies our desired result by PL. (Recall
that () is logically equivalent in PL to ().)
Why do we need to save K3 for last? The strategy of successively dis-
tributing the box over all the nested conditionals comes to a halt as soon as
the K3 theorem is used. Suppose, for example, that we attempted to prove
3P(2Q3(PQ)) as follows:
. P(Q(PQ)) PL
. 2[P(Q(PQ))] , Nec
. 3P3(Q(PQ)) K3, , MP
. ?
Now were stuck. We need 3(Q(PQ))(2Q3(PQ)) to nish the
proof; but neither K nor K3 gets us this. We must start over, beginning with a
different conditional:
Example 6.11: Proof of 3P(2Q3(PQ)):
. Q(P(PQ)) PL
. 2(Q(P(PQ))) , Nec
. 2Q2(P(PQ)) K, , MP
. 2(P(PQ))(3P3(PQ)) K3
. 2Q(3P3(PQ)) , , PL (syllogism)
. 3P(2Q3(PQ)) , PL (permutation)
Lets derive another helpful shortcut involving the 3, the following modal
negation (MN) theorem schemas:
'
K
23 '
K
32 (MN)
'
K
32 '
K
23
Ill prove one of these; the rest can be proved as exercises.
Example 6.12: Proof of 23, i.e. 23 (for any ):
. PL
. 22 , NEC, K, MP
. 22 , PL (contraposition)
The MN theorems let us move s through strings of 2s and 3s.
Example 6.13: Show that '
K
232P323P:
. 2P3P MN
. 32P33P , NEC, K3, MP
. 33P23P MN
. 32P23P , , PL (syllogism)
. 232P223P , NEC, K, MP
. 223P323P MN
. 232P323P , , PL (syllogism)
Its important to note, by the way, that this proof cant be shortened as follows:
. 232P233P MN
. 233P223P MN
. 223P323P MN
. 232P323P , , , PL
Steps and of the latter proof are mistaken. The MN theorems say only that
particular wffs are provable, whereas steps and attempt to apply MN to the
insides of complex wffs.
K is a very weak system. In it you cant prove anything interesting about
iterated modalitiessentences with strings of multiple modal operators. You
cant even prove that necessity implies possibility. (Well be able to establish
facts of unprovability after section 6.5.) So its unclear whether K represents
any sort of necessity. Still, theres a point to K. K gives a minimal proof theory
for the 2: if 2 is to represent any sort of necessity at all, it must obey at least
Ks axioms and rules. For on any sense of necessity, surely logical truths must
be necessary; and surely, if both a conditional and its antecedent are necessary
then its consequent must be necessary as well. (Think of the latter in terms
of possible worlds: if is true in all accessible worlds, and is true in all
accessible worlds, then by modus ponens within each accessible world, must
be true in all accessible worlds.)
So even if K doesnt itself represent any sort of necessity, K is well-suited to
be the proof-theoretic basis for all the other systems well study. Each of those
other systems will result from adding appropriate axioms to K. For example, to
get system T well add each instance of 2; and to get S well additionally
add each instance of 222. Thus, each of our systems will be extensions
of K: every theorem of K is also a theorem of all the other systems (since each
system differs from K only by containing additional axioms).
Ixcrcisc 6.4 Prove the remaining MN theorems.
Ixcrcisc 6.5 Give axiomatic proofs in K of the following wffs:
a)* 3(PQ)(3P3Q)
b) 2P2(PQ)
c)* 3(QR)2(QR)
d)** 2(PQ)(2P2Q)
e) [2(PQ) 2(PQ)] 3P
f) (2P2Q)2(PQ)
g)* 3(PQ)(2P3Q)
h) 3P(2Q3Q)
i) 332(PQ)223P
6.4.2 Systcm D
To get D we add to K a new axiom saying that whats necessary is possible:
Axioxa:it svs:ix :
Rules: MP, NEC
Axioms: the PL, PL, PL, and K schemas, plus the D-schema:
23 (D)
In D it can be proved that tautologies are possible and contradictions are
not necessary, as the next example and exercise 6.6a illustrate.
D
3(PP)
. PP PL
. 2(PP) , NEC
. 2(PP)3(PP) D
. 3(PP) , , MP
One more example:
D
22P23P.
. 2P3P D
. 2(2P3P) , NEC
. 22P23P , K, MP
Like K, system D is very weak. As we will see later, 2PP isnt a D-
theorem. This is not a problem if the 2 is to be given a deontic sense, since
as we noted earlier, some things that ought to be, arent. But anything that is
metaphysically, naturally, or technologically necessary, for example, must be
true. (If something is true in all metaphysically possible worlds, or all naturally
possible worlds, or all technologically possible worlds, then surely it must be
true in the actual world, and so must be plain old true.) So any system aspiring
to represent these further sorts of necessity will need new axioms.
Ixcrcisc 6.6 Give axiomatic proofs in D of the following wffs:
a) 2(PP)
b) (2P2P)
c) 2[2(PQ) 2(PQ)]
6.4.3 Systcm T
Here we drop the D-schema, and add all instances of the T-schema:
Axioxa:it svs:ix T:
Rules: MP, NEC
Axioms: the PL, PL, PL, and K schemas, plus the T-schema:
2 (T)
In section 6.4.1 we proved a theorem schema, K3, which was the analog
for the 3 of the K-axiom schema. Lets do the same thing here; lets prove a
theorem schema T3, which is the analog for the 3 of the T axiom schema:
3 (T3)
For any wff , the following is a proof of 2, i.e., 3.
. 2 T
. 2 , PL
So lets allow ourselves to write down instances of T3 in proofs.
Notice that instances of the D-axioms are now theorems (2 is a T
axiom; 3 is an instance of T3; 23 then follows by PL). Thus, T
is an extension of D: every theorem of D remains a theorem of T.
Ixcrcisc 6.7 Give axiomatic proofs in T of the following wffs:
a) 32P3(PQ)
b)** [2P32(PQ)]3Q
c) 3(P2Q)(2P3Q)
6.4.4 Systcm B
We turn nowto systems that say something distinctive about iterated modalities.
Axioxa:it svs:ix B:
Rules: MP, NEC
Axioms: the PL, PL, PL, K, and T schemas, plus the B-schema:
32 (B)
Since we retain the T axiom schema, B is an extension of T (and hence of
Dand K, of courseas well.)
As with K and T, we can establish a theorem schema that is the analog for
the 3 of Bs characteristic axiom schema.
23 (B3)
For any , we can prove 23 (i.e., 22, given the denition of
the 3) as follows:
. 22 B (given the def of 3)
. 22 , PL
B
[2P232(PQ)]2Q.
. 32(PQ)(PQ) B
. 232(PQ)2(PQ) , Nec, K, MP
. 2(PQ)(2P2Q) K
. 232(PQ)(2P2Q) , , PL (syllogism)
. [2P232(PQ)]2Q , PL (import/export)
Ixcrcisc 6.8 Give axiomatic proofs in B of the following wffs:
a) 32P3232P
b)** 22(P2P)2(P2P)
6.4.5 Systcm Si
S takes a different stand from B on iterated modalities:
Axioxa:it svs:ix S:
Rules: MP, NEC
Axioms: the PL, PL, PL, K, and T schemas, plus the S-schema:
222 (S)
Both B and S are extensions of T; but neither is an extension of the other.
(The nonlinearity here mirrors the nonlinearity of the diagram of semantic
systems in section 6.3.2.) S contains the S-schema but not the B-schema,
whereas B contains the B-schema but not the S-schema. As a result, some
B-theorems are unprovable in S, and some S-theorems are unprovable in B.
As before, we have a theorem schema that is the analog for the 3 of the S
axiom schema:
333 (S3)
Ill prove it by proving its denitional equivalent, 222:
. 222 S
. 22 PL
. 2222 , NEC, K, MP
. 222 , , PL (syllogism)
. 222 , PL (contraposition)
S
(3P2Q)3(P2Q). This problem is rea-
sonably difcult. Heres my approach. We know from example 6.10 how to
prove things of the form 2(33), provided we can prove the condi-
tional (). Now, this technique wont help directly with the formula
were after, since we cant prove the conditional Q(P(P2Q)). But we
can use this technique to prove something related to the formula were af-
ter: 22Q(3P3(P2Q)) (since the conditional 2Q(P(P2Q)) is an
MPL-tautology). This thought inspires the following proof:
. 2Q(P(P2Q)) PL
. 22Q2(P(P2Q)) , Nec, K, MP
. 2(P(P2Q))(3P3(P2Q)) K3
. 22Q(3P3(P2Q)) , , PL (syllogism)
. 2Q22Q S
. (3P2Q)3(P2Q) , , PL (syll., import-export)
Ixcrcisc 6.9 Give axiomatic proofs in S of the following wffs:
a) 2P232P
b) 2323P23P
c) 32P3232P
6.4.6 Systcm S<
Our nal system, S, takes the strongest stand on iterated modalities. It results
from adding to T the S schema:
Axioxa:it svs:ix S:
Rules: MP, NEC
Axioms: the PL, PL, PL, K, and T schemas, plus the S-schema:
322 (S)
The analog of the S-schema for the 3 is:
323 (S3)
We can prove 323, i.e., 222, as follows:
. 222 S (def of 3)
. 222 , PL
Notice that we didnt include the B and S schemas as axiom schemas of S.
Nevertheless, all their instances are theorems of S (so we can still appeal to
them in proofs.) Any instance of the B schema, 32, follows immediately
via PL from an S axiom 322 and a T axiom 2. As for the S
schema, the following proof uses B3, which is a theorem of B and hence of S.
. 2232 B3
. 322 S
. 23222 , NEC, K, MP
. 222 , , PL (syllogism)
Ixcrcisc 6.10 Give axiomatic proofs in S of the following wffs:
a) (2P3Q)2(P3Q)
b) 3(P3Q)(3P3Q)
c)** 2(2P2Q) 2(2Q2P)
d) 2[2(3PQ)2(P2Q)]
6.4.7 Substitution of cquivalcnts and modal rcduction
Lets conclude our discussion of provability in modal logic by proving two
simple meta-theorems. The rst, substitution of equivalents, says roughly
that you can substitute provably equivalent wffs within complex wffs. More
carefully: call two wffs /variants iff they differ only in that in zero or more
places, wff occurs in one where wff occurs in the other. Thus, you can
turn one into the other by changing (zero or more) s to s or s to s. (For
example, P(QP) and S(QS) are P/S variants, as are P(QP)
and S(QP).)
Substitution of cquivalcnts: Where S is any of our modal systems, if '
S
,
then '
S
/
for any / variants and
/
Proof. Suppose '
S
. Ill argue by induction that the following holds for
any wff, :
'
S
/
, for any / variant
/
of
Base case: here is a sentence letter. Let
/
be any / variant of . If is
neither nor then
/
is just itself. If on the other hand is either or
then
/
is either or . Either way, we have one of the following cases:
/
= , or = and
/
= , or = and
/
= . Since '
S
and S
includes PL, '
S
(
/
) in each case.
Induction case: Now we assume the inductive hypothesis, that wffs
1
and
2
obey the theorem:
'
S
/
1
, for any / variant
/
1
of
1
'
S
/
2
, for any / variant
/
2
of
2
We must show that the theorem holds for
1
,
1
2
, and 2
1
.
Take the rst case. We must show that the theorem holds for
1
i.e., that
'
S
1
, for any / variant of
1
. Suppose rst that has the form
/
1
, where
/
1
is an /variant of
1
. By the inductive hypothesis, '
S
/
1
;
since S includes PL, '
S
/
1
, i.e., '
S
1
. If, on the other hand,
does not have the form
/
1
for some / variant
/
1
of
1
, then must result
from changing the whole of
1
from to or from to . Thus, each of
1
and must be either or . But then, as in the base case, '
S
1
.
I leave the remaining cases as an exercise.
The following examples illustrate the power of substitution of equivalents.
First, in our discussion of K we proved the following two theorems:
2(PQ)(2P2Q)
(2P2Q)2(PQ)
Hence (by PL), 2(PQ)(2P2Q) is a K-theorem. Given substitution
of equivalents, whenever we prove a theorem in which the formula 2(PQ)
occurs as a subformula, we can infer that the result of changing 2(PQ) to
2P2Q is also a K-theoremwithout having to do a separate proof.
Second, given the modal negation theorems, we know that all instances of
the following schemas are theorems of K (and hence of every other system):
23 32
Call these the duals equivalences.
11
Given the duals equivalences, we can
swap 3 and 2, or 2 and 3, within any theorem, and the result
will also be a theorem. So we can easily move a through a series of modal
operators. For example, its easy to show that each of the following is a theorem
of each system S:
332332 ()
333332 ()
323332 ()
223332 ()
11
Given the duals equivalences, 2 relates to 3 the way relates to (since xx,
and xx are logical truths). This shared relationship is called duality; 2 and 3
are said to be duals, as are and . The duality of 2 and 3 would be neatly explained by a
metaphysics according to which necessity just is truth in all worlds and possibility just is truth
in some worlds!
() is a theorem of S, since it has the form . () is the result of changing
2 on the left of () to 3. Since () is a theorem of S, () is also a theorem
of S, by substitution of equivalents via a duals equivalence. We then obtain ()
by changing 33 in () to 23; by substitution of equivalents via a duals
equivalence, this too is a theorem of S. Finally, () follows from () and a duals
equivalence by PL, so it too is a theorem of S. (Note how much easier this is
than example 6.13!)
Our second meta-theorem concerns only system S:
12
Modal rcduction thcorcm for S<: Where O
1
. . . O
n
are modal operators and
is a wff:
'
S
O
1
. . . O
n
O
n
Intuitively: a string of modal operators always boils down to the innermost

operator. For example, 223232232323 boils down to 3; that is, the
following is a theorem of S: 2232322323233.
Proof. The following equivalences are all theorems of S:
322 (a)
222 (b)
233 (c)
333 (d)
The left-to-right direction of (a) is just S; the right-to-left is T3; (b) is T
and S; (c) is T and S3; and (d) is S3 and T3. Now consider O
1
O
2
. . . O
n
.
Depending on which two modal operators O
1
and O
2
are, one of (a)-(d) tells
us that '
S
O
1
O
2
. . . O
n
O
2
. . . O
n
. Repeating this process n 1 times, we
have '
S
O
1
. . . O
n
O
n
. (It is straightforward to convert this argument into
a more rigorous inductive proof.)
Ixcrcisc 6.11 Finish the proof of substitution of equivalents.
12
The modal reduction formula, the duals equivalences, and substitution of equivalents
together let us reduce strings of operators that include s as well as modal operators. Simply
use the duals equivalences to drive any s in the string to the far right hand side, then use the
modal reduction theorem to eliminate all but the innermost modal operator.
6.5 Soundncss in MPL
We have dened twelve logical systems: six semantic systems and six axiomatic
systems. But each semantic system was paired with an axiomatic system of the
same name. The time has come to justify this nomenclature. In this section and
the next, well show that for each semantic system, exactly the same wffs are
counted valid in that system as are counted theorems by the axiomatic system
of the same name. That is, for each of our systems, S (for S = K, D, T, B, S,
and S), we will prove soundness and completeness:
S-soundncss: every S-theorem is S-valid
S-complctcncss: every S-valid formula is a S-theorem
Our study of modal logic has been in reverse historical order. We began with
semantics, because that is the more intuitive approach. Historically (as we noted
earlier), the axiomatic systems came rst, in the work of C. I. Lewis. Given
the uncertainty over which axioms to choose, modal logic was in disarray. The
discovery by the teenaged Saul Kripke in the late s of the possible-worlds
semantics we studied in section 6.3, and of the correspondence between simple
constraints (reexivity, transitivity, and so on) on the accessibility relation in
his models and Lewiss axiomatic systems, transformed modal logic.
The soundness and completeness theorems have practical as well as the-
oretical value. First, once weve proved soundness, well have a method for
showing that formulas are not theorems. We already know from section 6.3.3
how to establish invalidity (by constructing countermodels), and the soundness
theorem tells us that an invalid wff is not a theorem. Second, once weve proved
completeness, if we want to know that a given formula is a theorem, rather than
constructing an axiomatic proof we can instead construct a semantic validity
proof, which is much easier.
Lets begin with soundness. Were going to prove a general theorem, which
well use in several soundness proofs. First well need a piece of terminology.
Where is any set of modal wffs, lets call K+ the axiomatic system that
consists of the same rules of inference as K (MP and NEC), and which has
as axioms the axioms of K (instances of the K- and PL- schemas), plus the
members of . Here, then, is the theorem:
Thcorcm 6.1 If is any set of modal wffs and . is an MPL-model in which
each wff in is valid, then every theorem of K+ is valid in .
Modal systems of the form K+ are commonly called normal. Normal
modal systems contain all the K-theorems, plus possibly more. What Theorem
6.1 gives us is a method for constructing a soundness proof for any normal
system. Since all the systems we have studied here (K, D, etc.) are normal,
this method is sufciently general for us. Heres how the method works for
system T. System T has the same rules of inference as K, and its axioms are all
the axioms of K, plus the instances of the T-schema. In the K+ notation,
therefore, T=K+]2: is an MPL wff]. To establish soundness for T,
all we need to do is show that every instance of the T-schema is valid in all
reexive models; for we may then conclude by Theorem 6.1 that every theorem
of T is valid in all reexive models. This method can be applied to each of our
systems: for each system, S, to establish Ss soundness it will sufce to show
that Ss extra-K axioms are valid in all S-models.
Theorem 6.1 follows from two lemmas we will need to prove:
Lemma 6.2 All instances of the PL and K axiom schemas are valid in all MPL-
models
Lemma 6.3 For every MPL-model, ., MP and NEC preserve validity in .
Proof of Theorem 6.1 from the lemmas. Assume that every wff in is valid in a
given MPL-model .. Any K+-proof is a series of wffs in which each line is
either an axiom of K+, or follows from earlier lines in the proof by MP or
NEC. Now, axioms of K+ are either PL axioms, K axioms, or members of
. By Lemma 6.2, PL and K axioms are valid in all MPL-models, and so are
valid in .; and members of are valid in . by hypothesis. So all axioms in
the proof are valid in .. Moreover, by Lemma 6.3, MP and NEC preserve
validity in .. Therefore, by induction, every line in every K+-proof is valid
in .. Hence every theorem of K+ is valid in ..
We now need to prove the lemmas. Ill prove half of Lemma 6.2, and leave
the other as an exercise.
Proof that PL axioms are valid in all MPL-models. From our proof of soundness
for PL (section 2.7), we know that the PL truth tables generate the value 1 for
each PL axiom, no matter what truth value its immediate constituents have.
But here in MPL, the truth values of conditionals and negations are determined
at a given world by the truth values at that world of its immediate constituents
via the PL truth tables. So any PL axiom must have truth value 1 at any world,
regardless of what truth values its immediate constituents have. PL-axioms,
therefore, are true at every world in every model, and so are valid in every
model. Ill leave that proof that every K axiom is valid in every MPL-model as
an exercise.
Ixcrcisc 6.12 Show that every K-axiom is valid in every MPL-
model.
Ixcrcisc 6.13 Prove Lemma 6.3i.e., that for any MPL-model
., if the inputs to either MP or NEC are valid in ., then that
rules output is also valid in ..
6.5.1 Soundncss of K
We can now construct soundness proofs for the individual systems. Ill do this
for some of the systems, and leave the verication of soundness for the other
systems as exercises.
First K. In the K+ notation, Kis just K+, and so it follows immediately
from Theorem 6.1 that every theorem of K is valid in every MPL-model. So
K is sound.
6.5.2 Soundncss of T
T is K+, where is the set of all instances of the T-schema. So, given
Theorem 6.1, to show that every theorem of T is valid in all T-models, it
sufces to show that all instances of the T-schema are valid in all T-models.
Assume for reductio that V(2, w) =0 for some world w in some T-model
(i.e., some model with a reexive accessibility relation). So V(2, w) =1 and
V(, w) =0. By reexivity, %ww, and so V(, w) =1; contradiction.
6.5.3 Soundncss of B
B is K+, where is the set of all instances of the T- and B- schemas. Given
Theorem 6.1, it sufces to show that every instance of the B-schema and every
instance of the T-schema is valid in every B-model. Let . be any B-model
and w be any world in that model; we must show that all instances of the T-
and B-schemas are true at w in .. The proof of the previous section shows
that the T-axioms are true at w (since .s accessibility relation is reexive).
Now for the B-axioms. Assume for reductio that V(32, w) = 0. So
V(32, w) =1 and V(, w) =0. Given the former, V(2, v) =1, for some v
such that %wv; by symmetry, %vw; so V(, w) =1, contradicting the latter.
Ixcrcisc 6.14 Prove soundness for systems D, S, and S.
Ixcrcisc 6.15 Consider the system K that results from adding
to K all instances of the S schema (i.e., S minus the T schema).
Let K models be understood as MPL models whose accessibility
relation is euclidean: for any worlds w, u, v, if %wu and %wv then
%uv. Establish soundness for K.
6.6 Complctcncss in MPL
Next, completeness: for each system, well show that every valid formula is a
theorem. As with soundness, most of the work will go into developing some
general-purpose machinery. At the end well use the machinery to construct
completeness proofs for each system. (As in section 2.9, well be constructing a
proof of the Henkin variety.)
For each of our systems, were going to show how to construct a certain
special model, the canonical model for that system. The canonical model for a
system, S, will be shown to have the following feature:
If a formula is valid in the canonical model for S, then it is
a theorem of S
This sufcient condition for theoremhood can then be used to give complete-
ness proofs, as the following example brings out. Suppose we can demonstrate
that the accessibility relation in the canonical model for T is reexive. Then,
since T-valid formulas are by denition true in every world in every model
with a reexive accessibility relation, we know that every T-valid formula is
valid in the canonical model for T. But then the italicized statement tells us
that every T-valid formula is a theorem of T. So we would have established
completeness for T.
The trick for constructing canonical models will be to let the worlds in
these models be sets of formulas (remember, worlds are allowed to be anything
we like). In particular, a world will be the set of formulas true at that world.
Working out this idea will occupy us for some time.
6.6.1 Dchnition of canonical modcls
If we want to use sets of wffs as the worlds in canonical models, and if a world
is to be the set of wffs true at that world, then we cant use just any old set of
wffs. Its part of the denition of a valuation function that for any wff and
any world w, either or is true at w. That means that any set of wffs
that were going to call a world had better contain either or . Moreover,
wed better not let such a set contain both and , since a wff cant be both
true and false at a world. This suggests that we might try using the maximal
consistent sets of wffs introduced in section 2.9.1.
As before, a maximal set is dened as one that contains, for each wff (now:
each MPL-wff), either it or its negation. But the denition of consistency
needs to be modied a bit. Consistency was dened in section 2.9.1 in terms
of provability in PL; here we will dene a notion of S-consistency, in terms of
provability in system S, for each of our modal systems. Further, the section
2.9.1 denition made use of the notion of provability from a set of premises; but
weve been avoiding speaking of provability from premise sets in modal logic
since the rule of necessitation is appropriate only when applied to theorems.
What Ill do is introduce a new notion of provability from a set, and in terms
of this new notion retain the earlier denition of consistency:
`iv biiixi:iox oi S-vvovaniii:v-ivox: A wff is provable in system S from
a set ( '
S
) iff for some
1
. . .
n
, '
S
(
1

n
)
iiixi:iox oi S-toxsis:ixtv: A set of wffs is S-inconsistent iff '
S
. is
S-consistent iff it is not S-inconsistent
In the denition of S-provability from, understand (
1

n
) to be
1
if n = 1 and if n = 0 (the latter case is for when is empty; thus,
'
S
iff '
S
). , remember, is dened as the wff (PP).
Given these denitions, we can now dene canonical models. It may not
be fully clear at this point why the denition is phrased as it is. For now, take it
on faith that the denition will get us where we want to go.
iiixi:iox oi taxoxitai xobii: The canonical model for system S is the
MPL-model T, %, where:
T is the set of all maximal S-consistent sets of wffs
%ww
/
iff 2
(w) w
/
(, w) =1 iff w, for each sentence letter and each w T
2
() is dened as the set of wffs such that 2 is a member of

Lets think for a bit about this denition. As promised, we have dened
the members of T to be maximal S-consistent sets of wffs. And note that all
maximal S-consistent sets of wffs are included in T.
Accessibility is dened using the 2
notation. Think of this operation as

stripping off the boxes: to arrive at 2
() (the box-strip of set ), begin with

set , discard any formula that doesnt begin with a 2, line up the remaining
formulas, and then strip one 2 off of the front of each. For example, the
box-strip of set ]PQ, 2R, 2Q, 22(P2P)], is the set ]R, 2(P2P)].
The denition of accessibility, therefore, says that %ww
/
iff for each wff 2
that is a member of w, the wff is a member of w
/
.
The denition of accessibility in the canonical model says nothing about
formal properties like transitivity, reexivity, and so on. As a result, it is not
true by denition that the canonical model for S is an S-model. T-models,
for example, must have reexive accessibility relations, whereas the denition
of the accessibility relation in the canonical model for T says nothing about
reexivity. As we will soon see, for each of the systems S that we have introduced
in this book, the canonical model for S turns out to be an S-model. But this
fact must be proven; its not built into the denition of a canonical model.
An atomic wff (sentence letter) is dened to be true at a world in the
canonical model iff it is a member of that world. Thus, for atomic wffs, truth
and membership coincide. What we really need to know, however, is that truth
and membership coincide for all wffs, including complex wffs. Proving this is
the biggest part of establishing completeness, and will take awhile.
6.6.2 Iacts about maximal consistcnt scts
In section 2.9 we proved various results about maximal consistent sets of PL-
wffs, where consistency was dened in terms of provability in PL. Here,
were going to need to know, among other things, that analogous results hold
for maximal S-consistent sets of MPL-wffs:
Thcorcm 6.4 If is an S-consistent set of MPL-wffs, then there exists some
maximal S-consistent set of MPL-wffs, , such that
Lemma 6.5 Where is any maximal S-consistent set of MPL-wffs:
6.5a for any MPL-wff , exactly one of , is a member of
6.5b iff either / or
Proof. A look back at the proofs of theorem 2.3 and lemma 2.4 reveals that the
only features of the relation of provability-in-PL-from-a-set on which they
depend are the following:
if '
PL
then
1
. . .
n
'
PL
, for some
1
. . .
n
(or else '
PL
)
(lemma 2.1)
Excluded middle MP: , '
PL
ex falso quodlibet: , '

PL
modus ponens: , '

PL
negated conditional: () '

PL
and () '
PL
if then '
PL
Cut for PL
The deduction theorem for PL
(I invite the reader to go back and verify this.) So if the relation of provability-
from-a set in modal system S also has these features, then one can give exactly
analogous proofs of theorem 6.4 and lemma 6.5. And this is indeed the case, as
may easily be veried, since each modal system is an axiomatic proof system
whose axioms include the PL axiom schemas and whose rules include MP. The
one sticking point is the deduction theorem. As we pointed out in section 6.4.1,
the deduction theorem fails for our modal systems if provability-from-a-set is
understood in the usual way. But we are not understanding provability-from-a-
set in the usual way; and given our new denition of provability-from-a-set,
the deduction theorem holds:
Dcduction thcorcm for MPL: For each of our modal systems S (and given
our new denition of provability from a set), if ]] '
S
then '
S

Proof. Suppose ]] '
S
. So for some
1
. . .
n
, '
S
(
1

n
) , where
perhaps one of the
i
s is and the others are members of . If is one of the
i
s,
say
k
, then (
1

k1
k+1

n
) () is an MPL-tautological
consequence of (
1

n
) , and so is a theorem of S, whence '
S
.
And if none of the
i
s is then each is in ; but (
1

n
) () is
an MPL-tautological consequence of (
1

n
) , whence again '
S
.
Before we end this section, it will be convenient to establish two further
sub-lemmas of Lemma 6.5:
6.5c if '
S
then
6.5d if '
S
and then
Proof. For 6.5c, if '
S
then '
S
() since S includes PL. Since is S-
consistent, / ; and so, since is maximal, . For 6.5d, use lemmas
6.5c and 6.5b.
Ixcrcisc 6.16 (Long.) Show that the relation of provability-from-
a-set dened in this section does indeed have the listed features. (As
elsewhere in this chapter, you may simply assume the completeness
of the PL axioms, and hence that any MPL-tautology is a theorem
of each system S.)
6.6.3 Mcsh
In addition to Theorem 6.4 and Lemma 6.5, well also need one further fact
about maximal S-consistent sets that is specic to modal systems. Our ultimate
goal, remember, is to show that in canonical models, a wff is true at a world iff
it is a member of that world. If were going to be able to show this, wed better
be able to show things like this:
(2) If 2 is a member of world w, then is a member of every world
accessible from w
(3) If 3 is a member of world w, then is a member of some world
accessible from w
Well need to be able to show (2) and (3) because its part of the denition of
truth in any MPL-model (whether canonical or not) that 2 is true at w iff
is true at each world accessible from w, and that 3 is true at w iff is true at
some world accessible from w. Think of it this way: (2) and (3) say that the
modal statements that are members of a world w in a canonical model mesh
with the members of accessible worlds. This sort of mesh had better hold if
truth and membership are going to coincide.
(2) we know to be true straightaway, since it follows from the denition of
the accessibility relation in canonical models. The denition of the canonical
model for S, recall, stipulated that w
/
is accessible from w iff for each wff 2
in w, the wff is a member of w
/
. (3), on the other hand, doesnt follow
immediately from our denitions; well need to prove it. Actually, it will be
convenient to prove something slightly different which involves only the 2:
Lemma 6.6 If is a maximal S-consistent set of wffs containing 2, then
there exists a maximal S-consistent set of wffs such that 2
() and

(Given the denition of accessibility in the canonical model and the denition
of the 3 in terms of the 2, Lemma 6.6 basically amounts to (3).)
Proof of Lemma 6.6. Let be as described. The rst step is to show that the
set 2
() ]] is S-consistent. Suppose for reductio that it isnt, and hence

that 2
() ]] '
S
. By the MPL deduction theorem, 2
() '
S
.
So for some
1
. . .
n
2
(), we have: '

S
(
1

n
)().
13
Next,
begin a proof in S with a proof of this wff, and then continue as follows:
i . (
1

n
)()
i +1.
1
(
2
(
n
)) i , PL (recall the denition of )
i +2. 2(
1
(
2
(
n
))) i +1, NEC
i +3. 2
1
(2
2
(2
n
2)) i +2, K, PL (n)
i +4. (2
1
2
n
2) i +3, PL
13
If 2
() is empty then this means '

S
, and the argument runs much as in the text:
by PL, '
S
, so by NEC, '
S
2, so by PL, '
S
2, contradicting s S-consistency.
Given this proof, '
S
(2
1
2
n
2) . But since 2
1
2
n
, and
2 are all in , this contradicts s S-consistency (2
1
2
n
are members
of because
1
n
are members of 2
().)
Weve shown that 2
() ]] is S-consistent. It therefore has a maximal

S-consistent extension, , by Theorem 6.4. Since 2
() ]] , we know
that 2
() and that . is therefore our desired set.

Ixcrcisc 6.17 Where S is any of our modal systems, show that if
is an S-consistent set of wffs containing the formula 3, then
2
() is also S-consistent. You may appeal to lemmas and

theorems proved so far.
6.6.4 Truth and mcmbcrship in canonical modcls
Were now in a position to put all of our lemmas to work, and prove that
canonical models have the property that I promised they would have: the wffs
true at a world are exactly the members of that world:
Thcorcm6.7 Where . (=T, %, ) is the canonical model for any normal
modal system, S, for any wff and any w T, V
.
(, w) =1 iff w
Proof of Theorem 6.7. Well use induction. The base case is when has zero
connectivesi.e., is a sentence letter. In that case, the result is immediate:
by the denition of the canonical model, (, w) = 1 iff w; but by the
denition of the valuation function, V
.
(, w) =1 iff (, w) =1.
Now the inductive step. We assume the inductive hypothesis (ih), that the
result holds for and , and show that it must then hold for , , and
2 as well. The proofs of the rst two facts make use of lemmas 6.5a and 6.5b,
and are parallel to the proofs of the analogous facts in section 2.9.4. Finally,
2: we must show that 2 is true at w iff 2 w. First the forwards direction.
Assume 2 is true at w; then is true at every w
/
T such that %ww
/
. By
the (ih), we have (+) is a member of every such w
/
. Now suppose for reductio
that 2 / w; since w is maximal, 2 w. Since w is maximal S-consistent,
by Lemma 6.6, we know that there exists some maximal S-consistent set such
that 2
(w) and . By denition of T, T; by denition of %,

%w; and so by (+) contains . But also contains , which contradicts
its S-consistency given 6.5a.
Now the backwards direction. Assume 2 w. Then by denition of %,
for every w
/
such that %ww
/
, w
/
. By the (ih), is true at every such world;
hence by the truth condition for 2, 2 is true at w.
What was the point of proving theorem 6.7? The whole idea of a canonical
model was that a formula is to be valid in the canonical model for S iff it is a
theorem of S. This fact follows fairly immediately from Theorem 6.7:
Corollary 6.8 is valid in the canonical model for S iff '
S
Proof of Corollary 6.8. Let T, %, be the canonical model for S. Suppose

'
S
. Then, by lemma 6.5c, is a member of every maximal S-consistent
set, and hence w, for every w T. By theorem 6.7, is true in every
w T, and so is valid in this model. Nowfor the other direction: suppose
S
.
Then ]] is S-consistent (if it werent then '
S
, and hence '
S
,
and hence, given the denition of , '
S
.) So, by theorem 6.4, ]] has a
maximal consistent extension; thus, w for some w T; by theorem 6.7,
is therefore true at w, and so is not true at w, and hence is not valid
in this model.
So, weve gotten where we wanted to go: weve shown that every system
has a canonical model, and that a wff is valid in the canonical model iff it
is a theorem of the system. In the next section well use this fact to prove
completeness for our various systems.
6.6.5 Complctcncss of systcms of MPL
Ill run through the completeness proofs for K, D, and B, leaving the remainder
as exercises.
First, K. Any K-valid wff is valid in all MPL-models, and so is valid in the
canonical model for K, and so, by corollary 6.8, is a theorem of K.
For any other system, S, all we need to do to prove S-completeness is to
show that the canonical model for S is an S-model. That is, we must show
that the accessibility relation in the canonical model for S satises the formal
constraint for system S (seriality for D, reexivity for T and so on).
For D, rst lets show that in the canonical model for D, the accessibility
relation, %, is serial. Let w be any world in that model. Example 6.14 showed
that 3(PP) is a theorem of D, so its a member of w by lemma 6.5c, and so
is true at w by theorem 6.7. Thus, by the truth condition for 3, there must be
some world accessible to w in which PP is true; and hence there must be
some world accessible to w.
Now for Ds completeness. Let be D-valid. is then valid in all D-
models, i.e., all models with a serial accessibility relation. But we just showed
that the canonical model for D has a serial accessibility relation. is therefore
valid in that model, and hence by corollary 6.8, '
D
.
Next, B. We must show that the accessibility relation in the canonical model
for B is reexive and symmetric (as with D, Bs completeness then follows from
corollary 6.8). Reexivity may be proved just as it is proved in the proof of
Ts completeness (exercise 6.18). As for symmetry: in the canonical model for
B, suppose that %wv. We must show that %vwthat is, that for any , if
2 v then w. Suppose 2 v. By theorem 6.7, 2 is true at v; since
%wv, by the truth condition for 3, 32 is true at w, and hence is a member
of w by theorem 6.7. Since '
B
32, by lemma 6.5d, w.
Ixcrcisc 6.18 Prove completeness for T, S, and S
Ixcrcisc 6.19 Prove completeness for K (see exercise 6.15).
Ixcrcisc 6.20 Consider the system that results from adding to K
every axiom of the form 32. Let the models for this system
be dened as those whose accessibility relation meets the following
condition: every world can see at most one world. Prove completeness
for this (strange) system.
Chaptcr 7
Bcyond Standard Propositional
Modal Logic
K
vivxis vossinii vovibs semantics has proved itself useful in many areas.
In this chapter we will briey examine its use in deontic, epistemic, tense,
and intuitionistic logic.
7.1 Dcontic logic
Deontic logic is the study of the logic of normative notions. Lets introduce
operators O and M, for, roughly speaking, ought and may. Grammatically,
these are one-place sentence operators (like 2 and ): each combines with a
single wff to form another wff. Thus, we can write OP, MQOR, and so on.
One can read O and M as saying Agent S ought to see to it that and
Agent S may see to it that , respectively, for some xed agent S. Or, one can
read them as saying it ought to be the case that and it is acceptable for it
to be the case that . Either way, the formalism is the same.
Its plausible to dene M as O, thus enabling us to take O as the sole
new bit of primitive vocabulary. The denition of a wff for deontic logic is thus
like that of nonmodal propositional logic, with the following added clause:
If is a wff, then so is O
For semantics, we use possible worlds. In fact, well use the very same
apparatus as for modal logic: MPL-models, truth relative to worlds in these
CHAPTER 7. BEYOND STANDARD MPL

models, and so on. O replaces the 2, and behaves exactly analogously: O says
that is true in all accessible possible worlds. Thus, its truth condition is:
V(O, w) =1 iff V(, v) =1 for each v T such that %wv
The derived condition for M is then:
V(M, w) =1 iff V(, v) =1 for some v T such that %wv
The clauses for atomics, and , and the denitions of validity and semantic
consequence, remain unchanged.
Indeed, this just is modal logic. Nothing in the formalism has changed;
were just conceiving of accessibility in a certain way. We now think of v as
being accessible from w if the goings-on in v are permitted, given the operative
norms in w (or: given the norms binding agent S in w). That is, %wv iff
everything that, in w, ought to be true is in fact true in v (thus, v violates
nothing that in w is mandatory). We think of % as being a relation of deontic
accessibility. When we conceptualize modal logic in this way, we write O
instead of 2 and M instead of 3.
If were thinking of % in this way, what formal properties should it be
required to have? One simple and common answer is that the only required
property is seriality. Seriality does seem right to require: there must always
be some possibility that morality permits; from every world there is at least
one deontically accessible world. Note that reexivity in particular would
be inappropriate to impose. Things that morally ought to be, nevertheless
sometimes are not.
If seriality is the sole constraint on %, the resulting logic for O is the
modal logic D. Logic D, recall, builds on the modal system K by validating in
addition all instances of 23, or OM in the present context. These
do indeed seem like logical truths: whatever is obligatory is permissible. The
characteristic features of K also seem welcome: if is valid, so is O (recall
the rule NEC); and every instance of the K-schema is valid (O distributes over
). Further, since accessibility need not be reexive, some instances of the
T-schema O turn out invalid, which is what we want (deontic necessity
isnt alethic).
Formally speaking, there is no difference whatsoever between this semantics
for deontic logic and the semantics for the modal systemD. Reconceptualizing
the accessibility relation has no effect on the denition of a model or the
valuation function. But suppose you took possible worlds semantics seriously,
as being more than a mere formal semantics for formal languagessuppose
you took it to give real truth conditions in terms of real possible worlds and
real accessibility for natural language modal and deontic talk. Then you would
take the truth conditions for necessarily and possibly to differ from the truth
conditions for ought and may, since their accessibility relations would be
different relations. The accessibility relation in the semantics of ought and
may would be a real relation of deontic accessibility (we wouldnt just be
thinking of it as being such a relation), whereas the accessibility relation for
necessarily and possibly would have nothing to do with normativity.
This is a mere beginning for deontic logic. Should we impose further
constraints on the models? For example, is the principle (U) (for utopia)
O(O) a valid principle of deontic logic? (This principle says that it ought
to be the case that everything that ought to be true is true.) If so, we should nd
a corresponding condition to impose on the deontic accessibility relation, and
impose it. And is our operator O adequate to represent all deontic reasoning?
For example, how can we represent the apparently true sentence if you kill the
victim, you ought to kill him quickly using O? The obvious candidates are:
KOQ
O(KQ)
But neither seems right. Against the rst: suppose that you do in fact kill the
victim. Then it would follow from the rst that one of your obligations is to do
the following: kill the victim quickly. But surely thats wrong; you ought not to
kill the victim at all! Against the second: if its the right representation of if
you kill the victim, you ought to kill him quickly, then the right representation
of if you kill the victim, you ought to kill him slowly should be O(KS).
But O(KS) follows from OK (given just a K modal logic for O), and you
ought not to kill the victim certainly does not imply if you kill the victim,
you ought to kill him slowly.
1
1
See Feldman () for more on this last issue.
Ixcrcisc 7.1* Find a condition on accessibility that validates every
instance of (U).
Ixcrcisc 7.2* Let X be the axiomatic system that results from
modal system D by adding as additional axioms all instances of
(U). Show that X is sound and complete with respect to a Kripke
semantics in which the accessibility relation is required to be serial
and also to obey the condition you came up with in exercise 7.1.
7.2 Ipistcmic logic
In deontic logic we took the 2 of modal logic and gave it a deontic reading. In
epistemic logic we give it an epistemic reading; we treat it as meaning it is
known (perhaps by a xed agent S) to be the case that. Under this reading,
we write it: K. Thus, K means that is known. (K can be thought of as
a kind of epistemic possibility: as far as what is known is concerned, might
be true.)
As with deontic logic, we do semantics with Kripke models, conceptualized
in a certain way. Formally, this is just modal logic: we still treat K as true at
w iff is true at every accessible world. But now we think of the accessibility
relation as epistemic accessibility: %wv iff everything known in w is true in
v.
The constraints on the formal properties of epistemic accessibility must
clearly be different from those on deontic accessibility. For one thing, epistemic
accessibility should be required to be reexive: since knowledge implies truth,
we want K to be a valid principle of epistemic logic. Whether further
constraints are appropriate is debatable. Do we want K to obey an S modal
logic? The analogs for K of the characteristic axioms of S and S are contro-
versial, but do have some plausibility. The S axiom for K is also known as the
KK principle, or the principle of positive introspection: KKK. From
the S axiom schema we get the so-called principle of negative introspection:
KKK. These schemas (as well as the T axiom schema) are all validated
if we require the relation of epistemic accessibility to be an equivalence relation.
Whether the introspection principles are correct is a disputed question
among epistemologists. It goes without saying that epistemic logic cannot
hope to resolve this question on its own. The question is a philosophical one,
about the nature of knowledge. One can develop formal systems in which these
principles are valid, and formal systems in which they are not; it is up to the
epistemologists to tell us which of these formal systems best models the actual
logic of knowledge.
Regardless of what constraints we place on accessibility, the mere use of
Kripke semantics gives K at least the features from system K. Some of these
features are apparently objectionable. For example, if in fact logically implies
, then our system says that K logically implies K (see exercise 7.3). That
is, we know all the logical consequences of our knowledge. That seems wrong;
cant I be unaware of subtle or complex consequences of what I know? But
perhaps epistemic logic can be regarded as a useful idealization.
In addition to a logic of knowledge, we can develop a logic of belief, based
on a new one-place sentence operator B. As before, the models are Kripke
models, only now we think of % as a relation of doxastic accessibility: %wv
iff everything believed in w is true in v. Unlike epistemic accessibility, doxastic
accessibility shouldnt be required to be reexive (since belief is not factive); we
dont want the T-principle BPP to be valid. Nor do we want the B-principle
BBPP to be valid: just because I dont believe that I dont believe P, it
doesnt followthat P is true. As before, there is controversy over introspection
over whether BBB and BBB should be validated. If they should,
then doxastic accessibility must be required to be transitive and also euclidean: if
%wv and %wu then %vu. (We knowfromchapter 6 that transitivity validates
the S schema, and if you did exercise 6.15 you showed that euclideanness
validates the S schema.) This generates the modal logic K, in which the K,
S, and S axioms are valid, but not the T or B axioms.
Ixcrcisc 7.3 Show that knowledge is closed under entailment in
our epistemic logic. That is, show that if then K K. (For
this problem it does not matter which constraints on accessibility
are assumed.)
7.3 Propositional tcnsc logic
7.3.1 Thc mctaphysics of timc
A logical treatment of the full range of things we say and think must cover
temporal discourse. Some philosophers, however, think that this demands
nothing beyond standard predicate logic. This was the view of many early
logicians, notably Quine.
2
Here are some examples of how Quine would
regiment temporal sentences in predicate logic:
Everyone who is now an adult was once a child
x(Axn t [Et n Cxt ])
A dinosaur once trampled a mammal
xyt (Et n Dx My T xyt )
Here, n (for now) is a name of the present time (Quine treats moments of
time as entities). E is a predicate for the earlier-than relation over moments
of time. Thus, Et n means that moment t is earlier than the present moment;
t (Et n(t )) means that (t ) is true at some moment t in the past, and so on.
To every predicate that can hold temporarily, Quine adds in a new argument
place for the time at which the predicate is satised. Thus, instead of saying
Cxx is a childhe says Cxt : x is a child at t . Finally, the quantier x
is atemporal, ranging over all objects at all times. Thus, Quine is willing to say
that there is a thing, x, that is a dinosaur, and which, at some previous time,
trampled a mammal.
So: we can use Quines strategy to represent temporal notions using standard
predicate logic. But Quines strategy presupposes a metaphysics of time that
some philosophers reject. First, Quine assumes that there exist past objects.
His symbolization of the presumably true sentence A dinosaur once trampled
a mammal says that there is such a thing as a dinosaur. Quines view is that time
is space-like. Past objects are as real as present ones, theyre just temporally
distant, just as spatially distant objects are just as real as the ones around here.
(Defenders of this metaphysics usually say that future objects exist as well.)
Second, Quine presupposes a distinctive metaphysics of change. Quine would
describe my change from childhood to adulthood thus: Cap Aan, where
a names me, n again names the present moment, and p names some past
moment at which I was a child. Note the symmetry between the past state of
2
See, for example, Quine (b).
my childhood, Cap, and the current state of my adulthood, Aan. Tenselessly
speaking, the states are on a par; theres nothing metaphysically special about
either. Some conclude that Quines approach leaves no room for genuine
change. His approach, they say, assimilates change too closely to variation
across space: compare my being a child-at- p and an adult-at-n with the USA
being mountainous-in-the-west and at-in-the-middle.
Arthur Prior (; ) and others reject Quines picture of time. Ac-
cording to Prior, rather than reducing notions of past, present, and future to
notions about what is true at times, we must instead include certain special
temporal expressionssentential tense operatorsin our most basic languages,
and develop an account of their logic. Thus he initiated the study of tense logic.
One of Priors tense operators was P, symbolizing it was the case that.
Grammatically, P attaches to a complete sentence and forms another complete
sentence. Thus, if R symbolizes it is raining, then PR symbolizes it was
raining. If a sentence letter occurs by itself, outside of the scope of all tem-
poral operators, then for Prior it is to be read as present-tensed. Thus, it was
appropriate to let R symbolize It is rainingi.e., it is now raining.
Suppose we symbolize there exists a dinosaur as xDx. Prior would then
symbolize There once existed a dinosaur as:
PxDx
And according to Prior, PxDx is not to be analyzed as saying that there exist
dinosaurs located in the past. For him, there is no further analysis of PxDx.
Priors attitude toward P is like nearly everyones attitude toward . Nearly
everyone agrees that is not further analyzable (for example, no one thinks
that xUx, there are no unicorns, is to be analyzed as saying that there
exist unreal unicorns.) Further, for Prior there is an asymmetry between past
and future events that allows the possibility of genuine change. He represents
the fact that I was a child thus: PCa, and the fact that Im now an adult thus:
Aa. Only statements about the present can be made unqualiedly, without
tense operators. Note also that Prior does away with Quines relativization of
temporary predicates to times. For Prior, the sentence Aa (Ted is an adult) is
a complete statement, but nevertheless can alter its truth value.
7.3.2 Tcnsc opcrators
One can study various tense operators. Here is one group:
G: it is, and is always going to be the case that
H: it is, and always has been the case that
F: it either is, or will at some point in the future be the case that,
P: it either is, or was at some point in the past the case that,
Grammatically, we can take G and H as primitive, governed by the following
clause in the denition of a wff:
If is a wff then so are G and H
Then we can dene F and P:
F is short for G
P is short for H
One could also dene further tense operators, for example A and S, for always
and sometimes, in terms of G and H:
A is short for HG
S is short for PF (i.e., HG)
Other tense operators are not denable in terms of G and H. Metrical tense
operators, for example, concern what happened or will happen at specic
temporal distances in the past or future:
P
x
: it was the case x minutes ago that
F
x
: it will be the case in x minutes that
We will not consider metrical tense operators further.
The (nonmetrical) tense operators, as interpreted above, include the
present moment. For example, if G is now true, then must now be true.
One could specify an alternate interpretation on which they do not include the
present moment:
G: it is always going to be the case that
H: it always has been the case that
F: it will at some point in the future be the case that
P: it was at some point in the past the case that
Whether we take the tense operators as including the present moment will
affect what kind of logic we develop. For example, G and H should imply
if G and H are interpreted as including the present moment, but not otherwise.
7.3.3 Kripkc-stylc scmantics for tcnsc logic
As with deontic and epistemic logic our semantic approach is to use Kripke
models, conceived in a certain way. But our new conception is drastically
different from our earlier conceptions. Now we think of the members of T
as times rather than as possible worlds, we think of the accessibility relation as
a temporal ordering relation, and we think of the interpretation function as
assigning truth values to sentence letters at times.
(A Priorean faces hard philosophical questions about the use of such a
semantics, since according to him, the semantics doesnt accurately model the
metaphysics of time. The questions are like those questions that confront
someone who uses possible worlds semantics for modal logic but rejects a
possible worlds metaphysics of modality.)
This reconceptualization requires no change to the denition of an MPL-
model. But to mark the change in thinking, lets change our notation. Since
were thinking of T as the set of times, lets rename it , and lets use variables
like t , t
/
, etc., for its members. And since were thinking of accessibility as a
relation of temporal ordering the at-least-as-early-as relation over times, in
particularlets rename it too: . (If we were interpreting the tense operators
as not including the present moment, then we would think of the temporal
ordering relation as the strictly-earlier-than relation, and would write it <.)
Thus, instead of writing %ww
/
, we write: t t
/
.
We need to update the denition of the valuation function. The clauses
for atomics, , and remain the same; but in place of the 2 we now have
two 2-like operators, G and H, which look at different directions along the
accessibility relation, so to speak. Here are their semantic clauses:
V
.
(G, t ) =1 iff for every t
/
such that t t
/
, V
.
(, t
/
) =1
V
.
(H, t ) =1 iff for every t
/
such that t
/
t , V
.
(, t
/
) =1
F and P are then governed by the following derived clauses:
V
.
(F, t ) =1 iff for some t
/
such that t t
/
, V
.
(, t
/
) =1
V
.
(P, t ) =1 iff for some t
/
such that t
/
t , V
.
(, t
/
) =1
Call an MPL-model, thought of in this way, a PTL-model (for Priorean
Tense Logic). And say that a wff of tense logic is PTL-valid iff it is true
in every time in every PTL-model. Given our discussion of system K from
chapter 6, we already know a lot about PTL-validity. The truth condition
for the G is the same as the truth condition for the 2 in MPL. So if you take
a K-valid wff of MPL and change all the 2s to Gs, you get a PTL-valid wff
of tense logic. (For example, since 2(PQ)2P is K-valid, G(PQ)GP
is PTL-valid.) Similarly, replacing 2s with Hs in a K-valid wff results in a
PTL-valid wff (exercise 7.5). But in other cases, PTL-validity depends on the
interaction between different tense operators; this has no direct analog in MPL.
For example, GP and HF are both PTL-valid.
PTL
GP and
PTL
HF
Ixcrcisc 7.5* Show that replacing 2s with Hs in a K-valid formula
of MPL results in a PTL-valid formula.
7.3.4 Iormal constraints on
PTL-validity is not a good model for logical truth in tense logic. We have
so far placed no constraints on the formal properties of the relation in a
PTL-model. That means that there are PTL models in which the looks
nothing like a temporal ordering. We dont normally think that time could
consist of a number of temporally disconnected points, for example, or of many
points each of which is at-least-as-early-as all of the rest, and so on, but there
are PTL-models answering to these strange descriptions. PTL-validity, as I
dened it, requires truth at every time in every PTL-model, even these strange
models. This means that many tense-logical statements that ought, intuitively,
to count as logical truths, are in fact not PTL-valid.
The formula GPGGP is an example. It is PTL-invalid, for consider a
model with three times, t
1
, t
2
, and t
3
, where t
1
t
2
, t
2
t
3
, and t
1
( t
3
, and in
which P is true at t
1
and t
2
, but not at t
3
:
P
t
1
t
2
P
t
3
P
In this model, GPGGP is false at time t
1
. But GPGGP is, intuitively, a
logical truth. If it is and will always be raining, then surely it must also be
true that: it is and always will be the case that: it is and always will be raining.
The problem, of course, is that the relation in the model we considered is
intransitive, whereas, one normally assumes, the at-least-as-early-as relation
must be transitive.
More interesting notions of validity result from restricting the class of
models in the denition of validity to those whose relations satisfy certain
formal constraints. We might require to be transitive, for example. Under
this denition, every instance of the S schemas is valid:
GGG
HHH
Reexivity is also natural to require, since that validates the T-schemas
G and H. (Assuming, that is, that were construing the tense opera-
tors as including the present moment. If we construed them as not including
the present moment, and thought of accessibility as meaning strictly earlier
than, then it would be natural to require that no time be accessible from itself.)
One might also require connectivity of some sort:
iiixi:iox oi xixbs oi toxxit:ivi:v: Let R be any binary relation over A.
R is strongly connected in Aiff for every u, v A, either Ruv or Rvu
R is weakly connected iff for every u, v, v
/
, IF: either Ruv and Ruv
/
, or
Rvu and Rv
/
u, THEN: either Rvv
/
or Rv
/
v
Thus we might require that the relation be strongly connected (in ), or,
alternatively, merely weakly connected. This would be to disallow incom-
parable pairs of timespairs of times neither of which bears the relation
to the other. The stronger requirement disallows all incomparable pairs; the
weaker requirement merely disallows incomparable pairs when each member
of the pair is after or before some one time. Thus, the weaker requirement
disallows branches in the temporal order but allows distinct timelines wholly
disconnected from one another, whereas the stronger requirement insures that
all times are part of a single non-branching structure. Each sort validates every
instance of the following schemas (exercise 7.6):
G(G) G(G)
H(H) H(H)
There are other constraints one might impose, for example anti-symmetry
(no distinct times bear to each other), density (between any two times there is
another time), or eternality (there exists neither a rst nor a last time). Some-
times a constraint validates an interesting schema, sometimes it doesnt. Some
constraints are more philosophically controversial than others.
Symmetry clearly should not be imposed. Obviously if one time is at least
as early as another, then the second time neednt be at least as early as the rst.
Moreover, imposing symmetry would validate the B schemas FG and
PH; but these clearly ought not to be validated. Take the rst: it doesnt
follow from it will be the case that it is always going to be the case that Im dead that
Im (now) dead.
* * *
We have briey examined Kripke semantics for deontic, epistemic and
doxastic, and tense operators. Another interesting project in this vicinity is to
explore connections between these and other operators. We might introduce a
single language containing deontic, epistemic, and doxastic operators, as well as
a 2 standing for some further sort of necessitymetaphysical necessity, say. A
natural semantics for this language would be a Kripke semantics with multiple
accessibility relations, one for each of the operators. This leads to interesting
questions about how these operators logically relate to one another. Does
knowledge imply belief? That is, is KB valid? If so, we should require
that if one world is doxastically accessible from another then it is epistemically
accessible from it as well. Similarly, if metaphysical necessity implies knowledge
then we must validate 2K, and so epistemic accessibility must be required
to imply metaphysical accessibility (the kind of accessibility associated with the
2). Adding in tense operators generates a further dimension of complexity,
since the models must now incorporate a set of times in addition to the set
T of worlds, and formulas must be evaluated for truth at world-time pairs.
We have considered only the semantic approach to deontic, epistemic and
doxastic, and tense logic. What of a proof-theoretic approach? Since we have
been treating these logics as modal logics, it should be no surprise that axiom
systems similar to those of section 6.4 can be developed for them. Moreover,
the techniques developed in sections 6.5-6.6 can be used to give soundness
and completeness proofs for many of these axiomatic systems, relative to the
possible-worlds semantics that we have developed.
Ixcrcisc 7.6 Show that all instances of G(G) G(G)
and H(H) H(H) turn out valid if is required to be
connected (either weakly or strongly).
7.4 Intuitionistic propositional logic: scmantics
7.4.1 Proof stagcs
Intuitionists, recall, reject the law of the excluded middle and double-negation
elimination. In section 3.5 we developed a proof-theory for intuitionistic
propositional logic by beginning with the classical sequent calculus and then
dropping double-negation elimination while adding ex falso. In this section we
will develop a semantics for intuitionistic logic due to Saul Kripke.
The semantics is again of the possible-worlds variety. Formally speaking,
the models will be just MPL-models, the only difference being a different
denition of the valuation function. But informally, we think of these models
differently. We now think of members of T as stages in the construction
of proofs, rather than as possible worlds, and we think of 1 and 0 as proof
statuses, rather than truth values. That is, we think of V(, w) =1 as meaning
that formula has been proved at stage w, and of V(, w) =0 as meaning that
formula has not yet been proved at stage w.
Lets treat and , in addition to and , as primitive connectives. And
to emphasize the different way we are regarding the worlds, we rename T
, for proof stages, and we use the variables s , s
/
, etc., for its members. Here
is the semantics:
iiixi:iox oi xobii: An I-model is a triple , %, , such that:
is a non-empty set (proof stages)
% is a reexive and transitive binary relation over (accessibility)
is a two-place function that assigns 0 or 1 to each sentence letter,
relative to each member of (interpretation function)
for any sentence letter , if (, s ) = 1 and %s s
/
then (, s
/
) = 1
(heredity condition)
iiixi:iox oi vaiia:iox: Where . (= , %, ) is any I-model, the I-
valuation for ., IV
.
, is dened as the two-place function that assigns either
0 or 1 to each wff relative to each member of , subject to the following
constraints, for any sentence letter , any wffs and , and any s :
IV
.
(, s ) =(, s )
IV
.
(, s ) =1 iff IV
.
(, s ) =1 and IV
.
(, s ) =1
IV
.
(, s ) =1 iff IV
.
(, s ) =1 or IV
.
(, s ) =1
IV
.
(, s ) =1 iff for every s
/
such that %s s
/
, IV
.
(, s
/
) =0
IV
.
(, s ) =1 iff for every s
/
such that %s s
/
, either IV
.
(, s
/
) =0
or IV
.
(, s
/
) =1
Note that the valuation conditions for the and the at stage s no longer
depend exclusively on what s is like; they are sensitive to what happens at stages
accessible from s . Unlike the and the , and are not truth functional
(relative to a stage); they behave like modal operators.
While it can be helpful to think informally of these models in terms of
proof stages, this should be taken with more than the usual grain of salt. In-
tuitionists about mathematics would regard the real existence of a space of all
possible future proof stages as clashing with their anti-platonistic philosophy
of mathematics. Further, intuitionists dont regard mathematical statements
(for example, those of arithmetic) as being about proofs. Finally, not everyone
who employs intuitionistic logic is an intuitionist about mathematics. Ofcially,
then, the semantics is nothing more than a formal tool, useful for establishing
metalogical facts about section 3.5s proof theory (for example soundness and
completenesssee below.)
Nevertheless, the proof-stage heuristic is vivid, so long as it isnt taken too
seriously. In its terms, lets think a bit more about these models. Think of
as including all possible stages in the construction of mathematical proofs.
Each stage s is associated with a certain collection Pr
s
of proofs: those proofs
you would have come up with, if you were to arrive at that stage. When IV
assigns 1 to a formula at stage s , that means that the formula is proved by some
member of Pr
s
the formula is proven as of the stage. 0 means that none of the
proofs in Pr
s
proves the formula. (0 does not mean that the formula is disproven;
perhaps it will be proven in some future stage.)
The holding of the accessibility relation represents which stages are left
open, given what you know at your current stage. We can think of %s s
/
as
meaning: if youre in stage s , then for all you know, you might subsequently be
in stage s
/
. That is, if you know of the proofs in Pr
s
, then for all you know, you
might later come to possess Pr
s
/ as your set of proofs. At any point in time there
are a number of stages accessible to you; the fewer proofs youve accumulated
so far, the more accessible stages there are. As you accumulate more proofs,
you move into one of these accessible stages.
Given this understanding of accessibility, reexivity and transitivity are
obviously correct to impose, as is the heredity condition, since (on the somewhat
idealized conception of proof we are operating with) one does not lose proved
information when constructing further proofs. But the accessibility relation will
not in general be symmetric. Suppose that at stage s , you dont know whether
youre going to be able to prove P. There is an accessible stage s
/
where you
prove P (P is 1 there), and there are accessible stages (in addition to your own
stage) where you dont prove P (P is 0 there). Now suppose you do in fact prove
P, and so you reach stage s
/
. Stage s is then no longer accessible. For now you
have a proof of P; and you know that you never lose proved information; so you
know from your s
/
vantage point that youll never again be in stage s .
Lets look at the conditions for the connectives , , , and , in the
denition of IV. Remember that we are thinking of IV(, s ) =1 intuitively as
meaning that is proven at s . So, the condition for the , for example, says
that weve proved a conjunction , at some stage, if and only if we have
proved at that stage and also have proved at that stage. In fact, this is a very
natural thing to say, since it is natural to take a proof of a conjunction as
consisting of two components, a proof of and a proof of . Thus, a natural
conception of what a proof of a conjunction requires meshes with the clause
for in the denition of IV. The clauses for the other connectives also mesh
with natural conceptions of the natures of proofs involving those connectives:
a proof of is a proof of or a proof of
a proof of is a construction for turning any proof of into a proof of
a contradiction
a proof of is a construction for turning any proof of into a proof
of
Lets look more closely at the truth conditions for and . A proof of ,
according to the above conception, is a construction for turning a proof of
into a proof of a contradiction. So if youve proved at stage s (IV(, s ) =1),
then at s you have such a construction, so you can rule out future stages in which
you prove (provided you know that your methods of proof are consistent).
And if you have not proved at s (IV(, s ) =0), then you dont then have
any such construction, and so for all you know, you will one day prove . So
since includes all possible stages in the development of proofsand by
these wed better mean all epistemically possible stages, relative to any stage in
then there must be some s
/
in which you prove , as the valuation
condition for says. As for : if you have a method for converting any proof
of into a proof of , then at no stage in the future could you have a proof of
without having a proof of . Conversely, if you lack such a method, then for
all you know, one day you will have a proof of but no proof of .
We can now dene intuitionist validity and semantic consequence in the
obvious way:
is I-valid (
I
) iff IV
.
(, s ) =1 for each stage s in each intuitionist
model .
is an I-semantic-consequence of (
I
) iff for every intuitionist
model . and every stage s in ., if IV
.
(, s ) =1 for each , then
IV
.
(, s ) =1
I
iff
I
.
Ixcrcisc 7.8* Show that intuitionist consequence implies classical
consequence. That is, show that if
I
then
PL
.
7.4.2 Ixamplcs
Given the semantics just introduced, its straightforward to demonstrate facts
about validity and semantic consequence.
Example 7.1: Show that Q
I
PQ. (Ill omit the qualier I from
now on.) Take any model and any stage s ; assume that IV(Q, s ) = 1 and
IV(PQ, s ) =0. Thus, for some s
/
, %s s
/
and IV(P, s
/
) =1 and IV(Q, s
/
) =0.
But this violates heredity.
Example 7.2: Show that PQ QP. Suppose IV(PQ, s ) =1 and
IV(QP, s ) =0. Given the latter, theres some stage s
/
such that %s s
/
and
IV(Q, s
/
) = 1 and IV(P, s
/
) = 0. Given the latter, for some s
//
, %s
/
s
//
and
IV(P, s
//
) =1. Given the former, IV(Q, s
//
) =0. Given transitivity, %s s
//
. Given
the truth of PQ at s , either IV(P, s
//
) = 0 or IV(Q, s
//
) = 1. Contradiction.
(Thus, what I called contraposition in chapter 2 is intuitionistically correct.
But contraposition is not; see exercise 7.9d.)
Its also straightforward to use the techniques of section 6.3.3 to construct
countermodels.
Example 7.3: Show that PP. Heres a model in which PP is valu-
ated as 0 in stage r:
0 0 0
PP
1
P
a
The ofcial model:

=]r, a]
% =]r, r, a, a, r, a]
(P, a) =1, all other atomics 0 everywhere
(Ill skip the ofcial models from now on.) As in section 6.3, we use aster-
isks to remind ourselves of commitments that concern other worlds/stages.
The asterisk is under P in stage r because a negation with value 0 carries a
commitment to including some stage at which the negated formula is 1. The
asterisk is over the P in stage a because of the heredity condition: a sentence
letter with value 1 commits us to making that letter 1 in every accessible stage.
(Likewise, negations and conditionals valuated as 1 generate top-asterisks, and
conditionals valuated as 0 generate bottom-asterisks).
Example 7.4: Show that P P. Here is a countermodel:
1 0 0
P P
1 0
P P
Note: since P is 1 at r, that means that P must be 0 at every stage at

which r sees. Now, %rr, so P must be 0 at r. So r must see some stage in
which P is 1. World a takes care of that.
Ixcrcisc 7.9 Establish the following facts.
a) (PQ)
I
PQ
b) PQ
I
(PQ)
c) P(QR)
I
(PQ)(PR)
d) PQ
I
QP
Ixcrcisc 7.10* Bolster the conclusion of exercise 3.17 by nd-
ing, for the systems of ukasiewicz, Kleene, and Priest, and also
supervaluationism, a and such that semantically implies
according to the system but not according to intuitionist semantics.
7.4.3 Soundncss
Recall our proof system for intuitionism from section 3.5. What Id like to
do next is show that that proof system is sound, relative to our semantics for
intuitionism. One can prove completeness as well, but we wont do that here.
3
First well need to prove an intermediate theorem:
Gcncralizcd hcrcdity: The heredity condition holds for all formulas. That
is, for any wff , whether atomic or no, and any stage, s , in any intuitionist
model, if IV(, s ) =1 and %s s
/
then IV(, s
/
) =1.
Proof. The proof is by induction. The base case is just the ofcial heredity
condition. Next we make the inductive hypothesis (ih): heredity is true for
formulas and ; we must now show that heredity also holds for , ,
, and . Ill do this for , and leave the rest as exercises.
: Suppose for reductio that IV(, s ) = 1, %s s
/
, and IV(, s
/
) = 0.
Given the former, IV(, s ) = 1 and IV(, s ) = 1. By (ih), IV(, s
/
) = 1 and
IV(, s
/
) =1contradiction.
Now for soundness. What does soundness mean in the present context?
The proof system in section 3.5 is a proof system for sequents, not individual
formulas. So rst, we need a notion of intuitionist validity for sequents.
iiixi:iox oi sioiix: !-vaiibi:v: Sequent is intuitionistically valid
(I-valid) iff
I
We can now formulate soundness:

Soundncss for intuitionism: Every intuitionistically provable sequent is I-
valid
Proof. This will be an inductive proof. Since a provable sequent is the last
sequent in any proof, all we need to show is that every sequent in any proof
is I-valid. And to do that, all we need to show is that the rule of assumptions
generates I-valid sequents (base case), and all the other rules preserve I-validity
(induction step). For any set, , I-model ., and stage s , lets write IV
.
(, s ) =
1 to mean that IV
.
(, s ) =1 for each .
Base case: the rule of assumptions generates sequents of the form ,
which are clearly I-valid.
Induction step: we show that the other sequent rules from section 3.5
preserve I-validity.
3
See Kripke (); Priest (, section .), although their proof systems are of the truth
tree variety.
I: Here we assume that the inputs to I are I-valid, and show that its
output is I-valid. That is, we assume that and are I-valid
sequents, and we must show that it follows that , is also I-valid.
So suppose otherwise for reductio. Then IV(, s ) =1 and IV(, s ) =0,
for some stage s in some I-model. Since is I-valid, IV(, s ) = 1 (we
know that IV(, s ) =1, i.e., all members of are 1 at s in the model
were discussing; so all members of are 1 at s in this model, so since is
I-valid, is 1 at s in this model.) Similarly, since is I-valid, IV(, s ) =1.
Contradiction.
E: Assume that i) , ii)
1
, , and iii)
2
, are all
I-valid, and suppose for reductio that ,
1
,
2
is I-invalid. So IV(
2
, s ) =1 but IV(, s ) =0, for some stage s in some model. Since sequent
i) is I-valid, IV(, s ) = 1, so either or is 1 at s . If the former then
by the I-validity of ii), IV(, s ) = 1; if the latter then by the I-validity of iii),
IV(, s ) =1. Either way, we have a contradiction.
I leave the proof that the remaining rules preserve I-validity as an exercise.
I can now justify an assertion I made, but did not prove, in section 3.5. I
asserted there that the sequent PP is not intuitionistically provable.
Given the soundness proof, to demonstrate that a sequent is not intuitionisti-
cally provable, it sufces to show that its premises do not I-semantically-imply
its conclusion. But in example 7.3 we showed that PP, which is equivalent
to saying that PP.
Similarly, we showed in example 7.4 that P P. Thus, by the soundness
theorem, the sequent P P isnt provable. (Recall how, in constructing
our proof system for intuitionism in section 3.5, we dropped the rule of double-
negation elimination.)
Ixcrcisc 7.11 Of the PL-axiom schemas (section 2.6), which are
intuitionistically acceptable (i.e., which have only I-valid instances)?
Ixcrcisc 7.12 Complete the proof of generalized heredity.
Ixcrcisc 7.13 Complete the soundness proof by showing that E,
I, DNI, RAA, I, E, and EF preserve I-validity.
Chaptcr 8
Countcrfactuals
T
uivi avi tiv:aix toxbi:ioxais in natural language that are not well-
represented either by propositional logics material conditional or by
modal logics strict conditional. In this chapter we consider counterfactual
conditionalsconditionals that (loosely speaking) have the form:
If it had been that , it would have been that
For instance:
If I had struck this match, it would have lit
The counterfactuals that we typically utter have false antecedents (hence
the name), and are phrased in the subjunctive mood. It is common to distin-
guish counterfactuals from conditionals phrased in the indicative mood. A
famous example illustrates the apparent semantic difference: the counterfac-
tual conditional If Oswald hadnt shot Kennedy, someone else would have
is false (assuming that certain conspiracy theories are false and Oswald was
acting alone); but the indicative conditional If Oswald didnt shoot Kennedy
then someone else did is true (we know that someone shot Kennedy, so if it
wasnt Oswald, it must have been someone else.) The semantics of indicative
conditionals is an important topic in its own right (since they too seem not to
be well-represented by the material or strict conditional), but we wont take up
that topic here.
1
We symbolize the counterfactual with antecedent and consequent thus:
. What should the logic of this new connective be?
1
For a good overview see Edgington ().
CHAPTER 8. COUNTERFACTUALS
8.1 Natural languagc countcrfactuals
Well, lets have a look at how natural language counterfactuals behave. Our
survey will provide guidance for our main task: developing a semantics for .
As well see, counterfactuals behave very differently from both material and
strict conditionals.
8.1.1 Antcccdcnts and conscqucnts
Our system for counterfactuals should have the following features:
P PQ
Q PQ
For consider: I did not strike the match; but it doesnt logically follow that
if I had struck the match, it would have turned into a feather. So if is
to represent if it had been that, it would have been that, P should
not semantically imply PQ. Similarly, George W. Bush (somehow) won
the United States presidential election, but it doesnt follow that if the
newspapers had discovered beforehand that Bush had an affair with Al Gore,
he would still have won. So our semantics had better not count PQ as a
semantic consequence of Q either.
(Relatedly, counterfactuals arent truth-functional. For example, the coun-
terfactuals If I had struck the match, it would have turned into a feather and
If I had struck the match, it would have lit both have false antecedents and
false consequents; but they differ in truth value.)
Like counterfactuals, strict conditionals are not in general implied by the
falsity of their antecedents or the truth of their consequents (in any modal
system). The material conditional, however, is implied by the truth of its
consequent or the falsity of its antecedent (and its truth-functional). We have
our rst logical difference between counterfactual and material conditionals.
8.1.2 Can bc contingcnt
In the actual world, since there was no conspiracy, its not true that if Oswald
hadnt shot Kennedy, someone else would have. But in a possible world in
which there is a conspiracy and Oswald has a backup, it presumably is true
that if Oswald hadnt shot Kennedy, someone else would have. Thus, our
logic should allow counterfactuals to be contingent statements. Just because a
counterfactual is true, it should not follow logically that it is necessarily true;
and just because a counterfactual is false, it should not follow logically that it
is necessarily false. That is, our semantics for should have the following
features:
PQ 2(PQ)
(PQ) 2(PQ)
This places an obstacle (though not the most important one) to using the
strict conditional to represent natural language counterfactuals. Given the
denition of as 2(), its easy to check that
S
2() and
()
S
2(). So if the logic of the 2 is at least as strong as S, we
have a logical mismatch between counterfactuals and the .
8.1.3 No augmcntation
The and the (in all systems) obey the argument form augmentation:
()
()
That is,
PL
() and
K
(). However, natural
language counterfactuals famously do not obey augmentation. Consider:
If I had struck the match, it would have lit.
Therefore, if I had struck the match and had been in
outer space, it would have lit.
The premise is true and the conclusion is false. We have another desidera-
tum for our semantics for counterfactuals: it should turn out that PQ
(PR)Q.
8.1.4 No contraposition
and obey contraposition:
But counterfactuals do not. Suppose Im on a ring squad that executes a victim.
The only chance the victim had for survival was for all of our guns to jam; but
unfortunately for him, none of the guns jammed. Now consider:
If my gun had jammed, the victim would (still) have died.
Therefore, if the victim had not died, my gun would
not have jammed
The premise is true (theres no reason to suppose that the other guns would
have jammed if mine had) but the conclusion is false (if the victim had not died,
all of the squads guns would have jammed).
8.1.5 Somc implications
Here is an argument form that intuitively should hold for the :
The counterfactual conditional should imply the material conditional.

2
We
can argue for this contrapositively: if the material conditional is false,
then is true and is false. But surely a counterfactual with a true antecedent
and false consequent is false.
Also, the strict conditional arguably should imply the counterfactual:
For if entailsnecessitates, then, it seems, if had been true, would

have to have been true as well.
8.1.6 Contcxt dcpcndcncc
Years ago, a few of us were at a restaurant in NYRed Smith, Frank
Graham, Allie Reynolds, Yogi [Berra] and me. At about . p.m., Ted
[Williams] walked in helped by a cane. Graham asked us what we thought
2
will then obey modus ponens and modus tollens since obeys both. That is, well
have , and , .
Ted would hit if he were playing today. Allie said, due to the better
equipment probably about .. Red Smith said. About .. I said,
due to the lack of really great pitching about .. Yogi said, .. We
all jumped up and I said, Youre nuts, Yogi! Teds lifetime average is ..
Yeah said Yogi but he is years old.
Buzzie Bavasi, baseball executive.
Who was right? If Ted Williams had played at the time the story was told,
would he or wouldnt he have hit over .?
Clearly, theres no single correct answer. The rst respondents were imag-
ining Williams playing as a young man. Understood that way, the answer is, no
doubt: yes, he would have hit over .. But Berra took the question a different
way: he was imagining Williams hitting as he was then: a year old man.
Berra took the others off guard, by deliberately (?this is Yogi Berra were
talking about) shifting how the question was construed, but he didnt make a
semantic mistake in so doing. Its perfectly legitimate, in other circumstances
anyway, to take the question in Berras way. (Imagine Williams muttering to
himself at the time: These punks nowadays! If I were playing today, Id still
hit over .!) Counterfactual sentences can be interpreted in different ways
depending on the conversational context in which they are uttered.
Another example:
If Syracuse had been located in Louisiana, Syracuse
winters would have been warm.
True or false? It might seem true: Louisiana is in the south. But waitperhaps
Louisiana would have included Syracuse by having its borders extend north to
Syracuses actual latitude.
Would Syracuse have been warm in the winter? Would Williams have hit
over .? No single answer is correct, once and for all. Which answer is
correct depends on the linguistic context. Whether a counterfactual is true or
whether it is false depends in part on what the speaker means to be saying, and
on what her audience takes her to be saying, when she utters the counterfactual.
When we consider the counterfactual hypothesis that Syracuse is located in
Louisiana, we imagine reality having been different in certain respects from
actuality. In particular, we imagine Syracuse having been in Louisiana. But we
dont imagine reality having been different in any old waywe dont imagine
Syracuse and Louisiana both being located in China. We hold certain things
constant (Syracuse and Louisiana not being in China) while varying others.
The question then arises: what parts of reality, exactly, do we hold constant?
In the Syracuse-Louisiana case, we seem to have at least two choices. Do we
hold constant the location of Syracuse, or do we hold constant the borders of
Louisiana? The truth value of the counterfactual depends on which choice we
make.
What determines which things are to be held constant, when we evaluate
the truth value of a counterfactual? In large part: the context of utterance of
the counterfactual. Suppose I am in the middle of the following conversation:
Syracuse restaurants struggle to survive because the climate there is so bad:
no one wants to go out to eat in the winter. If Syracuse had been located in
Louisiana, its restaurants would have done much better. In such a context,
an utterance of the counterfactual If Syracuse had been located in Louisiana,
Syracuse winters would have been warm would be regarded as true. But if
this counterfactual were uttered in the midst of the following conversation, it
would be regarded as false: You know, Louisiana is statistically the warmest
state in the country. Good thing Syracuse isnt located in Louisiana, because
that would have ruined the statistic.
Does just saying a sentence, intending it to be true, make it true? Well,
sort of! When a sentence has a meaning that is partly determined by context,
then when a person utters that sentence with the intention of saying something
true, that tends to create a context in which the sentence is true. In ordinary
circumstances, if you look at your kitchen table and say that table is at, you
would say something true. But suppose a scientist walked into your kitchen
and said: you know, macroscopic objects are far from being at. Take that
table, for instance. It isnt at at all; when viewed under a microscope, it can
be seen to have a very irregular surface. Youd take the scientist to be saying
something true as well. Indeed, youd go along with her and say yourself: that
table is not at. (In saying this you wouldnt take yourself to be contradicting
your earlier utterance of that table is at. You meant something different
earlier.) The term at can mean different things depending on how strict the
standards are for counting as at. What the standards are depends on the
conversational context, and when the scientist made her speech, you and she
adopted standards under which what she said came out true.
3
3
See Lewis ().
8.2 Thc Lcwis/Stalnakcr thcory
What do counterfactuals mean? What are their truth conditions? David Lewis
(a) and Robert Stalnaker () give versions of the following answer. To
determine whether a counterfactual PQ is true, we must consider all the
possible worlds in which P is true, and nd the one that is most similar to the
actual world. PQ is true in the actual world if and only if Q is true in that
most similar world. Consider Lewiss example:
If kangaroos had no tails, they would topple over.
When we consider the possible world that would be actual if kangaroos had no
tails, we do not depart gratuitously from actuality. We do not consider a world
in which kangaroos have wings, or crutches. We do not consider a world with
different laws of nature, in which there is no gravity. We keep the kangaroos
and the laws of nature as similar as we can to how they actually are (while still
removing the tails). It seems that the kangaroos would then fall over.
In the previous section we saw how one and the same counterfactual sen-
tence can have different truth values in different contexts. On the Lewis-
Stalnaker view, this context dependence results from the fact that the similarity
relation mentioned in the truth conditions for counterfactuals varies from
context to context.
To clarify this point, lets think generally about similarity. Things can be
similar in certain respects but not in others. A blue square is similar to a blue
circle in respect of color, not in respect of shape. What happens when you
compare objects that differ in multiple respects? Is a blue square more like a
blue circle or a red square? Theres clearly no once-and-for-all answer. If we
grant more importance to similarity in color than to similarity in shape, then the
blue square is more like the blue circle; if we grant more importance to shape
then the blue square is more like the red square. Put another way: a similarity
relation that weights shape more heavily counts the red square as being more
similar, whereas a similarity relation that weights color more heavily counts
the blue circle as being more similar. The multiplicity of similarity relations
only increases when we move to possible world similarity. When comparing
entire possible worlds, there are a vast number of respects of similarity, and so
there is room for many, many similarity relations, differing from one another
over the relative weights assigned to different respects of comparison.
Return, now, to the example of context dependence from the previous
section:
If Syracuse had been located in Louisiana, Syracuse
winters would have been warm.
When we afrm this counterfactual, according to Lewis and Stalnaker we
are using a similarity relation that weights heavily Louisianas actual borders.
Under this similarity relation, the possible world most similar to actuality is
one in which Syracuse has moved south. When we reject the counterfactual,
we are using a similarity relation that weights Syracuses actual location more
heavily; under this similarity relation, the most similar world is one in which
Louisiana extends north.
8.3 Stalnakcrs systcm
Lewis and Stalnaker give different semantic systems for counterfactuals. Each
system is based on the intuitive idea described in the previous section, but the
systems differ over details. Ill begin with Stalnakers system (since its simpler).
4
8.3.1 Syntax of SC
The primitive vocabulary of SC is that of propositional modal logic, plus the
connective . Heres the grammar:
iiixi:iox oi vii:
Sentence letters are wffs
if , are wffs then (), , 2, and () are wffs
nothing else is a wff
8.3.2 Scmantics of SC
Where R is a three-place relation, lets abbreviate Rxyz as R
z
xy. And,
where u is any object, let R
u
be the two-place relation that holds between
objects x and y iff R
u
xy. (Think of R
u
as the two-place relation that results
from plugging up one place of the three-place relation R with object u.)
Here are the denitions of an SC-model and its valuation function (SC-
validity and SC-semantic consequence are then dened in the usual way):
4
See Stalnaker (). The version of the theory I present here is slightly different from
Stalnakers original version; see Lewis (a, p. ).
iiixi:iox oi xobii: An SC-model, ., is an ordered triple T, _, ,
where:
T is a nonempty set (worlds)
is a two-place function that assigns either 0 or 1 to each sentence letter
relative to each w T (interpretation function)
_ is a three-place relation over T (nearness relation)
The valuation function V
.
for . (see below) and _satisfy the following
conditions:
for any w, _
w
is strongly connected in T
for any w, _
w
is transitive
for any w, _
w
is anti-symmetric
for any x, y, x _
x
y (Base)
for any SC-wff, , provided V
.
(, v) = 1 for at least one v T,
then for every z, theres some w such that V
.
(, w) =1, and such
that for any x, if V
.
(, x) =1 then w _
z
x (Limit)
(A binary relation R is strongly connected in set Aiff for each u, v A, either
Ruv or Rvu, and anti-symmetric iff u = v whenever both Ruv and Rvu.)
iiixi:iox oi vaiia:iox: Where . (= T, _, ) is any SC-model, the
SC-valuation for ., V
.
, is dened as the two-place function that assigns
either 0 or 1 to each SC-wff relative to each member of T, subject to the
following constraints, where is any sentence letter, and are any wffs, and
w is any member of T:
V
.
(, w) =(, w)
V
.
(, w) =1 iff V
.
(, w) =0
V
.
.
(, w) =0 or V
.
(, w) =1
V
.
(2, w) =1 iff for any v, V
.
(, v) =1
V
.
(, w) =1 iff for any x, IF [V
.
(, x) =1 and for any y such that
V
.
(, y) =1, x _
w
y] THEN V
.
(, x) =1
Phew! Lets look into what this means.
First, notice that much here is the same as with MPL. A model still has a set
of worlds, and an interpretation function that assigns truth values to sentence
letters relative to worlds. As before, a valuation function then assigns truth
values to complex wffs relative to worlds. The propositional connectives
and have their usual truth conditions (so the derived clauses for , , and
remain the same.)
What happened to the accessibility relation? Ive dropped it for simplicitys
sake. The truth condition for 2 is now just that is true at all worlds. (This
is equivalent to including an accessibility relation but requiring it to be total,
which generates an S modal logic for the 2 as noted in section 6.3.1.) The
derived truth condition for the 3 is then:
V
.
(3, w) =1 iff for some v, V
.
(, v) =1
Next, what about this nearness relation _? Think of x _
z
y as meaning that
possible world x is at least as similar to (near to) world z as is world y; thus,
think of _ as the similarity relation between possible worlds that we talked
about before. To decide whether x _
w
y, place yourself in possible world w,
and ask which possible world is more similar to yours, x or y.
(An option I wont pursue would be to represent context dependence by
introducing a set ( of contexts of utterance and multiple nearness relations
_
1
, _
2
. . . into the models. We could then relativize truth values to contexts
(members of (), allowing which nearness relation determines the truth condi-
tions of to depend on the context.)
I say we can think of _ as a similarity relation, but take this with a grain
of salt. As I keep emphasizing, model theory isnt metaphysics. Just as our
denitions allow the members of T to be any old things, so, _ is allowed to
be any old relation over T. Just as the members of T could be sh, so the _
relation could be any old relation over sh. (But as before, if the truth conditions
for natural language counterfactuals have nothing to do with real possible
worlds and similarity then the interest of our semantics is diminished, since the
models wont be modeling the semantics of natural language counterfactuals.)
The constraints on the formal properties of _some of them, anyway
seem plausible if _ is to be thought of as a similarity relation. Strong connec-
tivity says that any two worlds can be compared for similarity to a given world.
Transitivity has a transparent meaning. Anti-symmetry prohibits tiesit says
that two distinct worlds cannot each be at least as close to a given world w as
the other. The base constraint says that every world is at least as close to
itself as is every other. (Given anti-symmetry, each world must then be closer to
itself than any other world is, where x is closer to w than y is (x
w
y) means
that x _
w
y and not: y _
w
x.) Finally, the limit assumption says that theres
always a closest -world. That is, no matter what world w that youre in, for
any wff there will always be some world x in which is true that is at least
as close to your world as is any other world (unless isnt true in any worlds
at all). The limit assumption prohibits the following: there are no closest
-worlds, only an innite sequence of closer and closer -worlds. (Notice that
the limit assumption automatically holds whenever there are only nitely many
worlds. So when we start constructing countermodels, if theyre nite then we
wont need to separately verify that they satisfy the limit assumption.) Some of
these assumptions have been challenged, especially anti-symmetry and limit.
We will consider these challenges below.
Note how the limit assumption refers to the valuation function. (MPL
models, by contrast, are dened without reference to the valuation function.)
The limit assumption is a constraint that relates the nearness relation to the
truth values of all formulas, complex or otherwise: it says that any formula
that is true somewhere is true in some closest-to-w world. (See exercise 8.1.)
Ixcrcisc 8.1* Could we have stated an ofcial limit assumption
just for atomics, and then proved a derived limit assumption for
complex wffs (as with heredity in the semantics for intuitionistic
logic)?
8.4 Validity proofs in SC
Here are some examples of semantic validity proofs in Stalnakers system.
SC
(PQ)(PQ). Where T, _, is any
SC-model and r is any world in T:
i) Suppose for reductio that V((PQ)(PQ), r ) =0. Then V(PQ, r ) =
1 and
ii) V(PQ, r ) = 0. Now, the truth condition for says that PQ is
true at r iff Q is true at every closest-to-r P-world. So since PQ is
false at r , there must be a closest-to-r P-world at which Q is falsethat
is, there is some world a such that:
a) V(P, a) =1
b) for any x, if V(P, x) =1 then a _
r
x
c) V(Q, a) =0
iii) From line i), V(P, r ) = 1. So given b), a _
r
r . By base, r _
r
a. So, by
anti-symmetry, r = a. Since V(Q, r ) = 1 by i), we have V(Q, a) = 1,
contradicting c).
SC
[(PQ)((PQ)R)] [PR]. (This
formula is worth taking note of, because it is valid despite its similarity to the
invalid formula [(PQ)(QR)] [PR]; see below):
i) Suppose for reductio that the formula is false at some world r in some SC-
model. Then (given the truth conditions for and ) V(PQ, r ) =1
and
ii) V((PQ)R, r ) =1 but
iii) V(PR, r ) =0. So some a is a nearest-to-r P world, and V(R, a) =0.
iv) By i), Q is true in all nearest-to-r P worlds, and so V(Q, a) =1.
v) Note now that a is a nearest-to-r PQ world:
a) By lines iii) and iv), V(PQ, a) =1.
b) If V(PQ, x) =1 then a _
r
x, for any world x. For V(P, x) =1 since
V(PQ, x) =1; but then by iii), a _
r
x. (Remember: a is a nearest-
to-r P world means: V(P, a) = 1, and for any x, if V(P, x) = 1
then a _
r
x.)
vi) So by ii) and v), V(R, a) =1, contradicting iii).
Ixcrcisc 8.2 Show that the counterfactual is intermediate in
strength between the strict and material conditionals; i.e., that:
a)
SC
b)
SC
8.5 Countcrmodcls in SC
In this section well learn how to construct countermodels in SC. Along the
way well also look at how to decide whether a given formula is SC-valid or
SC-invalid. As with plain old modal logic, the best strategy is to attempt to
come up with a countermodel. If you fail, you can use your failed attempt to
guide the construction of a validity proof.
We can use diagrams like those from section 6.3.3 to represent SC-counter-
models. The diagrams will be a little different though. They will still contain
boxes (rounded now, to distinguish them from the old countermodels) in which
we put formulas; and we again indicate truth values of formulas with small
numbers above the formulas. But since there is no accessibility relation, we
dont need the arrows between the boxes. And since we need to represent the
nearness relation, we will arrange the boxes vertically. At the bottom goes a box
for the world, r , of our model in which were trying to make a given formula
false. We string the other worlds in the diagram above this bottom world r :
the further away a world is from r in the _
r
ordering, the further above r we
place it in the diagram. Thus, a countermodel for the formula P(PQ)
might look as follows:
/. -,
() *+
1 1
P Q
b
no P
/. -,
() *+
1 0
P Q
a
/. -,
() *+
1 0 0 0
P(PQ)
r
In this diagram, the world were primarily focusing on is the bottom world,
world r. The nearest world to r is world r itself. The next nearest world to r
is the next world moving up from the bottom: world a. The furthest world
from r is world b. Notice that P is false at world r, and true at worlds a and
b. Thus, a is the nearest world to r in which P is true. Since Q is false at
world a, that makes the counterfactual PQ false at world r . Since P is
true and PQ is false at r, the material conditional P(PQ) is false at
r, as desired. (World b isnt needed in this countermodel; I included it merely
for illustration.) The no P sign to the left of worlds a and r is a reminder
to ourselves in case we want to add further worlds to the diagram later on. It
reminds us not to put any P worlds between a and r. Otherwise world a would
no longer be the nearest P world.
What strategy should one use for constructing SC-countermodels? As we
saw in section 6.3.3, a good policy is to make forced moves rst. For example,
if you are committed to making a material conditional false at a world, go
ahead and make its antecedent true and consequent false in that world, right
away. A false counterfactual also forces certain moves. It follows from the
truth condition for the that if is false at world w, then there exists
a nearest-to-w world at which is false. So if you put a 0 overtop of a
counterfactual in some world w, its good to do the following two things
right away. First, add a nearest-to-w world in which is true (if such a world
isnt already present in your diagram). And second, make false there.
True counterfactuals dont force your hand quite so much, since there are
two ways for a counterfactual to be true. If is true at w, then must be
true at every nearest-to-w world. This could happen, not only if there exists
a nearest-to-w world in which is true, but also if there are no nearest-to-w
worlds. In the latter case we say that is vacuously true at w. A
counterfactual can be vacuously true only when its antecedent is necessarily
false, since the limit assumption guarantees that if there is at least one world,
then there is a nearest world. So: if you want to make a counterfactual true at
a world, its a good idea to wait until youve been forced to make its antecedent
true in at least one world. Only when this has happened, thus closing off
the possibility of making the counterfactual vacuously true, should you add a
nearest world in which its antecedent is true, and make its consequent true at
that nearest antecedent-world. These strategies are illustrated by the following
example.
Example 8.3: Show that [(PQ)(QR)] (PR) is SC-invalid. We
begin as follows:
/. -,
() *+
1 1 1 0 0
[(PQ)(QR)](PR)
r
In keeping with the advice to make forced moves rst, lets deal with the false
counterfactual before dealing with the true counterfactual; lets make PR
false in r. This means adding a nearest-to-r P world in which R is false. At
this point, nothing prevents us from making this world r itself, but that might
collide with other things we do later, so Ill make this nearest-to-r P world a
distinct world from r:
no P
/. -,
() *+
1 0 1
P R Q
a
/. -,
() *+
0 1 1 1 0 0
[(PQ)(QR)](PR)
r
No P reminds me not to add any P-worlds between a and r. Since world r is
in the no P zone, I made P false there.
Notice that I made Q true in a. I did this because PQ is true in r. PQ
says that Q is true in the nearest-to-r P world, and a is the nearest-to-r P world.
In general, whenever you add a new world to one of these diagrams, you should
go back to all the counterfactuals in the bottom world and see whether they
require their consequents to have certain truth values in the new world.
We now have to make the nal counterfactual QR true. There are
two ways this could happen: our model might contain no Q worlds at all (the
vacuous case), or it might contain a nearest-to-r Q world in which R is true. Q
is already true in at least one world (world a), so the vacuous case is ruled out.
So we must include a nearest-to-r Q world, call it b, and make R true there.
Where will we put this new world b? There are three possibilities. World b
could be farther away from, identical to, or closer to r than a. (These are the
only three possibilities, given anti-symmetry.) Lets try the rst possibility:
/. -,
() *+
1 1
Q R
b
no Q
no P
/. -,
() *+
1 0 1
P R Q
a
/. -,
() *+
0 1 0 1 1 0 0
[(PQ)(QR)](PR)
r
This doesnt work, because world a is in the no-Q zone, but Q is true at world
a. Put another way: in this diagram, b isnt the nearest-to-r Q world; world a
is. And so, since R is false at world a, the counterfactual QR would come
out false at world r, whereas we want it to be true.
Likewise, we cant make world b be identical to world a, since we need to
make R true in b and R is already false in a.
But the nal possibility works; we can let world b be closer to r than a:
no P
/. -,
() *+
1 0 1
P R Q
a
/. -,
() *+
1 1 0
Q R P
b
no Q
/. -,
() *+
0 1 0 1 1 0 0
[(PQ)(QR)](PR)
r
(I made P false in b since b is in the no P zone.) Heres the ofcial model:
T =]r, a, b]
_
r
=]b, a . . . ]
(P, a) =(Q, a) =(Q, b) =(R, b) =1, all others 0
In this ofcial model I left out a lot in the description of the similarity relation.
First, I left out some of the elements of _
r
. Fully written out, it would be:
_
r
=]b, a, r, b, r, a, r, r, a, a, b, b]
My policy will be to leave out an ordered pair when you could work out
from the denition of a model that the pair must be present. Thus I left
out r, b and r, a because the base condition requires them (r, a is also
required by transitivity given the presence of r, b and b, a), and I left out
r, r, a, a, and b, b because theyre needed to make _
r
reexive. (Why must
it be reexive? Because reexivity comes from strong connectivity. Let w and
x be any members of T; we get x _
w
x or x _
w
x from strong connectivity
of _
w
, and hence x _
w
x.) Second, to fully specify this model, strictly speaking
it isnt enough to specify just _
r
. Wed need to specify the rest of _ by writing
out _
a
and _
b
. But in this case, it doesnt matter what _
a
and _
b
are like, so I
omitted them. (In some later problems well need to specify more of _ than
just _
r
.)
Example 8.4: Is (PR) ((PQ)R) valid or invalid? (This formula
corresponds to augmentation (section 8.1.3).) As always, we begin by trying
for a countermodel. In this case we succeed:
no PQ
/. -,
() *+
1 1 1 0
PQ R
a
/. -,
() *+
1 1 0
P R Q
b
no P
/. -,
() *+
0 1 0 0
(PR)[(PQ)R)
r
I began with the false: (PQ)R. This forced the existence of a nearest PQ
world (world a), in which R was false. But since PQ was true there, P was
true there; this ruled out the true PR in r being vacuously true. So I was
forced to include a nearest P world, b, and make R true in it. It couldnt be
farther out than a, since P is true in a. It couldnt be a, since R was already false
there. So I had to put it nearer than a. Notice that I had to make Q false at b.
Why? Well, it was in the no PQ zone, and I had made P true in it. Heres
the ofcial model:
T =]r, a, b]
_
r
=]b, a . . . ]
(P, a) =(Q, a) =(P, b) =(R, b) =1, all else 0
Example 8.5: Determine whether
SC
3P[(PQ)(PQ)]. An
attempt to nd a countermodel fails at the following point:
no P
/. -,
() *+
1
1 1 0
P Q
a
/. -,
() *+
1 0 0 1 0 0 1
3P[(PQ)(PQ)]
r
At world a, Ive got Q being both true and false. A word about how I got to
that point. I noticed that I had to make two counterfactuals true: PQ and
PQ. Now, this isnt a contradiction all by itself. Since counterfactuals
are vacuously true if their antecedents are impossible, I could have made both
counterfactuals true if I could have made P impossible. But this route was
closed to me since 3P is true in r. The limit assumption forced me to include a
closest P world; and then the two true counterfactuals created the contradiction.
This reasoning is embodied in the following semantic validity proof:
i) Suppose for reductio that the formula is false in some world r in some
model. Then V(3P, r ) =1 and
ii) V(PQ, r ) =1 and
iii) V((PQ), r ) =0. So V(PQ, r ) =1.
iv) Given i), P is true at some world, so by the limit assumption there is some
closest-to-r P world. Call one such world a. Then by ii), V(Q, a) =1,
but by iii), V(Q, a) =1, and so V(Q, a) =0; contradiction.
Note the use of the limit assumption. Its needed to establish that there is a
nearest -world in cases where we couldnt infer this otherwise.
Example 8.6: Showthat [P(QR)][(PQ)R] is SC-invalid. The
antecedent contains a nested counterfactual, which, as well see, calls for some-
thing new.
We begin our countermodel by making the formula false in r, which means
making the antecedent true and the consequent false. Since the consequent is
a false counterfactual, were forced to create a nearest PQ world in which R
is false:
no PQ
/. -,
() *+
1 1 1 0
PQ R
a
/. -,
() *+
1 0 0
[P(QR)][(PQ)R]
r
Next we must make P(QR) true. We cant make it vacuously true,
because weve already got a P-world in the model: a. So, weve got to create a
nearest-to-r P world. Could it be farther away than a? No, because a would be
a closer P world. Could it be a? No, because weve got to make QR true
in the closest P world, and since Q is true but R is false in a, QR is already
false in a. So, we do it as follows:
no PQ
/. -,
() *+
1 1 1 0
PQ R
a
/. -,
() *+
1 0 1
P QR
b
no P
/. -,
() *+
0 1 0 0
[P(QR)][(PQ)R]
r
(I made Q false at b because b is in the no PQ zone and P is true at b.)
Now we must make QR true at b. This requires some thought. So far
the diagram represents the view from r. That is, it represents how near the
worlds in the model are to r. That is, it represents the _
r
relation. But the truth
value of QR at b depends on the view from bon the the _
b
relation. So
we need to depict _
b
with a new diagram, in which b is the bottom world:
no Q
/. -,
() *+
1 1
Q R
c
/. -,
() *+
1 0 1
P QR
b
I created a nearest-to-b Q world, c, and made R true there. Notice that I kept
the old truth values of b from the other diagram. This is because this new
diagram is a diagram of the same worlds as the old diagram; the difference is
that the new diagram represents the _
b
nearness relation, whereas the old one
represented a different relation: _
r
. Now, this diagram isnt nished. The
diagram is that of the _
b
relation, and that relation relates all the worlds in
the model (given strong connectivity). So, worlds r and a have to show up
somewhere here. The safest place to put them is far away from b, to avoid
conict with the no Q zone. Thus, the nal appearance of this part of the
diagram is as follows:
/. -,
() *+
r
/. -,
() *+
a
no Q
/. -,
() *+
1 1
Q R
c
/. -,
() *+
1 0 1
P QR
b
The old truth values from worlds r and a are still in effect (remember that this
diagram represents the same worlds as the earlier diagram of the view from r),
but I left them out since theyre already specied in that earlier diagram.
The order of the worlds in the r-diagramdoes not in any way affect the order
of the worlds in the b diagram. The nearness relations in the two diagrams
are completely independent, because the denition of an SC-model does not
constrain the relationship between _
i
and _
j
when i (= j . This might seem
unintuitive. The denition allows two halves of a model to look as follows:
The view from r The view from a
c r
b b
a c
r a
It might, for example, seem odd that in the view from r, b is physically closer to
a than c is, whereas in the view from a, c is closer to a than b is. But remember
that in any diagram, only some of the features are intended to be genuinely
representative. Ive constructed these diagrams from ink, but I dont mean to
be saying that the worlds in the model are made of ink. This feature of the
diagramthat its made of inkisnt intended to convey information about
the model. Analogously, the fact that b is physically closer to a than to c in the
view from r is not intended to convey the information that, in the model, b_
a
c.
In fact, the diagram of the view from r is only intended to convey information
about _
r
; it doesnt carry any information about _
a
, _
b
, or _
c
.
Back to the countermodel. The initial diagram, of the view from r, must be
updated to include world c. Its safest to put c far from r to avoid collisions:
/. -,
() *+
c
no PQ
/. -,
() *+
1 1 1 0
PQ R
a
/. -,
() *+
1 0 1
P QR
b
no P
/. -,
() *+
0 1 0 0
[P(QR)][(PQ)R]
r
Again, I havent re-written the truth values in world c, because theyre already
in the other diagram. The ofcial model:
T =]r, a, b, c]
_
r
=]b, a, a, c . . . ]
_
b
=]c, a, a, r . . . ]
(P, a) =(Q, a) =(P, b) =(Q, c) =(R, c) =1, all else 0
As before, I didnt write out all of _. I left out those bits that followautomatically
(given the denition of a model) from what I wrote out, and I didnt specify _
a
and _
c
since they dont matter here. But I did specify both _
r
and _
b
, since
the falsity of the formula [P(QR)][(PQ)R] at world r depended
on _
r
and _
b
being as described.
Ixcrcisc 8.3 Establish each of the following facts.
a) Q
SC
PQ
b) PQ
SC
QP
c) (PQ)R
SC
PR
d) (PQ)R
SC
P(QR)
e)* P(QR)
SC
Q(PR)
Ixcrcisc 8.4 Determine whether the following wffs are SC-valid
or invalid. Give a falsifying model for every invalid wff, and a
semantic validity proof for every valid wff.
a) 3P[(PQ)(PQ)]
b) [P(QR)][(PQ)R]
c) (PQ) [((PQ)R)(P(QR))]
8.6 Logical Icaturcs of SC
Does Stalnakers semantics for match the logical features of natural language
counterfactuals that we discussed in section 8.1? Yes.
We wanted counterfactuals to differ from material conditionals by not
following from the falsity of their antecedents or the truth of their consequents.
The Stalnaker system delivers these results. In world r in the rst model of
section 8.5, P is true but PQ is false; so P
SC
PQ. And the second
result is demonstrated in exercise 8.3a.
We wanted counterfactuals to differ from strict conditionals by being capa-
ble of contingency. The Stalnaker semantics also delivers this result, because
different worlds can have different similarity metrics. For example: consider a
model with worlds r and a, in which Q is true in the nearest-to-r P world, but
in which Q is false at the nearest-to-a P world. PQ is true at r and false at
a, whence 2(PQ) is false at r. So PQ
SC
2(PQ).
We wanted augmentation to fail for counterfactuals. Stalnaker delivers
again: the model of example 8.4 shows that PQ
SC
(PR)Q.
We wanted contraposition to fail for counterfactuals. Exercise 8.3b shows
that this result holds too.
We wanted the counterfactual conditional to be intermediate in strength
between the strict and material conditionals; see exercises 8.2a and 8.2b.
So: the SC-semantics reproduces the logical features of natural language
counterfactuals discussed in section 8.1. In the next few sections Ill discuss
some further logical features of the SC-semantics, and compare them with the
logical features of the , the , and natural language counterfactuals.
8.6.1 No cxportation
The obeys exportation:
()
()
But the doesnt in any system; (PQ)R
S
P(QR). Nor does the
(exercise 8.3d.)
Do natural language counterfactuals obey exportation? Here is an argument
that they do not. The following is true:
If Bill had married Laura and Hillary, he would have
been a bigamist.
But one can argue that the following is false:
If Bill had married Laura, then it would have been the
case that if he had married Hillary, he would have been
a bigamist.
Suppose Bill had married Laura. Would it then have been true that: if he had
married Hillary, he would have been a bigamist? Well, lets ask for comparison:
what would the world have been like, had George W. Bush married Hillary
Rodham Clinton? Would Bush have been a bigamist? Here the natural answer
is no. George W. Bush is in fact married to Laura Bush; but when imagining him
married to Hillary Rodham Clinton, we dont hold constant his actual marriage.
We imagine him being married to Hillary instead. If this is true for Bush, then
one might think its also true for Bill in the counterfactual circumstance in
which hes married to Laura: it would then have been true of him that, if he
had married Hillary, he wouldnt have still been married to Laura, and hence
would not have been a bigamist.
Its unclear whether this is a good argument, though, since it assumes that
ordinary standards for evaluating unembedded counterfactuals (If George had
married Hillary, he would have been a bigamist) apply to counterfactuals
embedded within other counterfactuals (If Bill had married Hillary, he would
have been a bigamist as embedded within If Bill had married Laura then.)
Contrary to the assumption, it seems most natural to evaluate the consequent
of an embedded counterfactual by holding its antecedent constant.
So the argument is questionable. But a defender of the SC semantics might
argue that the second displayed counterfactual above has a reading on which it
is false (recall the context-dependence of counterfactuals), and hence that we
need a semantics that allows for the failure of exportation.
8.6.2 No importation
Importation holds for , and for in T and stronger systems:
()
()
()
()
but not for the (see example 8.6).
The status of importation for natural language counterfactuals is similar to
that of exportation. One can argue that the following is true, at least on one
reading:
If Bill had married Laura, then it would have been the
case that if he had married Hillary, he would have been
happy.
without the result of importing being true:
If Bill had married Laura and Hillary, he would have
been happy
(If he had married both he would have become a public spectacle.)
8.6.3 No transitivity
Material and strict conditionals are transitive, in that the following implications
hold (in all systems):

But the model in example 8.3 shows that PQ, QR

SC
PR. Stalnakers
is intransitive.
Natural language counterfactuals also seem intransitive. I am the oldest
child in my family; my brother Mike is the second-oldest. So the following two
counterfactuals seem true:
5
If I hadnt been born, Mike would have been my parents
oldest child.
If my parents had never met, I wouldnt have been born.
But the result of applying transitivity is false:
If my parents had never met, Mike would have been
their oldest child.
8.6.4 No transposition
Transposition governs the :
()
()
but not the (in any of our modal systems); P(QR)
S
Q(PR). Nor
does it govern the (see exercise 8.3e).
5
They sound less clearly true if you read them in reverse order: If my parents had never
met, I wouldnt have been born; If I hadnt been born, Mike would have been my parents
oldest child. Its natural in this case to interpret the second counterfactual by holding constant
the antecedent of the rst. This fact, together with what we observed about embedded
counterfactuals in section 8.6.1, suggests a systematic dependence of the interpretation of
counterfactuals on their immediate linguistic context. See von Fintel () for a dynamic
semantics for counterfactuals, which more accurately models this feature of their use, and also
makes sense of how hard it is to hear the readings argued for in sections 8.6.1, 8.6.2, and 8.6.3.
The status of transposition for natural language counterfactuals is sim-
ilar to that of importation and exportation. If we can ignore the effects of
embedding on the evaluation of counterfactuals, then we have the following
counterexample to transposition. It is true that:
If Bill Clinton had married Laura Bush, then it would
have been the case that: if he had married Hillary Rod-
ham, hed have been married to a Democrat.
But it is not true that:
If Bill Clinton had married Hillary Rodham, then it
would have been the case that: if he had married Laura
Bush, hed have been married to a Democrat.
8.7 Lcwiss criticisms of Stalnakcrs thcory
As I mentioned earlier, David Lewis also defends a similarity-based theory of
counterfactuals. Lewiss system is in many ways similar to Stalnakers. But there
are two points of detail where Lewis and Stalnaker disagree.
6
First, Lewis challenges Stalnakers assumption of anti-symmetry. Ties in
similarity are generally possible, so why couldnt two possible worlds be exactly
similar to a given world? The challenge is most straightforward if Stalnaker
intends to be giving truth conditions rather than merely doing model theory, for
then Stalnaker would be assuming anti-symmetry for a real similarity relation:
the similarity relation used to give the truth conditions for natural language
counterfactuals. But even if Stalnaker is not doing this, the objection may yet
have bite, to the extent that the semantics of natural language conditionals is
like similarity-theoretic semantics.
The validity of certain wffs depends on whether you require anti-symmetry.
According to Stalnaker, all instances of the following two schemas are valid:
() () (Conditional excluded middle)
[()] [()()] (distribution)
But Lewis challenges each verdict. Take the rst one, for example. Suppose
you gave up anti-symmetry, thereby allowing ties. Then the following would
be a countermodel for an instance of conditional excluded middle:
6
See Lewis (a, section .). For an interesting response see Stalnaker ().
no Q
/. -,
() *+
1 0
P Q
a
/. -,
() *+
1 0 1
P Q
b
/. -,
() *+
0 0 0
(PQ)(PQ)
r
Here worlds a and b are tied for similarity to r. Remember that is
true only if is true in all the nearest worlds. So since Q is not true in
all the nearest-to-r P worlds (though its true in one of them), PQ is false
at r. Similarly for PQ. A similar model shows that distribution fails if
anti-symmetry is not required (see exercise 8.5.)
So, should we give up conditional excluded middle? Lewis concedes that
the principle is initially plausible. An equivalent formulation of conditional
excluded middle is: ()(). Now, whenever is possibly true,
the converse of this conditional, namely ()(), is agreed by
everyone to be true. So if conditional excluded middle is valid, then whenever
is possibly true, () and are equivalent to each other. And
we do normally treat them as being indistinguishable. We normally dont
distinguish between its not true that if she had played, she would have won
and if she had played, she would have failed to win (which does if she had
played, she wouldnt have won mean?).
And take distribution. If someone says: if I had been a baseball player, I
would have been either a third-baseman or a shortstop, it might seem natural
to reply with a question: well, which would you have been?. This reply
presupposes that either if you had been a baseball player, you would have been
a third-baseman or if you had been a baseball player, you would have been a
shortstop must be true.
So theres some intuitive plausibility to both conditional excluded middle
and distribution. But Lewis says two things. The rst is metaphysical: if were
going to accept the similarity analysis, weve got to give them up, because ties in
similarity just are possible. The second is purely semantic: the intuitions arent
completely compelling. About the coin-ipping case, Lewis denies that if the
coin had been ipped, it would have come up heads, and he also denies that if
the coin had been ipped, it would have come up tails. Rather, he says, if it had
been ipped, it might have come up heads. And if it had been ipped, it might
have come up tails. But neither outcome is such that it would have resulted, had
the coin been ipped. Concerning conditional excluded middle, Lewis says:
It is not the case that if Bizet and Verdi were compatriots, Bizet would be
Italian; and it is not the case that if Bizet and Verdi were compatriots, Bizet
would not be Italian; nevertheless, if Bizet and Verdi were compatriots,
Bizet either would or would not be Italian. (Lewis, a, p. )
If Bizet and Verdi were compatriots, Bizet might be Italian, but its not the case
that if they were compatriots, he would be Italian.
Lewis has a related objection to Stalnakers semantics. Consider English
conditionals of the formif it had been that , then it might have been that (I
used such conditionals in the previous paragraph). Lewis calls these conditionals
might-counterfactuals (to distinguish from the would-counterfactuals that we
have mostly been discussing in this chapter). He symbolizes them as ,
which he denes thus:
is short for ()
But, Lewis argues, this denition of doesnt work in Stalnakers system.
Since conditional excluded middle is valid in Stalnakers system, would
always imply . But we treat the might-counterfactual as being weaker
than the would-counterfactual; we often use it when were unwilling to utter the
would-counterfactual. So, Lewiss denition of doesnt work in Stalnakers
system. Moreover, there doesnt seem to be any other plausible denition.
Lewis also objects to Stalnakers limit assumption. His example: the dis-
played line is less than one inch long.
Now, the following counterfactual is clearly false:
If the line had been longer than one inch, it would have
been one hundred miles long.
But if we use Stalnakers truth conditions as truth conditions for natural lan-
guage counterfactuals, and take our intuitive judgments of similarity at face
value, we seem to get the result that it is true! For there doesnt seem to be
a closest world in which the line is more than one inch long. For every world
in which the line is, say, 1 +k inches long, there seems to be a world that is
more similar to the actual world: an otherwise similar world in which the line
is 1 +
k
2
inches long.
8.8 Lcwiss systcm
In light of the criticisms of the previous section, Lewis proposes a newsimilarity-
based semantics for counterfactuals. Ill call it LC, for Lewis-conditionals.
7
!iviss sixax:its: LC-models and their valuation functions LV are dened
as in Stalnakers semantics except that:
antisymmetry and limit are not assumed
the base condition is changed to read: for any x, y, if y _
x
x then x = y
the truth condition for the is changed to this: LV
.
(, w) =1 iff
EITHER is true at no worlds, OR: there is some world, x, such that
LV
.
(, x) =1 and for all y, if y _
w
x then LV
.
(, y) =1
Note that the non-anti-symmetric model of the previous section counts
as an LC-model in which (PQ) (PQ) is false at world r. So Lewiss
semantics invalidates conditional excluded middle. (It also invalidates distribu-
tion.)
Why the new base condition? Stalnakers base condition said that each
world is at least as close to itself as any other is; Lewiss makes the stronger claim
that each world is closer to itself than any other is. Lewis needs the stronger
claim in order to insure that ,
LC
. (Stalnaker could get by with the
weaker claim since it plus anti-symmetry entails the stronger claim, but Lewis
doesnt assume anti-symmetry.)
Why the new truth condition for the ? The limit assumption is now
allowed to fail; but as we saw with the nearly one-inch line, Stalnakers truth
condition yields unwanted vacuous truths when the limit assumption fails.
8
Lewiss new truth condition is designed to avoid this. Lets think about what it
says. First, theres the vacuous case: if is necessarily false then comes
out true. But if is possibly true, then is true at w iff there exists some
world with the following feature: no matter how much closer to w you go,
you never nd a world where is false. (If there is a nearest-to-w world,
then is true at w iff is true in all the nearest-to-w worlds.) To see
why this avoids vacuity, think for the moment of Lewiss semantics as providing
truth-conditions for natural-language counterfactuals, and recall the sentence:
7
See Lewis (a, pp. -). I have simplied Lewiss system.
8
Actually, dropping the limit assumption doesnt affect which wffs are valid (Lewis, b,
p. ). The issue of the limit assumption is an issue about the truth conditions of the
counterfactual, not its logic.
If the line had been longer than one inch, it would have
been one hundred miles long.
Theres no nearest world in which the line is longer than one inch, only an
innite series of worlds in which the line has lengths closer and closer to one
inch. But this doesnt make the counterfactual true. Since its antecedent is
possibly true, the only way for the counterfactual to be true, given Lewiss
truth condition, is for there to be some world, x, at which the the antecedent is
true, and such that the material conditional (antecedentconsequent) is true
at every world at least as similar to the actual world as is x. Since the at least
as similar as relation is reexive, this can be rewritten thus:
for some world, x, the antecedent and consequent are both true at x, and
in all worlds that are at least as similar to the actual world as is x, the
antecedent is never true while the consequent is false
So, is there any such world, x? No. For let x be any world in which the
antecedent and consequent are both true. Since the line is one hundred miles
long in x, we can nd a world that is more similar to the actual world than x in
which the antecedent is true but the consequent is false: just choose a world
just like x but in which the line is only, say, two inches long.
Lets see howLewis handles a true counterfactual when the limit assumption
is false:
If I had been taller than six feet, I would have been
shorter than nine feet
(I am, in fact, shorter than six feet.) Again, there is no nearest world in which
the antecedent is true. But now we can nd our world x: simply take x to be
a world just like the actual world but in which I am, say, six-feet-one. The
antecedent and consequent are both true in x. And any world that is at least as
similar to the actual world as x must surely be one in which Im less than nine
feet tall. So in no such world will the antecedent (Im taller than six feet) be
true while the consequent (Im shorter than nine feet) is false.
Recall Lewiss denition of the might-counterfactual:
is short for ()
From this we may obtain a derived clause for the truth conditions of :
LV
.
(, w) = 1 iff for some x, LV
.
(, x) = 1, and for any x, if
LV
.
(, x) =1 then for some y, y _
w
x and LV
.
(, y) =1)
That is, is true at w iff is possible, and for any world, theres a
world as close or closer to w in which and are both true. (In cases where
there is a nearest world, this means that must be true in at least one of the
nearest worlds.)
LC
[()][()()]
Ixcrcisc 8.6** Show that every LC-valid wff is SC-valid.
8.9 Thc problcm of disjunctivc antcccdcnts
Lets end by briey discussing a criticism that has been raised against both
Lewiss and Stalnakers systems.
9
In neither system does (PQ)R seman-
tically imply PR (exercises 8.3c, 8.6). But shouldnt this implication hold?
Imagine a conversation between Butch Cassidy and the Sundance Kid in heaven,
after having been surrounded and killed by the Bolivian army. They say:
If we had surrendered or tried to run away, we would
have been shot.
Intuitively, if this is true, so is this:
If we had surrendered, we would have been shot.
In general, we normally conclude from If P or Q had been the case, then R
would have been the case that if P had been the case, R would have been
the case. If Butch Cassidy and the Sundance Kid could have survived by
surrendering, they certainly would not say to each other If we had surrendered
or tried to run away, we would have been shot.
Is this a problem for Lewis and Stalnaker? Some say yes, but others reply as
follows. One must take great care in translating fromnatural language into logic.
For example, no one would criticize the law of double-negation elimination on
9
For references, see the bibliography of Lewis ().
the grounds that There aint no cake doesnt imply that there is some cake.
10
And or behaves in notoriously peculiar ways in similar contexts.
11
Consider:
You are permitted to stay or go.
One can argue that this does not have the form:
You are permitted to do the action: (Stay Go)
After all, suppose that you are permitted to stay, but not to go. If you stay,
you cant help doing the following act: staying-or-going. So, surely, youre
permitted to do that. So, the second sentence is true. But the rst isnt; if
someone uttered it to you when you were in jail, theyd be lying to you! It really
means: You are permitted to stay and you are permitted to go. Similarly, If
either P or Q were true then R would be true seems usually to mean If P
were true then R would be true, and if Q were true then R would be true.
We cant just expect natural language to translate directly into our logical
languagesometimes the surface structure of natural language is misleading.
Or so the reply goes. But it would be nice to have an explanation of why or
functions in this way.
10
The example is adapted from Loewer ().
11
This behavior is sometimes thought to threaten the deontic logic of section 7.1.
Chaptcr 9
Quantihcd Modal Logic
Q
iax:iiiib xobai iooit is what you get when you combine modal logic
with predicate logic. With it we can represent natural language sen-
tences such as:
Necessarily, all bachelors are male: 2x(BxMx)
Some male could have been female: x(Mx3F x)
Ferris could have been a walrus: 3W b
9.1 Grammar of QML
The language of quantied modal logic, or QML, is exactly what youd expect:
that of plain old predicate logic, but with the 2 added. Thus, the one new
clause to the denition of a wff says that if is a wff, then so is 2. (We retain
the old denitions of 3, , , , , and .) You get a different grammar for
QML depending on what version of predicate logic grammar you begin with.
To keep things simple, lets consider a stripped-down version of predicate logic:
no function symbols, and no denite description operator. But lets include the
identity sign =.
9.2 Dc rc and dc dicto
Like any logical extension, QML increases our powers of analysis. Way back in
propositional logic, we were able to analyze a certain level of structure, structure
CHAPTER 9. QUANTIFIED MODAL LOGIC

in terms of and, or, not, and so on. The move to predicate logic then let us
analyze quanticational structure; and the move to modal propositional logic
let us analyze modal structure. Moving to QML lets us do all three at once, as
with:
Its not possible for something to create itself
whose tripartite propositional, predicate, and modal structure is revealed in its
QML symbolization:
3xCxx
This deeper level of analysis reveals some new logical features. One ex-
ample is the famous distinction between de re and de dicto modal statements.
Consider:
Some rich person might have been poor
x(Rx3Px)
It might have been the case that some rich person is
poor
3x(RxPx)
The rst sentence asserts the existence of someone who is in fact rich, but
who might have been poor. This seems true, in contrast to the absurd second
sentence, which says that the following state of affairs is possible: someone
is both rich and poor. The second sentence is called de dicto because the
modality is attributed to a sentence (dictum): the modal operator 3 attaches to
the closed sentence x(RxPx). The rst sentence is called de re because
the modality is attributed to an object (res): the 3 attaches to a sentence with a
free variable, Px, and thus can be thought of as attributing a modal property,
the property of possibly being poor, to an object u when x is assigned the value u.
Modal propositional logic alone does not reveal this distinction. Given only
a Q to stand for some rich person is poor, we can write only 3Q, which
represents only the absurd second sentence. To represent the rst sentence we
need to put the 3 inside the Q, so to speak, as we can when we further analyze
Q as x(RxPx) using predicate logic.
A further example of the de re/de dicto distinction:
Every bachelor is such that he is necessarily unmarried
x(Bx2Ux)
It is necessary that all bachelors are unmarried
2x(BxUx)
Its helpful to think about the difference between these two statements in
terms of possible worlds. The second, de dicto, sentence makes the true claim
that in any possible world, anyone that is in that world a bachelor is, in that
world, unmarried. The rst, de re, sentence makes the false claim that if any
object, u, is a bachelor in the actual world, then that object u is necessarily
unmarriedi.e., the object u is unmarried in all possible worlds.
What do the following English sentences mean?
All bachelors are necessarily unmarried
Bachelors must necessarily be unmarried
Surface grammar suggests that they would mean the de re claim that each
bachelor is such that he is necessarily unmarried. But in fact, its very natural
to hear these sentences as making the de dicto claim that its necessary that all
bachelors are unmarried.
The de re/de dicto distinction also emerges with denite descriptions. This
may be illustrated by using Russells theory of descriptions (section 5.3.3). Recall
how Russells method generated two possible symbolizations for sentences
containing denite descriptions and negations, depending on whether the
denite description is given wide or narrow scope relative to the negation
operator. A similar phenomenon arises with sentences containing denite
descriptions and modal operators. There are two symbolizations of The
number of the planets is necessarily odd (letting Nx mean that x numbers
the planetsi.e., x is a number that corresponds to how many planets there
are):
2x(Nxy(Nyx=y) Ox)
x(Nxy(Nyx=y) 2Ox)
The rst, in which the description has narrower scope than the 2, is de dicto;
it says that its necessary that: one and only one thing numbers the planets, and
that thing is odd. This claim is false, since there could have been two planets,
or four planets, or six, etc. The second, in which the description takes wider
scope, is de re; it says that (in fact) there is one and only one thing that numbers
the planets, and that that thing is necessarily odd. Thats true, I suppose: the
number nine (the thing that in fact numbers the planetslets count Pluto as a
planet) is necessarily odd.
Natural language sentences containing both denite descriptions and modal
operators can perhaps be heard as expressing either de re or de dicto claims.
The number of the planets is necessarily odd sounds (or can sound) de re;
The American president is necessarily an American citizen sounds (or can
sound) de dicto.
The de re/de dicto distinction is often extended in the following way: a
sentence is said to be de re if it contains some formula of the form 2 or 3
in which contains a name or a free variable (free in , that is); otherwise
the sentence is de dicto. For example, 3W b and x2F x are de re, whereas
2x(BxUx) and 3x(F xGx) are de dicto.
De re modality is sometimes thought to be especially philosophically prob-
lematic. Consider again the de re sentence Each bachelor is such that: neces-
sarily, he is unmarriedx(Bx2Ux). To evaluate whether this sentence is
true, we must go through each object, x, that is a bachelor in the actual world,
and decide whether 2Ux is true. Take some particular bachelor, John, who is,
let us say, the only child of certain parents. We must go through all the possible
worlds and ask whether John is unmarried in all those worlds. But how do we
locate John in other possible worlds? In worlds in which John is not too different
from the way he is in the actual world, it will be easy. But consider a world in
which his parents only son is physically and psychologically very different. Is
this son John? If his parents have two sons, which (if either) is John? What
if their only child is female? And anyway, how are we guring out who his
parents are? This is the so-called problem of trans-world identication. (It is
analogous in some ways to the problem of how to re-identify individuals over
time.) What to say about it (and even, whether there really is a problem) is up
for grabs in the philosophy of modality.
1
The problem (if it is a problem) is thought not to arise with de dicto
modal sentences, for the evaluation of such sentences does not require taking
an individual from one possible world and reidentifying it in another world.
Return to the de dicto sentence necessarily, all bachelors are unmarried
2x(BxUx). Here, we take the sentence all bachelors are unmarried
around to the different worlds, rather than an individual like John. All we
need to do, in any world w, is nd all the people that in w are bachelors,
and see whether they are all unmarried. We have the descriptive predicate
1
See, for starters: Quine (c); Kripke (, ); Lewis (, chapter ).
bachelor to help us nd the relevant individuals in w; we dont need to do
anything like identify which individual in w is John.
9.3 A simplc scmantics for QML
Lets begin with a very simple semantics, SQML(for simple QML). Its simple
in two ways. First, there is no accessibility relation. 2 will be said to be true
iff is true in all worlds in the model. In effect, each world is accessible from
every other (and hence the underlying propositional modal logic is S). Second,
it will be a constant domain semantics. (Well discuss what this means, and
more complex semantical treatments of QML, below.)
iiixi:iox oi xobii: An SQML-model is an ordered triple T, , where:
T is a nonempty set (possible worlds)
is a nonempty set (domain)
is a function such that: (interpretation function)
if
n
is an n-place predicate then (
n
) is a set of n +1-tuples
u
1
, . . . , u
n
, w, where u
1
, . . . , u
n
are members of , and w T
Recall that modal propositional logic models took the interpretations from
nonmodal propositional logic (functions assigning truth values to sentence
letters) and relativized them to possible worlds. We have something similar
here: we relativize the interpretation of predicates to possible worlds. The
interpretation of a two-place predicate, for example, was in nonmodal predicate
logic a set of ordered pairs of members of the domain; now it is a set of ordered
triples, two members of which are in the domain, and one member of which
is a possible world. When u
1
, u
2
, w is in the interpretation of a two-place
predicate R, that represents Rs applying to u
1
and u
2
in possible world w. This
relativization makes intuitive sense: a predicate can apply to some objects in
one possible world but fail to apply to those same objects in some other possible
world.
These predicate-interpretations are known as intensions. The name em-
phasizes the analogy with extensions, which are the interpretations of predicates
in nonmodal predicate logic. The analogy is this: the intension () of an
n-place predicate can be thought of as determining an extension within each
possible world, as follows: the extension of in world w is the set of n-tuples
u
1
. . . u
n
such that u
1
. . . u
n
, w ().
Unlike the interpretations of predicates, the interpretations of constants are
not relativized in any way to possible worlds. The interpretation function
simply assigns a member of the domain to a name. This reects the common
belief that natural language proper nameswhich constants are intended to
representare rigid designators, i.e., terms that have the same denotation relative
to every possible world (see Kripke ().) Well discuss the signicance of
this feature of our semantics below.
Recall from section 2.2 that a semantics for a formal language denes both a
set of congurations, and truth-in-a-conguration. The congurations here are
SQML-models. A conguration must represent both a way for the world to be,
and the meanings of nonlogical expressions. An SQML-models set of worlds
and domain represent the world (i.e., reality); and its interpretation function
represents the meanings of nonlogical expressions (by assigning denotations to
names and intensions to predicates. Notice that intensions are a richer sort of
meaning than the extensions of nonmodal predicate logic.)
As for truth-in-a-conguration, this is the job of the valuation function
for an SQML-model. To dene this, we begin by keeping the denition of a
variable assignment from nonmodal predicate logic (section 4.2). Our variable
assignments therefore assign members of the domain to variables absolutely,
rather than relative to worlds. (This is an appropriate choice given our choice to
assign constants absolute semantic values.) But the valuation function will now
relativize truth values to possible worlds (as well as to variable assignments).
After all, the sentence Fa, if it represents Ted is tall, should vary in truth
value from world to world.
iiixi:iox oi vaiia:iox: The valuation function V
., g
, for SQML-model
. (= T, , ) and variable assignment g is dened as the function that
assigns either 0 or 1 to each wff relative to each member of T, subject to the
following constraints:
for any terms , , V
., g
(=, w) =1 iff []
., g
=[]
., g
for any n-place predicate, , and any terms
1
, . . . ,
n
,
V
., g
(
1
. . .
n
, w) =1 iff [
1
]
., g
, . . . , [
n
]
., g
, w ()
for any wffs , , and variable, ,
V
., g
(, w) =1 iff V
., g
(, w) =0
V
., g
., g
(, w) =0 or V
., g
(, w) =1
V
., g
(, w) =1 iff for every u , V
., g
u
(, w) =1
V
., g
(2, w) =1 iff for every v T, V
., g
(, v) =1
The derived clauses are what youd expect, including the following one for
3:
V
., g
(3, w) =1 iff for some v T, V
., g
(, v) =1
Finally, we have:
is valid in . (=T, , ) iff for every variable assignment, g, and
every w T, V
., g
(, w) =1
is SQML-valid (
SQML
) iff is valid in all SQML models.
SQML-semantically-implies (
SQML
) iff for every SQML-
model . (=T, , ), every w T, and every variable assignment g
for ., if V
., g
(, w) =1 for each , then V
., g
(, w) =1
9.4 Countcrmodcls and validity proofs in SQML
As before, we want to come up with countermodels for invalid formulas, and
validity proofs for valid ones. Validity proofs introduce nothing new.
SQML
3x(x =a2F x)Fa:
i) suppose for reductio that the wff is false in some world r in model, under
some variable assignment g. Then V
g
(3x(x=a2F x), r ) =1 and
ii) V
g
(Fa, r ) =0
iii) From i), for some w T, V
g
(x(x=a2F x), w) =1. So for some u ,
V
g
x
u
(x=a2F x, w) =1). So V
g
x
u
(x=a, w) =1 and
iv) V
g
x
u
(2F x, w) =1. So V
g
x
u
(F x, r ) =1, and so [x]
g
x
u
, r (F )that is,
u, r (F ).
v) from iii), [x]
g
x
u
=[a]
g
x
u
; so u =(a). So by iv), (a), r (F ), contra-
dicting line ii).
As for countermodels, we can use the pictorial method of section 6.3.3,
asterisks and all, with a few changes. First, we no longer need the arrows
between worlds since weve dropped the accessibility relation. Second, we have
predicates and names instead of sentence letters; how to deal with this? Lets
take an example.
SQML
(3Fa3Ga) 3(FaGa). We begin as
follows:
1 1 1 0 0
(3F a3Ga)3(F aGa)

r
The understars make us create two new worlds:
1 1 1 0 0
(3F a3Ga)3(F aGa)

r
1
F a
a
1
Ga
b
In each world we must then discharge the overstar from the false diamond in r:
1 1 1 0 0 0 0
(3F a3Ga)3(F aGa)

r
1 0 0
F a F aGa
a
1 0 0
Ga F aGa
b
(I had to make either Fa or Ga false in rI chose Fa arbitrarily.)
So far Ive placed 1s and 0s above atomic formulas to indicate the truth
values I want them to have. But to get them to have these truth values, I need
to construct the models domain and interpretation function accordingly. Lets
use letters like u and v as the members of the domain in our models. Now, if
we let the name a refer to (the letter) u, and let the extension of the predicate
F in world r be ]] (the empty set), then the truth value of Fa in world r will be
0, since the denotation of a isnt in the extension of F at world r. Likewise, we
need to put u in the extension of F (but not in the extension of G) in world a,
and put u in the extension of G ((but not in the extension of F ) in world b. All
this may be indicated on the diagram as follows:
1 1 1 0 0 0 0
(3F a3Ga)3(F aGa)

F :]]
r
: ]u]
a : u
1 0 0
F a F aGa
F : ]u] G : ]]
a
1 0 0
Ga F aGa
F : ]] G : ]u]
b
Within each world I specied the extension of each predicate. But the spec-
ication of the referent of the name a does not go within any world. This
is because names, unlike predicates, get assigned semantic values absolutely
in a model, not relative to worlds. (Likewise the specication of the domain
doesnt go within any world.) Time for the ofcial model:
T =]r, a, b]
=]u]
(a) =u
(F ) =]u, a]
(G) =]u, b]
What about formulas with quantiers?
SQML
2xF xx2F x. We begin thus::
+
1 1 0 0
2 xF x x2F x
+
r
The overstar above the 2 in the antecedent must be discharged in r itself
(remember: no accessibility relation). That gives us a true existential. Now,
a true existential is a bit like a true 3the true xF x means that there must
be some object u from the domain thats in the extension of F in r. Ill put
a + under true s and false s, to indicate a commitment to some instance of
some sort or other. Analogously, Ill indicate a commitment to all instances of
a given type (which would arise from a true or a false ) with a + above the
connective in question.
OK, how do we make xF x true in r? By making F x true for some value of
x. Lets put the letter u in the domain, and make F x true when u is assigned to
x. Well indicate this by writing F
u
x
in the diagram, and putting a 1 overtop of
it. (F
u
x
isnt a formula of our language; Im just using it and related expressions
in these diagrams to indicate truth values for open sentences relative to variable
assignments.) And to make F x true when u is assigned to x, we put u in the
extension of F at r:
+
1 1 0 0 1
2 xF x x2F x F
u
x
+
F : ]u]
r
: ]u]
Good. Now to attend to the overplus, the + sign overtop the false x2F x. It
requires 2F x to be false for every object in the domain. So far theres only one
object in our domain, u, so weve got to make 2F x false, when u is assigned to
the variable x. Well indicate this on the diagram by putting a 0 overtop of
2F
u
x
:
+
1 1 0 0 1 0
2 xF x x2F x F
u
x
2F
u
x
+
F : ]u]
r
: ]u]
Now we have an understar, so we need a new world. And well need then to
discharge the overstar from the antecedent. We get:
+
1 1 0 0 1 0
2 xF x x2F x F
u
x
2F
u
x
+
F : ]u]
r
: ]u, v]
0 1 1
F
u
x
xF x F
v
x
+
F : ]v]
a
Why the v? Well, I had to make F x false in a, with u assigned to x. That meant
keeping u out of the extension of F at a. Easy enough, rightjust make F s
extension {}? Well, no. Because of the true 2 in r, Ive got to make xF x true
in a, and so something had to be in F s extension in a. It couldnt be u, so I added
a new object, v, to the domain, and put it in F s extension in a.
But adding v to the domain of the model adds a complication, given the
overplus in r. Since x2F x is false in r, 2F x must be false in r for every
member of the domain, and hence for v (as well as for u). That requires another
understar, and so a new world:
+
1 1 0 0 1 0 0
2 xF x x2F x F
u
x
2F
u
x
2F
v
x
+
F : ]u]
r
: ]u, v]
0 1 1
F
u
x
xF x F
v
x
+
F : ]v]
a
0 1 1
F
v
x
xF x F
u
x
+
F : ]u]
b
(Well, we didnt really need a newworld; we could have discharged the understar
on r.) The ofcial model:
T =]r, a, b]
=]u, v]
(F ) =]u, r, u, b, v, a]
Ixcrcisc 9.1 For each formula, give a validity proof if the wff is
SQML-valid, and a countermodel if it is invalid.
a)* (2x(F xGx) 3xF x) 3xGx
b) 3xF xx3F x
c)* x3Rax32xyRxy
d) 2x(F xGx) (x2F x2xGx)
e) x(Nxy(Nyy=x)2Ox)
2x(Nxy(Nyy=x)Ox)
9.5 Philosophical qucstions about SQML
Our semantics for quantied modal logic faces philosophical challenges. In
each case we will be able to locate a particular feature of the SQML semantics
that gives rise to the alleged problem. In response, one can stick with SQML
and give it a philosophical defense, or one can look for a new semantics.
9.5.1 Thc ncccssity of idcntity
Lets try to come up with a countermodel for the following formula:
xy(x=y2(x=y))
When we try to make the formula false by putting a 0 over the initial , we
get an underplus. So weve got to make the inside part, y(x=y2x=y), false
for some value of x. We do this by putting some object u in the domain, and
letting that be the value of x for which y(x=y2x=y) is false. We get:
0 0
xy(x=y2x=y) y(
u
x
=y2(
u
x
=y))
+ +
r
: ]u]
Nowwe need to do the same thing for our newfalse universal: y(x=y2x=y).
For some value of y, the inside conditional has to be false. But then the an-
tecedent must be true, so the value for y has to be u again. We get:
0 0 1 0 0
xy(x=y2x=y) y(
u
x
=y2(
u
x
=y))
u
x
=
u
y
2(
u
x
=
u
y
)
+ +
r
: ]u]
The understar now calls for a new world in which x=y is false, when both x
and y are assigned u. But there can be no such world! An identity sentence is
true (at any world) if the denotations of the terms are identical. Our attempt to
nd a countermodel has failed; time for a validity proof:
i) suppose for reductio that V
g
(xy(x=y2x=y), r ) =0 (for some r and
g in some SQML model). Then V
g
x
u
(y(x=y2x=y), r ) =0, for some
u . So V
g
xy
uv
(x=y2x=y, r ) =0, for some v . So V
g
xy
uv
(x=y, r ) =
1 (hence [x]
g
xy
uv
=[y]
g
xy
uv
) and
ii) V
g
xy
uv
(2x=y, r ) =0. So V
g
xy
uv
(x=y, w) =0 for some w T. So, [x]
g
xy
uv
(=
[y]
g
xy
uv
, contradicting i).
Notice in this proof how the world at which an identity sentence is evaluated
doesnt affect its truth condition. The truth condition for an identity sentence
is simply that the terms (absolutely) denote the same thing.
2
We can think of xy(x=y2(x=y)) as expressing the necessity of iden-
tity: it says that whenever objects are identical, theyre necessarily identical.
This claim is philosophically controversial. On the one hand it can seem ob-
viously correct. If x = y then x and y are one and the same thing, so a world
in which x is distinct from y would have to be a world in which x was distinct
from x; and how could that be? On the other hand, it was a great discovery that
Hesperus =Phosphorus. Surely, it could have turned out the other waysurely,
Hesperus might have turned out to be distinct from Phosphorus. But isnt this
a counterexample to the necessity of identity?
3
Its worth noting why xy(x=y2(x=y)) turns out SQML-valid. It was
our denition of variable assignments. Our variable assignments assign mem-
bers of the domain to variables absolutely, rather than relative to worlds. (Simi-
larly: since the interpretation function assigns referents to names absolutely,
a=b2a=b turns out valid.) One could instead dene variable assignments
as functions that assign members of the domain to variables relative to worlds.
Given appropriate adjustments to the valuation function, this would invalidate
the necessity of identity.
4
(Similarly, one could make assign denotations to
names relative to worlds, thus invaliding a=b2a=b.)
2
A note about variables. In validity proofs, Im using italicized u and v as variables to
range over objects in the domain of the model Im considering. So, a sentence like u = v
might be true, just as the sentence x=y of our object language can be true. But when Im
doing countermodels, Im using upright roman letters u and v as themselves being members
of the domain, not as variables ranging over members of the domain. Since the letters u and
v are different letters, they are different members of the domain. Thus, in a countermodel
with letters in the domain, if the denotation of a name a is the letter u, and the denotation
of the name b is the letter v, then the sentence a=b has got to be false, since u(=v. If
I were using u and v as variables ranging over members of the domain, then the sentence
u = v might be true! This just goes to show that its important to distinguish between the
sentence u = v and the sentence u =v. The rst could be true, depending on what u and
v currently refer to, but the second one is just plain false, since u and v are different letters.
3
The classic discussion of this example is in Kripke ().
4
See Gibbard ().
9.5.2 Thc ncccssity of cxistcncc
Another (in)famous valid formula of SQML is the Barcan Formula (named
after Ruth Barcan Marcus):
x2F x2xF x
(Call the schema 22 the Barcan schema.) An attempt to produce
a countermodel leads us to the following stage:
+
1 0 0
x2F x2xF x
r
0 0
xF x F
u
x
+
F : ]]
a
: ]u]
When you have a choice between discharging over-things and under-things,
whether plusses or stars, always do the under-things rst. In this case, this
means discharging the understar and ignoring the overplus for the moment.
Discharging the understar gave us world a, in which we made a universal false.
This gave an underplus, and forced us to make an instance false. So I put object
u in our domain, and kept it out of the extension of F in a. This makes F x false
in a, when x is assigned u.
But now we must discharge the overplus in r; we must make 2F x true for
every member of the domain, including u, which is now in the domain. But
then this requires F x to be true, when u is assigned to x, in a:
+
1 0 0 1 1
x2F x2xF x 2F
u
x
F : ]u]
r
0 0 1
xF x F
u
x
F
u
x
+
F : ]?]
a
: ]u]
So, we failed to get a countermodel. Time for a validity proof. In fact, lets
show that every instance of the Barcan Schema is valid:
i) suppose for reductio that V
g
(22, r ) =0 (for any r and g in
any model). Then V
g
(2, r ) =1 and
ii) V
g
(2, r ) = 0. So for some world w, V
g
(, w) = 0; and so for
some u in the domain, V
g
u
(, w) =0.
iii) Given i), V
g
u
(2, r ) =1; and so V
g
u
(, w) =1, contradicting ii).
The fact that the Barcan formula is SQML-valid is often regarded as a defect
of SQML. To see why, we need to pause for a moment, and reect on the
intuitive signicance of the relative order of quantiers and modal operators.
The point is perhaps clearest when we consider the following equivalent of the
Barcan formula:
3xF xx3F x
The consequent of this conditional is existential in form. That is, its major
connective is x. Like any existential sentence it says that there exists something
of a certain sort, namely, something that could have been F . In contrast, the
form of the antecedent is modal, not existential. Its major connective is 3, not
x. What it says is that it would be possible for something to be the case (namely,
for there to exist an F ). It does not say that there exists something of a certain
sort. (Of course, one might present a philosophical argument that it implies
such a thing. But it doesnt say it.) This difference in what the antecedent and
consequent say, in fact, suggests that in some cases the antecedent might be
true and the consequent might be false. Perhaps, for example, it would be
possible for there to exist a ghost, even though there in fact exists nothing that
could have been a ghost. That is: if you go through all the things there are,
u
1
, u
2
. . . , none of them is capable of having been a ghost; but nevertheless, the
following is possible: there exists an extra thing, v, distinct from each u
i
, which
is a ghost.
(The contrast between x3F x and 3xF x is analogous to the contrast
between Quines () there exists someone whom Ralph believes to be a
spy and Ralph believes that someone is a spy. In the former, where the
existential quantier comes rst, there is said to be someone of a certain sort
namely, someone who is believed-by-Ralph-to-be-a-spy. In the latter, where
the existential quantier occurs inside of Ralph believes that, no existential
statement is made; rather, the sentence attributes an existential belief to Ralph,
to the effect that there are spies. On the face of it, Ralph might believe that
there are spies, without there being any particular person whom Ralph believes
to be a spy.)
With all this in mind, lets return to the Barcan formula, x2F x2xF x.
Notice how the quantier x comes before the modal operator 2 in the an-
tecedent, but after it in the consequent. Thus, the antecedent is universal in
form; it says that all entities have a certain feature: being-necessarily-F . The
consequent, on the other hand, is modal in form; it says that a certain claim
is necessarily true: the claim that everything is F . Apparently, this difference
between what the antecedent and consequent say leads to the possibility that
the antecedent could be true while the consequent is false. Let u
1
, u
2
. . . again
be all the entities there are; and suppose that each u
i
is necessarily F , so that
the antecedent is true. Mightnt it nevertheless be possible for there to exist
an extra entity, v, distinct from each u
i
, that fails to be F ? In that case, the
consequent would be false. Suppose, for instance, that each u
i
is necessarily
a material object. Then, letting F stand for is a material object, x2F x is
true. Nevertheless, 2xF x seems falseit would presumably be possible for
there to exist an immaterial object: a ghost, say. The ghost would simply need
to be distinct from each u
i
.
This objection to the validity of the Barcan formula is obviously based
on the idea that it is contingent which objects exist, for it assumes that there
could have existed an extra object, distinct from each u
i
that in fact exists. In
terms of possible worlds, the objection assumes that what objects exist can vary
from possible world to possible world. Anyone who thinks that the objection
is correct will then point to a corresponding defect in the SQML denition
of a model. Each SQML-model contains a single domain, ; and the truth
condition for the quantied sentence , at a world w, is simply that is
true at w of every member of . Thus, the quantier ranges over the same
domain, regardless of which possible world is being describedSQML does
not represent it as being contingent which objects exist. That is why the Barcan
formula turns out SQML-valid.
This feature of SQML models is problematic for an even more direct
reason: the sentence x2y y = x turns out valid. (For suppose for reductio
that V
g
(x2y y=x, w) =0. Then V
g
x
u
(2y y=x, w) =0, for some u . So
V
g
x
u
(y y=x, w
/
) =0, for some w
/
T. So V
g
xy
uu
/
(y=x, w
/
) =0, for every u
/
.
So, since u , we have V
g
xy
uu
(y=x, w
/
) = 0. So [y]
g
xy
uu
(= [x]
g
xy
uu
, i.e., u (= u;
contradiction. Its clear that the source of the validity here is the same as with
the Barcan schema: SQML models have a single domain common to each
possible world.) This is problematic because x2y y = x seems to say that
everything necessarily exists! It says that for each object x, its necessary that
there is something identical to x; but if there is something identical to x in a
possible scenario, then, it would seem, x exists in that scenario.
The Barcan schema is just one of a number of interesting schemas concern-
ing how quantiers and modal operators interact (for each schema I also list an
equivalent schema with 3 in place of 2):
22 33 (Barcan)
22 33 (converse Barcan)
22 33
22 33
We have already discussed the Barcan schema. The fourth schema raises no
philosophical problems for SQML, since, quite properly, it has instances that
turn out invalid (example 9.3). Lets look at the other two schemas.
First, the converse Barcan schema. Like the Barcan schema, each of its
instances is SQML-valid (exercise 9.2), and like the Barcan schema, this verdict
faces a philosophical challenge. Suppose the antecedent is true. So its necessary
that everything is ; in every possible world, the statement everything is
is true. But, the challenger says, this just means that in every world, all the
things that exist in that world are . So it permits things that dont exist in that
world to fail to be in that world, in which case the consequent would be false.
This talk of an object being in a world in which it doesnt exist may seem
strange, but consider the following instance of the converse Barcan schema,
substituting y y=x (think: x exists) for :
2xy y=xx2y y=x
This formula seems false. Its antecedent is clearly true; but its consequent
seems to say that everything exists necessarily.
Each instance of the third schema, 22, is also SQML-valid
(exercise 9.2); and again, this is philosophically questionable. Lets suppose that
physical objects are necessarily physical. Then, x2Px seems true, letting P
mean is physical. But 2xPx seems falsesurely it would have been possible
for there to have existed no physical objects. This counterexample (as well as
the counterexample to the converse Barcan formula) requires that it be possible
for some objects that in fact exist to go missing, whereas the counterexample
to the Barcan formula required the possibility of extra objects.
Ixcrcisc 9.2 Showthat all instances of the converse Barcan schema
and the third schema are SQML-valid.
9.5.3 Ncccssary cxistcncc dcfcndcd
There are various ways to respond to the challenge of the previous section.
From a logical point of view, the simplest is to stick to ones guns and defend
the SQML semantics. SQML-models accurately model the modal facts. The
Barcan formula, the converse Barcan formula, the third schema, and the state-
ment that everything necessarily exists are all logical truths; the philosophical
objections are mistaken. Contrary to appearances it is not contingent what
objects there are. Each possible world has exactly the same stock of individuals.
Call this the doctrine of constancy.
One could uphold constancy either by taking a narrow view of what is
possible, or by taking a broad view of what there is. On the former alternative,
one would claim that it is just not possible for there to be any ghosts, and that
it is just not possible in any sense for an actual object to have failed to exist.
On the latter alternative, which Ill be discussing for the rest of this section,
one accepts the possibility of ghosts, dragons, and so on, but claims, roughly,
that there are possible ghosts and dragons in the actual world; and one accepts
that actual object could have failed to exist in a certain robust sense while
denying that actual objects could have utterly failed to be.
The defender of constancy that I have in mind thinks that there are a great
many more things than one would normally suppose. In addition to normal
thingswhat one would normally think of as the actually existing entities:
people, tables and chairs, planets and electrons, and so onour defender of
constancy claims that there are also objects that, in other possible worlds, are
ghosts, golden mountains, talking donkeys, and so forth. Call these further
objects extraordinary things. In order for the formula x to be true, its
not enough for the normal things to be , for the normal things are not all
of the things that there are. There are also all the extraordinary things, and
each of these must be as well (must be here in the actual world, that is), in
order for x to be true. Hence, the objection to the Barcan formula from the
previous section fails. That objection assumed that x2F x, the antecedent of
the Barcan formula, was true when F symbolizes is a material object. But this
neglects the extraordinary things. Even if all the normal objects are necessarily
material objects, there are some further thingsextraordinary thingsthat are
not necessarily material objects.
Further: in ordinary language, when we say everything or something,
we typically dont mean to be talking about all objects; were typically talking
about just the normal objects. Otherwise we would be speaking falsely when
we say, for example, everything has mass: extraordinary things that might
have been unicorns presumably have no mass (nor spatial location, nor any
other physical feature.) Ordinary quantication is restricted to normal things.
So if we want to translate an ordinary claim into the language of QML, we
must introduce a predicate for the normal things, N, and use it to restrict
quantiers. But now, consider the following ordinary English statement:
If everything is necessarily a material object, then nec-
essarily: everything is a material object
If we mindlessly translate this into the language of QML, we would get
x2F x2xF xan instance of the Barcan schema. But since in every-
day usage, quantiers are restricted to normal things, the thought in the mind
of an ordinary speaker who utters this sentence is more likely the following:
x(Nx2F x)2x(NxF x)
which says:
If every normal thing is necessarily a material object,
then necessarily: every normal thing is a material object.
And this formula is not an instance of the Barcan schema, nor is it valid, as may
be shown by the following countermodel:
+
1 0 0 0 1 0
x(Nx2F x)2x(NxF x) N
u
x
2F
u
x

N : ]]
r
: ]u]
0 1 0 0
x(NxF x) N
u
x
F
u
x
+
N : ]u] F : ]]
a
So in a sense, the ordinary intuitions that were alleged to undermine the Barcan
schema are in fact consistent with constancy.
The defender of constancy can defend the converse Barcan schema and
the third schema in similar fashion. The objection to the converse Barcan
schema assumed the falsity of x2y y=x. Sheer prejudice!, according to
the friend of constancy. And further, an ordinary utterance of Everything exists
necessarily expresses, not x2y y=x, but rather x(Nx2y(Nyy=x)),
(N for normal), the falsity of which is is perfectly compatible with constancy.
Its possible to fail to be normal; all thats impossible is to utterly fail to be.
Likewise for the third schema.
The defender of constancy relies on a distinction between normal and
extraordinary objects. This distinction is schematic; different defenders of
constancy might understand this distinction in different ways. Some might
say that the normal things are the existent things; extraordinary objects are
things that do not exist, but neverthless are. Others might say that the normal
things are actual, and the extraordinary ones are nonactualmerely possible
objects. Still others might say that the extraordinary things are those that are
not, but could have been, located in space and time.
5
However the distinction is
understood, this defense of SQML has hefty metaphysical commitments. Some
philosophers consider the postulation of nonexistent or nonactual entities as
being anywhere from obviously false to conceptually incoherent, or subversive,
5
Compare Williamson (), whose defense of constancy inspired this section.
or worse.
6
And even the postulation of contingently nonspatiotemporal entities
will strike many as extravagant.
On the other hand, constancys defenders can point to certain powerful
arguments in its favor. Heres a quick sketch of one such argument. First, the
following seems to be a logical truth:
Ted =Ted
But it follows from this that:
7
y y =Ted
This latter formula, too, is therefore a logical truth. But if is a logical truth
then so is 2 (recall the rule of necessitation from chapter 6). So we may infer
that the following is a logical truth:
2y y =Ted
Now, nothing in this argument depended on any special features of me. We
may therefore conclude that the reasoning holds good for every object; and
so x2y y = x is indeed a logical truth. Since, therefore, every object exists
necessarily, it should come as no surprise that there are things that might have
been ghosts, dragons, and so onfor if there had been a ghost, it would have
necessarily existed, and thus must actually exist. This and other related argu-
ments have apparently wild conclusions, but they cannot be lightly dismissed,
for it is hard to say exactly where they go wrong (if they go wrong at all!).
8
9.6 Variablc domains
We now consider a way of dealing with the problems discussed in section 9.5.2
that does not require embracing constancy.
SQML models contain a single domain, , over which the quantiers range
in each possible world. Since it was this feature that led to the problems of
section 9.5.2, lets introduce a new semantics that instead provides different
6
See Quine (); Lycan ().
7
Free logicians will of course resist this step. See section 5.6.
8
On this topic see Prior (, -); Plantinga (); Fine (); Linsky and Zalta
(, ); Williamson (, ).
domains for different possible worlds. And lets also reinstate the accessibility
relation, for reasons to be made clear below. The new semantics is called
VDQML (variable-domains quantied modal logic):
iiixi:iox oi xobii: A VDQML-model is a 5-tuple T, %, , 2, where:
T is a nonempty set (possible worlds)
% is a binary relation over T (accessibility relation)
is a nonempty set (super-domain)
2 is a function that assigns to any w T a subset of . Let us refer
to 2(w) as
w
. Think of
w
as ws sub-domainthe set of objects
that exist at w.
is a function such that: (interpretation function)
if is an n-place predicate then () is a set of ordered n+1-tuples
u
1
, . . . , u
n
, w, where u
1
, . . . , u
n
are members of , and w T.
iiixi:iox oi vaiia:iox: The valuation function V
., g
, for VDQML-model
. (=T, %, , 2, ) and variable assignment g, is dened as the function
that assigns either 0 or 1 to each wff relative to each member of T, subject to
the following constraints:
for any terms and , V
., g
(=, w) =1 iff []
., g
=[]
., g
for any n-place predicate, , and any terms
1
, . . . ,
n
,
V
., g
(
1
. . .
n
, w) =1 iff [
1
]
., g
, . . . , [
n
]
., g
, w ()
for any wffs and , and variable, ,
V
., g
(, w) =1 iff V
., g
(, w) =0
V
., g
., g
(, w) =0 or V
., g
(, w) =1
V
., g
(, w) =1 iff for each u
w
, V
., g
u
(, w) =1
V
., g
(2, w) =1 iff for each v T, if %wv then V
., g
(, v) =1
The denition of denotation remains unchanged. The obvious derived clauses
for and 3 are as follows:
V
., g
(, w) =1 iff for some u
w
, V
., g
u
(, w) =1
V
., g
(3, w) =1 iff for some v T, %wv and V
., g
(, v) =1
Thus, we have introduced subdomains. We still have , a set that contains
all of the possible individuals. But for each possible world w, we introduce a
subset of the domain,
w
, to be the domain for w. When evaluating a quantied
sentence at a world w, the quantier ranges only over
w
.
How should we dene validity and semantic consequence here? There are
some complications. Our earlier denitions were: a valid formula must be true
at every world in every model under every variable assignment; and semantic
consequence is truth preservation at every world in every model under every
variable assignment. Sticking to those denitions leads to some odd results.
For example, the formula xF xF y turns out to be invalid. For consider
a model with a world w such that everything in
w
is F at w; suppose that
some object u that is not a member of
w
is not F at w; and consider a variable
assignment that assigns u to y. xF x is then true but F y is false at w, relative
to this model and variable assignment. This result is odd because xF xF y
is an instance of the principle of universal instantiation (axiom schema PC
from section 4.4.) For similar reasons, xF xFa comes out invalid as well.
The example could be blocked by redening validity. We could say that a
formula is valid iff it is true for every admissible choice of a model ., world w,
and variable assignment g, where such a choice is admissible iff []
., g

w
for
each term (whether variable or constant). But this just relocates the oddity:
now the rule of necessitation fails to preserve validity. xF xF y now turns
out valid, but 2(xF xF y) does not (as a model like the one considered in
the previous paragraph demonstrates.) Alternatively, we could stick with the
original denition, embrace the invalidity of xF xF y (and of xF xFa),
thus accepting free logic (section 5.6).
Note that if . is an SQML model, then we can construct a corresponding
VDQML model with the same set of worlds, (super-) domain, and interpre-
tation function, in which every world is accessible from every other, and in
which 2is a constant function assigning the whole super-domain to each world.
It is intuitively clear that the same sentences are true in this corresponding
model as are true in .. Hence, whenever a sentence is SQML-invalid, it is
VDQML-invalid. (The converse of course is not true.)
9.6.1 Contingcnt cxistcncc vindicatcd
What is the status in VDQML of the controversial SQML-valid formulas
discussed in section 9.5.2? They all turn out invalid. Here is an abbreviated
countermodel to the Barcan formula; for the others, see exercise 9.4.
Example 9.4:
VDQML
x2F x2xF x:
r
: ]u] F : ]u] r
: ]u, v]
a
: ]u, v] F : ]u]
a
Ofcial model:
T =]r, a]
% =]r, r, r, a, a, a]
=]u, v]
r
=]u]
a
=]u, v]
(F ) =]u, r, u, a]
Ixcrcisc 9.3 Does the move to variable domain semantics change
whether any of the formulas in exercise set 9.1 are valid? Justify
your answers.
Ixcrcisc 9.4 Demonstrate the VDQML-invalidity of the follow-
ing formulas
a) 2xF xx2F x
b) x2F x2xF x
c) x2y y=x
9.6.2 Incrcasing, dccrcasing domains
If we made certain restrictions on the accessibility relation in variable-domains
models, then the validity of the controversial formulas of section 9.5.2 would
be reinstated. For example, the counterexample to the Barcan formula in the
previous section required a model in which the domain expanded; world a was
accessible from world r, and had a larger domain. But suppose we included the
following constraint on % in any model:
if %wv, then
v
w
(decreasing domains)
The counterexample would then go away. Indeed, every instance of the Barcan
schema would then become valid, which may be proved as follows:
g
(22, w) =0. So V
g
(2, w) =
1 and
ii) V
g
(2, w) =0. So for some v, %wv and V
g
(, v) =0; and so,
for some u
v
, V
g
u
(, v) =0.
iii) Given decreasing domains,
v
w
, so u
w
. So by i), V
g
u
(2, w) =1;
and so V
g
u
(, v) =1. This contradicts ii).
Similarly, the following constraint would validate the converse Barcan
schema as well as 22 (exercise 9.5):
if %wv then
w
v
(increasing domains)
Even after imposing the increasing domains constraint, the Barcan formula
remains invalid; and after imposing the decreasing domains constraint, the con-
verse Barcan formula and also x2F x2xF x remain invalid. But when the
accessibility relation is symmetric (as it is in B and S) this collapses: imposing
either constraint results in imposing both.
Ixcrcisc 9.5 Show that every instance of each of the following
schemas is valid given the increasing domains requirement.
a) 22
b) 22
9.6.3 Strong and wcak ncccssity
In order for 2 to be true at a world, the VDQML semantics requires that
be true at every accessible world. This requirement might seem too strong. In
order for 2Fa, say, to be true, Fa must be true in all possible worlds. But what
if a fails to exist in some worlds? In order for Necessarily, I am human to
be true, must I be human in every possible world? Isnt it enough for me to be
human in all the worlds in which I exist?
If the underlying worry here is that a must exist necessarily in order for
2Fa to be truethat I must exist necessarily in order to be necessarily human
then the worry is unfounded. The VDQML semantics does require Fa to
be true in every world in order for 2Fa to be true; but it does not require a
to exist in every world in which Fa is true. The clause in the denition of a
VDQML-model for the interpretation of predicates was this:
if is an n-place predicate then () is a set of ordered n +1-tuples
u
1
, . . . , u
n
, w, where u
1
, . . . , u
n
are members of , and w T.
(The underlined part mentions , not
w
.) (F ) is allowed to contain pairs
u, w, where u is not a member of
w
. 2Fa is consistent with as failing to
necessarily exist; its just that a has to be F even in worlds where it doesnt exist.
I doubt this really addresses the philosophical worry about the semantics,
though, since it looks like bad metaphysics to say that a person could be human
at a world where she doesnt exist. One could hard-wire a prohibition of this
sort of bad metaphysics into VDQML semantics, by replacing the old clause
with a new one:
if is an n-place predicate then () is a set of ordered n +1-tuples
u
1
, . . . , u
n
, w, where u
1
, . . . , u
n
are members of
w
, and w T.
thus barring objects from having properties at worlds where they dont ex-
ist. But some would argue that this goes too far. The new clause validates
x2(F xy y=x). An object must exist in order to be F sounds clearly
true if F stands for is human, but what if F stands for is famous? If Baconians
had been right and there had been no such person as Shakespeare, perhaps
Shakespeare might still have been famous.
The issues here are complex.
9
But whether or not we should adopt the new
clause, it looks as though there are some existence-entailing English predicates
9
The question is that of so-called serious actualism (Plantinga, ).
: predicates such that nothing can be a without existing. Is human seems
to be such a predicate. So were back to our original worry about VDQML-
semantics: its truth condition for 2 requires truth of at all worlds, which
is allegedly too strong in at least some cases, for example the case where
represents I am human.
One could modify the clause for the 2 in the denition of the valuation
function, so that in order for 2Fa to be true, a only needs to be F in worlds in
which it exists:
V
., g
(2, w) =1 iff for each v T, if %wv, and if []
., g

w
for each
name or free variable occurring in , then V
., g
(, v) =1
This would indeed have the result that 2Fa gets to be true provided a is F
in every world in which it exists. But be careful what you wish for. Along
with this result comes the following: even if a doesnt necessarily exist, the
sentence 2x x=a comes out true. For according to the new clause, in order
for 2x x=a to be true, it must merely be the case that x x=a is true in every
world in which a exists, and of course this is indeed the case.
If 2x x=a comes out true even if a doesnt necessarily exist, then 2x x=a
doesnt say that a necessarily exists. Indeed, it doesnt look like we have any way
of saying that a necessarily exists, using the language of QML, if the 2 has the
meaning provided for it by the new clause.
A notion of necessity according to which Necessarily requires truth in
all possible worlds is sometimes called a notion of strong necessity. In contrast,
a notion of weak necessity is one according to which Necessarily requires
merely that be true in all worlds in which objects referred to within exist.
The new clause for the 2 corresponds to weak necessity, whereas our original
clause corresponds to strong necessity.
As we saw, if the 2 expresses weak necessity, then one cannot even express
the idea that a thing necessarily exists. Thats because one needs strong necessity
to say that a thing necessarily exists: in order to necessarily exist, you need to
exist at all worlds, not just at all worlds at which you exist! So this is a deciency
of having the 2 of QML express weak necessity. But if we allow the 2 to
express strong necessity instead, there is no corresponding deciency, for one
can still express weak necessity using the strong 2 and other connectives. For
example, to say that a is weakly necessarily F (that is, that a is F in every world
in which it exists), one can say: 2(x x=aFa).
So it would seem that we should stick with our original truth condition for
the 2, and live with the fact that statements like 2Fa turn out false if a fails
to be F at worlds in which it doesnt exist. Those who think that Necessarily,
I am human is true despite my possible nonexistence can always translate
this natural language sentence into the language of QML as 2(x x=aFa)
(which requires a to be F only at worlds at which it exists) rather than as 2Fa
(which requires a to be F at all worlds).
9.6.4 Actualist and possibilist quantihcation
Suppose we kept the denition of a VDQML model as-is, but added a new
expression
p
to the language of QML, with a grammar just like (i.e.,
p
is
a wff for each variable and wff ), and with a semantics given by the following
added clause to the denition of the valuation function:
V
., g
(
p
) =1 iff for each u , V
., g
u
(, w) =1
Thus, in any world w, whereas ranges just over
w
,
p
ranges over all of .
p
is sometimes called a possibilist quantier, since it ranges over all
possible objects; is called an actualist quantier since it ranges at world w
only over the objects that are actual at w. In this setup, continues to behave as
it does in VDQML, and hence the Barcan formula and company remain invalid.
But the
p
behaves just like did in SQML. For example,
p
x2F x2
p
xF x
and
p
x2
p
y y=x come out valid (where
p
is dened as meaning
p
).
Formally speaking, this approach is very similar to the approach of section
9.5.3. For in effect, this sections
p
is the sole quantier of section 9.5.3;
and this sections is the restricted quantier x(Nx of section 9.5.3,
where N is a predicate symbolizing is a normal object. On the face of it, the
approaches are metaphysically similar as well. If there is a difference between
introducing two quantiers, one possibilist and one actualist, on the one hand,
and introducing a single quantier plus a predicate for normalcy/actuality, on
the other, then its a subtle one.
10
10
Although see McDaniel (); Turner (MS) for some related subtle metaphysics.
9.7 Axioms for SQML
So far our approach has been purely semantic. But one can also take a proof-
theoretic approach to quantied modal logic. This is quite straightforward for
SQML. (One can do it for VDQML as well, but we wont pursue that here.) To
get an axiomatic system, for instance, one can simply combine the axioms for
predicate logic introduced in section 4.4 with the axioms for S from section
6.4.6, plus axioms governing the identity sign:
Axioxa:it svs:ix SOM!:
Rules: MP, plus:
UG

2
NEC
Axioms: instances of the PL, PL, PL schemas, plus:
(/) (PC)
() () (PC)
2() (22) (K)
2 (T)
322 (S)
= (RX)
=(()()) (II)
QML wffs are now allowed to be substituted for the schematic letters. Substitu-
tions for the schematic letters in PC and PC must be restricted as discussed in
section 4.4. In RX (reexivity), may be any variable or individual constant;
in II (indiscernibility of identicals), and may be any two variables or
individual constants, and () and () may be any two wffs that are exactly
alike except that zero or more occurrences of (free occurrences if is a
variable) in the rst are replaced by occurrences of (free occurrences, if is
a variable) in the second. Thus, RX says that everything is self-identical, and II
expresses the familiar principle of the indiscernibility of identicals: if objects
are identical then anything true of one is true of the other as well.
11
11
II must be distinguished from section 5.4.3s indiscernibility of identicals (though each
is based on the same idea). The former is an axiom schemaa claim in the metalanguage
As always, a theorem is dened as the last line of a proof in which each line
is either an axiom or follows from earlier lines by a rule. As with MPL, we will
be interested only in theoremhood, and will not consider proofs from premise
sets. And we will use shortcuts as in sections 2.8, 4.4, and 6.4 to ease the pain of
axiomatic proofs. Though we wont prove this here, our SQML axiom system
is sound and complete with respect to the SQML semantics: a QML-wff is
SQML-valid iff it is an SQML-theorem.
12
Many theorems of SQML are just what youd expect from the result of
putting together axioms for predicate logic and axioms for S modal proposi-
tional logic. Examples include:
2x(F xGx) 2xF x
(2xF x2xGx) 2x(F xGx)
(2xF x3xGx) 3x(F xGx)
The rst of these formulas, for example, is just an instance of a familiar sort
of K-theorem, namely, a wff of the form 22 where is provable on
its own. The only difference in this case is that to prove here, namely
x(F xGx)xF x, you need to use predicate logic techniques (see example
4.9). The other theorems are also unsurprising: each is, intuitively, an amalgam
of a predicate logic theorem and a S MPL theorem.
But other theorems of SQML are more surprising. In particular, all in-
stances of the Barcan and converse Barcan schemas are theorems. An instance
of the converse Barcan schema may be proved as follows:
. xF xF x PC
. 2xF x2F x , NEC, K, MP
. x(2xF x2F x) , UG
. 2xF xx2F x PC, , MP
Note that the only propositional modal logic required in this proof is K. The
proof of the Barcan formula, on the other hand, requires B:
to the effect that any formula of a certain shape is an axiom; the axioms it generates have
only rst-order variables; and it is, intuitively, limited to properties that one can express in
the language of QML. The latter is a single sentence of the object language (the language
of second-order predicate logic); it contains second-order variables; and it is not limited to
expressible properties (since the second-order variable X ranges over all subsets of the domain).
12
See Hughes and Cresswell (, chapters ).
. x2F x2F x PC
. 3x2F x32F x , NEC, K3, MP
. 32F xF x B
. 3x2F xF x , , PL (syllogism)
. 3x2F xxF x , UG, PC, MP
. 23x2F x2xF x , NEC, K, MP
. x2F x23x2F x B3
. x2F x2xF x , , PL (syllogism)
For one more example, the formula 2x x=a, which attributes necessary exis-
tence to a, may be proved as follows:
. a=a RX
. xx=a a=a PC
. xx=a , , PL
. 2xx=a , NEC
The conclusion, 2xx=a, is the denitional equivalent of 2x x=a.
Given completeness, all the other controversial SQML-valid formulas are
also SQML-theorems. Anyone who rejects these controversial formulas must
therefore come up with some other proof-theoretic approach. There are indeed
other proof-theoretic approaches. But these approaches are more complex.
Though we wont go into this in detail, lets look quickly at one possibility.
Take the SQML-proof of 2x x=a. How might we revise the rules of our
SQML axiomatic system to block it? The simplest method is to replace the
standard predicate logic axioms with those from free predicate logic. Once we
reach line we have proved, purely by means of predicate logic, the sentence
xx=athat is, x x=a. This is just the kind of conclusion that the free
logician wants to block; from her point of view, the name a might fail to denote
any existing object. In section 5.6.2 we saw that the axiom of standard predicate
logic that is objectionable to free logicians is PC. PC expresses the principle
of universal instantiation: if everything is , then is . The free logical
restriction of PC mentioned in section 5.6.2 was this:
(=(/)) (PC
/
)
which says: if everything is , then is provided exists. Replacing PC
with PC
/
blocks the proof of 2x x=a at step . It also blocks the proofs of
the Barcan and converse Barcan formulas given above. Using this free-logical
approach, one can develop various axiomatic systems for QML that are sound
and complete with respect to variable domain semantics.
13
Ixcrcisc 9.6 Construct axiomatic proofs in SQML for each of the
following wffs.
a)* 2(2x(F xGx) xF x) 2xGx
b) (2xF x3xGx) 3x(F xGx)
c) (x2(F xGx)x3F x) x3Gx
d) y2x x=y
13
See Garson (). An alternate approach is that originally taken by Kripke, who blocks
the objectionable proofs by disallowing the style of reasoning using free variables on which
those proofs are based. See Kripke (), and Hughes and Cresswell (, ).
Chaptcr 10
Two-dimcnsional modal logic
I
x :uis tuav:iv we consider a class of extensions to modal logic with con-
siderable philosophical interest.
10.1 Actuality
The word actually, in one of its senses anyway, can be thought of as a one-place
sentence operator: Actually, .
Actually might at rst seem redundant. Actually, snow is white just
amounts to: snow is white; actually, grass is blue just amounts to: grass is
blue. But its not redundant when its embedded inside modal operators. The
following two sentences, for example, have different meanings:
Necessarily, if grass is blue then grass is blue
Necessarily, if grass is blue then grass is actually blue
The rst sentence makes the trivially true claim that grass is blue in any possible
world in which grass is blue. But the second sentence makes the false claim that
if grass is blue in any world, then grass is blue in the actual world. Intuitively,
actual lets us talk about whats going on in the actual world, even if were
inside the scope of a modal operator where normally wed be talking about
other possible worlds.
We symbolize Actually, as @. (Grammar: whenever is a wff,
so is @.) We can now symbolize the pair of sentences above as 2(BB)
CHAPTER 10. TWO-DIMENSIONAL MODAL LOGIC

and 2(B@B), respectively. For some further examples of sentences we can
symbolize using actually, consider:
1
It might have been that everyone who is actually rich is
poor
3x(@RxPx)
There could have existed something that does not ac-
tually exist
3x@y y=x
10.1.1 Kripkc modcls with dcsignatcd worlds
Before doing semantics for @, lets return to the semantics of standard proposi-
tional modal logic. Here is a way of doing that semantics which differs slightly
from that of section 6.3.1. First, instead of a triple T, %, , let an MPL-
model be a quadruple T, w
@
, %, , where T, %, and are as before, and
w
@
is some member of T, thought of as the actual, or designated world of the
model. Second, dene the valuation function exactly as before (the designated
world w
@
plays no role here). But third, use the designated worlds in the
following new denitions (where S is any modal system):
iiixi:ioxs oi :vi:u ix a xobii, vaiibi:v, axb sixax:it toxsioiixti:
is true in model . (= T, w
@
, %, ) iff V
.
(, w
@
) =1
is S-valid iff is true in all S-models
is an S-semantic consequence of iff for any S-model ., if each
is true in . then is true in .
One could add a designated world to models for quantied modal logic in a
parallel way.
The old denitions of validity and semantic consequence, recall, never used
any notion of truth in a model. (A valid formula, for example, was dened
as a formula that is valid in all models.) But in model theory generally, one
normally denes some notion of truth in a model, and then uses it to dene
1
In certain special cases, we could do without the new symbol @. For example, instead of
symbolizing Necessarily, if grass is blue then grass is actually blue as 2(B@B), we could
symbolize it as 3BB. But the @ is not in general eliminable; see Hodes (b,a).
validity as truth in all models, and semantic consequence as the preservation of
truth in models. The nice thing about our new denitions is that they let us do
the same for modal logic. But they dont differ in any substantive way from the
old denitions; they yield exactly the same results (exercise 10.1).
Ixcrcisc 10.1* Show that the new denitions of validity and se-
mantic consequence are equivalent to the old ones.
10.1.2 Scmantics for @
We can give @ a simple semantics using models with designated worlds. And
now the designated worlds will play a role in the valuation function, not just
in the denition of validity. Well move straight to quantied modal logic,
bypassing propositional logic. To keep things simple, lets go the SQML route:
constant domain and no accessibility relation.
iiixi:iox oi xobii: A designated-world SQML-model is a four-tuple
T, w
@
, , , where:
T is a non-empty set (worlds)
w
@
is a member of T (designated/actual world)
is a non-empty set (domain)
is an interpretation function that assigns semantic values as before
(to names: members of ; to predicates: extensions relative to worlds)
The valuation function is dened just as for SQML (section 9.3), with the
following added clause for the new operator @:
V
., g
(@, w) =1 iff V
., g
(, w
@
) =1
Thus, @ is true at any world iff is true in the designated world.
10.1.3 Istablishing validity and invalidity
The strategies for establishing the validity or invalidity of a given formula are
similar to those from chapter 9.
Example 10.1: Show that x(F x2Gx) 2x(Gx@F x)
i) Suppose for reductio that this formula is not valid. Then for some model
and variable assignment g, V
g
(x(F x2Gx) 2x(Gx@F x), w
@
) =
0. So V
g
(x(F x2Gx), w
@
) =1 and
ii) V
g
(2x(Gx@F x), w
@
) = 0. So V
g
(x(Gx@F x), a) = 0 for some
a T. So V
g
x
u
(Gx@F x, a) = 0 for some u . So V
g
x
u
(Gx, a) = 0
and
iii) V
g
x
u
(@F x, a) =0. So V
g
x
u
(F x, w
@
) =0 (by the truth condition for @).
iv) Given line i), V
g
x
u
(F x2Gx, w
@
) = 1. So either V
g
x
u
(F x, w
@
) = 1 or
V
g
x
u
(2Gx, w
@
) =1. So, given iii), V
g
x
u
(2Gx, w
@
) =1, and so V
g
x
u
(Gx, a) =
1, contradicting ii).
Example 10.2: Show that 2x(Gx@F x) 2x(GxF x):
T =]w
@
, a]
=]u]
(F ) =]u, w
@
]
(G) =
The formula is false in world w
@
of this model. (The consequent is false in @
because at world a, something (namely, u) is neither G nor F ; but the antecedent
is true in @: since u is F at w
@
, its necessary that u is either G or actually F .)
So the formula is false in the model; so it is invalid.
10.2
Adding @ to the language of quantied modal logic lets us express certain kinds
of comparisons between possible worlds that we couldnt express otherwise. But
it doesnt go far enough; we need a further addition.
2
Consider this sentence:
It might have been the case that, if all those then rich
might all have been poor, then someone is happy
2
See Hodes (a) on the limitations of @; see Cresswell () on (his symbol is Ref),
and further related additions.
What its saying, in possible worlds terms, is this:
For some world w, if theres a world v such that (ev-
eryone who is rich in w is poor in v), then someone is
happy in w
This is a bit like It might have been that everyone who is actually rich is poor;
in this new sentence the word then plays a role a bit like the role actually
played in the earlier sentence. But the intention of the then is not to take us
back to the actual world; it is rather to take us back to the world, w, that was
introduced by the rst possibility operator, it might have been the case that.
We cannot, therefore, symbolize our new sentence this way:
3(3x(@RxPx)xHx)
For this says, in possible worlds terms:
For some world w, if theres a world v such that (ev-
eryone who is rich in w
@
is poor in v), then someone
is happy in w
The problemis that @, as weve dened it, always takes us back to the designated
world, whereas what we need to do is to mark the world w, and have @ take
us back to the marked world:
3(3x(@RxPx)xHx)
marks the spot: it is a point of reference for subsequent occurrences of @.
10.2.1 Two-dimcnsional scmantics for
So lets add another one-place sentence operator, (grammar: whenever
is a wff, so is ). The idea is that means the same thing as , except
that subsequent occurrences of @ in are to be interpreted as picking out the
world that was the current world of evaluation when the was encountered.
For semantics, lets return to the old SQML models T, , (without
designated worlds). Denotation is dened as before. But lets change the
valuation function: it will now assign truth values to formulas relative to pairs of
possible worlds, rather than relative to single worlds (hence: two-dimensional
semantics). So well write V
2
., g
(, w
1
, w
2
) rather than V
., g
(, w). The
second world, w
2
, plays the same role that the sole world w played before; call
it the world of evaluation. The rst world, w
1
, is new; call it the reference
world. Think of it as a temporary actual world: it is the world that is picked
out by @, and it can be changed by . Thus, V
2
., g
(, w
1
, w
2
) will mean that
is true at world w
2
, when w
1
is treated as the actual world.
iiixi:iox oi vaiia:iox: The two-dimensional valuation function, V
2
., g
, for
an SQML-model . (=T, , ) is dened as the three-place function that
assigns to each wff, relative to each pair of worlds, either 0 or 1 subject to the
following constraints, for any n-place predicate , terms
1
. . .
n
, wffs and
, and variable :
V
2
., g
(
1
. . .
n
, v, w) =1 iff [
1
]
., g
, . . . , [
n
]
., g
, w ()
V
2
., g
(, v, w) =1 iff V
2
., g
(, v, w) =0
V
2
., g
(, v, w) =1 iff V
2
., g
(, v, w) =0 or V
2
., g
(, v, w) =1
V
2
., g
(, v, w) =1 iff for all u , V
2
., g
u
(, v, w) =1
V
2
., g
(2, v, w) =1 iff for all w
/
T, V
2
., g
(, v, w
/
) =1
V
2
., g
(@, v, w) =1 iff V
2
., g
(, v, v) =1
V
2
., g
(, v, w) =1 iff V
2
., g
(, w, w) =1
Note the nal clause. says to forget about the old reference world, and let
the new reference world be the current world of evaluation. As for validity and
semantic consequence, our ofcial denitions will be the following:
is 2D-valid (
2D
) iff for every model ., every variable assignment
g for ., and every world w in ., V
2
., g
(, w, w) =1
is a 2D-semantic consequence of (
2D
) iff for every model
., every variable assignment g for ., and every world w in ., if
V
2
., g
(, w, w) =1 for each , then V
2
., g
(, w, w) =1
These dene validity as truth in every pair of worlds of the form w, w, and
semantic consequence as truth-preservation at every such pair. But these arent
the only notions of validity and consequence that one could introduce. There
are also the notions of truth and truth-preservation at every pair of worlds:
3
iiixi:ioxs oi oixivai 2 vaiibi:v axb sixax:it toxsioiixti:
is generally 2D-valid (
G2D
) iff for every model ., every variable
assignment g for ., and any worlds v and w in ., V
2
., g
(, v, w) =1
is a general 2D-semantic consequence of (
G2D
) iff for every
model ., every variable assignment g for ., and any worlds v and w
in ., if V
2
., g
(, v, w) =1 for each , then V
2
., g
(, v, w) =1
Validity and general validity, and consequence and general consequence, come
apart in various ways, as well see below.
As noted, the move to this new language lets us symbolize It might have
been the case that, if all those then rich might all have been poor, then some-
one is happy as 3(3x(@RxPx)xHx). Moreover, the move costs us
nothing. For we can replace any sentence of the old language with in
the new language (i.e. we just put the operator at the front of the sentence.)
4
For example, instead of symbolizing It might have been that everyone who is
actually rich is poor as 3x(@RxPx) as we did before, we can symbolize it
now as 3x(@RxPx).
Example 10.3: Show that if
2D
then
2D
@. Suppose that @ is not
valid. Then in some model and some world, w (and some assignment g, but
Ill suppress this when it isnt relevant), V
2
(@, w, w) = 0. Thus, given the
truth condition for @, V
2
(, w, w) =0, and so isnt valid.
Example 10.4: Show that every instance of @ is 2D-valid, but not
every instance of 2(@) is. (Moral: any proof theory for this logic had
better not include the rule of necessitation!) For the rst, the truth condition
for @ insures that for any world w in any model (and any variable assignment),
3
The term general validity is from Davies and Humberstone (); the rst denition of
validity corresponds to their real-world validity.
4
This amounts to the same thing as the old symbolization in the following sense. Let
be any wff of the old language. Thus, may have some occurrences of @, but it has
no occurrences of . Then, for every SQML-model . = T, , , and any v, w T,
V
2
., g
(, v, w) =V
.
/
, g
(, w), where .
/
is the designated-world model T, w, , .
V
2
(@, w, w) = 1 iff V
2
(, w, w) = 1, and so V
2
(@, w, w) = 1. Thus,
2D
@. But here is a countermodel for 2(Fa@Fa):
T =]c, d]
=]u]
(a) =u
(F ) =]u, c]
V
2
(2(Fa@Fa), c, c) = 0 because V
2
(Fa@Fa, c, d) = 0. For Fa is true at
c, d iff the referent of a is in the extension of F at world d (it isnt) whereas
@Fa is true at c, d iff the referent of a is in the extension of F at world c (it is).
Note that this same model shows that @is not generally valid. General
validity is truth at all pairs of worlds, and the formula Fa@Fa, as we just
showed, is false at the pair c, d.
Ixcrcisc 10.2 Demonstrate the following facts:
a) For any wff ,
2D
2@
b)
2D
2x3@F x2xF x
10.3 Iixcdly
The two-dimensional approach to possible-worlds semanticsevaluating for-
mulas at pairs of worlds rather than single worldsraises an intriguing possi-
bility. The 2 is a universal quantier over the world of evaluation; we might,
by analogy, follow Davies and Humberstone () and introduce an operator
that is a universal quantier over the reference world. Davies and Humberstone
call this operator xedly. Well symbolize xedly, as F. Grammatically,
F is a wff whenever is; its semantic clause is this:
5
V
2
., g
(F, v, w) =1 iff for every v
/
T, V
2
., g
(, v
/
, w) =1
5
Humberstone and Davies use designated-world QML models rather than two-dimensional
semantics (and they dont include ). Their truth condition for F is this: V
., g
(F, w) =1 iff
V
.
/
, g
(, w) =1 for every model .
/
that is just like . except perhaps containing a different
designated world. This approach isnt signicantly different from the two-dimensional one.
The other two-dimensional semantic denitions, including the denitions of
validity and semantic consequence, remain the same.
Humberstone and Davies point out that given F, @, and 2, we can introduce
two new operators: F@ and F2. Its easy to show that:
V
2
., g
(F@, v, w) =1 iff for every v
/
T, V
2
., g
(, v
/
, v
/
) =1
V
2
., g
(F2, v, w) =1 iff for v
/
, w
/
T, V
2
., g
(, v
/
, w
/
) =1
Thus, we can think of F@ and F2, as well as 2 and F themselves, as expressing
kinds of necessities, since their truth conditions introduce universal quantiers
over worlds of evaluation and reference worlds. (What about 2F? Its easy to
show that 2F is equivalent to F2.)
As with the semantics of the previous section, validity and general validity
do not always coincide, as the following example shows.
Example 10.5: F@ is 2D-valid for each wff (exercise 10.3). But
some instances of this wff fail to be generally valid, for example:
F@(@GaGa)(@GaGa)
General validity requires truth at all pairs v, w in all models. But in the
following model, V
2
(F@(@GaGa)(@GaGa), c, d) =0:
T =]c, d]
=]u]
(a) =u
(G) =]u, c]
In this model, the referent of a is in the extension of G in world c, but not
in world d. That means that @Ga is true at c, d whereas Ga is false at c, d,
and so @GaGa is false at c, d. But F@ means that is true at all pairs
of the form v, v, and the formula @GaGa is true at any such pair (in any
model). Thus, F@(@GaGa) true at c, d in this model.
2D
F@, for each wff
Ixcrcisc 10.4 Show that for some ,
2D
F.
Ixcrcisc 10.5** Show that if has no occurrences of @, then
2D
F.
10.4 Ncccssity and a priority
The two-dimensional modal framework has been put to signicant philosoph-
ical use in the past thirty or so years.
6
This is not the place for an extended
survey; rather, I will briey present the two-dimensional approach to just one
philosophical issue: the relationship between necessity and a priority.
In Naming and Necessity, Saul Kripke famously presented putative examples
of necessary a posteriori statements and of contingent a priori statements:
Hesperus = Phosphorus
B (the standard meter bar) is one meter long
The rst statement, Kripke argued, is necessary because whenever we try to
imagine a possible world in which Hesperus is not Phosphorus, we nd that we
have merely imagined a world in which Hesperus and Phosphorus denote
different objects than they in fact denote. Given that Hesperus and Phosphorus
are in fact one and the same entitynamely, the planet Venusthere is no
possible world in which Hesperus is different from Phosphorus, for such a
world would have to be a world in which Venus is distinct from itself. Thus, the
statement is necessary. But its a posteriori. It took astronomical investigation
to learn that Hesperus and Phosphorus were identical; no amount of pure
rational reection would have sufced. And the second sentence is a priori,
according to Kripke, because anyone possessing the semantic knowledge that
the description the length of bar B xes the reference of one meter can
know that its true. Nevertheless, he argues, the second sentence is contingent:
bar B does not have its length essentially, and thus could have been longer or
shorter than one meter.
But these conclusions are quite surprising. How can a statement that is true
in all possible worlds be in principle resistant to a priori investigation? Worse,
how can a statement that might have been false be known a priori?
Some think that the two-dimensional framework sheds light on all this.
Lets consider the contingent a priori rst. Consider the following notion:
iiixi:iox oi siviviitiai tox:ixoixtv: is supercially contingent in model
. at world w iff, for every variable assignment g for ., V
2
., g
(2, w, w) =0
and V
2
., g
(2, w, w) =0.
6
For work in this tradition, see Stalnaker (, a, ); Evans (); Davies and
Humberstone (); Hirsch (); Chalmers (, ); Jackson (); see Soames ()
for an extended critique.
Intuitively: if you were sitting in w you could say truly: 33.
Supercial contingency is one way to formalize the notion of contingency.
How should we formalize the notion of a priority? As a rough and ready guide,
lets think of a sentence as being a priori iff it is 2D-validi.e., true at every pair
w, w of every model. In defense of this guide: we can think of the truth value
of an utterance of a sentence as being the truth value of that sentence at the
pair w, w in a model that accurately models the genuine possibilities, and in
which w accurately models the world of the speaker. So any 2D-valid sentence
is invariably true whenever uttered; hence, if is 2D-valid, any speaker who
understands the logic of her language is in a position to know that an utterance
of would be true.
These denitions allow sentences to be supercially contingent but never-
theless a priori (2D-valid). For example, Fa@Fa is supercially contingent
in any world of any model where Fa is true in some worlds and false in others,
but it is 2D-valid (example 10.4). One can also give examples that are similar
in spirit both to Kripkes example of the meter bar, and to a related example
due to Gareth Evans (). Consider these sentences:
Bar B is one meter
Julius invented the zip
Bar B is the standard meter bar. One meter and Julius are supposed to
be descriptive namesrigid designators whose references are xed by the
descriptions the length of bar B and the inventor of the zip, respectively.
Now, whether or not these English sentences are indeed contingent and a priori
depends on delicate issues in the philosophy of language concerning descriptive
names, rigid designation, and reference xing. Rather than going into all that,
lets construct some examples that are similar to Kripkes and Evanss. Lets
stipulate that one meter and Julius are to abbreviate actualized descriptions:
the actual length of bar B and the actual inventor of the zip. With a little
creative reconstruing in the rst case, the sentences then have the form: the
actual G is G:
the actual length of bar B is a length of bar B
the actual inventor of the zip invented the zip
Now, these sentences are not quite a priori, since for all one knows, the G might
not existthere might exist no unique length of bar B, no unique inventor of
the zip. So suppose we consider instead the following sentences:
If there is exactly one length of bar B, then the actual
length of bar B is a length of bar B
If there is exactly one inventor of the zip, then the actual
inventor of the zip invented the zip
Each has the form:
If there is exactly one G, then the actual G is G
Or, in symbols:
x(Gxy(Gyy=x)) x(@Gxy(@Gyy=x)Gx) ()
() is 2D-valid (though not generally 2D-valid), and can be supercially contin-
gent (exercise 10.6). So we have further examples of the contingent a priori.
Various philosophers want to concede that these sentences are contingent in
one sensenamely, in the sense of supercial contingency. But, they claim, this
is a relatively unimportant sense (hence the supercial; the term is Evanss).
In another sense, theyre not contingent at all. Evans calls the second sense of
contingency deep contingency, and denes it thus (, p. ):
If a deeply contingent statement is true, there will exist some state of
affairs of which we can say both that had it not existed the statement
would not have been true, and that it might not have existed.
The intended meaning of the statement would not have been true is that the
statement, as uttered with its actual meaning, would not have been true. The
idea is supposed to be that Julius invented the zip is not deeply contingent
because we cant locate the required state of affairs; in any situation in which
Julius invented the zip is uttered with its actual meaning, it is uttered truly.
Evanss notion of deep contingency is not perfectly transparent. But as
Davies and Humberstone () point out, we can give a clear denition using
the two-dimensional modal framework:
iiixi:iox oi biiv tox:ixoixtv: is deeply contingent in . at w iff (for
all g) V
2
., g
(F@, w, w) =0 and V
2
., g
(F@, w, w) =0.
(This is parallel to the denition of supercial contingency, but with F@ in
place of 2.) The putative examples of the contingent a priori given above
are not deeply contingent. To be sure, this denition is only as clear as the
two-dimensional notions of xedness and actuality. The formal structure of the
two-dimensional framework is of course clear, but one can raise philosophical
questions about how that formalism is to be interpreted. But at least the
formalism provides a clear framework for the philosophical debate to occur.
As for the necessary a posteriori, lets follow our earlier strategy and take
the failure to be 2D-valid as our conception of a posteriority. And lets dene a
notion of supercial necessity by analogy to supercial contingency:
iiixi:iox oi siviviitiai xitissi:v: is supercially necessary in . at w
iff (for all g) V
2
., g
(2, w, w) =1
But here we must take a bit more care. Its a trivial matter to construct models
in which 2D-invalid sentences are necessarily true; and we dont need the two-
dimensional framework to do it. We dont want to say that Everything is a
lawyer is an example of the necessary a posteriori. But let F symbolize is a
lawyer; we can construct a model in which the predicate F is true of every
member of the domain at every world. xF x is supercially necessary at every
world in this model, despite the fact that it is not 2D-valid. But this is too
cheap. Whats wrong is that this model isnt realistic. Relative to our choice to
let F symbolize is a lawyer, the model doesnt accurately depict the modal fact
that its simply not necessarily true that everything is a lawyer.
To provide a nontrivial formalization of the necessary a posteriori, we will
provide realistic models in which 2D-invalid sentences are necessarily true in
the world corresponding to actuality. To do so, we will rst think of nonlogical
expressions of the language of QML as symbolizing certain particular expres-
sions of natural language. And then, we will choose a model that accurately
depicts the real modal facts, given what the nonlogical expressions symbolize.
(This notion of a realistic model is admittedly vague.)
Our putative necessary a posteriori sentence will be based on Kripkes
Hesperus and Phosphorus example. To avoid controversies about the semantics
of proper names in natural language, lets just stipulate that Hesperus is to be
short for the actual F , and that Phosphorus is to be short for the actual G,
where F stands for is a rst heavenly body visible in the evening, and G stands
for is a last heavenly body visible in the morning. The sentence is then this:
If Hesperus and Phosphorus exist then they are identical; i.e.,
If the actual F and the actual G exist, they are identical; i.e.,
[x(@F xy(@F yx=y)) z(@Gzy(@Gyz=y)]
x[@F xy(@F yx=y) z(@Gzy(@Gyz=y) z=x)] ()
Sentence () isnt 2D-valid (exercise 10.7). But it is supercially necessary in the
world corresponding to actuality in any realistic model. To see why, consider
the facts. The planet Venus is in fact both the heavenly body rst visible in the
morning, and also the heavenly body rst visible in the evening. (Or so we may
pretend.) So any realistic model must have a part that looks as follows:
T =]c. . . ]
=]u. . . ]
(F ) =]u, c . . . ]
(G) =]u, c . . . ]
Object u corresponds to Venus, object d corresponds to Mars, and world c
corresponds to the actual world (note how u is both the unique F and the
unique G in c). And in any such model, the necessitation of (), i.e.:
2([x(@F xy(@F yx=y)) z(@Gzy(@Gyz=y)]
x[@F xy(@F yx=y) z(@Gzy(@Gyz=y) z=x)])
is true in c, c (since () is true in c, w for each world w). So () is supercially
necessary in c in any such model.
Isnt it strange that () is both a posteriori and necessary? The two-dimen-
sional response is: no, its not strange, since despite being supercially necessary,
() is not deeply necessary. Deep necessity is dened thus:
iiixi:iox oi biiv xitissi:v: is deeply necessary in . at w iff (for all g)
V
2
., g
(F@, w, w) =1
To see why () isnt deeply necessary in the world corresponding to the actual
world of any realistic model, consider again the facts. It could have been that
Mars was the rst heavenly body visible in the morning, while Venus remained
the rst heavenly body visible in the evening. So in addition to the part depicted
above, any realistic model must also contain a world, d, corresponding to this
possibility:
T =]c, d. . . ]
=]u, v. . . ]
(F ) =]u, c, u, d . . . ]
(G) =]u, c, v, d . . . ]
(Note that the unique G in d is v; u is the unique F there; as before, u is both
the unique F and the unique G in c, which continues to correspond to the
actual world.) In any such model, the result of prexing () with F@:
F@][x(@F xy(@F yx=y)) z(@Gzy(@Gyz=y)]
x[@F xy(@F yx=y) z(@Gzy(@Gyz=y) z=x)]]
is false at c, c (and indeed, at every pair of worlds), since () is false at d, d.
And so, () is not deeply necessary in c in this model.
One might try to take this two-dimensional line further, and claim that
in every case of the necessary a posteriori (or the contingent a priori), the
necessity (contingency) is merely supercial. But defending this stronger line
would require more than we have in place so far. To take one example, return
again to Hesperus = Phosphorus, but now, instead of thinking of Hesperus
and Phosphorus as abbreviations for actualized descriptions, let us represent
them by names in the logical sense (i.e., the expressions called names in
the denition of well-formed formulas, which are assigned denotations by
interpretation functions in models). Thus, Hesperus = Phosphorus is now
represented as: a=b. Any realistic model will look in part as follows:
T =]c. . . ]
=]u. . . ]
(a) =u
(b) =u
In any such model the sentence a=b is deeply necessary (at any world in the
model). And yet it is a posteriori (2D-invalid) (exercise 10.8).
Ixcrcisc 10.6 Showthat sentence () is valid, though not generally
valid, and is supercially contingent in some world in some model.
Ixcrcisc 10.7 Show that sentence () isnt 2D-valid.
Ixcrcisc 10.8 Show that a = b is deeply necessary in any world of
any model in which (a) =(b).
Ixcrcisc 10.9 Show that a formula is capable of being supercially
contingent (i.e., for some model and some world, it is supercially
contingent at that world) iff it fails to be generally valid.
Appcndix A
Answcrs and Hints to Sclcctcd
Ixcrciscs
Ixcrcisc 1.1a PP is a logical truth is a sentence of the metalanguage,
and (I would say) is false. PP contains the meaningless letter P, so it isnt
a logical truth. Rather, it represents logical truths (assuming the law of the
excluded middle is correct! See chapter 3.)
Ixcrcisc 1.1b (PQ)(QP) is a sentence of the object language. Since
it contains meaningless expressions (P, Q), it isnt true. (Not that its false!)
Ixcrcisc 1.1c This is a bit of a trick question. Frank and Joe are brothers
logically implies Frank and Joe are siblings is a sentence of English, which is
talking about further sentences of English. So English is functioning here both
as the object language and as the metalanguage. As for whether the sentence is
true, I would say no, since the implication is not formal.
Ixcrcisc 1.2a Attorney and lawyer are synonyms confuses use and mention;
inserting quotation marks thus xes the problem:
Attorney and lawyer are synonyms.
Ixcrcisc 1.2b Howcan we insert quotation marks to remove the use-mention
confusion in If S
1
2
is another English sentence,
then the string S
1
and S
2
is also an English sentence? This is again a bit of a
trick question. You might think to do it this way:
APPENDIX A. ANSWERS AND HINTS

If S
1
2
is another English
sentence, then the string S
1
and S
2
is also an English
sentence.
But this isnt right. It makes the (false) claim that the string of letters S
1
and
S
2
(a string that contains the variables S
1
and S
2
) is an English sentence,
whereas the intention of the original sentence was to say that strings like Snow
is white and grass is green and Roses are red and violets are blue are English
sentences. Really, what we want is something like this:
If S
1
2
is another English
sentence, then the string consisting of S
1
, followed by
and, followed by S
2
, is also an English sentence.
Quine (, ) invented a device for saying such things more concisely. In
his notation, we could write instead:
If S
1
2
is another English
sentence, then S
1
and S
2
is also an English sentence.
His corner quotes, and , work like regular quotation marks, except when
it comes to variables of the metalanguage such as S
1
and S
2
. Expressions other
than such variables simply refer to themselves within corner quotes, just as all
expressions do within regular quotation marks. But metalanguage variables
refer to their valuesi.e., the linguistic expressions they stand forrather than
themselves, within Quines corner quotes. Thus,
S
1
and S
2
means the same as:

the string consisting of S
1
, followed by and, followed
by S
2
Ixcrcisc 1.3 Let sentence S
1
be There exists an x such that x and x are
identical, and let S
2
be There exists an x such that there exists a y such that x
and y are not identical.
Does S
1
logically imply S
2
according to the modal criterion? Well, that
depends. It depends on what is possible. You might think that there could have
existed only a single thing, in which case S
1
would be true and S
2
would be
false. If this is indeed possible, then S
1
doesnt logically imply S
2
(given the
modal criterion). But some people think that numbers exist necessarily, and in
particular that its necessarily true that the numbers 0 and 1 exist and are not
identical. If this is correct, then it wouldnt be possible for S
1
to be true while
S
2
is false (since it wouldnt be possible for S
2
to be false.) And so, S
1
would
logically imply S
2
, given the modal criterion.
How about according to Quines criterion? Again, it dependsin this case
on which expressions are logical expressions. If (as is commonly supposed)
there exists an x such that, there exists a y such that, not, and are identical
are all logical expressions, then all expressions in S
1
and S
2
are logical expressions.
So, since each sentence is in fact true, theres no way to substitute nonlogical
expressions to make S
1
true and S
2
false. So S
1
logically implies S
2
(according to
Quines criterion). But suppose are identical is not a logical expression. Then
S
1
would not logically imply S
2
, according to Quines criterion. For consider
the result of substituting the predicate are both existent for are identical.
S
1
then becomes true: There exists an x such that x and x are both existent,
whereas S
2
becomes false: There exists an x such that there exists a y such that
x and y are not both existent.
Ixcrcisc 1.4 Here is the denition of the powerset of A: ]u : u A]. The
powerset of ]2, 4, 6] is ], ]2], ]4], ]6], ]2, 4], ]2, 6], ]4, 6], ]2, 4, 6]]. Notice that
the powerset of a set always contains both the null set and the set itself (look at
the denition of subset to see why this is so.)
Ixcrcisc 1.5 N and Z are equinumerous, because of the following function
f : f (0) = 0, f (1) = 1, f (2) = 1, f (3) = 2, f (4) = 2, f (5) = 3, f (6) = 3, . . . .
This function can be dened more rigorously as follows:
f (n) =
n
2
if n is even
n+1
2
if n is odd
(for any n N)
Ixcrcisc 2.8 Hint: instead of trying to show directly that every wff without
repetition of sentence letters has the feature of PL-invalidity, nd some feature
F that is stronger than PL-invalidity (i.e., some feature F from which PL-
invalidity follows), and show by induction that every wff without repeated
sentence letters has this feature F ; and then, nally, conclude that every wff
without repeated sentence letters is PL-invalid.
Ixcrcisc 2.10 Hint: call a sequent valid iff ; prove by induction
that every provable sequent is a valid sequent.
Ixcrcisc 3.7 Were to show that there are no valid formulas in Kleenes
system. Consider the trivalent interpretation that assigns # to every sentence
letter. If there existed any Kleene-valid formula then KV
() would need to
be 1, whereas we can show by induction that KV
() =# for every wff . Base

case: all the sentence letters are obviously # in . Inductive step: assume that
and are both # in . We need now to show that , , and
are all # in . But thats easyjust look at the truth tables for , and . ##
is #, ## is #, and ## is #.
Ixcrcisc 3.8 Hint: use induction.
Ixcrcisc 3.11 Hint: in each direction, prove the contrapositive. Exercise 3.8
might come in handy.
Ixcrcisc 3.15 Hint: this isnt hard, but its a bit tricky. It might help to
note that every classical (bivalent) interpretation also counts as a trivalent
interpretation, with itself as its only precisication.
Ixcrcisc 3.16 Were to argue that contraposition and reductio should fail,
given a supervaluational semantics for Z (assuming the identication of truth
with truth-in-all-sharpenings). Contraposition: as argued in the text, for all ,
logically implies denitely, . So Middling Mary is rich logically implies
Middling Mary is denitely rich. But not: denitely, Middling Mary is rich
doesnt logically imply not: Middling Mary is rich, since if Mary is a (denite)
borderline case of being rich, the rst is true on all sharpenings and hence is
true, while the second is false under some sharpenings and so is not true. So to
model these results, it should turn out under the supervaluationist semantics
that P ZP but ZP P.
As for reductio, Mary is rich and Mary is not denitely rich cannot be
true (on logical grounds), and so vacuously implies anything at all. (If it were
true, then it would be true on all sharpenings; but then Mary is rich would be
true on all sharpenings; but then Mary is not denitely rich would be false.)
So in particular, it logically implies both Snow is white and Snow is not white
(say). But, contrary to reductio, not: Mary is rich and Mary is not denitely
rich is not a logical truth, since it isnt true. For there are sharpenings in which
both Mary is rich and Mary is not denitely rich are true.
Ixcrcisc 3.17 For the systems of ukasiewicz, Kleene, and Priest, we are to
nd intuitionistically provable sequents whose premises do not semantically
imply their conclusions. Lets begin with Kleenes system. We showed in
exercise 3.7 that there are no Kleene-valid wffs. Thus,
K
PP. But the
following is an intuitionistically acceptable proof of the sequent PP:
. P P RA (for conditional proof)
. PP , I
Next, V
((PP)) = # for any trivalent assignment in which P is #,

so (PP) is ukasiewicz-invalid. But (PP) is intuitionistically
provable:
. PP PP RA (for reductio)
. (PP) , RAA
(Since (PP) is also Kleene-invalid, we could just as well have used this
example for that system as well.) Finally, P, PQ
LP
Q (exercise 3.10d),
whereas P, PQ Q is intuitionistically provable:
. P P RA
. PQ PQ RA
. P, PQ Q , , E
Ixcrcisc 4.1 Hint: rst prove by induction that for any wff , perhaps with
free variables, and model ., if variable assignments g and h agree on all
variables with free occurrences in , then V
., g
() = V
.,h
(), and then use
this fact to establish the desired result.
Ixcrcisc 4.3d Hint: the premise has a free variable. Look carefully at the
denition of semantic consequence to see how to accommodate this.
Ixcrcisc 4.5 Were to show that the set = ],
2
xF x,
3
xF x . . . ] would
violate compactness, where by hypothesis, i) for each n, the sentence
n
xF x is
true in a model iff the extension of F in that model has at least n members; and
ii) the sentence is true in a given model iff the extension of F in that model
is nite.
is unsatisable. For suppose for reductio that each member of were true
in some model .. Since , is true in ., and so by ii), .s domain has
some nite number, k, of members. But
k+1
xF x is also a member of , and
so by i), .s domain would have to have at least k +1 members.
Since is unsatisable, compactness tells us that it has some nite unsat-
isable subset
0
. But that is impossible. Since
0
is nite, theres a limit to
how many sentences of the form
n
xF x are in it. Let k be the largest such n.
So: every member of
0
is either a) , or is b)
n
xF x for some n k. Now let
. be some model in which the extension of F has k members. By i) and ii),
every sentence of type a) or b) is true in ., so every member of
0
is true in
.. Contradiction.
Ixcrcisc 5.5a Hint: its easy to get confused by the complexity of the an-
tecedent here, xLxyF xy. This just has the form: xLx, where is
yF xy. L is a two-place predicate; it applies to the terms x and . If you think
of F xy as meaning that x is a father of y, and Lxy as meaning that x loves
y, then xLxyF xy means everyone x loves the y that he (x) is the father of.
Ixcrcisc 5.6 We must show that for any model , , and any variable
assignment g, []
g
(relative to this model) is either undened or a member of
. Well do this by induction on the grammar of . So, well show that the
result holds when is a variable, constant, or term (base cases), and then show
that, assuming the result holds for simpler terms (inductive hypothesis), it also
holds for complex terms made up of the simpler terms using a function symbol.
Base cases. If is a variable then []
g
is g(), which is a member of
given the denition of a variable assignment. If is a constant then []
g
is
(), which is a member of given the denition of a models interpretation
function. If has the form then []
g
is either the unique u such that
V
g
u
() = 1 (if there is such a u) or undened (if there isnt). So in all three
cases, []
g
is either undened or a member of . (Note that even though
terms are syntactically complex, we treated them here as a base case of our
inductive proof. Thats because we had no need for any inductive hypothesis;
we could simply show directly that the result holds for all terms.)
Next we assume the inductive hypothesis (ih): the denotations of terms
1
. . .
n
are either undened or members of ; and we must show that the
same goes for the complex term f (
1
. . .
n
). There are two cases; in each case
well show that [ f (
1
. . .
n
)]
g
is either undened or a member of . Case :
at least one of [
1
]
g
. . . [
n
]
g
is undened. Then [ f (
1
. . .
n
)]
g
is undened.
Case : all of [
1
]
g
. . . [
n
]
g
are dened. Then [ f (
1
. . .
n
)]
g
is dened as
( f )([
1
]
g
. . . [
n
]
g
). Moreover, the (ih) tells us that each of [
1
]
g
. . . [
n
]
g
is
a member of . And we know from the denition of a model that ( f ) is a
total function dened on . So ( f )([
1
]
g
. . . [
n
]
g
) is a member of .
Ixcrcisc 6.1 Hint: the only hard part is showing that if
O
then
S
. Suppose
O
and let . = T, %, be any S-model; we must show
that V
.
(, w) = 1 for each w T. Now, its a fact from set theory that
any equivalence relation R over set A partitions Ait divides A into non-
overlapping subsets where: i) each element of Ais in exactly one of the subsets,
and ii) every member of every subset bears Rto every member of that subset. So
% partitions T in this way. Let T
w
be the subset containing w, and consider
the model .
/
that results from. by cutting away all worlds other than those
in T
w
. .
/
is a total model, so is valid in it, so V
.
/ (, w) = 1. But then
V
.
(, w) =1, as well. Why? You can prove by induction that the truth value
of any wff at any world v in . is determined by the truth values of sentence
letters within vs subset. (Intuitively: chains of modal operators take you to
worlds seen by v, worlds seen by worlds seen by v, and so on; youll never need
to look at worlds outside of vs subset.)
Ixcrcisc 6.3a 2[P3(QR)]3[Q(2P3R)]:

1 0 0
2[P3(QR)]3[Q(2P3R)]
r

0 1 1 0 1 0 0
P3(QR) Q(2P3R)
1 0
P R
b
D-countermodel:
T =]r, a, b]
% =]r, a, a, b, b, b]
(Q, a) =(P, b) =1, all else 0
(also establishes K-invalidity)
T-validity proof (also establishes validity in B, S, and S):
i) Suppose for reductio that the formula is false in some world r in some
T-model T, %, . Then V(2[P3(QR)], r ) =1, and
ii) V(3[Q(2P3R)], r ) =0.
iii) By reexivity, %rr , so by ii), V(Q(2P3R), r ) = 0. So
V(2P3R, r ) =0. Thus, V(2P, r ) =1 and so V(P, r ) =1; also
iv) V(3R, r ) =0
v) From i), given %rr , V(P3(QR), r ) = 1, and so, given iii),
V(3(QR), r ) =1. So for some world a, %ra and V(QR, a) =1.
vi) Since %ra, fromii) we have V(Q(2P3R), a) =0, and so V(Q, a) =1;
and from iv) we have V(R, a) =0. These contradict line v).
Ixcrcisc 6.3d 2(PQ)2(2P2Q):
1 1 1 1 0 0
2(PQ)2(2P2Q)

r
1 1 1 1 1 0 0
PQ 2P2Q

a
0 1
Q P
b
B-countermodel:
T =]r, a, b]
% =]r, r, a, a, b, b, r, a, a, r,
a, b, b, a]
(P, r) =(Q, r) =(P, a) =(Q, a) =
(P, b) =1, all else 0
(also establishes K, D, and T invalidity)
Validity proof for S (and so for S as well):
i) Suppose for reductio that the formula is false in some world r of some
S-model. Then V(2(PQ), r ) =1 and
ii) V(2(2P2Q), r ) =0. So for some a, %ra and V(2P2Q, a) =0,
and so 2P and 2Q must have different truth values in a. Without loss
of generality (given the symmetry between P and Q elsewhere in the
problem), lets suppose that V(2P, a) =1 and
iii) V(2Q, a) =0. So for some world b, %ab and V(Q, b) =0. Also, given
ii), V(P, b) =1. So P and Q have different truth values at b.
iv) By transitivity, %rb, and so given i), V(PQ, b) =1, contradicting iii).
Ixcrcisc 6.3g 332P2P:
1 1 0 0
332P2P

r

1 1 1
32P
0
P
b
B countermodel:
T =]r, a, b]
% =]r, r, a, a, b, b, r, a, a, r,
r, b, b, r]
(P, r) =(P, a) =1, all else 0
(also establishes K, D, and T invalidity)
1 0 0
332P2P

r

1 1 1
32P
0
P
b
S-countermodel:
T =]r, a, b]
% =]r, r, a, a, b, b, r, a, r, b]
(P, a) =1, all else 0
S-validity proof:
i) Given the truth condition for the , it will sufce to show that 332P
and 2P have the same truth value in every world of every S model. So
let r be any world in any S model, and suppose rst for reductio that
V(332P, r ) =1 and
ii) V(2P, r ) =0. So for some b, %r b and V(P, b) =0
iii) From i), for some a, %ra and V(32P, a) =1, and so for some c, %ac and
V(2P, c) =1. By symmetry, %ca and %ar , and so by transitivity, %c b,
and so V(P, b) =1, contradicting ii). So the rst reductio assumption is
false. Suppose next for reductio that V(332P, r ) =0 and
iv) V(2P, r ) = 1. By reexivity, %r r . So V(32P, r ) = 1; and so,
V(332P, r ) =1, contradicting iii).
Ixcrcisc 6.5a '
K
3(PQ)(3P3Q):
. (PQ)P PL
. 2[(PQ)P] , NEC
. 2[(PQ)P][3(PQ)3P] K3
. 3(PQ)3P , , MP
. 3(PQ)3Q Similar to
. 3(PQ)(3P3Q) , , PL (composition)
Ixcrcisc 6.5c '
K
3(QR)2(QR):
. 3(QR)2(QR) MN (rst one direction)
. (QR)(QR) PL
. 2[(QR)(QR)] , NEC
. 2(QR)2(QR) , K, MP
. 3(QR)2(QR) , , PL (syllogism)
. (QR)(QR) PL (neg. conjunction) (now the other)
. 2(QR)2(QR) , NEC, K, MP
. 2(QR)3(QR) MN
. 2(QR)3(QR) , , PL (syllogism)
. 3(QR)2(QR) , , PL (biconditional)
Ixcrcisc 6.5d Hint: you can move from () and () to
() using PL.
Ixcrcisc 6.5g Were to show that '
K
3(PQ)(2P3Q). This ones a
bit tough. The trick for the rst half is choosing the right tautology, and for
the second half, getting the right PL strategy.
. P[(PQ)Q] PL
. 2P2[(PQ)Q] , NEC, K, MP
. 2[(PQ)Q][3(PQ)3Q] K3
. 2P[3(PQ)3Q] , , PL (syllogism)
. 3(PQ)(2P3Q) , PL (permutation)
I must now prove the right-to-left direction, namely, (2P3Q)3(PQ).
Note that the antecedent of this conditional is PL-equivalent to 2P3Q
(disjunction from table 4.1), and that a conditional () follows in PL
fromthe conditionals and (dilemma). So my goal will be to get two
conditionals, 2P3(PQ), and 3Q3(PQ), from which the desired
conditional follows by PL (line below).
. 2P3P MN
. P(PQ) PL
. 3P3(PQ) , NEC, K3, MP
. 2P3(PQ) , , PL (syllogism)
. Q(PQ) PL
. 3Q3(PQ) , NEC, K3, MP
. (2P3Q)3(PQ) , , PL (dilemma, disjunction)
. 3(PQ)(2P3Q) , , PL (biconditional)
Ixcrcisc 6.7b Hint: use the strategy of example 6.10.
Ixcrcisc 6.8b Hint: rst prove 2(P2P)(3PP).
Ixcrcisc 6.10c Hint: follows in PL from ; and remember MN.
Ixcrcisc 7.1 One condition on accessibility that validates every instance of
(U) is the condition of reexivity at one remove: if %wv for some w then %vv.
For, let w be any world in any MPL model, and suppose for reductio that
O(O) is false there. Then O is false at some v accessible from w;
and so, O is true at v and is false there. By reexivity at one remove, %vv;
so, since O is true at v, must be true at v; contradiction.
Ixcrcisc 7.2 To establish completeness for systemX, we can use the theorems
and lemmas used to prove soundness and completeness for modal systems in
chapter 6 (strictly, those systems require the modal operator to be the 2; so
lets think of O as a kind of rounded way of writing 2.) For soundness, note
that system X is K+D U in the notation of section 6.5, where D is the set of
all instances of the D schema and U is the set of all instances of the (U) schema.
So, given lemma 6.1, all we need to do is show that all members of D U are
valid in every model whose accessibility relation is serial and reexive at one
remove. This follows, for the members of D, from the proof in exercise 6.14,
and for the members of U, from the proof in exercise 7.1.
As for completeness, rst lets show that the accessibility relation in the
canonical model for X is serial and reexive at one remove. For seriality, the
analogous part of the proof of Ds completeness (section 6.6.5) may simply be
repeated. As for reexivity at one remove, suppose %wv; we must show that
%vv. So let 2 be any member of v; we must show that v. 2(2) is
an axiom and hence a theorem of X, and so is a member of w (lemma 6.5c). So
by the denition of %, 2 v; so, since 2 v, v by 6.5b.
Now for completeness. Suppose that
X
. That is, is valid in all serial-
and-reexive-at-one-remove MPL models. Given the previous paragraph, is
valid in the canonical model for X, and so by corollary 6.8, '
X
.
Ixcrcisc 7.5 For any MPL-wff, , let
H
be the result of replacing 2s with
Hs in . Now suppose is an MPL-wff and
K
; we must show that
PTL
H
.
Intuitively, this holds because H works just like the 2 except that it looks
backward along the accessibility relation, and theres nothing special about
either direction of the accessibility relation. But we need a proper argument.
Let . = , , be any PTL-model, and let t be any member of ;
we must show that
H
is true at t in .. Let be the converse of (that is,
t t
/
iff t
/
t ); and let .
/
be the MPL model just like . except that is its
accessibility relation. That is, .
/
= , , . Since
K
, V
.
/ (, t ) =1. Ill
show in a moment that:
for any MPL-wff and any s , V
.
/ (, s ) =1 iff V
.
(
H
, s ) =1 (*)
Thus, V
.
(
H
, t ) =1, which is what we wanted to show.
It remains to establish (*). Ill do this by induction. The base case, that
V
.
/ (, s ) = 1 iff V
.
(
H
, s ) = 1 for any sentence letter , is immediate since
=
H
and . and .
/
share the same interpretation function . Now assume
for induction that (*) holds for and ; we must show that it also holds for
, , and 2. This is obvious in the rst two cases; as for the latter:
V
.
/ (2, s ) =1 iff V
.
/ () =1 for each s
/
such that s s
/
(t.c. for 2)
iff V
.
/ () =1 for each s
/
such that s
/
s (def of )
iff V
.
(
H
) =1 for each s
/
such that s
/
s (ih)
iff V
.
(H
H
) =1 (t.c. for H)
Since H
H
is the same wff as (2)
H
, were done.
Ixcrcisc 7.8 We must show that if
I
then
PL
. Suppose
I
,
and let be a PL-interpretation in which every member of is true; we
must show that V
() =1. (V
is the classical valuation for .) Consider the

intuitionist model . with just one stage, r, in which sentence letters have the
same truth values at r as they have in i.e., . = ]r], ]r, r],
/
, where
/
(, r) = () for each sentence letter . Since . has only one stage, the
classical and intuitionist truth conditions collapse in this caseit would be
easy to show by induction that for every wff , IV
.
(, r) =V
(). So, since

V
() =1 for each , IV
.
(, r) =1 for each . Since
I
, it follows
that IV
.
(, r) =1; and so, V
() =1.
Ixcrcisc 7.10 Were to come up with cases of semantic consequence in the
systems of ukasiewicz, Kleene, Priest, and supervaluationism, that are not
cases of intuitionist consequence. Actually a single case sufces. Exercise
7.9a shows that (PQ)
I
PQ. But (PQ) PQ in each of
these systems. Exercise 3.10c demonstrates this for LP. As for ukasiewicz,
suppose that V((PQ)) =1. Then V(PQ) =0, so either V(P) =0 or
V(Q) = 0. So either V(P) = 1 or V(Q) = 1. So V(PQ) = 1.
Since the Kleene tables for the , , and are the same as ukasiewiczs, the
implication holds in Kleenes system as well. Finally, supervaluationism: since
(PQ) PL-implies PQ, by exercise 3.13, (PQ)
S
PQ.
Ixcrcisc 8.1 Could a general limit assumption be derived from a limit as-
sumption for atomics? No. Consider the following model. There are innitely
many worlds, including a certain world w in which P is true. P is false at all
other worlds, and all other sentence letters are false in all worlds. There is
no nearest-to-w world (other than w itself): for each world x (= w, there is
some y (= w such that y
w
x. And w is always the nearest-to-x world (other
than x itself), for any other world x: for any x and any y (= x, w _
x
y. In
this model the limit assumption for atomics holds (for any world x, w is the
nearest-to-x P world; and no other atomic is true at any world.) But the general
limit assumption fails: although P is true in some worlds (indeed, innitely
many), there is no nearest-to-w P world.
Ixcrcisc 8.3c P(QR)
SC
Q(PR):
view from r:
/. -,
() *+
d
/. -,
() *+
c
/. -,
() *+
1 0
Q PR
b
no Q
no P
/. -,
() *+
1 0 1
P QR
a
/. -,
() *+
0 1 0 0
P(QR) Q(PR)
r
view from a
/. -,
() *+
d
/. -,
() *+
r
/. -,
() *+
b
no Q
/. -,
() *+
1 1
Q R
c
/. -,
() *+
1 0 1
P QR
a
view from b:
/. -,
() *+
c
/. -,
() *+
r
/. -,
() *+
a
no P
/. -,
() *+
1 0
P R
d
/. -,
() *+
1 0 0
Q PR
b
Ofcial model:
T =]r, a, b, c, d]
_
r
=]a, b, b, c, c, d . . . ]
_
a
=]c, b, b, r, r, d . . . ]
_
b
=]d, a, a, r, r, c . . . ]
(P, a) =(Q, b) =(Q, c) =(R, c) =(P, d) =1, all else 0
Ixcrcisc 8.6 Hint: say that an LC model is Stalnaker-acceptable iff it
obeys the limit and anti-symmetry assumptions. Show (by induction) that in
Stalnaker-acceptable models, Lewiss truth-conditions yield the same results as
Stalnakers. That is, in any such model, a wff counts as being true at a given
world given Lewiss denition of truth in a model if and only if it counts as
being true at that world given Stalnakers denition.
Ixcrcisc 9.1a
SQML
(2x(F xGx) 3xF x) 3xGx:
g
((2x(F xGx)3xF x) 3xGx, w) =
0, for some world and variable assignment w and g in some model. Then
V
g
(2x(F xGx) 3xF x, w) =1 and
ii) V
g
(3xGx, w) =0.
iii) Given i), V
g
(3xF x, w) = 1. So V
g
(xF x, v) = 1, for some v T. So,
V
g
x
u
(F x, v) =1, for some u .
iv) Given i), V
g
(2x(F xGx), w) = 1. So V
g
(x(F xGx), v) = 1. So
V
g
x
u
(F xGx, v) =1. So either V
g
x
u
(F x, v) =0 or V
g
x
u
(Gx, v) =1; and so,
given iii), V
g
x
u
(Gx, v) =1.
v) Given ii), V
g
(xGx, v) =0. So V
g
x
u
(Gx, v) =0, contradicting iv).
Ixcrcisc 9.1c
SQML
x3Rax32xyRxy:
1 0 0 0 1 1
x3Rax32xyRxy 3Ra
u
x
+
R: ]u, u]
r
+ +
0 0 0
xyRxy yR
u
x
y R
u
x
u
y
c
=]u]
(a) =u
Ofcial model:
T =]r, c]
=]u]
(a) =u
(R) =]u, u, r]
Ixcrcisc 9.6a '
SQML
2(2x(F xGx) xF x) 2xGx:
. x(F xGx) (F xGx) PC
. x(F xGx) (GxF x) , PL (contr., syll.)
. x(x(F xGx)(GxF x)) , UG
. x(F xGx) x(GxF x) PC, , MP
. x(GxF x) (xGxxF x) Distribution
. x(F xGx) (xGxxF x) , , PL (syllogism)
. (x(F xGx) xF x) xGx , PL (contr., imp/exp.)
. (x(F xGx) xF x) xGx , def of
. 2x(F xGx) x(F xGx) T
. (2x(F xGx) xF x) xGx , , PL (see below)
. 2(2x(F xGx) xF x) 2xGx , NEC, K, MP
(Step used the tautology: ((PQ)R)((SP)((SQ)R)).) My
approach: I set myself an ultimate goal of getting the conditional
(2x(F xGx)xF x) xGx (since then I could use the usual K technique
for adding a 2 to each side.). Since SQML includes the T axioms, it sufced to
establish (x(F xGx)xF x) xGx, that is: (x(F xGx)xF x)
xGx. So this latter formula became my penultimate goal. But this follows
via PLfromx(F xGx) (xGxxF x). So this nal formula became
my rst goal.
Ixcrcisc 10.1 Ill show that the designated-worlds denition of validity is
equivalent to the old one; the proof for semantic consequence is parallel. First
note that:
(*) For each new model . =T, w
@
, %, , the corresponding old model
.
/
=T, %, has the same distribution of truth valuesi.e., for every
wff and every w T, V
.
(, w) =V
.
/ (, w)
(*) is true because the designated world plays no role in the denition of the
valuation function.
Now, where S is any modal system, suppose rst that is S-valid under
the old denition. Then is valid in every old S-model. So by (*), is true in
every world of every new S-model, and so is true at the designated world of
every new S-model, and so is S-valid under the new denition.
For the other direction, suppose is S-invalid under the old denition.
So is false at some world, w, in some old S-model .
/
=T, %, . Now
consider the new S-model . =T, w, %, (same worlds, accessibility, and
interpretation function as .
/
; w is the designated world). By (*), is false at
w in ., and so is false in ., and so is S-invalid under the new denition.
Ixcrcisc 10.5 Hint: rst prove by induction the stronger result that if has
no occurrences of @, then F is generally valid.
Bibliography
Benacerraf, Paul and Hilary Putnam (eds.) (). Philosophy of Mathematics.
nd
edition. Cambridge: Cambridge University Press.
Bencivenga, Ermanno (). Free Logics. In D. Gabbay and F. Guenther
(eds.), Handbook of Philosophical Logic, volume , . Dordrecht: D.
Reidel.
Boolos, George (). On Second-Order Logic. Journal of Philosophy :
. Reprinted in Boolos : .
(). To Be Is to Be the Value of a Variable (or to Be Some Values of
Some Variables). Journal of Philosophy : . Reprinted in Boolos :
.
(). Nominalist Platonism. Philosophical Review : . Reprinted
in Boolos : .
(). Logic, Logic, and Logic. Cambridge, MA: Harvard University Press.
Boolos, George S., John P. Burgess and Richard C. Jeffrey (). Computability
and Logic.
th
edition. Cambridge: Cambridge University Press.
Cartwright, Richard (). Philosophical Essays. Cambridge, MA: MIT Press.
Chalmers, David (). The Conscious Mind. Oxford: Oxford University Press.
(). Two-Dimensional Semantics. In Ernest Lepore and Barry C.
Smith (eds.), Oxford Handbook of Philosophy of Language, . New York:
Oxford University Press.
Cresswell, M. J. (). Entities and Indices. Dordrecht: Kluwer.
BIBLIOGRAPHY
Davies, Martin and Lloyd Humberstone (). Two Notions of Necessity.
Philosophical Studies : .
Dowty, David R., Robert E. Wall and Stanley Peters (). Introduction to
Montague Semantics. Dordrecht: Kluwer.
Edgington, Dorothy (). On Conditionals. Mind : .
Etchemendy, John (). The Concept of Logical Consequence. Cambridge, MA:
Harvard University Press.
Evans, Gareth (). Reference and Contingency. The Monist : .
Reprinted in Evans .
(). Collected Papers. Oxford: Clarendon Press.
Feldman, Fred (). Doing the Best We Can. Dordrecht: D. Reidel.
Fine, Kit (). Vagueness, Truth and Logic. Synthese : .
Reprinted in Keefe and Smith : .
(). Plantinga on the Reduction of Possibilist Discourse. In J. Tomber-
lin and Peter van Inwagen (eds.), Alvin Plantinga, . Dordrecht: D.
Reidel.
French, Peter, Theodore E. Uehling, Jr. and Howard K. Wettstein (eds.)
(). Midwest Studies in Philosophy XI: Studies in Essentialism. Minneapolis:
University of Minnesota Press.
Gamut, L. T. F. (a). Logic, Language, and Meaning, Volume : Introduction
to Logic. Chicago: University of Chicago Press.
(b). Logic, Language, and Meaning, Volume : Intensional Logic and Logical
Grammar. Chicago: University of Chicago Press.
Garson, James W. (). Modal Logic for Philosophers. New York: Cambridge
University Press.
Gentzen, Gerhard (). Investigations into Logical Deduction. Mathema-
tische Zeitschrift : , . Reprinted in Gentzen : .
(). The Collected Papers of Gerhard Gentzen. Ed. M. E. Szabo. Amsterdam:
North-Holland.
BIBLIOGRAPHY
Gibbard, Allan (). Contingent Identity. Journal of Philosophical Logic :
. Reprinted in Rea : .
Goble, Lou (ed.) (). The Blackwell Guide to Philosophical Logic. Malden, MA:
Blackwell.
Harper, William L., Robert Stalnaker and Glenn Pearce (eds.) (). Ifs: Con-
ditionals, Belief, Decision, Chance, and Time. Dordrecht: D. Reidel Publishing
Company.
Heim, Irene and Angelika Kratzer (). Semantics in Generative Grammar.
Malden, MA: Blackwell.
Hilpinen, Risto (). Deontic Logic. In Goble (), .
Hirsch, Eli (). Metaphysical Necessity and Conceptual Truth. In French
et al. (), .
Hodes, Harold (a). On Modal Logics Which Enrich First-order S.
Journal of Philosophical Logic : .
(b). Some Theorems on the Expressive Limitations of Modal Lan-
guages. Journal of Philosophical Logic : .
Hughes, G. E. and M. J. Cresswell (). A New Introduction to Modal Logic.
London: Routledge.
Jackson, Frank (). FromMetaphysics to Ethics: ADefence of Conceptual Analysis.
Oxford: Oxford University Press.
Keefe, Rosanna and Peter Smith (eds.) (). Vagueness: A Reader. Cambridge,
MA: MIT Press.
Kripke, Saul (). Semantical Considerations on Modal Logic. Acta
Philosophical Fennica : . Reprinted in Linsky : .
(). Semantical Analysis of Intuitionistic Logic. In Michael Dummett
and John Crossley (eds.), Formal Systems and Recursive Functions, .
Amsterdam: North-Holland.
BIBLIOGRAPHY
(). Naming and Necessity. In Donald Davidson and Gilbert Harman
(eds.), Semantics of Natural Language, , . Dordrecht: Reidel.
Revised edition published in as Naming and Necessity (Cambridge, MA:
Harvard University Press).
Lemmon, E. J. (). Beginning Logic. London: Chapman & Hall.
Lewis, C. I. (). ASurvey of Symbolic Logic. Berkeley: University of California
Press.
Lewis, C. I. and C. H. Langford (). Symbolic Logic. New York: Century
Company.
Lewis, David (a). Counterfactuals. Oxford: Blackwell.
(b). Counterfactuals and Comparative Possibility. Journal of Philo-
sophical Logic : .
(). Possible-World Semantics for Counterfactual Logics: A Rejoinder.
Journal of Philosophical Logic : .
(). Scorekeeping in a Language Game. Journal of Philosophical Logic :
. Reprinted in Lewis : .
(). Philosophical Papers, Volume . Oxford: Oxford University Press.
(). On the Plurality of Worlds. Oxford: Basil Blackwell.
Linsky, Bernard and Edward N. Zalta (). In Defense of the Simplest
Quantied Modal Logic. In James Tomberlin (ed.), Philosophical Perspectives
: Logic and Language, . Atascadero: Ridgeview.
(). In Defense of the Contingently Nonconcrete. Philosophical Studies
: .
Linsky, Leonard (ed.) (). Reference and Modality. Oxford: Oxford University
Press.
Loewer, Barry (). Counterfactuals with Disjunctive Antecedents. Journal
of Philosophy : .
BIBLIOGRAPHY
Lycan, William (). The Trouble with Possible Worlds. In Michael J.
Loux (ed.), The Possible and the Actual, . Ithaca: Cornell University
Press.
MacFarlane, John (). Logical Constants. Stanford Encyclopedia of Phi-
losophy. Available online at http://plato.stanford.edu/entries/
logical-constants/.
McDaniel, Kris (). Ways of Being. In David Chalmers, David Manley and
Ryan Wasserman (eds.), Metametaphysics, . Oxford: Oxford University
Press.
McGee, Vann and Brian McLaughlin (). Distinctions Without A Differ-
ence. Southern Journal of Philosophy (Supp.): .
Mendelson, Elliott (). Introduction to Mathematical Logic. Belmont, Cali-
fornia: Wadsworth & Brooks.
Meyer, J.-J. Ch. (). Epistemic Logic. In Goble (), .
Plantinga, Alvin (). On Existentialism. Philosophical Studies : .
Priest, Graham (). The Logic of Paradox. Journal of Philosophical Logic :
.
(). An Introduction to Non-Classical Logic. Cambridge: Cambridge
University Press.
Prior, A. N. (). Time and Modality. Oxford: Clarendon Press.
(). Past, Present, and Future. Oxford: Oxford University Press.
(). Papers on Time and Tense. London: Oxford University Press.
Quine, W. V. O. (). Mathematical Logic. Cambridge, MA: Harvard Uni-
versity Press.
(). On What There Is. Review of Metaphysics : . Reprinted in
Quine a: .
(a). From a Logical Point of View. Cambridge, MA: Harvard University
Press.
BIBLIOGRAPHY
(b). Mr. Strawson on Logical Theory. Mind : .
(c). Reference and Modality. In Quine (a), .
(). Quantiers and Propositional Attitudes. Journal of Philosophy :
. Reprinted in Quine : .
(). Carnap and Logical Truth. Synthese : . Reprinted in
Quine : .
(). The Ways of Paradox. New York: Random House.
Rea, Michael (ed.) (). Material Constitution. Lanham, Maryland: Rowman
& Littleeld.
Restall, Greg (). An Introduction to Substructural Logics. London: Routledge.
Russell, Bertrand (). On Denoting. Mind . Reprinted in Russell
: .
(). Logic and Knowledge. Ed. Robert Charles Marsh. New York: G.P.
Putnams Sons.
Salmon, Nathan (). Modal Paradox: Parts and Counterparts, Points and
Counterpoints. In French et al. (), .
Sider, Theodore (). Reductive Theories of Modality. In Michael J. Loux
and Dean W. Zimmerman (eds.), Oxford Handbook of Metaphysics, .
Soames, Scott (). Reference and Description: The Case against Two-
Dimensionalism. Princeton: Princeton University Press.
Stalnaker, Robert (). A Theory of Conditionals. In Studies in Logical
Theory: American Philosophical Quarterly Monograph Series, No. . Oxford:
Blackwell. Reprinted in Harper et al. : .
(). Complex Predicates. The Monist : .
(). Assertion. In Peter Cole and Jerry Morgan (eds.), Syntax and Se-
mantics, Volume : Pragmatics, . New York: Academic Press. Reprinted
in Stalnaker : .
BIBLIOGRAPHY
(). A Defense of Conditional Excluded Middle. In Harper et al.
(), .
(). Context and Content: Essays on Intentionality in Speech and Thought.
(a). Conceptual Truth and Metaphysical Necessity. In Stalnaker
(b), .
(b). Ways a World Might Be. Oxford: Oxford University Press.
(). Assertion Revisited: On the Interpretation of Two-Dimensional
Modal Semantics. Philosophical Studies : . Reprinted in Stalnaker
b: .
Turner, Jason (MS). Ontological Pluralism. Available online at
http://www.personal.leeds.ac.uk/~phljtt/jason_turner/
papers_files/OP(17).pdf.
von Fintel, Kai (). Counterfactuals in a Dynamic Context. In Ken Hale:
A Life in Language, . Cambridge, MA: MIT Press.
Westersthl, Dag (). Quantiers in Formal and Natural Languages. In
D. Gabbay and F. Guenther (eds.), Handbook of Philosophical Logic, volume ,
. Dordrecht: Kluwer.
Williamson, Timothy (). Vagueness. London: Routledge.
(). Bare Possibilia. Erkenntnis : .
(a). A Note on Truth, Satisfaction and the Empty Domain. Analysis
: .
(b). On the Structure of Higher-Order Vagueness. Mind :
.
(). Necessary Existents. In A. OHear (ed.), Logic, Thought and
Language, . Cambridge: Cambridge University Press.
Wright, Crispin (). Truth and Objectivity. Cambridge, MA: Harvard
University Press.
Indcx
+ (QML),
2
,
,
,
, ,
denition,
semantics,
semantics, in modal logic,
, see also might-counterfactual
denition,
semantics,
Z,
3,
denition,
semantics,
denition,
semantics,
a priority, two-dimensional formaliza-
tion of,
accessibility, , ,
in tense logic,
actualist versus possibilist quantiers,
actuality,
adequate connective sets, see truth func-
tion, symbolizable in PL
alethic necessity, , ,
anti-symmetric (relation),
anti-symmetry (counterfactuals), ,
Lewiss objection,
argument, of a function,
assumptions
in MPL, see conditional proof, in
MPL
reasoning with, , ,
asterisks (MPL countermodels)
above,
below,
atomic formula, ,
augmentation (counterfactuals), ,
,
axioms, see proof system, axiomatic
Barcan formula, , , ,
base axiom (counterfactuals), ,

Berra, Yogi,
binary (relation),
bivalence,
box-strip, see 2
B3, see schema, B3

canonical models for MPL
denition,
idea of,
truth and membership,
closed formulas, see free variables
compactness
INDEX
fails in second-order logic,
predicate logic,
completeness
fails in second-order logic,
intuitionistic logic,
modal propositional logic,
predicate logic,
propositional logic,
SQML,
complex predicates,
conditional,
conditional excluded middle (counter-
factuals), ,
conditional proof,
in MPL,
congurations, , , , ,
,
connected (relation),
constancy,
contingent a priori,
two-dimensional account of,
contraposition (counterfactuals), ,
converse Barcan formula, , , ,
correct substitution,
counterfactuals
context dependence,
distinguished from indicative con-
ditionals,
dynamic semantics for,
introduced,
Lewis/Stalnaker theory,
logic in natural language,
daggers (MPL countermodels),
de re
modality, philosophical problem
of,
versus de dicto modal sentences,
decidability
decreasing domains requirement,
deduction theorem,
proof of, for propositional logic,
deep contingency,
deep necessity,
dened connectives, see primitive con-
nectives
denite descriptions, in natural lan-
guage, ,
deniteness, see determinacy
denotation
,
function symbols,
predicate logic,
dense (relation),
deontic logic,
designated values,
designated world,
determinacy,
and supervaluationism,
operator for,
three-valued approach to,
deviation (from standard logic),
dialetheism,
distribution (counterfactuals), ,
domain
modal logic, ,
predicate logic,
double-negation elimination and in-
troduction,
INDEX
doxastic logic,
empty descriptions,
empty names,
epistemic logic,
equivalence relation,
eternality (tense logic),
Evans, Gareth,
ex falso (EF),
existential quantier, see
exportation (counterfactuals),
extension (of a predicate),
extension (of a proof system),
extension (of standard logic),
extraordinary (vs normal) objects,
xedly, idea of,

formal language, , ,
formalization, see formal language
frame,
free logic, , , ,
positive, negative, neutral,
free variables,
axiomatic proofs of formulas con-
taining,
truth for formulas containing,
function,
domain of,
range of,
total,
function symbols, in natural language,
Gdel, Kurt,
incompleteness theorem,
Geach-Kaplan sentence,
generalized quantiers,
grammar,
=,
,
,
Z,
counterfactuals,
function symbols,
predicate logic,
quantied modal logic,
second-order logic,
tense logic,
Henkin proof, ,
heredity condition,
generalized to all formulas,
Hilbert-style system, see proof system,
axiomatic
identity of indiscernibles,
importation (counterfactuals), ,
inadequate connective sets, see truth
function, not symbolizable in
PL
inclusive logic,
increasing domains requirement,
indiscernibility of identicals, ,
induction, proof by,
innity,
intension,
interpretation
bivalentsee interpretation
propositional logic
predicate logic,
trivalent,
intuitionistic logic, ,
INDEX
philosophical justication of,
proof theory for,

semantics for,
Kleene, Stephen,
strong vs. weak truth tables,
truth tables,
Kripke, Saul, , , , , ,
Kripke-style semantics, see possible

worlds, semantics
K3, see schema, K3
lambda abstraction, see complex predi-
cates
Leibnizs Law,
Lewis, C. I.,
Lewis, David, ,
liar paradox,
limit assumption, ,
Lewiss objection,
logic
application,
mathematical,
semantic approach to, ,
,
symbolic, see logic, mathematical
logical consequence
introduced,
nature of,
proof-theoretic conception of, ,
, ,
semantic conception of, ,
logical constant,
logical form,
logical truth,
logically correct argument,
ukasiewicz, Jan,
major connective, ,
Marcus, Ruth Barcan,
maximal consistent sets
dened,
dened for modal logic,
existence proved,
features of,
in canonical models,
meaning, , , ,
merely possible objects, see extraordi-
nary objects
mesh (completeness proof for MPL),
metalanguage, ,
metalogic, , ,
proof, , , ,
might-counterfactual
Lewiss denition,
might-counterfactuals,
modal negation, see schema, modal nega-
tion
modal reduction theorem,
modal systems,
relations between,
modality
introduced,
strengths of,
model
counterfactuals
Lewis,
Stalnaker,
free logic,
function symbols,
INDEX
predicate logic,
quantied modal logic
simple,
simple, designated world,
variable domains,
systems of MPL,
modus ponens,
MP technique,
MPL-tautological consequence,
MPL-tautology,
nearness relation,
necessary a posteriori,
necessary a priori
two-dimensional account of,
necessary existence, ,
necessitation (rule of),
necessity, see modality
necessity of identity,
necessity of the consequence vs neces-
sity of the consequent,
necessity, strong versus weak,

negation,
nonactual objects, see extraordinary ob-
jects
nonclassical logic,
nonexistent objects, see extraordinary
objects
object language, see metalanguage
one-to-one function,
open formulas, see free variables
ordered pair,
over (relations),
paraconsistent logic,
PC-tautological consequence,
PC-tautology,
plus sign (QML), see +
polish notation,
possibility, see modality
possible worlds
introduced,
philosophical questions about, ,
semantics, idea of,

precisication,
presupposition,
Priest, Graham,
primitive connectives,
Prior, Arthur,
proof stage,
proof system
axiomatic,
B,
D,
K,
MPL strategies,
predicate logic,
S,
S,
SQML,
T,
idea of,
natural deduction, ,
sequent,
Quine, W. V. O., , ,
recursive denition,
INDEX
reexive (relation),
reexivity (of identity),
relation
domain of,
range of,
relations,
formal properties of,
relevance logic,
rigid designator,
Russell, Bertrand,
theory of descriptions,
S3, see schema, S3
S3, see schema, S3
schema
axiom,
B3,
contraposition, form ,
contraposition, form ,
double-negation elimination,
double-negation introduction,
ex falso quodlibet,
excluded middle MP,
K3,
modal negation,
negated conditional,
permutation,
proof-,
S3,
S3,
T3,
validity and invalidity of,
weakening the consequent,
scope,
second-order logic,
semantic consequence
counterfactuals,
Kleene,
LP,
ukasiewicz,
predicate logic,
propositional modal logic,
simple,
variable domains,
supervaluationism,
two-dimensional modal logic,
general semantic consequence,
semantic equivalence,
semantics, see logic, semantic approach
to
sentences vs open formulas, see free
variables
sequent
dened,
logically correct,
proof
dened,
idea of,
provable sequent dened,
rules,
serial (relation),
serious actualism,
set theory,
Sheffer stroke,
similarity (counterfactuals),
soundness
predicate logic,
SQML,
stage, see proof stage
INDEX
Stalnaker, Robert, ,
substitution of equivalents,
supercial contingency,
supercial necessity,
supervaluations,
symmetric (relation),
Tarski, Alfred,
tautologies, , , , , ,
table of,
tense logic,
tense operators
formalized,
idea of,
metrical,
toolkit,
total (relation),
generates same modal logic as equiv-
alence relation, ,
transitive (relation),
transitivity
of conditionals in PL,
transitivity (counterfactuals),
truth function,
not symbolizable in PL,
symbolizable in PL,
truth in a model, see also valuation
predicate logic,
with designated worlds,
truth table,
and truth-functionality,
cannot be given for 2,
Kleene,
ukasiewicz,
Sheffer stroke,
truth values,
truth-conditions, see meaning
truth-functionality, , ,
truth-value gaps, ,
philosophical justication of,
philosophical objection to,

versus gluts,
tuple, see ordered pair
two-dimensional modal logic, idea of,
T3, see schema, T3

undecidability (predicate logic),
uniform substitution,
universal generalization (UG),
use and mention,
vacuously true counterfactual,
vagueness,
higher-order,
validity,
counterfactuals,
in an MPL model,
Kleene,
LP,
ukasiewicz,
predicate logic,
simple,
variable domains,
supervaluationism,
tense logic,
two-dimensional modal logic,
general validity,
with designated worlds,
valuation
INDEX
=,
,
,
(two-dimensional),
actuality,
counterfactuals
Lewis,
Stalnaker,
free logic,
Kleene,
ukasiewicz,
simple,
variable domains,
second-order logic,
supervaluation,
tense logic,
value, of a function,
variable assignment
dened,
truth relativized to,
variables
metalinguistic,
variation (of standard logic),
well-formed formula (wff), see gram-
mar

(Theodore Sider) Logic For Philosophy

Uploaded by

Copyright:

Available Formats

(Theodore Sider) Logic For Philosophy

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Theodore Sider) Logic For Philosophy

Uploaded by

Copyright:

Available Formats

Logic for Philosophy

CHAPTER 1. WHAT IS LOGIC?

Function thats not one-to-one

CHAPTER 2. PROPOSITIONAL LOGIC

, is dened as the function that assigns to each wff either 1 or 0, and

() is either 0 or 1, but not both.

is dened as a function that assigns either 1 or 0 to each

() =1. The expression

(()) =1, then

() =1. So, we begin by assuming that V

(()) =1, by denition of the valuation function, clause for , we

() = 0. Now, we earlier noted the principle that a wff

for the , we know that its not the case

() =1. So now we have what

() =1, by the clause for , V

() =0; but now since

() =0, by the clause for we know that V

(()) =1, which is what we

((PQ)(QP)) =0 (reductio assumption)

(Q(PR)) =0. (From now on well omit the subscripted .)

(PR) =0 (example 2.1), so V

((PR)Q) =1. But since V

() = 0. Given the latter,

() = 0contradiction. Analogous proofs can be given that instances of

() cant be 0, for if it were, then

() would be 0, and it isnt. Thus, V

() =0 (truth cond. for )

() =1 (truth cond for )

CHAPTER 3. BEYOND STANDARD PROPOSITIONAL LOGIC

, is dened as the function that assigns to each wff either

() =() if is a sentence letter

) iff for every trivalent interpretation ,

) iff for every

() =1 for each , then V

(PQ) for each trivalent interpretation .

, relative to a trivalent assignment . But

() =1 for all trivalent interpretations ; and

() (=0 for each trivalent interpretation

() (=0 for each , then KV

() = 1, we say that is supertrue in ; when SV

(PQ)? Answer: #. Let ( and (

(PP)? Answer: 0. (A different result, notice,

CHAPTER 4. PREDICATE LOGIC

Axioms: all instances of PL-PL, plus:

CHAPTER 5. BEYOND STANDARD PREDICATE LOGIC

, meaning there are innitely many:

genuinely enhances predicate logic: no

and most, there are binary quantiers

CHAPTER 6. PROPOSITIONAL MODAL LOGIC

I discharged the second

Notice how commitments

Intuitively: a string of modal operators always boils down to the innermost

() is dened as the set of wffs such that 2 is a member of

notation. Think of this operation as

() (the box-strip of set ), begin with

ex falso quodlibet: , '

modus ponens: , '

negated conditional: () '

CHAPTER 6. PROPOSITIONAL MODAL LOGIC