Phdthesis Mayol

PRONOUNS IN CATALAN: INFORMATION, DISCOURSE AND STRATEGY
Laia Mayol
A DISSERTATION
in
Linguistics
Presented to the Faculties of the University of Pennsylvania in Partial

Fulfillment of the Requirements for the Degree of Doctor of Philosophy
2009
Robin Clark, Supervisor of Dissertation
Eugene Buckley, Graduate Group Chair

Acknowledgements
I thought writing the acknowledgments would be the easiest part of this thesis, but it is not!
It is very difficult to find the appropriate words to thank everyone who has helped me along
this process, showing my gratitude without being (too) corny. I tried setting up one of my
game theory trees and compute the equilibrium, but it did not seem to work very well for
this problem. Let’s try, though!
I would like to thank my advisor, Robin Clark, for guiding me throughout this project,
teaching me game theory and discussing every section of this thesis. Thanks also to my
committee, Aravind Joshi and Charles Yang, for giving me valuable feedback which greatly
helped to improve this work. Penn has been a wonderful place to learn about linguistics. I
would like to thank all my professors here, and in particular Maribel Romero. Thanks to
Maribel I have a much clearer idea of how a teacher should be!
My experience here would not have been the same without the many hours spent at the
p.lab (many thanks to Jiahong Yuan for letting me squat there!). Aviad Eilam, Catherine
Lai, Yanyan Sui and Josh Tauberer have been very good friends here. We did not only
talk about linguistics, but also gossiped about it! Our New York trips, dinners around
West Philly, chats in the p.lab, squash matches and adventurous trips to the mall have
been an important part of my life here. I’ll miss you! Double thanks to Aviad for hav-
ing the patience of proof-reading 200 pages with weird prepositions and determiners. Eva
Florencio and Ganesh Krishnamurthi were my first good friends in Philly and helped me
ii
survive the (sometimes) amusing and (sometimes) frustrating times of getting used to a
new city and a new country. I wish you much happiness! My fellow grad students at
Penn have made my stay here much more pleasant and I’d like to thank them for their
camaraderie: Lukasz Abramowicz, Stefanie Brody, Lucas Champollion, Toni Cook, Ariel
Diertani, Aaron Dinkin, Keelan Evanini, Michael Friesner, Kyle Gorman, Jonathan Gress-
Right, Damien Hall, Robert Lannon, Caitlin Light, Laurel MacKenzie, Brittany McLaugh-
lin, Satoshi Nambu, Giang Nguyen, Marjorie Pak, Maya Ravindranath, Tatjana Scheffler,
Augustin Speyer and Joel Wallenberg.
My professors in Barcelona are the first people who are responsible for my writing
these lines now. Enric Vallduvı́ is an amazing teacher and an amazing linguist. If he
hadn’t been my advisor in my senior year, I would probably never have discovered how fun
and satisfying doing linguistics can be. Louise McNally and Toni Badia have been very
supportive, always ready to help and give advice. The Glicom was the first research group
I ever worked in and I have fond memories of my colleagues there: Gemma Boleda, Stefan
Bott, Àngel Gil, Martı́ Quixal and Oriol Valentı́n.
What an extremely good friend Elena Castroviejo has been! Our skype conversations
have been very important in helping me survive these two years of writing a thesis. I
have had lots of fun discussing and brainstorming about semantics and pragmatics, shar-
ing youtube links and making up emoticons, among the most successful being (opn) and
(puny). I am looking forward to our next project! Gemma Barberà has been another very
good friend, always ready to listen, help and bike while singing bonfire songs. Noies,
gràcies per tot!
I have been very lucky to have been able to attend several ESSLLI summer schools.
Thanks to everyone who makes them possible! At ESSLLI, I have learned a lot, had lots of
fun and met many great people, in particular, Manuel Kirschner, who is one of the nicest
people I know, always kind and supportive.
iii
In these five years, I have grown very fond of Philadelphia. It was not a love at first
sight, but I know I will miss this place very much, its parks, its cafes, its bumpy roads
(well, maybe not the bumpy roads). However, I am also looking forward to spending some
time home, closer to my family and friends. I am very lucky to have friends back home
with whom I have remained very close despite the distance. Mercè is a lot of fun and the
non-linguist with the best linguistic intuitions I know. Montse is one of the most passionate
people I know and always a pleasure to talk to.
My family deserves the biggest thank you for their unconditional support in all the
decisions I have made in my life: my parents Rosa and Joan, my brother Adrià and my
grandparents Martı́, Lola and Vicenç. Aquesta tesi és per a vosaltres. Moltes gràcies!
Finally, everything would have been different had I not met Pedro. Thanks for putting up
with my hydrogen-like behavior, being enormously patient as I struggled with writing this
thesis, driving the car when the road gets too narrow and making this story possible across
several countries and continents. ¡Gracias por todo!
iv
ABSTRACT
PRONOUNS IN CATALAN: INFORMATION, DISCOURSE AND STRATEGY
Laia Mayol
Supervisor: Robin Clark
This thesis investigates the variation between null and overt pronouns in subject position
in Catalan, a null subject language. I argue that null and overt subject pronouns are two
resources that speakers efficiently deploy to signal their intended interpretation regarding
antecedent choice or semantic meaning, and that communicative agents interact strategi-
cally in order to communicate the desired meaning with the most economical form possi-
ble. The mathematical framework of Game Theory is used to analyze this variation, since
it is particularly suitable for modeling strategic interaction and choices.
The Position of Antecedent Hypothesis, proposed by Carminati (2002) for Italian, states
that null pronouns have a subject preference, while overt pronouns have a non-subject pref-
erence. I show that Catalan intersentential data conforms to the PAH whenever the subject
is the link of the sentence. However, the PAH needs to be redefined once the topic-focus
articulation of the sentence is taken into account: null pronouns have a subject preference
regardless of whether the subject is acting as link of the sentence or not, while overt pro-
nouns have a preference for low salience (non-subject, non-link) antecedents. These results
point to a model in which salience is composed of several factors and different forms are
sensitive to different factors. This data is modeled using games of partial information, in
which information states represent different levels of salience. This model makes the pre-
diction that the biases emerging from the PAH should be overridden if there are powerful
enough contextual cues, which is borne out.
The relative rates of null and overt pronouns vary greatly in different Romance varieties.
I present two hypotheses to deal with this variation: one based on priming effects and the
v
other on a grammatical change in progress. Finally, the relationship between contrastivity
and overt pronouns is addressed. I argue that all instances of contrastive pronouns are
Contrastive Topic markers, which trigger an uncertainty contrast interpretation, which can
be coerced into an exhaustive contrast if there is a salient alternative in the discourse or in
the context. I offer a game theoretical analysis of the pairing between forms and contrastive
meanings.
vi
Contents
Acknowledgements ii
Abstract v
Contents vii
List of Tables xi
List of Figures xiii
List of Abbreviatures xiv
1 Introduction 1
1.1 The research question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background 6
2.1 Choice of referring expressions and their processing . . . . . . . . . . . . . 6
2.1.1 Accessibility Theory (Ariel, 2001) . . . . . . . . . . . . . . . . . . 7
2.1.2 The Givenness Hierarchy (Gundel et al., 1993) . . . . . . . . . . . 9
2.1.3 Centering Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Information structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
vii
2.2.1 The Old and the New . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Vallduvı́’s (1992) tripartite approach . . . . . . . . . . . . . . . . . 14
2.3 Catalan syntactic and pragmatic structure . . . . . . . . . . . . . . . . . . 17
2.4 Some claims about overt pronouns in Romance Languages . . . . . . . . . 21
2.4.1 The Position of Antecedent Hypothesis (PAH) . . . . . . . . . . . 22
2.4.2 S-topic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 Contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.4 Focal information . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.5 Rigau’s (1989) approach . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.6 Summary of proposals . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Corpus data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Subjecthood and pronouns: The Position of Antecedent Hypothesis 37

3.1 Italian pronouns: Carminati (2002) . . . . . . . . . . . . . . . . . . . . . . 38
3.1.1 Experiment 1: questionnaire with non-biased sentences . . . . . . . 38
3.1.2 Experiment 2: self-paced reading experiment . . . . . . . . . . . . 40
3.2 The PAH in Spanish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Experiments on Catalan pronouns . . . . . . . . . . . . . . . . . . . . . . 43
3.3.1 Experiment 1: questionnaire study . . . . . . . . . . . . . . . . . . 44
3.3.2 Experiment 2: self-paced reading test . . . . . . . . . . . . . . . . 46
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4 Game theory 52
4.1 Overview of game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.1 The role of payoffs . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 A game theoretical approach to discourse anaphora . . . . . . . . . . . . . 59
4.3 An analysis of null-subject languages . . . . . . . . . . . . . . . . . . . . 65
viii
4.3.1 Experiment 3: self-paced reading experiment with different de-
grees of biasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4 Mixed strategies and uncertainty . . . . . . . . . . . . . . . . . . . . . . . 77
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5 Pragmatic structure and pronouns: topic, link and focus 86

5.1 Topics and links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.1.1 Related work: Italian pronouns . . . . . . . . . . . . . . . . . . . . 88
5.1.2 Related work in other languages . . . . . . . . . . . . . . . . . . . 95
5.1.3 Experiment 4: questionnaire experiment . . . . . . . . . . . . . . . 98
5.1.4 Game theoretical analysis . . . . . . . . . . . . . . . . . . . . . . 103
5.2 Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2.1 Experiment 5: questionnaire study with focal subjects . . . . . . . 110
5.2.2 Game theoretical analysis . . . . . . . . . . . . . . . . . . . . . . 112
5.2.3 Incompatibility of NSPs and focus: a game theoretical perspective . 116
5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6 Cross-linguistic variation 122

6.1 Null and overt subjects across varieties . . . . . . . . . . . . . . . . . . . . 122
6.1.1 Quantitative studies . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.1.2 Qualitative studies . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Game theory and cross-linguistic variation . . . . . . . . . . . . . . . . . . 129
6.2.1 Hypothesis I: Priming effects . . . . . . . . . . . . . . . . . . . . . 130
6.2.2 Hypothesis II: Competition of grammars . . . . . . . . . . . . . . 132
6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7 Contrast 141
ix
7.1 The data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.1.1 Double contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7.1.2 Implicit contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.1.3 Weak contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
7.1.4 Stressed and unstressed overt pronoun . . . . . . . . . . . . . . . . 149
7.2 On the notion of contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.2.1 Contrast as a rhetorical relation . . . . . . . . . . . . . . . . . . . 151
7.2.2 Contrast as a semantic operator . . . . . . . . . . . . . . . . . . . 152
7.2.3 Contrastive Topics . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.2.4 A reformulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.3 Analysis of contrast in Romance OSPs . . . . . . . . . . . . . . . . . . . . 164
7.3.1 Contrastive OSPs as Contrastive Topics . . . . . . . . . . . . . . . 164
7.3.2 Game theory and contrast . . . . . . . . . . . . . . . . . . . . . . 170
7.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8 Conclusion 178
8.1 Contributions of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.2 Directions for future work . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9 Appendices 184
9.1 Appendix A. Materials for Experiment 1 . . . . . . . . . . . . . . . . . . . 184
9.2 Appendix B. Materials for Experiment 2 . . . . . . . . . . . . . . . . . . . 189
9.3 Appendix C. Materials for Experiment 3 . . . . . . . . . . . . . . . . . . . 195
9.4 Appendix D. Materials for Experiment 4 . . . . . . . . . . . . . . . . . . . 202
9.5 Appendix E. Materials for Experiment 5 . . . . . . . . . . . . . . . . . . . 205
Bibliography 207
x
List of Tables
2.1 Transitions in Centering Theory . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 Results for Experiment 1 in Carminati (2002) . . . . . . . . . . . . . . . . 39

3.2 Results for Experiment 2 in Carminati (2002) . . . . . . . . . . . . . . . . 41
3.3 Results for Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.1 Example with a dominant strategy . . . . . . . . . . . . . . . . . . . . . . 54

4.2 Game without a dominant strategy . . . . . . . . . . . . . . . . . . . . . . 55
4.3 Game with a Pareto-Nash equilibrium . . . . . . . . . . . . . . . . . . . . 56
4.7 Game with a mixed Nash equilibrium . . . . . . . . . . . . . . . . . . . . 78
4.8 Potential mixed strategy game . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1 Results in Frana (2007) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.2 Results in Kaiser and Trueswell (2008) . . . . . . . . . . . . . . . . . . . . 96
xi
6.1 Overall rate of OSPs in different varieties . . . . . . . . . . . . . . . . . . 123
6.2 Distribution of pronouns across categories in Cameron (1992) . . . . . . . 124
6.3 Distribution of OSPs in switch and same conditions in Cameron (1992) . . 124
6.4 Varbrul weights for Switch Reference in Cameron (1992) . . . . . . . . . . 125
6.5 Percentage of overt singular pronouns in Madrid and San Juan: cross-
tabulation of trigger status by same/switch condition . . . . . . . . . . . . 126
6.6 Percentage of G1 items incompatible with a G2 grammar . . . . . . . . . . 134
6.7 Percentage of G2 items incompatible with a G1 grammar . . . . . . . . . . 136
xii
List of Figures
2.1 Catalan VOS order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1 Game of incomplete information . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Game of partial information for English anaphora . . . . . . . . . . . . . . 61
4.3 Game for Catalan pronouns . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Expected payoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1 Game for interaction between subjecthood and linkhood . . . . . . . . . . 105

5.2 Game for focal pronouns (1) . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.3 Game for focal pronouns (2) . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.1 Full pronominal subjects during seven periods in Brazilian Portuguese . . . 129
7.1 Contrast I Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

7.2 Contrast II Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
xiii
List of Abbreviatures
CB Backward-looking center
CF Forward-looking center
CL Clitic pronoun
CP Preferred center
DD Definite description
GT Game theory
IO-CL Indirect object clitic pronoun
IS Information state
NSL Null subject language
NSP Null subject pronoun
OSP Overt subject pronoun
PAH Position of Antecedent Hypothesis
Q Question marker
xiv
Chapter 1
Introduction
1.1 The research question
The goal of this thesis is to investigate which factors regulate the variation between null and
overt subject pronouns (NSP and OSP, henceforth) in Catalan and to model this variation
as a game theoretical problem, in which there are, in principle, two resources competing
for the same position and function.
Catalan, on a par with other Romance languages like Italian or Spanish, is a null subject
language (NSL) and has a double system of pronouns (Rigau, 1986). In subject position,
there is an alternation between overt pronouns (ell in 1a) and null pronouns (in 1b).
(1) a. Ell estima la Maria.

he loves the Mary
b. Estima la Maria.
loves the Mary
‘He loves Mary.’
There are cases in which OSPs are ungrammatical, as in 2a, cases in which they are
optional, as in 2b, and cases in which they are mandatory, as in 2c (examples from Rigau
(1989)).
1
(2) a. Has entrat i (* tu) has sortit.
have entered and you have left
b. Quan (ell) va arribar, tothom va callar.

when (he) arrived, everyone stopped talking
c. En Pere és de Barcelona però *(tu) ets de Girona.

the Peter is from Barcelona but you are from Girona
The goal of this thesis is to investigate what regulates these patterns, to examine which
type of referring preferences they exhibit and to give a pragmatic game theoretical analysis
of these preferences. In Catalan, NSPs and OSPs are two resources that speakers may
deploy to signal their intended interpretation and I argue that in this domain speakers and
hearers interact strategically in order to communicate the desired meaning with the most
economical form possible. That is, this extra resource that NSLs have at their disposal
is efficiently used by speakers to communicate particular decisions regarding antecedent
choice or semantic meaning, specifically as it relates to contrastivity.
Game theory, which provides a mathematical framework to deal with strategic choice,
will be the tool used in order to analyze this variation. Rationality plays a key role in
pragmatic choices: the speaker makes a choice taking into account how the hearer will
interpret this choice. The hearer interprets the speaker’s choice taking into account that
the speaker took the hearer into account, etc. Game theory is equipped to handle precisely
these situations, in which two agents take each other into account in order to choose the
action which should turn out to be optimal for them. In this thesis, I propose a game
theoretical model of how participants in a conversation interpret null and overt pronouns in
a discourse; that is, I specify the necessary ingredients of the model: the information states
and their probabilities, the possible actions in each information state and their payoffs.
The main questions I aim to solve in this thesis are the following:
• Question 1: What is the relationship between syntactic function and pronouns in
2
Catalan? Do different pronouns have biases towards antecedents in particular syn-
tactic positions?
• Question 2: What is the relationship between information structure and pronouns in

Catalan? Do different pronouns have biases towards antecedents playing different
roles in the information structure of a sentence? Can the syntactic preferences of
pronouns be understood as a by-product of their pragmatic preferences? What does
this tell us about the notion of salience?
• Question 3: It is well known that different NSLs present different overall rates of
OSPs. How should we deal with this cross-linguistic variation in our game theoretical
approach?
• Question 4: How should contrastive pronouns be analyzed? Is there only one type of
contrastive pronoun? Are they always mandatory?
The main thesis defended in this work is that the variation between NSPs and OSPs in
Catalan allows participants engaged in a communicative exchange to interact strategically
and behave rationally in generating and interpreting anaphoric expressions. I will also
argue for a model of salience, in which both syntactic and pragmatic factors play a role and
different pronouns are sensitive to different factors.
1.2 Structure of the thesis
This thesis is structured as follows:
• Chapter 2 gives some background information about different models of referring ex-
pressions, information structure, Catalan syntactic and pragmatic structure and pre-
vious claims in the literature regarding about overt pronouns in Romance.
3
• Chapter 3 presents Experiments 1 and 2, which test the relationship between subject-
hood and subject pronouns with a questionnaire study and a self-paced reading study.
These experiments show that NSPs and OSPs have different types of biases: NSPs
have a subject preference and OSPs have an object preference.
• Chapter 4 gives an overview of game theory and its application to linguistics as it

pertains to the issue of referring expression choice and anaphora resolution, and the
application of these ideas to the Catalan data. I model the empirical data from Chap-
ter 3 as a game of partial information in which speaker and hearer interact to choose
a form and a meaning. I also present Experiment 3, which provides further evidence
for the game theoretical analysis previously offered. This experiment shows that
NSPs can go against their bias and refer to the object if contextual information has
been appropriately manipulated.
• Chapter 5 presents how pragmatic notions, such as topic and focus, interact with
the choice and interpretation of pronouns. Experiments 4 and 5 provide some insight
into this relationship. Experiment 4 shows that NSPs are sensitive to syntactic factors
(they have a subject preference) and that OSPs are sensitive to both syntactic and
pragmatic factors (they have an object preference only when the object is not the
topic). Experiment 5 shows that the referring preferences of OSPs change when the
pronoun acts as focus and that, in this situation, they are fully ambiguous.
• Chapter 6 investigates the differences between different null-subject Romance vari-

eties. I review the main differences reported in sociolinguistic studies and present
two hypotheses about how to approach cross-dialectal differences. The first hypoth-
esis derives the different rates of overt pronouns from priming effects. The sec-
ond hypothesis derives the different rates of overt pronouns from the idea that some
varieties (Brazilian Portuguese and Caribbean Spanish) are undergoing a language
4
change process from being a NSL to being a non-NSL.
• Chapter 7 deals with the relationship between contrastivity and OSPs in Romance.
I argue that contrastive OSPs in Romance should be treated as Contrastive Topics.
I present an analysis of Contrastive Topics which combines insights from Büring
(2003), Hara and van Rooij (2007) and Tomioka (2008): contrastive OSPs convey an
‘uncertainty contrast’, which can be strengthened into an ‘exhaustive contrast’, under
certain circumstances. I argue that contrastive pronouns are not always mandatory,
but rather only when the utterance must be interpreted as answering a sub-question
of the Question Under Discussion.
• Chapter 8 concludes by reviewing the main findings reported in this thesis and ex-
ploring open questions and venues for future work.
5
Chapter 2
Background
This section presents some background information on the topics relevant to this thesis.
Section 2.1 gives an overview of linguistic and psycholinguistic proposals regarding choice
and processing of referring expressions. Section 2.2 reviews some information-theoretical
notions and Section 2.3 briefly presents the syntactic and pragmatic structure of Catalan.
Section 2.4 discusses several hypotheses regarding what triggers the presence of overt pro-
nouns, which I test or reformulate in Chapters 3, 5 and 7. Finally Section 2.5 presents
a corpus of Catalan narrations that will be used at several points in this thesis mainly to
estimate the probability of the different information states in our game theoretical models.
2.1 Choice of referring expressions and their processing
The choice and interpretation of referring expressions and the processing of pronouns has
been a topic of interest in linguistics for a long time. I do not intend to present a complete
review of the different proposals in this section, but to give an overview of some influen-
tial ideas, which are relevant for my data and my analysis. I review the basic ideas behind
Accessibility Theory (Ariel, 2001), the Givenness Hierarchy (Gundel et al., 1993) and Cen-
6
tering Theory (Grosz et al., 1995). All these proposals share the idea that some notion of
salience or accessibility drives the choice and interpretation of referring expressions, but
differ in the mechanisms invoked.
2.1.1 Accessibility Theory (Ariel, 2001)
Accessibility Theory (Ariel, 2001) claims that referring expressions encode a specific de-
gree of mental accessibility. Thus, anaphoric expressions are seen as ‘accessibility markers’
in the following hierarchy:
(3) Low Accessibility ...................................................................... High Accessibility

Full name + modifier > full name > long definite description > short definite de-
scription > last name > first name > distal demonstrative + modifier > proximate
demonstrative + modifier > distal demonstrative + N > proximate demonstrative
+ NP > distal demonstrative (-NP) > proximate demonstrative (-NP) > stressed
pronoun + gesture > stressed pronoun >unstressed pronoun > cliticized pronoun
> verbal inflection > zero
According to this hierarchy, names and definite descriptions are low accessibility mark-
ers and, thus, can retrieve referents not salient in memory, while, for instance, pronouns are
high accessibility markers and thus, retrieve antecedents in the current focus of attention.
Ariel argues that three different criteria determine the association of a particular anaphoric
expression with a degree of accessibility: informativity, rigidity and degree of attenuation.
The more informative and rigid an anaphoric expression, the better it is at referring to a
less accessible referent; while the less informative and flexible, the better it is at retrieving
a highly accessible referent. Degree of attenuation refers to the amount of phonological
material a marker has. Ariel’s approach makes the prediction that there is an asymmetry
between NSPs and OSPs: NSPs are predicted to refer to more accessible referents than the
7
OSPs, since they are higher in the hierarchy.
In order to determine the degree of accessibility of a discourse referent Ariel proposes
that different factors interact: (1) salience (determined by many other factors: grammatical
function, high vs. low physical salience in the context, order of mention, definiteness
and quantificational status of the DPs), (2) competition (if there is competition between
potential antecedents for an anaphoric expression, these antecedents are less accessible than
if there is no competition), (3) distance (recently mentioned entities are more accessible
than remotely mentioned ones) and (4) unity (the greater degree of cohesion between the
clause which contains the antecedent and the clause that contains the anaphor, the greater
the accessibility of the anaphor).
Ariel (1990) shows how her approach yields good predictions for corpus studies on
English and Hebrew. However, it is not clear how the criteria that she identifies for de-
termining the degree of accessibility of a discourse referent interact. Does competition
completely override salience? If so, we would expect the overt pronouns to be the pre-
ferred choice in cases of competition. However, experimental results show that this is not
always the case (see the experiments reported in Chapter 3). Also, some contexts require
an OSP (and do not allow for an NSP) regardless of the accessibility of the antecedent (see
Chapter 7).
The approach defended in this thesis uses Ariel’s idea that different referring expres-
sions have different degrees of accessibility, but derives this fact in a less stipulative way by
relating informativity and rigidity of a particular form to its corpus frequency in a particular
situation (cf. chapters 4 and 5). In addition, my experiments support a view of salience as
a non-monolithic concept in which both syntactic and pragmatic factors play a role.
8
2.1.2 The Givenness Hierarchy (Gundel et al., 1993)
According to the Givenness Hierarchy proposed by Gundel et al. (1993), the choice of a
referring expression depends on the assumed cognitive status of the referent, on “assump-
tions that a cooperative speaker can reasonably make regarding the addressee’s knowledge
and attention state in the particular context in which the expression is used” (Gundel, 1883,
page 275). Thus, this proposal highlights the role of the speaker as a rational agent, which
acts strategically and takes into account the addressee’s knowledge. The game theoreti-
cal approach I present in this thesis also uses this idea but views both agents, speaker and
hearer, as rational agents, which interact strategically taking into account the other agent to
make their own decisions.
Gundel et al. (1993) develop a hierarchy of cognitive states which correspond to dif-
ferent degrees of ‘givenness’. They propose the following hierarchy (the types of linguis-
tic referring expressions that correspond to each cognitive status in English are shown in
parentheses).
(4) In focus (it) > activated (that, this, this N) > familiar (that N) > uniquely identifi-
able (the N) > referential (indefinite this N) > type identifiable (a N).
The main difference between this approach and Ariel’s approach is that while Ariel’s
accessibility statuses are seen as mutually exclusive, Gundel et. al.’s proposal is that they
are implicationally related, so that each cognitive status entails all that are below it on the
scale (to the right in 4). Thus, in principle, this predicts that a referent of a particular
givenness status may be referred to with a linguistic form associated with a status lower on
the scale: for example, a referent ‘in focus’ can be referred to with a definite article because
‘in focus’ entails ’uniquely identifiable’. A corpus study of different languages shows that
this is sometimes the case: for instance, the definite article in English appears mostly in
the uniquely identifiable status, but also appears frequently in the statuses at the left, as
9
predicted. However, the indefinite article only appears with the two rightmost statuses
and not all the way to the left, as Gundel et al. would predict. They explain this data by
proposing that the correlation between linguistic form and givenness status is regulated by
the two opposing Gricean Quantity Maxims:
(5) a. Q1. Make your contribution as informative as required
b. Q2. Do not make your contribution more informative than is required
Q1 explains why the indefinite article is not used for statuses above referential or why
the in-focus referents are encoded mostly by the most restrictive forms (zero or unstressed
pronouns) depending on the language. Q2 explains why a definite article appears in sta-
tuses above uniquely identifiable (familiar or activated, for instance). However, such an
explanation could be applied to almost any distribution of the data and there does not seem
to be any principled reason why Q1 and Q2 should affect different referring expressions.
Gundel et al. (1993) apply their proposal to one Romance null-subject language, Span-
ish. They include both pronouns (NSPs and OSPs) in the ‘in focus’ status. In their corpus
study, all NSPs refer to ‘in focus referents’, while this is also the case for almost all OSPs,
except for three instances which are ‘activated’. As I show in the next chapters, Catalan
data does not seem to follow this pattern: NSPs and OSPs tend to select antecedents with
different degrees of accessibility; however, NSPs can select referents which are not ‘in fo-
cus’, but just ‘activated’ if there is enough contextual information (see Chapters 3 and 4).
Also, I argue in Chapter 7 that the use of an OSP to refer to the most activated referent
conveys an additional meaning, associated with Contrastive Topics, and it is not used in a
purely referential way.
10
2.1.3 Centering Theory
Centering Theory (Grosz et al., 1995) is a way of modeling attention during discourse
processing, a framework to theorize about local coherence, salience and choice of referring
expressions. Centering Theory (CT, henceforth) claims that some entities are more central
than others in a discourse and this affects the referring expressions speakers use to refer to
these entities. The basic units of analysis in CT are utterances. Every utterance evokes a
set of entities, called the forward-looking center (CF), which is partially ordered according
to some language-dependent ranking.
There are two special members in the CF of an utterance Ui :
• the preferred center (CP) of some utterance Ui is the highest ranked center of the CF
of Ui . It is predicted that Ui+1 will be about this entity.
• the backward looking center (CB) of some utterance U is the highest ranked center
of the CF of the previous utterance Ui−1 which is realized in Ui . That is, it is the
most central entity in the utterance, which connects the previous sentence with the
current sentence. Each utterance may have at most one CB, and it may be the case
that an utterance does not have a CB; this will happen if none of the entities of the
CF of Ui−1 are present in Ui .
Between any two utterances Ui and Ui+1 , there will be a transition, which can be clas-
sified into four types depending on the interaction between the CB and the CP of both
utterances. These transitions are illustrated in Table 2.1.
CB(Ui ) = CP(Ui ) CB(Ui ) 6= CP(Ui )
CB(Ui ) = CB(Ui−1 ) Continue Retain
CB(Ui ) 6= CB(Ui−1 ) Smooth-Shift Rough-Shift
Table 2.1: Transitions in Centering Theory
Centering proposes two rules, which make the theory empirically testable. The first one
11
is a constraint on center realization and the second one is on center movement:
• Rule 1: If there is a pronoun in an utterance, its CB must be also realized as a

pronoun. This rule encodes the idea that, since the CB is the most salient entity,
it can be referred to in the most minimal way. Therefore, this rule predicts that if
only one pronoun is used in any utterance, it will refer to the CB.
• Rule 2: (Sequences of) Continues are preferred over retains, which are preferred over
smooth-shifts, which are preferred over rough-shifts.
Centering Theory as it stands does not make predictions about the variation between
NSPs and OSPs. However, CT has been applied to null subject languages, such as Turkish,
Japanese, Italian and Greek. These proposals are summarized and examined in Section 2.4.
2.2 Information structure
The choice of referring expressions and, thus, the use of pronouns contributes to the con-
struction of a certain information structure in discourse. In this section, I review some of
the main concepts in relation to this topic.
It has been noted for a long time that there is a distinction between the grammatical sub-
ject and predicate of a sentence and the subject-predicate structure of the meaning conveyed
by the sentence. The latter is the Information Structure or the Information Packaging of the
sentence. In 6a the grammatical subject and predicate coincide with the ‘logical’ subject
and predicate: it is predicated about Mary that she ate beans. However, they do not need to
coincide as 6b shows. In this case, the grammatical direct object is the ‘logical’ subject: it
is predicated about the beans, that Mary ate them.
(6) a. What did Mary eat?

Mary ate BEANS.
12
b. Who ate beans?
MARY ate beans.
The term information packaging was introduced by Chafe (1976) to refer to the level
of linguistic analysis that has “to do primarily with how the message is sent and only sec-
ondarily with the message itself” (Chafe, 1976, page 28). In Prince’s words (Prince, 1981,
pg. 224), “information-packaging in natural language reflects the sender’s hypothesis about
the receiver’s assumptions and beliefs and strategies”. Thus, in modeling the information
structure of a discourse, it is crucial to be able to talk about the common knowledge shared
by the speaker and the hearer and about the mutual assumptions about each other’s beliefs.
A traditional approach in information structure is to divide the sentence into topic, what
the sentence is about, and focus, what is predicated about the topic. This partition is often
associated with the division between given and old information in a sentence. However,
there is a great deal of terminological confusion regarding what it means for information
to be ‘new’ or ‘old’. Different terms have been used to refer to the same phenomena
or the same terms have been used to describe different facts. For instance, the partition
between new and old/given information has also been called focus-ground, focus-topic,
rheme-theme, etc. Section 2.2.1 discusses what it means for information to be new or to
be old. Section 2.2.2 presents Vallduvı́’s (1992) approach to information structure, which
explains how informational roles are mapped into syntactic positions in Catalan and makes
some predictions about the use of overt pronouns in discourse (see Section 2.4.2).
2.2.1 The Old and the New
Gundel and Fretheim (2001) distinguish between two types of givenness-newness: referen-
tial givenness-newness and relational givenness-newness. Referential givenness-newness
is a relationship between a linguistic expression and a corresponding non-linguistic en-
13
tity in the speaker or hearer’s discourse model. Some examples of referential givenness
concepts are the hearer-old/new and discourse-old/new statuses of Prince (1992) and the
cognitive statuses of Gundel et al. (1993). Relational givenness-newness is a partition of
the semantic representation of a sentence into two complementary parts, X and Y, where X
is what the sentence is about and Y is what is predicated about X. X is given in relation to
Y, and Y is new in relation to X. Thus, information structure is associated with relational
givenness-newness.
The two types of givenness-newness partitions are logically independent, as can be seen
in 7.
(7) Who called?

Pat said SHE called.
The pronoun in 7 is referentially given, since the intended referent is activated, hearer-
old, discourse-old, etc. At the same time, it is relationally new: it is the new information of
7. However, there seems to be a connection between topicality (relational givenness) and
some degree of referential givenness. For example, the phrase marked by a topic marker
in Japanese and Korean has a definite interpretation. However, the exact nature of this
association remains unclear. For example, Gundel (1985, 1988) proposes that the referents
of topics must already be familiar (the addressee must have an existing representation in
memory), but there appear to be counterexamples to this claims: Reinhart (1981) notes that
specific indefinites can appear in dislocated topic positions in certain contexts.
2.2.2 Vallduvı́’s (1992) tripartite approach
Vallduvı́ (1992) views information packaging as the “structuring of sentences by syntactic,

prosodic or morphological means that arises from the need to meet the communicative
demands of a particular context or discourse” (Vallduvı́ and Engdahl, 1996).
14
He argues that the different bipartite articulations found in the literature (namely the
‘topic-comment’ approaches and the ‘ground-focus’ approaches) cannot capture all the
information distinctions present in a sentence.
Vallduvı́’s (1992) proposal is that sentences are divided into focus and ground, where
the ground is further divided into link and tail. Information packaging is seen as instructions
for the update of information. The focus is the actual update potential of the sentence, while
the ground indicates how the information update must take place. The link indicates where
the focus should go (in which file, following File Change Semantics (Heim, 1983)), and the
tail how the information must be updated. All sentences have a focus, while both elements
of the ground are optional. Thus, a sentence may present one of the following structures:
link-focus, link-focus tail, all-focus and focus-tail. The four types are illustrated in 8.
(8) a. Link-focus
Tell me about the people in the White House. Is there anything I should know?
(The president Link ) (hates CHOCOLATE F ocus ).
b. Link-focus-tail
And what about the president? How does he feel about chocolate?
(The president Link ) (HATES F ocus ) (chocolate T ail ).
c. All-focus
The president has a weakness.
(He hates CHOCOLATE F ocus ).
d. Focus-tail
You shouldn’t have brought chocolates for the president.
(He HATES F ocus ) (chocolate T ail ).
Note that while a DP subject, such as the president, constitutes a link, a subject pronoun
in English does not and, according to Vallduvı́ (1992), both 8c and 8d are linkless sentences.
15
In English, the different ground-focus partitions are usually encoded through stress.
The focus is associated with a pitch accent (H*, in Pierrehumbert (1980)). Links may be
marked syntactically (as in 9, where it is fronted) or intonationally (with a characteristic
L+H* accent). Subject links may be unmarked (without the L+H* accent), while con-
trastive links are always B accented (a combination of pitch accent plus a high boundary
tone, H*L-H%). Finally, tails are not marked in a particular way in English, apart from
being de-accented.
(9) Where can I find the cutlery?

The forks are in the CUPBOARD but the knives I left in the DRAWER.
In this thesis, I follow Vallduvı́’s approach and terminology: focus and link refer to
linguistic material that plays a particular informational structure role: they indicate the
update potential and the file where this update should be located, respectively. I use s-topic
to refer to the abstract file where the information is entered. Thus, a link is the linguistic
material that points to an abstract s-topic (see McNally (1998) for discussion about these
two notions). I depart from this terminology in Chapter 7: I use the term Contrastive
Topic (CT) to refer to a contrastive link because CT is the term mostly used in the current
literature. Finally, I use d-topic to refer to Discourse Topics: what the discourse or the
discourse segment is about (see Asher (2004)). It has been argued that d-topics are crucial
in explaining some discourse relations: for example, Asher and Lascarides (2003) require
that all elements belonging to the the discourse relation Narration have a common d-topic.
As mentioned, links are an optional part of the information structure of a sentence
and, thus, linkless sentences do occur. Vallduvı́ (1992) distinguished two types of linkless
sentences: (i) sentences in which no particular file is relevant (such as presentational or
existential sentences, which are topicless) and (ii) sentences in which there is a relevant
file/topic, but it is not necessary that there be a linguistic link pointing to it, because it can
16
be inferred from context. Section 2.4.2 elaborates on the relationship between links and
subject pronouns.
The next section discusses how the information structure instructions identified by Vall-
duvı́ are mapped onto syntactic positions in Catalan.
2.3 Catalan syntactic and pragmatic structure
This section contains a brief review of Catalan syntactic and pragmatic structure. Catalan,
as other Romance null-subject languages, has a relatively free word order: the subject
can either be preverbal (10a) or postverbal (10b) and the arguments of the verb may be
dislocated to the left or to the right (10c and 10d). By contrast, the direct object must
precede any oblique or locative arguments (11).
(10) a. Els nens diuen moltes mentides.

the children say many lies
b. Diuen moltes mentides els nens.

say many lies the children
c. De mentides, els nens en diuen moltes.

of lies, the children part-pr say many
d. Els nens en diuen moltes, de mentides.

the children part-pr say many, of lies
‘Children lie a lot.’
(11) * Els nens fiquen al calaix la roba.

The children put in the drawer the clothes
‘Children put clothes in the drawer.’
Romance null-subject languages have received different analyses. These analyses dif-
fer mainly in how postverbal subjects are treated and in which order is considered to be
the base-generated one. It is beyond the scope of this thesis to provide a full review of
proposals, but I will briefly present some of them.
17
Catalan has been traditionally analyzed as an SVO language, together with other Ro-
mance NSLs. For Italian, Rizzi (1982) argues that postverbal subjects are generated through
a postposition rule, a rightward NP movement, by which the subject is adjoined to the VP
(see also Belletti and Rizzi (1981) and Burzio (1986) for similar views).
There is another line of analysis, in which postverbal subjects are analyzed as stay-
ing low in the clause, while the verb raises higher up. For instance, for Italian, Belletti
(2000) argues that postverbal subjects move from their VP-internal position to a higher
Focus Phrase immediately above the VP, while the verb raises higher up to the IP. This
analysis capture the fact that a postverbal subject conveys new, focal information. Bar-
bosa (2000) analyzes subjects in European Portuguese as having a thematic position to the
right of a raised V, while preverbal subjects are left-dislocated constituents (Alexiadou and
Anagnostopoulou (1998) present a very similar analysis for Romance NSLs).
Catalan has also been analyzed as a VOS language. Under this approach, all prever-
bal subjects (including all preverbal pronoun subjects) are taken to be instances of left-
dislocations. For example, Bonet (1990) proposes that all subjects in Catalan are base-
generated in the specifier of the VP to the right of its head and, thus, VOS is the base-
generated order, as shown in Figure 2.1 for the sentence in 12.
(12) Llegeix un llibre la Maria.

reads a book Mary
‘Mary reads a book.’
Vallduvı́ (1993) gives evidence for a VOS structure from an information packaging
point of view. His thesis is that, in Catalan, information packaging is conveyed by syntactic
means: i.e. there is a mapping between syntactic position and information-structure roles.
For verbal complements, there is a clear one-to-one mapping since there are three possible
information-packaging roles in his approach (link, focus and tail, as explained in 2.2.2)
and three possible surface syntactic slots in which verbal complements may appear: left-
18
Figure 2.1: Catalan VOS order
detached, right-detached and in situ. Vallduvı́’s proposal is that focus stays in situ, while
links are left-dislocated and tails are right-dislocate.1
Example 13 shows the three positions in which an object may appear: in situ, left-
dislocated or right-dislocated. Left and right-dislocations trigger the appearance of a coin-
dexed clitic attached to the verb.
(13) a. Ficarem el ganivet al calaix.

put the knife in the drawer
b. El ganivet, el ficarem al calaix.

the knife, CL put in the drawer
c. El ficarem al calaix, el ganivet.

CL put in the drawer, the knife
If we maintain an SVO analysis, this one-to-one mapping cannot be maintained for
subjects. The reason is that, while there are three information-packaging roles, there would
potentially be four syntactic positions: left-detached, right-detached, preverbal and postver-
bal. By contrast, with a VOS analysis, the one-to-one mapping can be maintained, adding
1
Although Vallduvı́ supports a VOS analysis, the evidence he presents would also be compatible with
other analyses, such Barbosa’s or Alexiadou and Anagnostopoulou’s analyses, in which preverbal subjects
are treated as left-dislocations and the subject is base-generated in some position low in the clause.
19
the idea that all preverbal subjects are in fact left-detached subjects and that, therefore,
there are only three syntactic positions for subjects as well.
As mentioned, subjects may appear right-detached, outside the main clause, or postver-
bally, inside the main clause. It is not possible to use the appearance of a clitic to dis-
tinguish these positions because Catalan does not have subject clitics. However, the two
structures show different intonation patterns: while postverbal subjects are within the scope
of intonational prominence, right-detached subjects are placed to the right of prominence
(this is often indicated in writing by placing a comma before the right-dislocated subject).
Moreover, while postverbal subjects must appear to the left of VP adjuncts and vocatives,
right-dislocated subjects may appear to their right. Thus, if a subject appears to the right
of an adjunct, it will be a right-dislocated subject and cannot be prominent, as the contrast
between 14a and 14b shows.
(14) a. Ha trucat a les VUIT, l’amo.

has called at eight, the boss
b. * Ha trucat a les vuit l’AMO.

has called at eight the boss
The contrast in 14 also shows that intonation in Catalan has a fixed invariable contour,
in which the intonational prominence falls on the clause-final position. If linguistic material
occurs to the right of the intonational peak, it must be clause-external, dislocated material.
Vallduvı́ (1993) shows that all preverbal subjects may be reanalyzed as left-detached ar-
guments, based on the following evidence: (1) preverbal Catalan subjects and left-detached
subjects are pragmatically interpreted in the same way, as links, and (2) preverbal subjects,
like left-detached complements, must appear to the left of wh-phrases and yes/no mor-
phemes as 15 shows. There is just one position for preverbal subjects in questions and this
is the position of left-detached constituents.
(15) a. L’amo que ha trucat?

the boss Q has called?
20
b. * Que l’amo ha trucat?
Q the boss has called?
‘Has the boss called?’
Thus, it is possible to maintain that subjects are base-generated in a postverbal position

and that all preverbal subjects are left-detached subjects. The greater frequency of the
SVO is due to the fact that subjects usually serve as links, and not as focal information
and, therefore, are typically left-dislocated. If this is correct, the mapping Vallduvı́ (1992,
1993) proposes can be maintained for all constituents: in Catalan, the focus remains in its
canonical position, while the ground is detached; links are left-detached and tails are right-
detached. In 16, the four informational structures are shown: in 16a link-focus, in 16b
link-focus-tail, in 16c all-focus and in 16d focus-tail. Note how all links are left-detached
(appear before the verb), while tails are right-detached.
(16) a. El president odia la xocolata.

the president hates the chocolate
b. El president, l’ odia, la xocolata.

the president CL hates the chocolate
c. Odia la xocolata el president.

hates the chocolate the president
d. L’ odia, la xocolata.
CL hates, the chocolate
2.4 Some claims about overt pronouns in Romance Lan-
guages
In this section, I present the main proposals found in the literature regarding which factors
trigger the appearance of overt pronouns.
21
2.4.1 The Position of Antecedent Hypothesis (PAH)
Carminati (2002) proposes that the variation between NSPs and OSPs is regulated by the
Position of Antecedent Hypothesis (PAH). According to the PAH, within a sentence, null
and overt pronouns have different antecedent biases: null pronouns prefer to retrieve an
antecedent in the (highest) Spec IP, whereas overt pronouns prefer an antecedent in a
lower syntactic position. This hypothesis is in accordance with Ariel’s proposal that more
marked, informative forms tend to retrieve less salient antecedents, while unmarked, less
informative forms tend to retrieve more salient antecedents. In Chapter 3, I discuss several
experiments that test the PAH in different languages, including my own experiments for
Catalan.
Other studies that consider switch/same reference (that is, reference to the previous
subject or not) as an important factor in the appearance of OSPs are a number of variation-
ist studies of several languages, including Cameron (1992) and Silva-Corvalán (1977) for
Spanish.
2.4.2 S-topic change
Vallduvı́ (1992) observes that weak and null proforms do not participate in the construc-
tion of Information Packaging instructions (see Section 2.2.2), while overt pronouns do
participate. Thus, his general hypothesis is that OSPs work towards constructing the in-
formation structure of the text. More specifically, preverbal subject pronouns are links,
which designate a specific file card where the information update is to be carried out. As
mentioned before, links are an optional part of the information structure of a sentence and,
thus, linkless sentences occur when (i) no particular file is relevant (such as presentational
or existential sentences) and (ii) when there is a relevant file/s-topic, but it need not be
mentioned, because it can be inferred from context.
22
The second case includes those pairs of sentences in which a sentence Sn shares its
abstract s-topic with Sn−1 . In this situation, Sn need not have a link, it may have an NSP. In
contrast, the use of a link in two adjacent sentences will imply a change of locus of update
from Sn−1 to Sn .
The second sentence in 17a is an example of a linkless sentence, which inherits the
s-topic (the update file card) from the previous sentence and, thus, the NSP is coreferential
with the previous subject. In contrast, the sentences in 17b are an example of links in two
adjacent sentences, which implies that there is a change of a locus of update, from Maria
in S1 to Anna in S2 . Therefore, the overt pronoun is coreferential with the previous object
and indicates a change of locus of update, or s-topic.
(17) a. La Maria va insultar l’Anna i li va fotre una hòstia.
‘Mary insulted Anna and [null] hit her.’
b. La Maria va insultar l’Anna i ella li va fotre una hòstia.
‘Mary insulted Anna and she hit her.’
Note that, for the examples in 17, this approach and the PAH make the same predictions.
However, this is not always the case. The cases covered by the PAH do not completely
overlap with the cases covered by Vallduvı́’s approach. The former in principle only covers
intrasentential cases, while the latter covers both within- and across-sentence cases (see
experiments 1 and 2 for evidence that the PAH also holds intersententially). Moreover,
these two approaches also yield different predictions when the subject does not act as a
link. In Chapter 5, I show that the pragmatic structure of the sentence has an effect on
the interpretation of some pronouns, but not in the straightforward manner proposed by
Vallduvı́.
Applications of Centering Theory (Grosz et al., 1995) to null-subject languages im-
plement a similar idea to Vallduvı́’s (see Turan (1995) for Turkish, Kameyama (1985) for
23
Japanese and DiEugenio (1998) for Italian). For example, DiEugenio (1998) claims that
NSPs are used when the center transition between the two sentences is a continue and OSPs
are used when the center transition is a retain or a shift (that is, when the center of attention
is not the one expected, given the previous sentence). Analyzing data from Turkish, Tu-
ran (1995) concludes that the salience of referents should be computed according to their
thematic role, based on the observation that the objects of some psychological verbs rank
higher than the subjects and are the preferred antecedent for NSPs. Following the same
line as Di Eugenio, Turan also reports a connection between continue transitions and zero
subjects in Turkish.
Dimitriadis (1996) accounts for the overt/null pronoun variation in Greek in a slightly
different way: his proposal is that an OSP in Greek should not be construed as the CP of
the previous utterance. Thus, the overt pronoun will ‘skip’ the first element in the CF list of
the previous utterance. Since the CP is the subject of the utterance2 , this proposal amounts
to claiming that an overt subject cannot refer to a previous subject. Therefore, this proposal
may be more in line with Carminati’s proposal, although it is hard to say so conclusively
because he does not explicitly discuss the role of s-topics and links.
Samek-Lodovici (1996) also argues that s-Topic Continuation/Change is responsible for
the distribution of null and overt pronouns in Italian. He models this situation using Opti-
mality Theory, by positing a higher-ranked DropTopic constraint, which outranks Subject
and Parse constraints (see 5.1.1 for more details on Samek-Lodovici’s work).
2.4.3 Contrast
Several researchers have pursued the idea that the appearance of OSPs in null-subject lan-
guages is related to the expression of contrast. This is actually a recurrent idea in traditional
2
The ranking Dimitriadis (1996) assumes is SUBJECT > OBJECT2 > OBJECT > OTHER.
24
grammars of Spanish and Catalan. However, although the role of contrast is recognized,
these grammars do not attempt to define what contrast means and remain quite vague about
the contrastive import of pronouns. For instance, the official Spanish grammar by the Real
Academia Española (Alarcos Llorach, 1994) says that overt pronouns ”tienen marcado
carácter enfático y expresivo y trata de contrapoer la persona aludida a las otras” (“have
a marked emphatic and expressive character and [they] contrast the alluded person to the
others”), without exactly defining what exactly ‘contrast’ means or who the referent is con-
trasted with. Similarly vague claims are found in Catalan grammars: “subject pronouns
may be left unexpressed. In fact, they are usually left unexpressed, except for emphasis
or contrast” (Hualde, 1992) or “Catalan is characterized, like Spanish, Italian, and Por-
tuguese, but unlike French, by the way in which subject pronouns accompany verbs only
for particular emphasis.” (Wheeler, 1988).
Luján (1985, 1999) argues that OSPs in Spanish convey emphasis and are contrastive
in those contexts in which they are optional (subject and object positions). The contrast can
be understood with respect to the pronoun alone or it can be wider, with respect to both the
pronoun and the predication.
(18) a. Tu trabajas demasiado, no otro3 .
You work too much, not someone else.
b. Tu trabajas demasiado, ellos te pagan poco.
You work too much, they pay you little.
Luján (1985) argues that OSPs in Romance correspond to stressed pronouns in English
and proposes the following generalization:
(19) null pronoun in Italian/Spanish → unstressed pronoun in English

3
In fact, this example illustrates a focal subject pronoun, rather than a purely contrastive pronoun. See
section 2.4.4 for comments on the relationship between focus and subject pronouns.
25
overt pronoun in Italian/Spanish → stressed pronoun in English
Her evidence for establishing this correspondence is that OSPs in Spanish and stressed
pronouns in English cannot precede their antecedents, as in 20, and do not allow for a
sloppy reading in elliptical contexts, as in 21, while the opposite is true with NSPs in
Spanish and unstressed pronouns in English.4
(20) a. Cuando éli/∗j come, Pedroj no fuma.
b. When HEi/∗j eats, Pedroj does not smoke.
(21) a. Marcos cree que él ha aprobado el examen y Ana también.
b. Mark thinks HE has passed the exam, and so does Anne.
However, as Carminati (2002) argues, this proposal cannot be complete, since there
are also stressed pronouns in Romance and there are cases in which overt pronouns in
Romance are not contrastive and would not be translated by a stressed pronoun in English.
Furthermore, sentences with a double contrast, which do require OSPs in Romance, do not
require stressed pronouns in English (see 22).
(22) a. Jo vaig anar a la festa i tu vas quedar-te a casa.
b. I went to the party and you stayed at home.
Brunetti (2006) also argues that subjects may trigger a contrastive interpretation in some
contexts. She follows Vallduvı́’s account of information structure and agrees that the pres-
ence of a link is used to indicate an s-topic shift, while an NSP represents a continuous
s-topic. However, she also notes that in some cases a link appears to refer to a continuous
s-topic, as example 23 shows for Italian. However, a special contrastive interpretation arises
4
See Hirschberg and Ward (1991) for an experimental study that casts doubts on these intuitions.
26
in these cases: the hearer expects the speaker to say something about other friends/relatives
to whom he will give (or not give) presents.5
(23) a. A Dante, che cosa (gli) regalerai?

To Dante, what thing (IO-CL) give?
‘What will you give to Dante (as a present)?’
b. A Dante (gli) regalerò un LIBRO.
‘To Dante, I’ll give a book.’
Brunetti’s (2006) proposal is that this unexpected use of a link legitimates a contrastive
interpretation. The question in 23 is about Dante. Thus, one expects an answer about
Dante with a continuous s-topic and, thus, an NSP. So, if the speaker decides to utter a
sentence with a link, he/she does so to evoke the alternatives to that link and this is how the
contrastive interpretation arises.
Her proposal is that a link implies the existence of alternatives: selecting an address
from the knowledge store always implies choosing among potentially different addresses
that may be relevant in the context. But the relevance of alternatives varies according to
the context. If there is a change of s-topic, the alternatives are not relevant, and not evoked.
In cases like 23b, the speaker wants to contrast the current s-topic with other entities, and
does so by once again using a link to refer to that s-topic. The s-topic is sorted again, this
makes the alternatives relevant and a contrastive interpretation is triggered. Summarizing
Brunetti (2006), we may expect to find pronoun links in two contexts: (1) to introduce a
shift and (2) to introduce a contrastive interpretation (if there is link repetition).
Cameron (1992) conducted a variationist study of the expression of subject in differ-
ent Spanish dialects. He did not argue for contrast as the main factor which affects the
presence/absence of OSPs, but it is interesting to note that he excluded from the envelope
5
Brunetti (2006) deals with left-dislocated verbal constituents. If we stick to the hypothesis that preverbal
subjects are left-dislocations, the same effect should hold for subjects.
27
of variation cases which he counted as contrast, given that there was no variation in such
cases6 ; that is, the OSP is taken to be obligatory (see Todolı́ (2002) for the same insights
regarding Catalan data). He distinguished three different types of contrast.
• Contrast of Negation: the same predicate (or two similar predicates) occurs in two
sentences, but it is negated in the second one:
(24) Ellos fueron pero yo no fui.
They went but I did not go.
• Contrast of Scalar Opposition: there are two similar predicates, which are modified
by adjuncts which are construable as elements of a scalar set, such that the two ad-
juncts differ by degree.
(25) Mi señora habla bien inglés pero yo lo hablo bastante mal.
My wife speaks English well but I speak it very brokenly.
• Contrast of Alternatives: this type occurs when object arguments of the first and
second sentences are construable as elements of a set and understood as alternatives
to one another.
(26) Yo fuı́ a una escuela y él fue a otra.
I went to a school and he went to another one.
In Chapter 7, I review several approaches to contrast and argue for a unitary analysis
of all pronouns conveying contrast as marking a Contrastive Topic. I also argue that not all
contrastive pronouns are obligatory, but only when they appears in an utterance which is
an answer to a subquestion to the Question Under Discussion.
6
This is not always the case. See Section 7.1 for more discussion regarding this issue.
28
2.4.4 Focal information
Vallduvı́ (2002) notes that OSPs are mandatory in cases in which they represent focal infor-
mation (in clefts, answers to wh-questions, comparative constructions, focus constructions,
constructions with an elliptical verb, etc.). This naturally follows from the fact that focal
information is always placed at the end of the main clause in Catalan, which is where the
main pitch of the sentence is located. Thus, focal information in Catalan always receives
the main pitch of the utterance. Since null pronouns cannot be stressed, an overt pronoun
is required to express the focus and receive the main pitch. This can clearly be seen in 27:
the subject pronoun, which represents focal information and in accordance is accordingly
postverbal, cannot be omitted, although the verb contains all the necessary agreement infor-
mation to retrieve the antecedent of the pronoun. The main stress needs to fall on the focus
and that’s why the pronoun must be pronounced. The answer in 27b is not appropriate in
this context because the predicate receives the main pitch and, thus, is marked as focus.
(27) a. Qui et va veure?

Who saw you?
b. * Em vas veure.
me past see
c. Em vas veure tu.

me past see you
‘You saw me.’
Brucart (1987) has a similar insight and proposes the Principle of Lexicalization of Pro-
nouns, which says that those pronouns which contribute new information to the discourse
must have phonetic realization.
Chapter 5 shows that focal pronouns and overt non-focal pronouns do not share the
same referring preferences and it derives the contrast in 27 from a game theoretical per-
spective.
29
2.4.5 Rigau’s (1989) approach
Rigau (1989) has a more complex account of the distribution of NSPs and OSPs in Catalan,
which combines several of the factors mentioned so far. She distinguishes between two
types of OSPs in Catalan: a plain overt pronoun and a stressed overt pronoun.
(28) a. Jo vull venir.
b. JO vull venir.
‘I want to come.’
Following Kuno (1972), her proposal is that an unstressed OSP triggers an exhaustive
listing interpretation, while stressed ones trigger a contrastive focus interpretation. Rigau
(1989) assumes that the two readings are variants of the same emphatic operator. The ex-
haustive listing interpretation could be paraphrased as ‘Among the people under discussion,
only A wants to come’. The contrastive focus interpretation conveys the negation of some
alternative and can be paraphrased as ‘as for A (A = 1st person in 28), but not for X, A
wants to come’.7
(29) a. Qui vol venir, tu o en Joan?
‘Who wants to come, you or John?
b. *JO vull venir.. en Joan, no ho sé.
c. Jo vull venir... en Joan, no ho sé.
‘I want to come.. John, I don’t know´
According to Rigau, the contrastive focus interpretation is not possible in 29b because
it amounts to saying ‘It is not John who wants to come’, which is contradictory with the
second part of the utterance. However, I don’t see why answer 29c should be acceptable if
7
Rigau ignores the possibility of placing the pronoun in a postverbal position.
30
it conveys an exhaustive listing interpretation, given the fact that this interpretation amounts
to saying ‘only I want to come’ and thus should also be in contradiction with the second
part of the utterance. See Chapter 7 for a review of several notions of contrast found in the
literature and for arguments that the readings Rigau identified should be relabeled.
Apart from noting the contrastive nature of pronouns, Rigau’s attempts to offer an ac-
count of the distribution of Catalan pronouns in discourse. Her proposal is that once a
discourse element becomes a discourse-topic, it is represented by a pronoun in Catalan.
If the discourse topic is the subject of a sentence, an NSP must appear except under the
following circumstances:
1. When there is another possible antecedent for the subject pronoun. Thus, she claims
that whenever there is some ambiguity an overt pronoun is always preferred.
2. When the subject of the sentence is used to recover a discourse-topic, which has been
abandoned.
3. When the position occupied by the empty pronoun receives an emphatic interpreta-
tion (either exhaustive listing or contrastive topic interpretation).
Chapters 3 and 5 show that it is not the case that OSPs are always preferred when-
ever there is some ambiguity and Chapter 7 argues for a simpler approach to the so-called
emphatic interpretation of pronouns.
2.4.6 Summary of proposals
Below is a summary of the main claims presented in this section, accompanied by an ex-
ample showing how each accounts for the presence of OSPs.
• PAH: NSPs prefer an antecedent in the highest Spec IP, whereas OSPs prefer an
antecedent in a lower syntactic position.
31
(30) La Marta escrivia sovint a la Raqueli quan ellai era als Estats Units .
“Marta wrote frequently to Pierai when shei was in the United States.”
• Topic Change: OSPs are used to change the locus of update of information of a
sentence.
(31) La Mariai va insultar l’Annaj . Ellaj lii va fotre una hòstia.
‘Mariai insulted Annaj . Shej hit heri .’
• Contrast:
– Contrastive focus: a stressed OSP has a contrastive focus interpretation.
b. JO vull venir.
‘I want to come.’
– Exhaustive listing: an unstressed OSP has an exhaustive listing interpretation.
b. Jo vull venir... en Joan, no ho sé.
‘I want to come.. John, I don’t know.’
– Implicit contrast: an unexpected repetition of a previous link triggers an implicit

contrast with other alternatives. (This example has been adapted from Brunetti
(2006), so that a pronoun appears in subject position.)
32
(34) a. El Dante t’ha regalat alguna cosa?
‘Did Dante gave you anything (as a present)?’
b. Sı́, ell m’ha regalat un llibre.
‘Yes, he gave me a book.’
• Focus information: the overt pronoun is necessary when it represents focal informa-
tion.
(35) a. Qui et va veure?
‘Who saw you?’
b. Em vas veure tu.

Me saw you
‘You saw me.’
2.5 Corpus data
Most examples I use in this thesis are naturally-occurring examples taken from a corpus of
oral narrations. This corpus was collected within the Nocando Project (2004), which aimed
to study noncanonical constructions in different languages. As part of this project, Catalan
speakers were asked to narrate stories presented to them with illustrations only. There were
three different stories and each story was told by nineteen speakers. The narrations were
recorded and transcribed.
The game theoretical approach I will present in Chapter 4 makes crucial use of proba-
bilities that speaker and listener estimate about different situations (for instance, the proba-
bility that the current subject refers to the previous subject, etc.). In this thesis, probabilities
are approximated by means of corpus counts. That is, the counts we can find in a corpus are
taken to be an approximation of the probabilities speakers and hearers assign to different
33
situations. In this section, I present several counts that will be used later in the analysis.
All the counts were counted manually from the transcriptions of the narrations. I present
one example of each type, the relevant subject indicated in boldface. The corpus consisted
of 5473 clauses with a finite verb. The counts mostly refer to the behavior of the subjects
in the corpus.
• Subject of the current utterance refers to:
– The subject of the previous utterance: 41%
(36) Llavors el gati salta i, doncs, [null]i vol caçar la granota.
Then, the cati jumps and, well, [null]i wants to hunt the frog
– Some other antecedent (not in subject position) of the previous utterance: 11%
(37) Llavors el gat salta i, doncs, vol caçar la granotaj , però la granotaj
s’agafa al biberó.
Then, the cat jumps and, well, [null] wants to hunt the frogj , but the
frogj holds on to the baby bottle.
– Antecedent not present in the previous utterance: 48%
(38) El gat estava asseguda en un banc i el nen s’estava mirant un vaixell.
The cat was seating on a bench and the child was looking at a boat.
• Out of the subjects whose antecedent is in the previous sentence, they refer to:
– Previous subject: 79%
– Other: 21%
• In utterances in which the subject is not the link of the sentence, this non-link subject
is a:
34
– Focused subject (in a cleft or with a focal particle): 27 instances, 4 of which
refer to a previous subject, 2 to a referent in a different position and 21 to an
antecedent not mentioned in the previous utterance.
(39) La granotai s’ha posat al davant i és ellai qui està a punt de prendre’s el
biberó.
The frogi is now at the front and shei is the one who’s drinking from
the bottle.
– Postverbal subject8 : 64 instances, 8 of which refer to a previous subject, 9 to

a referent in a different position and 47 to an antecedent not mentioned in the
previous utterance.
(40) Les granotesi miren el gat de reüll. I mentre estan pujades les dues
granotetesi a sobre de la tortuga, (...)
The frogsi are sneaking a look at the cat. And while the two little frogsi
are on top of the turttle, (...)
• I consider again utterances in which the subject is not the link of the sentence. In
particular, three different constructions are considered(left-dislocations, focused sub-
jects and postverbal subjects) and the counts indicate what the subject of the next
utterance refers to. That is, it is examined what are the effects of non-link subjects in
subsequent discourse.
– Left-dislocations: 10 instances, in 2 of which the next subject refers to the pre-

vious link, 2 to the previous non-link subject, 2 to another non-link constituent,
4 to an antecedent not mentioned in the previous utterance.
8
I exclude subjects of unaccusative verbs, which tend to appear postverbally by default.
35
(41) L’amanida, [null] la va servir a una donai , molt guapa ella, molt ben
vestida i la donai va començar a menjar.
The salad, [he] served it to a ladyi , very beautiful, very elegant and the
ladyi started eating.
– Focused subjects (in a cleft or with a focal particle): 27 instances, in 9 of which

the next subject refers to a previous subject, 3 to a referent in a different position
and 15 to an antecedent not mentioned in the previous utterance.
(42) La granota s’ha posat al davant i és ella qui està a punt de prendre’s el
biberó. En canvi, el gat sı́ que l’ha vista.
The frog is now at the front and she is the one who’s drinking from the
bottle. In contrast, the cat did see her.
– Postverbal subjects9 : 64 instances, in 22 of which the next subject refers to a

previous subject, 4 to a referent in a different position and 38 to an antecedent
not mentioned in the previous utterance.
(43) I mentre estan pujades les dues granotetes a sobre de la tortugaj , perquè
[null]j les porti, (...)
And while the two little frogs are on top of the turtlej , so that (it)j
carries them, (...)
9
Again, excluding unaccusative verbs
36
Chapter 3
Subjecthood and pronouns: The

Position of Antecedent Hypothesis
This Chapter investigates the relationship between pronouns in Catalan and syntactic posi-
tion. This relationship was first studied experimentally by Carminati (2002) to explain the
variation between NSPs and OSPs in Italian. Her proposal is that this variation is regulated
by the Position of Antecedent Hypothesis:
(44) Position of Antecedent Hypothesis: NSPs prefer to retrieve an antecedent in the

(highest) Spec(IP), whereas OSPs prefer an antecedent in a lower syntactic posi-
tion.
Thus, for Carminati (2002) NSPs and OSPs have different functions, given that they
have different antecedent biases, based on syntactic position. Subject position is thought
to host more salient antecedents than object position. If this is so, the PAH is compatible
with Ariel’s Accessibility Theory: more reduced forms tend to refer to more accessible
antecedents (salience being one of the factors that compose accessibility) than less reduced
forms. In her work, Carminati is concerned with intrasentential anaphora and her hypoth-
esis is that this kind of anaphor has access to the syntactic representation. As for intersen-
37
tential anaphora, she basically remains agnostic about whether this hypothesis also holds.
My goal is to show that it does indeed hold for Catalan across sentences.
In this chapter, I first review Carminati’s experiments in some detail in Section 3.1. I
also report results for intersentential anaphora in Spanish (Section 3.2), which show partial
support for the PAH. Finally, I present my own experiments for Catalan in Section 3.3,
which show that PAH holds in Catalan even across sentences.
3.1 Italian pronouns: Carminati (2002)
Carminati (2002) tested the PAH in a series of off-line and on-line experiments investi-
gating a variety of antecedents standardly assumed to occupy the subject position in the
syntactic structure. Overall, her findings supported the Position of Antecedent Hypothesis,
as opposed to other hypotheses, such as hypotheses based on an economy principle (gen-
erally favoring NSPs), or those based on avoidance of ambiguity (favoring OSPs, since
they carry gender information and, therefore, could disambiguate some cases). Carminati
(2002) used different methods in her work: self-paced reading tasks, questionnaires, and
correction tasks. I summarize here two of her experiments, which I replicate for Catalan:1
3.1.1 Experiment 1: questionnaire with non-biased sentences
This experiment tested the PAH with regard to intra-sentential coreference, in complex
sentences consisting of a main clause followed by a subordinate clause. The main clause
introduces two individuals by means of two proper names of the same grammatical gender,
one in subject position and the other in object position. The subordinate clause, which starts
1
For ease of reference and presentation I have changed Carminati’s original experiment numbers: what I
call Carminati’s Experiment 1 is her Experiment 2, and what I call Carminati’s Experiment 2 is her Experi-
ment 1.
38
with either a NSP or an OSP, is not pragmatically biased and, in principle, can refer either
to the previous subject or the previous object. This study involved a questionnaire, in which
after reading sentences with NSPs or OSPs (such as 45a and 45b respectively), subjects had
to choose their preferred interpretation for the pronoun, by answering a question like the
one in 45c :
(45) a. Null Pronoun

Marta scriveva frequentemente a Piera quando ∅2 era negli Stati Uniti.
“Marta wrote frequently to Piera when ∅ was in the United States.”
b. Overt Pronoun
Marta scriveva frequentemente a Piera quando lei era negli Stati Uniti.
“Marta wrote frequently to Piera when she was in the United States.”
c. Who was in the States?
The materials of the experiment consisted of eighteen experimental items which were
counterbalanced and randomized across two presentation lists. Forty-four participants took
part in this experiment. The results in raw percentages can be seen in Table 3.1.
subject antecedent object antecedent

null pronoun 80.7 19.3
overt pronoun 16.7 83.3
Table 3.1: Results for Experiment 1 in Carminati (2002)
Thus, there was a strong preference to interpret null pronouns as having subject an-
tecedents and overt pronouns as having object antecedents.
A one-way ANOVA of the difference in choosing the subject antecedent between the
null vs. overt pronoun conditions was performed with subjects and items as random effects.
The difference of choosing the subject antecedent in the two conditions (80% vs 16%)
2
I use the empty set to represent the null pronoun in examples.
39
was statistically significant (F1(1,43)= 161.64, p<.001; F2(1,17) = 286.14, p<.001). The
difference between the preferred antecedent choices of the null and overt pronoun (80% vs
83%) was not significant.
The results of this experiment speak in favor of the PAH. However, the experiment
crucially hinges on the assumption that the sentences are neutral (i.e. it must be equally
plausible that the pronouns refer to the previous object or to the previous subject). This
assumption is dropped in Experiment 2, where the sentences are biased and what is mea-
sured are reading times. In addition, participants are not directly asked for their judgments,
rather, reading times provide a way to estimate ease or difficulty of processing.
3.1.2 Experiment 2: self-paced reading experiment
This experiment tested the PAH with regard to intra-sentential coreference in complex sen-
tences consisting of a subordinate clause followed by a main clause. The subordinate clause
introduces two individuals by means of two proper names of the same grammatical gender,
one in subject position and the other in object position. The main clause, which starts with
either an NSP or an OSP, is pragmatically biased to refer to one of the two referents in the
preceding subordinate clause.
(46) a. Condition 1: Subject Bias + Null Pronoun

Dopo che Giovanni ha messo in imbarazzo Giorgio di fronte a tutti, ∅ si è
scusato ripetutamente.
“After G. embarrassed G. in front of everyone, ∅ apologized repeatedly.”
b. Condition 2: Subject Bias + Overt Pronoun

Dopo che Giovanni ha messo in imbarazzo Giorgio di fronte a tutti, lui si è
scusato ripetutamente.
“After G. embarrassed G. in front of everyone, he apologized repeatedly.”
40
c. Condition 3: Object Bias + Null Pronoun
Dopo che Giovanni ha messo in imbarazzo Giorgio di fronte a tutti, ∅ si è offeso
tremendamente.
“After G. embarrassed G. in front of everyone, ∅ was very offended.”
d. Condition 4: Object Bias + Overt Pronoun

Dopo che Giovanni ha messo in imbarazzo Giorgio di fronte a tutti, lui si è
offeso tremendamente.
“After G. embarrassed G. in front of everyone, he was very offended.”
The materials of the experiment consisted of sixteen experimental items which were
counterbalanced and randomized across two presentation lists. Forty participants took part
in this experiment. Comprehension questions probing the resolution of the pronoun were
asked after seven of the items. The results can be seen in Table 3.2, where the ‘% correct’
column contains the percentage of answers in which subjects understood the pronoun as
referring to the pragmatically-biased antecedent.
Condition Main clause Difference % correct

Reading Time (Observed - Expected)
Condition 1: subj + null 1844 -162 88.7
Condition 2: subj + pron 2666 499 80.4
Condition 3: obj + null 2352 349 70.4
Condition 4: obj + pron 2236 41 89.1
Table 3.2: Results for Experiment 2 in Carminati (2002)
The average reading times for the main clause were computed after eliminating times
that were longer than 6000 ms and shorter than 200 ms (about 4% of the total number of
trials).
Also, since there were small length differences between the conditions, deviations from
regressions were also computed. These are the numbers in the Difference column. They
were calculated as follows. The predicted reading time for each segment was computed by
41
a regression equation, calculated on a subject by subject basis by correlating the reading
time and segment length over all times and conditions of the experiment. Expected times
were then calculated for each segment and each subject and subtracted from the observed
reading times. Positive numbers mean reading times were slower than expected, negative
numbers reading times faster than expected.
Reading times were significantly faster for main clauses with NSPs than clauses with
OSPs, which is expected because clauses with NSPs are always shorter than clauses with
OSPs. The effect of bias was not significant. In addition, there was a significant antecedent
by pronoun interaction, both using both raw Reading Times and Difference times (the
ANOVA results for raw RT are: F1(1,39) = 28.16, p < .001; F2(1,15) = 23.68, p < .001).
NSPs in main clauses biased towards a subject antecedent were read faster than sentences
biased towards the object, while the opposite was true for sentences with OSPs.
As mentioned earlier, Carminati’s (2002) study is mainly concerned with intrasentential
anaphora. She herself insists on the fact that intra and intersentential anaphora must be
studied separately and that it cannot be assumed that both types of anaphora are processed
in the same way. She suggests that, while her studies show that intra-sentential anaphora
involve accessing syntactic representations, this may not be the case for inter-sentential
anaphora. This type of anaphor was tested for Spanish and Catalan in experiments that are
presented in the next two sections.
3.2 The PAH in Spanish
Alonso-Ovalle et al. (2002) tested the Position of Antecedent Hypothesis for Spanish in
several contexts. Especially interesting for the purposes of this proposal is the fact that they
replicated Carminati’s Experiment 1 in intersentential contexts: that is, they constructed a
questionnaire which contained two-sentence discourses and questions about the interpreta-
42
tion of pronouns in the second sentence. They found that, if the second sentence contained
an NSP, it was mostly interpreted as referring to the previous subject (73.2%), while, if
it contained an OSP, this percentage dropped to 50.2%, the difference being significant
(F1(1,79) = 65.28; F2(1,11) = 43.38, p< .001).
Thus, this data seems to support the idea that the Position of Antecedent Hypothesis
holds for Spanish too. However, there are some intriguing differences regarding the overt
pronoun in the Italian and the Spanish experiments: in Italian, the OSP was interpreted as
referring to the previous subject in only 16.67% of the cases, while in Spanish it was 50.2%.
Thus, the Spanish experiments seem to indicate that, while the NSP is clearly biased to-
wards the previous subject, the OSP does not show a clear preference. Thus, although the
PAH seems to also be in effect also for Spanish in intersentential cases, its effects seem to
be milder. This could be due to the change of language (Italian versus Spanish) or to the
change of type of anaphora tested (intersentential versus intrasentential). Given that the
results obtained for Spanish and Italian do not exactly match and show interesting differ-
ences, it is worth exploring further this hypothesis further and replicating these experiments
in other Romance languages, in order to get a better grasp of the phenomenon we are deal-
ing with. In the next section, I present my experiments for intersentential anaphora in
Catalan.
3.3 Experiments on Catalan pronouns
As reported in the previous sections, Carminati (2002) showed that the PAH holds for
Italian intrasententially. By contrast, the results for intersentential contexts in Spanish look
more puzzling, in particular with respect to the behavior of OSPs. In this Section, I present
the two experiments I carried out for Catalan in cross-sentential contexts.
43
3.3.1 Experiment 1: questionnaire study
Experiment 1 replicates Carminati’s (2002) and Alonso Ovalle et al.’s (2002) experiment
1. It tests the PAH in two-sentence discourses without semantic bias.
Materials: the materials consisted of sixteen two-sentence discourses with two condi-
tions. The first sentence introduces two individuals by means of two proper names of the
same grammatical gender, one in subject position and the other in object position. The con-
tent of the second sentence is not pragmatically biased to refer to one of the two referents.
The subject of the second clause is either an NSP or an OSP. Thus, the two conditions are:
(47) a. Condition 1: Null Pronoun

La Marta escrivia sovint a la Raquel. ∅ Vivia als Estats Units.
“Marta wrote frequently to Raquel. ∅ Lived in the United States.”
b. Condition 2: Overt Pronoun

La Marta escrivia sovint a la Raquel. Ella vivia als Estats Units.
“Marta wrote frequently to Raquel. She lived in the United States.”
The conditions for each item set were counterbalanced and incorporated into a question-
naire experiment together with 24 filler items and 5 practice items. Four counterbalanced
lists were constructed (the last two lists with the items in reverse order), with a single ran-
domization for all lists. The complete set of experimental items can be seen in Appendix
A.
Procedure: The experiment was administered using a laptop, equipped with EPrime
software. Before starting the experimental session proper, subjects read a set of written in-
structions, which explained the experimental procedure. Participants went through a prac-
tice session, so that they could get familiar with the keyboard and the procedure, and the
experiment subsequently began. The discourses were presented on the computer screen.
Subjects were asked to indicate which interpretation of the second sentence they preferred,
44
i.e., whether they thought it was a statement about the subject of the first sentence, or the
object of the first sentence. Therefore, under each experimental sentence, two paraphrases
of the second sentence were given, such as the following, corresponding to the example
items presented above.
(48) a. Marta lived in the United States
b. Raquel lived in the United States
Participants: Thirty-two participants from Universitat Pompeu Fabra in Barcelona

took part in this experiment. They also participated in Experiment 2 and did not partic-
ipate in any of the other experiments.
Results: The results can be seen in Table 3.3.

null pronoun 70.3 29.7
overt pronoun 35.5 64.5
Table 3.3: Results for Experiment 1
There was quite a strong preference to interpret NSPs as having subject antecedents
and OSPs as having object antecedents. The effect was not as strong as in Carminati’s
(2002) experiment, but much stronger than Alonso-Ovalle et al.’s (2002). OSPs do not
show the mixed behavior reported for Spanish, but rather shows a clear preference for an
object antecedent.
To test the statistical significance of these patterns, an analysis of variance (ANOVA) of
the frequency with which the subject antecedent was chosen in the null vs. overt pronoun
conditions was performed with subjects and items as random effects. The difference in
choosing the subject antecedent (70% vs 35%) was significant (F1(1,31) = 64.23, p< 0.001;
F2(1,15) = 26.153, p < 0.001). The difference between the preferred antecedent choices
for the two conditions (70% vs 64%) was not significant (F1(1,31) = 1.5573, p = 0.22;
45
F2(1,15) = 0,353, p = 0.56). This analysis confirms that the type of pronoun has an effect
on its interpretation: NSPs display a subject preference and OSPs an object preference.
3.3.2 Experiment 2: self-paced reading test
Experiment 2 replicates Carminati’s (2002) Experiment 2 for intersentential anaphora. The

goal of this experiment was to test the Position of Antecedent Hypothesis in two-sentence
discourses with semantic bias using self-paced reading. As explained above, this kind of
design has the advantage of not assuming that the sentences are neutral3 (that is, pragmat-
ically non-biased); furthermore, participants are less aware of the goal of the experiment,
since their judgments are not explicitly asked.
It has been shown widely in the psycholinguistic literature that readers make use of all
the available linguistic cues to arrive at a coherent interpretation. If they encounter explicit
information which goes against some of the linguistic cues they have encountered before,
this does not result in an acceptable or anomalous sentence, but they do need more time
to read the sentence (see Caramazza et al. (1977)). For instance, Koornneef and Berkum
(2006) investigated some verbs conveying implicit causality, such as apologize, which has
a bias towards a continuation that makes reference to the first NP. They tested sentences
such as example 49 in which the pronoun was either consistent or inconsistent with the
bias of the verb. Although both discourses are coherent, they found that the bias-consistent
sentence was read faster than the bias-inconsistent one.
(49) a. Bias-consistent pronoun.

Linda and David had an accident. David apologized to Linda because he was
the one to blame.
3
I consider dropping the assumption of neutrality to be an advantage in the sense that the neutrality of a
sentence is much more subjective and disputable than its non-neutrality.
46
b. Bias-inconsistent pronoun.
Linda and David had an accident. Linda apologized to David because he was
not the one to blame.
These results suggest that sentences which obey the biases predicted by the PAH should
be read faster than sentences which do not obey them. The goal of experiment 2 is to test
this claim.
Materials: the materials consisted of sixteen two-sentence discourses with four con-
ditions. In these discourses, the first sentence introduces two individuals by means of two
proper names of the same grammatical gender, one in subject position and the other in ob-
ject position. The second sentence contains either an NSP or an OSP and is semantically
biased so that the pronoun refers either to the previous subject or previous object. Thus,
the four conditions are:
(50) a. Condition 1: Null pronoun + bias towards subject antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. ∅ Es va excusar repeti-
dament.
“John made fun of Dani in front of everyone. ∅ Apologized many times.”
b. Condition 2: Overt pronoun + bias towards subject antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Ell es va excusar repeti-
dament.
“John made fun of Dani in front of everyone. He apologized many times.”
c. Condition 3: Null pronoun + bias towards object antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. ∅ Es va ofendre molt.
“John made fun of Dani in front of everyone. ∅ Was very offended.”
d. Condition 4: Overt pronoun + bias towards object antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Ell es va ofendre molt.
47
“John made fun of Dani in front of everyone. He was very offended.”
The conditions for each item set were counterbalanced and incorporated into a self-paced
reading experiment together with 24 filler items and 5 practice items. Eight counterbal-
anced lists were constructed (the last four lists with the items in reverse order), with a
single randomization for all lists. The complete set of experimental items can be seen in
Appendix B.
Procedure: The experiment was administered using a laptop, equipped with EPrime
software. Before starting the experimental session proper, subjects read a set of written
instructions, which explained the experimental procedure. Participants then went through
a practice session, so that they could get familiar with the keyboard and the procedure,
and the experiment subsequently began. The discourses were presented on the computer
screen. Subjects were asked to press the space bar after each sentence and this is how the
reading times for each sentence were measured. Comprehension questions, such as the one
in 51, probing the resolution of the pronoun, were asked after each item.
(51) a. Qui es va ofendre?
“Who was offended?”
b. El Joan.
c. El Dani.
Participants: Thirty-two students from Universitat Pompeu Fabra in Barcelona took

part in this experiment.
Results: Table 3.4 contains the results for this experiment.
The average reading times for the second sentence were computed, after eliminating
times that were longer than 6000 ms and shorter than 200 ms (about 3.5% of the total num-
ber of trials). The number in the ‘% correct’ column refers to the percentage of answers in
which subjects understood the pronoun as referring to the pragmatically-biased antecedent.
48
Condition Second Sentence Difference % correct
Condition 1: subj + null 2464 -24 90
Condition 2: subj + pron 2929 290 90
Condition 3: obj + null 2587 45 91
Condition 4: obj + pron 2700 -1 91
For both types of bias (sentences pragmatically biased to the subject and to the ob-
ject), the sentence with the NSP was read faster than the sentence with the OSP. Thus, the
data for the OSP does not follow the pattern found by Carminati (and, therefore, does not
confirm the PAH). In fact, an ANOVA analysis gives pronoun type (null vs. overt) as the
only significant factor (F1(1,31) = 23.86, p < .001; F2(1,15) = 27.66, p < .001), while
the interaction Pronoun by Bias is not significant (F1(1,31) = 3.93, p = .059; F2(1,15) =
3.09, p = .098). However, conditions (2) and (4) are systematically longer because they
contain the OSP, and this may be masking the effect of the PAH.4 Thus, deviations from
regressions were computed to account for length differences. These are the numbers in the
Difference column. They were calculated following the same method Carminati applied,
which I repeat here: The predicted reading time for each segment was computed by a re-
gression equation, calculated on a subject by subject basis by correlating the reading time
and segment length over all times and conditions of the experiment. Expected times were
then calculated for each segment and each subject and subtracted from the observed read-
ing times. This procedure to adjust for length differences is quite standard in the processing
literature (see Ferreira and Clifton (1986) for one of the first papers to use it and Trueswell
et al. (1994) for some additional discussion of this technique).
The average differences between observed and expected reading times for each condi-
4
Although the stimuli were designed so that the length differences would be kept to a minimum for all
conditions, the differences did exist. These are the mean lengths (in characters) of the items in each condition
are: condition 1 = 59.25; condition 2 = 62.75; condition 3 = 59.56; condition 4 = 63.0.
49
tion is shown in the Difference column in table 3.4. Positive numbers mean reading times
were slower than expected, negative numbers reading times faster than expected. Condi-
tions 1 and 4 were faster than expected, while 2 and 3 were slower, which is consistent with
the biases predicted by the PAH.
The data regarding the difference between observed and expected RT was submitted
to an ANOVA analysis. As with raw reading times, the effect of the type of pronoun was
significant (F1(1,31) = 4.78, p = 0.03; F2(1,15) = 4.58, p < .049). In addition, in this
case there was also a significant bias by pronoun interaction (F1(1,31) = 4.68, p = 0.038;
F2(1,15) = 11.04, p < .001).
Thus, we can conclude that this experiment also shows that the PAH holds for Cata-
lan intersententially: NSPs tend to refer to subject antecedents and OSPs tend to refer to
non-subject antecedents. However, this tendency is milder at a discourse level than at the
sentence level. I will have more to say about the PAH and the results of these experiments
at the end of Chapter 4 and in Chapter 5.
3.4 Conclusion
In this Chapter, I have shown that the Position of Antecedent Hypothesis, the hypothesis
Carminati (2002) tested for Italian, also holds for Catalan. That is, NSPs and OSPs have
different biases: NSPs have a subject preference, while OSPs have an object preference.
These experiments show that it is not the case that OSPs are preferred whenever there is
some ambiguity, contra Rigau. Following Ariel’s (1990) terminology, competition does
not override salience. Under the right circumstances, NSPs are preferred even if there is
potential ambiguity.
The effect of the PAH in Catalan across sentences is milder than in Italian within a
sentence, which could be due to the fact that, at a discourse level, pragmatic notions (such
50
as topicality) become more prominent than syntactic notions (such as subjecthood). I ex-
plore this possibility in Chapter 5 and, although I show that pragmatic structure has some
influence on the interpretation of pronouns (and particularly of OSPs), it is not as straight-
forward as one might believe.
In light of the results for Italian and Catalan, the Spanish data remain puzzling. Al-
though the authors of the Spanish study claim that their data gives support to the PAH,
the overt pronoun shows a mixed, random behavior the PAH cannot account for. Alonso-
Ovalle et al. (2002) did not do any on-line experiments; therefore, further experiments,
using methods other than questionnaires, are needed to establish whether the behavior of
overt pronouns in Spanish is qualitatively different from Italian or Catalan.
51
Chapter 4
Game theory
One of the main goals of this thesis is to show how the psycholinguistic data about the
overt/null pronoun variation in Catalan can be analyzed in terms of game theory, as the
result of the strategic interaction between participants in a communicative exchange. I
argue that game theoretical approaches can account for this type of data more accurately
than other approaches and can capture the fact that the judgments are extremely sensitive
to the context.
This section contains a brief introduction to game theory (Section 4.1) and explains how
it has been applied to linguistics, and in particular to the modeling of anaphora choice and
resolution (Section 4.2). In Section 4.3, I propose my own analysis of the psycholinguistic
data just presented, which is further supported by experiment 3. Finally, Section 4.4 argues
that mixed strategies are not suitable to model the phenomenon studied in this thesis.
4.1 Overview of game theory
Game theory (GT) is the study of the ways in which strategic interaction among rational
players produces outcomes with respect to the preferences (or utilities) of those players. In
52
linguistics, GT has mainly been used in semantics and pragmatics since it provides a good
framework to explain why speakers and hearers (that is, rational agents) choose a certain
action (i.e. utter a sentence or interpret a sentence with a particular meaning in a particular
context). Specifically, game theory has been applied to topics as varied as the semantics of
questions (van Rooij, 2003), discourse anaphora (Clark and Parikh, 2007) and implicatures
(Parikh, 2001; Ross, 2006).
In what follows, I summarize the basic ideas of GT. An agent is often faced with a
decision. He can choose among several different actions; if so, he will choose the one he
prefers. It is possible to translate this preference by giving a numerical value (or payoff) to
each option. The option with the highest payoff will be the preferred action, since agents
seek to maximize their payoffs. However, sometimes payoffs are uncertain, so that every
possible outcome has a certain probability associated with it. In this case, the agent will
choose the action with the highest expected payoff. When there is more than one agent
making decisions, the action one agent decides to make might affect the other agent’s payoff
and, thus, the other agent’s decisions. That is, one agent needs to consider the other agents’
actions and payoffs in order to choose the best option. In this sense, there is a strategic
interaction among all rational agents and everyone is trying to choose so that their partial
influence over the outcome benefits them the most.
When no agent has an incentive to change his action (given all others agents’ actions),
an equilibrium (known as a Nash equilibrium) is reached and the game is solved. As
I illustrate below with examples, a Nash equilibrium is a strategy profile in which if a
player chooses to unilaterally defect and do something different, this player will get a worse
payoff. In a given game, there may be several Nash equilibria: the one (or ones) with the
highest payoffs (called a Pareto-Nash equilibrium) will be preferred.
In a more technical notation, a strategic game is a structure such that (van Rooij, 2006):
• N = {1, ... , n} denotes the set of players.
53
• For each player i there is a set of Ai actions that he can perform. An action profile is
an n-tuple (a1 , ... , an ) of actions where each ai ∈ Ai .
• A payoff is a function U that maps each action profile (a1 , ... , an ) ∈ A to an n-tuple
of real numbers (u1 , ... , un ). In zero-sum games, the payoffs of the players sum zero
for each profile: that is, the winnings of one player entail losses for the other player.
The opposite situation is represented by games of pure coordination, in which the
payoffs of all players are identical for each action profile.
An action a1 strictly dominates an action a2 if in all possible worlds, the payoffs an

agent gets when choosing a1 are better than when choosing a2 . I will illustrate the concept
of dominant action with an example (from Dixit and Nalebuff (1991)). Suppose that during
a certain week there are two major news stories in the news: an impasse between the House
and the Senate on the budget and a new drug which is claimed to be effective against AIDS.
Two magazines, say Time and Newsweek, have to decide on one of these topics for the
cover. The buyers will buy a magazine depending on the story on the cover. Suppose that
70% are interested in the AIDS story and 30% in the budget story. Suppose also that, if the
two magazines have the same story on the cover, the group interested in that story splits
equally between the two magazines. Table 4.1 represents the situation: the first number of
the pair represents Time’s payoff, the second Newsweek’s payoffs; that is, if Times chooses
Aids and Newsweek chooses Budget, Times gets 70 and Newsweek gets 30 (as can be see
in the right column of the first row).
Newsweek’s Choices
Aids Budget
Time’s Aids 35,35 70,30
Choices Budget 30,70 15,15
Table 4.1: Example with a dominant strategy
54
Times has a dominant strategy, namely using the AIDS story, because whatever Newsweek
decides to do, Times ’s payoffs are higher if it uses the AIDS story. If Newsweek uses the
AIDS story, Times will get 35 if it also uses the AIDS story or 30 if it uses the budget story.
If Newsweek uses the budget story, Times will get 70 if it uses the AIDS story and 15 if it
uses the budget story. Therefore, in all situations, Times is better off if it uses the AIDS
story: that’s its dominant strategy. The same reasoning applies to Newsweek.
However, sometimes, in a game, there is no dominant strategy for any of the players.
Consider the following story (cited in van Rooij (2006), originally taken from Luce and
Raiffa (1957)): Adam wants to go to a boxing event and Eve to a concert the same night.
However, they both prefer to go somewhere together over going alone to the place each one
individually prefers. We represent this situation in Table 4.2. Columns represent Adam’s
choices and rows Eve’s choices. The first number of the ordered pair represents the payoffs
for the row player (Eve) and the second number the payoffs for the column player (Adam).
Payoffs represent the preferences of the players; the particular numbers are not important,
but the relationship between the payoffs is crucial to determine the equilibria of the game.
Adam’s Choices
Boxing Concert
Eve’s Boxing 4,2 0,0
Choices Concert 1,1 2,4
Table 4.2: Game without a dominant strategy
Adam does not have a dominant strategy. If Eve goes to the concert, he prefers going
to the concert. If Eve goes to the boxing event, he prefers going to the boxing event.
The same is true for Eve. Intuitively, they should avoid strategies (Boxing, Concert) and
(Concert, Boxing) and agree on either (Boxing, Boxing) or (Concert, Concert). These last
two strategies are the Nash Equilibria of the game. A strategy profile s is called a Nash
Equilibrium if none of the players i has an interest in playing a strategy different from si
55
given what the other players play. In our game above, if Eve plays Concert, Adam has
no interest in playing Boxing instead of Concert, since he will be worse off. Similarly, if
Adam plays Concert, Eve has no interest in playing Boxing instead of Concert. Therefore,
(Concert, Concert) is a Nash Equilibrium. The same reasoning applies to (Boxing, Boxing).
This is a coordination game in which both players have a common interest: in this case,
meet at some event. Communication can also be seen as a coordination game, in which
speaker and hearer also have a common interest: that is, to understand each other.
In the Adam and Eve game, given that there are two Nash Equilibria, what should they
do? If they look only at their own payoffs, they will go to the event they prefer but they will
never meet and will be limited to a payoff of 1. Since there are two Nash Equilibria with
the same payoffs, they cannot use a pure strategy, always playing Boxing or Concert, but
should play a mixed strategy choosing each strategy with a certain probability. As we will
see in the next sections, this circumstance does not occur in the particular linguistic games
we will be examining. It is rarely the case that two strategies yield exactly the same payoff
because, for instance, some forms are cheaper to utter than other forms and some meanings
are more informative, and thus more valuable, than other meanings.1
Consider now the situation in Table 4.3, in which there are two Nash equilibria (A,
a) and (B, b). However, intuitively, both players would prefer (A, a) to (B, b). (A, a) is
the Pareto-Nash equilibrium. A Nash equilibrium s is Pareto optimal iff there is no other
equilibrium s’, such that for all players the payoffs of s are smaller than the payoffs of s’.
a b
A 3,3 0,0
B 0,0 1,1
Table 4.3: Game with a Pareto-Nash equilibrium
An additional type of game is one involving incomplete information, in which, at some

1
Cases of stable sociolinguistic variation are likely to be exceptions. See section 4.4 for more comments.
56
point, one agent i does not know which action the other agent is playing. That is, one agent
does not know which information state s/he is in: an information state is one state of affairs
that can possibly hold. Usually, games of incomplete information are represented not using
strategic forms, as we have been doing so far, but using games in extensive forms or game
trees. A tree consists of several nodes; each node is identified with a move of one of the
players who has to decide between several actions. The set of information states in which
a player thinks he may be is called an information set and it is represented by circling the
states in the set.
Figure 4.1 represents a game tree for a game of incomplete information. Imagine a
game between player A and B, in which player A hides a coin in either his left or his right
hand and player B has to guess in which hand the coin is hidden. Since player B does not
know whether A has chosen the action left (hiding the coin in his left hand) or action right
(hiding the coin in his right hand), he does not know whether he is in t or t’; {t, t’} is his
information set and that is why these two nodes are circled. Each end node has assigned a
pair of payoffs for player A and B, respectively. These are the payoffs the players get from
playing the strategy starting at the root node and leading to that end node.
Figure 4.1: Game of incomplete information
57
Similar, but not equivalent, to the games of incomplete information are the games of
partial information, which will be the ones used in my analysis. A game of partial infor-
mation is a game in which, at some point, one agent i does not know which state he is in
because, although he is sure of which action the other player has chosen, the action may
correspond to different information states (different state of affairs in the world). For ex-
ample, lexical ambiguity can be represented by a game of partial information. Imagine a
speaker utters the word ‘pen’. The hearer is sure that the speaker has chosen to utter ‘pen’,
but, in principle, in the absence of context, he does not know whether the speaker meant
‘writing instrument’ or ‘enclosure for animals’.
Parikh (2001) has described how games of partial information can be used to model
a variety of linguistic problems. The general idea is as follows: speaker and hearer are
rational agents; the speaker is trying to convey some information by uttering a proposition
(among the several possible propositions she2 could utter), and the hearer is trying to cor-
rectly interpret this proposition (among several interpretations, given the fact that utterances
can mean different things depending on the context). Both agents are trying to minimize
production and processing costs (for example, by avoiding unambiguous but extremely
long sentences), while communicating successfully. Section 4.2 presents an application of
games of partial information to discourse anaphora in English.
4.1.1 The role of payoffs
Before turning to the analysis of discourse anaphora in English, I would like to make a
point about the role of payoffs or utilities in the analysis. Payoffs are used to represent
preferences. So if an agent prefers action A to action B, the payoff for the first action must
be greater than the payoff for the second one, but the particular values assigned to them are
2
I adopt the convention of using the feminine pronoun to refer to the speaker and the masculine pronoun
to refer to the hearer.
58
not important. That is, payoffs are basically indices and their relationship and ordering is
what is important, but not their particular value. Assigning a particular value, though, is
useful to be able to compute the equilibria of the games and to have a clearer intuitive idea
of the expected outcomes of the different strategies. The following quote from Luce and
Raiffa (1957) expresses this same idea very clearly, as well as the dangers associated with
giving specific values to payoffs:
”One may contend that introducing the numbers does no harm, that they sum-
marize the ordinal data in a compact way and that they are mathematically
convenient to manipulate. But, in part, their very manipulative convenience is
a source of trouble, for one must develop an almost inhuman self-control not
to read into these numbers those properties which numbers usually enjoy. For
example, one must keep in mind that it is meaningless to add two together or
to compare magnitudes of difference between them. If they are used as indices
in the way we have described, then the only meaningful numerical property is
order. We may compare two indices and ask which is the larger, but we may
not add or multiply them.” (Luce and Raiffa (1957), page 16)
4.2 A game theoretical approach to discourse anaphora
This section reviews Clark and Parikh’s (2007) proposal for discourse anaphora in English.
Their approach is the basis of my own analysis of Catalan, which I present in Section 4.3.
Consider the simple text in 52:
(52) A cop saw a hoodlum. He yawned.
There are several issues regarding the choice of referring expression in this small text,
both from the speaker’s and hearer’s point of view. How does the hearer know who he
59
refers to? Why does the speaker choose to utter he instead of a definite description? Why
is the text judged by most speakers to be unambiguous? Clark and Parikh (2007) view
this problem as a game of partial information in which speaker and hearer share some
knowledge and in which they try to find the most efficient strategy to solve the game of
communicating the utterance.
On the one hand, the speaker uses particular discourse anaphors when she expects the
hearer to be able to correctly identify the referent. On the other hand, the hearer chooses
antecedents to discourse entities based on how he expects the speaker to refer to each entity.
As both agents, speaker and hearer, are aware of this fact (they both know they are playing
this game), they can find a maximally efficient solution, that is, they can compute a Pareto-
Nash equilibrium as a solution for the game. This equilibrium is maximally efficient in
the sense that for the speaker it is the best way to encode the meaning she wants convey
and, for the hearer, given the form he has heard, it is the best way to interpret the referring
expression. Thus, they cannot do better by deviating from this strategy profile.
Suppose that a speaker utters the first sentence of 52. By doing so, she has introduced
two discourse entities. Now, she wants to convey the meaning that the cop yawned. Both
agents know that the speaker could refer to either of the two entities by using either a
pronoun or a definite description. Once the speaker has uttered a second sentence, the
hearer has to decide who the referring expression refers to.
The game tree in Figure 4.2 shows the moves of the hearer and speaker and the payoffs
they get in each situation for each option. There are two information states, two possible
state of affairs in the world which are relevant to the game. There are two trees for each
information state s1 and s2 , with probabilities p1 and p2 , respectively. The tree rooted
in s1 is the one in which the speaker intends to refer to the cop (Subj, henceforth), and
s2 is the one in which the speaker intends to refer to the hoodlum (Obj, henceforth). The
branches of the main root show the speaker moves, while the sub-branches emanating from
60
these show the hearer’s moves. At the leaves, there is a set of ordered pairs of payoffs, the
first element of which refers to the payoff of the speaker and the second to the payoff of the
hearer.
For each tree, the speaker may use a pronoun or a definite description. If a pronoun is
used, the hearer may resolve the anaphora correctly or may make a mistake. This is pre-
cisely what makes the game a game of partial information: if the speaker utters a pronoun
the hearer will not be sure whether the speaker intends to refer to Subj or to Obj, that is,
the hearer will not be sure whether he is in information state s1 or in s2 (this is indicated
by circling the nodes t1 and t2 ).
Figure 4.2: Game of partial information for English anaphora
The payoffs for each action are assigned according to the following principles:
• Generally, it is more costly to use longer expressions.
61
• Generally, it is more costly to use expressions with “high” conventional content, in-
dependent of context (thus, names and descriptions are costlier than pronouns, which
are context-dependent).
• It is cheaper to refer to more salient entities with pronouns, and to less salient entities
with definite descriptions. Prominence is calculated according to the grammatical
function of the element in the preceding sentence, following the hierarchy in 53,
which is the one assumed in much of Centering Theory.
(53) Subject > Indirect Object > Direct Object > Others
In s1 , if the speaker uses a definite description, the hearer will surely resolve the
anaphor correctly. However, the payoffs will not very high due to producing and processing
costs and due to the assumption that referring to a prominent element (the subject) with a
full description rather than a pronoun entails some cost. Therefore, the proposed payoffs
are (6,6). If the speaker uses a pronoun in s1 and the hearer correctly resolves the anaphor,
the payoffs are higher (10, 10), since the costs are much less. However, if the hearer inter-
prets Obj instead of Subj, the payoffs would be negative (-10, -10) and would lead to an
undesirable situation of miscommunication. In s2 , the situation is very similar. However,
if a definite description is used, the payoffs are (7, 7), and not (6, 6) as in s1 , because using
a definite description for a less prominent entity (Obj) is assumed to be less costly. Also, if
the speaker chooses a pronoun and the hearer correctly chooses Obj, the payoffs are (8,8)
and not (10,10), because it is less efficient to pronominalize a less prominent element. If
the speaker chooses a pronoun and the hearer incorrectly chooses Subj, the payoffs are
again negative (-10, -10). As mentioned in 4.1.1, the particular value of the payoffs is not
important or meaningful; what is important and meaningful is the relationship between the
payoffs.
62
In the absence of further information, Clark and Parikh (2007) assume that the two
information states are equally likely3 , that is p1 = p2 . In this case, there are two pure
Nash equilibria4 , corresponding to the following strategies and payoffs (the payoffs are
calculated adding the outcomes of the different situations, weighted by the probabilities):
1. {(s1 , he), (s2 , ‘the hoodlum’), ({t1 , t2 }, Subj)}: the speaker should utter ‘he’ in s1 ,
‘the hoodlum’ in s2 and the speaker should interpret a pronoun as referring to Subj.
Thus, the expected payoff is: p1 (10) + p2 (7)= 0.5(10) + 0.5(7) = 8.5.
2. {(s1 , ‘the cop’), (s2 , ‘he’), ({t1 , t2 }, Obj)}: the speaker should utter ‘the cop’ in s1 ,
‘he’ in s2 and the speaker should interpret a pronoun as referring to Obj. Thus, the
expected payoff is: p1 (6) + p2 (8)= 0.5(6) + 0.5(8) = 7.
None of the other strategies are Nash equilibria. For example, the strategy of the speaker
uttering definite descriptions both in s1 and s2 and the hearer interpreting a pronoun as
referring to the subject is not a Nash equilibrium because the speaker can do better by devi-
ating and using a pronoun instead of a definite description (her payoffs in this information
state would increase from 6 to 10).
Among the two Nash equilibria of the game, the first one has the highest expected
payoff; it is the only Pareto-Nash equilibrium of the game. Both participants can compute
this equilibrium and will choose this strategy as the solution of the game. Communication,
even in the absence of complete information, becomes possible.
Clark and Parikh (2007) also show how this account can deal with apparent counterex-
amples, in which the Pareto-Nash equilibrium seems to be violated. The basic idea is that
3
I change this assumption in the next section.
4
There are two other mixed Nash equilibria, in which the players choose each option with a certain
probability.
63
several factors can influence the probabilities, so that one of the information states becomes
more likely. For instance, note the following contrast:
(54) a. John called Bill a Republican. Then he insulted him.
b. John called Bill a Republican. Then HE insulted him.
The partial game just presented correctly predicts that the pronoun in 54a should refer to
John. The game for 54b should be identical to the game for 54a. However, the contrastive
stress on the pronoun has the effect of altering the probabilities, so that p2 > p1 ; that is,
it becomes more likely that the speaker wants to refer to Obj, the object of the previous
sentence. That is:
(55) p1 = P (s1 | he bears contrastive stress)

p2 = P (s2 | he bears contrastive stress)
p2 > p 1
Since both speaker and hearer know that stress alters the probabilities, they can use this
information to compute the optimal strategy.
Lexical semantics and world knowledge can also influence the probabilities, as the
following examples show:
(56) a. John can open Bill’s safe. He knows the combination.
b. John can open Bill’s safe. He should change the combination.
The coreference of 56a is straightforwardly predicted by the model. In contrast, in 56b

, world knowledge increases the probability of p2 , so that the pronoun corefers with ‘Bill’.
One aspect that makes this game theoretical approach conceptually different from Cen-
tering Theory or Optimality Theory approaches is that the former explicitly relates dis-
course anaphora to the rational choice of some agents; referring expressions are conceived
of as a way of signaling a specific strategy. The idea behind the game theoretical system
64
for analyzing discourse anaphora is that participants in a discourse are able to communicate
efficiently because they share some information, some common knowledge (the set of ac-
tions available, their payoffs and probabilities) and thus are able to use linguistic resources
in the most efficient way.
The linguistic phenomena analyzed with games of partial information so far all present
one ambiguous linguistic form competing against other less economical, non-ambiguous
linguistic forms. Discourse anaphora in Catalan are different in this respect since there
are two ambiguous forms competing against each other and against non-ambiguous forms.
The next section presents an account of these cases.
4.3 An analysis of null-subject languages
In this section, I present a game theoretical model of the asymmetry predicted by the PAH,
which, as was shown in Section 3.3, also holds for Catalan. Consider sentence 57, which
was one of the items in Experiment 1.
(57) Marta wrote frequently to Raquel. ∅ Lived in the United States.
The game between hearer and speaker to resolve the anaphor in sentence 57 is shown
in Figure 4.3. It looks similar to the game for the English discourse in Figure 4.2, although
its complexity has increased. There are two information states, two situations speaker
and hearer could be in. In particular, since experiments 1 and 2 showed that there is a
relationship between salience and interpretation of anaphoric forms, information states will
be understood as encoding antecedents with different, relevant degrees of salience. In this
chapter, we consider the two degrees of salience studied so far: Subj, in which the speaker
wants to refer to the antecedent in subject position, highly salient, and, Obj, in which the
speaker wants to refer to the antecedent in object position, with lower salience.
65
In each of the two information states of the game, the speaker now has three choices
instead of two: she can use an overt pronoun (OSP), a null pronoun (NSP) or a proper
noun/definite description (DD). When hearing a sentence with either of the two pronominal
forms, the hearer will have to decide whether the speaker wants to refer to Marta, the
Subject, or to Raquel, the Object. The former corresponds to information state s1 and the
latter to information state s2 . I call the information set the hearer is in after hearing a
sentence with an NSP {t1 , t2 } and the one after hearing a sentence with an OSP {u1 ,u2 }.
Figure 4.3: Game for Catalan pronouns
66
I follow the same assumptions as Clark and Parikh to assign payoffs to each option
(see Section 4.2), with one change. I follow them in assuming a ranking within referring
expressions, so that the referring expressions which are shorter and more context-dependent
receive higher payoffs, while the ones which are longer and more conventional receive
lower payoffs. This is the hierarchy I propose:
(58) Null Pronoun > Overt Pronoun > Proper Name/Definite Description.
That is, NSPs are the most economical form, followed by OSPs, followed by proper
names and definite descriptions. Therefore, I assume that the payoffs for each option are
10, 8 and 5, respectively. As mentioned before, what is important is not the numerical value
of the payoff, but the relationship between payoffs: the relationship is what determines the
Pareto-Nash equilibrium.
I do not follow Clark and Parikh (2007) in encoding in the payoffs any asymmetry
between referring to a subject antecedent and an object antecedent. That is, I do not encode
the hierarchy in 53 in the payoffs; my proposal is that the payoffs for correctly interpreting
NSPs and OSPs are the same in s1 and s2 . However, the same result they present can be
achieved by introducing this asymmetry in the probabilities of the information states, which
is how Clark and Parikh (2007) modeled the various non-default antecedent assignments
they dealt with (examples 54 and 56). In a nutshell, my proposal is that speakers and hearers
assign different probabilities to different information states and that these probabilities can
be estimated through corpora counts. This distribution of the probabilities is common
knowledge for the participants in a conversation and, thus, they can take advantage of it to
make the most efficient use of their resources.
The corpus of Catalan narrations, presented in section 2.5, provides clear evidence that
the two information states which we are considering here, Subj and Obj reference, are not
equally likely. 79% of the subjects whose antecedents are in the previous sentence refer
to the previous subject and 21% to another constituent. Therefore, the two information
67
states are not equally, or similarly, likely: the first one is much more likely than the sec-
ond one. The idea that the default option is that the referent of the subject of the current
utterance Ui is the same as the referent of the previous utterance Ui−1 gets support from
different sources. First, the same claim is found in Centering Theory literature, in which
it represents a Continue Transition, which is the preferred transition (see also Walker et al.
(1998) on discourse continuity). Second, research about discourse structure has shown that
a discourse normally sticks to the same topic, talking about the same objects and events
(see Jasinskaja and Zeevat (2008)). Third, there is evidence from psycholinguistic studies
that referents in subject position are expected to be mentioned again. Kaiser and Trueswell
(2008) performed an eye-tracking experiment and they found that, after an SVO sentence
in Finnish, people anticipated that a subject would be mentioned again. Kim (2009) also
found a subject preference in Korean. To sum up, subject continuity is not a property spe-
cific to a particular language or language family, but rather a cross-linguistic tendency;
therefore, it is best encoded in the probabilities. As Jäger (2007) points out, probabili-
ties in game theory should be used to represent cognitive and communicative tendencies,
not particularities of a certain language. Subject continuity is one of these communicative
tendencies.
As mentioned, corpora counts can be used to estimate probabilities. A different ques-
tion would be where these probabilities come from; that is, why the probability distribution
is the way it is or, in different words, why communicative tendencies are the way they are.
This question is beyond the scope of this thesis. It is a question that can probably be ad-
dressed within the framework of evolutionary game theory, rather than with the framework
of rationalistic game theory used in this thesis.
Going back to our game, I propose that probability p1 , corresponding to information
state s1 , is greater than probability p2 , corresponding to information state s2 . For the
purposes of showing the calculations, I assume that p1 = 2/3 and p2 = 1/3. However, note
68
that the equilibria will remain constant as long as p1 > p2 and that they do not depend on
the particular values assigned to p1 and p2 . The game has the following four pure Nash
equilibria:
(59) a. {(s1 , NSP), (s2 , OSP), ({t1 , t2 }, Subj), ({u1 , u2 }, Obj)}. The expected payoff
is: p1 (10) + p2 (8) = 2/3(10) + 1/3(8) = 28/3.
b. {(s1 , DD), (s2 , NSP), ({t1 , t2 }, Obj), ({u1 , u2 }, Obj)}. The expected payoff
is: p1 (5) + p2 (10) = 2/3(5) + 1/3(10) = 20/3.
c. {(s1 , OSP), (s2 , NSP), ({t1 , t2 },Obj), ({u1 , u2 }, Subj)}. The expected payoff
is: p1 (8) + p2 (10) = 2/3(8) + 1/3(10) = 26/3.
d. {(s1 , NSP), (s2 , DD), ({t1 , t2 }, Subj), ({u1 , u2 }, Subj)}. The expected payoff
is: p1 (10) + p2 (5) = 2/3(10) + 1/3(5) = 25/3.
No other strategy is a Nash equilibrium. For example,{(s1 , null), (s2 , null), ({u1 , u2 ,},
Subj), ({t1 , t2 ,}, Obj)} is not a Nash equilibrium. The speaker would always use a null
pronoun, regardless of whether she wants to refer to the subject or the object, while the
hearer would always understand a null pronoun as referring to the subject. This means
that there would always be miscommunication whenever the speaker refers to the object.
The expected payoff for this strategy is: p1 (10) + p2 (0) = 2/3(10) + 1/3(0) = 20/3. Given
the strategy the hearer is using, it is in the speaker’s best interest to deviate from her own
strategy and to use an overt pronoun when she wants to refer to the object: that is, she
should use the strategy in 59a. This would increase her payoffs from 20/3 to 28/3.
There is a single Pareto-Nash equilibrium, which is the equilibrium in 59a. According
to this equilibrium, the speaker should use an NSP to refer to a previous subject and an OSP
to refer to a previous object. The hearer should interpret an NSP as referring to a previous
subject and an OSP as referring to a previous object. It is easy to see that this strategy is
equivalent to the predictions of the Position of Antecedent Hypothesis.
69
As mentioned before, we can think of the two information states of the game as two dif-
ferent degrees of salience: Subj is the information state with the highest degree of salience
and Obj is the information state with the lowest degree of salience. The game of partial
information easily derives the division of labor usually found in pragmatics, in which the
unmarked form expresses an unmarked meaning (or rather, refers to an unmarked, expected
referent) and the marked form expresses a marked meaning (or rather, refers to a marked,
less expected referent).
This approach predicts that if p1 < p2 (if it becomes more likely that we are referring to
the previous object), the NSP should be used to refer to the previous object. This prediction
is borne out, as the following naturally-occurring example shows:
(60) Altre cop tira la granoteta fora però, com que estan a l’aigua, ∅ cau a l’aigua.
“Again ∅bigf rog pushes the little frog outside, but since ∅ are in the water, ∅littlef rog
falls in the water.”
This is a story about two frogs. In the first clause of 60, an NSP refers to the big
frog, while a DP in direct object position refers to the little one. In the second clause, a
null pronoun can felicitously refer to the little frog, given that the semantic content of the
sentences clearly biases the hearer in this direction: if x pushes y, y, and not x, is the one
likely to fall. With this extra information, the speaker can use the more economical form,
the NSP, which the hearer can interpret correctly.
Note that Carminati presents the PAH not as a grammatical constraint, but as a prag-
matic principle which expresses preferences that can be violated. However, she does not
provide a mechanism to express when the biases predicted by the PAH can be violated. By
translating the PAH into games of partial information, it becomes obvious how to do so: the
probabilities of each information state encode the shared knowledge about the likelihoods
of these states and, depending on how agents assess them, the biases emerging from the
70
PAH will be obeyed or violated.
In light of this analysis, consider again the results from the reading-time experiment
(Experiment 2), which are repeated below.

Condition 1: subj + null 2464 -24 90
Condition 2: subj + pron 2929 290 90
Condition 3: obj + null 2587 45 91
Condition 4: obj + pron 2700 -1 91
The corrected reading times (in the Difference column) showed the asymmetry pre-
dicted by the PAH even in cases of semantic/pragmatic biasing (that is, even when the
probabilities of p1 and p2 are being manipulated). However, the effect is very different in
each one of the four conditions:
• Conditions 1 and 2 correspond to cases in which there was a bias towards the sub-
ject. While my proposal is that for unbiased situations p1 > p2 , in these cases the
difference between the probabilities is still larger. Thus, as expected, the null pro-
noun condition (Condition 1) is greatly favored, while the overt pronoun condition
(Condition 2) receives a large penalty.
• Conditions 3 and 4 correspond to cases in which there was bias towards the object.
So is this bias towards the object capable of eliminating the initial bias towards the
subject? Looking at the results, the answer seems to be no, although they certainly
show the effect of the conflicting biases. In Condition 3, there is some penalization
in the reading time: that is, the biasing does not render the null pronoun completely
felicitous. However, note that the penalization is much smaller than in Condition 2, as
we would expect (thus, the semantic bias is indeed doing some work). In Condition
4, there is some facilitatory effect: that is, in spite of the bias towards p2 , the overt
71
pronoun is still easing the processing of the sentence. However, the facilitatory effect
is quite small, particularly if we compare it with the one in Condition 1.
I take these results to indicate that the initial difference between p1 and p2 is fairly big
and that, thus, it takes many extra signals to compensate for this initial difference and to
reverse the probabilities. So, even in the case of some semantic bias, p1 continues to be
greater than p2 , and thus the OSP is needed to indicate reference to the object. This can be
tested experimentally by constructing sentences with several degrees of biasing. Consider
the following two conditions: (1) mild bias, with just some semantic bias in the discourse
and (2) strong bias, in which the semantic bias is reinforced by discourse connectives. The
prediction of this approach is that if we are referring to the object with an NSP, the reading
times of Condition 1, with a mild bias, would still show some penalty, while the ones of
Condition 2, with a strong bias, would not. Experiment 3 is aimed at showing that this
prediction is fulfilled.
4.3.1 Experiment 3: self-paced reading experiment with different de-
grees of biasing
Experiment 3 is very similar to Experiment 2 and it also uses the methodology of self-
paced reading. The goal of this experiment is to test how context affects the processing and
biasing preferences of NSPs.
Materials: Materials consisted of sixteen two-sentence discourses with eight condi-
tions. In these discourses, the first sentence introduces two individuals by means of two
proper names of the same grammatical gender, one in subject position and the other in
object position. The second sentence contains either an NSP or an OSP, it is semantically
biased so that the pronoun refers either to the previous subject or previous object and this
bias is either mild or strong. The degree of biasing is affected by means of connectives. The
72
idea behind this move is that, because connectives explicitly mark the rhetorical relation
between sentences, they reinforce discourse coherence and pronouns can be interpreted
more easily. The connectives used for the subject biasing condition were those marking
narration, elaboration or explanation (‘after’, ‘in addition’, ‘it turns out’), while for the ob-
ject biasing condition, the connectives marked result and violated expectation (‘that’s why’,
’however’). These two sets of relations tend to trigger subject and object interpretation of
the pronoun, respectively (Stevenson et al. (2000), Kehler (2002), Hobbs (1979)). Note
that the goal of placing a connective is not to change the interpretation of the pronoun, but
to increase the coherence of the text and see how this affects the processing of pronouns.
The eight conditions of the experiment are:
(61) a. Condition 1: Null pronoun + mild bias towards subject antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. ∅ Es va excusar repeti-
dament.
“John made fun of Dani in front of everyone. ∅ Apologized many times.”
b. Condition 2: Null pronoun + strong bias towards subject antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Després, ∅ es va excusar
repetidament.
“John made fun of Dani in front of everyone. Afterwards, ∅ apologized many
times.”
c. Condition 3: Overt pronoun + mild bias towards subject antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Ell es va excusar repeti-
dament.
“John made fun of Dani in front of everyone. He apologized many times.”
d. Condition 4: Overt pronoun + strong bias towards subject antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Després, ell es va excusar
73
repetidament.
“John made fun of Dani in front of everyone. Afterwards, he apologized many
times.”
e. Condition 5: Null pronoun + mild bias towards object antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. ∅ Es va ofendre molt.
“John made fun of Dani in front of everyone. ∅ Was very offended.”
f. Condition 6: Null pronoun + strong bias towards object antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Per això, ∅ es va ofendre
molt.
“John made fun of Dani in front of everyone. That’s why ∅ was very offended.”
g. Condition 7: Overt pronoun + mild bias towards object antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Ell es va ofendre molt.
“John made fun of Dani in front of everyone. He was very offended.”
h. Condition 8: Overt pronoun + strong bias towards object antecedent.

El Joan va deixar en ridı́cul el Dani davant de tothom. Per això, ell es va ofendre
molt.
“John made fun of Dani in front of everyone. That’s why he was very of-
fended.”
The conditions for each item set were counterbalanced and incorporated into a self-
paced reading experiment together with 24 filler items and 5 practice items. Sixteen coun-
terbalanced lists were constructed (the last eight lists with the items in reverse order), with
a single randomization for all lists. The complete set of experimental items can be seen in
Appendix C.
Procedure: The procedure is the same as explained for experiment 2. Discourses were
presented on the computer, equipped with Eprime software. Subjects were asked to press
74
the space bar after each sentence and this is how the reading times for each sentence were
measured. Comprehension questions, probing the resolution of the pronoun, were asked
after each item.
Participants: Thirty-two members of the Universitat Pompeu Fabra community took
part in this experiment. They had not participated in either Experiment 1 or Experiment 2
(they did participate in Experiments 4 and 5).
Results: Table 4.5 contains the results for this experiment. The second column con-
tains the raw reading times; the third column the difference between the Observed and the
Expected reading time and the fourth column the percentage of correct answers.

Cond 1: subj + null + mild 2447 -176 92
Cond 2: subj + null + strong 2570 -423 92
Cond 3: subj + pron + mild 3077 288 83
Cond 4: subj + pron + strong 3342 275 85
Cond 5: obj + null + mild 2609 170 81
Cond 6: obj + null + strong 2757 -124 82
Cond 7: obj + pron + mild 2783 97 94
Cond 8: obj + pron + strong 3119 -104 77
The average reading times for the main clause were computed, after eliminating times
that were longer than 7000 ms and shorter than 200 ms (about 3.12% of the total number of
trials). The number in the ‘% correct’ column refers to the percentage of answers in which
participants understood the pronoun as referring to the expected, pragmatically-biased an-
tecedent.
Deviations from regressions were computed to account for length differences, follow-
ing the same method used in Experiment 2 (see Section 3.1) and these results will be the
ones discussed here. Negative numbers indicate that the reading times were shorter than
expected (i.e. they were read faster) and positive numbers that they were longer than ex-
75
pected (i.e. they were read slower). In the first four conditions, the ones with bias to the
subject, we observe the pattern we have reported so far: the conditions with NSPs are read
faster than expected, while conditions with OSPs are read slower than expected. The differ-
ent level of biasing increases the ease of processing in the conditions with the null pronoun
(Condition 1 vs. Condition 2), while it does not have any significant effect in the condi-
tions with the overt pronoun (Condition 3 vs. Condition 4). The most interesting result
for our purposes is the contrast observed for Conditions 5 and 6: the ones with bias to the
object and NSPs. If the bias is mild (Condition 5), we see some difficulty in processing
(the sentences are read slower than expected). However, if the bias is strong (Condition 6),
this difficulty disappears and the sentence is read faster than expected. This is exactly what
our model predicts: if the bias is strong enough so that the probabilities are switched, the
speaker can use the more economical form and can expect the hearer to process the sentence
without problems. Finally, we see a parallel pattern with the overt pronoun in Conditions
7 and 8, although both the ease and the difficulty of processing are less extreme. Note
that we predict that a speaker should not produce an OSP with a strongly biased sentence
and, therefore, the sentences in Condition 8 should be unnatural. This may be somewhat
reflected in the percentage of correct answers. which is the lowest for all conditions (77%).
However, it seems that the strong bias overrides the conflicting linguistic cue and hearers
can nonetheless process the sentence with ease (although less than in Condition 6, with
the null pronoun). In Condition 8, the strong bias speeds processing, but the conflicting
linguistic cues (the discourse connective vs. the OSP) impair comprehension. In contrast,
in Condition 7, it looks like the OSP is enabling comprehension, albeit at some processing
cost.
The data regarding the difference between observed and expected reading times was
submitted to an ANOVA analysis. The effect of the type of pronoun was significant
(F1(1,31) = 7.55, p < 0.01; F2(1,15) = 26.67, p < .001), as was the effect of the type
76
of biasing, although only marginally by items (F1(1,31) = 7.74, p < 0.01; F2(1,15) = 3.41,
p = .083). In addition, there was a significant bias by pronoun interaction (F1(1,31) = 11.85,
p < 0.001; F2(1,15) = 12.94, p < .001).
These results clearly show that contextual information is crucial in assigning antecedents
to the different referential forms and in processing them. Game of partial information
provide a way of explicitly modeling the context, by assigning probabilities to different
contextual states of affairs and this is one of the reasons why they are highly suitable for
analyzing phenomena such as the one studied in this thesis.
4.4 Mixed strategies and uncertainty
The game of partial information presented in Section 4.3 has proved successful in model-
ing the experimental data from Chapter 3. These experiments have identified significant
tendencies for different pronouns. However, the amount of data that does not follow the
identified tendency is quite large. For example, consider the results for Experiment 1 (the
questionnaire study) repeated in Table 4.6. The dispreferred antecedent was chosen 30%
of the time for NSPs and 35% of the time for OSPs.

null pronoun 70 30
overt pronoun 35 65
A natural way to think about this distribution of probabilities for a game theorist would
be to see them as the result of a mixed strategy. In this section, I briefly explain the notion of
a mixed strategy and argue that these strategies cannot be applied to the null/overt pronoun
variation.
Consider the following game (originally from von Neumann and Morgenstern (1944))
77
between Sherlock Holmes and his enemy Moriarty. Sherlock Holmes wants to go from
London to Dover and to the continent, to escape from Moriarty. Moriarty is aware of
this plan. Holmes, then, has two options: (1) continue with his plan and go to Dover or
(2) change his plan and leave the train at Canterbury, the only intermediate station. His
adversary has the same choice: he can go all the way to Dover or he can stop at Canterbury.
That is, they both have to choose and take into account what the other player might choose.
If they both decide to leave the train at the same stop, Moriarty will certainly catch Holmes
and thus Moriarty will have a positive payoff of 10, while Holmes will have a negative
payoff of −10. If Holmes reaches Dover, while Moriarty leaves the train at Canterbury,
Holmes will be able to temporarily escape and thus will get a positive payoff, say of 5,
while Moriarty will receive a payoff of −5. Lastly, if Moriarty goes all the way to Dover,
while Holmes stays at Canterbury, this is best seen as a tie (0,0) between both players,
since Holmes has so far escaped, but has failed to reach the continent. Note that this is a
zero-sum noncooperative game, that is, the winnings of one player represent losses for the
other player. The game is depicted in 4.7.
Moriarty’s choices
Dover Canterbury
Holmes Dover -10,10 5,-5
choices Canterbury 0,0 -10,10
Table 4.7: Game with a mixed Nash equilibrium
There is no pure Pareto Nash equilibrium in this game. If Holmes plays Dover, then
Moriarty should also play Dover. However, in that case, Holmes should deviate and play
Canterbury, instead. There is, however, a mixed strategy. Holmes and Moriarty can ran-
domize and play both actions with a certain probability, so that the other player does not
have a reason to prefer one of the two options. Let p be the probability of Moriarty play-
ing Dover and 1 − p the probability of Moriarty playing Canterbury. The expected payoff
78
for Holmes to play Dover is calculated in 62a and the one to play Canterbury in 62b. If
p = 3/5, Holmes receives the same payoff in both Dover and Canterbury as shown in 62c
and, thus, has no reason to prefer a particular action. The same reasoning applies to the
other player.
(62) a. -10p + (1-p)5 = -15p + 5
b. (1-p)10 = 10p - 10
c. 10p - 10 = -15p + 5 =
p = 3/5
The mixed equilibrium requires Moriarty to leave the train at Dover 60% of the time and
at Canterbury 40% of the time, and requires Holmes to leave the train at Canterbury 40% of
the time and at Dover 60% of the time. If they randomize according to these probabilities,
they prevent their adversary frp, anticipating their actions (Schelling, 1969).
Now the interesting question for our purposes is the following: is it plausible to think
that the results of Experiment 1 are a product of players playing a mixed strategy? In what
follows, I present what a game with a mixed Nash equilibrium solution would look like and
the predictions such a game would make.
Let’s assume first that the speaker uses an NSP. Again, the speaker may have used
this pronoun to refer to the previous subject or to refer to the previous object, while the
hearer needs to decide how to interpret the pronoun. Furthermore, let’s assume we have no
information about the payoffs, other than the fact that if speakers and hearers coordinate,
the payoffs will be positive and if they don’t the payoffs will be negative. This is depicted
in 4.8.5
Let’s now take the results from Experiment 1 and use them to represent the probabili-
ties with which the hearer chooses a particular interpretation. That is, the hearer chooses
5
‘S’ stands for speaker and ‘h’ stands for hearer.
79
Hearer’s choices
Subject Object
Speaker’s Subject +s1,+h1 -s2,-h2
choices Object -s3,-h3 +s4,+h4
Table 4.8: Potential mixed strategy game
Subject with a probability of 0.7 and Object with a probability of 0.3. If this is a mixed
strategy, this means that the payoffs for the speaker have to be equivalent in both strategies.
That is, s1·0.7 = s4· 0.3. This equation will not give us fixed values for the actions, but will
give us a function for one option in terms of the other, or more specifically: s1 = s4· 0.3
/ 0.7 = 0.4 · s4. This predicts that the payoff for assigning the NSPs an object antecedent
should be roughly twice as much as assigning it a subject antecedent. This is not a plausible
assumption given that the experimental results point to the opposite direction: subjects are
the preferred antecedents for NSPs.
Consider now the case in which the speaker uses an OSP. The hearer’s probabilities
are the following: the hearer chooses Subject with a probability of 0.35 and Object with a
probability of 0.65. Doing the same calculations as before, the payoffs for the speaker have
to be such that the following equality holds: s1· 0.35 = s4· 0.65 and, therefore, s1 = 1.8
· s4. This predicts that the payoff for interpreting the OSP as the subject should be roughly
twice as much as that for interpreting it as the object. Again, this is not plausible, since the
empirical pattern is just the opposite.
Although the results of the experiment may look at first sight as if they are a product
of mixed strategies, this would make very odd predictions about what the payoffs for the
different options should look like. The fundamental difference between the Holmes and
Moriarty game and the pronoun game is that the first is uncooperative, while the second is
cooperative. That is, the point of using mixed strategies in the Holmes and Moriarty game
is to confuse the other player so that he does not have a reason to prefer one action over the
80
other. This is not how language works. A speaker is cooperative because it is in his best
interest to be understood, unless he is trying to deceive the hearer.6 A mixed strategy could
be used to model cases of stable sociolinguistic variation (see for example Labov (1994)),
that is situation in which two forms with the same meaning coexist, but not cases in which
the two forms (1) can potentially convey different meanings and (2) are associated with
different costs, such as the anaphora case.
Although further work should be done to explain the variation found in the experimental
data, we can reject the idea that it is a product of agents playing mixed strategies and I
would like to entertain two alternative explanations. One possible explanation would be
that there is some uncertainty regarding the probabilities that speaker and hearer assign to
different information states. In some cases, their probabilities might not exactly coincide:
for instance, if the Speaker assigns p1 = 0.6 and the Hearer assigns p1 = 0.4, the speaker
will use an NSP to refer to the subject, but the hearer will interpret it as referring to the
object.
Consider the following discourse from Nesson et al. (2008):
(63) a. My dog has been getting quite obstreperous lately.
b. I took him to the groomer yesterday.
c. He hates him.
d. In fact, he tried to bite him last month.
e. In fact, he always tries to schedule his appointments when the other groomer is
6
The game theoretical analysis of indirect speech by Pinker et al. (2008) is consistent with this idea. A
speaker may want to be somewhat misleading and use indirect speech, which is less efficient and more costly
than direct speech, but which can be “plausibly denied”. For instance, using indirect speech to convey a bribe
is the optimal choice if the speaker thinks he might be talking to an honest officer, who will not accept the
bribe, because the speaker can deny the bribe ever took place, while he would not be able to do that if direct
speech is used.
81
on duty.
The third sentence is ambiguous and the following utterance in the discourse (either 63d
or 63e) disambiguates the pronoun. In spite of the temporary ambiguity, these discourses
are not perceived to be incoherent. Nesson et al. (2008) propose that this is due to the fact
that the salience ranking of the two entities is not clearly fixed (neither referent appears
in subject position in 63b) and it may be that hearer and speaker have different rankings
or, in our terms, different probabilities. If the most likely referent is the same for both
participants, the discourse will proceed without any problem. However, if it is not the
same, the hearer will have to backtrack and correct his antecedent assignment in light of
the contextual information. In fact, miscommunication regarding anaphora interpretation
does occur in natural conversation. Speakers and hearers mostly understand each other, but
communication is not always perfect. Thus, it is plausible that this type of mismatch in
probability assignment happens occasionally.
If miscommunication due to anaphora does occur in natural conversation, with plenty
of contextual information, it is not surprising to find a great amount of variation in experi-
mental settings, in which context is very limited. In fact, the items used in the experiments
had no previous context because pronouns are highly sensitive to the previous context and
the goal of the experiments was to find out the pronoun preferences precisely in the absence
of contexts. Participants had to estimate the probabilities of information states without any
contextual cues and, therefore, it is not surprising that the results showed some amount of
variation, although with very clear tendencies.
Another possible way of explaining the variation found in the experiments would be to
think that there is some degree of uncertainty in the payoffs. That is, it might be that agents
don’t assign a fixed value as payoff, but can only estimate the range in which the payoff is
found. For example, consider a situation in which the agents assign a payoff of 10 for the
null pronoun and a payoff in the range of 8 to 9 to the overt pronoun. In this situation, if
82
p1 = 0.70, the Null-Overt7 strategy would yield an expected payoff in the range 9.7-9.4 and
the Overt-Null strategy an expected payoff in the range 9.3-8.6. Therefore, the Null-Overt
strategy is still the Pareto-Nash equilibrium. However, consider what happens if p1 = 0.6.
In this case, the Null-Overt strategy would yield a expected payoff in the range 9.6-9.2
and the Overt-Null strategy an expected payoff in the range 9.4-8.8. That is, there is some
overlap in the payoffs and in such cases there is no single Pareto-Nash equilibrium. Figure
4.4 plots the range of payoffs for the two strategies against the different probabilities. The
descending band represents the payoffs for the Null-Overt strategy and the ascending band
the payoffs for the Overt-Null strategy. The dark area represents the area where there is
some overlap and, thus, there is not a single Pareto-Nash equilibrium which is the solution
of the game. Conversational agents may randomize if the payoffs of their strategies are
found in this dark area and that would explain some of the variation that we find in the
experiments.
As I mentioned before (see Section 4.1.1), the particular values that I have used to
illustrate my analysis are not meaningful and they only indicate preferences. Giving ranges
of values, instead of a single value, to payoffs is a way of saying that preferences may have
some degree of vagueness and this can become relevant if the probabilities of the different
information states are not very different and the expected payoffs have some degree of
overlap.
4.5 Conclusion
In this chapter, I have presented a game theoretical analysis of the data obtained through
the experiments presented in Chapter 3. The results of my experiments supported the PAH,
7
I use the following abbreviation: X-Y strategy means use X to refer to the subject and Y to refer to the
object, where X and Y can be null or overt pronouns.
83
Figure 4.4: Expected payoffs
which is a principle that can easily be modeled using games of partial information. More-
over, by translating a principle such as the PAH to a game of partial information, we are able
to express when the biases predicted by the PAH should be obeyed and when they should
be violated. Experiment 3 confirms the prediction made by the game theoretical model that
by manipulating the context, probabilities can be shifted and NSPs can become felicitous
to refer to a less salient antecedent. That is, the preferences of NSPs in the absence of
context are easily overridden in presence of context. A powerful model which associates
probabilities with information states can easily accommodate these changing preferences.
Finally, I have argued against analyzing the variation we find in the experimental data as a
result of participants playing mixed strategies, instead, they are a result of their probabili-
ties not perfectly matching in some circumstances or the payoffs not having a fixed value,
but rather a range of values.
Pragmatic choices, including production and interpretation choices of referring expres-
sions, are essentially rational choices, while cases of sociolinguistic variation or linguistic
84
change are not necessarily so. Rationalistic game theory is a good framework to deal with
the former, while evolutionary game theory, which deals with behavior rather than with
rationality, could be a good framework to deal with the latter.
In this chapter, I presented a first step towards constructing a game theoretical model
that speakers and hearers use to choose and interpret anaphoric forms. The mathematics
behind game theory, and in particular behind the Pareto-Nash equilibrium, determines the
outcome of the game. Therefore, the task of the theorist is to propose a model that can
capture the empirical data and specify its necessary ingredients: payoffs, probabilities,
information states, choices and information sets. Given that game theory provides us with
a powerful mechanism, it is important to justify well the ingredients of the model. In order
to do so, I follow two criteria: (1) payoffs are assigned following two simple criteria and
are kept constant throughout this thesis and (2) probabilities are always estimated following
the same method (i.e. using corpora counts).
The goal of this thesis is to present the empirical data obtained through experiments
and to specify the ingredients of the game theoretic model. In this chapter, I have presented
an initial model which captures the tendencies of the data presented so far. In the next
chapters, I present additional empirical evidence which calls for a redefinition of some
aspects of the model.
85
Chapter 5
Pragmatic structure and pronouns:

topic, link and focus
In the preceding chapters, I have examined the idea that there is a special relationship
between NSPs and subjecthood, which is well supported by the experiments reported in
Chapter 3. However, it is possible that this relationship is a byproduct of the pragmatic
structure of the sentence. This chapter aims to establish whether the referring preferences
of pronouns is determined by their pragmatic status. I start with the relationship between
pronouns and links in Section 5.1 and then I examine the relationship between pronouns
and focus in Section 5.2. Also, since we know that there is a relationship between reduced
anaphoric forms and salience, we can gain insight into which factors compose salience,
particularly because in Catalan there is not just one reduced anaphoric form, but two of
them.
Regarding the relationship between pronouns and links, I argue that both syntax and
pragmatics have an effect on the referring preferences of pronouns: both add to salience,
but the former has a larger weight than the latter. NSPs refer to the most salient an-
tecedent, which is always the subject, even if it is not the link (contra Frana (2007) or
86
Samek-Lodovici (1996)). In contrast, OSPs refer to a non-salient antecedent, if there is
one.
As for the relationship between pronouns and focus, I argue that focal overt pronouns
are ambiguous and do not show the same referring preferences as non-focal overt pronouns.
I analyze this ambiguity and propose a model to derive the inability of NSPs to carry focal
information.
5.1 Topics and links
I assume Vallduvı́’s (1992) proposal (see section 2.2), according to which a sentence is
divided into focus and ground, where the ground is further divided into link and tail. The
link indicates where the focal information should go (in which file, following the terminol-
ogy from File Change Semantics (Heim, 1983)) and is represented by preverbal material
in Catalan. In the experiments presented so far, the only preverbal material in the sen-
tences was the subject. Therefore, in all the experimental items the subject of the sentences
overlapped with the link, while the object was always non-link material. The results from
Experiment 1 are thus compatible with two explanations: the preference of the NSP may be
syntactic in nature (preference for a previous subject) or it may be pragmatic (preference
for a previous link). Carminati’s hypothesis was casted in pure syntactic terms and she
argues that this access to syntactic information is crucial in cases of non-referential uses,
in which pronouns do not refer to discourse referents, since they act as bound variables.
However, inter and intrasentential anaphora may work at different linguistic levels and,
even intrasententially, none of Carminati’s experiments were designed to be able to dis-
tinguish the pragmatic from the syntactic hypothesis. In the next section, I review several
pieces of related work that address this question.
87
5.1.1 Related work: Italian pronouns
Vallduvı́ (1992) argued for the pragmatic hypothesis and his proposal is that, while NSPs
inherit the previous topic, OSPs act as links and, thus, change the topic of the sentence.
Frana (2007) pursued a similar idea, which she calls the Discourse-Prominence Hypothesis
of Antecedent Assignment (DPH): NSPs have a link preference. Frana (2007) also en-
tertains the Anti-Topic Hypothesis according to which OSPs decrease their preference for
non-subject antecedents, when this position correlates with a link. She tested the DPH by
performing an experiment very similar to Carminati’s Experiment 1, but manipulating the
items so that, according to her analysis, the immediately preceding subject does not always
coincide with the link. The details of the experiment are as follows:
Materials: the materials consisted of twenty two-sentence passages with four condi-
tions. The first sentence introduces an individual by proper name (Referent 1). The second
sentence is a complex sentence. In the subordinate clause, a new individual is introduced in
subject position (Referent 2), while Referent 1 is repeated, either by a full DP or by a clitic.
The main clause contains either an NSP or an OSP in subject position. The content of the
second sentence is not pragmatically biased to refer to one of the two referents. Thus, the
four conditions are:
(64) a. Cond 1: full DP + null

La signora Rossi è una persona molto maleducata che non merita alcun riguardo.
Quando Maria incontra la signora Rossi per strada, ∅ fa sempre finta di non
vederla.
“Mrs Rossi is a very rude person that does not deserve any regard. When Maria
sees Mrs Rossi in the street, ∅ always pretends not to see her.”
b. Cond 2: full DP + overt

88
Quando Maria incontra la signora Rossi per strada, lei fa sempre finta di non
vederla.
sees Mrs Rossi in the street, she always pretends not to see her.”
c. Cond 3: clitic + null

Quando Maria la incontra per strada, ∅ fa sempre finta di non vederla.
her-sees in the street, ∅ always pretends not to see her.”
d. Cond 4: clitic + overt

Quando Maria la incontra per strada, lei fa sempre finta di non vederla.
her-sees in the street, she always pretends not to see her.”
Frana assumes that this manipulation (clitic versus proper name) is able to distinguish
subject from link. The clitic in conditions 3 and 4 is supposed to reinforce the DP it corefers
with and its discourse referent and, as a consequence, reinforce its topical status. In these
cases, Frana assumes that the clitic is the link of the sentence, although it is not the subject.
In contrast, the proper name in conditions 1 and 2 is supposed to not reinforce the DP and,
consequently, the subject, and not the object, is supposed to be the link of the sentence.
Carminati (2002) predicts that this manipulation should not produce any effect and that
conditions 1 and 3, on the one hand, and conditions 2 and 4, on the other hand, should
behave in the same way. In contrast, if the DPH is correct, the prediction is that conditions
1 and 3 should show a different pattern: the NSP should prefer the subject antecedent in
condition 1 and the object antecedent in condition 3. As for condition 4, Carminati predicts
89
object antecedent, while the Anti-Topic Hypothesis would predict subject antecedent.
Procedure: Four counterbalanced versions of the questionnaire were created. 32 Italian
native speakers completed the questionnaire via e-mail.
Results: The results can be seen in Table 5.1.

Cond 1: null + full DP 70 30
Cond 2: overt + full DP 27 73
Cond 3: null + clitic 35 65
Cond 4: overt + clitic 16 84
Table 5.1: Results in Frana (2007)
The results for condition 1 and condition 2 are parallel to Carminati’s results for Italian
or my results for Catalan. When the subject acts as a link, NSPs have a subject preference
and OSPs have an object preference. However, the pattern for condition 3 is different.
The NSP does not refer to the immediately preceding subject, but to the subject of the first
sentence, which appears as an object clitic in the subordinate clause of the second sentence.
Frana (2006) takes this as supporting evidence for the DPH. In addition, the Anti-Topic
Hypothesis does not get support from the data, as shown by the results in condition 4. As
in condition 3, in condition 4, OSPs preferably refer to the previous object, when it acts as
a link, in a proportion that is even larger than when the object is not the link (condition 2).
I take these results as evidence that the relationship between syntactic position and type
of pronoun is not as straightforward as proposed in Carminati (2002). However, I am not
convinced that these results are conclusive evidence for the DPH. It can be argued that
Frana’s experimental items are not manipulating the information structure of the sentence,
which in Romance languages correlates with word order. Following Vallduvı́’s approach,
if the update of the information is done clause by clause, the link of the experimental items
by the time the pronoun is reached is the subject, Maria, in all four conditions. In Frana’s
results, it is very striking that in conditions 3 and 4, there does not seem to be a way to refer
90
to the subject of the subordinate clause. That is, when there is a clitic coreferential with a
previous antecedent, both pronouns have a preference for this referent. In what follows, I
present two hypotheses why this is the case, one taking into account the previous discourse
and the other referring to subsequent discourse.
First, Carminati’s experimental items were all concerned with intrasentential anaphora
and all of them consisted of a single, complex sentence. My own experimental items ad-
dress intersentential anaphora and all consist of multiple simple sentences. In contrast,
Frana’s items have multiple sentences, one of which is complex. Thus, in her items, both
intersentential and intrasentential anaphora play a role. An alternative explanation of the re-
sults could be that a pronominal clitic in a subordinate clause is a sign for future anaphoric
pronouns to ignore this clause and look for a referent in some previous point of the dis-
course. That is, clitics may signal that resolution needs to be intersentential. If this is so,
there is only one referent available in the previous sentence and this is what both pronouns
end up referring to. In other words, the choice of referring expressions may affect discourse
segmentation. A DP or a full noun phrase signal a new discourse segment, while a clitic
signals a continuing discourse segment.
Of course, it is possible to remove the complexity which pronouns potentially corefer-
ing within and across sentences adds by constructing items such as the ones in 65, in which
all potential coreferential relations are across sentences.
(65) a. La signora Rossi è una persona molto maleducata che non merita alcun riguardo.
Maria incontra Mrs Rossi spesso. (Lei) fa sempre finta di non vederla.
Mrs Rossi is a very rude person that does not reserve any regard. Maria sees
Mrs Rossi often. (She) always ignores her.
b. La signora Rossi è una persona molto maleducata che non merita alcun riguardo.
Maria la incontra spesso. (Lei) fa sempre finta di non vederla.
91
Mrs Rossi is a very rude person that does not reserve any regard. Maria sees
her often. (She) always ignores her.
Second, it could be that clitics trigger certain expectations about how the discourse will
continue. Since Frana’s sentences have quite a broad context, the concept of discourse
topic may be playing an important role in determining pronoun preferences. According to
Asher and Lascarides (2003), the discourse relation of Narration requires that there be a
d-topic. In conditions 3 and 4, the clitic signals that the subject of the previous sentence,
‘la signora Rossi’, is the d-topic of the Narration. Thus, it is expected that this d-topic
will be maintained and that the speaker will add some more information about it. In other
words, the fact that ‘la signora Rossi’ is interpreted as a d-topic is responsible for the fact
that in both conditions 3 and 4 the pronoun must be coreferential with it. In contrast,
in conditions 1 and 2, there are not enough linguistic cues to construe ‘la signora Rossi’
as a d-topic, since there is no clitic reinforcing the subject of the previous sentence and,
moreover, the name is repeated. A more general d-topic (such as, ‘what’s happening in our
neighborhood’) is constructed and the coreferential pattern follows Carminati’s Position of
Antecedent Hypothesis. In other words, a clitic can act as some sort of cataphoric marker,
triggering the expectation that something else will be added about its referent.1 Both the
overt and null pronoun fulfill this expectation and are interpreted as adding information
about the clitic referent. In order not to fulfill this expectation and change the d-topic, a
stronger cue than a pronominal form would be needed. This stronger cue could be a definite
description or a proper name, as shown in 66. The missing coreferential pattern is achieved
by placing a proper name in the main clause and a null pronoun in the subordinate clause.
(66) Quando ∅ la incontra per strada, Maria fa sempre finta di non vederla.
When sees her on the street, Maria pretends not to see her.
1
Thanks to Aviad Eilam for discussion of this point.
92
Finally, it is possible that this manipulation is not related to topicality at all, since it has
been suggested that pronominalization is one of the factors that contributes to the complex
concept of ‘salience’. Kameyama (1999) claims that pronominalized non-subjects gain in
salience by virtue of being pronominalized and that they compete in salience with a non-
pronominalized entity in subject position. However, this claim is partially disconfirmed
by one of Carminati’s experiments. In her experiment, she tested the reading times of
non-ambiguous sentences which contained a clitic pronoun vs. name manipulation, as in
67.
(67) a. Condition a: subject antecedent + name

Quando Maria cerca Roberto, ∅ diventa ansiosa
”When Maria looks for Roberto, ∅ becomes anxious (fem).”
b. Condition b2 : object antecedent + clitic

Quando Maria lo cerca, ∅ diventa ansioso
”When Maria him looks for, ∅ becomes anxious (masc).”
Condition a was read faster than condition b3 , while the Discourse Prominence Hypoth-
esis would predict the opposite.
Frana’s study is the only study I am aware of that tries to distinguish linkhood from
subjecthood in a null-subject language based on experimental data. As mentioned in Sec-
tion 2.4.2, other authors support the idea that linkhood is responsible for the distribution of
NSPs and OSPs. Several corpus studies using Centering Theory (see DiEugenio (1998) and
Dimitriadis (1996)) assume that OSPs are used to mark a non-default transition. However,
since subject and link largely overlap in corpus data, it is again difficult to decide which
2
This was condition c in Carminati’s experiment. She tested other factors which are not relevant for our
purposes here.
3
The average reading time for condition a was 1358 ms., while it was 1537 ms. for condition b.
93
of the two approaches makes the best predictions. From a more theoretical perspective,
Samek-Lodovici (1996) also argues for linkhood as the factor regulating the distribution of
NSPs and OSPs. His evidence is based on the contrast between the passive in 68 and the
wh-question in 69 in Italian in terms of their ability to license NSPs. He argues that the
agent of a passive sentence cannot license a null pronoun, while the agent of a wh-question
can. He assumes that the difference is that, in the passive, the subject, and not the referent
in the oblique by-phrase, is the link. In contrast, in the wh-question, the referent of the
by-phrase is the only non wh-constituent and, thus, should be a link.
(68) a. Questa mattina, la mostra é stata visitata da Gianni.
“This morning the exhibition was visited by John.”
b. Piu tardi, *∅/egli/lui ha visitato l’università.
“Later on, *∅/he/he has visited the university.”
(69) a. Quali mostre sono state visitate de Gianni?
“What exhibitions were visited by John?”
b. Recentemente ∅/??egli/*lui ha visitato la mostra di Klee e di Miró.
“Recently ∅/??he/*he has visited the exhibits by Klee and Miró.”
I would like to note here that, to the extent that passive sentences are natural in Catalan,
I find an NSP acceptable in a context like the one in 68, in which the pronoun has only
one potential antecedent. This is in line with the game theoretical approach presented
in Chapter 4 which predicts that whenever possible the more economical form should be
used. As far as Italian is concerned, it seems to present some grammatical constraints
on the distribution of NSPs, which are absent in Catalan (see Chapter 6 for some more
considerations regarding cross-linguistic differences in pronoun distribution).
Unlike Samek-Lodovici (1996), Calabrese (1985), in line with Carminati (2002), argues
94
in favor of the subject hypothesis on the basis of sentences like 70, in which the direct object
has been left-dislocated and, thus, occupies the link position. According to Calabrese, the
NSP corefers with the subject and not with the left-dislocated constituent.
(70) a. Marioj , Sandroi l’ha incontrato per strada ieri.
Marioj , Sandroi met him in the street yesterday.
b. Apenna ∅i lj ’ha visto, ∅i,∗j é arrosito
As soon as ∅i saw himj , ∅i,∗j blushed
Samek-Lodovici (1996) argues that it is not obvious what should be considered the link
in the first sentence of 70. I agree with this criticism; since there is a preverbal subject
in 70a, the two preverbal constituents might have some topical status. Also, although this
is not discussed by either Samek-Lodovici or Calabrese, the judgments of discourses like
70 are far from absolute and, thus, experiments should be carried out to find out about the
general preferences of speakers.
To sum up, there seem to be arguments both for the subject and the link hypothesis,
although the discussion is far from being settled. I have also pointed out that it is difficult
to construct experimental items which clearly separate subjecthood from linkhood, without
adding other factors which might affect the results (such as constructing one of the ref-
erents as a discourse topic or mixing different levels in the resolution of anaphora). My
experiments address these issues. Before presenting them, I will review some more work
which deals with related phenomena in Finnish and English.
5.1.2 Related work in other languages
Kaiser and Trueswell (2008) have studied the interpretation of pronouns and demonstra-
tives in Finnish. Finnish is a partial null-subject language (Holmberg et al., 2009), which
allows 3rd person null subjects in very restricted circumstances. However, Finnish, as
95
Catalan, has two types of third person anaphors, the pronoun hän, ’s/he’ and the demon-
strative tämä, ’this’. Additionally, like in Catalan, word order in Finnish is flexible: SVO
is the default word order, but OVS sentences are also possible and felicitous when the ob-
ject is discourse-old information and the subject is discourse-new information. Therefore,
Finnish and Catalan map informational status with sentence position very similarly. Kaiser
and Trueswell carried out a sentence completion task. Participants were presented with
small discourses of three sentences. The third sentence was either SVO or OVS and the
fourth sentence started either with the pronoun hän or with the demonstrative tämä. A
sample item is given in 71:
(71) a. Sentence 1: Nina was shopping at the grocery store.

Sentence 2: While waiting in line, she saw a cook with a white hat behind her
b. Sentence 3a. SVO: The cook-subj pushed a baker-obj at the back of the line
Sentence 3b. OVS: The cook-obj pushed the baker-subj at the back of the line
c. Sentence 4a. Hän ...

Sentence 4b. Tämä ...
The results from this study can be seen in table 5.2. Since tämä is a demonstrative, some of
the continuations used it in this way. Also, when the continuation was ambiguous, it was
coded as ’unclear’.
subject object Demonstrative Unclear

SVO Hän 64 13 0 23
SVO Tämä 0 88 9 3
OVS Hän 64 13 0 23
OVS Tämä 44 0 30 17
Table 5.2: Results in Kaiser and Trueswell (2008)
The results show that the pronoun hän is sensitive primarily to syntactic role and has a
subject preference regardless of word order. In contrast, the demonstrative is sensitive to
96
word-order. It prefers postverbal referents, but this preference is modulated by the syntactic
role of the antecedent: it prefers objects to subjects. It is remarkable that in the last condi-
tion, OVS-Tämä, there were many uses of tämä as a demonstrative (for example, ‘this was
fun’), indicating that the anaphoric use of tämä is less felicitous in the VOS condition than
in the SVO condition. Kaiser and Trueswell (2008) take these results to show that salience
cannot be described by a single-factor concept, but rather requires a model with multiple
constraints, in which referential forms can show different degrees of sensitivity to different
factors.
In another study, Kaiser (2006) studied how focalization affects subsequent pronoun in-
terpretation. This sentence completion study manipulated whether the focused constituent
was the subject or the object and whether focalization was only semantic or also structural
(with a cleft). A summary of the conditions can be seen in 72.
(72) a. The maid scolded the bride.
b. (SVO + Object=focus) No that’s wrong. She scolded the secretary. She...
c. (SVO + Subject=focus) No that’s wrong. The secretary scolded her. She...
d. (cleft + Object=focus) No that’s wrong. It was the secretary that she scolded.
She...
e. (cleft + Subject=focus) No that’s wrong. It was the secretary who scolded her.
She...
She found a subject preference across the board, regardless of whether the subject was
a link or a focus. However, the preference was stronger in the first condition (SVO + Ob-
ject=focus) than in the others. She suggests that these results show that subjecthood makes
both topics and focus good antecedents for subsequent pronouns, but that this effect can
be mitigated by other factors. Subject preference is not as strong in the other three con-
ditions, because of the other factors: the structural focusing of the object in the condition
97
‘cleft + Object=focus’ and the fact that the other potential antecedent is pronominalized in
the conditions ‘Subject=focus’ takes away salience from the subject. Thus, again salience
is once again seen as a multiple-factor system, which is computed by the interaction of
several factors.
5.1.3 Experiment 4: questionnaire experiment
Experiment 4 aims to test whether the different referring preferences of NSPs and OSPs
are due to syntactic factors (preference for antecedents in particular syntactic positions:
subject vs. non-subject) or to pragmatic factors (preference for antecedents belonging to
different pragmatic categories: link vs. non-link). As mentioned in Section 2.3, information
structure is encoded through syntactic position in Catalan. Preverbal elements are links and
they may take different shapes. Most often, links are syntactically encoded through subjects
and the resulting sentence then has SVO order. However, links can also be realized by left-
dislocated objects and, in this case, the resulting sentence has OVS order. Therefore, by
manipulating the syntactic order, we can differentiate linkhood from subjecthood and test
what drives the preferences of the different types of pronouns.
Materials: the materials consisted of sixteen three-sentence discourses with four con-
ditions. Since OVS sentences are unnatural without context, all sentences were preceded
by a question asking about the referent mentioned preverbally in the second sentence. The
second sentence introduces two individuals by means of two proper names of the same
grammatical gender, one in subject position and the other in object position. The content of
the second sentence is not pragmatically biased to refer to one of the two referents. In two
of the conditions, the first sentence has an SVO order, in which the subject is the link of
the sentence; in the other two conditions, the first sentence has an OVS order, in which the
object is the link and the subject is new information. The subject of the second clause is ei-
ther an NSP or an OSP. Conditions 1 and 2 follow the same structure as the two conditions
98
in Experiment 1. Thus, the four conditions are:
(73) a. Cond 1: SVO + Null

A: Què li va passar a la Marta?
A: “What happened to Marta?”

B: La Marta escrivia sovint a la Raquel. ∅ Vivia als Estats Units.
B: “Marta wrote frequently to Raquel. ∅ Lived in the United States.”
b. Cond 2: SVO + Overt

A: Què li va passar a la Marta?
A: “What happened to Marta?”

B: La Marta escrivia sovint a la Raquel. Ella vivia als Estats Units.
B: “Marta wrote frequently to Raquel. She lived in the United States.”
c. Cond 3: OVS + Null

A: Què li va passar a la Raquel?
A: “What happened to Raquel?”

B: A la Raquel, l’escrivia sovint la Marta. ∅ Vivia als Estats Units.
B: “To Raquel, Marta wrote (to her) frequently. ∅ Lived in the United States.”
d. Cond 4: OVS + Overt

A: Què li va passar a la Raquel?
A: “What happened to Raquel?”

B: A la Raquel, l’escrivia sovint la Marta. Ella vivia als Estats Units.
B: “To Raquel, Marta wrote (to her) frequently. She lived in the United States.”
The conditions for each item set were counterbalanced and incorporated into a ques-
tionnaire experiment together with 24 filler items (some of them belonging to Experiment
99
5) and 5 practice items. Eight counterbalanced lists were constructed (the last four lists
with the items in reverse order), with a single randomization for all lists. The complete set
of experimental items can be seen in Appendix D.
Note that, as in Experiment 1, these items deal exclusively with intersentential anaphora,
unlike Frana’s experimental items. The context is also kept quite minimal to avoid the con-
struction of a discourse topic. Also, the subject is placed in postverbal position, which does
not have a link interpretation, to avoid the problems found in sentences like 70.
Procedure: The procedure is the same as explained in Experiment 1. The discourses
were presented on a computer screen. After reading them, the participants had to choose
which paraphrase they preferred for the second sentence.
(74) a. Marta lived in the United States
b. Raquel lived in the United States
Subjects: Thirty-two members from the Universitat Pompeu Fabra community took
Results. The results can be seen in Table 5.3.4

Cond 1: SVO + null 59.1 40.9
Cond 2: SVO + overt 35.2 64.8
Cond 3: OVS + null 58.0 42.0
Cond 4: OVS + overt 51.1 48.9
Conditions 1 and 2 follow the pattern predicted by the PAH and mimic the results in
Experiment 1, although the results for condition 1 in the current experiment are less strong
than the ones previously reported. The results also coincide with the results Frana obtained
in her conditions 1 and 2. In contrast, the results for conditions 3 and 4 differ greatly from
4
I thank Enric Vallduvı́ for discussion of these results.
100
the results for conditions 3 and 4 in Frana’s experiments. In condition 3, the NSP still
shows a preference for the subject, and not for the object, which crucially is the link in
these sentences. In condition 4, there is no clear preference of the OSP towards either the
object or the subject. While conditions 1 and 2 are mirror images of each other, this is
not the case for conditions 3 and 4. An ANOVA analysis of the frequency with which the
subject antecedent was chosen in the four conditions was performed with subjects and items
as random effects. The ANOVA shows that whether an anaphoric element is interpreted
as referring to the preceding subject depends on the type of pronoun (null or overt): F1
(F1(1,31) = 7.02, p = 0.01; F2(1,15) = 5.07, p = 0.04). There is also a significant interaction
between type of pronoun and word order, although it is only marginally significant by
subjects (F1(1,31) = 2.32, p = 0.07; F2(1,15) = 3.06, p = 0.04). The lack of significance by
subjects can probably be attributed to the behavior of the OSP in condition 4.
Note that the preferences in the SVO conditions are less clear in Experiment 4 than
the ones reported in Chapter 3 for Experiment 1, although both results do point in the
same direction. This can be attributed to the fact that Experiment 4 added OVS items,
which require a more developed discourse context than the one used in the experiment.
Experiment 4 did include some more context than Experiment 1 so that the OVS sentences
would not sound completely unnatural. However, it seems that this was not enough, and
the lack of sufficient context contributed to raise the overall variation found in the results.
The two pronominal forms are not sensitive to the same factors: while NSPs have a
simple subject preference, regardless of the pragmatic function of the subject, OSPs have
an object preference only when it is not the link and show no clear preference when the
object is the link. In other words, NSPs are only sensitive to syntactic function, while
OSPs are sensitive to both syntactic and pragmatic function. These results support a notion
of salience in which various factors play a role and different referential expressions are
sensitive to different factors. In particular, both subjecthood and linkhood add to salience,
101
but the former has a larger weight than the latter. NSPs, being the default pronominal form,
have a preference for the most salient entity, the subject, which remains the most salient
entity even if it is not the link (that is, both in condition 1 and condition 3). In contrast,
OSPs have more constrained preferences: they are constrained to refer to low salience
entities. When both factors contributing to salience (syntactic and pragmatic function)
agree in marking a referent as low in salience, this is the one the OSP will prefer. When
both factors do not agree (one potential antecedent is subject but non-link, and the other is
non-subject but link), both potential antecedents have an intermediate degree of salience,
there is no low salience antecedent, and, therefore, OSPs do not show a clear preference for
any of the candidates. This explains the contrast between condition 2, in which the OSPs
shows a clear preference for the object, non-link referent, and condition 4, in which OSPs
do not exhibit a clear preference.5
The crucial difference between the two types of pronouns is that NSPs have a simple
preference for previous subjects, while OSPs have a more complex preference involving
both syntactic and pragmatic factors. This proposal is very much in the spirit of Kaiser
and Tureswell’s 2008 approach, according to which multiple constraints affect anaphora
resolution. Note that the results for Catalan are very similar to the results of the Finnish
(hän/tämä) study. The main difference is that while the overt pronoun in the OVS condition
does not show a clear preference, the demonstrative tämä in Finnish showed a weak prefer-
ence for the postverbal subject. However, the results for Finnish may have been affected by
5
The results of this experiment could also be attributed to the different syntactic structures of SVO and
OVS sentences. In particular, the behavior of the OSP could be explained as follows: OSPs have a prefer-
ence for non-subject antecedents in a position lower than Spec(IP). In SVO items, this behavior results in
a preference for the object, while in OVS conditions there is no candidate that fulfills the conditions, the
object having been left-dislocated in a position that cannot be lower than Spec(IP). In contrast, the NSP has
a preference for subject antecedents, regardless of its syntactic position: in Spec(IP) in SVO sentences and in
a lower position in OVS sentences. I thank Charles Yang for this observation.
102
the fact that the demonstrative use of tämä acted as an ’escape hatch’ in a situation where
neither argument is a good antecedent. Catalan does not have such an escape hatch and
participants just chose the possible antecedents at random.
The corpus data (see Section 2.5) is compatible with the approach presented here. I
have identified 101 instances in which the subject is not the link of the sentence, either
because it appears postverbally (64 instances), in a left-dislocation (10 instances) or with
a focal particle (27 instances). These non-link subjects continue to be good referents for
subsequent reference (their referents remain as subjects in 33% of the instances, in 57%
the subject is an entity not mentioned in the previous sentence and in 9% the subject is
some other (not subject, not link) constituent). There are only two instances in my corpus
in which a non-subject link constituent becomes the subject of the next utterance. Thus,
there is a tendency for subjects to remain subjects across sentences, regardless of whether
they are links or not, while this is not the case for other constituents. The case of left-
dislocations is particularly clear: a left-dislocation introduces a state of affairs in which a
non-subject becomes the link of the sentence. However, the probability of this link to be
selected for subsequent reference is not higher than for the other constituents. This has also
been argued by Givón (1983) based on corpora studies; according to him, left-dislocations
encode less continuous topics than canonical word-order and right-dislocations.
5.1.4 Game theoretical analysis
As I argued in Chapter 4, the role of the game theorist is to construct a model such that its
equilibria coincide with the empirical data available. Therefore, my goal in this section is to
adapt the model presented in Chapter 4, so that it can capture the new empirical evidence
provided by Experiment 4. In Chapter 4, I mentioned that information states could be
thought of as representing different degrees of salience a referent has. In this section, I
apply this idea to account for the data provided by Experiment 4.
103
The results of Experiment 4 can be interpreted as a sign that hearers do not behave
in a completely Gricean manner.6 That is, given that in condition 3 (OVS + null), there
is a subject preference, we would expect to observe the usual division of labor and find
an object preference in condition 4 (OVS + Overt). However, as argued in the previous
section, OSPs seem to be highly constrained to refer only to low salience antecedents and
not to object links. I discuss why this might be so at the end of this section.
I propose to model this situation by changing the model presented in the last chapter,
so that there are three information states (IS), each corresponding to a degree of salience
that is relevant for the problem at hand. The three ISs of the game are the following:
1. Information state [Subject]: the IS in which the speaker wants to refer to the previous
subject (regardless of whether or not it is also the link or not). This IS corresponds
to referring to the most salient antecedent. As shown in the experiments, the studied
pronominal forms are not sensitive to whether a subject is acting as a link or not.
Therefore, we do not need two different ISs for subjects depending on their status as
links; rather, this can be left unspecified.
2. Information state [Object -link]: the IS in which the speaker wants to refer to a
previous non-subject, non-link antecedent. This IS corresponds to referring to the
least salient antecedent.
3. Information state [Object +link]: the IS in which the speaker wants to refer to a link
non-subject antecedent. This IS corresponds to referring to an antecedent with an
intermediate degree of salience. Since pronominal forms are sensitive to the status of
object antecedents as links, this cannot be left unspecified and we need two different
ISs for object referents.
6
Thanks to Satoshi Tomioka for discussion of this point.
104
This state of affairs is represented as a tree in Figure 5.1, which constitutes another
game of partial information.
Figure 5.1: Game for interaction between subjecthood and linkhood
In this game of partial information, the speaker announces her choice but this choice
may be compatible with different information states; it may be ambiguous for the hearer.
The speaker must choose between three forms in each of the three information states: she
can utter a sentence with a definite description, an NSP or an OSP. Both pronouns are,
in principle, ambiguous across the three information states and a hearer encountering them
105
will not be certain of which state he is in. The two information sets are indicated by circling
the ambiguous nodes: there is one information set for the NSP ({t1 , t2 , t3 }) and another one
for the OSP ({u1 , u2 , u3 }) .
The three information states do not have the same probabilities; they represent differ-
ent cross-linguistic tendencies which are not equally likely. It is possible to estimate their
respective probabilities through corpus counts, as I have done in the previous chapter. In-
formation state [Subject] is, by far, the most common: it accounts for over 70% of the
instances in the corpus of narrations; then comes information state [Object -link], which
accounts for over 20% of the corpus instances. Finally, the information state [Object +link]
occurs very rarely; there are only two attested instances in the corpus. To sum up, the prob-
ability of information state [Subject] is greater than the probability of the information state
[Object -link], which is greater than the probability of the information state [Object +link].
For the purposes of showing the calculations, I assume the following probabilities:
p([Subject]) = 7/10, p[Object -link]) = 2/10 and p([Object +link]) = 1/10. I keep the values
for the payoffs of the different options constant from the last chapter: 5 for the definite
descriptions, 8 for overt pronouns and 10 for null pronouns. That is, NSPs are the most
economical forms, followed by OSPs and by DDs. In this situation, there is a single Pareto-
Nash equilibrium, which states the following:
(75) In information state [Subject], use an NSP; in information state [Object -link], use
an OSP; in information state [Object + link] use a definite description.
When encountering an NSP, interpret it as referring to the previous subject; when
encountering an OSP, interpret it as referring to the previous non-link object.
The expected payoff for the equilibrium is p(Subject)· 10 + p(Object-link)· 8 +
p(Object+link)· 5 = 7/10· 10 + 2/10· 8 + 1/10· 5 = 9.1.
As with the previous analysis, contextual and linguistic factors may affect the probabil-
106
ities and the shift in probabilities can change the Pareto-Nash equilibrium of the game, so
that, for instance, NSPs can also be felicitously uttered in the other two information states.
The empirical data and its game theoretical modelization point to the following inter-
pretation: Catalan has specific anaphoric forms to refer to antecedents at opposite ends
on a scale of salience for activated referents. That is, NSPs refer to maximally salient
antecedents (i.e. subjects) and OSPs to low salience antecedents in the immediate con-
text (i.e. non-subject, non-link constituent). In contrast, there is no particular pronominal
anaphoric form to refer to an antecedent with an intermediate degree of salience. Inter-
estingly enough, there is also a correlation between the frequency of an IS and whether it
is associated with a particular anaphoric form. As noted, cases of referents with an inter-
mediate degree of salience are very scarce. This could explain the seemingly non-Gricean
results of Experiment 4. Gricean behavior (and the division of labor between forms and
interpretations) does take place, but only for interpretations that are distinct and frequent
enough. Opposite ends on a scale of salience for activated referents fulfill the two con-
ditions: they are distinct from each other and they are frequent (even if one end is much
more frequent than the other). In contrast, the intermediate degree on a scale of salience,
represented by link, non-subject referents, is neither distinct enough from the other two,
nor frequent enough. The consequence of this is that no pronominal form shows a clear
preference for referents with this intermediate degree of salience. Note that this approach
proposes a more complicated relationship between anaphoric forms and salience than, for
example, Ariel’s Accessibility Theory.
5.2 Focus
In the previous section, I examined one aspect of how the informational structure of a
sentence affects pronoun resolution. In particular, I looked at how the pragmatic structure
107
of the previous context affects the preferences of the following pronouns. In this section, I
examine how the pragmatic status of the pronoun itself affects its own preferences: namely,
I consider what happens when a pronoun is marked as being focal. According to Vallduvı́
(1992), focus is the update potential of the sentence. In Catalan, focal information remains
in its canonical position, which, for subjects, corresponds to the postverbal position. In
addition, subjects can also be explicitly marked as focal by other linguistic cues: syntactic
constructions, such as clefts, or focal particles, such as ‘even’ or ‘also’, as in the following
examples from the Nocando corpus.
(76) a. Postverbal subject pronoun:

La granota gran va dir ‘Aquesta no es queda aquı́ a casa meva, si hi sóc jo’.
“The big frog said ‘She will not stay here, in my place, if I am here’.”
b. Subject pronoun in the focus position of a cleft or pseudo-cleft:

La mare s’enfada molt amb el nen perquè es pensa que ha sigut ell que l’ha
enfonsat.
“The mother gets very angry with the child because she thinks that he was the
one who sank it.”
c. The pronoun appears together with a focal or emphatic particle (even, self, also
etc.).
A l’home li va caure el te, les ulleres, li va caure tot. Va caure fins i tot ell a
terra.
“The man dropped the tea, the glasses, everything. Even he fell down.”
Although it can be argued that the postverbal position is the canonical position for subjects,
it is not the most frequent position. As mentioned, subjects act most frequently as links
and appear preverbally. In the entire Nocando corpus, there were only 27 cases of focused
108
subjects and 64 instances of postverbal subjects7 (out of 5473 utterances). Most of these
focal subjects (68 out of 91) refer to an entity not mentioned in the previous discourse. Out
of the remaining 23 focal subjects, 12 refer to the previous subject and 11 to the previous
object. The data for these counts is very scarce, given that, in the first place, focusing
a subject is a marked operation and, in the second place, focused subjects usually refer to
discourse-new (or newly introduced) referents. Since the focus of the sentence is the update
potential of the sentence, it is a good place to introduce (or reintroduce) new referents in
the discourse (Gundel and Fretheim (2001)), but this is only a tendency, not a necessity.
The examples in 76 are good evidence of this: focused subjects can be pronouns, which by
definition are discourse-old.
In these cases, there is no choice between the two types of pronouns. As mentioned in
Section 2.4, focal information is placed at the end of the main clause in Catalan, which is
where the main pitch of the sentence is located. If the subject is focal information and the
speaker wants to use a pronoun, she is forced to use an OSP because only OSPs can host
the main pitch of the sentence in the sentence-final focal position. Otherwise, if an NSP
were used, the main pitch would be placed on some other constituent and this would yield a
different informational structure. There is, however, a choice between using the pronoun or
using a definite description. For example, in sentence 76b above, the choice is between the
pronoun ell and the DP ‘el nen’ (‘the child’) (note that the same choice would be present in
the English translation of the cleft).
Given the fact that OSPs in focal position are not optional anymore, it is an interesting
question how this can affect their referring preferences. It may be that they retain the object
preference of non-focal OSPs. However, since they are the most economical resource in
this situation, it may be that they play the same role as NSPs in the default case and that they
exhibit a subject preference. Yet another hypothesis is that focal OSPs are fully ambiguous.
7
Excluding subjects of unaccusative verbs, which tend to always appear postverbally.
109
Given that they appear in a very marked case (focal subject referring to a discourse-old
referent) for which the statistical evidence is scarce, it also seems plausible that speaker
and hearer are not able to estimate the probabilities of the two information states. The goal
of Experiment 5 is to find out which of the three hypotheses is correct.
5.2.1 Experiment 5: questionnaire study with focal subjects
The goal of this experiment is to test the effect that focus marking has on the biases of
OSPs. That is, in a context in which a pronoun is no longer optional, will the referential
preferences remain the same as in the default case? Will the OSP take the place of the NSP
as the most economical form or is it truly ambiguous?
Materials: the materials consisted of nine two-sentence discourses. In these discourses,
the first sentence introduces two individuals by means of two proper names of the same
grammatical gender, one in subject position and the other in object position. The second
sentence contains an OSP in focus position: the focus marking comes from focal particles
(such as ‘even’ and ‘only’) or from the syntactic structure (such as being the subject of a
cleft). The content of the second sentence is not pragmatically biased to refer to any of the
two referents.
(77) La Maria va trobar-se amb la Clara a la biblioteca. Era ella qui havia volgut que
estudiessin juntes.
“Maria met Clara at the library. She was the one who wanted them to study to-
gether.”
The items were incorporated into a questionnaire experiment together with 24 filler
items (some belonging to the items from Experiment 4) and 5 practice items. The complete
set of experimental items can be seen in Appendix E.
Procedure: The procedure was the same as explained for Experiment 1. The discourses
110
were presented on the computer screen. Subjects were asked to indicate which interpreta-
tion of the second sentence they preferred, i.e., whether they thought it was a statement
about the subject of the first sentence, or the object of the first sentence, by choosing one
of the two possible paraphrases for the second sentence, such as the ones in 78.
(78) a. La Maria havia volgut que estudiessin juntes.
“Maria was the one who wanted them to study together.”
b. La Clara havia volgut que estudiessin juntes.
“Clara was the one who wanted them to study together.”
Subjects: Thirty-two members from the Universitat Pompeu Fabra community took
Results: The results can be seen in Table 5.4. The numbers indicate the percentage with
which every option was chosen. For comparison, I also include the results for condition
2 in Experiment 4, that is, the results for the non-focused overt pronoun with a preceding
SVO sentence.

focused pronoun 45 55
non-focused pronoun 35 65
Subject interpretation was chosen for 45% of the items, while object interpretation was
chosen 55% of the time. A t-test was performed to test the hypothesis that the probability
of choosing a subject antecedent is 0.5. This test yielded a p-value of 0.26 and a confidence
interval between 0.37 and 0.53. Therefore, the null hypothesis cannot be rejected and we
have no reason to believe that subject and object were chosen with different probabilities.
Therefore, no statistically significant pattern is detected concerning the referring prefer-
ences of focused overt pronouns, which indicates that they behave in a truly ambiguous
111
way. This contrasts with the t-test performed on the data for non-focused pronouns (condi-
tion 2 of Experiment 4): the p-value is 0.005 and the confidence interval ranges from 0.25
to 0.45, and, therefore, the null hypothesis can be rejected. In the next section, a game
theoretical analysis of these ambiguous focused pronouns is presented.
5.2.2 Game theoretical analysis
The results from Experiment 5 are also amenable to a game theoretical approach. As before,
my goal here is to adapt the previous models so that they capture the empirical data just
presented. To account for the focus data, we need a model containing the following four
information states, in which the speaker has uttered the subject of utterance Ui .
1. Information state [S +f]: the speaker wants to refer to the subject of Ui−1 and mark
the current subject of Ui as focal.
2. Information state [S -f]: the speaker wants to refer to the subject of Ui−1 and not
mark the current subject Ui as focal.
3. Information state [O +f]: the speaker wants to refer to the object of Ui−1 and mark
the current subject Ui as focal
4. Information state [O -f]: the speaker wants to refer to the object of Ui−1 and not mark
the current subject Ui as focal.8
Each information state has an initial probability, which I call p(1), p(2), p(3) and p(4),
respectively. When the speaker does not want to mark the current subject as focal, she has
three options: using a definite description, an overt pronoun or a null pronoun. When the
speaker wants to mark the current subject as focal, she has only two options: using an overt
8
I am ignoring here the distinction added in the last section between link objects and non-link objects for
simplicity. Nothing would change in the Pareto-Nash equilibria if these distinctions were considered.
112
pronoun or a definite description. The use of the null pronoun is not an option here, given
the syntactic, phonological and informational structure of Catalan. The game tree for this
game is shown in Figure 5.2.
Figure 5.2: Game for focal pronouns (1)
Whenever the subject is marked as focal, there are explicit cues in the sentence that
indicate so: for example, focal particles, the postverbal position of the subject, the use of
a cleft, etc. Therefore, the hearer knows based on these cues whether the speaker wants to
mark the subject as focal or not. That is, it is not ambiguous whether OSPs are focal or not.
113
Given this, there are three possible information sets the hearer can find himself in:
• When a focally marked OSP is used, the hearer will know he is either in [S +f] or [O
+f].
• When a non-focally marked OSP is used, the hearer will know he is either in [S -f]
or [O -f].
• When an NSP is used, the hearer will know he is either in [S -f] or [O -f].
The payoffs are assigned according to the same principles as before: shorter forms are
more economical, and therefore receive higher payoffs than longer forms. NSPs receive a
payoff of 10, OSPs a payoff of 8 and definite descriptions a payoff of 5.
I continue to estimate probabilities through corpus counts, which are also supported by
theoretical considerations. The default state is that in which the current subject is non-focal
and refers to a previous subject [S -f]. Most of the counts from the corpus belong to this
class (almost 80%) and this reflects the fact that there is a connection between linkhood and
subjecthood and that link-focus structures are the unmarked type of information structure
(Lambrecht, 2001). Next comes the information state [O -f], with a probability lower than
[S -f], since it is less frequent, although it still accounts for almost 20% of the cases. Finally,
[S +f] and [O +f] both have the lowest probabilities. They appear much less frequently than
the other two information states and with equal low frequency. In our corpus, out of the 23
focused subjects that refer to an antecedent in the previous clause, 12 refer to a previous
subject and 11 to a previous object.
With this state of affairs, there are two Pareto-Nash equilibria.9 Both equilibria make
the same predictions for [-f] states (states in which the current pronoun is not marked as
9
I assume that p(1)= 6/10, p(2) = 2/10 and both p(3) and p(4) = 1/10. With these probabilities, the expected
payoff for both Pareto-Nash equilibria is 8.9.
114
focal), but differ in [+f] states. In [-f] states, the equilibria is that in [S -f] the speaker
will utter an NSP and in [O -f] an OSP; the hearer will interpret the NSP as referring to
the subject and the OSP as referring to the object. This is exactly the same equilibrium
presented in Section 4.3 to model the Position of Antecedent Hypothesis. As for the [+f]
states, the two equilibria differ and are the following:
(79) a. In [S +f], the speaker will use an overt pronoun and in [O +f] a definite descrip-
tion. The hearer will interpret the overt pronoun as referring to the subject.
b. In [S +f], the speaker will use a definite description and in [O +f] an overt
pronoun. The hearer will interpret the overt pronoun as referring to the object.
That is, the speaker can use the OSP to refer to both the subject and the object (depend-
ing on which equilibrium he chooses) and the hearer can interpret the OSP as referring both
to the subject and the object. In other words, focal OSPs are ambiguous in the absence of
contextual bias, which is the behavior we wanted to model given the results of Experiment
5. In addition, the model predicts that once either [S +f] or [O +f] becomes more likely,
due to contextual information, the Pareto-Nash equilibrium will change and the pronoun
will be able to refer felicitously both to the subject and to the object. This seems to be cor-
rect. In all the corpus examples, there is either only one salient referent or the contextual
information clearly indicates which is the intended referent. For example, if we slightly
change the discourse in 76b so that the two referents are of the same gender, as in 80a, the
contextual cues allow the hearer to interpret the pronoun as unambiguously referring to the
previous object. It is also possible for the focused pronoun to refer to the subject, if the
context points in this direction, as in 80b.
(80) a. La mare s’enfada molt amb la nena perquè es pensa que és ella qui ha enfonsat
la barca.
“The mother gets very angry with the girl because she thinks that she was the
115
one who sank it.”
b. La mare s’enfada molt amb la nena perquè es pensa que és ella qui haurà de
pagar les destrosses.
“The mother gets very angry with the girl because she thinks that she will be
the one to pay for the damages.”
5.2.3 Incompatibility of NSPs and focus: a game theoretical perspec-
tive
The approach taken in the last section was to exclude the null pronoun as a valid option
for the two [+f] information states. The goal of this section is to show that this incompat-
ibility does not need to be built into the model, but rather can be derived from the model.
Samek-Lodovici (1996) points out that the possibility of a focused null subject is usually
rejected on the basis that focusing always requires stress, which null subjects evidently
cannot support. However, as mentioned above, subjects can be focused structurally, that is,
by occupying a postverbal position and no emphatic stress is then needed. In these cases,
the null pronoun is, in principle, an option, which needs to be ruled out independently.
As I pointed out before, the overt postverbal pronoun does get the nuclear accent of the
sentence, but this is something that can be derived from the game and does not need to be
stipulated.
There is, however, a fundamental difference between the game I present in this sec-
tion and the games presented in Chapter 4 and in Section 5.1.4. Those games captured
tendencies which are highly affected by context. In fact, games of partial information are
particularly useful to model anaphora phenomena because they can capture this context-
dependency through the probabilities associated with information states. However, in this
case, it is a grammatical fact, and not a tendency, that NSPs cannot convey focal informa-
116
tion. The idea I entertain in this section is that this grammatical rule is the “frozen” outcome
of a game of partial information, “frozen” because the probabilities and, as a consequence
the equilibria, cannot be altered by contextual factors.
Consider the model shown in figure 5.3. As before, there are four information states.
The speaker may wish to refer to the previous subject and mark the current subject as focal
([S+f]) or as non-focal ([S-f]) or to refer to the previous object and mark the current subject
as focal ([O+f]) or as non-focal ([O-f]). Let’s now assume that, in all four information
states, the speaker may choose any of the three forms available. That is, the difference
vis-à-vis the previous game is that we allow NSPs as a possible option in the [+f] states.
The situation for each of the linguistic forms is the following:
• When the speaker chooses a definite description, its referent is not ambiguous for
the hearer. In addition, word-order disambiguates the information-structure of the
sentence, so whether the subject is focal or non-focal is not ambiguous.
• When the speaker chooses an NSP, the pronoun is in principle ambiguous with regard
to its referent and also with regard to its information structure. That is, in a structure
with an NSP, the hearer cannot in principle tell whether it is preverbal or postverbal
(that is, whether is has been focused or not).
• When the speaker chooses an OSP, it will in principle be ambiguous with regard to its
reference, just like the NSP. In addition, unlike the NSP, the sentence has two distinct
syntactic structures depending on the information structure; that is, the OSP is either
preverbal or postverbal depending on whether it is non-focal or focal, respectively.
Therefore, the informational role of the OSP is not ambiguous.
Thus, we have three information sets, circled in the figure, where, in principle, there is
some degree of ambiguity for the hearer:
117
1. The use of an NSP is ambiguous across the four information states.
2. The use of a preverbal OSP is ambiguous between the first two information states:
[S -f] and [O -f]. Given the syntactic position of the pronoun, it is impossible for it
to be focal.
3. The use of a postverbal OSP is ambiguous between the last two information states:
[S +f] and [O +f]. Given the syntactic position of the pronoun, it must be focal.
As in previous games, the payoffs of each form are assigned according to their com-
plexity: 10 for the NSP, 8 for the OSP, and 5 for the definite description. The probabilities
assigned to each information state follow from corpus frequencies and are kept constant
from the last section: the information state [S -f] is the most frequent one, followed by [O
-f] and followed by the two +focus states [S + f] and [O +f]. The probabilities used for the
computation are the same as were used in the previous game: p(1) = 6/10, p(2) = 2/10, p(3)
= 1/10 and p(4) = 1/10.
With this state of affairs, the equilibria are exactly the same as for the game presented
in the previous section. That is, the same results are obtained even if we allow for the
NSP to be an option in all information states. The results are the following: there are two
Pareto-Nash equilibria, with an expected payoff of 8.9. Both equilibria make the same
predictions for [-f] states, but differ in [+f] states. In [-f] states the equilibria is that in [S
-f] the speaker will utter a NSP and in [O -f] an OSP; the hearer will interpret the NSP as
referring to the subject and the OSP as referring to the object. These are exactly the same
equilibria presented in Section 4.3, capturing the predictions of the Position of Antecedent
Hypothesis. As for the [+f] states, the two equilibria are the following:
• In [S +f], the speaker will use an OSP and, in [O +f], a definite description. The
hearer will interpret the OSP as referring to the subject.
118
• In [S +f], the speaker will use a definite description and, in [O +f], an OSP. The
hearer will interpret the OSP as referring to the object.
Therefore, the prediction is that NSPs will not occur if the speaker wants to mark them
as focal, even if we allow them as a potential option in the game. The fact that the use of
an NSP triggers a potential four-way ambiguity (it is ambiguous among the four informa-
tion states) makes it not an optimal option in the more marked (with lower probabilities)
information states.
The idea behind this game is that its outcome becomes grammaticalized and it becomes
a part of the grammar that NSPs cannot be used to convey focal information. Once this
happens, the probabilities of the different information states cannot be altered and NSPs
can never be the optimal solution when the speaker wants to mark the subject as focal.
This game is, then, at a different level than the other games presented here. The other
games were meant to represent a model of speaker and hearer competence, which reflects
how they choose and interpret different referring expressions. In contrast, the last model
presented here is meant as a proposal of why it needs to be the case that it is grammatically
impossible for NSPs to convey focal information.
5.3 Conclusion
In this chapter, I have argued for a more complex model to account for the referring pref-
erences of NSPs and OSPs. The experimental data shows that neither the Position of An-
tecedent Hypothesis nor the Discourse Prominence Hypothesis can account for the behavior
of both pronouns. NSPs have a simple subject preference, while OSPs are influenced by
both syntax and information structure: they prefer low salience antecedents. In addition, in
contexts where there is no variation, such as when OSPs convey the focal information of
the sentence, their preferences disappear and they become fully ambiguous.
119
The relationship between pronouns and information structure categories has been mod-
eled with games of partial information in Sections 5.1.4 and 5.2.2. A game theoretical
explanation of why null pronouns cannot convey focal information is proposed in section
5.2.3.
One of the findings reported in this section is that there seems to be a correlation be-
tween lack of preferences of a pronominal expression in a particular context and scarcity
of data regarding the use of the pronominal expression in that particular context. It is then
an interesting question how much data is necessary so that speakers and hearers can form
estimations about the use of a form and take advantage of them to make the most efficient
use of the different linguistic resources at their disposal.
120
Figure 5.3: Game for focal pronouns (2)
121
Chapter 6
Cross-linguistic variation
The main focus of interest in this thesis is analyzing the behavior of NSPs and OSPs in
Catalan and modeling this data. However, it is interesting to look at other null-subject
languages and dialects (NSL) and discuss how we can deal with cross-linguistic variation.
In the first section of this chapter, I discuss data from other languages that shows that not
all Romance NSLs behave in a uniform way, in both quantitative and qualitative terms.
Subsequently, I explore two different options for capturing this variation. My goal is not to
review all the Romance dialects and to provide a definitive answer about these issues, but
to draw attention to two promising hypothesis: the priming hypothesis and the grammar
competition hypothesis.
6.1 Null and overt subjects across varieties
It is well-attested in the literature that not all NSLs behave the same. The contexts in which
NSPs and OSPs are required or forbidden vary across dialects, as well as their respec-
tive rates. In this section, I review some of the main differences across some Romance
languages and dialects. I cannot do justice here to the vast literature on the topic (see
122
Flores-Ferrán (2007) for a nice overview for Spanish); my aim here is just to highlight
some interesting differences. I start with quantitative differences revealed by sociolinguis-
tic studies and continue with qualitative differences investigated in other frameworks.
6.1.1 Quantitative studies
There has been extensive sociolinguistic research on the variable use of OSPs in different
Romance varieties, particularly in different Spanish dialects. These studies have identified
several factors which regulate this variation and have also highlighted that there is signifi-
cant variation in the overall use of OSPs.
Table 6.1 summarizes the overall rate of OSPs found in different sociolinguistic stud-
ies. As can be seen, there is a wide range of overall use of OSPs; Brazil and Caribbean
(Dominican, Puerto Rican and Cuban Spanish) varieties being at one end of the spectrum
and Mexican, Iberian Spanish and European Portuguese at the other end.
% of overt pronouns # of verbs study

Brazil 56 8924 Lira (1982)
San Juan (Puerto Rico) 45 2110 Cameron (1992)
Dominican Republic 41 2217 Otheguy et al. (2007)
Puerto Rico 35 3805 Otheguy et al. (2007)
Cuba 33 2778 Otheguy et al. (2007)
Ecuador 27 3735 Otheguy et al. (2007)
Colombia 24 1926 Otheguy et al. (2007)
Portugal 22 162 Barbosa et al. (2005)
Madrid (Spain) 21 1059 Cameron (1992)
Mexico 19 2569 Otheguy et al. (2007)
Portugal 8 6091 Tycho Brahe Corpus (2009)
Table 6.1: Overall rate of OSPs in different varieties
In his sociolinguistic study, Cameron (1992) did an extensive comparison between two
of the dialects which sit at opposite ends of the spectrum: he compared his own data for
Puerto Rican Spanish and data from Madrid Spanish, coming from a collection of inter-
123
views (Esgueva and Cantarero, 1981). The participants in both studies were comparable
in terms of age and socioeconomic class. As mentioned, the overall percentage of OSPs
is much higher in San Juan as in Madrid. This is the case for every pronoun, except for
the second person singular pronoun. This pronoun can be used to refer to one of the par-
ticipants of the conversation, [+specific] you, or can also be used generically, [-specific]
you. The two dialects studied by Cameron treat these two types of second person singular
pronouns differently: the two dialects show a similar rate when it is [+ specific], but not
when it is [-specific]. In the latter situation, there is an increase of pronominal subjects for
the Puerto Rican data (69%) and a decrease for the Madrid data (19%).
Category % of OSPs % of OSPs

in San Juan in Madrid
Overall 45 21
specific you 48 40
unspecific you 69 19
Table 6.2: Distribution of pronouns across categories in Cameron (1992)
Cameron identified Switch Reference as the most important constraint regulating the
appearance of NSPs and OSPs. Switch Reference is the configuration in which the pronoun
under study (called the target) does not refer to the previous subject (called the trigger) and
Same Reference is the one in which the pronoun does refer to the previous subject. Table
6.3 shows the data according to this condition in both dialects. It can be observed that
in both dialects Switch Reference favors the expression of the OSP, which is of course
compatible with the Catalan experiments presented in Chapter 3. However, the rate of
overt pronouns is still twice as much in Puerto Rico than in Madrid in the two categories.
Madrid San Juan

Same Switch Same Switch
Overt 11 30 31 57
Null 89 70 69 43
Table 6.3: Distribution of OSPs in switch and same conditions in Cameron (1992)
124
Although the rates are very different, Cameron argues that the strength of the constraints
on the variation is the same. He argues this on the basis of the Varbrul weights of his
statistical analysis, which:
“[...] provide a measure of the strength of a given constraint on variation which

is relative to other constraints within the same domain as they apply within
the dialect or group of speakers being analyzed. Therefore, it is possible for
two dialects or groups of speakers to exhibit strikingly different rates of the
occurrence of a given variant, and yet to share similar Varbrul weights for the
strength of factors which constrain the presence of this constraint (Cameron,
1992, page 227)”.
The Varbrul weights for Switch and Same Reference in both dialects in Cameron’s
study are shown in table 6.4:
San Juan Madrid
Switch .64 .65
+Same .34 .34
Table 6.4: Varbrul weights for Switch Reference in Cameron (1992)
Thus, although the rates of pronominal expression are very different in the two dialects,
the weights are very similar. Cameron offers a speculative explanation of this fact. In the
grammar of a language, both the overall rate with which a particular variant occurs and
the Varbrul weights associated with the constraints of variation are defining features of this
grammar. If there is a change in the weight of a constraint regulating a variation, this may
result in a change of the rate of the distribution of this variation, which may, in turn, serve
to assign new weights to the other constraints on variation. However, if these weights are
resistant to change, a way of maintaining the values would be to increase or decrease the
overall expression rate of the variant involved. For the null/overt variation, this idea is
translated as follows:
125
“At some point in time, the effect of Nonspecificity on second person TÚ
changed in various dialects of Spanish. In order to maintain the values of
the Varbrul weights associated with other constraints in the language, such as
Switch Reference, the overall rate of pronominal expression increased or de-
creased as the case may be. This, in turn, served to maintain the value of the
weights associated with the constraints of variation.”
I will explore this idea in connection with another factor that has been identified as
regulating this variation: priming effects. Many studies have found that an OSP appears
to favor a following OSP, while an NSP appears to favor a following NSP. Table 6.5, from
Cameron (1992), shows this effect for the two Spanish dialects he studied. When there
is a Same Reference configuration, we find a significant priming effect: OSP triggers, in
contrast to NSP triggers, favor OSP targets. When there is a Switch Reference configura-
tion, the priming effect is only significant for the Puerto Rican Spanish data, and not for
the Madrid data. In Puerto Rican Spanish, there is again a significant priming effect, such
that OSPs lead to more OSPs and NSPs to more NSPs.
Madrid San Juan

Trigger is Same Switch Same Switch
Both 14 38 35 66
Overt 24 41 47 72
Null 11 37 26 63
Table 6.5: Percentage of overt singular pronouns in Madrid and San Juan: cross-tabulation
of trigger status by same/switch condition
This priming effect has been extensively discussed for subject pronouns in Spanish
(see also Flores-Ferran (2002)) and noticed in many sociolinguistic and psycholinguistic
studies for other phenomena (for instance, see Branigan et al. (2000) for syntactic priming
or Poplack (1981) for priming effects in the expression of plural markers in Puerto Rican
Spanish). Jäger and van Rooij (2007) argue that language users show a tendency to repeat
126
linguistic material from the immediately preceding context. If a certain item or construc-
tion has been used before, the likelihood that it is used again increases, possibly because
activated units are more likely to be used than non-activated ones.
6.1.2 Qualitative studies
It has also been proposed that the high frequencies of overt pronouns in Brazilian Por-
tuguese and Caribbean Spanish (Dominican, Cuban and Puerto Rican varieties) are due to
changes in the settings of the null subject parameter. For instance, Toribio (2000) argues
that Dominican Spanish is undergoing a change process and displays properties both of
NSL and Non-NSL. Although null subjects are grammatical, overt pronouns may be used
as expletives and in non-finite clauses, as 81a and 81b show. These sentences are ungram-
matical in other dialects of Spanish. Also, the discourse in 81c shows a density of overt
pronouns which would be highly infelicitous in, for instance, Peninsular Spanish:
(81) a. Ello quiere llover

It wants to rain
b. Ven acá, para nosotros verte

‘Come here, for us to see you‘
c. Entre tú más estudias tú te vas proyectando mejor y estás adquiriendo más ex-
periencia. algo que tú no conoces o no conocı́as a través de los estudios tú
lo vas a conocer. Si tú decı́as una palabra mal anteriormente, tú ya la hablas
correctamente
“The more you study the better you project yourself and acquire more experi-
ence. Something that you don’t know or didn’t know through studies you begin
to know. If you used to say a word badly before, you now speak it correctly”.
127
Moreover, unlike other Spanish varieties, word-order in Dominican Spanish is almost
categorically SVO, even in contexts which would require subject inversion in other vari-
eties, such as in matrix an embedded questions:
(82) a. Que tú piensas?

What you think?
b. No sabı́a cuándo ella irı́a.

No know when she would go
‘I did not know when she would go.’
Toribio (2000) argues that Dominican Spanish is in a state of change and that it contains
two grammars: a grammar with the null subject settings and a new, incoming grammar with
the non-null subject settings.
Similar claims are found in the literature regarding Puerto Rican Spanish (Morales,
1989) and Brazilian Portuguese. Duarte (1993) claims that Brazilian Portuguese is evolving
from being NSL to being non-NSL. She presents some examples in which an obligatory
NSL has become optional, as in 83. In other Romance varieties, these contexts (embedded
subject coreferential with the main subject of the clause and left-dislocation of the subject)
require an NSP. Moreover, Duarte’s diachronic data shows a great increase of the rate of
OSPs, from a rate of 20% in 1845 to a rate of 74% in 1992 (shown in Figure 6.1).
(83) a. De repenta elai sabe que elai quando criança ficava meio triste perisso.
‘It may happen that she knows that she as a child would be sad for that.’
b. A Clarinha ela cozinha que é uma maravilha.

‘Clarinha she can cook wonderfully.’
Moreover, diachronic data from de Andrade Berlinck (2000) also shows how the fre-
quency of SV orders has increased from 42% at the beginning of the 19th Century to an
128
Figure 6.1: Full pronominal subjects during seven periods in Brazilian Portuguese
almost categorical 96% in the second half of 20th Century. The opposite has been the case
for postverbal orders, which have almost disappeared. VSX orders decreased from 34% to
2% and VXS orders from 24% to 2%.
6.2 Game theory and cross-linguistic variation
In this section I present two hypotheses about how to understand the dialectal differences
presented in the previous section. Let dialect A be the generic name for a dialect with
a high rate of overt pronouns (Brazilian Portuguese or Caribbean Spanish) and dialect B
the generic name for a dialect with a low rate of overt pronouns (Catalan or Spanish from
Madrid or Mexico). The first hypothesis derives the rate differences between dialects from
the priming effects presented in Section 6.1.1.The second hypothesis takes as a basis the
qualitative differences between dialects just presented and explores that idea that the high
rates of OSPs present in dialect A is the result of two grammars currently competing in this
dialect.
129
6.2.1 Hypothesis I: Priming effects
The first hypothesis I explore is the following: priming effects are responsible for (at least,
part of) the rate differences between dialect A and dialect B. Consider what happens if
priming exerts an effect in a particular dialect: if a particular form is used, this use primes
other instances of the same form and, as a consequence, its overall rate increases. If,
in a particular context of dialect A, OSPs become highly favored, they will get primed
more often, including outside the context that initially triggered them, and, therefore, their
overall rate will increase in the whole of this linguistic system. In the case of Puerto Rican
Spanish, a good candidate for a triggering context would be the association between OSPs
and second person singular pronouns to express generic statements (recall that 70% of
generic second person subjects were expressed through OSPs). Most instances of second
person singular pronouns express generic statemens (Cameron, 1992) and the increase of
OSPs in this context could spread to other contexts, which in principle do not favor OSPs,
and raise their overall rate.
A way to think about priming effects in a game theoretical model would be to treat them
as a Schelling point (Schelling, 1969), that is, as a focal point salient for the participants of
the game, in this particular case by virtue of having just been used in the conversation. Once
something becomes a Schelling point, it is active in the mental representations of speaker
and hearer and it can become easier to produce and interpret. The form that has been used
to refer to an antecedent becomes a temporary convention to refer to that particular referent.
That is, the initial advantage of NSPs as the maximally economical form may be leveled
by OSPs, if they have been primed by another OSP.
It is interesting that in Cameron’s data the priming effect is not equally strong in Same
Reference and Switch Reference contexts. It is particularly strong for Same Reference, and
not that strong for Switch Reference. In the data for Madrid Spanish, the priming effect
was statistically significant only in the Same Reference context and, in San Juan, although
130
it was significant for both, Same and Switch contexts, the difference of OSP expression
between primed and unprimed contexts was 21% for Same Reference, and only 9% for
Switch Reference. I believe this is related to the fact that in Same Reference contexts there
is one clearly favored form, the NSP, while in Switch Reference contexts there is no such
clearly favored form; rather, the preferred form depends on the probabilities assigned to the
information states.
Let me spell out in more detail the differences between the two contexts and how this
might affect priming effects. Our models from Chapters 4 and 5 predict the use of NSPs in
cases of Same Reference. There is one favored form and there is room for priming effects
to alter this preference. That is, the Pareto-Nash equilibrium of the game may change if
an OSP was used before and is being primed. This would explain why we find twice as
many Overt-Overt sequences than Null-Overt sequences in Same Reference contexts. As I
mentioned before, the form itself becomes temporally associated with the referent, becomes
a Schelling point which participants in the conversation use as a convention to refer to a
particular antecedent, temporally overriding the pragmatic constraints that regulates the
distribution of NSPs and OSPs.
The situation is quite different in Switch Reference contexts, in which priming effects
play a relatively minor role. First, our models already predict that there should be more
variation in these contexts: the use of an OSP is predicted, unless there are cues indicating
a change of probabilities, in which case an NSP can be used to refer to the object. That is,
in Switch Reference contexts, there is already much more variation between the two forms,
since the distribution of probabilities in the information states regulate which pronoun will
be used and there is less room for priming effects to appear. Second, the nature of Switch
Reference contexts, in which the referents of two consecutive subjects are different, does
not allow for the association between a particular form and a particular referent.
When there are two factors influencing a particular choice, the effect of each is stronger
131
when the other factor is less constraining. This has been reported for the relative role
of heaviness and newness in Heavy NP Shift and Dative Alternation constructions (Arnold
et al., 2000). In our case, priming is able to have a greater effect if the other factor (syntactic
and pragmatic constraints) clearly predicts a particular form.
It is beyond the scope of this thesis to work out the details of (i) how priming should be
formalized in a game of partial information and (ii) when exactly priming effects operate.
With respect to the first question, a possibility would be to translate priming effects directly
into the payoff function: that is, the payoff of a particular form rises if it has been primed
by the same form. This spike of payoffs of a primed form could alter the Pareto-Nash
equilibrium of the game, so that, for instance, an OSP in a Same Reference context becomes
the optimal form. As for the second question, although it is clear that priming has an effect
on the choice of pronouns, it is also clear that it does not always have an effect and that
its effects are temporally limited. That is, there are many appearances of unprimed forms.
Thus, it cannot be the case that a primed pronoun always gets higher payoffs, although this
does seem to be the case in some circumstances.
6.2.2 Hypothesis II: Competition of grammars
Hypothesis II takes as its basis Toribio’s (2000) idea that dialect’s A grammar is undergoing
a change in progress from being NSL to being non-NSL. While the change is in progress,
speakers will have both grammars at their disposal, although their respective rates will
change overtime.
In order to give a game theoretical analysis of dialect A, we need both the model for
NSLs proposed in Chapter 4 and refined in Chapter 5 (see figures 4.3 and 5.1) and the model
for non-NSLs (see Figure 4.2). The two games, belonging to two different grammars, are
in competition and when the innovative grammar is selected in the competition, it may be
that an OSP will be used in a context in which a speaker of dialect B would have used an
132
NSP.
How does the competition of grammars evolve and how do their respective rates change
through time? Yang (2000) develops a model of language change and acquisition, which
I briefly summarize here. Language acquisition is seen as a competition process among a
population of grammars. When an input sentence s is presented, a grammar G is selected
with a certain probability p. If that grammar can parse the sentence, the selected grammar
is rewarded and all the others are punished. If the sentence cannot be parsed, the selected
grammar is punished and all the other are rewarded. The penalty probability is what de-
fines the fitness value of a grammar: the penalty probability of a grammar Gi , ci , is the
probability that an item s in the linguistic environment cannot be parsed by Gi .
Language change occurs when two generations, n and n + 1, are exposed to sufficiently
different linguistic evidence, due to some factor, be it migration, real linguistic innovation
or social and cultural factors affecting the distribution of the linguistic expressions in a
population. Suppose that the expressions used in a linguistic environment, let’s call them
E {G1 ,G2 } , come from two different grammars, G1 and G2 . Suppose a proportion α of G1
expressions are incompatible with G2 and a proportion β of G2 expressions are incompat-
ible with G1 .
At generation n, a proportion p of expressions are generated by G1 and a proportion q
are generated by G2 , where p + q = 1. This constitutes the linguistic evidence for the next
generation n + 1. The penalty probabilities of G1 and G2 , c1 and c2 , correspond to βq and
αp. We can then compute p0 and q 0 , the weights of G1 and G2 respectively, as internalized
by the learners of the next generation n + 1 (the reader is referred to the original paper
for all the mathematical details), which may be different from the weights of the previous
generation.
In order for G2 to overtake G1 , q, the weight of G2 , needs to increase in successive
generations, until the weight of G1 eventually reaches 0. Expressed in other terms, G2
133
overtakes G1 if β > α, which has the following corollary: once a grammar is on the rise,
it is unstoppable. Moreover, the weight of G2 increases overtime, yielding an S-shaped
curve, as frequently described in the language change literature (Kroch, 1989).
In the case at hand, in order for a language to change from being NSL to being non-
NSL, there needs to be more sentences in the linguistic evidence that are incompatible with
the G1 grammar (null-subject grammar) than with the G2 grammar (non null-subject + rigid
SVO). In the dialect that is changing we cannot observe α and β directly, but only α· p and
β· q. However, we can observe α and β in varieties in which there is no change in progress,
that is, in pure monolingual NSL and non-NSL.
In order to estimate α, we need the percentage of items in a G1 grammar which are
incompatible with a G2 grammar, that is, those sentences with a null or postverbal sub-
ject. These counts are fairly easy to find in sociolinguistic or acquisition studies. For
example, table 6.6 shows the rates of null and postverbal subjects for Catalan (Casanova
(1998)), Mexican Spanish (Silva-Corvalán (1994)) and Italian (Lorusso et al. (2005) and
Bates (1976)).
% Null Subject postverbal

Catalan (Casanova, 1998) 72 8.6
Italian (Bates, 1976) 51 23
Italian (Lorusso et al., 2005) 74 NA
Mexican Spanish (Silva-Corvalán, 1994) 59 NA
Table 6.6: Percentage of G1 items incompatible with a G2 grammar
Although there is some variation in the data, these three dialects show a comparable
behavior and it is possible to estimate α at around 75%.
In order to estimate β, we need the percentage of items in a G2 grammar which are
incompatible with a G1 grammar, that is, those sentences with expletive subjects, infini-
tival subjects, left-dislocated subjects followed by an OSP, preverbal subjects in contexts
where an NSL would display a postverbal subject (i.e. in questions) and cases of overuse of
134
OSPs. These counts are somewhat more complicated to obtain. Yang (2003) estimates the
appearance of expletive subjects at 1.2 % in English. Infinitival subjects and left-dislocated
subjects followed by pronouns are not very frequent constructions either. We can safely
assume that they are not more frequent than expletive subjects and approximate their fre-
quency at 1% at most. It is also not obvious how to estimate the percentage of preverbal
subjects which would be postverbal in an NSL and the rate of overuse of OSPs. For the
former, we can assume that most of the postverbal subjects in Spanish or Catalan data
would be ungrammatical or dispreferred if placed preverbally. Thus, we can estimate at
18% the percentage of items with preverbal subjects which would be incompatible with an
NSL. For the latter, it is hard to decide what constitutes an overuse of OSPs. However,
we can get a good approximation looking at data from a topic-drop language, such as Chi-
nese. In Chinese, both subjects and objects can be dropped when they refer to the discourse
topic. Chinese topic-drop is more restricted than Romance pro-drop. For instance, if a topic
phrase has been fronted in Chinese, an NSP is only possible if the topic phrase is an adjunct
and, thus, it is not a possible referent of the NSP. In contrast, if the topicalized constituent is
an argument of the verb, the subject cannot be dropped. The contrast is shown in examples
84a and 84b, taken from Yang (2003). In Romance pro-drop, these sentence are acceptable,
as the Catalan example in 84c shows.
(84) a. Zai gonguyan-li, e1 t2 da-le ren. (e1 = John)

In park-LOC2 , e1 t2 beat-ASP people
‘It is in the park that John beat people up’
b. * Sue2 , e1 xihuan t2 (e1 = John)

Sue2 , e1 likes t2
‘It is Sue that John likes’
c. La Sue2 , e1 estima t2 (e1 = John)

Sue, e1 loves
‘It is Sue that John loves’
The instances of pro-drop in Romance are roughly a superset of the instances of topic-
135
drop in Chinese. It is a superset because NSPs in Romance do not have to refer to a previous
topic, but can refer to a least salient entity depending on the distribution of probabilities in
the information states (see Experiment 3 in Section 4.3.1). In addition, those pronouns
referring to previous topics will most likely be null: we have shown in Chapters 3 and 5
that null pronouns have a tendency to refer to previous subjects and syntactic subjects tend
to coincide with topics most of the time. We can assume that an OSP referring to a topic
in a null-subject language would be felt as ‘unnatural’ and would be counted as a case of
overuse of an OSP. Since the rate of subject drop in Chinese is 50% (Yang (2003)), we can
estimate at 50% the rate of overt pronouns in a G2 grammar which would be incompatible
with a G1 grammar (which would count as an ‘overuse’ of OSPs in a G1 grammar).
Contexts %
Expletive subjects 1.2
Infinitival subjects ≈1
Left dislocation + OSP ≈1
Preverbal subjects in questions
+ other contexts in which postverbal would be preferred ≈ 18
‘Overuse’ of OSPs ≈ 50
Table 6.7: Percentage of G2 items incompatible with a G1 grammar
Therefore, we can estimate β at around 72%. We see, then, that the estimated values of
α and β are quite close to each other. In fact, there are many stable NSLs and non-NSLs so
it is plausible to think that they cannot be easily overcome by other grammars. If so, what
can we say about Brazilian Portuguese?
It has been observed that Popular Brazilian Portuguese, the variety spoken by the ru-
ral and working class, presents significant differences with standard Brazilian Portuguese.
Guy (1981) argues that some of its properties could not have arisen from a natural language
change and claims that this dialect originated in a creole language spoken by African speak-
ers in the colonial period, which subsequently underwent a process of decreolization. Brazil
had the largest proportion of slaves displaced to the New World, around 3.6 million peo-
136
ple according to Curtin (1969), cited in Guy (1981). These slaves formed the entire labor
force in agriculture and mining and, at the end of the colonial period, in 1817, the African
population represented 75% of the population. Therefore, during the period in which the
foundations of Popular Brazilian Portuguese were being laid, Afro-Brazilians were the
largest group. African languages most likely to have influenced Brazilian Portuguese are
the West African languages, Igbo and Yoruba, and the Bantu languages of Angola and the
Congo River basin.
One of the linguistic variables studied by Guy and considered incompatible with a natu-
ral language change is the variable agreement found in Popular Brazilian Portuguese within
a noun phrase or between subject and verb. Guy found that, within a noun phrase, the first
word of the NP was almost always marked for plural, while other positions disfavored plu-
ral marking. This type of rule has no precedent in the history of Portuguese and Romance
languages and does not easily lend itself to a natural change account. In contrast, this same
pattern is found in a number of creole varieties of Portuguese and Spanish. The hypothet-
ical Brazilian proto-creole probably lacked agreement, as most creoles do, and would use
some NP-initial element to express plurality. This is precisely the pattern found in many
West African languages, which were the native languages of the African people brought to
Brazil in colonial times.
Also, interestingly, some of the linguistic variables studied by Guy (1981), such as
variable agreement, are shared between Brazilian Portuguese and the Caribbean Spanish
dialects, and not with the rest of the Spanish-speaking world. In fact, the Caribbean was the
region of the Spanish Empire which used slave labor more heavily. Holms (2004) analyzes
both Popular Brazilian Portuguese and Nonstandard Caribbean Spanish as semi-creoles, or
partially restructured languages, which have some features of both creoles and non-creoles.
According to him these varieties are different both from unrestructured overseas dialects
(Quebec French or Chilean Spanish) and from completely restructured creole languages
137
(Guyanese Creole English and Palenquero Creole Spanish). The ratio between native and
non-native speakers of the source language during the first century of creation of the new
dialect seems to be the most important factor in determining in which group it will fall. In
unrestructured dialects, native speakers were the vast majority. In restructured creoles, non-
native speakers were a vast majority. In partially restructured dialects, there was a majority
of non-native speakers, but also a significant percentage of native speakers (around 30-
40%). Holms also points out that the lack of subject inversion in questions is common in
the Atlantic creoles and in the African substrate languages.
As mentioned, Yoruba and Igbo are two of the languages that are most likely to have in-
fluenced Popular Brazilian Portuguese. Youruba is an SVO language, without null subjects
(Bode, 2000). As for Igbo, it has traditionally been analyzed as non-NSL as well. However,
Eze (1995) argues that it should be treated as an NSL and that its subject pronouns should
be analyzed as subject clitics. In any event, whether the languages that influenced Popular
Brazilian Portuguese were NSL or not is not crucial for the argument. There is extensive
evidence form the field of second language learning that shows that learners of an NSL
will overuse OSPs even if their language also allows null subjects. For instance, according
to Bini (1993), Spanish learners of Italian use OSPs significantly more than native Italian
speakers, although the two languages show a very similar distribution of null and overt
pronouns. Similar results are reported by Marzaga and Bel (2006) for Greek learners of
Spanish. Bini argues that learners produce these OSPs, which would be absent in their L1,
because they have not fixed the [+null subject] value of their Italian Interlanguage yet and
need to reinforce verbal morphology. It could also be that in order to ease the processing
load of the foreign language, OSPs serve as some sort of ‘default strategy’ (Sorace et al.
(2009)).
The hypothesis I entertain is that Brazilian Portuguese began to change due to the mi-
gration of African people during colonial times. A language change due to migration can
138
be modeled using equations coming from population genetics. Population genetics is inter-
ested in modeling the evolution of different alleles of the same gene within a population. In
particular, population genetics deals with what happens when the proportions of different
alleles change due to migration. In the so-called island model, there are two alleles A and
a, A being favored over a. Their respective fitness values are 1 and 1 − s and their respec-
tive frequencies are p and q (or 1 − p). If the island receives some migration m from the
continent, in which only allele a is present (and, therefore, p is 0 for this population), p will
change in the island after migration. In particular, it can be shown that if m >= s (Hartl
and Clark, 1989), i.e., if the percentage of migrants is greater than the fitness advantage of
the local language, the incoming variety will take over the local variety. In our case study,
if the proportion m of immigrants speaking a non-NSL variety is greater than the fitness
advantage s that an NSL variety has over a non-NSL variety, the incoming variety (non-
NSL) is predicted to take over the local NSL variety. As we have seen, the fitness values of
NSL and non-NSL varieties are very similar, so s is certainly smaller than the proportion
m of immigrants in Brazil and the Caribbean that spoke a creole, or a non-NSL variety,
during colonial times. It is, then, expected that these varieties will lose null pronouns and
will go through a period in which they exhibit properties of both NSL and non-NSL.
6.3 Conclusion
Null-subjecthood is not displayed uniformly across NSLs: there are both quantitative and
qualitative differences, which were reviewed in Section 6.1. I presented two hypotheses re-
garding how to formalize these differences. Hypothesis I derives the rate differences across
dialects from priming effects. Once a particular pronominal form becomes favored in a
particular linguistic context, it gets primed more often and, thus, its overall rate increases.
Cameron’s (1992) data for Puerto Rican and Madrid Spanish fits nicely with this idea. Hy-
139
pothesis II entertains the idea that some dialects are undergoing a process of change and are
currently in a transition state from being an NSL towards being a non-NSL with rigid SVO
order. Caribbean Spanish and Brazilian Portuguese are good candidates for such varieties.
140
Chapter 7
Contrast
As mentioned in Section 2.4, OSPs in Romance null subject languages become obligatory
when they convey contrast, while NSPs are generally prohibited in these contexts. This
is widely acknowledged in the literature about subject expression in Romance. However,
often no definition is offered of what is meant by contrast. Moreover, naturally-occurring
instances of contrastive OSPs seemingly convey different types of contrast and it is not
obvious that they are amenable to a unitary analysis.
In this chapter, I present an analysis of the contrastive import of Romance contrastive
OSPs, based on data from Catalan, Spanish and Italian. I claim that contrastive OSPs are
Contrastive Topic markers and I offer a definition based on previous analyses by Büring
(2003), Tomioka (2008) and Hara and van Rooij (2007). The basic meaning of a Con-
trastive Topic is an uncertainty contrast, which can be strengthened into an exhaustive con-
trast in some particular discourse conditions. The pairing between forms and meanings will
be derived using game theory.
Section 7.1 presents the data on contrastive Romance pronouns by examining corpus
examples. Section 7.2 reviews different approaches to the notion of contrast. Section
7.3 presents an analysis of the corpus data in which OSPs are analyzed as Contrastive
141
Topics markers, and I advocate for a particular approach to Contrastive Topics. The pairing
between forms and meanings is derived by means of game theory in section 7.3.2. Finally,
section 7.4 concludes.
7.1 The data
This section presents corpus examples in which the subject pronoun may be taken to convey
some notion of contrast. The corpus examples do not all follow the same structure, nor do
they seem to convey the same contrastive meaning. Thus, there seem to be different types
of examples and different notions of contrast seem to be at stake. Unless otherwise noted,
all examples are taken from the Nocando (2004) corpus of narrations: Catalan, Spanish
and Italian speakers were asked to narrate stories presented to them with illustrations only.
There were three different stories and each story was told by several speakers in their native
language. The narrations were recorded and transcribed.
Descriptively, the corpus examples can be broadly divided into three classes: those
conveying a double contrast between two entities, those conveying an implicit contrast and
those conveying what I call a weak contrast.
7.1.1 Double contrast
I call “double contrast” discourses those two-clause discourses in which two different ref-
erents occupy the subject position and their respective verb phrases predicate two different,
and in some sense opposite, actions or states. Consider the two examples from Catalan in
85. There are antonym predicates in the two discourses: be happy and be sad in the first
one and go (sailing) and stay in the second one. These two opposite actions or states are
predicated of two different referents in subject position.
(85) a. En el camı́ de tornada tots estan enfadats i ell, en canvi, està content.
142
“On the way home, they are all angry and he, in contrast, is happy.”
b. Ara nosaltres anirem a navegar per l’aigua i tu et quedaràs aquı́ sola.
“Now we will go sailing in the water and you will stay here on your own.”
This type of contrast is explicitly stated in the discourse (as opposed to the implicit
contrast that I present in Section 7.1.2) and, for each of the relevant entities in the discourse,
it is conveyed whether they did (or did not do) the action that it is predicated of them (as
opposed to the weak contrast that I present in Section 7.1.3) .
The two sentences become infelicitous if the OSP is replaced by just an NSP. The
infelicity of these sentences without the OSP is neither due to potential ambiguity nor
to the fact that there is coordination. As for potential ambiguity, the verbal morphology
unambiguously indicates the person of the verb, which is different in the two clauses of
both sentences: third person plural and third person singular in 85a and first person plural
and second person singular in 85b. As for coordination, NSPs in a coordination that do not
convey contrast are acceptable, as shown in example 86a. In contrast, NSPs in a discourse
without coordination, but which conveys contrast, are still infelicitous, as shown in example
86b, a modified version of 85a.
(86) a. A mig camı́ la gran tira la granoteta avall i ∅smallf rog es queda endarrerida i, a
sobre, ∅smallf rog es fa mal.
“On the way, the big frog pushes the small frog and ∅smallf rog is left behind
and, on top of that, ∅smallf rog hurts itself.”
b. En el camı́ de tornada tots estan enfadats amb el nen. # En canvi, ∅thechild està
content.
“On the way home, they are all angry with the child. # In contrast, ∅ thechild is
happy.”’
143
Similar examples are found in Spanish and Italian, as shown in 87.
(87) a. La tortuga grande se queda a un lado del rı́o, mientras ellos van a dar una vuelta
con la barca.
“The big turtle stays at the side of the river, while they go around with the
boat.”
b. Io resto sulla barca e tu cadi in acqua.
“I stay on the boat and you fall into the water.”
This type of contrastive OSP, in a double contrast discourse, is usually excluded from
the envelope of variation between NSPs and OSPs in sociolinguistic studies, since there
does not seem to be variation between the two forms. Cameron (1992), who studied subject
expression in Puerto Rican Spanish, is one of these studies. In particular, he distinguished
three subcases of double contrast to be excluded from the envelope of variation.
• Contrast of Negation: the same predicate (or two similar predicates) occurs in two
sentences, but it is negated in the second one:
(88) Ellos fueron pero yo no fui.
“They went but I did not go.”
• Contrast of Scalar Opposition: there are two similar predicates, which are modified
by adjuncts which are construable as elements of a scalar set, such that the two ad-
juncts differ by degree.
(89) Mi señora habla bien inglés pero yo lo hablo bastante mal.
“My wife speaks English well but I speak it very brokenly.”
144
• Contrast of Alternatives: this type occurs when object arguments of the first and
second sentences are construable as elements of a set and understood as alternatives
to one another.
(90) Yo fuı́ a una escuela y él fue a otra.
“I went to a school and he went to another one.”
However, it is not the case that NSPs are always excluded in double contrast discourses.
In fact, Matos Amaral and Schwenter (2005) argue against the idea that OSPs are obligatory
when contrast is conveyed and show that other linguistic material (such as adverbs) can
enable the appearance of an NSP, as 91 shows for Spanish. The reply of Informant A in
91c is unacceptable if there is no contrast marker. It becomes felicitous when a contrastive
marker is present, be it an OSP or the adverb aquı́ (literally here, translated by Matos and
Schwenter, as ‘in our case’). According to them, adverbials that can be constructed as
referring to the referent of the subject of the sentence will be acceptable in situations that
require a contrastive marker.
(91) a. Inf A: Vosotros lo tenéis el lunes?
‘You guys have it on Monday?’
b. Inf B: El lunes. Un dı́a estratégico, además.
‘Monday. A day, a strategic day, besides.’
c. Inf A: Bueno, (nosotros / aquı́/ * ∅) lo tenemos el viernes.
‘OK, we have it on Friday.’
‘OK, here we have it on Friday.’
145
7.1.2 Implicit contrast
Implicit contrast discourses do not have the explicit contrastive structure we have just exam-
ined. However, they do convey an implicit contrast between the antecedent of the pronoun
and another entity, highly salient in the context.
Consider example 92a for Catalan: two frogs are the main characters of this story. The
big frog is the referent of the NSP of the clause before the OSP. Thus, this referent is
maximally salient at the moment of utterance of the OSP. However, an OSP, which, as we
have seen, has a preference for non-subject referents, is used to refer to this maximally
salient entity. By using this OSP, a contrast is conveyed between the antecedent of the
pronoun and the other entity salient in the discourse (that is, the other frog): that is, it is
conveyed that one frog, the referent of the pronoun, is big, while the other is not, although
this second frog is not explicitly mentioned.
If the OSP were absent, the discourse would still be acceptable and the NSP would still
refer to the same referent (to the previous subject, which is what NSPs tend to refer to), but
no contrast would be evoked.
The same thing happens with the second OSP of 92b, by which an implicit contrast
between the boy and the rest of the family is established, and it is conveyed that the rest of
the family, unlike the boy, was looking forward to the dinner.
(92) a. El nen torna a renyar la granota gran i li torna a dir que això no pot ser, que han
de ser amics, que s’han de comportar bé i que i ∅bigf rog l’ha de cuidar perquè
ellabigf rog és la gran.
“The child scolds the big frog again and tells it again that this can’t continue,
that they should be friends, they should behave themselves and that ∅bigf rog
should take care of it because shebigf rog is the big one.
b. En el camı́ de tornada tots estan enfadats i ell, en canvi, està content perquè ell
146
no tenia cap ganes d’anar-se’n a sopar.
“On the way home, they are all angry and he, in contrast, is happy, because he
was not looking forward going out for dinner.”
This type of contrast is also conveyed through OSPs in Spanish and Italian. The OSP
in 93a refers to a highly salient referent and the discourse strongly conveys that while the
small frog (the referent of the pronoun) wanted to be friends with the big one, the opposite
was not true. The same implicit opposition is true in 93b, in which the speaker implicitly
contrasts him having known Michelino for many years with the addressee, who has just
met him.
(93) a. La ranita se pone a llorar porque se ha decho daño y además ella querı́a que las
dos fueran amigas.
“The little frog starts crying because she has hurt herself and, moreover, she
wanted them to be friends”.
b. Guarda che io lo conosco da un sacco di anni, a Michelino.
“Look, I have known Michelino for many, many years.”
7.1.3 Weak contrast
Finally, the third type of contrast is the weakest of all three types: it is conveyed that the
speaker ignores or does not want to commit herself to whether the predicate is true of
anyone else than the antecedent of the OSP. That is, unlike double contrast and implicit
contrast, it is not conveyed that there is an opposition between the antecedent of the OSP
and some other entity in the discourse or in the context. Rather, the speaker is only making
a claim about the referent of the OSP and leaves it open whether this claim should or should
not apply to the other entities relevant in the discourse.
147
Consider example 94a: a waitress is asking a group of people what they would like
for dinner. The mother answers with a sentence containing an OSP. Her answer does not
convey an implicit contrast between her eating chicken and someone else eating something
else, but it is just a partial answer to the waitress’ questions; the other people in the group
may or may not eat chicken. The sentence without the OSP would be unacceptable, because
it would present the answer as if it was complete and exhaustive in a context in which
obviously it is not.
(94) a. ‘Què voldran per sopar?’ La mare diu: ‘Bé, doncs jo vull pollastre’ i el pare
‘Doncs, jo vull sopa’.
“ ‘What will you have for dinner?’ The mother says: ‘Well, I’ll have chicken’
and the father says ‘Well, I will have soup’.”
b. “Miri, senyora, nosaltres no sabem pas res de cap granota”
“Look, Ma’am, we don’t know anything about a frog”
The context for 94b is the following: a frog has been creating trouble in a restaurant,
one of the costumers complains to the waiters, which are quite clueless about what is going
on with the frog. As before, there is no opposition between them not knowing about the
frog and someone else knowing about it, but the sentence conveys a weaker meaning: as far
as they are concerned, they don’t know anything about a frog; someone else may or may
not know the answer.
This weaker contrast can also be expressed through OSPs in Spanish and Italian. In
example 95a, taken from Stewart (2003), the informant of a sociolinguistic interview is
explaining how she prepares for her job as a journalist. In the first part of the example,
she uses the generic second person. However, when she wants to make it clear that this is
just her personal experience and that other journalists may or may not do what she has just
described, she switches to first person and uses an OSP. In 95b, the speaker makes explicit
148
her own ignorance, leaving it open whether other people may or may not the answer to the
question under discussion.
(95) a. Entonces cuando por la mañana sabes que se convoca una manifestación de
estudiantes o, vamos, una cosa similar, pues te informas un poco del tema.
Vamos yo por lo menos pues miro si ha pasado en dı́as anteriores
“so when one morning you know that a student demonstration is to be held, or,
well something like that, well, you find out a bit about the issue. Well, at least
I, well, look if it has happened on previous days.”
b. “Ma, io non so niente”
“But I don’t know anything about it”
7.1.4 Stressed and unstressed overt pronoun
As mentioned in Section 2.4.5, Rigau (1986; 1989) also noticed that OSPs can convey the
weak contrast just examined and, moreover, she noticed that there is a difference between
stressed and unstressed OSPs. Unstressed OSPs are compatible with the speaker claiming
ignorance about some other entity, while stressed OSPs are not, as shown in 96 for Catalan.
That is, unstressed OSPs can convey a weak contrast, while stressed OSPs cannot.
‘Who wants to come, you or John?’
b. Jo vull venir... en Joan, no ho sé.
c. # JO vull venir.. en Joan, no ho sé.
‘I want to come.. John, I don’t know’
Following Kuno (1972), Rigau’s proposal is that an unstressed OSP triggers an exhaus-
tive listing interpretation, while a stressed one triggers a contrastive focus interpretation.
149
Rigau (1989) assumes that the two readings are variants of the same emphatic operator.
The exhaustive listing interpretation could be paraphrased as ‘Among the people under dis-
cussion, only A wants to come’. The contrastive focus interpretation conveys the negation
of some alternative and can be paraphrased as ‘as for A (A = 1st person in 96c), but not for
X, A wants to come’.
While I agree with the judgments, the labels she uses do not seem correct. The un-
stressed OSP does not convey an exhaustive listing interpretation. If it did, since “only A
wants to come” conveys that the speaker knows that nobody else wants to come, sentence
96b should be a contradiction, but it is not.
Also, interestingly, stressed OSPs cannot appear in discourses with double contrast:
(97) a. ‘Qui vindrà?’
‘Who will come?’
b. # JO vindré, però ELLA es quedarà.
‘I will come, but SHE will stay.’
7.2 On the notion of contrast
This section contains a review of the different notions of contrast in the literature. They can
be broadly divided into those which look at contrast from the point of view of rhetorical
relations and those which see it as a semantic operator. These two points of view are
reviewed here and, finally, I concentrate on the analysis of Contrastive Topics, which will
be used to analyze the contrastive import of Romance contrastive OSPs.
150
7.2.1 Contrast as a rhetorical relation
In his approach to discourse relations, Kehler (2002) categorizes contrast as a type of a

Resemblance relation. Resemblance relations require that commonalities and contrasts
among corresponding sets of entities and relations be recognized. For each relation, the
hearer identifies a relation p1 that applies over a set of entities a1 , ..., an from the first
sentence S1 , and a corresponding relation p2 that applies over a corresponding set of entities
b1 , ..., bn from the second sentence S2 . Coherence results from inferring a common (or
contrasting) relation p that subsumes p1 and p2 , along with a suitable set of common (or
contrasting) properties qi of the arguments ai and bi . In particular, contrast can create this
inference in two ways:
• Infer p(a1 , a2 ,...) from the assertion of S1 and p(b1 ,b2 ,...) from the assertion of S2 , in
which for some property vector q, qi (ai ) and ¬qi (bi ) for some i.
(98) Gephardt supported Gore, but Armey supported Bush.
In this example, the same relation p (support) applies in both sentences, and for the
contrasting elements a2 (Gore) and b2 (Bush), there is a property q (belong to the
Democratic party), such that it is true of a2 and false of b2 .
• Infer p(a1 , a2 ,...) from the assertion of S1 and ¬p(b1 ,b2 ,...) from the assertion of S2 ,
in which for some property vector q, qi (ai ) and qi (bi ) for all i.
(99) Gephardt supported Gore, but Armey opposed him.
In this example, p1 and p2 correspond to the relations denoted by support and oppose;
the common relation p that subsumes these might be the relation denoted by have an atti-
tude towards a candidate. The contrasting elements a1 and b1 correspond to Gephardt and
Armey, who have the contrasting property of q1 of supporting different political parties.
151
The parallel elements a2 and b2 correspond to the meanings of Gore and him, which share
the trivial common property q2 that they denote the same individual.
7.2.2 Contrast as a semantic operator
Vallduvı́ and Vilkuna (1998) convincingly argue that it is necessary to distinguish between
informational rhematicity and quantificational kontrast1 , two notions that are often sub-
sumed under the term focus. According to them, konstrast is a semantic operator which
generates a set of alternatives which become available to the semantic computation as a
quantificational domain of, for instance, focus-sensitive adverbs. If an expression a is kon-
trastive, a set M of alternatives is generated. These alternatives need to be comparable to a,
in the sense of being similar but different (see Umbach (2004) for discussion). For instance,
in 100a, the focused (or kontrast-marked) constituent ‘Sue’ generates a set of alternatives,
whose elements need to be different from Sue but similar to her at the same time, for in-
stance by including other friends or colleagues of John. The unacceptability of 100b shows
that the alternatives need to be different, that is one cannot subsume the other (100b is only
acceptable if martini is not a drink). In 100c, the need for the alternatives to be similar
triggers the interpretation of port as a drink.
(100) a. John only saw SUE at the dinner party.
b. # John only paid for the DRINKS, not for the MARTINI.
c. John only paid for the BEER, not for the PORT.
Vallduvı́ and Vilkuna (1998) explore several ways in which contrastivenness can oper-
ate in the set of alternatives. Particularly relevant for our purposes is the distinction between
identificational, exhaustiveness and thematic kontrast, which they informally define as:
1
They spell kontrast like this to distinguish it from the general notion of contrast.
152
• Identificational kontrast: if M = {a,b,c} and P(x ∈ M), then P(a).
• Exhaustiveness kontrast: if M = {a,b,c} and P(x ∈ M), then ¬(P((y ∈ M) 6= a)).
• Thematic kontrast: if M = {a,b,c} and P(a), then P’((y ∈ M) 6= a).
There is some controversy in the literature about whether focused constituents convey
an identificational or exhaustiveness contrast. Rooth (1985) argues that in a sentence like
101a contrast is merely ”identificational”. The contrastive import of the focused phrase
could be paraphrased as follows: if a proposition of the form “John saw x at the dinner
party” is true, then “John saw Sue at the dinner party” is true. In contrast, operators like
only give rise to exhaustiveness by negating all the alternatives created by the focused
constituent (101b). Other authors argue that focused constituents do not trigger an identi-
ficational kontrast, but an exhaustive one, even if no adverb such as only is present (see,
for example Svoboda and Materna (1987)). Also, according to Kuno (1972), the Japanese
morpheme ga triggers what he calls an exhaustive listing interpretation, which can be para-
phrased as ‘x and only x’ or ‘it is x that’ and is equivalent to the exhaustiveness contrast
just presented.
(101) a. John saw SUE at the dinner party
b. John only saw SUE at the dinner party
Finally, the thematic contrast can be paraphrased as “if a property P holds of a, then
other properties P’ hold of other members of M”. Vallduvı́ and Vilkuna argue that this is
the contrast conveyed by “Contrastive Topics” (see also Szabolcsi (1981)). This paraphrase
captures the idea that a Contrastive Topic triggers alternatives and that it is left unspecified
what is asserted of these alternatives. Subsequent analyses of Contrastive Topics have
attempted to make more explicit their meaning. The main analyses will be reviewed in the
next section.
153
7.2.3 Contrastive Topics
Büring (1999, 2003)
Büring (1999, 2003) proposes an analysis for Contrastive Topics (CT, henceforth)2 . In
particular, Büring discusses CTs in English and German, which mark CTs with stress,
namely with a rising pitch contour, L-H*, unlike Focus, which receives a falling one, H-
L*. In example 102, the first constituent is the CT, while the last one is a Focus.
(102) a. A: Which book would Fritz buy?
b. B: [I]CT would buy [‘The Hotel New HAMPshire’]F
c. B’: # I would buy [‘The Hotel New HAMPshire’]F
Büring’s idea is that CTs introduce alternatives, in a similar way as Focus does. How-
ever, instead of introducing a set of propositions, like Focus does (Rooth, 1992), a CT
introduces a set of sets of propositions or, in other words, a set of questions. Consider
again example 102. The focus value of answer B (103a) is a set of propositions, such as
the one in 103b, while its Contrastive Topic value (represented by [[A]]CT ) is a set of such
sets of propositions, such as in 103c or 103d.
(103) a. A: [I]CT would buy [‘The Hotel New HAMPshire’]F
b. [[A]]F : {I would buy ‘War and Peace’, I would buy ‘The Hotel New Hamp-
shire’, I would buy ‘The World According to Garp’, ...}
c. [[A]]CT : {{I would buy ‘War and Peace’, I would buy ‘The Hotel New Hamp-
shire’, I would buy ‘The World According to Garp’, ...}, {Paul Simon would
buy ‘War and Peace’, Paul Simon would buy ‘The Hotel New Hampshire’, Paul
Simon would buy ‘The World According to Garp’, ...}, {Fritz would buy ‘War
2
In earlier papers, Büring refers to Contrastive Topics as S-Topics or Sentence Topics.
154
and Peace’, Fritz would buy ‘The Hotel New Hampshire’, Fritz would buy ‘The
World According to Garp’, ...}}
d. {Which book would I buy, which book would Paul Simon buy, which book
would Fritz buy, ...}
In addition, Büring proposes the following Question/Answer Condition:
(104) Question/Answer Condition: the meaning of the question Q must match one ele-
ment in the Topic value ([[A]]CT ) of its answer A.
This condition explains the felicity of a Contrastive Topic, as illustrated in answer B of

102: Since the answer introduces alternatives, including ‘Which book would Fritz buy?’,
the meaning of the question matches this alternative.3 In contrast, answer B’ is not felic-
itous, since it lacks the Topic marking and only the focal alternatives are introduced. The
Question/Answer Condition also explains the felicity of a partial Topic, in which the an-
swer addresses part of the Topic. The meaning of the question 105a matches one element
in the Topic value of the answer, namely the one represented first in the set in 105d.
(105) a. A: What did the popstars wear?
b. B: The [female]T pop starts wore [caftans]F
c. B’: # The female pop starts wore [caftans]F
d. {What did the male or female pop stars wear, what did the female pop stars
wear, what did the male pop stars wear, what did the Italian pop stars wear}
Furthermore, Büring assumes that CTs carry the following implicature:

3
Krifka (1999) points out that the Question/Answer Condition is a necessary, but not sufficient condition
for the felicity of this type of answer. The answer only makes sense if we can assume that there is some
relation between the original question and the answer.
155
(106) Topic Implicature: Given a sentence A, containing an Contrastive Topic, there is
an Element Q in [[A]]CT such that Q is still under consideration after uttering A.
That is, there is a question in the set of questions denoted by [[A]]CT which is still
disputable, which he calls Residual Topic. The Residual Topic might be 107a for 103a and
107b for 105. Note that in both cases the Residual Topic is an element of the Topic Value;
in the second case, it is the original Topic, which has not been yet resolved.
(107) a. What would Fritz buy?
b. What did the male pop stars wear?
This Topic Implicature accounts for the so-called purely implicational Topic, as illus-
trated in example 108. The answer without the Topic marking would be felicitous as well.
However, the answer with the Topic is also acceptable. According to Büring, the CT in-
troduces alternatives, such as the ones in 109 and, by the Topic Implicature, at least one
element in 109 is still under consideration and can serve as a Residual Topic. With this kind
of utterance, the speaker can indicate that there is at least one person whose wife might or
might not have kissed other men.
(108) a. A: Did your wife kiss other men?
b. B: [My]CT wife [didn’t]F kiss other men
c. B’: My wife [didn’t]F kiss other men
(109) {{my wife kissed other men, my wife didn’t kiss other men}, {{your wife kissed
other men, your wife didn’t kiss other men}, {{Fritz’ wife kissed other men, Fritz’
wife didn’t kiss other men},...}
According to Büring (2003), CTs are used to mark a discourse strategy to answer a
Question Under Discussion and are obligatory when there is an implicit sub-question. For
instance, in the case of partial Topics (105), an answer with a CT is not fully answering
156
the Question Under Discussion, but an implicit sub-question, which addresses part of the
Question Under Discussion. Büring relates this to the fact that not-given information needs
to be marked in discourse.
Hara and van Rooij (2007)
Hara and van Rooij (2007) claim that the Japanese morpheme wa is a Contrastive Topic
marker and that it is licit when the speaker is not sure of the alternatives having the prop-
erty denoted by the verb or when the speaker knows that the alternatives do not have this
property, as the following example shows.
(110) a. Among John and Bill, who came to the party?
b. JOHN-wa kita.
John-wa came.
(John came, Bill didn’t come or I don’t know about Bill; the speakers considers
the possibility that ‘Bill came’ is false.)
Hara and van Rooij (2007) point out a number of problems with Büring’s proposal and
propose a simpler way to obtain the Topic alternatives. The main problem with Büring’s
approach is that it predicts that Topic marking can only occur with partial answers and
should not be able to occur when questions are completely resolved. This prediction is
not born out, as answer 111b shows. That is, the CT marking on Bill implies that there
is still an alternative under consideration. However, at this point, the questions has been
completely answered. A possible way out for Büring would be to limit the domain of the
partial-answer requirement to each conjunct. However, then another problem would arise:
answer 111c would be predicted to be felicitous.
b. [CT John] came, and [CT Bill] didn’t come.
157
c. # [CT John] came, and [CT Bill] came.
Tomioka (2008) also notes that Japanese wa is licensed even without a focused con-
stituent and this poses a problem for Büring’s approach, which crucially relies on a focused
marked constituent to generate the alternatives.
Hara and van Rooij (2007) propose that Topic marking creates a simple set of Topic-
alternative propositions and gives rise to the implicature that one of the Topic-alternatives
is not known to be true by the speaker. Crucially, knowledge is defined as “a speaker
has more knowledge about P if she knows of more individuals that they have property P”
(Schulz and Van Rooij, 2006): thus, knowing that some individual does not have property
P is not counted as knowledge. Then, for the speaker not to know that one of the Topic
alternatives is true is compatible with both ignoring whether it is true or not and knowing
that it is false. Their proposal is summarized in 112.
(112) a. Topic alternatives: {P(T’): T’ ∈ Alt(T)}
b. CT-implicature: ∃T’[T’ ∈ Alt(T )] [¬Ksp (P (T 0 ))], where Ksp represents “the

speaker knows that”.
This proposal derives the contrast in 111 in the following way. Consider first the ac-
ceptable answer in 111b. The first conjunct of this sentence generates the Topic alternatives
in 113a and the implicature in 113b. The implicature is compatible with the second con-
junct of the sentence; in fact, the second conjunct is just strengthening the implicature
(113c). The second conjunct generates the set of Topic alternatives in 114a and the impli-
cature in 114b. Again, the implicature is compatible with the assertion of the first conjunct
(114c). Note that the implicature of the second conjunct is not informative, since it conveys
something weaker than the previous assertion: however, this does not render the discourse
infelicitious. Compatibility between implicatures and assertions is all that is needed to
make the discourse felicitous.
158
(113) a. Topic alternatives: {John came, Bill came}
b. CT-implicature (1st conjunct): ¬Ksp (Bill came)

Possibly Bill did not come.
c. ¬Ksp (Bill came)Implicature and Ksp ¬(Bill came)Assertion
(114) a. Topic alternatives: {John did not come, Bill did not come}
b. CT-implicature (2nd conjunct): ¬Ksp (John did not come)

Possibly John came.
c. Ksp (John came)Assertion and ¬Ksp (John did not come)Implicature
Consider now the unacceptable answer in 111c. The same Topic alternatives are gen-
erated for both conjuncts, namely those in 113a. The CT of the first conjunct implicates
¬Ksp (Bill came) and this is contradicted by the assertion of the second conjunct: Ksp (Bill
came). Note also that this means that this implicature should be treated as a Conventional
Implicaturess (in the sense of Potts (2007)) since it cannot be canceled by a following
assertion.4
Tomioka (2008) notes that this knowledge based analysis cannot easily account for the
presence of CT marking in a variety of speech acts, other than assertions, such as questions,
imperatives or performatives (see example 115). His approach is discussed in the next
section.
(115) a. Zyaa Erika-WA doko-e itta-no?
“Well, then, where did ErikaCT go?”
b. Eego-WA tyanto yatte-ok-e.
“At least, prepare yourself for EnglishCT .”

4
Or rather a stronger cue is needed to cancel the implicature, such as the particle too. Krifka (1999)
proposes an analysis for these particles, which basically provides a mechanism to get around what he calls
the Distinctiveness Constraint, similar to the CT-implicature discussed here.
159
Another problem with Hara and van Rooij (2007) is that it does not follow from their
approach why the CT marking should be obligatory in 111b. That is, it is not explained
why the alternatives and the CT-implicatures are crucial for the felicity of the discourse,
which is something Büring (1999) did address.
Tomioka (2008)
Tomioka’s analysis is based on the idea that CTs operate at the level of Speech Acts, which
are assumed to be within the bounds of sentence grammar. A Contrastive Topic triggers
a set of alternatives, not at the sentence level, but at the speech act level. In Tomioka’s
approach the rest of the work is done by Gricean reasoning, as is usually applied to im-
plicatures. Consider again the answer in 110: the Contrastive Topic generates a set of
alternative speech acts and the pragmatic Gricean reasoning applies.
The Topic alternatives are derived as follows:
(116) a. [[[John]1CT came]]f(g) = {p : ∃h, h is a distinguished assignment, p = λw.h(1)

passed in w} = {p : ∃x∈De p = λw. x came in w}
b. [[assert [John]1CT came]]f(g) = {a : ∃x∈De a = assert(λw. x came in w)}
Then, the usual Gricean reasoning applies to the set of alternatives in 116b.
(117) a. The speaker asserted that John came.
b. There are two possible assertions that she could have made, but she only as-
serted one of them.
c. There must be a reason for not asserting the remaining one.
This last step brings about the sense of uncertainty or incompleteness which invites the
hearer to make speculations about the reasons for using a CT. The listener may deduce that
the speaker does not know about Bill, but this is only one of the possibilities. If the hearer
160
knows that the speaker has complete knowledge, she might think that the speaker considers
it impolite to advertise that Bill did not come to the party, etc.
Thus, Tomioka’s analysis in principle does not preclude the possibility that the speaker
is fully knowledgeable (and neither did Hara and van Rooij’s). He notes, however, that
this possibility is absent with CT on measure phrases. The answer in 118b conveys that
the speaker does not have full knowledge or, in other words, ‘three’ cannot mean ‘exactly
three’. Tomioka derives this effect from a competition between the CT marking and the
focus marking. The speaker could have marked the measure phrase with a focus accent, as
in 118c. The result of this competition between CT and focus marking adds an extra step
in the pragmatic reasoning, as specified in 119:
(118) a. How many people will come to the party?
b. SAN-NIn-wa kuru-desyoo.
“(At least) Three people will come.”
c. SAN-NIn kuru-desyoo.
“Three people will come.”
(119) d. The speaker could have avoided using a CT by using focus. There must be a
reason for the speaker choosing a CT.
A CT will, then, bring about a sense of uncertainty or incompleteness when a focus

could have been used, but was not (as in 118). A CT will not preclude complete knowledge
on the part of the speaker if focus is not possible, as in examples 110 and 111b repeated
below, in which it is not possible to use focus marking.
(120) a. Did both Ken and John come to the party?
b. # John-ga ki-ta.
“[JohnF ] came.”
161
c. #John-ga ki-ta ga Bill-ga ko-nakat-ta.
“[JohnF ] came but [BillF ] did not come.
Tomioka’s approach can capture the compatibility of wa with a variety of speech acts.
However, it loses some of the insights from Hara and van Rooij’s theory. Consider first
121b. Tomioka predicts that this sentence could be uttered in the following scenario: the
speaker organized a party and invited, among other people, John and Bill. Bill was not
really supposed to come to the party, because he was supposed to help take care of his
twins, who are sick, but he came to the party anyway. The hearer knows that the speaker
has complete knowledge about who came to the party and about Bill’s situation. In this
scenario it should be possible for the hearer to reason that the alternative assertion was
not uttered because the speaker considers it impolite to announce that Bill did come to the
party. However, 121b is not acceptable in this scenario and, if the speaker has complete
knowledge, it can only be understood as implying that Bill did not come.
Consider now 121c. It is not clear how the Gricean reasoning should apply in this
discourse. Steps (b) and (c) of the Gricean reasoning in 117 do not apply in 121c because
in fact the alternatives were asserted, so it is not clear what Tomioka’s predictions are for
this example.
b. [CT John] came.
c. # [CT John] came, and [CT Bill] came.
7.2.4 A reformulation
My proposal is to combine the insights of Hara and van Rooij (2007), Tomioka (2008) and
Büring (1999, 2003).5 From Hara and van Rooij (2007), I use the idea that CTs trigger
5
I thank Elena Castroviejo for discussion of this section.
162
an implicature, to which I refer as uncertainty contrast; from Tomioka (2008), I use the
idea that CTs operate on Speech Acts and, from Büring (1999, 2003), I use the idea that
CTs are obligatory to address subquestions of the Question Under Discussion. Finally, I
also propose that the CT-implicature, the uncertainty contrast, can be strengthened into a
stronger contrast under certain circumstances.
CTs convey a CT-implicature, which is basically the CT-implicature proposed by Hara
and van Rooij (2007), but applied to alternative Speech Acts (derived following the mech-
anism proposed by Tomioka (2008)). This implicature is informally stated in 122.
(122) CT-implicature (uncertainty contrast): the speaker conveys that she is not carrying
out the alternative speech acts generated by the mechanism in 116.
I use ‘uncertainty contrast’ as a label to refer to this meaning: after triggering a set of
alternative speech acts, the speaker refuses to carry them out and the hearer is left wonder-
ing why this is so. In the case of assertions, not carrying out an assertion amounts to the
proposal by Hara and van Rooij (2007): there is an alternative assertion that the speaker
does not want to make either because the speaker knows it is false or because she does not
know whether it is true.
Consider now speech acts other than assertions, such as the ones in 115. Regarding the
question (115a), a set of alternative questions are generated and it is CT-implied that the
speaker is not carrying out the alternative questions. That is, although there are alternative
questions she could make, the speaker is only asking about the CT-marked referent. As for
the imperative (115b), a set of alternative commands are generated and it is CT-implied that
the speaker does not want to carry out these alternative commands either because she has
nothing to say about subjects other than English or because she thinks that her addressee
should not prepare for subjects other than English.
Furthermore, this CT-implicature can be strengthened into a stronger meaning, under
163
certain circumstances, whenever one of the Topic-alternatives is salient either in the context
or the discourse itself.
(123) Strengthened CT-implicature (exhaustive contrast): a speech act is implied, whose

content is the opposite of the content of the salient speech act.
In the case of assertions, the Strengthened CT-implicature implicates the opposite as-
sertion of the salient assertion alternative. That is, the speaker conveys that the proposition
expressed by the salient assertion alternative is false. I use ‘exhaustive contrast’ as a label
to refer to this meaning.
Finally, we also need Büring’s idea about the relationship between CTs and the Ques-
tion Under Discussion, which is repeated in 124.
(124) CTs are obligatory when they do not address the Question Under Discussion, but
an implicit sub-question of the Question Under Discussion.
The next section shows how this reformulation accounts for all the data presented in
Section 7.1.
7.3 Analysis of contrast in Romance OSPs
7.3.1 Contrastive OSPs as Contrastive Topics
The main proposal of this chapter is that, in all the examples from section 7.1, the pro-
noun is a Contrastive Topic marker, which conveys the CT-implicature just presented: the
speaker conveys that she is not carrying out the generated alternative speech acts. This
CT-implicature can be strengthened in certain circumstances: that is, the uncertainty con-
trast can be coerced into an exhaustive contrast. In this section, I apply this analysis to
contrastive OSPs and I make more precise the pairing between forms and meanings. I con-
164
centrate on applying the analysis to the Catalan examples, but the same reasoning applies
to Spanish and Italian.
All the corpus data presented here deals with assertions and, thus, Tomioka’s insight
about CTs operating on Speech Acts is not crucial to derive the correct interpretation. How-
ever, Romance OSPs provide support for Tomioka’s idea: contrastive OSPs can appear in
Speech Acts other than assertions, without their contrastive import changing. For instance,
125 is an example of an imperative. The answer with the OSP triggers a set of alternative
commands (126), which the speaker chooses not to utter. The meaning we derive is that
the speaker does not have anything to say about Bill or that she thinks that Bill should not
prepare for English.
(125) a. Quins examens ens haurı́em de preparar els estudiants de primer?
‘Which exams should we, first year students, prepare for?’
b. Tu prepara’t l’exàmen d’anglès.
‘You prepare yourself for English.’
(126) {I command that you prepare for English, I command that Bill prepares for English,
... }
Let us now go back to the data. The weak contrast examples will be examined first,
since they are the ones that fit best with the unstrengthened meaning of Contrastive Topics.
Consider first example 94a, repeated in 127a. By virtue of being a CT marker in an asser-
tion, the OSP introduces alternative assertions (127b) and the CT-implicature conveys that
the speaker is not asserting the introduced alternatives. An uncertainty contrast is derived:
the speaker is not committing herself to asserting what other members of the family will
eat. Note that the answer is a partial answer to the Question Under Discussion, that is, an
answer to a subquestion of the Question Under Discussion. As a consequence, we expect
165
the OSP to be obligatory. An answer without the OSP would be unacceptable in this con-
text, because it would present the mother’s answer as a complete answer to the Question
Under Discussion, although obviously this is not the case. Note also that, at the end of the
dialogue, the Question Under Discussion is completely resolved.
(127) a. ‘Què voldran per sopar?’ La mare diu: ‘Bé, doncs jo vull pollastre’ i el pare
‘Doncs, jo vull sopa’.
“ ‘What will you have for dinner?’ The mother says: ‘Well, I’ll have chicken’
and the father says ‘Well, I will have soup’.”
b. {I assert ‘My husband will have soup’, I assert ‘My son will have soup’}
Consider now example 94b, repeated below. This is what Büring would call a Purely
Implicational Topic: the OSP is not necessary because the utterance does not need to be
interpreted as an answer to a sub-question of the Question Under Discussion. However, the
OSP serves to trigger the CT-implicature about the non-asserted propositions. That is, it
serves to convey an uncertainty contrast: as far as the speaker is concerned, someone else
may (or may not) know about the frog.
(128) a. “Miri, senyora, nosaltres no sabem pas res de cap granota.”
“Look, mam, we don’t know anything about any frog.”
b. {I assert ‘The kids don’t know anything about any frog’, I assert ‘The cooks
don’t know anything about any frog’}
Let us move now to the implicit contrast exemplified by the examples in 92, repeated
below in 129 and 130. The OSP also introduces Topic alternatives but, in these cases, the
CT-implicature gets strengthened: the speaker does not want to carry out the alternative
assertion, not because she does not know whether it is true or not, but because she knows
it is false. The uncertainty contrast is coerced into an exhaustive contrast. The discourse
166
leaves no feeling of uncertainty to the hearer because there is one salient Topic alternative
in the context which is not likely to be true.
For instance, in 129a, the hearer can easily see that one of the Topic alternatives of the
set (represented in 129b), namely the assertion ‘the small frog is the big one’ is not true.
Thus, the CT-implicature gets strengthened: the weak contrast becomes an implicit contrast
between two entities and it is implicated that the contextually salient Topic-alternative is
not true.
(129) a. El nen torna a renyar la granota gran i li torna a dir que això no pot ser, que han
de ser amics, que s’han de comportar bé i que l’ha de cuidar perquè ella és la
gran.
“The child scolds the big frog again and tells it again that this can’t continue,
that they should be friends, they should behave themselves and that ∅bigf rog
should take care of it because she bigf rog is the big one.
b. {‘The child asserts ‘The little frog is the big one’, The child asserts ‘the big
frog is the big one’}
The same reasoning holds for 130a. It is clear in the story told by the speaker that the
boy (referred to by the pronoun he) was not looking forward to the dinner, while the rest of
the family was. With the use of the Contrastive Topic, it is implied that this salient Topic
alternative is not true: the CT-implicature is strengthened and a stronger implicit contrast
is conveyed.
(130) a. En el camı́ de tornada tots estan enfadats i ell, en canvi, està content perquè ell
no tenia cap ganes d’anar-se’n a sopar.
“On the way home, they are all angry and he, in contrast, is happy, because he
was not looking forward to going out for dinner.”
167
b. {The speaker asserts ‘The rest of the family was not looking forward to
going out for dinner’, the speaker asserts ‘the pets were not looking forward
to going out for dinner’}
Note that the OSP refers to a previous subject (against its tendency) and that its use
is optional, since the utterance does not need to be interpreted as an answer to a sub-
question of the Question Under Discussion. An NSP would refer to the same antecedent
but no implicature would be triggered (there would be no implicit contrast between the two
referents). Thus, the OSP is used here not to select a particular referent, but to convey a
particular implicature.
Finally, consider the cases of double contrast from 85, also repeated below. In these
cases, the OSP is mandatory and there is no uncertainty feeling overall in the discourse. I
argue that this is just another case of the Strengthened CT-Implicature being conveyed, of
an uncertainty contrast being coerced into an exhaustive contrast. The OSP is mandatory
because a CT is needed to mark that each of the conjuncts is addressing a sub-question of
the Question Under Discussion.
Consider the example in 85b, repeated below in 131. This whole discourse is an answer
to the implicit question under discussion in 132a, but each conjunct is an answer to the sub-
questions in 132b and 132c and, thus, the OSP is needed in each conjunct.6 Sentences with
NSPs are understood as complete answers to the Question Under Discussion. If an NSP
were used in 131, the second conjunct would become uninterpretable.
(131) Ara nosaltres anirem a navegar per l’aigua i tu et quedaràs aquı́ sola.
“Now we will go to the boat to sail in the lake and you will stay here on your own.”
(132) a. What will everyone do?

6
Or an adverbial of the type discussed by Matos Amaral and Schwenter (2005). See Section 7.3.2 for
more comments on this.
168
b. What will we do?
c. What will you do?
With a pronoun marking a CT in each conjunct, a set of alternatives is introduced for

each conjunct (133a and 133b). The hearer will try to make sense of these sets and of why
the speaker did not assert the non-asserted alternatives. In this case, he can easily arrive
at the conclusion that the alternatives were not asserted because they are not true, since,
in fact, they are explicitly negated in the discourse. The discourse, thus, does not leave an
overall feeling of uncertainty and, by virtue of the Topic alternatives being explicit in the
discourse, the CT-implicature can be strengthened.
(133) a. {I assert ‘We will sail in the lake’, I assert ‘you will sail in the lake’}
b. {I assert ‘We will stay here’, I assert ‘you will stay here’}
In this example, and in all examples of double contrast, there is a rhetorical relation of
contrast, but this is orthogonal to the discussion. Whenever one of the Topic alternatives
ends up being subsequently negated, the resulting rhetorical relation will be of contrast, but
this does not need to be the case, as in the other examples discussed above (examples 127,
128, 129, and 130).
The approach defended here explains two correlations reported in sociolinguistic stud-
ies as well: correlations between pronominal subject expression and (i) first person singu-
lar pronouns (Cameron, 1992; Silva-Corvalán, 1994) and (ii) psychological verbs (Silva-
Corvalán, 1994; Travis, 2005). First person singular often serves to convey the speaker’s
own opinion. It is, then, not unexpected that first person singular triggers a higher rate
of overt pronouns if the speaker wishes to convey that those are exclusively her opinions
which may or may not coincide with those of other people. The same argument holds
for psychological verbs which express subjective opinions or points of view. Note that al-
though variationist studies claim that they exclude contrastive pronouns from the envelope
169
of variation, they would not exclude those cases in which the Contrastive Topic conveyed
by the OSP is optional (such as examples 94 and 92). These optional contrastive pronouns
could also account for some percentage of OSPs in same reference contexts, which, in
principle, favor NSPs. Finally, this approach is also consistent with views coming from
discourse analysis studies. For instance, Davidson (1996) claims that pronouns serve to
increase the ‘pragmatic weight’ of utterances and make them ‘more personally relevant’
and Stewart (2003) claims that overt pronouns are used as a way to hedge the speaker’s
opinions and protect their pragmatic face. In fact, these analyses point to observations that
are a byproduct of pronouns expressing Contrastive Topics and, thus, bringing about some
sense of uncertainty or non-finality about the non-asserted alternatives. The speaker may
wish to trigger this uncertainty feeling for politeness reasons: by uttering a Contrastive
Topic in first person singular, the speaker is protecting her face (the “public self-image that
every member wants to claim for themselves” (Brown and Levinson, 1987)) and conveying
that she is making a modest claim that need not be true of other discourse alternatives.
7.3.2 Game theory and contrast
Let me begin this section with a reminder of the pairings between forms and meanings
found in Romance null subject languages. These will be the pairings derived by means of
game theory.
• A contrastive overt pronoun is a Contrastive Topic marker (Hara and van Rooij,
2007; Tomioka, 2008), which conveys an uncertainty contrast, by means of the CT-
implicature. This uncertainty meaning can be coerced into an exhaustive meaning if
there are enough contextual cues (if there is one salient alternative in the context or
the discourse).
• In cases of double contrast, the overt pronoun cannot be replaced by only a null
170
pronoun, but can be replaced by another contrast marker, such as an adverbial. There
is no overall uncertainty meaning.
• Stressed overt pronouns convey an exhaustive contrast, but are not acceptable in dou-
ble contrast structures (see Section 7.1.4).
A contrastive OSP is not used to select the correct antecedent, but mainly to trigger
the desired interpretation. Thus, the information states in the game cannot select between
different antecedents, but between different interpretations. The relevant interpretations are
(i) a non-contrastive one, (ii) an uncertainty contrast interpretation and (iii) an exhaustive
contrast interpretation. My proposal is that we need a chain of two games to derive the
pairing between forms and meanings described above. In the first game, the decision of the
speaker is between uttering an NSP or an OSP and the decision of the hearer is between
interpreting the discourse contrastively or non-contrastively (I call this game Contrast I
Game, henceforth). In the second game, the decision of the speaker is which type of OSP
to utter, stressed or non-stressed, and the decision of the hearer is which type of contrastive
interpretation to arrive at, uncertain or exhaustive (I call this game Contrast II Game).
Contrast I Game can be seen in figure 7.1 and its structure is as follows. There are
two relevant Information States: s1 , Non-Contrast, and s2 , Contrast. In the former state,
the utterance is a complete answer to the Question Under Discussion and no contrast with
other entities is conveyed. In the latter state, there is some contrast between the referent of
the pronoun and some other entity. In each of the information states, the speaker has two
options: she can either use an NSP or an OSP. Note that it is left underspecified in Contrast
I Game both whether the OSP is stressed or not and whether contrast is exhaustive or not.
This is precisely the goal of Contrast II Game.
The two pronouns are in principle ambiguous with respect to the two potential inter-
pretations. Whenever an NSP is used, the hearer will be in the information set {t1 , t2 }.
171
Figure 7.1: Contrast I Game
Whenever an OSP is used, the hearer will be in the information set {u1 , u2 }. The payoffs
for the two pronominal forms are kept constant from past chapters: 10 for NSPs and 8 for
OSPs in both information states. As for the probabilities of the two information states,
Non-contrast is the default, unmarked state, while Contrast is the non-default marked state,
and therefore, the probability assigned to the the former, p1 , is greater than the probability
assigned to the latter, p2 . With this state of affairs we derive the following Nash equilibria.7
(134) a. {(s1 , NSP), (s2 , OSP), ( {t1 , t2 }, Non-contrast), ({u1 , u2 }, Contrast)}. The
expected payoff is: p1 (10) + p2 (8) = 2/3(10) + 1/3(8) = 28/3.
b. {(s1 , OSP), (s2 , NSP), ( {t1 , t2 }, Non-contrast), ({u1 , u2 }, Non-contrast)}.

The expected payoff is: p1 (8) + p2 (10) = 2/3(8) + 1/3(10) = 26/3.
c. {(s1 , NSP), (s2 , NSP), ( {t1 , t2 }, Non-contrast), ({u1 , u2 }, Contrast)}. The

7
The calculations assume p1 = 2/3 and p2 = 1/3, but the equilibrium will be the same whenever
p1 > p2 .
172
expected payoff is: p1 (10) + p2 (-10) = 2/3(10) + 1/3(-10) = 10/3.
None of the other strategies are Nash equilibria. The first of the three Nash equilib-
ria (134a) is the only Pareto-Nash equilibrium of the game, the Nash equilibrium with the
highest payoffs. According to this equilibrium, the speaker should use an NSP in s1 and an
OSP in s2 . Also, when the hearer finds himself in the information set {t, t’} he should in-
terpret the NSP as not conveying contrast; when the hearer finds himself in the information
set {u, u’} he should interpret the OSP as conveying contrast.
This equilibrium corresponds to Horn’s division of pragmatic labor (Horn, 2004), ac-
cording to which marked forms express marked meanings and unmarked forms express un-
marked meanings. A further prediction of the game theoretical model is that if probabilities
change, the Pareto-Nash equilibrium of the game will also change. That is, if probability
p2 becomes greater than p1 , then the Pareto-Nash equilibrium would be the second equi-
librium of 134, rather than the first one. For the purposes of our game, what could make
the probabilities change? What could make a contrastive interpretation more likely than a
non-contrastive answer? The adverbials studied by Matos Amaral and Schwenter (2005)
are a good candidate. Since the adverb is already marking contrast, s2 becomes more likely
and the equilibrium in 134b becomes the Pareto-Nash equilibrium: in this circumstance,
the null pronoun can be used felicitously when the speaker wants to convey contrast.
It also follows from this game that in a situation in which the discourse is sensible with
and without a contrastive interpretation (see for example the optional contrastive pronouns
in 129a and 130a), both types of pronouns will be acceptable; the only difference will be
whether the selected information state is contrastive or not.
The second game is played only if the optimal interpretation in the equilibrium of Con-
trast I Game was a contrastive one and the optimal form the OSP. A representation of this
game can be seen in figure 7.2. The Information States in this game are also two differ-
ent interpretations, in this case two different contrastive interpretations: s1 represents an
173
uncertainty contrast interpretation (it is not conveyed whether the relevant alternatives did
or did not do what is predicated about the referent of the pronoun) and s2 represents an
exhaustive contrast interpretation (the relevant alternatives did not do what is predicated
about the referent of the pronoun) As usual, in each of the information states, the speaker
could potentially use two different linguistic expressions: a non-stressed overt pronoun and
a stressed overt pronoun, the latter being more marked and requiring more effort than the
former. Thus, the payoffs of the non-stressed overt pronouns should be greater than the
payoffs for the stressed overt pronouns: I assign them a payoff of 8 and 7, respectively.
As for the probabilities, note that all instances of exhaustive contrast are a subset of uncer-
tainty contrast, which is less informative and more general. That is, uncertainty contrast is
more unspecified and compatible with more states of affairs than exhaustive contrast. This
is a clear indication of their relative probabilities: the probability assigned to uncertainty
contrast IS, p1 , needs to be greater than the one assigned to the exhaustive contrast IS, p2 .
Figure 7.2: Contrast II Game
174
With this state of affairs, the game has three Nash equilibria.8
(135) a. {(s1 , Non-stressed OSP), (s2 , Stressed OSP), ({t, t0 }, Uncertainty Contrast),
({u, u0 }, Exhaustive Contrast)}. The expected payoff is: p1 (8) + p2 (7) = 2/3(8)
+ 1/3(7) = 23/3.
b. {(s1 , Stressed OSP), (s2 , Non-stressed OSP), ({t, t0 }, Exhaustive Contrast),

({u, u0 }, Uncertainty Contrast)}. The expected payoff is: p1 (7) + p2 (9) =
2/3(7) + 1/3(8) = 22/3.
c. {(s1 , Non-stressed OSP), (s2 , Non-stressed OSP), ({t, t0 }, Uncertainty Con-

trast), ({u, u0 }, Uncertainty Contrast)}. The expected payoff is: p1 (8) + p2 (-10)
= 2/3(8) + 1/3(-10) = 6/3.
There are no other Nash equilibria and the only Pareto-Nash equilibrium is the one in
135a. Again, the game of partial information derives the division of labor between stressed
and unstressed OSPs and two different contrastive interpretations: non-stressed OSPs serve
to express the less informative, more general uncertainty contrast and stressed OSPs serve
to express the more informative, more specific exhaustive contrast.
The prediction is again that this Pareto-Nash equilibrium can be altered and that the
non-stressed OSPs can successfully convey an exhaustive contrast. I argue that this happens
precisely in two cases identified before: (i) when there is one salient alternative in the
context, that is, in cases of implicit contrast (see examples 92a and 92b) and (ii) when
there is one salient alternative in the discourse, that is, in cases of double contrast (see
examples 85a and 85b). In both cases, p2 becomes greater than p1 . In one case, there is
a salient alternative in the context and the referent of the pronoun is in contrast with this
alternative. In the other case, the salient alternative with which the referent of the pronoun
8
The calculations again assume p1 = 2/3 and p2 = 1/3, but the equilibrium will be the same whenever
p1 > p2.
175
contrasts is in the discourse itself. The existence of this salient alternative is able to make
the probabilities switch and the non-stressed OSP can express an exhaustive contrast. In all
other cases, a stressed OSP needs to be used to convey an exhaustive contrast.
Finally, note that we are predicting that an OSP should be uttered for two different,
independent reasons: to select the correct antecedent, based on the pragmatic and syntactic
factors described in Chapters 4 and 5, or to express (exhaustive or uncertainty) contrast.
That is, an OSP will be used if there is either (i) contrast or (ii) reference to a non-salient
antecedent or if both conditions are fulfilled. The discourses in 136 show the three possi-
bilities: 136a shows an OSP that is both contrastive and refers to a low salience antecedent,
136b a non-contrastive OSP that refers to a low salience antecedent and 136c a contrastive
OSP that refers to a high salience antecedent.
(136) a. La tortuga veu el nen i ell xiscla, però ella no.
“The turtle sees the child and he screams, but she does not.”
b. La tortuga veu el nen i ell xiscla.
“The turtle sees the child and he screams.”
c. La granota gran ha de cuidar la petita perquè ella és la gran.
“The big frog must take care of the small one since she is the big one.”
7.4 Conclusion
Although there seem to be different kinds of contrastive OSPs in Romance, I have argued
that all contrastive OSPs are in fact Contrastive Topic markers. A Contrastive Topic gener-
ates Topic alternatives and the speaker conveys that she is not carrying out the alternative
speech acts. This implicature, the uncertainty contrast, can be coerced into an exhaustive
contrast if there are enough contextual cues: namely, there needs to be a salient relevant
176
alternative either in the context or in the discourse (in the cases of double contrast). In
addition, a Contrastive Topic is needed when the utterance is not a direct answer to the
Question Under Discussion, but to a sub-question of the Question Under Discussion.
These pairings between forms and meanings were derived by having a chain of two
games of partial information, which matched unmarked, frequent meanings with unmarked,
cheap forms and marked, infrequent meanings with marked, expensive forms.
177
Chapter 8
Conclusion
8.1 Contributions of this thesis
This thesis has examined the variation between null subject pronouns and overt subject
pronouns in Catalan, a null-subject language. The empirical basis of the thesis was obtained
through five psycholinguistic experiments and corpus data, which were then modeled using
game theory. In order to summarize the main contributions of this thesis, the four questions
posed in Chapter 1 can now be answered.
• Question 1: What is the relationship between syntactic function and pronouns in

Catalan? Do different pronouns have biases towards antecedents in particular syn-
tactic positions?
NSPs and OSPs have different preferences depending on the syntactic position of the
antecedent: NSPs have a subject preference and OSPs an object preference. The two
experiments of Chapter 3 (a questionnaire study and a self-paced reading task) both
showed this asymmetry between the two types of pronouns. This division of labor
was first studied by Carminati (2002), who called it the Position of Antecedent Hy-
pothesis (PAH). The PAH can easily be captured as the Pareto-Nash equilibrium of a
178
game of partial information, in which more economical and less marked forms cor-
respond to more frequent and less marked information states, while less economical
and more marked forms correspond to less frequent and more marked information
states.
Experiment 3 showed that the biases predicted by the PAH can be overridden if there
is enough contextual information. NSPs can felicitously refer to the previous object
if there are enough contextual cues. Since context is explicitly encoded in games of
partial information by assigning probabilities to the different information states, this
context dependency directly follows from the model.
• Question 2: What is the relationship between information structure and pronouns

in Catalan? Do different pronouns have biases towards antecedents playing different
roles in the information structure of a sentence? Can the syntactic preferences of
pronouns be understood as a byproduct of their pragmatic preferences? What does
this tell us about the notion of salience?
Experiment 4 shows that the syntactic preferences of subject pronouns are not a by-
product of their pragmatic preferences, but that the two levels interact and the PAH
needs to be redefined in order to capture the pronouns’ preferences. NSPs have a
simple preference for subject antecedents, regardless of whether they are links or
not. OSPs have a more complex preference for low-salience (non-subject, non-links)
antecedents.
This data points to a multi-factor concept of salience, to which both syntactic and
pragmatic factors contribute, although with different weights. Syntactic factors have
a greater weight, so that if a referent is in subject position, it is the most salient
antecedent, even if it is not the link of the sentence.
Experiment 5 showed that the lack of variation affects the preferences of the OSP. In
179
a context in which OSPs become mandatory, because they are focused, they do not
have a low-salience preference, but are fully ambiguous.
• Question 3: It is well known that different NSLs present different overall rates of
OSPs. How should we deal with this cross-linguistic variation in our game theoreti-
cal approach?
I have presented two possibilities. Hypothesis I is that priming effects are responsible
for the dialect differences. It has been found in sociolinguistic studies that the use
of a particular form primes the subsequent use of this same form. If, in a particular
context, OSPs become favored (for instance, OSPs become favored to express gener-
icity in Puerto Rican Spanish), this preference will prime OSPs outside this particular
context and the overall rate of OSPs will increase. Hypothesis II is that some dialects
(namely, Brazilian Portuguese and Caribbean Spanish) are in a transition from being
a null subject language to being a non-null subject language. The two grammars are
currently active in those dialects and this affects the use of pronouns and makes the
rates of OSPs higher than in other Romance varieties.
• Question 4:How should contrastive pronouns be analyzed? Is there only one type of
contrastive pronoun? Are they always mandatory?
I have argued that all contrastive OSPs in Catalan are Contrastive Topic markers. A
Contrastive Topic generates a set of alternatives and triggers an implicature concern-
ing the alternatives. This meaning, an ‘uncertainty contrast’, can be strengthened into
an ‘exhaustive contras’ under certain circumstances: when there is a salient alterna-
tive in the discourse or in the context. Otherwise, a stressed OSP needs to be used to
derive this contrast.
Contrastive pronouns are not always mandatory, but only when they must be inter-
preted as answering a sub-question of the Question Under Discussion.
180
8.2 Directions for future work
There are several issues relevant to this thesis that remain open for future work. In this
section I discuss several open questions.
• What is the role of definite descriptions? Although the goal of this thesis was to
examine the variation between NSPs and OSPs, the role of definite descriptions has
been mentioned and used in the analysis. However, more needs to be said about their
role in discourse. It is clear that DDs are lower in Ariel’s Accessibility Hierarchy or
Gundel et. al.’s Givenness Hierarchy and, thus, they can refer to less activated refer-
ents. However, they can also refer to previous objects or even to previous subjects,
as shown in 137.
(137) Llavors el gat salta i, doncs, ∅ vol caçar la granotaj , però la granotaj s’agafa
al biberó
Then, the cat jumps and, well, ∅ wants to hunt the frogj , but the frogj holds
on to the baby bottle
These uses of definite descriptions may be related to stylistic or discourse consider-

ations. For instance, it might indicate the beginning of a new discourse segment or
a change of discourse topic. The experiments have shown that a payoff-dominant
equilibrium (that is, a Pareto-Nash equilibrium) is used to interpret anaphoric forms.
However, it could also be that speakers occasionally decide to use a risk-dominant
equilibrium (Sally, 1993), both to add some variation to their speech and to make ab-
solutely sure the hearer is understanding correctly, when they believe there is a risk
of mismatches, such as the ones explored in Section 4.4.
• Related to the previous point, I leave it for future work to examine how referring
expressions affect the construction of a discourse structure. As just mentioned def-
181
inite descriptions are likely to contribute to the building of discourse structure. In
Chapter 5, I also discussed the role of clitics and how they affect the segmentation
of past and future utterances. The segmentation of discourse into different units, as
well as the construction of the information structure of sentences, could be seen as
part of a bigger game, in which several linguistic cues are used to achieve the desired
segmentation.
• How much evidence is necessary so that conversational agents can form estimations
and use them? This thesis has used an approach which estimates probabilities of
information states based on corpora counts. I have shown that focused OSPs are
ambiguous and do not have the non-subject preference of non-focused OSPs. This
correlated with the fact that the two involved information states ([subject reference
+ focused subject] and [object reference + focused subject]) are very rare in cor-
pora, again unlike their unfocused counterparts, so participants are not able to form
estimates and use them to choose the maximally efficient form. Is there a minimal
threshold so that speakers and hearers can use frequencies in corpora to approximate
probabilities of information states?
• How exactly should the role of priming effects be formalized? Priming has been
found to have an effect on pronoun choice. It has also been found that it plays a
greater role in Same Reference contexts than in Switch Reference contexts (in which
there is already much more variation between the two forms). However, it is not the
case that priming has an effect all the time. Is it constrained by some other factor?
The payoff function is a good place where priming effects could been encoded in
a game theoretical model. How should exactly priming effects modify the payoff
function of the game?
• It has also been left for future work how to formalize the implicature conveyed by a
182
contrastive OSP. The formalization proposed by Hara and van Rooij (2007) should
be modified so that it does not rely on the speaker’s knowledge but on her intentions
to not carry out the alternative speech acts. It would also be desirable to be more
precise about how the strengthening of the implicature takes place. A promising idea
would be to use speech acts operators, such as Assert or Quest (Krifka, 1995), so that
the relative scope of negation and the speech act operator determine which of the CT
implicatures is obtained.
183
Chapter 9
Appendices
9.1 Appendix A. Materials for Experiment 1
List of the sixteen experimental items in the two conditions and the two paraphrases of the
second sentence. The two conditions are:
• Condition 1: null pronoun.
• Condition 2: overt pronoun.
(1) a. La Marta escrivia sovint a la Raquel. Vivia als Estats Units.
b. La Marta escrivia sovint a la Raquel. Ella vivia als Estats Units.
‘Marta wrote frequently to Raquel. She lived in the United States.’
i. La Marta vivia als Estats Units.
‘Marta lived in the United States.’
ii. La Raquel vivia als Estats Units.
‘Raquel lived in the United States.’
(2) a. El Robert va insultar el Carles. Estava borratxo.
184
b. El Robert va insultar el Carles. Ell estava borratxo.
‘Robert insulted Carles. He was drunk.’
i. El Robert estava borratxo.
‘Robert was drunk.’
ii. El Carles estava borratxo.
‘Carles was drunk.’
(3) a. La Gemma ja no veu l’Anna. S’ha casat fa poc.
b. La Gemma ja no veu l’Anna. Ella s’ha casat fa poc.
‘Gemma does not see Anna anymore. She got recently married.’
i. La Gemma s’ha casat.
‘Gemma got married.’
ii. L’Anna s’ha casat.
‘Anna got married.’
(4) a. L’Adrià ha trucat a l’Albert. Estava a l’oficina.
b. L’Adrià ha trucat a l’Albert. Ell estava a l’oficina.
‘Adrià called Albert. He was at the office.’
i. L’Adrià estava a l’oficina.
‘Adrià was at the office.’
ii. L’Albert estava a l’oficina.
‘Albert was at the office.’
(5) a. Demà la Montse anirà al teatre amb la Marta. No ha de treballar.
b. Demà la Montse anirà al teatre amb la Marta. Ella no ha de treballar.
185
‘Tomorrow Montse will go to the theater with Marta. She doesn’t have to
work.’
i. La Montse no ha de treballar.
‘Montse doesn’t have to work.’
ii. La Marta no ha de treballar.
‘Marta doesn’t have to work.’
(6) a. La Sònia ha trucat a la Sı́lvia. Sempre arriba tard.
b. La Sònia ha trucat a la Sı́lvia. Ella sempre arriba tard.
‘Sònia called Sı́lvia. She is always late.’
i. La Sònia sempre arriba tard.
‘Sònia is always late.’
ii. La Sı́lvia sempre arriba tard.
‘Sı́lvia is always late.’
(7) a. El Toni farà un viatge amb el Marc. Vol anar de viatge a l’agost.
b. El Toni farà un viatge amb el Marc. Ell vol anar de viatge a l’agost.
‘Toni will travel with Marc. He wants to travel in August.’
i. El Toni vol anar de viatge a l’agost.
‘Toni wants to travel in August.’
ii. El Marc vol anar de viatge a l’agost.
‘Marc wants to travel in August.’
(8) a. El Josep sempre juga a tennis amb el Martı́ el dijous a les sis. Té la tarda lliure.
b. El Josep sempre juga a tennis amb el Martı́ el dijous a les sis. Ell té la tarda
lliure.
186
‘Josep always plays tennis with Martı́ Thursday at six. He is free in the after-
noon.’
i. El Josep té la tarda lliure.
‘Josep is free in the afternoon.’
ii. El Martı́ té la tarda lliure.
‘Martı́ is free in the afternoon.’
(9) a. La Roser va anar a veure la Marina. Tenia problemes.
b. La Roser va anar a veure la Marina. Ella tenia problemes.
‘Roser went to see Marina. She had problems.’
i. La Roser tenia problemes.
‘Roser had problems.’
ii. La Marina tenia problemes.
‘Marina had problems.’
(10) a. El Jordi va avisar el Gabriel que tindrien problemes. Està espantat.
b. El Jordi va avisar el Gabriel que tindrien problemes. Ell està espantat.
‘Jordi warned Gabriel that they would have trouble. He is scared.’
i. El Jordi està espantat.
‘Jordi is scared.’
ii. El Gabriel està espantat.
‘Gabriel is scared.’
(11) a. El Germà va insultar el Pere. El va pegar.
b. El Germà va insultar el Pere. Ell el va pegar.
‘Germà insulted Pere. He hit him.’
187
i. El Germà va pegar.
‘Germà hit.’
ii. El Pere va pagar.
‘Pere hit.’
(12) a. La Lali va sempre d’excursió amb la Jordina. Ella disfruta caminant.
b. La Lali va sempre d’excursió amb la Jordina. Disfruta caminant.
‘Lali always goes hiking with Jordina. She likes walking.’
i. La Lali disfruta caminant.
‘Lali likes walking.’
ii. La Jordina disfruta caminant.
‘Jordina likes walking.’
(13) a. L’Andreu sol anar a l’òpera amb en Ricard. N’és molt aficionat.
b. L’Andreu sol anar a l’òpera amb en Ricard. Ell n’és molt aficionat.
‘Andreu usually goes to the opera with Ricard. He’s quite an expert.’
i. L’Andreu és molt aficionat a l’òpera.
‘Andrew is quite an opera expert.’
ii. El Ricard és molt aficionat a l’òpera.
‘El Ricard is quite an opera expert.’
(14) a. L’Estel va conèixer la Blanca a la facultat. Era força més gran que la resta.
b. L’Estel va conèixer la Blanca a la facultat. Ella era força més gran que la resta.
‘Estel met Blanca in college. She was older than the rest of the people.’
i. L’Estel era més gran.
188
‘Estel was older.’
ii. La Blanca era més gran.
‘Blanca was older.’
(15) a. La Núria ha marxar a viure lluny de la Rosa. La troba a faltar.
b. La Núria ha marxar a viure lluny de la Rosa. Ella la troba a faltar.
‘Nuria has moved far away from Rosa. She misses her.’
i. La Núria troba a faltar la Rosa.
‘Nuria misses Rosa.’
ii. La Rosa troba a faltar la Núria.
‘Rosa misses Núria.’
(16) a. El Vı́ctor no està d’acord amb el Rubén. És molt tossut.
b. El Vı́ctor no està d’acord amb el Rubén. Ell és molt tossut.
‘Vı́ctor does not agree with Rubén. He’s very stubborn.’
i. El Vı́ctor és molt tossut.
‘Vı́ctor is very stubborn.’
ii. El Rubén és molt tossut.
‘Rubén is very stubborn.’
9.2 Appendix B. Materials for Experiment 2
List of the sixteen experimental items in the four conditions:
• Condition 1: null pronoun + subject bias.
• Condition 2: overt pronoun + subject bias
189
• Condition 3: null pronoun + object bias.
• Condition 4: overt pronoun + object bias.
(1) a. El Joan va deixar en ridı́cul al Dani davant de tothom. Es va excusar repetida-

ment.
b. El Joan va deixar en ridı́cul al Dani davant de tothom. Ell es va excusar repeti-

dament.
‘John made fun of Dani in front of everyone. He apologized many times.’
c. El Joan va deixar en ridı́cul al Dani davant de tothom. Es va ofendre moltı́ssim.
d. El Joan va deixar en ridı́cul al Dani davant de tothom. Ell es va ofendre

moltı́ssim.
‘John made fun of Dani in front of everyone. He was very offended.’
(2) a. El Marc li ha demanat al Lluı́s que no fumés. Li ha dit que era al·lèrgic al fum
del tabac.
b. El Marc li ha demanat al Lluı́s que no fumés. Ell li ha dit que era al·lèrgic al
fum del tabac.
‘Marc asked Lluı́s to stop smoking. He told him he was allergic to tobacco
smoke.’
c. El Marc li ha demanat al Lluı́s que no fumés. Li ha dit que mai no aconseguia

deixar-ho.
d. El Marc li ha demanat al Lluı́s que no fumés. Ell li ha dit que mai no aconseguia
deixar-ho.
‘Marc asked Lluı́s to stop smoking. He told him he had not managed to quit.’
(3) a. La Carla sempre contradiu la Júlia. Ho fa per venjar-se.
190
b. La Carla sempre contradiu la Júlia. Ella ho va per venjar-se.
‘Carla is always contradicting Júlia. She wants revenge.’
c. La Carla sempre contradiu la Júlia. Sempre s’acaba enfadant.
d. La Carla sempre contradiu la Júlia. Ella sempre s’acaba enfadant.
‘Carla is always contradicting Júlia. She always ends up being angry.’
(4) a. La Mercè va visitar la Rosa a l’hospital. Li va portar bombons.
b. La Mercè va visitar la Rosa a l’hospital. Ella li va portar bombons.
‘Mercè visited Rosa in the hospital. She brought her sweets.’
c. La Mercè va visitar la Rosa a l’hospital. Ja està fora de perill.
d. La Mercè va visitar la Rosa a l’hospital. Ella ja està fora de perill.
‘Mercè visited Rosa in the hospital. She is out of danger now.’
(5) a. La Maria es va trobar la Núria inconscient al sofà. Es va espantar molt.
b. La Maria es va trobar la Núria inconscient al sofà. Ella es va espantar molt.
‘Maria found Núria unconscious on the couch. She got very scared.’
c. La Maria es va trobar la Núria inconscient al sofà. Estava molt pàl·lida.
d. La Maria es va trobar la Núria inconscient al sofà. Ella estava molt pàl·lida.
‘Maria found Núria unconscious on the couch. She was very pale.’
(6) a. El Pere va desafiar el Miquel a beure’s una ampolla sencera de whisky. No ho

va dir de broma.
b. El Pere va desafiar el Miquel a beure’s una ampolla sencera de whisky. Ell no

ho va dir de broma.
‘Pere challenged Miquel to drink a whole bottle of whiskey. He was not joking.’
191
c. El Pere va desafiar el Miquel a beure’s una ampolla sencera de whisky. Va
acceptar el repte.
d. El Pere va desafiar el Miquel a beure’s una ampolla sencera de whisky. Ell va

acceptar el repte.
‘Pere challenged Miquel to drink a whole bottle of whiskey. He accepted the

challenge.’
(7) a. El Vicenç va insultar l’Enric pel carrer. Va ser servir insults molt forts.
b. El Vicenç va insultar l’Enric pel carrer. Ell va ser servir insults molt forts.
‘Vicenç insulted Enric on the street. He used very harsh words.’
c. El Vicenç va insultar l’Enric pel carrer. Li va tornar insults encara pitjors.
d. El Vicenç va insultar l’Enric pel carrer. Ell li va tornar insults encara pitjors,
‘Vicenç insulted Enric on the street. He insulted him with even worse words.’
(8) a. El Llorenç respecta molt l’opinió del Quim. Sempre li demana consell.
b. El Llorenç respecta molt l’opinió del Quim. Ell sempre li demana consell.
‘Llorenç respects Quim’s opinion a lot. He always asks him for advice.’
c. El Llorenç respecta molt l’opinió del Quim. Se sent molt important.
d. El Llorenç respecta molt l’opinió del Quim. Ell se sent molt important.
‘Llorenç respects Quim’s opinion a lot. He feels very important.’
(9) a. La Irene sempre li fa regals cars a la Maria. Sovint arriba molt justa a fi de mes.
b. La Irene sempre li fa regals cars a la Maria. Sovint ella arriba molt justa a fi de
mes.
‘Irene always gives expensive presents to Maria. She has often trouble to make
ends meet.’
192
c. La Irene sempre li fa regals cars a la Maria. A canvi, la convida sovint al teatre.
d. La Irene sempre li fa regals cars a la Maria. A canvi, ella la convida sovint al

teatre.
‘Irene always gives expensive presents to Maria. She invites her to the theater
in return.’
(10) a. La Carme sempre intimida la Sònia. Té un caràcter molt fort.
b. La Carme sempre intimida la Sònia. Ella té un caràcter molt fort.
‘Carme has always intimidated Sònia. She has a very strong personality.’
c. La Carme sempre intimida la Sònia. No s’atreveix a parlar-li.
d. La Carme sempre intimida la Sònia. Ella no s’atreveix a parlar-li.
‘Carme has always intimidated Sònia. She does not dare to talk to her.’
(11) a. La Paula va renyar la Núria. És una persona molt exigent.
b. La Paula va renyar la Núria. Ella és una persona molt exigent.
‘Paula scolded Núria. She is a very demanding person.’
c. La Paula va renyar la Núria. Havia comès un error greu.
d. La Paula va renyar la Núria. Ella havia comès un error greu.
‘Paula scolded Núria. She had made a serious mistake.’
(12) a. El Miquel va ensenyar al Joan a tocar la guitarra. És un bon mestre.
b. El Miquel va ensenyar al Joan a tocar la guitarra. Ell és un bon mestre.
‘Miquel taught Joan to play the guitar. He’s a good teacher.’
c. El Miquel va ensenyar al Joan a tocar la guitarra. És un bon alumne.
d. El Miquel va ensenyar al Joan a tocar la guitarra. Ell és un bon alumne.
‘Miquel taught Joan to play the guitar. He’s a good student.’
193
(13) a. La Maria sempre vol que la Núria li faci massatges. Aixı́ no té mal d’esquena.
b. La Maria sempre vol que la Núria li faci massatges. Aixı́ ella no té mal
d’esquena.
‘Maria always wants Núria to gave her a massage. She does not have back pain
like this.’
c. La Maria sempre vol que la Núria li faci massatges. És massatgista profes-
sional.
d. La Maria sempre vol que la Núria li faci massatges. Ella és massatgista profes-
sional.
‘Maria always wants Núria to gave her a massage. She’s a professional masseuse.’
(14) a. El Pau mai no vol anar al cine amb el Joan. Prefereix anar-hi tot sol.
b. El Pau mai no vol anar al cine amb el Joan. Ell prefereix anar-hi tot sol.
‘Pau never wants to go to the movies with Joan. He prefers to go there on his
own.’
c. El Pau mai no vol anar al cine amb el Joan. Ell sempre xerra durant la peli.
d. El Pau mai no vol anar al cine amb el Joan. Ell sempre xerra durant la peli.
‘Pau never wants to go to the movies with Joan. He always talks during the
film.’
(15) a. La Raquel va ser la mestra d’anglès de l’Anna. Té bon record de la seva estu-
diant.
b. La Raquel va ser la mestra d’anglès de l’Anna. Ella té bon record de la seva
estudiant.
‘Raquel was Anna’s English teacher. She has good memories of her student.’
194
c. La Raquel va ser la mestra d’anglès de l’Anna. Té bon record de la seva pro-
fessora.
d. La Raquel va ser la mestra d’anglès de l’Anna. Ella té bon record de la seva
professora.
‘Raquel was Anna’s English teacher. She has good memories of her teacher.’
(16) a. L’Albert va ajudar el Toni a pintar la casa. Sempre ajuda els amics quan pot.
b. L’Albert va ajudar el Toni a pintar la casa. Ell sempre ajuda els amics quan pot.
‘Albert helped Toni to paint the house. He always helps his friends when he
can.’
c. L’Albert va ajudar el Toni a pintar la casa. Li va agrair molt el cop de mà.
d. L’Albert va ajudar el Toni a pintar la casa. Ell li va agrair molt el cop de mà.
‘Albert helped Toni to paint the house. He was grateful for his help.’
9.3 Appendix C. Materials for Experiment 3
List of the sixteen experimental items in the eight conditions:
• Condition 1: null pronoun + mild subject bias.
• Condition 2: null pronoun + strong subject bias.
• Condition 3: overt pronoun + mild subject bias
• Condition 4: overt pronoun + strong subject bias
• Condition 5: null pronoun + mild object bias.
• Condition 6: null pronoun + strong object bias.
195
• Condition 7: overt pronoun + mild object bias.
• Condition 8: overt pronoun + strong object bias.
In parenthesis, the added connective in the ‘strong bias’ conditions.
(1) a. El Joan va deixar en ridı́cul al Dani davant de tothom. (Després), es va excusar

repetidament.
b. El Joan va deixar en ridı́cul al Dani davant de tothom. (Després), ell es va

excusar repetidament.
‘John made fun of Dani in front of everyone. (Afterwards), he apologized many

times.’
c. El Joan va deixar en ridı́cul al Dani davant de tothom. (Per això), es va ofendre

moltı́ssim.
d. El Joan va deixar en ridı́cul al Dani davant de tothom. (Per això), ell es va

ofendre moltı́ssim.
‘John made fun of Dani in front of everyone. (That’s why), he was very of-
fended.’
(2) a. El Marc li ha demanat al Lluı́s que no fumés. (Resulta que) és al·lèrgic al fum
del tabac.
b. El Marc li ha demanat al Lluı́s que no fumés. (Resulta que) ell és al·lèrgic al
fum del tabac.
‘Marc asked Lluı́s to stop smoking. (It turns out that) he is allergic to tobacco
smoke.’
c. El Marc li ha demanat al Lluı́s que no fumés. (Tanmateix), no aconsegueix

deixar-ho.
196
d. El Marc li ha demanat al Lluı́s que no fumés. (Tanmateix), ell no aconsegueix
deixar-ho.
‘Marc asked Lluı́s to stop smoking. (However), he told him he had not managed
to quit.’
(3) a. La Carla sempre contradiu a la Júlia. (A més), a ella li agrada fer-la empipar.
b. La Carla sempre contradiu a la Júlia. (A més), li agrada fer-la empipar.
‘Carla is always contradicting Júlia. (In addition), she wants revenge.’
c. La Carla sempre contradiu a la Júlia. (Per això), sempre s’acaba enfadant.
d. La Carla sempre contradiu a la Júlia. (Per això), ella sempre s’acaba enfadant.
‘Carla is always contradicting Júlia. (That’s why), she always ends up being
angry.’
(4) a. El Vicenç va insultar l’Enric pel carrer. (Després d’insultar-lo), va dir-li coses
molt grosses.
b. El Vicenç va insultar l’Enric pel carrer. (Després d’insultar-lo), ell va dir-li

coses molt grosses.
‘Vicenç insulted Enric on the street. (After having insulted him), he used very
harsh words.’
c. El Vicenç va insultar l’Enric pel carrer. (Després de ser insultat), li va tornar

insults encara pitjors.
d. El Vicenç va insultar l’Enric pel carrer. (Després de ser insultat), ell li va tornar
insults encara pitjors.
‘Vicenç insulted Enric on the street. (After being insulted), he insulted him
with even worse words.’
(5) a. El Llorenç respecta molt l’opinió del Quim. (A més), sempre li demana consell.
197
b. El Llorenç respecta molt l’opinió del Quim. (A més), ell sempre li demana
consell.
‘Llorenç respects Quim’s opinion a lot. (In addition), he always asks him for
advice.’
c. El Llorenç respecta molt l’opinió del Quim. (Tanmateix), no li sol donar gaire
bons consells.
d. El Llorenç respecta molt l’opinió del Quim. (Tanmateix), ell no li sol donar
gaire bons consells.
‘Llorenç respects Quim’s opinion a lot. (However), he usually does not give
good advice.’
(6) a. La Carme sempre intimida la Sònia. (Resulta que) té un caràcter molt fort.
b. La Carme sempre intimida la Sònia. (Resulta que) ella té un caràcter molt fort.
‘Carme has always intimidated Sònia. (It turns out that) she has a very strong
personality.’
c. La Carme sempre intimida la Sònia. (Per això), no s’atreveix a parlar-li.
d. La Carme sempre intimida la Sònia. (Per això), ella no s’atreveix a parlar-li.
‘Carme has always intimidated Sònia. (That’s why), she does not dare to talk
to her.’
(7) a. La Paula va renyar la Núria. (Tanmateix), ho va fer sense ser gaire dura.
b. La Paula va renyar la Núria. (Tanmateix), ella ho va fer sense ser gaire dura.
‘Paula scolded Núria. (However), she was not too tough.’
c. La Paula va renyar la Núria. (Resulta que) havia comès un error greu.
d. La Paula va renyar la Núria. (Resulta que) ella havia comès un error greu.
‘Paula scolded Núria. (It turns out that) she had made a serious mistake.’
198
(8) a. La Maria es va trobar la Núria inconscient al sofà. (Després de trobar-la), va
trucar l’ambulància.
b. La Maria es va trobar la Núria inconscient al sofà. (Després de trobar-la), ella

va trucar l’ambulància.
‘Maria found Núria unconscious on the couch. (After finding her), she called
an ambulance.’
c. La Maria es va trobar la Núria inconscient al sofà. (Després que la trobessin),

va recuperar a poc a poc el coneixement.
d. La Maria es va trobar la Núria inconscient al sofà. (Després que la trobessin),

ella va recuperar a poc a poc el coneixement.
‘Maria found Núria unconscious on the couch. (After being found), she slowly
recovered consciousness.’
(9) a. El Pau va preparar-li el sopar a l’Albert. (A més), també li va preparar les

postres.
b. El Pau va preparar-li el sopar a l’Albert. (A més), ell també li va preparar les

postres.
‘Pau prepared dinner for Albert. (In addition), he also prepared dessert.’
c. El Pau va preparar-li el sopar a l’Albert. (Tanmateix), no se’l va poder acabar.
d. El Pau va preparar-li el sopar a l’Albert. (Tanmateix), ell no se’l va poder

acabar.
‘Pau prepared dinner for Albert. (However), he could not finish it.’
(10) a. El Pere va guanyar el Pau al futbolı́. (Resulta que) té molta més experiència.
b. El Pere va guanyar el Pau al futbolı́. (Resulta que) ell té molta més experiència.
‘Pere beat Pau at table football. (It turns out that) he is much more experienced.’
199
c. El Pau va guanyar el Pau al futbolı́. (Tanmateix), no s’enfada quan perd.
d. El Pere va guanyar el Pau al futbolı́. (Tanmateix), ell no s’enfada quan perd.
‘Pere beat Pau at table football. (However), he does not get angry when he
loses.’
(11) a. L’Adrià va felicitar el Manel pel seu aniversari. (A més), li va fer un regal.
b. L’Adrià va felicitar el Manel pel seu aniversari. (A més), ell li va fer un regal.
‘Adrià wished Manuel a happy birthday. (In addition), he gave him a present.’
c. L’Adrià va felicitar el Manel pel seu aniversari. (Tanmateix), odia fer anys.
d. L’Adrià va felicitar el Manel pel seu aniversari. (Tanmateix), ell odia fer anys.
‘Adrià wished Manuel a happy birthday. (However), he hates getting older.’
(12) a. La Verònica no va reconèixer la Marina pel carrer. (Resulta que) és una mica
despistada.
b. La Verònica no va reconèixer la Marina pel carrer. (Resulta que) ella és una
mica despistada.
‘Verònica did not recognize Marina in the street. (It turns out that) she is a bit
absent-minded.’
c. La Verònica no va reconèixer la Marina pel carrer. (Resulta que) estava molt

canviada.
d. La Verònica no va reconèixer la Marina pel carrer. (Resulta que) ella estava

molt canviada.
‘Verònica did not recognize Marina in the street. (It turns out that) she was
looking very different.’
(13) a. L’Elena va regalar un llibre a la Gemma. (Tanmateix), no té per costum fer
regals.
200
b. L’Elena va regalar un llibre a la Gemma. (Tanmateix), ella no té per costum fer
regals.
‘Elena gave a book to Gemma. (However), she does not usually give presents.’
c. L’Elena va regalar un llibre a la Gemma. (Tanmateix), ja el tenia.
d. L’Elena va regalar un llibre a la Gemma. (Tanmateix), ella ja el tenia.
‘Elena gave a book to Gemma. (However), she already had it.’
(14) a. El Toni va anar a escoltar el concert del Pau. (Després), va anar al gimnàs abans
que acabés.
b. El Toni va anar a escoltar el concert del Pau. (Després), ell va anar al gimnàs
abans que acabés.
‘Toni went to listen one of Pau’s concerts. (Afterwards), he went to the gym
before it was over.’
c. El Toni va anar a escoltar el concert del Pau. (Resulta que) toca cada dijous.
d. El Toni va anar a escoltar el concert del Pau. (Resulta que) ell toca cada dijous.
‘Toni went to listen one of Pau’s concerts. ((It turns out that) he plays every
Thursday.’
(15) a. El Miquel ha fet enfadar el Ramon. (Per això), li ha demanat disculpes.
b. El Miquel ha fet enfadar el Ramon. (Per això), ell li ha demanat disculpes.
‘Miquel made Ramon get angry. (That’s why), he apologized.’
c. El Miquel ha fet enfadar el Ramon. (Per això), no li parla quan el veu.
d. El Miquel ha fet enfadar el Ramon. (Per això), ell no li parla quan el veu.
‘Miquel made Ramon get angry. (That’s why), he does not talk to him when he
sees him.’
201
(16) a. La Júlia ha enganyat la Maria més d’una vegada. (Resulta que) és molt men-
tidera.
b. La Júlia ha enganyat la Maria més d’una vegada. (Resulta que) ella és molt
mentidera.
‘Júlia cheated on Maria more than ones. (It turns out that) she tells a lot of lies.’
c. La Júlia ha enganyat la Maria més d’una vegada. (Per això), ja no se’n refia.
d. La Júlia ha enganyat la Maria més d’una vegada. (Per això), ella ja no se’n
refia.
‘Júlia cheated on Maria more than ones. (That’s why), she does not trust her.’
9.4 Appendix D. Materials for Experiment 4
List of the sixteen experimental items in the four conditions:
• Condition 1: null pronoun + svo.
• Condition 2: null pronoun + ovs.
• Condition 3: overt pronoun + svo.
• Condition 4: overt pronoun + ovs.
In parenthesis, the overt pronouns from Conditions 3 and 4.
(1) a. La Marta escrivia sovint a la Raquel. (Ella) vivia als Estats Units.
b. A la Raquel, l’escrivia sovint la Marta. (Ella) vivia als Estats Units.
‘Marta wrote frequently to Raquel. She lived in the United States.’
(2) a. El Robert va insultar el Carles. (Ell) estava borratxo.
202
b. Al Carles, el va insultar el Robert. (Ell) estava borratxo.
‘Robert insulted Carles. He was drunk.’
(3) a. La Gemma fa temps que no veu l’Anna. (Ella) s’ha casat fa poc.
b. A l’Anna fa temps que no la veu la Gemma. (Ella) s’ha casat fa poc.
’Gemma has not seen Anna in a long time. She got recently married.’
(4) a. L’Adrià ha trucat a l’Albert. (Ell) estava a l’oficina.
b. A l’Albert, l’ha trucat l’Adrià. (Ell) estava a l’oficina.
‘Adrià called Albert. He was at the office.’
(5) a. La Montse ha convidat al teatre a la Marta. (Ella) no ha de treballar.
b. A la Marta, l’ha convidada al teatre la Montse. (Ella) no ha de treballar.
‘Montse has treated Montse to the theater. She doesn’t have to work.’
(6) a. La Sònia va trucar a la Sı́lvia. (Ella) estava arribant tard.
b. A la Sònia, la va trucar la Sònia. (Ella) estava arribant tard.
‘Sònia called Sı́lvia. She was late.’
(7) a. El Toni portarà de viatge el Marc. (Ell) sempre ha volgut anar a Londres.
b. Al Marc, el portarà de viatge el Toni. (Ell) sempre ha volgut anar a Londres.
‘Toni will travel with Marc. (He) has always wanted to visit London.’
(8) a. El Josep dóna classes de tennis al Martı́. (Ell) té la tarda lliure.
b. Al Martı́, li dóna classes de tennis el Josep. (Ell) té la tarda lliure.
‘Josep teaches tennis to Martı́. He is free in the afternoon.’
(9) a. La Roser va anar a veure la Marina. (Ella) tenia problemes.
203
b. A la Marina, la va visitar la Roser. (Ella) tenia problemes.
‘Roser went to see Marina. She had problems.’
(10) a. El Jordi va avisar el Gabriel que tindrien problemes. (Ell) està espantat.
b. Al Gabriel, el va avisar el Jordi que tindrien problemes. (Ell) està espantat.
‘Jordi warned Gabriel that they would have trouble. He is scared.’
(11) a. El Germà va insultar el Pere. (Ell) el va pegar.
b. Al Pere, el va insultar el Germà. (Ell) el va pegar.
‘Germà insulted Pere. He hit him.’
(12) a. La Lali va a buscar la Jordina per anar d’excursió. (Ella) disfruta caminant.
b. A la Jordina, l’ha vingut a buscar la Lali per anar d’excursió. (Ella) disfruta
caminant.
‘Lali always goes hiking with Jordina. She likes walking.’
(13) a. L’Andreu ha trucat el Ricard per anar a l’òpera. (Ell) n’és molt aficionat.
b. Al Ricard, l’ha trucat l’Andreu per anar a l’òpera. (Ell) n’és molt aficionat.
‘Andreu called Ricard to go to the opera. He is quite an expert.’
(14) a. L’Estel ajuda amb l’anglès a la Blanca. (Ella) s’hi esforça molt.
b. A la Blanca, l’ajuda amb l’anglès l’Estel. (Ella) s’hi esforça molt.
‘Estel helps Blanca with her English language skills. She puts a lot of effort.’
(15) a. La Núria troba molt a faltar la Rosa. (Ella) ha marxat a viure lluny.
b. A la Rosa, la troba molt a faltar la Núria. (Ella) ha marxat a viure lluny.
‘Nuria misses a lot Rosa. She has moved far away.’
(16) a. El Vı́ctor s’ha enfadat molt amb el Rubén. (Ell) és molt tossut.
204
b. Amb el Rubén, s’hi ha enfadat molt el Vı́ctor. (Ell) és molt tossut.
‘Vı́ctor got very angry with Rubén. He is very stubborn.’
9.5 Appendix E. Materials for Experiment 5
List of the nine experimental items.
(1) a. El Joan va trucar al Jaume. Era ell qui aniria a Parı́s.
‘John called Jaume. It was him who was going to Paris.’
(2) a. La Cèlia va anar a buscar la Sònia a la feina. Era ella qui havia insistit per
quedar.
‘Cèlia went to Sònia’s office. It was her who insisted in meeting.’
(3) a. La Maria va trobar-se amb la Clara a la biblioteca. Era ella qui havia volgut
que estudiessin juntes.
‘Maria met Clara at the library. It was her who insisted on them studying
together.’
(4) a. La Pepa va parlar amb la Marina. La responsable del que havia passat era ella.
‘Pepa talked to Marina. She was the one who was responsible of what had
happened.’
(5) a. El Carles li va escriure una carta al Marcel. El guanyador del concurs era ell.
‘Carles wrote Marcel a letter. He was the one who won the competition.’
(6) a. El Lluc volia parlar seriosament amb en Joan. El millor candidat per la feina
era ell.
205
‘Lluc wanted to talked to Joan. He was the one who was the best candidate for
the job.’
(7) a. El Pep va anar de viatge amb en Màrius. Només ell pot fer fotos tan magnı́fiques
com les que ens va ensenyar.
‘Pep travelled with Màrius. Only he can do such wonderful pictures as the ones
he showed us.’
(8) a. L’Anna va sortir de festa amb l’Eli. Al final de la nit només ella estava borratxa.
‘Anna went out with Eli. At the end of the night, only she was drunk.’
(9) a. L’Aina va quedar amb l’Eva per anar a un concert. Fins i tot ella creia que s’ho
passaria bé.
‘Aina met Eva to go to a concert. Even she thought she would have a good
time.’
206
Bibliography
Emilio Alarcos Llorach. Gramática de la lengua española. Planeta, Madrid, fourth edition,
1994.
Artemis Alexiadou and Elena Anagnostopoulou. Parametrizing Agr: V-movement, word

order, and EPP-checking. Natural Language and Linguistic Theory, 16:491–539, 1998.
Luis Alonso-Ovalle, Susana Fernández Solera, Lyn Frazier, and Charles Clifton. Null vs.
overt pronouns and the topic-focus articulation in Spanish. Journal of Italian Linguistics,
14:2:151–169, 2002.
Mira Ariel. Accessibility theory: an overview. In T. Sanders, J. Schilperoord, and

W. Spooren, editors, Text representation: linguistic and psycholinguistic aspects. John
Benjamins, Amsterdam, 2001.
Mira Ariel. Accessing Noun Phrase Antecedents. Routledge, London, 1990.
Jennifer E. Arnold, Thomas Wasow, Anthony Losongco, and Ryan Ginstrom. Heaviness
vs. newness: The effects of structural complexity and discourse status on constituent
ordering. Language, 76:28–55, 2000.
Nicholas Asher. Discourse topic. Theoretical Linguistics, 30:163–201, 2004.
Nicholas Asher and Alex Lascarides. Logics of Conversation. Cambridge University Press,
2003.
207
Pilar Barbosa. Two kinds of subject pro. Studia Linguistica, 62(1):2–58, 2000.
Pilar Barbosa, Maria Eugénia Duarte, and Mary Kato. Null subjects in European and
Brazilian Portuguese. Journal of Portuguese Linguistics, 4:11–52, 2005.
Elizabeth Bates. Language and Context: The acquisition of pragmatics. Academic Press,
New York, 1976.
Adriana Belletti. “Inversion” as focalization. In Aafke C.J. Hulk and Jean-Yves Pollock,
editors, Subject Inversion in Romance and the Theory of Universal Grammar, pages
60–106. Oxford University Press, 2000.
Adriana Belletti and Luigi Rizzi. The syntax of ‘ne’: Some theoretical implications. Lin-
guistic Review, 2(3):1–33, 1981.
Milena Bini. La adquisición del italiano: más allá de las propiedades sintácticas del
parámetro pro-drop. In J.M. Liceras, editor, La lingüı́stica y el análisis de los sistemas
no nativos, pages 126–139. Dovehouse, Ottawa, 1993.
Oduentan Bode. Yoruba Clause Structure. Ph.d. thesis, University of Iowa, 2000.
Eulàlia Bonet. Subjects in catalan. The Massachussets Institute of Technology Working

Papers in Linguistics, 13:1–26, 1990.
Holly P. Branigan, Martin J. Pickering, and Alexandra A. Cleland. Syntactic co-ordination

in dialogue. Cognition, 75(2):B13–B25, May 2000.
Penelope Brown and Stephen C. Levinson. Politeness : Some Universals in Language Us-
age (Studies in Interactional Sociolinguistics). Cambridge University Press, Cambridge,
1987.
208
José Maria Brucart. La elisión sintáctica en español. Publicacions de la Universitat
Autònoma de Barcelona, Bellaterra, 1987.
Lisa Brunetti. Italian background: Links, tails, and contrast effects. In L. Kálmán B. Gyuris
and C. Piñon, editors, Proceedings of the Ninth Symposium on Logic and Language,
2006.
Daniel Büring. On D-Trees, Beans, and B-Accents. Linguistics & Philosophy, 26(5):
511–545, 2003.
Daniel Büring. Topic. In Peter Bosch and Rob van der Sandt, editors, Focus — Linguis-
tic, Cognitive, and Computational Perspectives, pages 142–165. Cambridge University
Press, 1999.
Luigi Burzio. Italian Syntax: A Government-Binding Approach. Reidel, Dordrecht, 1986.
Andrea Calabrese. Pronomina: Some properties of the Italian pronominal system. MITWP
in Theoretical Linguistics, 8:1–46, 1985.
Richard Cameron. Pronominal and null subject variation in Spanish: constraints, dialects,
and functional compensation. PhD thesis, University of Pennsylvania, 1992.
Alfonso Caramazza, Ellen Grober, Catherine Garvey, and Jack Yates. Comprehension
of anaphoric pronouns. Journal of Verbal Learning and Verbal Behavior, 16:601–609,
1977.
Maria Nella Carminati. The processing of Italian subject pronouns. PhD thesis, University
of Massachusetts, 2002.
Lourdes Casanova. Un estudi tipològic del català col·loquial. Sintagma, 10:5–25, 1998.
209
Wallace L. Chafe. Givenness, contrastiveness, definiteness, subjects, topics, and points of
view. In C. N. Li, editor, Subject and Topic, pages 25–56. Academic Press, 1976.
Robin Clark and Prashant Parikh. Game theory and discourse anaphora. Journal of Logic,
Language and Information, 16(3):265–282, 2007.
Philip D. Curtin. The Atlantic slave trade: a census. University of Wisconsin Press,
Madison, 1969.
Brad Davidson. ’Pragmatic weight’ and Spanish subject pronouns: The pragmatic and
discourse uses of ’tu’ and ’yo’ in spoken Madrid Spanish. Journal of Pragmatics, 26:
543–565(23), 1996.
Rosane de Andrade Berlinck. Brazilian Portuguese VS order: A diachronic analysis. In

Mary Kato and Esmeralda Negraõ, editors, Brazilian Portuguese and the Null Subject
Parameter, pages 175–194. Vervuert, Frankfurt Am Main, 2000.
Barbara DiEugenio. Centering in Italian. In A. K. Joshi M. Walker and E. Prince, editors,

Centering Theory in Discourse, pages 114–137. Oxford University Press, 1998.
Alexis Dimitriadis. When pro-drop languages don’t: Overt pronominal subjects and prag-
matic inference. In Proceedings of Chicago Linguistics Society 32, 1996.
Avinash Dixit and Barry Nalebuff. Thinking Strategically. Norton, New York, 1991.
Maria Eugénia Duarte. Do pronome nulo ao pronome plen. In Ian Robers and Mary A.
Kato, editors, Português Brasileiro: Uma viagem diacrônica, pages 107–128. Campinas.
Ed. da Unicamp, 1993.
Manuel Esgueva and Margarita Cantarero. El habla de la ciudad de Madrid: Materiales

para su estudio. Consejo Superior de Investigaciones Cientı́ficas, Instituto Miguel de
Cervantes, Madrid, 1981.
210
Ezike Eze. The forgotten null subject of Igbo. In Akinbiyi Akinlabi, editor, Theoretical
Approaches to African Linguistics, pages 59–81. Africa World Press, Trenton, 1995.
Fernanda Ferreira and Charles Clifton. The independence of syntactic processing. Journal
of Memory and Language, 25:555–568, 1986.
Nydia Flores-Ferran. Subject personal pronouns in Spanish narratives of Puerto Ricans in

New York City. Lincom Europa, Meunchen, 2002.
Nydia Flores-Ferrán. A bend in the road: Subject personal pronoun expression in Spanish
after 30 years of sociolinguistic research. Language and Linguistic Compass, 1(6):624–
652, November 2007.
Ilaria Frana. The role of discourse prominence in the resolution of referential ambiguities.
Evidence from co-reference in Italian. UMOP 37: Semantics and Processing, 2007.
Talmy Givón. Topic continuity in discourse: An introduction. In Talmy Givón, editor,

Topic continuity in discourse. A quantitative cross-language study, pages 11–42. John
Benjamins, Amsterdam, Philadelphia, 1983.
Barbara Grosz, Aravind Joshi, and Scott Weinstein. Centering: A framework for modeling
the local coherence of discourse. Computational Linguistics, 21(2):203–225, 1995.
Jeanette Gundel. Universals of topic-comment structure. In M. Hammond et. al., editor,

Studies in Syntactic Typology, pages 209–239. John Benjamins, 1988.
Jeanette Gundel. Shared knowledge and topicality. Journal of Pragmatics, 9:83–107, 1985.
Jeanette Gundel and Torsten Fretheim. Topic and focus. In Laurence Horn and Gregory
Ward, editors, Handbook of Pragmatic Theory. Blackwell, Oxford, 2001.
211
Jeanette Gundel, Nancy Hedberg, and Ron Zacharski. Cognitive status and the form of
referring expressions in discourse. Language, 69:274–307, 1993.
Gregory Guy. Linguistic variation in Brazilian Portuguese: Aspects of the phonology,

syntax, and language history. Ph.d. thesis, University of Pennsylvania, 1981.
Yurie Hara and Robert van Rooij. Contrastive topics revisited: A simpler set of topic
alternatives. In Proceedings of NELS, volume 38, 2007.
Daniel Hartl and Andrew Clark. Principles of Population Genetics. Sinauer Associates,
1989.
Irene Heim. File Change Semantics and the Familiarity Theory of Definites. In Rainer
Bäuerle, Christoph Schwarze, and Arnim von Stechow, editors, Meaning, Use and Inter-
pretation of Language, pages 164–189. De Gruyter, Berlin, 1983.
Julia Hirschberg and Gregory Ward. Accent and bound anaphora. Cognitive Linguistics,
2:101–121, 1991.
Jerry R. Hobbs. Coherence and coreference. Cognitive Science, 3:67–90, 1979.
Anders Holmberg, Aarti Nayudu, and Michelle Sheehan. Three partial null-subject lan-
guages: a comparison of Brazilian Portuguese, Finnish, and Marathi. Studia Linguistica,
63(1):59–97, 2009.
John Holms. Languages in Contact: The partial restructuring of vernaculars. Cambridge

University Press, Cambridge, 2004.
Larry Horn. Implicature. In The Handbook of Pragmatics, pages 3–28. Blackwell, Oxford,
2004.
José Ignacio Hualde. Catalan. Descriptive Grammars. Routledge, London, 1992.
212
Katja Jasinskaja and Henk Zeevat. Contrast in Russian and English. Proceedings of Sinn
und Bedeutung, 13, 2008.
Gerhard Jäger. Evolutionary game theory and typology: A case study. Language, 83(1):
74–109, 2007.
Gerhard Jäger and Robert van Rooij. Language structure: psychological and social con-
straints. Synthese, 159(1):93–130, 2007.
Elsi Kaiser. Effects of topic and focus on salience. In Christian Ebert and Cornelia Endriss,
editors, Proceedings of Sinn und Bedeutung 10, volume 44, pages 139–154, 2006.
Elsi Kaiser and John Trueswell. Interpreting pronouns and demonstratives in Finnish: Ev-
idence for a form-specific approach to reference resolution. Language and Cognitive
Processes, 23(5):709–748, 2008.
Megumi Kameyama. Stressed and unstressed pronouns: Complementary preferences. In

Focus: Linguistic, Cognitive, and Computational Perspectives, pages 306–321. Cam-
bridge University Press, 1999.
Megumi Kameyama. Zero Anaphora: The Case of Japanese. PhD thesis, Stanford Univer-
sity, 1985.
Andrew Kehler. Coherence, Reference and the Theory of Grammar. Stanford, CA, CSLI
Publiacions, 2002.
Lucy Kyoungsook Kim. Korean honorific agreement too guides null argument resolution:
Evidence from an offline study. Talk at the Penn Linguistics Colloquium, 2009.
Arnout W. Koornneef and Jos J. A. Van Berkum. On the use of verb-based implicit causality
in sentence comprehension: Evidence from self-paced reading and eye tracking. Journal
of Memory and Language, 54(4):445–465, 2006.
213
Manfred Krifka. Additive particles under stress. Proceedings of Salt 8, CLC Publications:
111–128, 1999.
Manfred Krifka. The semantics and pragmatics of polarity items. Linguistic Analysis, 25
(3-4):209–257, 1995.
Anthony Kroch. Reflexes of grammar in patterns of language change. Language Variation

and Change, 1:199–244, 1989.
Susumu Kuno. Functional sentence perspective: A case study from Japanese and English.
Linguistic Inquiry, 3:269–320, 1972.
William Labov. Principles of linguistic change. Volume 1: Internal Factors. Basil Black-
well, Oxford, 1994.
Knud Lambrecht. When subjects behave like objects: An analysis of the merging of S
and O in sentence-focus constructions across languages. Studies in Language, 24(3):
611–682, 2001.
Solange de Azambuja Lira. Nominal, pronominal and zero subject in Brazilian Portuguese.
Ph.d. thesis, University of Pennsylvania, 1982.
Paolo Lorusso, Claudia Caprin, and Maria Teresa Guasti. Overt subject distribution in early
Italian children. In Proceedings of Boston University Conference, volume 29, 2005.
Duncan Luce and Howard Raiffa. Games and Decision. John Wiley & Sons, Inc., New
York, 1957.
Marta Luján. Binding properties of overt pronouns in null pronominal languages. In

P.D. Kroeber W. H. Eilforth and K.L. Peterson, editors, Proceedings of the Chicago
Linguistics Society, volume 21, pages 424–438, 1985.
214
Marta Luján. Expresión y omisión del pronombre personal. In Violeta Demonte and Ig-
nacio Bosque, editors, Gramática descriptiva de la lengua española, pages 1275–1316.
Espala-Calpe, Madrid, 1999.
Panagiota Marzaga and Aurora Bel. Null subjects at the syntax-pragmatics interface: Evi-
dence from Spanish interlanguage of Greek speakers. Proceedings of the 8th Generative
Approaches to Second Language Acquisition Conference, pages 88–97, 2006.
Patrı́cia Matos Amaral and Scott A. Schwenter. Contrast and the (non-)occurrence of sub-
ject pronouns. Selected Proceedings of the 7th Hispanic Linguistics Symposium, pages
116–127, 2005.
Louise McNally. On recent formal analyses of topic. In The Tbilisi Symposium on Lan-
guage, Logic, and Computation: Selected Papers, 1998.
Amparo Morales. Hacia un universal sintáctico del español del caribe: el orden SVO.
Anuario de Lingüı́stica Hispánica, 5:139–152, 1989.
Rebecca Nesson, Floris Roelofsen, and Barbara J. Grosz. Rational coordinated anaphora
theory. Technical Report TR-01-08, School of Engineering and Applied Sciences, Har-
vard University, Cambridge, MA, 2008.
Nocando. Non-canonical constructions in oral speech: a cross-linguistic study. Technical

Report I+D HUM2004-04463, Universitat Pompeu Fabra, Barcelona, 2004.
Richardo Otheguy, Ana Celia Zentella, and David Livert. Language and dialect contact in
Spanish in New York: Towards the formation of a speech community. Language, 83:
1–33, 2007.
Prashant Parikh. Language in Use. Center for the Study of Language and Information,
Stanford, 2001.
215
Janet Pierrehumbert. The Phonology and Phonetics of English Intonation. Indiana Univer-
sity Linguistics Club, Bloomington, 1980.
Steven Pinker, Martin A. Nowak, and James J. Lee. The logic of indirect speech. PNAS,
105(3):833–838, January 2008.
Shana Poplack. Mortal phonemes as plural morphemes. In David Sankoff and Henrietta
Cedergren, editors, Variation omnibus, pages 59–71. Edmonton: Linguistic Research
Inc., 1981.
Chris Potts. The expressive dimension. Theoretical Linguistics, 33(2):165–197, 2007.
Ellen Prince. Toward a taxonomy of given-new information. In Peter Cole, editor, Radical
Pragmatics, pages 223–255. Associated Press, 1981.
Ellen Prince. The ZPG letter: subjects, definiteness, and information-status. In John Ben-
jamins, editor, Discourse description: diverse analyses of a fund raising text, pages
195–325. John Benjamins, 1992.
Tanya Reinhart. Pragmatics and linguistics: An analysis of sentence topics. Philosophica,

27:53–94, 1981.
Gemma Rigau. Some remarks on the nature of strong pronouns in null-subject languages.
In I. Bordelois, H. Contreras, and K. Zagona, editors, Generative Studies in Spanish
Syntax. Foris, Dordrecht, 1986.
Gemma Rigau. Connexity established by emphatic pronouns. In J. S. Petöfi M. Conte

and E. Sözer, editors, Text and Discourse Connectedness. John Benjamins Publishing,
Amsterdam, 1989.
Luigi Rizzi. Issues in Italian Syntax. Foris, Dordrecht, 1982.
216
Mats Rooth. Association with Focus. Ph.d. thesis, University of Massachussets, 1985.
Mats Rooth. A theory of focus interpretation. Natural Language Semantics, 1(1):75–116,

1992.
Ian Ross. Games Interlocutors Play: New Adventures in Compositionality and Conversa-
tional Implicature. Ph.d. thesis, University of Pennsylvania, 2006.
David Sally. Risky speech: behavioral game theory and pragmatics. Journal of Pragmatics,
35(8):1223–1245, 1993.
Vieri Samek-Lodovici. Constraints on Subjects: An Optimality Theoretic Analysis. Ph.d.

thesis, Rutgers University, 1996.
Thomas Schelling. The strategy of conflict. Harvard University Press, Cambridge, 1969.
Katrin Schulz and Robert Van Rooij. Pragmatic meaning and non-monotonic reasoning:
The case of exhaustive interpretation. Linguistics and Philosophy, 29(2):205–250, 2006.
Carmen Silva-Corvalán. Language Contact and Change. Clarendon Press, Oxford, 1994.
Carmen Silva-Corvalán. A discourse study of word order in the Spanish spoken by

Mexican-Americans in West Los Angeles. Master’s thesis, University of California,
Los Angeles, 1977.
Antonella Sorace, Ludovica Serratrice, Francesca Filiaci, and Michela Baldo. Discourse
conditions on subject pronoun realization: testing the linguistic intuitions of older bilin-
gual children. Lingua, 119:460–477, 2009.
Rosemary Stevenson, Alistair Knott, Jon Oberlander, and Sharon McDonald. Interpreting
pronouns and connectives: Interactions among focusing, thematic roles and coherence
relations. Language and Cognitive Processes, 15:225–262(38), 2000.
217
Miranda Stewart. ’Pragmatic weight’ and face: pronominal presence and the case of the
Spanish second person singular subject pronoun tu. Journal of Pragmatics, 35:191–
206(16), 2003.
Aleš Svoboda and Pavel Materna. Functional sentence perspective and intensional logic.
In René Dirven and Vilém Fried, editors, Functionalism in linguistics, pages 191–205.
John Benjamins, Amsterdam, 1987.
Anna Szabolcsi. The semantics of topic-focus articulation. In Jeroen Groenendijk and

Martin Stokhof, editors, Methods in the Study of Language, pages 413–541. Matematisch
Centrum, Amsterdam, 1981.
Júlia Todolı́. Els pronoms personals. In Joan Solà et. al., editor, Gramàtica del català
contemporani, pages 1341–1433. Empúries, 2002.
Satoshi Tomioka. Contrastive topics operate on speech acts. In Caroline Fery and Malte
Zimmermann, editors, Information Structure from Different Perspectives. Oxford Uni-
versity Press, 2008.
Almeida Jacqueline Toribio. Setting parametric limits on dialectal variation in Spanish.

Lingua, 110:315–341, 2000.
Catherine E. Travis. The yo-yo effect: Priming in subject expression in Colombian Span-
ish. In Randall S. Gess and Edward J. Rubin, editors, Theoretical and Experimental
Approaches to Romance Linguistics, pages 329–349. Benjamins, Amsterdam Philadel-
phia, 2005.
John C. Trueswell, Michael K. Tanenhaus, and Susan M. Garnsey. Semantic influences on

parsing: Use of thematic role information in syntactic ambiguity resolution. Journal of
Memory and Language, 33(3):285–318, June 1994.
218
Umit Turan. Null vs. Overt Subjects in Turkish Discourse: A Centering Analysis. PhD
thesis, University of Pennsylvania, 1995.
Tycho Brahe Corpus. Tycho Brahe parsed corpus of historical Portuguese. Technical Re-
port http://www.tycho.iel.unicamp.br/ tycho/corpus/en/index.html, University of Camp-
inas, Campinas, 2009.
Carla Umbach. Contrast in information structure and discourse structure. Journal of Se-
mantics, 21(2):155–175, 2004.
Enric Vallduvı́. L’oració com a unitat informativa. In Joan Solà et. al., editor, Gramàtica
del català contemporani, pages 1221–1279. Empúries, 2002.
Enric Vallduvı́. Catalan as VOS: evidence from information packaging. In William J.

Ashby, Marianne Mithun, and Giorgio Perissinotto, editors, Linguistic Perspectives on
Romance Languages, pages 335–350. Benjamins, Philadelphia, 1993.
Enric Vallduvı́ and Maria Vilkuna. On rheme and kontrast. In P. Culicover and L. Mc-
nally, editors, Syntax and Semantics, volume 29: The Limits of Syntax, pages 209–239.
Academic Press, San Diego, 1998.
Enric Vallduvı́. The informational component. Garland, New York, 1992.
Enric Vallduvı́ and Elisabet Engdahl. The linguistic realisation of information packaging.
Linguistics, 34:459–519, 1996.
Robert van Rooij. Game theory for linguists. In A. Benz, G. Jäger, and Robert Van Rooij,
editors, Game theory and pragmatics. Palgrave Macmillan, Basingstoke: New York,
2006.
Robert van Rooij. Questioning to resolve decision problems. Linguistics and Philosophy,
26:727–763, 2003.
219
John von Neumann and Oskar Morgenstern. Theory of Games and Economic Behavior.
Princeton University Press, 1944.
Marilyn Walker, Aravind Joshi, and Ellen Prince. Centering theory in discourse. Oxford
University Press, New York, 1998.
Max Wheeler. Catalan. In M. Harris and N. Vincent, editors, The Romance Languages.
Routledge, London, 1988.
Charles D. Yang. Internal and external forces in language change. Language Variation and
Change, 12(3):231–250, October 2000.
Charles D. Yang. Knowledge and Learning in Natural Language. Oxford University Press,
2003.
220

Phdthesis Mayol

Uploaded by

Copyright:

Available Formats

Phdthesis Mayol

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Phdthesis Mayol

Uploaded by

Copyright:

Available Formats

PRONOUNS IN CATALAN: INFORMATION, DISCOURSE AND STRATEGY

Presented to the Faculties of the University of Pennsylvania in Partial

Robin Clark, Supervisor of Dissertation

Eugene Buckley, Graduate Group Chair

PRONOUNS IN CATALAN: INFORMATION, DISCOURSE AND STRATEGY

Supervisor: Robin Clark

List of Figures xiii

List of Abbreviatures xiv

3 Subjecthood and pronouns: The Position of Antecedent Hypothesis 37

5 Pragmatic structure and pronouns: topic, link and focus 86

6 Cross-linguistic variation 122

2.1 Transitions in Centering Theory . . . . . . . . . . . . . . . . . . . . . . . 11

3.1 Results for Experiment 1 in Carminati (2002) . . . . . . . . . . . . . . . . 39

4.1 Example with a dominant strategy . . . . . . . . . . . . . . . . . . . . . . 54

5.1 Results in Frana (2007) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

2.1 Catalan VOS order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.1 Game of incomplete information . . . . . . . . . . . . . . . . . . . . . . . 57

5.1 Game for interaction between subjecthood and linkhood . . . . . . . . . . 105

7.1 Contrast I Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

1.1 The research question

(1) a. Ell estima la Maria.

b. Quan (ell) va arribar, tothom va callar.

c. En Pere és de Barcelona però *(tu) ets de Girona.

• Question 1: What is the relationship between syntactic function and pronouns in

• Question 2: What is the relationship between information structure and pronouns in

1.2 Structure of the thesis

This thesis is structured as follows:

• Chapter 4 gives an overview of game theory and its application to linguistics as it

• Chapter 6 investigates the differences between different null-subject Romance vari-

2.1 Choice of referring expressions and their processing

2.1.1 Accessibility Theory (Ariel, 2001)

(3) Low Accessibility ...................................................................... High Accessibility

(5) a. Q1. Make your contribution as informative as required

b. Q2. Do not make your contribution more informative than is required

• Rule 1: If there is a pronoun in an utterance, its CB must be also realized as a

2.2 Information structure

(6) a. What did Mary eat?

2.2.1 The Old and the New

(7) Who called?

2.2.2 Vallduvı́’s (1992) tripartite approach

Vallduvı́ (1992) views information packaging as the “structuring of sentences by syntactic,

(9) Where can I find the cutlery?

2.3 Catalan syntactic and pragmatic structure

(10) a. Els nens diuen moltes mentides.

b. Diuen moltes mentides els nens.

c. De mentides, els nens en diuen moltes.

d. Els nens en diuen moltes, de mentides.

(11) * Els nens fiquen al calaix la roba.

(12) Llegeix un llibre la Maria.

(13) a. Ficarem el ganivet al calaix.

b. El ganivet, el ficarem al calaix.

c. El ficarem al calaix, el ganivet.

(14) a. Ha trucat a les VUIT, l’amo.

b. * Ha trucat a les vuit l’AMO.

(15) a. L’amo que ha trucat?

Thus, it is possible to maintain that subjects are base-generated in a postverbal position

(16) a. El president odia la xocolata.

b. El president, l’ odia, la xocolata.

c. Odia la xocolata el president.