Academia.eduAcademia.edu

The Syntax-Prosody Interface

To appear in Annual Review of Linguistics

This article provides an overview of current and historically important issues in the study of the syntax-prosody interface, the point of interaction between syntactic structure and phrase-level phonology. We take a broad view of the syntax-prosody interface, surveying both direct and indirect reference theories, with a focus on evaluating the continuing prominent role of prosodic hierarchy theory in shaping our understanding of this area of linguistics. Specific topics discussed in more detail include the identification of prosodic domains, the universality of prosodic categories, the recent resurgence of interest in the role of recursion in prosodic structure, cross-linguistic variation in syntax-prosody mapping, prosodic influences on syntax and word order, and the influence of sentence processing in the planning and shaping of prosodic domains. We consider criticisms of prosodic hierarchy theory in particular, and provide an assessment of the future of prosodic hierarchy theory in work on the syntax-prosody interface.

Title: The Syntax-Prosody Interface Ryan Bennett University of California, Santa Cruz [email protected] Emily Elfner York University [email protected] Corresponding author: Emily Elfner Department of Languages, Literatures and Linguistics Room S561, Ross Building York University 4700 Keele St Toronto, ON M3J 1P3 [email protected] 1 The Syntax-Prosody Interface Ryan Bennett and Emily Elfner Keywords: syntax-prosody interface, prosody, phrasal phonology, recursion, prosodic hierarchy Abstract This article provides an overview of current and historically important issues in the study of the syntax-prosody interface, the point of interaction between syntactic structure and phrase-level phonology. We take a broad view of the syntax-prosody interface, surveying both direct and indirect reference theories, with a focus on evaluating the continuing prominent role of prosodic hierarchy theory in shaping our understanding of this area of linguistics. Specific topics discussed in more detail include the identification of prosodic domains, the universality of prosodic categories, the recent resurgence of interest in the role of recursion in prosodic structure, cross-linguistic variation in syntax-prosody mapping, prosodic influences on syntax and word order, and the influence of sentence processing in the planning and shaping of prosodic domains. We consider criticisms of prosodic hierarchy theory in particular, and provide an assessment of the future of prosodic hierarchy theory in work on the syntax-prosody interface. 1. Introduction: Domains in phrasal phonology The phonetic form of a word often depends on its position within a larger, containing phrase. To illustrate, consider a process of /r/-assimilation found in Bengali: in some circumstances, the approximant /r/ may be realized as identical with any following coronal consonant, as in 1 2 (Hayes & Lahiri 1991, Fitzpatrick-Cole 1996, Truckenbrodt 2002). This process applies optionally, both within and across words, as in 2a-c. (1) /r/ → CX / __ CX, [+cor] (2a) [bɔrʃa] ~ [bɔʃʃa] ‘rainy season’ (2b) [kor-t͡ ʃʰe] ~ [kot͡ ʃ-t͡ ʃʰe] ‘(s)he does’ (2c) [ram-er ʃoʃur-er d͡ ʒonno] ~ [ram-eʃ ʃoʃur-er d͡ ʒonno] ~ [ram-eʃ ʃoʃur-ed͡ ʒ d͡ ʒonno] ‘for Ram’s father-in-law’ The pronunciation of any given word may thus vary dramatically within the wider context in which it is embedded. Processes like 1, which apply at the junctures of words and morphemes, are sometimes known as sandhi processes (from Sansrkit saṃ- ‘together’ + dhi ‘putting’; Ruppel 2017, p. 109). It has long been known that sandhi processes do not apply indiscriminately between any adjacent pair of words (Selkirk 1980a, Nespor & Vogel 1986, and references there). More typically, sandhi rules apply between words standing in a particular syntactic relationship. For example, consider the syntactic structure of 2c ‘for Ram’s father-in-law’, adapted from Fitzpatrick-Cole (1996): (3) PP[ NP[ram-er ʃoʃur-er] d͡ ʒonno] Ram-GEN father.in.law-GEN for 3 This syntactic structure is reflected in the distribution of /r/-assimilation. Assimilation may apply between [ram-er] and [ʃoʃur-er], between [ʃoʃur-er] and [d͡ ʒonno], or both, as illustrated in 2c. Missing from this list is the form *[ram-er ʃoʃur-ed͡ ʒ d͡ ʒonno], in which assimilation has applied between [ʃoʃur-er] and [d͡ ʒonno], but not also between [ram-er] and [ʃoʃur-er]. It thus appears that the domain of /r/-assimilation conditioned by syntactic structure: it may apply between words contained in the same noun phrase (NP) or the same prepositional phrase (PP); but it may not apply solely between [ʃoʃur-er] and [d͡ ʒonno] because these two words do not form a syntactic unit to the exclusion of [ram-er] in 3. There is nonetheless reason to believe that the domain of /r/-assimilation is not, in fact, syntactically defined. Hayes & Lahiri (1991) point out that in a sentence like [ɔmɔr t͡ ʃador tarake diet͡ ʃʰe] ‘Amor gave a scarf to Tara’, assimilation is impossible between the first two words ([ɔmɔr] ‘Amor’ and [t͡ ʃador] ‘scarf’) in careful speech. However, assimilation to [ɔmɔt͡ ʃ t͡ ʃador...] is possible in faster, less guarded renditions of the same sentence, or in a discourse context in which the word [t͡ ʃador] ‘scarf’ has been previously mentioned (e.g. ‘Shamoli gave a scarf to Ram, and Amor gave a scarf to Tara’). It seems unlikely that the syntactic structure of this sentence is conditioned by speech rate or by the larger discourse context, since the semantic interpretation of the sentence (derived, by hypothesis, from the syntax) is fixed across all of these contexts. Hayes & Lahiri therefore conclude that the domain of /r/-assimilation cannot be defined in terms of the syntax itself. Here we appear to be at an impasse. The domain of /r/-insertion in Bengali seems closely tied to syntactic structure, and yet it cannot be identical to that syntactic structure. Such partial (mis)matches between syntax and the domains in which sandhi, intonation, and other aspects of phrasal phonology apply are at the heart of a research area known as the syntax-prosody 4 interface. A central problem in research on phrasal phonology is ‘chunk definition’ (Scheer 2012a, b): what kind of structures or domains condition phonology at the phrase level and above? What are the algorithms which produce those domains? How do we identify phonologically relevant phrasal domains on the basis of phonetic or phonotactic evidence? These issues were raised in a modern context at least as early as Chomsky & Halle (1968), the foundational document of generative phonology. Fifty years of subsequent research on the syntax-prosody interface has deepened the empirical basis for these questions, and has given rise to a range of different perspectives on what an adequate theory of phrase-level phonology should look like. Prosodic phrasing is often invoked to account for sandhi processes, like Bengali /r/assimilation, which affect the segmental or tonal content of individual words. Prosodic domains also play a key part in conditioning phrase-level intonation (e.g. Jun 2004, 2015), as well as ‘lower-level’ phonetic properties such as duration, voice quality, and so on (e.g. Fougeron & Keating 1997; Keating et al. 2003, and discussion in §§3.1, 5). We thus adopt a broad view of phrasal phonology, in which segmental phonotactics, stress, tone, intonation, morphology, and sub-segmental phonetic patterning all count as potential evidence for theories of phrase level domains. Here we survey some current and historically important issues in the study of the syntaxprosody interface. Our intent is to provide some basic background on this research tradition, while highlighting debates and empirical findings which strike us as particularly important for the ongoing development of the field. In an article of this scope we cannot hope to do justice to all of the phenomena and analytical issues which are important for understanding the relationship between syntax and prosody. To supplement the material presented here, we refer readers to 5 overviews like Inkelas & Zec (1995), Shattuck-Hufnagel & Turk (1996), Turk et al. (2006), Truckenbrodt (2007), Elordieta (2008), Wagner & Watson (2010), Selkirk (2011), Frota (2012), Ishihara (2015), Wagner (2015), Cole (2015), and Elfner (2018); to collected volumes like Kaisse & Zwicky (1987), Inkelas & Zec (1990), Jun (2005, 2014), and Selkirk & Lee (2015); and to books like Selkirk (1984), Nespor & Vogel (1986), Pierrehumbert & Beckman (1988), Gussenhoven (2004), Ladd (2008 [1996]), and Féry (2016). 2. Direct and indirect reference Theories of ‘chunk definition’ in phrasal phonology fall into one of two broad camps. Indirect reference theories assume that syntactic structure is first mapped to a separate representation— typically called prosodic structure—which provides the groupings of words that phrase-level phonological processes are sensitive to. To illustrate with Bengali, Hayes & Lahiri (1991) argue that syntactic phrases are mapped to a set of non-syntactic units known as phonological phrases (φ). These φ condition /r/-assimilation and other phrasal phenomena. The mapping from syntax to φ-structure yields groupings like those in 4a-d, which reflect syntax to the extent that each φ corresponds to a syntactic constituent of some kind (cf. the ungrammatical 4d, which violates this requirement; see Fitzpatrick-Cole 1996, Truckenbrodt 2002). Possible j-groupings for PP[ NP[ram-er ʃoʃur-er] d͡ ʒonno] (4a) (ram-eʃ ʃoʃur-ed͡ ʒ d͡ ʒonno)j (4b) (ram-eʃ ʃoʃur-er)j (d͡ ʒonno)j (4c) (ram-er)j (ʃoʃur-er)j (d͡ ʒonno)j (4d) *(ram-er)j (ʃoʃur-ed͡ ʒ d͡ ʒonno)j 6 The fact that prosodic structure is derived from syntax, but need not be identical to it, thus provides an explanation for why processes like /r/-assimilation are only imperfectly conditioned by syntactic constituency. Additionally, if factors like speech rate and discourse status can affect groupings of words at this level of representation, we expect such factors to condition processes like /r/-assimilation even when the underlying syntax (and resultant semantics) remains constant. In contrast, direct reference theories assume that the domains which condition segmental sandhi and other types of phrasal phonological processes are defined solely with reference to syntactic (and/or morphological) structure. Proponents of direct reference theories include Rotenberg (1978), Kaisse (1985), Odden (1987), Chen (1990), Cinque (1993), Seidl (2001), Wagner (2005, 2010), Arregi (2006), Pak (2008), Newell (2008), Samuels (2009), Scheer (2010, 2012a, 2012b), and Newell & Piggott (2014), among others (see Elordieta 2008 for additional references). As Elordieta (2008) emphasizes, existing direct reference theories do not claim that the domains of phrasal phonology are exactly identical to syntactic units (indeed, such a position is likely untenable, given the existence of well-known mismatches between syntax and the domains of phrasal phonology; see §2.1 and Pak 2008, §2.2.1). Instead, direct reference theories claim that phonologically relevant groupings of words are determined by the syntactic relationships which hold between those words. These relationships may be defined in terms of structural properties like c-command (Kaisse 1985), or in terms of the kinds of nodes which intervene between words in the same syntactic structure (Odden 1987; see also Elfner 2018 on cyclic/phasal models of prosodic phrasing, and work cited there). In a direct reference framework, we might explain the ungrammaticality of an output like 4d by assuming that /r/assimilation applies more readily between two words which are structural sisters ([ram-er] and 7 [ʃoʃur-er]) than between two words which are not ([ʃoʃur-er] and [d͡ ʒonno]), such that assimilation in the second pair of words entails assimilation in the first. Importantly, these are purely syntactic conditions on sandhi—there is no mediation by a separate level of structure such as φ. While we adopt the traditional division between direct and indirect reference theories of phrasal phonology here, we recognize that this dichotomy is probably too simplistic a characterization of the actual theoretical landscape (see also Elordieta 2008). Seidl (2001), for instance, advocates a ‘Minimal Indirect Reference’ theory in which some phonological processes apply at a level of representation (M0) which contains only morpho-syntactic structure, while other processes apply at a level of representation (P0) which is derived from but not identical to the morpho-syntax (see also Rotenberg 1978). Pak (2008, pp. 33-6) proposes a similar direct reference architecture, but further adopts rules which optionally rebracket groupings of words on the basis of speech rate. The space of theories here may be better characterized as a continuum, with more-or-less rigid adherence to the underlying syntax (see also Wagner 2010, Selkirk 2011). Indirect reference theories are clearly ascendant in work on the syntax-prosody interface. In §2.1 we discuss some additional evidence favoring indirect over direct reference as the source of ‘chunk definition’ in phrasal phonology. In §4 we consider some recent critiques of prosodic hierarchy theory (§2.2), the most widely employed indirect reference theory of phrasal phonology and phonetics. 2.1. Arguments in favor of indirect reference 8 A number of arguments have been offered in support of indirect reference theories of phrasal phonology, many originating in Nespor & Vogel’s (1986) seminal work on the topic. As discussed in §§3-5, some of these arguments are more convincing than others; we present a few of them here without commenting on their soundness. (i) Blindness to syntactic category distinctions: Processes of phrasal phonology do not generally distinguish between words based on their lexical or syntactic category. For example, some dialects of American English insert [ɹ] between lexical words (nouns, verbs, adjectives) ending in [ɑ ə ɔ], when followed by a vowel as in 5a-b. This applies regardless of the category of the following word. This indifference to syntactic category distinctions suggests that sandhi processes apply at a level of representation which no longer encodes such category information. Intrusive [ɹ] (5a) Did Wanda[ɹ]NOUN eatVERB much at dinner? (5b) The boat will yaw[ɹ]VERB [aDET littleNOUN]NP. (McCarthy 1993) (ii) Non-isomorphisms (mismatches) between syntax and prosody, and eurhythmic effects: Some domains for phrasal prosody do not match the groupings of words provided by morphology or syntax, and are often shaped by factors which are purely phonological in nature. In Catalan, for instance, there is a preference for the last j domain in the utterance to contain no more than two words (w) (Prieto 2005, 2014). This creates a contrast in intonational grouping between [verb object] clauses with a one-word object, which are phrased (w1 w2), and those with a two-word 9 object, which are phrased (w1)(w2 w3) (see also Ghini 1993, Selkirk 2000, Elordieta 2007, Myrberg 2013, Elfner 2012, 2015 and many others). Eurythmic effects (i.e. preferences relating to the size and rhythmic patterning of prosodic constituents) are a common source of nonisomorphism, though non-isomorphisms can also emerge from other factors, such as the particular mapping algorithm used to derive prosodic structure from syntactic structure (see e.g. Selkirk & Shen 1990 on Shanghai Chinese, and the discussion of Kwak’wala determiners in §4). (iii) Insensitivity to phonetically null elements: Syntacticians frequently assume that phonetically null elements may still be present in the syntactic representation (e.g. traces, unpronounced copies, ᴘʀᴏ). These elements seem to have no effect on sandhi rules or other processes of phrasal phonology (Kaisse 1985; Nespor & Vogel 1986; Truckenbrodt 1995, 1999; Elfner 2012, 2015; among others). (iv) Variability and optionality: Many processes of phrasal phonology apply variably and/or optionally (e.g. Bengali /r/-assimilation, which can be affected by factors such as speech rate). Such optionality is not characteristic of syntactic structure, though see §5 for explanations of such effects which make reference to speech processing and production planning. 2.2. Prosodic hierarchy theory The dominant indirect reference theory is prosodic hierarchy theory (PHT; Selkirk 1981 [1978], 1980a, 1984; Nespor & Vogel 1986; Pierrehumbert & Beckman 1988). PHT assumes that phrasal phonology is conditioned not by syntax, but by abstract phonological constituents known as prosodic categories. These prosodic categories are derived from the syntax, and come in 10 different types (or ‘levels’) depending on what syntactic units they characteristically correspond to. As these prosodic categories are each derived from syntactic units of different sizes—at least clauses (CP), maximal projections (XP), and words (X0)—they too can be arranged into a ‘hierarchy’ reflecting their relative sizes (Figure 1).1 Evidence for the prosodic hierarchy primarily comes from the observation that phonological domains of different sizes may be associated with categorically distinct phonological processes (e.g. Nespor & Vogel 1986; Vogel 2009; Vigário 2010). This suggests that different types of prosodic units may be co-present in the same representation, as in PHT. Relatedly, in many languages multiple phonological processes converge on the same domains; this suggests that a relatively small number of prosodic categories (as in Figure 1) may be sufficient to account for most patterns of word- and phrase-level phonology. An example of such domain clustering comes from European Portuguese, where the prosodic word conditions both stress assignment and a diverse set of segmental phenomena (Vigário 2003, ch. 5; see also Peperkamp 1997). The prosodic categories in Figure 1 can be nested: a sentence like She loaned her rollerblades to Robin would have (at least) the structure {([She loaned]ω [her rollerblades]ω)φ ([to Robin]ω)φ}ι (Selkirk 2000). Early versions of PHT adopted the strict layer hypothesis, which proposes that prosodic constituents can only dominate constituents at the next level down on the hierarchy (Selkirk 1984, p. 26; see also Selkirk 1981 [1978]; Beckman & Pierrehumbert 1986; Nespor & Vogel 1986; Pierrehumbert & Beckman 1988). It is now clear that the strict layer 1 There are various other views on the composition of the prosodic hierarchy, other than that shown in Figure 1; see, for example, Nespor & Vogel (1986), Jun (2005, 2014), and Ladd (2008 [1996]) for additional discussion. 11 hypothesis is too strong: at a minimum, categories may dominate categories which are more than one step lower on the hierarchy (‘level skipping’; Selkirk 1995). Verbal clitics in Standard Italian furnish a clear example: these clitics are outside the domain of stress assignment—the prosodic word (ω)—which includes the verb, and must therefore be dominated directly by φ (e.g. pórtamelo ([pórta]w me lo)j ‘Bring it to me!’; Peperkamp 1997; Anderson 2005; Vogel 2009). Recent work has also revived the possibility that prosodic constituents may dominate other constituents of the same type, a configuration also banned by the strict layer hypothesis (‘recursion’; e.g. Ladd 2008 [1996], ch. 8, and §3.3 below). 3. Some current issues in indirect reference theory 3.1. Identifying domains: How are mapping algorithms distinguished? Broadly understood, the term ‘prosodic domain’ refers to a portion of an utterance which is identifiable due to its behavior with respect to some phonetic or phonological process. As discussed above, /r/-assimilation in Bengali applies across domains larger than the word, yet does not apply indiscriminately across the entire utterance. The domain of application of /r/assimilation is thus definable according to those portions of the utterance in which the phonological process may be observed, whether those domains are defined according to syntactic constituents (under a direct reference approach) or according to prosodic constituents, such as j (as argued by Hayes & Lahiri 1991). Thus, we might infer that the edges of phrase-level phonological domains in Bengali fall between those words in which /r/-assimilation is blocked. As /r/-assimilation in Bengali only applies between words contained in the same domain, it can be considered a domain-span process, (in the terminology of Selkirk 1980a). Other 12 phonological and phonetic processes target elements at domain edges: typically, either the left or right edge of some prosodic domain. For example, in Japanese the prosodic domain traditionally referred to as the Minor Phrase is marked at its left edge by a rise in pitch (a %LH boundary tone), while the Major Phrase domain is marked at its left edge by a pitch reset, which undoes the pitch downtrends of the preceding prosodic phrase (McCawley 1968, Pierrehumbert & Beckman 1988, Selkirk & Tateishi 1991). In ChiMwiini, the right edge of intonational phrases is marked by lengthening of the vowel in the penultimate syllable, as well as by the presence of a rightedge phrasal H tone (Kisseberth & Abasheikh 1974; Selkirk 2011). As discussed in Selkirk (2011), the diagnostics for prosodic domain edges are often partial and asymmetric, such that prosodic domains are typically diagnosable only on the left or right edge, but not both. This observation inspired the ‘end-based’ theory of syntax-prosody mapping, which claims that languages differ parametrically as to whether the left or right edges of syntactic constituents are referred to in the mapping to corresponding prosodic domains (Selkirk 1986; Selkirk & Shen 1990). ChiMwiini, under Selkirk’s (1986) proposal, would map the right edges of syntactic phrases onto the right edges of phonological phrases, leaving the position of left edges unspecified. Japanese, alternatively, would be specified with a left-edge setting (Selkirk & Tateishi 1991). Under the strict layer hypothesis, the location of the unspecified domain edge from such incomplete mappings will be determined by general, formal constraints imposed on the prosodic hierarchy, such as a ban on recursive nesting of prosodic domains (see §2.2). Selkirk (2011) advocates for an approach to syntax-prosody mapping that moves away from reference to phrase edges, and which permits certain structures banned under strict layering, such as recursion. Selkirk (2011) instead proposes that both the left and right edges of 13 syntactic constituents are transparently mapped to their corresponding prosodic constituents. Evidence for such an approach comes from languages where it is possible to diagnose both the left and right edges of prosodic domains simultaneously. Selkirk (2011) discusses the case of Xitsonga, in which a lexical H tone spreads throughout j-level domains. It is blocked from spreading onto the final syllable of j (providing a diagnostic for the right edge), and is further blocked from spreading onto the initial syllable of the following w (thus providing a diagnostic for the left edge). Connemara Irish (Elfner 2012, 2015) provides another example of a language with diagnostics for both the left and right edges of j, which are marked with rising and falling phrasal pitch accents, respectively. In the case of languages like Japanese and ChiMwiini, the absence of a clear phonological diagnostic for one edge or the other would not necessarily constitute evidence against the presence of prosodic boundaries in locations predicted by syntactic structure. Selkirk’s (2011) proposal thus assumes that prosodic boundaries may be present even when there is no explicit phonological or phonetic evidence for their existence, and instead places the onus on the theory of syntax-prosody mapping to predict the locations for prosodic boundaries. Such an approach contrasts with what is typically assumed in psycholinguistic (§5) and ‘intonation-first’ approaches to prosodic domain demarcation (Jun 1998, 2005, 2014): namely, that prosodic boundaries are present only when there is some explicit phonetic or phonological cue to their existence. Such evidence may come from categorical measures, such as the presence of a boundary tone, a pause, or final lengthening, or from gradient measures which are equated to the relative strength of a prosodic boundary, such as the degree of final lengthening, the degree of pitch reset or pitch scaling, the magnitude of rises or falls in pitch, the duration of pauses, the 14 degree of initial strengthening, and the frequency that phonological processes apply across word boundaries. See §5 for further discussion. 3.2. Universality of categories The prosodic hierarchy is typically assumed to be a component of universal grammar: all languages have hierarchically ordered prosodic structure, and languages are thought to make use of the same set of prosodic categories in the structuring of utterances. Theories of the syntaxprosody interface which adopt prosodic hierarchy theory similarly assume that the principles or constraints governing syntax-prosody mapping are universal in nature. Thus, although languages may differ in terms of their surface syntactic structure and in the explicit marking of prosodic domains using phonetic and phonological processes, the grammatical mechanism underlying both the mapping of prosodic structure from syntactic structure, and the hierarchical organization of prosodic structure into distinct prosodic categories, is thought to remain constant across languages. To what extent is this an accurate representation of prosodic structure crosslinguistically? In terms of the number and types of prosodic categories that are present in the prosodic hierarchy, there are languages which arguably under-represent or over-represent the prosodic hierarchy as given in Figure 1. For example, Japanese and Basque, both lexical pitch accent languages, show evidence for an additional prosodic domain which is larger than the prosodic word but smaller than the phonological phrase, traditionally referred to as the Accentual or Minor Phrase (McCawley 1968, Pierrehumbert & Beckman 1988, Selkirk & Tateishi 1991, Jun & Elordieta 1997). Other languages, alternatively, appear to provide positive evidence for just some of the prosodic domains listed in Figure 1. For example, the Inuit languages demarcate 15 prosodic domains using tonal cues at the level of the prosodic word and the level of the intonational phrase, but (seemingly) provide no evidence for an intermediate, φ-level prosodic domain (Arnhold 2014; Arnhold et al. 2018; see also Bennett 2015 for discussion). Such languages call into question the universality of category labels in prosodic typology. For example, does the presence of an additional intermediate-level domain in Japanese and Basque necessitate a revision to the prosodic hierarchy, as envisioned in Figure 1? And conversely, if Inuit differentiates between just two types of higher-level domains, what determines which two categories these domains correspond to? Recent work by Ito & Mester (2012, 2013) on Japanese and Elordieta (2015) on Basque argue that the additional intermediatelevel categories may be captured under the assumption that φ domains may be recursive, such that at least some apparent category distinctions actually derive from the depth of embedding relative to the amount of structure present in the utterance, and not (necessarily) to distinctions made in the number and type of prosodic categories (see §3.3, as well as Selkirk 2011, Elfner 2015, 2018 for further discussion). Wagner (2005, 2010) argues in favour of a more radical, ‘label-free’ version of this approach, such that the hierarchical structure observed in the prosodic organization of sentences derives not from a universal prosodic hierarchy at all, but rather from the hierarchical, recursive nature of syntactic structure, on which prosodic domains are built (§3.3 and §5). Wagner’s ‘label-free’ approach contrasts with the ‘syntactic grounding’ approach proposed most explicitly in Selkirk (2011), in which the universality of prosodic categories derives from the universality of syntactic constituent types. More specifically, Selkirk (2011) proposes that there is a direct correspondence between the syntactic constituents of word (X0), phrase (XP), and clause (CP), each of which map onto a corresponding prosodic category, w, j, 16 and i, respectively. The hierarchical structure of syntax in this way results naturally in the hierarchy of prosodic categories, as espoused by prosodic hierarchy theory, under the assumption that prosodic domains may be recursive, as discussed above. Language-specific differences in prosodic category distinctions may thus derive from differences in the syntactic organization of individual utterances. A challenge for this type of approach is a relative lack of understanding regarding how such mapping constraints are applied in languages with radically different systems of syntactic organization, such as polysynthetic languages (see Elfner 2018 for discussion and references). Another prediction made by prosodic hierarchy theory is that processes of phrasal phonology should apply within the same phonological domains, namely those supplied by the prosodic hierarchy in Figure 1. This prediction, known as ‘domain clustering’ (§2.2, Inkelas 1990) has been the target of some criticism. Bickel et al. (2009) and Schiering et al. (2010) claim that word-level prosody in Limbu violates domain clustering (as well as nesting, §4) because domain-bounded phonological processes diagnose four distinct groupings of morphemes in [prefix-stem-suffix=clitic] strings. These four domains do not correspond neatly to ω or φ. Padgett (2014) notes that stress, vowel reduction and final devoicing in Russian—which generally converge on a single definition of the prosodic word—come apart in compounds, where they appear to have non-identical domains of application. Seidl (2001) discusses similar facts in Mende. All of these patterns seem to point toward a richer prosodic hierarchy than that in Figure 1, perhaps making use of language-specific or even process-specific domains. Seidl (2001) and Schiering et al. (2010) consider (but dismiss) the possibility that these apparent ‘extra’ domains actually reflect recursion of ω and φ (§3.3); it would be worthwhile to revisit these arguments in light of recent developments in the theory of prosodic recursion (particularly 17 the recursive prosodic sub-categories of Ito & Mester 2009, 2010, 2013, 2012). See Wagner (2010, 2015) for additional critical commentary. 3.3. Recursion in prosodic structure The strict layer hypothesis (§2.2) sharply restricts the nesting of phonological domains, prohibiting recursion in prosodic structure: a node of type κ cannot dominate another node of the same type. This ban on prosodic recursion was questioned early on in the development of prosodic hierarchy theory, most notably by Ladd (1986, 1988). Ladd investigated the strength of intonational boundaries in coordinate structures like [[A and B] but C] vs. [A [but B and C]], where {A, B, C} are all full clauses. He found that intonational boundaries were stronger before ‘but’ than before ‘and’. Under the assumption that all clauses correspond to intonational phrases (i), the only way to represent this distinction is through recursion: [[A and B]i but C]i vs. [A [but B and C]i]i (see also Dresher 1994; Kubozono 1989; Ladd 2008 [1996]; Wagner 2005, 2010; Féry & Truckenbrodt 2005). Throughout the 1990s, recursion was commonly invoked for the prosodic word (ω), chiefly as a means of understanding how unstressed affixes, clitics, and function words are phonologically incorporated into their hosts (e.g. Inkelas 1990; Selkirk 1995; Booij 1996; Peperkamp 1997; Vigário 1999; and many others). More sporadically, recursion was also invoked for the phonological phrase (φ), e.g. Truckenbrodt (1995, 1999). The last ten years have seen a resurgence of interest in the possibility of prosodic recursion at both higher levels (e.g. j, Ito & Mester 2007, 2012, 2013; Selkirk 2011; Elfner 2012, 2015; Elordieta 2015; i, Selkirk 2009; Myrberg 2013) and lower levels (e.g. the metrical foot, Bennett 2013; Martínez-Paricio 2013; Martínez-Paricio & Kager 2015; and references there). 18 Some researchers remain skeptical about the possibility of prosodic recursion, either rejecting it outright (e.g. Vogel 2009; Schiering et al. 2010) or arguing that it has a limited role to play in prosodic systems (e.g. Vigário 2010; Frota & Vigário 2013). In large part this debate concerns the kinds of diagnostics which are taken to be valid indicators of recursion. Vogel (2009) and Vigário (2010) argue that recursion of a category κ should only increase the strength of the phonetic cues associated with that category, as in Ladd’s (1986) study of coordinate structures in English. In contrast, Ito & Mester (2007, 2009, 2010, 2013), Martínez-Paricio (2013), Elfner (2015), and others argue that different levels of recursive structure can show not only gradient differences in boundary strength, but also categorical differences in the kinds of phonological phenomena which occur at each level (e.g. the topmost ω in a recursive prosodic word structure can show different behavior than the bottommost ω; see Elfner 2018, Bennett to appear for discussion). The question of how recursive prosodic structures are interpreted by the phonetics and phonology clearly merits further investigation. 3.4. Cross-linguistic variation in syntax-prosody mapping Prosodic structure is derived, at least in part, through reference to syntactic structure. While the basic building blocks of syntactic structure may be universal in nature, languages differ in terms of surface structure, after syntactic operations such as movement have taken place. Under most current theories of the syntax-prosody interface, at least those deriving from generative syntax and more specifically, minimalism (Chomsky 1995), prosodic domains are created based on syntactic constituent structure, and not vice-versa (see also §3.5). This view of the architecture of the grammar is commonly referred to as the ‘Y’-model of the grammar, which holds that syntactic structure feeds into the phonological component of the grammar. This assumption 19 holds even in views assuming that prosodic domains are created via direct reference, where there is no mediating component of prosodic structure. To what extent, therefore, can cross-linguistic variation in syntactic structure account for the typology of patterns found in prosodic structure? Syntactic structure provides, in a sense, the blueprints for prosodic domains; the particular phonological properties of sentences, including the number and type of words, their order and their organization in terms of constituents, will be determined by the syntactic structure. Different theories of syntax-prosody mapping will make different predictions regarding the particulars of how these constituents are mapped to prosodic domains, but language-specific syntactic structure will determine the basis of prosodic constituents in any given utterance. Given the basic mapping of syntactic constituents to prosodic constituents in Figure 1, we make the assumption that all languages will contain these basic syntactic building blocks: words (X0), phrases (XP), and clauses (CP). Selkirk’s (2011) Match Theory predicts that languages will map these basic syntactic elements onto corresponding prosodic domains: w, j, and i, making the prediction that all languages will distinguish at least three types of prosodic domains. Alternatively, it may be the case that prosodic domains are relative, and do not require specific category labels themselves, as would arise under the ‘label-free’ theory discussed in §3.2 (Wagner 2005, 2010). Owing to the relatively recent resurgence of interest in the role of recursion in prosodic structure (§3.3), such views of prosodic structure are becoming more common in the literature. At one extreme, Wagner (2005, 2010) proposes that the prosodic hierarchy be re-envisioned as recursive prosodic domains, where relative boundary strength and the depth of embedding, rather than category labels or correspondence with particular syntactic elements, is responsible for the effects of the prosodic hierarchy (see also Ladd 1986, 1988); as 20 such, cross-linguistic variation in syntax-prosody mapping may be derived exclusively from differences in surface syntactic structure, leaving little or no role for differences in terms of prosodic structure and the prosodic hierarchy. However, even within prosodic hierarchy theory, the presence of recursion has been used to argue for a simplification of the number of distinctions necessary in the prosodic hierarchy (§3.2, as well as Selkirk 2009; Elfner 2012, 2015; Ito & Mester 2012, 2013; Myrberg 2013; Elordieta 2015). In summary, while languages differ in terms of the details of their surface syntactic configurations, the mechanisms governing syntax-prosody mapping are sensitive only to the larger patterns of constituent structure. This means that if languages systematically map syntactic constituents onto prosodic domains in universally consistent ways, we expect to see commonalities in terms of how prosodic domains are constructed across languages. 3.5. Prosodic influence on syntax and word order Various phenomena suggest that the phonological size of a syntactic constituent can affect its position within the sentence. For example, Zec & Inkelas (1990) observe that topicalization in Serbo-Croatian is subject to a phonological size condition: prosodically large units like U Rio de Žaneiru ‘in Rio de Janeiro’ can be moved to a sentence-initial topic position, but not prosodically small units like U Riju ‘in Rio’ (see also Ryan 2018 on ‘end weight’ effects like Heavy XP shift). The distribution of ‘small’ words like clitics and pronouns can also be conditioned by constraints governing the position of such words within phonological domains like j or ι (Zec & Inkelas 1990; Halpern 1995; Anderson 2005; Werle 2009; Bennett et al. 2016). A growing body of research suggests that even phenomena traditionally (and uncontroversially) taken to belong to the syntax proper, such as wh-movement and argument incorporation, may be influenced by 21 fundamentally prosodic factors (among others, Aissen 2000; Kandybowicz 2009, 2015, 2017; Agbayani & Golston 2010, 2016; Richards 2010, 2016; Sabbagh 2013; Clemens 2014). Surveys of recent research in this area can be found in Anttila (2016) and Shih (2017), to which we refer the reader for further discussion and references. Almost all work dealing with phonological effects on syntax and word order assumes some version of prosodic hierarchy theory: we are unaware of any research on prosodically-motivated syntactic variation which adopts an explicit direct reference framework. A possible exception is Wagner (2005, 2010), who proposes that prosody may be a deciding factor when there is more than one syntactic parse available, as in the optional extraposition of relative and complement clauses. 4. A return to direct reference? In the last twenty years there has been a revived (though somewhat muted) debate over ‘chunk definition’ in phrasal phonology. In particular, there has been a renewed skepticism over the abstract categories assumed by prosodic hierarchy theory (§2.2), with some authors advocating the rejection of indirect reference theories as a whole (recalling, and to some extent updating arguments made in the 1980s by Kaisse, Odden, and others). As noted above, critics of prosodic hierarchy theory (PHT) have sometimes seized on the apparent non-universality of prosodic categories as evidence that PHT is fundamentally misguided as a theory of phrase-level phonological domains (§3.2). A second, related argument against PHT concerns patterns of overlap between phonological domains. PHT makes at least the following claims about the ways in which phrasal domains can be related to each other: • NESTING: 22 If the domains of two processes overlap, they must be nested, with one domain containing the other entirely (a.k.a. ‘proper bracketing’; see Ito & Mester 2009). • LAYERING: If the domains of two processes overlap, one of them should be consistently larger than (i.e. contain) the other, in all contexts. It has been suggested that both the nesting and the layering requirements of PHT are too strong. Seidl (2001), drawing on Akinlabi & Liberman (2000), argues that Yoruba falsifies the nesting provision of PHT. Yoruba has a ban on two adjacent identical tones within the same word. This restriction also applies between a verb and following enclitic, but not between verbs and preceding proclitics. This suggests the domain structure [proclitic-[verb-enclitic]]. Some dialects of Yoruba have another process of vowel harmony which takes the proclitic-verb sequence as its domain, suggesting the structure [[proclitic-verb]-enclitic]. This would appear to be a violation of nesting: the verb must belong to two different domains, but these domains are only partially overlapping—neither one is contained within the other (see also Chen 1987; Pak 2005, 2008, §6.4.1; Samuels 2009, §5.4.4). Seidl (2001) and Pak (2008) argue against layering on the basis of phrasal processes in Luganda (an argument originally due to Hyman et al. 1987). Luganda has a process which spreads high tone between certain words, and a second process shortening word-final long vowels in particular contexts. The domains of high tone spreading and shortening do not stand in 23 a consistent containment relation: both {...[….]SHORTEN...}SPREAD and [...{...}SPREAD...]SHORTEN are possible nestings of these domains, in violation of layering.2 The apparent empirical advantages of PHT (§2.1) have been called into question as well. First, there are some phrasal sandhi processes which seem to be sensitive to syntactic category distinctions. For example, Puerto Rican Spanish deletes word-final stressed [ˈa] before mid vowels, but only in verbs (Kaisse 1985, p. 128). Such phenomena are beyond the reach of basic indirect reference theories, given the assumption that phrasal phonology cannot ‘see’ syntactic category distinctions (see also Nespor & Vogel 1986, Ch. 2; Hayes 1990). A pillar of indirect reference theories is the existence of mismatches (‘nonisomorphisms’) between syntax and the groupings of words which are relevant for phrasal phonology. Such mismatches demonstrate that phonological constituents cannot be reduced to syntactic constituents. As another example, determiners in Kwak’wala form a syntactic unit with the following word(s), but a phonological unit with the preceding word, as diagnosed by stress and segmental patterning (Anderson 2005). Such mismatches seem to support indirect reference theories, to the extent that they show that syntactic constituency fails to determine the domains of phonological processes. Though suggestive, such arguments must in fact be evaluated on a case-by-case basis. Pak (2008, §2.2.1) points out that most direct reference theories do predict non-isomorphisms between syntactic domains and phonological domains: this is because direct reference models typically apply 2 Hyman et al. (1987) suggest that there is a principled reason why apparent violations of domain nesting in Luganda and elsewhere often involve tonal processes: such processes may be subject to their own groupings, independent of the prosodic hierarchy as it determines segmental patterning (i.e. prosodic groupings are tier-specific in an autosegmental sense). 24 phonological processes on the basis of syntactic relations, rather than syntactic constituency as such. Other authors have argued that some apparent non-isomorphisms actually reflect a misunderstanding of the underlying syntax rather than non-isomorphism per se (e.g. Seidl 2001; Wagner 2010).3 Lastly, some direct reference theories (e.g. Seidl 2001; Pak 2008) make use of rebracketing operations which produce groupings of words that deviate from the syntax itself (§2). The question, then, is whether there exists a residue of non-isomorphism which truly cannot be accommodated by principled direct reference theories of phrasal phonology. Apart from these critiques of PHT, conceptual arguments against the prosodic hierarchy, (mostly drawing on considerations of parsimony) have been set forth by Samuels (2009) and Scheer (2012a, 2012b), among others. Readers are referred to those works for details. 4.1. Why does prosodic hierarchy theory endure? Despite these critiques, PHT remains the most widely-accepted and widely-practiced framework for understanding the syntax-prosody interface. It is worth asking why. One reason is that PHT (and the closely related ‘autosegmental-metrical’ theory; Ladd 2008 [1996]) has proven to be a 3 Determining the correct underlying syntax is a thorny problem for all approaches to the syntax- phonology interface, not just indirect reference theories. For example, certain syntactic structures in both Kaisse (1985) (direct reference) and Nespor & Vogel (1986) (indirect reference) are manifestly not the structures that most syntacticians would assume today (e.g. Seidl 2001, Ch.3). Similar problems persist in modern theories which give the syntactic ‘phase’ (e.g. Chomsky 2001, 2008) a central role in the syntax-phonology interface, as there exists no current consensus as to which syntactic units constitute phases and which do not (see also Newell 2008, Scheer 2012a). 25 useful framework for analyzing phrasal phonology in a wide range of typologically diverse languages (see Jun 2005, 2014; Gussenhoven 2004; Ladd 2008 [1996], and many others for examples). The indisputable empirical success of PHT thus provides indirect support for the validity of such a theory, at least in its broad contours. A second reason for the persistence of PHT may have to do with its treatment of eurythmic constraints on phonological domains (§2.1). Phrase-level domains are sometimes conditioned by factors which seem purely phonological in nature, often having to do with domain size and rhythmic balancing. Ito & Mester (2007) analyze such effects as they arise in compounding in Japanese. Two-member compounds can be parsed together in the same accent domain, (w1 w2), but only when w2 is short. When w2 is longer (>4 moras), it must be parsed into its own accent domain, [w1 (w2)]. Similar effects can be observed for other levels of the prosodic hierarchy, including φ (≈XP; e.g. Prieto 2005, Elfner 2015 ), ι (≈CP; Myrberg 2013), and lowerlevel units such as the metrical foot (e.g. Selkirk 1980b, Prince 1991, Hayes 1995). Such effects are naturally accommodated in indirect reference theories, which assume that eurythmic pressures operate at a level of representation—prosodic structure—that is essentially phonological rather than syntactic in nature (§2). Eurythmic constraints are less naturally accommodated in direct reference theories. A subset of eurythmic effects—those which distinguish between branching and non-branching constituents—is addressed in syntactic terms by Rotenberg (1978) and Kaisse (1985) (see also Inkelas & Zec 1995, Pak 2005, and references there). The full range of eurythmic effects cannot be analyzed in the same way (e.g. the size effects in Japanese compounding mentioned above are clearly non-syntactic). To the extent that direct reference theories can account for such effects, it appears that they must invoke at least some level of post-syntactic representation (as in Seidl 26 2001, Pak 2005, 2008), or, alternatively, delegate such matters to processing and phonetic implementation (Wagner 2005, see §5 for further discussion). The apparent need for extra levels of representation of course brings direct reference theories closer to indirect reference theories like PHT (§2), perhaps lessening their overall appeal. Lastly, counterexamples notwithstanding, it appears that most phrasal phonological processes do have the properties enumerated in §§2.1, 3.2, 4: insensitivity to syntactic categories; domain clustering; domain layering and nesting; and so on. These tendencies—all predicted by PHT—must be explained in some other way in direct reference theories.4 Direct and indirect theories are typically pitted against each other because they seem to offer competing explanations for the same set of facts. An alternative view, suggested to us by a reviewer, is that there are simply two kinds of phrasal phonology: one conditioned by abstract prosodic structure, as in indirect reference theories; and another conditioned by the syntax itself, as in direct reference theories (see also Wagner 2012 and §5 below). For instance, tone sandhi in Xiamen and Taiwanese appears to be conditioned by morpho-syntactic structure (Chen 1987, Tsay & Myers 1996), and is correspondingly insensitive to factors like speech rate and the presence of pause between words. In contrast, 3rd tone sandhi in Beijing Mandarin appears to be conditioned by prosodic structure, as it applies in domains which do not match the syntax, and does show sensitivity to factors like speech rate (Chen 2000, Chen & Yuan 2007). It may be, then, that the tension between direct and indirect reference in phrasal phonology has persisted 4 Direct reference theories can account for some of these tendencies, at least in principle (Elordieta 2008, Pak 2008). For example, direct reference theories which assume that sandhi rules are conditioned only by c-command relations predict that such rules should be insensitive to syntactic category distinctions. 27 because phrasal phonology may in fact be sensitive to either syntactic or prosodic domains. This ambiguity may explain some apparent violations of domain layering and nesting (§4): in French, for example, the domains in which liaison applies do not seem to align with intonational domains, but liaison is arguably conditioned by syntax rather than prosodic structure (Kaisse 1985, Pak 2008, Wagner 2012). 5. Gradiency, processing and production planning Up to this point, we have focused primarily on the grammatical interface between syntax and prosody: how syntactic domains map onto prosodic domains, and how the boundaries of these domains may be demarcated in terms of phonetic and phonological processes. This section considers the role of sentence processing and production planning, each of which play some role in the creation and demarcation of prosodic domains. For reviews and discussion of the interface of prosody with sentence processing, production planning, and other psycholinguistic factors, see Wagner & Watson (2010), Wagner (2015) and Cole (2015). An assumption implicit in prosodic hierarchy theory is that the distinction between domain levels is categorical rather than gradient. While many cues to prosodic boundaries are gradient in character, such as duration and pitch scaling, prosodic hierarchy theory predicts that speakers should be able to associate such distinctions with a limited set of universal prosodic categories, which align with syntactic constituents in predictable ways. To what extent is this the case? As discussed in §3.3, evidence for the recursion of prosodic domains suggests that speakers make use of a greater number of prosodic domains than the prosodic hierarchy allows, at least under the confines of the strict layer hypothesis, and, further, that the number of phonetically distinct domains is directly correlated with the syntactic complexity of the utterance (Ladd 1986, 1988; Kubozono 1989; Wagner 2005, 2010). Prosodic 28 hierarchy theory predicts that the three basic prosodic categories (w, j, i) will be present in every utterance, regardless of syntactic complexity (though see discussion in §§3.2-3.4). Gradient cues to prosodic domains include durational cues, involving final lengthening, the use of prosodic pauses, and initial strengthening, as well as cues involving fundamental frequency (pitch), such as the scaling of pitch accents and pitch resets. The magnitude of such gradient cues appears to depend on relative boundary strength: stronger cues occur at stronger boundaries. For example, final lengthening, which involves the lengthening of segments and syllables before prosodic boundaries, has been shown to be positively correlated with the relative strength of the boundary (Klatt 1975; Byrd & Saltzman 1998; Price et al. 1991; ShattuckHufnagel & Turk 1996; Wightman et al. 1992; Turk & Shattuck-Hufnagel 2000). Similarly, prosodic pauses, though not required even at relatively strong prosodic boundaries, pattern in conjunction with final lengthening such that pauses are both more likely to appear, and have a relatively longer duration, at strong prosodic boundaries as opposed to weak ones (Ferreira 1991, 1993; Watson & Gibson 2004). Finally, the phonetic properties of segments in domain-initial positions, such as duration and degree of stricture, also correlate with the relative strength of the prosodic boundary, resulting in what has been termed domain-initial strengthening (e.g. Fougeron & Keating 1997; Keating et al. 2003). With respect to fundamental frequency, the scaling of pitch accents and the use of pitch reset within an utterance have been shown to be similarly correlated with relative boundary strength and the presence of recursion in prosodic structure (de Pijper & Sanderman 1994; Féry & Truckenbrodt 2005; Ladd 1988; Elordieta 2015). In terms of perception, research has shown that listeners are fairly adept at discerning relative differences in the strength of prosodic boundaries, but are not able to reliably categorize boundaries in terms of their strength (Price et al. 1991; de Pijper & Sanderman 1994). This 29 observation may suggest that listeners do not interpret gradient cues as a reflection of specific domain types (i.e. specific categories), be they syntactically or prosodically defined (§2 and §4). Recent developments exploring the recursion of prosodic categories (§§3.2-3.3) may help capture gradiency in the perception of relative boundary strength while retaining the advantages of prosodic hierarchy theory and indirect reference more generally (Selkirk 2009; Ito & Mester 2012, 2013; Myrberg 2013; Elordieta 2015). To some extent, relative boundary strength may be tied to syntactic structure and the relative complexity of syntactic constituents. Algorithmic models, common in relatively early work on the syntax-prosody interface, attempt to derive relative boundary strength (particularly durational cues to prosodic boundaries) directly from the depth of embedding found in syntactic structure (Cooper & Paccia-Cooper 1980; Ferreira 1993; Gee & Grosjean 1983). More recently, such algorithmic approaches have been tied to sentence processing and relative boundary strength, and more specifically to the notion that the relative length and/or complexity of syntactic material affects processing time (Watson & Gibson 2004; Watson et al. 2006). Prosodic boundaries are thus more likely to occur following a complex syntactic constituent (allowing time for recovery), as well as more likely to occur preceding a complex syntactic constituent (allowing time for planning). Speech processing and production planning have also been tied to patterns of variability and optionality in sandhi processes. Some authors have argued that variability in the application of sandhi processes reflects variation in how much of an utterance is concurrently planned during real-time speech production (Wagner 2012; Kilbourn-Ceron et al. 2017; Kilbourn-Ceron & Sonderegger 2018, and references cited there). Such variability may also be captured by formal algorithms which build prosodic domains in a variable or stochastic fashion (as in Hayes and 30 Lahiri’s account of Bengali, §1). Still, to the extent that appeals to production planning might offer a more principled account of variability in phrasal phonology, such variability may no longer provide an argument for indirect over direct reference theories of phrasal domains (§2). For example, Kilbourn-Ceron et al. (2017) discuss variability in the application of flapping in English, which variably occurs across word boundaries and syntactic boundaries of varying sizes, yet is less likely to occur as the relative strength of the prosodic boundary increases. Kilbourn-Ceron et al. (2017) propose that the application of flapping depends, at least in part, on the relative likelihood that upcoming syntactic material has been planned at the time of production, such that flapping will occur only if the two words exhibiting the conditioning environment are within the same planning window. This model predicts that the application of sandhi processes, while partially dependent on syntactic structure (because words that are closely related syntactically are more likely to be planned together), will also be variable to the extent that speakers may not fully plan their utterances. The role of factors such as lexical frequency and speech rate can thus be directly tied to production planning, and fit well with evidence that prosodic phrasing and prominence are affected by the relative predictability of lexical items within a given utterance (Aylett & Turk 2004; Turk 2010). Finally, there is evidence to suggest that the use of prosodic cues is not absolute, and is conditioned, at least in part, by the discourse and situational context in which speech is uttered, including information structural notions such as focus, givenness and topicality (see Féry & Ishihara 2016, and references there, for details). For example, while it is incontrovertible that prosody may be used to disambiguate otherwise ambiguous utterances (e.g. Lehiste 1973), speakers are more likely to employ such prosodic cues when the context allows for multiple interpretations, and are less likely to use prosodic cues where only a single interpretation is 31 possible, i.e. where the utterance is syntactically, but not contextually, ambiguous (Snedeker & Trueswell 2003). The role of context in prosody is a complex one, involving the integration of a number of linguistic modules; although the focus of this paper has been on the interface between prosody and syntactic structure, it is important to keep in mind that even this relationship may be affected by a number of factors external to a straightforward mapping between syntactic structure and prosodic domains. For a thorough overview of the role of context in prosody, see Cole (2015). 6. Conclusion We hope to have shown in this brief overview that the study of phrasal phonology—including segmental patterning, intonational patterns, and other phenomena at higher domains—can have wide-ranging consequences for our understanding of grammar. Phrasal phonology is deeply intertwined with syntax, semantics, pragmatics, phonetics, psycholinguistics, and other components of language behavior beyond phonology itself. Research on phrasal phonology, and its relation to syntax, has potentially profound consequences for our understanding of the architecture of grammar, raising issues related to the modularity and independence of different types of linguistic knowledge, as well as the traditional division between performance and competence (Chomsky 1965). Phrasal phonology has been an area of intense scrutiny for half a century, but, in our view, many of the most interesting and important issues in this area remain to be definitively settled. We also expect that new puzzles and problems will emerge from the continued development of theories which integrate quantitative evidence from experiments and corpus studies with more traditional qualitative descriptions of phonological patterning at the phrase level, and as data from a wider range of languages is brought into the fold. 32 ACKNOWLEDGEMENTS We thank Jennifer Bellik and an anonymous reviewer for helpful comments on an earlier version of this article. LITERATURE CITED Agbayani B, Golston C. 2010. Phonological movement in Classical Greek. Language 86: 133-67 Agbayani B, Golston C. 2016. Phonological constituents and their movement in Latin. Phonology 33: 1-42 Aissen J. 2000. Prosodic conditions on anaphora and clitics in Jakaltek. In The Syntax of Verb Initial Languages, ed. A Carnie, E Guilfoyle, pp. 185-200. Oxford: Oxford University Press Akinlabi A, Liberman M. 2000. The tonal phonology of Yoruba clitics. In Clitics in Phonology, Morphology and Syntax, ed. B Gerlach, J Grijzenhout, pp. 31-62. Amsterdam: John Benjamins Anderson SR. 2005. Aspects of the Theory of Clitics. Oxford: Oxford University Press Anttila A. 2016. Phonological effects on syntactic variation. Annual Review of Linguistics 2: 115-37 Arnhold A. 2014. Prosodic structure and focus realization in West Greenlandic. In Prosodic Typology II, ed. S-A Jun, pp. 216-51. Oxford: Oxford University Press Arnhold A, Compton R, Elfner E. 2018. Prosody and wordhood in South Baffin Inuktitut. In Proceedings of the Workshop on Structure and Constituency in Languages of the Americas 21, ed. M Keough, N Weber, A Anghelescu, S Chen, E Guntly, et al, pp. 30-39. Vancouver: University of British Columbia Working Papers in Linguistics Arregi K. 2006. Stress and islands in Northern Bizkaian Basque. In Studies in Historical and Basque Linguistics Dedicated to the Memory of R. L. Trask, ed. J Hualde, JA Lakarra, pp. 81-106. Donostia: Diputación Foral de Gipuzkoa Aylett M, Turk A. 2004. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech 47: 31-56 Beckman M, Pierrehumbert J. 1986. Intonational structure in English and Japanese. Phonology 3: 255-309 Bennett R. 2013. The uniqueness of metrical structure: Rhythmic phonotactics in Huariapano. Phonology 30: 355-98 Bennett R. 2015. Review of Sun Ah-Jun (ed.) (2014). Prosodic typology II: the phonology of intonation and phrasing. Oxford: Oxford University Press. xv + 587. Phonology 32: 33750 Bennett R. to appear. Recursive prosodic words in Kaqchikel (Mayan). Glossa 33 Bennett R, Elfner E, McCloskey J. 2016. Lightest to the right: An apparently anomalous displacement in Irish. Linguistic Inquiry 47: 169-234 Bickel B, Hildebrandt K, Schiering R. 2009. The distribution of phonological word domains: A probabilistic typology. In Phonological Domains: Universals and Deviations, ed. J Grijzenhout, B Kabak, pp. 47-75. Berlin: Mouton de Gruyter Booij G. 1996. Cliticization as prosodic integration: The case of Dutch. The Linguistic Review 13: 219-42 Byrd D, Saltzman E. 1998. Intragestural dynamics of multiple prosodic boundaries. Journal of Phonetics: 173-99 Chen M. 1987. The syntax of Xiamen tone sandhi. Phonology 4: 109-50 Chen M. 1990. What must phonology know about syntax. In The Phonology-Syntax Connection, pp. 19-46. Chicago: University of Chicago Press Chen MY. 2000. Tone Sandhi: Patterns across Chinese Dialects. Cambridge: Cambridge University Press Chen Y, Yuan J. 2007. A corpus study of the 3rd tone sandhi in Standard Chinese. In Proceedings of Interspeech 2007. Antwerp, Belgium Chomsky N. 1965. Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press Chomsky N. 1995. The Minimalist Program. Cambridge, Mass.: MIT Press Chomsky N. 2001. Derivation by phase. In Ken Hale: A Life in Language, ed. M Kenstowicz, pp. 1-52. Cambridge, MA: MIT Press Chomsky N. 2008. On phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, ed. R Freidin, C Otero, ML Zubizarreta, pp. 133-66. Cambridge, MA: MIT Press Chomsky N, Halle M. 1968. The Sound Pattern of English. New York: Harper & Row Cinque G. 1993. A null theory of phrase and compound stress. Linguistic Inquiry 24: 239-398 Clemens LE. 2014. Prosodic noun incorporation and verb-initial syntax. PhD thesis. Harvard University Cole J. 2015. Prosody in context: A review. Language, Cognition and Neuroscience 30: 1-31 Cooper WE, Paccia-Cooper J. 1980. Syntax and Speech. Cambridge, MA: Harvard University Press de Pijper JR, Sanderman A. 1994. On the perceptual strength of prosodic boundaries and its relation to suprasegmental cues. Journal of Acoustical Society of America 96: 2037-47 Dresher BE. 1994. The prosodic basis of the Tiberian Hebrew system of accents. Language 70: 1-52 Elfner E. 2012. Syntax-prosody interactions in Irish. PhD thesis. University of Massachusetts, Amherst Elfner E. 2015. Recursion in prosodic phrasing: Evidence from Connemara Irish. Natural Language & Linguistic Theory 33: 1169-208 Elfner E. 2018. The syntax-prosody interface: Current theoretical approaches and outstanding questions. Linguistics Vanguard 4: 1-14 Elordieta G. 2007. Minimum size constraints on intermediate phrases. In Proceedings of the 16th International Congress of the Phonetic Sciences, ed. J Trouvain, W Barry, pp. 1021-24. Saabrüken, Germany Elordieta G. 2008. An overview of theories of the syntax-phonology interface. International Journal of Basque Linguistics and Philology 42: 209-86 Elordieta G. 2015. Recursive phonological phrasing in Basque. Phonology 32: 49-78 34 Ferreira F. 1991. Effects of length and syntactic complexity on initiation times for prepared utterances. Journal of Memory and Language 30: 210-33 Ferreira F. 1993. Creation of prosody during sentence production. Psychological Review 100: 233-53 Féry C. 2016. Intonation and Prosodic Structure. Cambridge: Cambridge University Press Féry C, Ishihara S, eds. 2016. The Oxford Handbook of Information Structure. Oxford, UK: Oxford University Press Féry C, Truckenbrodt H. 2005. Sisterhood and tonal scaling. Studia Linguistica 59: 223-43 Fitzpatrick-Cole J. 1996. Reduplication meets the phonological phrase in Bengali. The Linguistic Review 13: 305-56 Fougeron C, Keating PA. 1997. Articulatory strengthening at edges of prosodic domains. Journal of Acoustical Society of America 101: 3728-40 Frota S. 2012. Prosodic structure, constituents and their implementation. In The Oxford Handbook of Laboratory Phonology, ed. A Cohn, C Fougeron, M Huffman, pp. 255-65. Oxford, UK: Oxford University Press Frota S, Vigário M. 2013. Toni Borowsky, Shigeto Kawahara, Takahito Shinya and Mariko Sugahara (eds.) (2012). Prosody matters: essays in honor of Elisabeth Selkirk. (Advances in Optimality Theory.) Sheffield & Bristol, Conn.: Equinox. Pp. xv + 528. Phonology 30: 165-72 Gee JP, Grosjean F. 1983. Performance structures: A psycholinguistic and linguistic appraisal. Cognitive Psychology 15: 411-58 Ghini M. 1993. Phi-formation in Italian: A new proposal. Toronto Working Papers in Linguistics 12: 41-78 Gussenhoven C. 2004. The Phonology of Tone and Intonation. Cambridge: Cambridge University Press Halpern A. 1995. Topics in the displacement and morphology of clitics. PhD thesis. Stanford University Hayes B. 1990. Precompiled phrasal phonology. In The Phonology-Syntax Connection, ed. S Inkelas, D Zec. Chicago: University of Chicago Press Hayes B. 1995. Metrical Stress Theory: Principles and Case Studies. Chicago: The University of Chicago Press Hayes B, Lahiri A. 1991. Bengali intonational phonology. Natural Language and Linguistic Theory 9: 47-96 Hyman L, Katamba F, Walusimbi L. 1987. Luganda and the Strict Layer Hypothesis. Phonology 4: 87-108 Inkelas S. 1990. Prosodic Constituency in the Lexicon. New York: Garland Inkelas S, Zec D. 1990. The Phonology-Syntax Connection. Chicago: University of Chicago Press Inkelas S, Zec D. 1995. Syntax-phonology interface. In The Handbook of Phonological Theory, pp. 535-49. Cambridge, Mass., and Oxford, UK: Blackwell Ishihara S. 2015. Syntax-phonology interface. In The Handbook of Japanese Phonetics and Phonology, ed. H Kubozono, pp. 569-618. Berlin: Mouton de Gruyter Ito J, Mester A. 2007. Prosodic adjunction in Japanese compounds. In Formal Approaches to Japanese Linguistics (FAJL), ed. Y Miyamoto, M Ochi, pp. 97-111. Cambridge, MA: MITWPL 35 Ito J, Mester A. 2009. The extended prosodic word. In Phonological Domains: Universals and Derivations, ed. J Grijzenhout, B Kabak, pp. 135-94. Berlin & New York: Mouton de Gruyter Ito J, Mester A. 2010. The onset of the prosodic word. In Phonological Argumentation: Essays on Evidence and Motivation, ed. S Parker. London: Equinox Ito J, Mester A. 2012. Recursive prosodic phrasing in Japanese. In Prosody Matters, ed. T Borowsky, S Kawahara, T Shinya, M Sugahara. London: Equinox Press Ito J, Mester A. 2013. Prosodic subcategories in Japanese. Lingua 124: 20-40 Jun S-A. 1998. The accentual phrase in the Korean prosodic hierarchy. Phonology 15: 189-226 Jun S-A, ed. 2005. Prosodic Typology: The Phonology of Intonation and Phrasing. Oxford: Oxford University Press Jun S-A, ed. 2014. Prosodic Typology II: The Phonology of Intonation and Phrasing. Oxford: Oxford University Press Jun S-A, Elordieta G. 1997. Intonational structure in Lekeitio Basque. In Intonation: Theory, Models and Applications, ed. A Botinis, G Kouroupetroglou, G Carayiannis, pp. 193-96. Athens, Greece: ISCA Kaisse E, Zwicky A, eds. 1987. Phonology Yearbook 4: Syntactic Conditions on Phonological Rules Kaisse EM. 1985. Connected Speech: The Interaction of Syntax and Phonology. San Diego: Academic Press Kandybowicz J. 2009. Embracing edges: Syntactic and phono-syntactic edge sensitivity in Nupe. Natural Language & Linguistic Theory 27: 305-44 Kandybowicz J. 2015. On prosodic vacuity and verbal resumption in Asante Twi. Linguistic Inquiry 46: 243-72 Kandybowicz J. 2017. On prosodic variation and the distributin of wh-in-situ. Linguistic Variation 17: 111-48 Keating PA, Cho T, Fougeron C, Hsu C-S. 2003. Domain-initial articulatory strengthening in four languages. In Phonetic Interpretation (Papers in Laboratory Phonology 6), ed. J Local, R Ogden, R Temple. Cambridge: Cambridge University Press Kilbourn-Ceron O, Sonderegger M. 2018. Boundary phenomena and variability in Japanese high vowel devoicing. Natural Language & Linguistic Theory 36: 175-217 Kilbourn-Ceron O, Wagner M, Clayards M. 2017. The effect of production planning locality on external sandhi: A study in /t. In Proceedings of the 52nd Annual Meeting of the Chicago Linguistic Society Kisseberth C, Abasheikh M. 1974. Vowel length in Chi Mwi:ni: A case study of the role of grammar in phonology. In CLS 10: Parasession on Natural Phonology, pp. 193-209. Chicago: Chicago Linguistic Society Klatt D. 1975. Vowel lengthening is syntactically determined in a connected discourse. Journal of Phonetics 3: 129-40 Kubozono H. 1989. Syntactic and rhythmic effects on downstep in Japanese. Phonology 6: 39-67 Ladd DR. 1986. Intonational phrasing: The case for recursive prosodic structure. Phonology 3: 311-40 Ladd DR. 1988. Declination 'reset' and the hierarchical organization of utterances. Journal of the Acoustical Society of America 84: 530-44 Ladd DR. 2008 [1996]. Intonational Phonology. Cambridge: Cambridge University Press Lehiste I. 1973. Phonetic disambiguation of syntactic ambiguity. Glossa 7: 107-23 36 Martínez-Paricio V. 2013. An exploration of minimal and maximal metrical feet. PhD thesis. University of Tromsø Martínez-Paricio V, Kager R. 2015. The binary-to-ternary rhythmic continuum in stress typology: Layered feet and non-intervention constraints. Phonology 32: 459-504 McCarthy JJ. 1993. A case of surface constraint violation. Canadian Journal of Linguistics 38: 169-95 McCawley JD. 1968. The Phonological Component of a Grammar of Japanese. The Hague: Mouton Myrberg S. 2013. Sisterhood in prosodic branching. Phonology 30: 73-124 Nespor M, Vogel I. 1986. Prosodic Phonology. Dordrecht: Foris Newell H. 2008. Aspects of the morphology and phonology of phases. PhD thesis. McGill University Newell H, Piggott G. 2014. Interactions at the syntax-phonology interface: Evidence from Ojibwe. Lingua 150: 332-62 Odden D. 1987. Kimatuumbi phrasal phonology. Phonology Yearbook 4 Padgett J. 2014. On the origins of the prosodic word in Russian. In Proceedings of the Speech Prosody 2014 Conference, ed. N Campbell, D Gibbon, D Hirst. Dublin, Ireland: Speech Prosody Pak M. 2005. Explaining branchingness effects in phrasal phonology. In Proceedings of the 24th West Coast Conference on Formal Linguistics, ed. J Alderete, C-h Han, A Kochetov, pp. 308-16. Somerville, MA: Cascadilla Proceedings Project Pak M. 2008. The postsyntactic derivation and its phonological reflexes. PhD thesis. University of Pennsylvania Peperkamp S. 1997. Prosodic Words. The Hague: Holland Academic Graphics Pierrehumbert J, Beckman M. 1988. Japanese Tone Structure. Cambridge, Mass.: MIT Press Price P, Ostendorf M, Shattuck-Hufnagel S, Fong G. 1991. The use of prosody in syntactic disambiguation. Journal of the Acoustical Society of America 90: 2956-70 Prieto P. 2005. Syntactic and eurhythmic constraints on phrasing decisions in Catalan. Studia Linguistica 59: 194-222 Prieto P. 2014. The intonational phonology of Catalan. In Prosodic Typology II: The Phonology of Intonation and Phrasing, ed. S-A Jun, pp. 43-80. Oxford, UK: Oxford University Press Prince A. 1991. Quantitative consequences of rhythmic organization. In CLS26(2): Papers from the Parasession on the Syllable in Phonetics and Phonology, ed. K Deaton, M Noske, M Ziolkowski, pp. 355-98. Chicago: University of Chicago, Chicago Linguistic Society Richards N. 2010. Uttering Trees. Cambridge, MA: MIT Press Richards N. 2016. Contiguity Theory. Cambridge, MA: MIT Press Rotenberg J. 1978. The Syntax of Phonology. MIT, Cambridge, Mass. Ruppel A. 2017. The Cambridge Introduction to Sanskrit. Cambridge, UK: Cambridge University Press Ryan K. 2018. Prosodic end-weight reflects phrasal stress. Natural Language & Linguistic Theory Sabbagh J. 2013. Word Order and Prosodic-Structure Constraints in Tagalog. Syntax 17: 40-89 Samuels B. 2009. The structure of phonological theory. PhD thesis. Harvard University, Cambridge, MA 37 Scheer T. 2010. A Guide to Morphosyntax-phonology Interface Theories: How Extraphonological Information is Treated in Phonology since Trubetzkoy’s Grenzsignale. Berlin: de Gruyter Scheer T. 2012a. Chunk definition in phonology: Prosodic constituency vs. phase structure. In Modules and Interfaces, ed. M Bloch-Trojnar, A Bloch-Rozmej, pp. 221-53. Lublin, Poland: Catholic University of Lublin (Katolicki Uniwersytet Lubelski) Scheer T. 2012b. Direct Interface and One-Channel Translation. Boston, MA: de Gruyter Schiering R, Bickel B, Hildebrandt K. 2010. The prosodic word is not universal, but emergent. Journal of Linguistics 46: 657-709 Seidl A. 2001. Minimal Indirect Reference: A Theory of the Syntax-Phonology Interface. London: Routledge Selkirk E. 1980a. Prosodic domains in phonology: Sanskrit revisited. In Juncture, ed. M Aronoff, M-L Kean, pp. 107-29. Saratoga, CA: Anma Libri Selkirk E. 1980b. The role of prosodic categories in English word stress. Linguistic Inquiry 11: 563-605 Selkirk E. 1981 [1978]. On prosodic structure and its relation to syntactic structure. In Nordic Prosody, ed. T Fretheim, pp. 111-40. Trondheim: TAPIR Selkirk E. 1984. Phonology and Syntax: The Relation between Sound and Structure. Cambridge, Mass.: MIT Press Selkirk E. 1986. On derived domains in sentence phonology. Phonology Yearbook 3: 371-405 Selkirk E. 1995. The prosodic structure of function words. In Papers in Optimality Theory, ed. J Beckman, LW Dickey, S Urbanczyk, pp. 439-70. Amherst, MA: GLSA Publications Selkirk E. 2000. The interaction of constraints on prosodic phrasing. In Prosody: Theory and Experiment, ed. M Horne, pp. 231-61. Dordrecht: Kluwer Selkirk E. 2009. On clause and intonational phrase in Japanese: The syntactic grounding of prosodic constituent structure. Gengo Kenkyu 136: 35-73 Selkirk E. 2011. The syntax-phonology interface. In The Handbook of Phonological Theory, 2nd edition, ed. J Goldsmith, J Riggle, A Yu, pp. 435-84 Selkirk E, Lee S, eds. 2015. Phonology 32(1): Constituency in Sentence Phonology Selkirk E, Shen T. 1990. Prosodic domains in Shanghai Chinese. In The Phonology-Syntax Connection, ed. S Inkelas, D Zec, pp. 313-37. Chicago: University of Chicago Press Selkirk E, Tateishi K. 1991. Syntax and downstep in Japanese. In Interdisciplinary Approaches to Language: Essays in Honor of S.-Y. Kuroda, ed. C Georgopoulos, R Ishihara, pp. 51944. Dordrecht: Kluwer Shattuck-Hufnagel S, Turk A. 1996. A prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research 25: 193-247 Shih S. 2017. Phonological influences in syntactic alternations. In The Morphosyntax-Phonology Connection, ed. V Gribanova, S Shih, pp. 223-54. Oxford, UK: Oxford University Press Snedeker J, Trueswell J. 2003. Using prosody to avoid ambiguity: Effects of speaker awareness and referential context. Journal of Memory and Language 48: 103-30 Truckenbrodt H. 1995. Phonological Phrases: Their Relation to Syntax, Focus, and Prominence. PhD thesis. Massachusetts Institute of Technology Truckenbrodt H. 1999. On the relation between syntactic phrases and phonological phrases. Linguistic Inquiry 30: 219-56 Truckenbrodt H. 2002. Variation in p-phrasing in Bengali. Linguistic Variation Yearbook 2: 259303 38 Truckenbrodt H. 2007. The syntax-phonology interface. In The Cambridge Handbook of Phonology, ed. P de Lacy, pp. 435-56. Cambridge, UK: Cambridge University Press Tsay J, Myers J. 1996. Taiwanese tone sandhi as allomorph selection. In Proceedings of the Twenty-Second Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on the Role of Learnability in Grammatical Theory, pp. 395-405 Turk A. 2010. Does prosodic constituency signal relative predictability? A Smooth Signal Redundancy hypothesis. Laboratory Phonology 1: 227-62 Turk A, Nakai S, Sugahara M. 2006. Acoustic segment durations in prosodic research: a practical guide. In Methods in Empirical Prosody Research, ed. S Sudhoff, D Lenertova, RMS Pappert, P Augurzky, I Mleinek, et al, pp. 1-28. Berlin: De Gruyter Turk A, Shattuck-Hufnagel S. 2000. Word-boundary-related duration patterns in English. Journal of Phonetics 28: 397-440 Vigário M. 1999. On the prosodic status of stressless function words in European Portuguese. In Studies on the Phonological Word, ed. TA Hall, U Kleinhenz, pp. 255-94. Amsterdam: John Benjamins Vigário M. 2003. The Prosodic Word in European Portuguese. Berlin: Mouton de Gruyter Vigário M. 2010. Prosodic structure between the prosodic word and the phonological phrase: Recursive nodes or an independent domain? The Linguistic Review 27: 485-530 Vogel I. 2009. The status of the Clitic Group. In Phonological Domains: Universals and Derivations, ed. J Grijzenhout, B Kabak, pp. 15-46. Berlin: Mouton de Gruyter Wagner M. 2005. Prosody and recursion. Doctoral dissertation thesis. MIT Wagner M. 2010. Prosody and recursion in coordinate structures and beyond. Natural Language & Linguistic Theory 28: 183-237 Wagner M. 2012. Locality in phonology and production planning. In Proceedings of Phonology in the 21st Century: Papers in Honour of Glyne Piggott, ed. J Loughran, A McKillen, pp. 1-18. Montreal: McGill Working Papers Wagner M. 2015. Phonological evidence in syntax. In Syntax - Theory and Analysis. An International Handbook., ed. T Kiss, A Alexiadou, pp. 1154-98. Berlin: Mouton de Gruyter Wagner M, Watson D. 2010. Experimental and theoretical advances in phonology: A review. Language and Cognitive Processes 25: 905-45 Watson D, Breen M, Gibson E. 2006. The role of syntactic obligatoriness in the production of intonational boundaries. Journal of Experimental Psychology: Learning, Memory, and Cognition 32: 1045-56 Watson D, Gibson E. 2004. The relationship between intonational phrasing and syntactic structure in language production. Language and Cognitive Processes 19: 713-55 Werle A. 2009. Word, phrase, and clitic prosody in Bosnian, Serbian, and Croatian. PhD thesis. University of Massachusetts, Amherst Wightman CW, Shattuck-Hufnagel S, Ostendorf M, Price PJ. 1992. Segmental durations in the vicinity of prosodic phrase boundaries. Journal of Acoustical Society of America 92: 1707-17 Zec D, Inkelas S. 1990. Prosodically constrained syntax. In The Phonology-Syntax Connection, ed. S Inkelas, D Zec, pp. 365-78. Chicago: University of Chicago Press 39 Figure 1: The prosodic hierarchy above the word 40