Bauer 2019
Bauer 2019
Bauer 2019
Herausgegeben von
Eva Breindl und Lutz Gunkel
Im Auftrag des Instituts für Deutsche Sprache
Gutachterrat
Ruxandra Cosma (Bukarest), Martine Dalmas (Paris), Livio Gaeta (Turin),
Matthias Hüning (Berlin), Sebastian Kürschner (Eichstätt-Ingolstadt),
Torsten Leuschner (Gent), Marek Nekula (Regensburg), Attila Péteri (Budapest),
Christoph Schroeder (Potsdam), Björn Wiemer (Mainz)
Band 9
Complex Lexical
Units
ISBN 978-3-11-063242-2
e-ISBN (PDF) 978-3-11-063244-6
e-ISBN (EPUB) 978-3-11-063253-8
www.degruyter.com
Inhalt
Rita Finkbeiner/Barbara Schlücker
Compounds and multi-word expressions in the languages of Europe 1
Laurie Bauer
Compounds and multi-word expressions in English 45
Barbara Schlücker
Compounds and multi-word expressions in German 69
Geert Booij
Compounds and multi-word expressions in Dutch 95
Francesca Masini
Compounds and multi-word expressions in Italian 153
Jesús Fernández-Domínguez
Compounds and multi-word expressions in Spanish 189
Maria Koliopoulou
ompounds and multi-word expressions in Greek 221
C
Ingeborg Ohnheiser †
Compounds and multi-word expressions in Russian 251
Bożena Cetnarowska
Compounds and multi-word expressions in Polish 279
Irma Hyvärinen
Compounds and multi-word expressions in Finnish 307
1 I ntroduction
This volume deals with compounds (e. g., boat house, softball) and multi-word
expressions (piece of cake, dry cough) in European languages.1 Compounds and
multi-word expressions (henceforth MWEs) are similar as they are both lexical
units and complex, made up of at least two constituents. The most basic differ-
ence between compounds and MWEs seems to be that the former are the product
of a morphological operation and the latter result from syntactic processes. This
is, admittedly, a very vague distinction. However, as soon as one takes into
account more than one specific language (or language family), it seems that this
is the closest one may come to a definition that is more or less applicable to the
European languages. In fact, in light of Romance examples such as French glace
au chocolat, Spanish helado de chocolate ‘chocolate ice cream’ which have often
been analyzed as compounds although they contain syntactic relational markers,
even the morphological criterion for compoundhood seems to be questionable.
Further complicating matters, whereas in many languages compounds are
regarded as being opposed to MWEs, in other languages, and particularly in Eng-
lish, compounds are often regarded as a kind of MWE. In addition, for languages
that are assumed to have an opposition between compounds and MWEs, the
question arises of whether compounds and MWEs act in competition or comple-
mentation with regard to the formation of new lexical units.
Given this background, the aim of the volume is to present an overview of
compounds and MWEs in a sample of European languages. Central questions
that are discussed for each language concern the formal distinction between
compounds and MWEs (in particular prosodic, morphological, and syntactic
properties), the relation between compounding and MWE formation as well as
the conclusions concerning the theory of grammar and the lexicon that follow
from these observations. Although several comprehensive volumes on com-
pounding and phraseology have appeared in recent (and not so recent) years (cf.
1 We would like to thank Kristel Van Goethem and Carmen Scherer for very valuable comments
on an earlier version of this chapter.
Scalise (ed.) 1992; Burger et al. (eds.) 2007; Lieber/Štekauer (eds.) 2009a; Gaeta/
Grossmann (eds.) 2009; Scalise/Vogel (eds.) 2010; Gaeta/Schlücker (eds.) 2012),
the relationship between compounds and MWEs with respect to their status in
lexicon and grammar has received comparatively little attention (cf. Hüning/
Schlücker 2015 for an overview). For this reason, this relationship constitutes the
central focus of this volume.
The aim of the present chapter is to review the language-specific properties,
bring them together and compare them against German. German is well-known
for its propensity for (nominal) compounding, as compared to, e. g., French. Also,
there is a rather clear demarcation line between compounds and MWEs in Ger-
man, in contrast to English, for instance. Taking German as a reference point may
help to shed more light on some of the crucial questions with respect to the com-
pound-MWE relationship in the various European languages such as, for instance,
the potential competition between the two processes, or their demarcation line.
By way of language comparison, the differences and commonalities between
languages – both within language families and across these borders – become
clearer, ultimately revealing that a cross-linguistically valid definition of com-
pounds and the demarcation from MWEs may be impossible, given that languages
vary greatly in their defining properties and in the number and productivity of
compound and MWE subpatterns.
The volume contains chapters on English, German, Dutch, French, Italian,
Spanish, Greek, Russian, Polish, Finnish, and Hungarian. Although this sample
is neither complete nor representative of “the” languages of Europe, it neverthe-
less provides thorough analyses of a large set of central European languages.
Importantly, it should be noted that the selection here is mostly due to various
practical reasons, rather than an assessment of the relevance of languages. In
addition to the languages mentioned, the present chapter also comprises an over-
view of the North Germanic languages.
The structure of this chapter is as follows: Section 2 starts with general con-
siderations about the lexicon and the lexicon-syntax interface and discusses
basic notions such as morphological vs. syntactic lexical unit, lexicalization, and
the problem of correspondence. Section 3 discusses compounds and MWEs
against the background of German, sorted by language families. The chapter ends
with a brief conclusion in Section 4.
Compounds and multi-word expressions in the languages of Europe 3
2 Theoretical considerations
At the outset of our overview, a short remark on the notion of MWE is in order. It
is widely known that different research traditions within this field have focused
on different types of MWEs, applying an extremely diverse terminology. In the
early Anglo-American structuralist tradition (e. g., Weinreich 1969; Newmeyer
1974), the focus was on idioms as semantically and/or syntactically irregular
MWEs. Idioms – a notorious example being kick the bucket – were mainly dis-
cussed under the assumption that they posed a problem to rule-based grammar.
Traditional German phraseology, on the other hand, which is influenced by the
Soviet tradition, has been investigating idioms in their own right, as a core phe-
nomenon of the linguistic subfield of phraseology (Häusermann 1977; Fleischer
1982; Burger et al. 2007). This tradition has put much effort into issues of classifi-
cation, studying not only idioms, but also other types of MWEs which need not be
idiomatic, for instance collocations such as starker Raucher (lit. strong smoker,
‘heavy smoker’) or routine formulae such as Kein Problem (‘no problem’) (e. g.,
Burger 1998). However, under the growing influence of theories such as Construc-
tion Grammar (Fillmore/Kay/O’Connor 1988; Goldberg 2006; Hoffmann/Trous-
dale (eds.) 2013), and insights from applied linguistics, such as research in for-
eign language learning (Pawley/Syder 1983; Wray 2002), and with the advent of
new technologies within quantitative linguistics and corpus linguistics (Sinclair
1991; Gries 2008), the notion of MWE has broadened dramatically in the last
decades. In particular, it has become increasingly accepted that there is a large
inventory of lexically partially fixed patterns in the lexicon such as [N by N] (page
by page, year by year, country by country, cf. Jackendoff 2008) that may or may
not be fully compositional, and that may be used productively to create new
instances. Under such a broad view, MWEs are “co-occurrence phenomena at the
syntax-lexis interface” (Gries 2008: 8) that may be defined as syntactic patterns
consisting of at least two words, the combination of which may be more or less
fixed, more or less idiomatic, and more or less productive. Crucially, as idiomatic-
ity is not a defining feature of all of these patterns, their status as stored MWEs
hinges on sufficient frequency and on their function as a lexical unit; hence the
term ‘phrasal lexical unit’, which is regularly employed throughout the volume
and the remainder of this introduction. To decide whether or not a frequent syn-
tactic pattern is a lexical unit, a well-defined notion of lexical unit, and of the
lexicon, is required.
4 Rita Finkbeiner/Barbara Schlücker
One is a finite list of structural elements that are available to be combined. This list is tradi-
tionally called the “lexicon”, and its elements are called “lexical items”. […] The other com-
ponent is a finite set of combinatorial principles, or a grammar. (Jackendoff 2002: 39)
This view entails the idea that lexical items have to be learned, as they are not
predictable. By contrast, grammar – which is often equated with syntax – is
viewed as the domain of rules, or principles, that enable speakers of a language
to productively generate new sentences. For example, it is an idiosyncrasy of Eng-
lish that the word squirrel (and not, say, the word dog, or the word hamburger)
refers to the concept squirrel. Speakers of English have to learn this word with
its specific phonological, categorial and semantic features. However, they do not
have to learn the sentence The squirrel is eating nuts, as they can productively
generate it by combining the respective words according to the rules of grammar.
Therefore, the dichotomy between lexicon and grammar also tends to be concep-
tualized as a dichotomy between words and phrases, and between idiosyncrasies
and rules (Engelberg/Holler/Proost 2011: 1).
However, it has long been recognized that there are a considerable number of
phenomena in the languages of the world that pose a serious problem to the view
of a strict lexicon/grammar divide. Compounds and MWEs are a pertinent case in
point. As to compounds, Jackendoff (2009: 108) points out that on the one hand,
speakers must store thousands of lexicalized compounds, e. g., peanut butter, but
on the other hand, they may build compounds “on the fly”, e. g., bike girl for a girl
who left her bike in the vestibule. Thus, compounds arguably are part of the lexi-
con, but at the same time, compounding is a productive, and therefore rule-based
process. For this reason, it is necessary to distinguish between the properties of
being morphological, and of being lexical (Gaeta/Ricca 2009).
As to MWEs such as kick the bucket, it is obvious that on the one hand, they
are phrasal units, often showing a fully regular syntactic behavior, but on the
Compounds and multi-word expressions in the languages of Europe 5
other hand, they must be part of the lexicon, as their meaning is non-composi-
tional and has to be learned (Nunberg/Sag/Wasow 1994; Gries 2008). What is
more, there is ample evidence by now that not all MWEs are isolated units that
have to be learned one by one, but that there must be something like MWEs “on
the fly”, as well. That is, there seem to be abstract patterns in the lexicon that can
be used by speakers to create new MWEs (Fillmore/Kay/O‘Connor 1988). For
example, speakers might newly coin the potential, but unattested phrasal simile
heavy as a truck on the basis of the lexicalized pattern [(as) A as NP], which com-
prises established examples such as strong as a horse or dead as a doornail (Fink-
beiner 2008).
This raises the more general question of the interrelation between the lexicon
and the two “rule-based” components of grammar, morphology and syntax. If
both morphology and syntax may feed the lexicon, as is evidenced by compounds
and MWEs, how is the interaction of morphology and syntax with the lexicon to
be represented in our theory of grammar?
2 A weaker form of lexicalism assumes that inflectional morphology is more closely related to
syntax, while word-formation is more closely related to the lexicon (e. g., Anderson 1982; cf. also
Giegerich 2009).
6 Rita Finkbeiner/Barbara Schlücker
Under such a conception, one may account for the fact that compounds are
part of the lexicon, while at the same time being the output of a productive mor-
phological component. However, the model does not account for the difference
between listed compounds and novel ones, as morphosyntactically they look
exactly the same. Even more importantly, lexicalism predicts a lexicon free of
syntactic phrases. Thus, not only do MWEs such as idioms and collocations pose
a serious problem to lexicalism, but also phenomena like phrasal compounds,
i. e. compounds with a phrasal modifier constituent (e. g., Pafel 2015; Trips/Korn-
filt 2015), and particle verbs, i. e. verbs with a separable particle (e. g., Lüdeling
2001; Zeller 2001).
The linear view of the lexicon/syntax relation is abandoned in Jackendoff’s
(1997) Parallel Architecture. At the heart of this approach is the hypothesis of
representational modularity, which states that grammar is organized into three
autonomous and generative components: viz. phonological structure, syntactic
structure, and conceptual structure. Each domain generates representations of
its own. The interaction between the components is established by separate inter-
face modules between the systems that contain correspondence rules. In this
model, a lexical entry is exactly such a (small-scale) correspondence rule. It links
a small chunk of phonology with a small chunk of syntax and a small chunk of
semantics. Instead of lexical insertion, there is lexical licensing, in that a lexical
item licenses its chunks of information as the result of three independent pro-
cesses. As Jackendoff (2009) puts it:
The crucial point is that this model allows for including into the lexicon all kinds
of units, not only simplex and complex words, but also phrases of different kinds.
That is, MWEs can be listed in the lexicon as correspondence rules like every
other lexical item. The only difference is that in an MWE such as kick the bucket,
the three syntactic words are associated with three phonological words, but only
with one element in semantics (‘to die’). Complex words, such as compounds, are
treated as instantiations of more abstract morphosyntactic schemata that contain
variables at the three representational levels. Thus, morphology is not a separate
component in Jackendoff’s model. There is no difference between words and
rules, but both are conceived of as declarative schemata that have the status of
(more or less abstract, and more or less productive) lexical units.
The Parallel Architecture has much in common with Construction Grammar
and Construction Morphology (Booij 2010). In a way, one can say that Construc-
Compounds and multi-word expressions in the languages of Europe 7
tion Grammar, or at least certain variants of it, are realizations of the Parallel
Architecture. At the heart of Construction Grammar is the insight that linguistic
knowledge largely consists of stored knowledge of constructional schemata, from
morphological schemata via lexical, phrasal, and even discourse schemata. Both
the Parallel Architecture and Construction Grammar thus argue for a continuity
between lexicon and grammar.3 In Construction Grammar, this continuum view
culminates in the notion of the ‘constructicon’, which replaces older views of a
lexicon/grammar dichotomy. The constructicon is conceived of as a large struc-
tured inventory of constructions of all levels of abstraction. Under this approach,
compounds and MWEs can easily be treated as on a par with each other, both
being complex constructions sharing certain conceptual or functional features.
2.3 L exicalization
The continuum view of the syntax/lexicon relationship may lay the ground for an
integrated and systematic treatment of both compounds and MWEs as the output
of productive or semi-productive schemata localized in the lexicon. Still, it does
not say anything about the differences in the lexical status between, e. g., the
compounds grass frog vs. grass slug, or the VPs hit the road vs. hit the dog. While
grass frog is a lexicalized compound, grass slug is not, and while hit the road is a
lexicalized MWE, hit the dog is not. Obviously, some outputs of schemata, or
rules, have the status of established lexical items listed in the lexicon, while oth-
ers have not (Hohenhaus 2005; Bauer 2006; Gaeta/Ricca 2009). In order to
account for these differences, one needs a concept of lexicalization.
According to Hohenhaus (2005: 356), the term lexicalization denotes both the
process of listing and the state of listedness, that is, the property of some element
to be a lexical item of a language. The main rationale behind the joint investiga-
tion of compounds and MWEs is precisely their common status as complex lexical
items. In order to delimitate the field of investigation, it is therefore crucial to
3 One difference between the Parallel Architecture approach and Construction Grammar lies in
the conceptualization of productivity. While Jackendoff (2009, 2013) clearly differentiates be-
tween productive and semi-productive phenomena, Construction Grammar is somewhat less
explicit in this respect, assuming a flexible continuum of productivity of constructions. Another
difference lies in the conceptualization of the contents of constructions. While in a homogeneous
approach (e. g., Goldberg 1995, 2006), all linguistic units are taken to be meaningful construc-
tions – there being no autonomous syntactic principles – a heterogeneous approach takes mean-
ingful constructions as only one kind of stored structure, assuming that the grammar can also
contain independent principles of syntactic form or semantic structure (Jackendoff 2013: 78 f.).
8 Rita Finkbeiner/Barbara Schlücker
properly define the notion of ‘lexical item’. That is, while we want to include grass
frog and hit the road into our field of investigation, we would like to exclude grass
slug and hit the dog. In particular, the following two criteria seem to be crucial in
this respect.
Firstly, a lexical item functions as a semantic, or conceptual unit. For exam-
ple, grass frog refers to a unitary concept, a certain species, and hit the road refers
to a specific kind of activity. Both are concepts that speakers of the language have
stored together with the respective items. By contrast, while speakers of English
will be able to assign an interpretation to grass slug, they do not have stored it as
a unit together with a certain conventional concept, or stable referent. Similarly,
speakers will be able to interpret the phrase hit the dog, but they do so on compo-
sitional grounds, and not because they have learned this phrase together with a
certain concept.
Secondly, for an element to have the status of a lexical item, it must occur
with significant frequency in the language. This criterion has received increasing
attention with the growing influence of usage-based approaches and rapidly
developing quantitative methods in corpus linguistics. It is closely related to the
first criterion, because high frequency makes it more likely that an item is becom-
ing listed with a certain meaning. For example, if during a rainy summer a plague
of slugs that eat all the grass in people’s gardens were to sweep over a country,
and everybody started talking about the nasty grass slugs, it might be that after a
while, this compound would get stored in the English lexicon as a label of this
specific concept (‘certain kind of nasty grass-eating slug’).
stable concept and occurs with sufficient frequency in the language. Cross-classi-
fying the two properties results in the following matrix (ibid.):
4 These items are also called occasionalisms. They may become listed in the lexicon at a later
stage, but not all of them will. Hohenhaus (2005) discusses the question whether there are occa-
sionalisms that are not listable (non-lexicalizable) in principle.
5 Booij (2010: 190) uses the term “lexical phrasal constructions” to refer to these units.
6 While Gaeta/Ricca (2009) focus on the delimitation between compounds and MWEs, i. e., com
plex lexical units, it is clear that the feature combination [–morph], [+lex] also applies to estab-
lished simplex words, such as grass. Likewise, the combination [–morph], [–lex] also applies to
inexistent simplex words, such as the nonce verb to gorp from a textbook sentence on language
acquisition (“The duck is gorping the bunny”), cf. Saxton (2010).
10 Rita Finkbeiner/Barbara Schlücker
usually behave like single words phonologically. For example, the stress pattern
in English compounds is more like the stress pattern in single words than the
stress pattern in phrases, e. g., gréen card (compound; ‘residence permit’) vs.
green cárd (phrase; ‘green card’, e. g., in a game of cards).
Second, compounds are marked as word-like units morphologically. While
the prototypical case is that a compound is made up of two unmarked lexemes, in
languages with inflection, the non-head may carry an inflection-like element
(e. g., the element -s in German Liebe+s+brief ‘love letter’). Crucially, though, this
inflection-like element does not vary as a function of the compound’s role in the
matrix sentence (Bauer 2009a: 346). What carries the inflection for the compound
as a whole, according to its role in the matrix sentence, is the head (ibid.). For
example, the linking element -s in the German compound Liebe+s+brief is carried
by the non-head, while inflection according to the compound’s role in a matrix
sentence goes to the end of the head, e. g., in den Liebe+s+brief+en (‘in theDAT.PL
love lettersDAT.PL’).
Third, compounds can be defined according to syntactic criteria, most impor-
tantly syntactic inseparability and an inability to modify the non-head. For exam-
ple, one cannot insert an element in between the two constituents of the German
compound Alt+bau ‘old building’, cf. *dieser Alt teure Bau (lit. this old expensive
building), and the non-head (the first constituent) cannot be modified: *dieser
sehr Alt+bau (lit. this very old building).
As for MWEs, scholars like Nunberg/Sag/Wasow (1994) and Gries (2008)
make use of syntactic, semantic, and frequency criteria to arrive at a definition.
As outlined in the beginning of this chapter, in modern phraseological research,
most scholars hold a rather broad view of the notion of MWEs, including many
different types of phrasal units. Syntactically, MWEs are required to consist
of more than two syntactic elements, which may be of different natures. For
example, the collocation heavy smoker consists of two words. In other MWEs, a
word tends to co-occur with a particular grammatical pattern, for instance, the
verb to hem tends to co-occur with the passive. In this case, the MWE consists of
a word and a syntactic frame (Gries 2008: 5). MWEs often are syntactically more
or less fixed, but there are also fully flexible MWEs. For instance, the MWE by and
large is completely fixed (e. g., the reverse order *large and by would be
ungrammatical), while run amok is rather flexible (e. g., it allows for different
tenses).
Semantically, it is usually required that MWEs be semantic units, i. e. that
they have a meaning just like a single word or morpheme. For example, hit the
road roughly means ‘leave’. While many MWEs tend to have a non-compositional
semantics, non-compositionality is not a necessary criterion. For example, while
kick the bucket is semantically non-compositional, too much to ask is fully compo-
Compounds and multi-word expressions in the languages of Europe 11
While MWEs such as kick the bucket and compounds such as blackbird do not
seem to have much in common except their being complex lexical units, it has
been pointed out repeatedly in the literature that there are certain subsets of com-
pound words and MWEs that closely correspond to each other. For example, in
German, as in many other languages, there are adjectival compounds, e. g., but
ter+weich ‘butter soft’, that have corresponding phrasal similes, e. g., weich wie
Butter ‘as soft as butter’. These expressions share lexical material and have a very
similar meaning. Another case in point are A+N combinations such as schwarzer
Tee vs. Schwarz+tee ‘black tea’ (cf. Schlücker 2014; Hüning/Schlücker 2015). As
both the morphological and the syntactic pattern are stored lexical units, they
pose a problem to the principle of synonymy blocking in the lexicon, suggesting
that this principle might not be as strong as often assumed. For such cases, poten-
tial tasks for the researcher are to find out how much the two competing pro-
cesses overlap, if the overlap is systematic or only applies to a subset of the
respective patterns, whether one is dealing with real doublets, or whether there
are more specific differences in meaning or usage (cf. Masini, this volume;
Schlücker, this volume). For example, Hüning/Schlücker (2015) point out that the
morphological and the phrasal pattern in similes such as butter+weich/weich wie
Butter are competitive only with regard to a relatively small subset of all possible
similes. This can be shown by pairs such as *brot+dumm/dumm wie Brot (lit.
dumb as bread, ‘very dumb’), where one of the two patterns is ruled out. Theoret-
ically, the interesting question is what underlying principles guide the choice of
strategy that is employed in a given language, or in a given context. For German
A+N sequences, for instance, the choice between the morphological and the
phrasal pattern seems to be sensitive to type frequency effects (cf. Schlücker/Plag
2011).
While all contributions to this volume discuss the compound-MWE relation-
ship, some of them focus explicitly on corresponding patterns, while others look
at the issue from a broader perspective. What can be said more generally for the
different languages and language families of Europe is that the potential corre-
12 Rita Finkbeiner/Barbara Schlücker
7 This is not to say that German phrasal patterns cannot adopt a kind reading, which is clearly
not the case (e. g., schwarzes Brett ‘bulletin board’). The point in Härtl (2016: 66) is that “right
from the beginning”, a compound is semantically more specialized, or more restricted than its
corresponding phrase, which may, but must not adopt a kind reading. Potential counterexam-
ples to this hypothesis are pairs such as Warmwasser vs. warmes Wasser (‘warm water’), or
Blondhaar vs. blondes Haar (‘blond hair’), where the compound does not seem to be semantical-
ly more restricted than the phrase; cf. Schlücker (2014).
Compounds and multi-word expressions in the languages of Europe 13
3 A contrastive overview
The second part of this chapter is devoted to the comparison of German with the
West Germanic, North Germanic, Romance, Slavic, Greek, and Finno-Ugric lan-
guage families in terms of the relationship between compounds and MWEs. It
strives to illustrate the similarities and differences between these languages and
to sketch some more general tendencies of the respective language families with
respect to this relationship. The languages discussed in the following overview
are restricted to those represented in the various chapters of the volume, except
the North Germanic languages, which lack their own chapter and which have
been added to this overview to complete the picture. When relevant, compound
boundaries are marked by “+” in the following.
kinds of nominal phrasal constructions with a naming function and which are
therefore on a par with nominal compounds. Thus, they are lexical noun phrases,
sometimes also termed ‘phrasal nouns’. Patterns of lexical noun phrases are eas-
ily found in all these languages, e. g., close apposition (German Prinzip Hoffnung
‘principle of hope’), genitive (or possessive) constructions (English baby’s chair,
German Ei des Kolumbus ‘egg of columbus’), constructions with prepositional
phrases (Dutch restaurant met tuin ‘garden café’), binomials (English fish and
chips), or A+N phrases, often with a relational adjective (Dutch stalen zenuwen
‘nerves of steel’).8
However, in addition to these similarities, there are also differences within
West Germanic. In particular, there is one fundamental contrast that distin-
guishes English from the other two. Overall, in German and Dutch, compounds
can be very clearly distinguished from phrasal constructions on the basis of for-
mal criteria, primarily stress and inflection. This distinction is reflected in spell-
ing, with compounds displaying solid spelling and MWEs being written in two (or
more) orthographic words. There are only very few patterns that resist a clear
classification as either morphological or phrasal, at least at first view, such as
phrasal (particle) verbs. In fact, German and Dutch seem to pattern very much
alike with regard to the (number of) types of compound and MWEs patterns that
exist in both languages.
Leaving aside various minor differences and specific characteristics of each
language, the major difference between German and Dutch seems to lie in the
often noted observation that – at least in the nominal domain – Dutch seems to
use phrasal patterns more often than German, which in contrast opts for com-
pounding more frequently, although both patterns are in principle available in
both languages, e. g., German Tag+es+gespräch, Dutch gesprek van de dag (lit.
talk of the day, ‘nine days’ wonder’), German Stumm+film, Dutch stomme film
(‘silent film’) (cf. van Haeringen 1956; De Caluwe 1990; Booij 2002; Hüning 2010;
Hüning/Schlücker 2010, among others).
In English, on the other hand, the formal distinction between (nominal)
compounds and phrases is notoriously difficult. First of all, the criterion of
inflection is inapplicable in English. Secondly, the (formerly often invoked) cri-
terion of stress has been shown in a number of works (cf., for instance, Plag
2006; Kunter 2011) to be incapable for drawing this distinction because although
the vast majority of (NN) compounds have forestress, as predicted, there are also
numerous exceptions, as can be seen from classical examples such as ′apple
cake vs. apple ′pie, ′Madison Street vs. Madison ′Avenue. Thirdly, the distinctive
force of other tests which refer to the idea that compounds, being words, should
be subject to lexical integrity (contrary to phrases), such as the pro-one test,
internal modification, or coordination, have been proven weak in works such as
Bauer (1998), Giegerich (2015) and Bauer (to appear). Also, the forms that evolve
as either morphological or phrasal on basis of the stress criterion do not neces-
sarily coincide with the outcomes of the other tests. For this reason, very diver-
gent opinions on the definition of compounds in English and the demarcation
from phrases can be found in the literature. A literature survey is beyond the
scope of the present paper (but see, for instance, Olsen 2000; Lieber/Štekauer
2009b). Generally speaking, in addition to uniform analyses that assume that
the constructions in question are either all morphological, and thus compounds,
or all syntactic, and thus phrases, it has also been suggested that some of
them are morphological whereas others are syntactic, depending on how the
above-mentioned criteria are weighted (e. g., Giegerich 2004). Finally, it has
been advocated that the inconclusive data are an indication of the fact that the
compound-phrase distinction does not exist and that there is either a continuum
or an overlap between syntax and the lexicon (e. g., Giegerich 2015; Bauer, this
volume). Another problematic case is the ‘descriptive’ or ‘classifying’ genitives,
e. g., lawyer’s fee, mother’s milk. Regardless of their obvious phrasal form, they
are alike compounds in that the genitive dependent has a classifying rather than
a determinative function, that it is immediately adjacent to the head noun, and
that the constituents cannot be separated, e. g., by another modifier. For this
reason, they have often been treated as compounds in the literature (cf. Rosen-
bach 2006: 82–89 for a literature survey).
In sum, the major difference between German (and Dutch) on the one hand
and English on the other is that in English, due to the apparent impossibility of
distinguishing clearly between morphological and syntactic N+N and A+N
sequences, compounds are often regarded as just one kind of MWE, cf., for
instance, Ramisch (2015), Bauer (this volume),9 whereas in German and Dutch,
compounds and MWEs are clearly opposed and there are only few patterns that
elude immediate classifications as either compound or MWE. Apart from that, the
West Germanic languages pattern very much alike regarding the existence of var-
ious specific subtypes of compounds and MWEs. This similarity becomes particu-
larly obvious when German is compared to other languages and language
families.
9 Moon (2015), on the other hand, explicitly excludes compounds from the set of MWEs.
16 Rita Finkbeiner/Barbara Schlücker
10 In this overview, we will concentrate on examples from Swedish, Danish, and Icelandic.
11 E. g., Swedish bil+en (lit. car+the, ‘the car’).
12 E. g., Swedish en stor bil (common gender, ‘a big car’), ett stort hus (neuter, ‘a big house’);
bilen är stor (common gender, ‘The car is big’), huset är stort (neuter, ‘The house is big’).
13 E. g., Swedish dörren öppnade-s ‘the door was opened’, with the -s-suffix marking passive.
14 Cf. Swedish Där kommer hon, German Da kommt sie (Adv V S) (both lit. there comes she), but
English There she comes (Adv S V).
15 This is reflected not only in subordinate clauses, but also in main clauses if one takes into
account the position of infinite verbal parts. Cf. for main clauses Swedish Hon har sett huset,
English She has seen the house (Vfin Vinfin O) (both ‘She has seen the house’), but German Sie hat
das Haus gesehen (Vfin O Vinfin) (lit. she has the house seen); for subordinate clauses Swedish [Jag
vet att] hon har sett huset, English [I know that] she has seen the house (Vfin Vinfin O) (both ‘I know
that she has seen the house’), but German [Ich weiß, dass] sie das Haus gesehen hat (O Vinfin Vfin)
(lit. I know that she the house seen has).
Compounds and multi-word expressions in the languages of Europe 17
sive (e. g., Danish Kilde+skatte+direktorat+et ‘internal revenue service’ (lit. source
tax directorate), cf. Haberland 1994, Icelandic Norð+austur+atlant+s+haf+s+
fisk+veiði+nefndin ‘The North East Atlantic Ocean Fisheries (lit. Fish-Catching)
Committee’, cf. Bjarnadóttir 2017).16 North Germanic compounds normally dis-
play solid spelling and carry stress on the first constituent. Nominal compound-
ing (N+N, A+N, V+N) is by far the most common process, with N+N being the
most productive pattern, approximately as in German (cf. Thráinsson 1994;
Teleman 2005; Bauer 2009b). Some examples are Swedish ång+båt, Danish
damp+skib, Icelandic gufu+bátur ‘steam boat’ (N+N); Swedish lill+finger, Danish
lille+finger, Icelandic litli+fingur ‘little finger’, ‘pinkie’ (A+N); Swedish skriv+bord,
Danish skrive+bord, Icelandic skrif+ borð, lit. write table, ‘desk’ (V+N).
One difference concerning V+N compounding in the three languages is that
V+N compounds in Swedish and Icelandic use the verbal stem as first constituent
(skriv-, skrif-), while Danish V+N uses the infinitive of the verb ([at] skrive). This
feature of Danish V+N compounds is distinct from German, which is also interest-
ing from a theoretical point of view. If one takes infinitival endings as inflectional
endings, the question arises whether Danish V+N compounds should be regarded
as cases of compound-internal inflection. However, it is clear that infinitival non-
heads are to be distinguished from cases where a non-head exhibits agreement
features with the head. Only the latter case may pose a serious problem to the
delimitation between compounds and syntactic phrases, since in cases with com-
pound-internal agreement there is a potential overlap between compound and
syntactic phrase.
A highly particular feature of Icelandic compounds, in contrast with all other
Germanic languages, is that they systematically exhibit compound-internal
inflection (cf. Bjarnadóttir 2017).17 This pertains both to a subclass of Icelandic
N+N compounds, i. e. those with a genitive (or sometimes also a dative) non-head,
as well as to all A+N compounds. As to N+N compounds, Bjarnadóttir (ibid.: 18)
distinguishes between compounds with a stem as non-head (e. g., fjár+hús ‘sheep
house’); compounds with a genitive as non-head (e. g., vegar+endi, ‘end of road’,
with vegar being one of two possible genitive forms of the noun vegur ‘way’); and
a very small class of compounds with a special stem form or a linking element (cf.
16 Note that compounds in North Germanic languages do not exhibit regular capitalization (in
contrast to German). The examples Kildeskattedirektoratet and Norðausturatlantshafsfiskveiði
nefndin exhibit upper case because they function as proper names.
17 Note that internal inflection, more generally, pertains to all nouns with a suffixed definite
article in Icelandic. Thus, in definite nouns, the noun and suffixed definite article both inflect,
e. g., hestur ‘horse’, hestur-inn ‘theNOM horseNOM’, hesti-num ‘theDAT horseDAT’.
18 Rita Finkbeiner/Barbara Schlücker
also Thráinsson 1994). While she acknowledges that the nature of the genitives in
Icelandic compounds and the question of whether these are true inflectional
forms or linking elements are matters of debate, Bjarnadóttir (2017: 19) argues for
a genitive/inflectional analysis of these forms. One of her arguments in favor of
this analysis is that the inflected forms of the non-head nouns are always the
“correct” genitives, in spite of the complexity of the inflectional patterns. This
stands in contrast with German, where forms such as Liebe+s+brief (‘love letter’)
are paradigmatically incorrect, the expected genitive feminine being Liebe, not
*Liebes. Internal inflection is also found in the adjectival non-heads of A+N com-
pounds, where agreement of gender, case, and number “is exactly the same
within the compounds as in syntax” (ibid.: 28 f.). For example, in litli+fingur
‘pinkie’ (lit. small finger), the ending -i in litli ‘small’ is a marker for masculine,
singular, nominative, definite. In the accusative case, the compound form would
be litla+fingur, with the ending -a in litla marking masculine, singular, accusa-
tive, definite. Thus, on purely inflectional grounds, it is not possible in Icelandic
to differentiate between a definite noun phrase litli fingurinn ‘the small finger’
and a compound word in definite form, litli+fingurinn ‘the pinkie’. This distinc-
tion can be made only with the help of word stress (and spelling, though this is
not a very robust criterion), with compound words carrying primary stress on the
first constituent, and secondary stress on the second constituent in a binary
compound.
In Swedish and Danish, on the other hand, compounds can be distinguished
from phrasal constructions based on prosodic, morphological, and syntactic cri-
teria. Swedish and Danish compounds, as in Icelandic, carry primary stress on
the first constituent and secondary stress on the last constituent (cf. Teleman
2005; Bauer 2009b). According to the Swedish tonal system, which differentiates
between accent 1 (“acute”) and accent 2 (“grave”), compounds carry accent 2,
which is characteristic for polysyllabic words with primary stress on the first syl-
lable (2sport+ˌbil ‘sports car’, 2läs+glas+ˌögon, lit. read glass eyes, ‘reading glass-
es’).18 The difference between accent 1 and accent 2 is distinctive in pairs such as
2
ande+n (‘spirit+definite’) and 1and+en (‘duck+plural’). For these pairs, accent 2
is a lexical accent differentiating lexical words from inflected word forms. This is
specific for Swedish and contrasts with German. In German, there is a difference
between lexical stress and phrasal stress, but not between lexical stress in words
vs. word forms.
Moreover, in Swedish and Danish, compounds may be distinguished from
phrases on formal grounds. Generally, in contrast to Icelandic, Swedish and Dan-
ish compounds do not exhibit internal inflection. While in A+N phrases, the
adjective must carry inflection (e. g., Danish et stort køb ‘a big purchase’, det store
køb ‘the big purchase’), in A+N compounds, it is uninflected (e. g., Danish et
stor+køb, stor+køb+et ‘a wholesale’). Syntactically, while the adjective in an A+N
phrase may be modified (e. g., et meget stort køb ‘a very big purchase’, det største
køb ‘the biggest purchase’), in an A+N compound, it may not (e. g., *meget
stor+køb, lit. very big purchase, *størst+køb, lit. biggest purchase). Further evi-
dence for the compoundhood of A+N compounds comes from definiteness inflec-
tion on the noun. While a single noun in Swedish and Danish takes a postposed
definite article (hus+et ‘the house’), a premodified noun takes a preposed definite
article (cf. Swedish det stora hus+et, Danish det store hus ‘the big house’). Thus,
the correct definite form of the Danish compound hvid+vin ‘white wine’ is
hvid+vin+en, but not *den hvid+vin, as would be expected of a phrase (Bauer
2009b).
In Swedish and Danish N+N compounds, the non-head may be changed mor-
phologically in various ways. However, these forms are normally regarded not as
inflection, but rather as linking elements, as in German (Niemi, S. 2009; Bauer
2009b). Swedish compounds may display vowel deletion (flicka > flick+skola ‘girl
school’), vowel addition (tjänst > tjänst+e+man ‘service man’, ‘clerk’), or the addi-
tion of -s (stol > stol+s+ben ‘chair leg’) (cf. Josefsson 1997; Teleman 2005). Danish
compounds may display an s-link (træning+s+bane ‘training ground’), an e-link
(jul+e+dag ‘christmas day’), an er-link (blomst+er+bed ‘flower bed’) or an (e)n-
link (rose+n+gaard ‘rose garden’). In general, this picture is consistent with West
Germanic languages such as German and Dutch (but not English, which lacks
linking elements).
Apart from compounds on the one hand and regular syntactic phrases on the
other, a large stock of MWEs can be found in North Germanic languages, both
those that in principle correspond to compounds and those that do not. For exam-
ple, in Swedish, there are A+N phrases with a naming function such as röda hund
‘measles’ and hög hatt ‘top hat’; collocations such as ymnig grönska ‘lush green-
ery’ and duka bordet ‘lay the table’; complex verbs incorporating a non-referen-
tial noun such as knipa käft (lit. shut mouth) ‘keep one’s trap shut’ and vålla
storm, lit. cause storm, ‘to cause a great stir’; idioms such as tala i skägget ‘to
express oneself in an obscure way’; and speech act formulae such as Tack för
senast ‘thanks for the other day’. As Koptjevskaja-Tamm (2009: 134) observes,
lexicalized A+N phrases in Swedish may contain both indefinite (hög hatt ‘top
hat’) and definite adjectives (röda hund, lit. red dog, ‘measles’), with definite
adjectives combining with either unmarked nouns (röda hund ‘measles’) or nouns
with the suffixed definite article (röda korset ‘the Red Cross’). However, what is
avoided, according to Koptjevskaja-Tamm (2009), are lexicalizations of the nor-
20 Rita Finkbeiner/Barbara Schlücker
mal pattern with preposed determiners, definite adjectives and nouns with suf-
fixed article (as in den gula hatten ‘the yellow hat’) (cf. Section 2.5). A specific
feature of Swedish MWEs is their connective prosody (cf. Anward/Linell 1976),
whereby all stressed syllables in the MWE become deaccentuated, except for the
last one. This can be taken as a distinctive feature for telling apart phrasal lexical
units from phrasal syntactic units. In this respect, Swedish clearly differs from
German, which does not distinguish lexical phrases from non-lexicalized phrases
on prosodic grounds.
Generally speaking, the North Germanic MWE systems are very similar to the
MWE system of German. Thus, there are overall commonalities both as to the
number and the types of MWEs, including rather specific idioms such as German
auf keinen grünen Zweig kommen, which directly corresponds to Swedish ej
komma på grön kvist (lit. to not come onto a green branch, ‘to get nowhere’). How-
ever, there are also many language-specific differences in lexicalization which
can be easily demonstrated, e. g., for the case of collocations. For example, in
Swedish, there are several collocations with the verb torka ‘to dry (sth.)’, e. g.,
torka bordet (‘wipe the table’), torka golvet (‘wipe/clean the floor’), torka disken
(‘dry the dishes’). While German has a direct verbal equivalent, trocknen ‘to dry
(sth.)’, it uses three different verbs in combination with the respective nouns: den
Tisch abwischen/*trocknen ‘wipe the table’, den Boden wischen/*trocknen ‘wipe/
clean the floor’, das Geschirr abtrocknen/*trocknen ‘dry the dishes’.
An interesting question is whether there are any tendencies in the North Ger-
manic languages as to the use of compounds compared to their corresponding
MWEs. It is well-known that Dutch, relative to German, tends to prefer MWEs over
compounds, while German, relative to Dutch, tends to prefer compounds over
MWEs (cf. Section 3.1). As to North Germanic languages, as far as we can see,
comprehensive studies on this issue are lacking. There is some evidence, though,
that Swedish tends to use compounds more frequently than corresponding MWEs
compared to other languages. For example, Dura/Gawronska (2007), in a parallel
corpus study on novel expressions, found that legislative concepts such as ‘qual-
ity control’ were realized in the Swedish corpus as compound nouns (kval
itet+s+kontroll), whereas the Polish parallel corpora used nominal phrases (kon
trola jakośki). Combinations with ‘animal food’ were realized as compound nouns
(djur+foder ‘animal food’, fisk+foder ‘fish food’) in the Swedish corpus, but as
lexical noun phrases containing prepositional phrases (karma dla zwierzat ‘ani-
mal food’, karma dla ryb ‘fish food’) in the Polish corpus. Inghult (1991), in an
investigation of the principles of lexical innovations in German and Swedish,
found that only 3 % of all new formations in dictionaries of neologisms were
phrases, while 97 % were word formations. Moreover, he found that Swedish
often has compounds where German has MWEs, for instance, German kupferne
Compounds and multi-word expressions in the languages of Europe 21
Kanne vs. Swedish koppar+kanna, ‘copper pot’. However, these somewhat out-
dated results from dictionaries should be treated with caution and are in need of
confirmation by corpus-driven studies.
Comparing German and Danish, Farø (2015) finds that Danish tends to have
MWEs where German has compounds, e. g., Danish røget laks vs. German
Räucher+lachs (‘smoked salmon’), Danish stor begivenhed vs. German Groß+
ereignis (‘major event’). However, there are also reverse pairs such as Danish
spanskrør vs. German Spanisches Rohr (‘cane’). For a comparison of Dutch and
Danish, Haberland (1994: 347) remarks that where Dutch would use derivational
processes, Danish would use compounds, cf. Danish vel+smagende ‘well tasting’,
‘tasty’ vs. Dutch smakelijk. While more comprehensive studies on this issue are
lacking, these observations suggest, overall, that the North Germanic languages
tend to pattern with German with respect to the utilization of the two competing
processes.
An interesting commonality between the North Germanic languages, Ger-
man, and Dutch, which clearly sets them apart from English, is put forward by
Klinge (2006). Klinge investigates the [N de N] construction, which is well-known
from French (e. g., prisonnier de guerre). Interestingly, this is also a productive
pattern of formation in English (e. g., prisoner of war), yet not or only marginally
in other West Germanic languages (German, Dutch) or indeed in North Germanic
languages such as Danish or Icelandic. Thus, where English has bird of prey, Ger-
man has Raub+vogel, Danish rov+fugl, and Icelandic rán+fugl. The hypothesis
put forward by Klinge is that this may be explained as a language contact phe-
nomenon. Thus, the originally Romance [N de N] pattern was adopted in English
from Norman French. This would explain why it does exist in English, but not in
Dutch, German, Danish, or Icelandic. Importantly, Klinge argues that MWEs such
as weapons of mass destruction in English are not the result of some isolated lex-
icalization of a syntactic phrase, but instead reflect the presence of a lexical for-
mation pattern [N de N] in English which instantiates such structures directly as
lexical units.
In sum, one can say that the North Germanic languages largely pattern with
German with respect to the availability and utilization of the processes of com-
pounding and MWE formation. The most significant differences between North
Germanic and German are to be found in the Icelandic possibility of compound-
internal inflection, which makes Icelandic compounds look more “syntactic”
than German compounds. However, in many other respects, the commonalities
outweigh the differences.
22 Rita Finkbeiner/Barbara Schlücker
‘red and white’).19 Regarding headedness, French and Italian compounds are gen-
erally left-headed, e. g., French stylo-bille (lit. pen ball, ‘ball pen’), Italian pesce-
spada (lit. fish sword, ‘sword fish’). However, Spanish, in addition to left-headed
compounds, e. g., célula madre (lit. cell mother, ‘stem cell’), also has some right-
headed compound patterns (cf., e. g., Guevara 2012; Rainer 2016), both adjectival
and nominal ones (cf. Fernández-Domínguez, this volume), e. g., drog+adicto (lit.
drug addict, ‘addicted to drugs’). For Italian, on the other hand, Masini/Scalise
(2012) argue that the existence of right-headed compounds does not provide evi-
dence against the assumption that Italian compounding is generally left-headed
because these cases are either neoclassical formations, Latin relics, or English
calques, such as scuolabus ‘school bus’. Another frequently mentioned property
of compounds, which is again particularly valid for compounding in German
(although not for all compound subpatterns) is recursivity. In general, compound-
ing is not considered to be recursive in the Romance languages under discussion
(cf., for instance, Scalise 1992 on Italian), with the exception of coordinate (or:
copulative) compounds (e. g., Arnaud 2015 on French). Also, solid spelling – which
is often said to be indicative of the compound (versus phrase) status in German –
is often found with morphological compounds, as well as hyphenated spelling.20
At the same time, however, there are also compounds with an unstable spelling
(cf. Fernández-Domínguez, this volume) as well as MWEs written as one word
(e. g., Van Goethem 2009; Van Goethem/Amiot, this volume).
So far, this brief overview has shown that in contrast to German it seems
much more difficult to provide clear criteria for morphological compounds as
opposed to MWEs in French, Spanish, and Italian. However, two important crite-
ria are still missing. They are among those that have been established by Lieber/
Štekauer (2009b: 8) as more general, cross-linguistic criteria of compounding,
namely (in addition to stress) (a) syntactic impenetrability, inseparability, and
unalterability, and (b) inflection. The first criterion is difficult to assess. On the
one hand, it is a basic criterion for distinguishing compounds from phrases (cf.,
for instance, Fernández-Domínguez, this volume, Van Goethem/Amiot, this vol-
ume). On the other hand, however, it is well-known that it also applies to some,
though not all kinds of lexicalized phrases (cf., for instance, Gunkel/Zifonun
19 In addition, Latinate and Greek linking elements are found in neoclassical compounding of
all three languages, cf., for instance, Villoing (2012) on French.
20 It goes without saying that spelling is subject to conventional norms and possible changes of
normative rules and, for these reasons, cannot be regarded as evidence for the grammatical sta-
tus of forms. However, in particular non-normative writing tendencies might be indicative of the
writer’s assessment of a form as a conceptual unit.
24 Rita Finkbeiner/Barbara Schlücker
Italian porta+bagagli (lit. carry luggage, ‘trunk’), Spanish cubre+cama (lit. cover
bed, ‘bedspread’). The pattern is regarded as typical for the Romance languages
and it does rarely exist in other Indo-European languages, and not at all in
German nor in Dutch (there are sporadic English examples such as turncoat,
killjoy). Although there have been debates concerning the stem form (and thus
the morphological nature) of the left, verbal constituent, these constructions are
relatively uniformly regarded as morphological compounds in contemporary
works (see the literature cited in this section as well as Ricca 2015, who also
discusses the interlinguistic differences of V+N compounds within Romance).
The second pattern are coordinate compounds (A+A, N+N) which can be said to
be fairly regular and productive in French, Italian, and Spanish (though with
some restrictions regarding specific subpatterns in the individual languages). In
comparison to German, they are interesting for two reasons: first, with regard to
form, coordinate compounds often show inflectional marking on both heads and
thus word-internal marking, which is impossible in German. One could therefore
argue that the pattern is more morphological in German than in Romance.
Second, the existence of N+N coordinate compounds has been widely discussed
in the literature on German (in contrast to A+A coordinate compounds, whose
existence has not been questioned). The main argument is that in many cases of
alleged N+N coordinate compounds it seems hard to establish a semantic
coordinate relationship and thus two semantic heads; instead, a determinative
interpretation is available in equal measure or even preferred. There are only
very few clearly nominal coordinate compounds in German (with an additive
meaning) such as toponyms, for instance the names of federal states that consist
of two regions, e. g., Nordrhein-Westfalen ‘North Rhine-Westphalia’, or technical
terms such as Sprecherschreiber ‘speaker-writer’ which is however restricted to
linguistic terminology. Thus, although in general German seems to be much
more prone to morphological compounding than the Romance languages which
in contrast make much more use of MWEs, there are at least these two patterns
of compounding that constitute an exception from this general distribution of
use of forms.
Regarding compounds and MWEs, Modern Greek and German display many sim-
ilarities. Compounding in Greek is, just as in German, a very productive device of
word-formation, and both languages have various MWE patterns. As in German,
compounds can be distinguished clearly from syntactic phrases (both MWEs and
common ones) in Greek.
26 Rita Finkbeiner/Barbara Schlücker
21 Contrary to compounding, though, they seem to be a rather recent pattern. According to Ralli
(2013a), they have been observed only in the last two centuries and have most probably emerged
under the influence of French and English. Also, they are almost always restricted to specific
registers.
28 Rita Finkbeiner/Barbara Schlücker
Slavic languages, exemplified here by Russian and Polish, differ clearly from Ger-
man with respect to the formation of new lexical items and in particular com-
pounding. Although compounding, and in particular nominal compounding, is
a productive word-formation process in both languages, it is a less important
means for expanding the lexicon than it is in German (and other languages, such
as English), particularly since derivation is highly productive (Uluhanov 2016).22
As in German, nominal compounds, in particular N+N compounds, are the
predominant compound type both in Russian and Polish, e. g., Polish gwiazd+o+
zbiór (lit. starset, ‘constellation’), Russian gaz+o+snabženie (‘gas supply’), fol-
lowed by adjectival compounds, e. g., Polish ciemn+o+niebieski (‘darkblue’), Rus-
sian tëmn+o+sinij (‘darkblue’). Verbal compounding is considered unproductive
in Polish (cf. Szymanek 2009) and only marginally productive in Russian (cf.
Benigni/Masini 2009), although both languages have a rather small inventory of
(older) verbal compounds. Generally, there are neither compounds with verbal
modifiers (V+X) (cf. Ohnheiser 2015: 761) nor phrasal modifiers (XP+X) (cf.
Bağrıaçık/Ralli 2015: 344; Szymanek 2017) in Slavic, in contrast to German. Com-
pounding is mostly right-headed, although there are also some (minor) left-
headed subpatterns. Compounds proper have a linking element, mostly -o-, as in
the above-mentioned examples or, less frequently, -e-, -i-, -u-, and they are writ-
ten in one word (or with a hyphen). Compounds in Polish display lexical stress on
the penultimate syllable which clearly sets them apart from phrases. Finally, Pol-
ish and Russian compounds are hardly recursive; compounds with more than two
constituents are only found with adjectival coordinate compounds, e. g., Polish
polsko-rosyjsko-ukraińskie (‘Polish-Russian-Ukrainian’).
22 In the following, only the most important and basic properties of compounding in Russian
and Polish are described. For further details, such as the difference between proper compounds
and solid compounds, the various kinds of input elements, neoclassical compounding, the gen-
der class shift etc. the reader is referred to the contributions by Ohnheiser (on Russian) and Cet-
narowska (on Polish) in this volume, as well as Szymanek (2009), Benigni/Masini (2009), Ohn-
heiser (2015), Uluhanov (2016), Nagórko (2016).
Compounds and multi-word expressions in the languages of Europe 29
they also share properties with free syntactic phrases and compounds. For
instance, these lexical noun phrases are inseparable. That is, they cannot be
interrupted by intervening material, e. g., Russian sotovyj telefon (‘mobile phone’),
but *sotovyj služebnyj telefon (lit. cellular official telephone). Also, the individual
constituents cannot be modified internally, e. g., Russian posobie po bezrabotice
‘unemployment benefit’, but *posobie po ženskoj bezrabotice (lit. benefit by
female unemployment). These are properties typical of morphological entities
and unlike free syntactic phrases; also, the function of lexical noun phrases as
lexical naming unit equals that of compounds. On the other hand, lexical noun
phrases display inflectional markers, like free syntactic phrases and unlike com-
pounds, and some patterns contain relational elements, thus prepositions and
conjunctions (as po in the last example), again like free syntactic phrases and
unlike compounds (for a more detailed discussion of the tests employed includ-
ing (apparent) counterexamples cf. Masini/Benigni 2012; Cetnarowska 2015;
Ohnheiser 2015; Cetnarowska 2018; Cetnarowska, this volume; Ohnheiser, this
volume). Thus, again it can be shown that these lexical noun phrases are lexical
entities on the interface of syntax and the lexicon, i. e. lexical entities that are
created in syntax. Building on works by Booij (2009, 2010) on A+N phrases in
Dutch and Greek, among others, constructionist analyses have been proposed for
these Russian and Polish lexical noun phrases in Masini/Benigni (2012), Cet
narowska (2018), Cetnarowska (this volume).
MWEs, and lexical noun phrases in particular, are also known from German,
although it seems likely that these (or comparable patterns) are less productive
in German than they are in Slavic, given the predominance of compounding in
German. There is, however, another process for the formation of lexical items
which stands in a close relationship to MWEs and MWE formation. This process
is specific for Slavic and without a real equivalent in German. It is a process of
shortening phrasal items to a single morphological lexeme.23 More precisely,
there are several shortening processes, among them ellipsis, truncation, clip-
ping and de-suffixation (cf. Masini/Benigni 2012 on Russian; Martincová 2015 on
Slavic in general), e. g., Polish rzut karny (‘penalty throw’) > karny (lit. penal),
Russian mineral’naja voda (‘mineral water’) > mineralka. These processes are
referred to either as shortening, condensation or univerbation, although the lat-
ter term is somewhat misleading as univerbation is often understood elsewhere
23 There are, obviously, also shortening processes such as clipping and contamination in Ger-
man. They are much less systematic in nature than the shortening processes in Slavic however.
Also, they occur only sporadically and are much less frequent. Finally, they do not require MWEs
as input structures.
Compounds and multi-word expressions in the languages of Europe 31
24 More precisely, in the Finnish literature, this is regarded as the nominative case, as the
nominative equals the base form without any inflectional suffixes (cf., e. g., Hyvärinen, this
volume).
25 An additional, morphosyntactic criterion is that possessive suffixes and clitics that can be
added to Finnish nouns are not allowed inside compounds (Niemi, J. 2009: 241 f.).
Compounds and multi-word expressions in the languages of Europe 33
3.7 D
iscussion
26 There are, however, numerous studies on MWEs in Finnish and Hungarian in the more tradi-
tional sense, written in German or English, in particular on verbal idioms, including various
Hungarian-German and Finnish-German contrastive studies. For an overview on Finnish cf.
Hyvärinen (2007).
34 Rita Finkbeiner/Barbara Schlücker
27 Guevara/Scalise (2009) correctly point out that defining criteria of compounding such as
those mentioned here usually reflect the Germanic perspective, given the huge amount of stud-
ies on compounding in Germanic, but cannot do justice to compounding from a broader
perspective.
28 In this overview, more attention has been given to nominal MWEs than to verbal ones. This is
due to reasons of space as well as to the fact that the starting point of this study is German, and
that German compounding is predominantly nominal. Verbal lexical phrasal units are studied in
detail in the chapters on Dutch, Finnish, and Hungarian (this volume).
29 As noted in Section 3.6, as far as we are aware there is no English or German speaking litera-
ture on lexical noun phrases and, in particular, on the respective lexical patterns in Finnish and
Hungarian so far. There are, however, studies on complex verbal lexical units.
Compounds and multi-word expressions in the languages of Europe 35
Slavic due to the productivity of both derivation and MWE formation, whereas
the high productivity of nominal compounding in German has often been used
as an explanation for the fact that the number and productivity of nominal Ger-
man MWEs seem to be lower than in other languages.
Comparing the MWE patterns, it turned out that all languages have, among
others, productive patterns for the formation of [A N] phrasal units (or [N A], in
left-headed configurations). Among these, units with a relational adjective play an
important role. In addition, some languages (among which German, Dutch, Dan-
ish, Swedish, Polish, and Greek) also have morphological A+N compounds, which
raises – for each language – the question of synonymy and synonymy blocking.
Another phrasal pattern that can be observed cross-linguistically are so-called
phrasal similes, i. e. comparative adjectival phrases of the type [(as) A as NP], e. g.,
as red as blood (cf. Section 2.5). Phrasal similes are attested in the West and North
Germanic languages as well as in Finnish and Italian,30 e. g., Swedish mjuk som
silke, German weich wie Seide, both ‘soft as silk’, Italian rosso come il sangue ‘red
as blood’ (note that not all comparisons make sense in their literal meaning, e. g.,
Danish dum som en dør ‘as stupid as a door’). They are particularly interesting
with respect to the question of synonymy and synonymy blocking since all these
languages also have an equivalent A+N compound pattern with a comparative
meaning, e. g., blood-red, Swedish silkesmjuk ‘silky smooth’. It can be observed
that in some cases the existence of a phrasal or morphological form blocks the
other (e. g., Swedish mjuk som smör ‘soft as butter’, but *smörmjuk (lit. butter-soft)),
but in other cases the phrasal and morphological form co-exist (e. g., Danish dum
som snot (lit. stupid as snot, ‘very stupid’), snotdum (id.)). The principles that
underlie the (non-)blocking in the various cases, both within single languages and
cross-linguistically, are however not yet fully understood. While both phrasal A+N
units and phrasal similes seem to arise quite naturally from the usual syntactic
patterns of the various languages, it is very interesting to see that phrasal patterns
that are more specific in that they violate the syntactic rules can also be found in
various languages. A case in point is the [an N1 of an N2] pattern (e. g., a hell of a
guy), again a comparative pattern. It expresses a comparison of N2 to the reference
value provided by N1. Hence, there is a mismatch between the semantic head of the
construction (N2) and the syntactic one (N1), referred to as ‘dependency reversal’ in
Rijkhoff (2009: 76). This pattern exists not only in Germanic, as for instance in
English (a hell of a guy), German (ein Idiot von (einem) Arzt ‘an idiot of (a) doctor’),
Danish (en klovn av en statsråd ‘a clown of privy council’), Swedish (en kretin till
30 More detailed studies on phrasal similes can be found in this volume in the chapters on Ger-
man, Dutch, and Italian.
36 Rita Finkbeiner/Barbara Schlücker
polisprefekt ‘an imbecile of police chief’) and Dutch (where it is well-known in the
linguistic literature in connection with the famous example schat van een kind (lit.
sweetheart of a child, ‘very sweet child’), cf. Paardekoper 1956), but also in Italian,
French, and Spanish (e. g., esta maravilla de niño ‘this wonder of a child’) (cf. Gun-
kel et al. 2017: 1627 ff.).
One has to add that while in the present context attention is given only to the
formal side, i. e. the morphosyntactic and possibly phonological properties of
patterns such as [(as) A as NP] and [an N1 of an N2], cross-linguistic similarities
(and differences) have also been studied with respect to the semantic side, in
particular themes and images that feed imagery and metaphors in phrasal pat-
terns and that re-occur cross-linguistically, due to cultural links and other factors
(cf. Piirainen 2012, among many others). Thus, from this perspective, it is not
unexpected that similar patterns occur cross-culturally in different, even geneti-
cally unrelated languages.
4 Summary
In this chapter, we sought to present an introductory overview of compound and
MWE formation in a sample of European languages. We started with some general
considerations about the notion of complex lexical unit, the lexicon, and the lex-
icon-syntax interface, and provided some preliminary criteria for the distinction
between compounds and MWEs. In the second part of the chapter, we reviewed
the language-specific properties of compounds and MWEs in West Germanic,
North Germanic, Romance, Greek, Slavic, and Finno-Ugric languages, comparing
them to German. Central questions that were discussed for each language family
included the formal distinction between compounds and MWEs (in particular
prosodic, morphological, and syntactic properties), the relationship between
compounding and MWE formation as well as the conclusions concerning the the-
ory of grammar and the lexicon following from these observations. One major
finding is that while there are great similarities as well as differences regarding
compound and MWE formation in the languages of Europe, a cross-linguistically
valid definition of compounds and MWEs is hard to establish, because the lan-
guages differ greatly with respect to both the compound criteria that can be rele-
vantly applied to them, and the relevant types of compound and MWE patterns
and their degree of productivity. The various chapters of this volume provide
in-depth analyses of the situation in the respective languages and language fam-
ilies, also discussing in more detail the relevant implications for the theory of the
lexicon-grammar interface.
Compounds and multi-word expressions in the languages of Europe 37
References
Anderson, Stephen R. (1982): Where’s morphology? In: Linguistic Inquiry 13. 571–612.
Anward, Jan/Linell, Per (1976): Om lexikaliserade fraser i svenskan. In: Nysvenska studier
55/56. 76–119.
Arnaud, Pierre J. L. (2015): Noun-noun compounds in French. In: Müller, Peter O. et al. (eds.).
673–687.
Bağrıaçık, Metin/Ralli, Angela (2015): Phrasal vs. morphological compounds: Insights from
Modern Greek and Turkish. In: STUF – Language Typology and Universals 68, 3. 323–357.
Bandle, Oscar et al. (eds.) (2005): The Nordic Languages. An International Handbook of the
History of the North Germanic Languages. Vol. 2. (= Handbooks of Linguistics and
Communication Science (HSK) 22.2). Berlin/New York: De Gruyter.
Bauer, Laurie (1998): When is a sequence of two nouns a compound in English? In: English
Language and Linguistics 2. 65–86.
Bauer, Laurie (2006): Morphological productivity. (= Cambridge Studies in Linguistics 95).
Cambridge, UK: Cambridge University Press.
Bauer, Laurie (2009a): Typology of compounds. In: Lieber, Rochelle/Štekauer, Pavol (eds.).
343–356.
Bauer, Laurie (2009b): IE, Germanic: Danish. In: Lieber, Rochelle/Štekauer, Pavol (eds.).
400–416.
Bauer, Laurie (2017): Compounds and Compounding. (= Cambridge Studies in Linguistics 155).
Cambridge, UK: Cambridge University Press.
Bauer, Laurie (to appear): Compounds. In: Aarts, Bas/Bowie, Jillian/Popova, Geri (eds.): Oxford
Handbook of English Grammar. Oxford: Oxford University Press.
Benigni, Valentina/Masini, Francesca (2009): Compounds in Russian. In: Lingue e Linguaggio
8, 2. 171–193.
Bjarnadóttir, Kristín (2017): Phrasal compounds in Modern Icelandic with reference to Icelandic
word formation in general. In: Trips, Carola/Kornfilt, Jaklin (eds.). 13–48.
Booij, Geert (2002): The morphology of Dutch. Oxford: Oxford University Press.
Booij, Geert (2009): Phrasal names. A constructionist analysis. In: Word Structure 2, 2.
219–240.
Booij, Geert (2010): Construction morphology. Oxford/New York: Oxford University Press.
Braunmüller, Kurt (2007): Die skandinavischen Sprachen im Überblick. 3rd ed. Tübingen/Basel:
Francke.
Burger, Harald (1998): Phraseologie. Eine Einführung am Beispiel des Deutschen. Berlin:
Schmidt.
Burger, Harald et al. (eds.) (2007): Phraseologie/Phraseology. Ein internationales Handbuch
zeitgenössischer Forschung/An International Handbook of Contemporary Research.
(= Handbooks of Linguistics and Communication Science (HSK) 28). Berlin/New York: De
Gruyter.
Cetnarowska, Bożena (2015): The lexical/phrasal status of Polish Noun+Adjective or
Noun+Noun combinations and the relevance of coordination as a diagnostic test. In:
SKASE Journal of Theoretical Linguistics 12, 3. 142–170.
Cetnarowska, Bożena (2018): Phrasal names in Polish: A+N, N+A and N+N units. In: Booij, Geert
(ed.): The Construction of Words. Advances in Construction Morphology. (= Studies in
Morphology Series 4). Cham: Springer. 287–313.
38 Rita Finkbeiner/Barbara Schlücker
Gries, Stefan Th. (2008): Phraseology and linguistic theory: a brief survey. In: Granger,
Sylviane/Meunier, Fanny (eds.): Phraseology. An interdisciplinary perspective.
Amsterdam: Benjamins. 3–25.
Guevara, Emiliano/Scalise, Sergio (2009): Searching for Universals in Compounding. In:
Scalise, Sergio/Bisetto, Antonietta (eds.): Universals of Language Today. Dordrecht:
Springer. 101–128.
Guevara, Emiliano R. (2012): Spanish compounds. In: Probus 24, 1. 175–195.
Gunkel, Lutz/Zifonun, Gisela (2011): Klassifikatorische Modifikation im Deutschen und
Französischen. In: Lavric, Eva/Pöckl, Wolfgang/Schallhart, Florian (eds.): Comparatio
delectat: Akten der VI. Internationalen Arbeitstagung zum romanisch-deutschen und
innerromanischen Sprachvergleich, Innsbruck, 3.–5. September 2008. Frankfurt a. M.:
Lang. 549–562.
Gunkel, Lutz et al. (2017): Grammatik des Deutschen im europäischen Vergleich: das Nominal.
Vol. 2: Nominalflexion, Nominale Syntagmen. (= Schriften des Instituts für Deutsche
Sprache 14). Berlin/Boston: De Gruyter.
Haberland, Hartmut (1994): Danish. In: König, Ekkehard/Auwera, Johan van der (eds.).
313–348.
Haeringen, Coenraad Bernardus van (1956): Nederlands tussen Duits en Engels. 2de druk. Den
Haag: Servire.
Härtl, Holden (2016): Normality at the boundary between word-formation and syntax. In: d’Avis,
Franz/Lohnstein, Horst (eds.): Normalität in der Sprache. In: Linguistische Berichte
Sonderheft 22. Hamburg: Buske. 65–92.
Häusermann, Jürg (1977): Phraseologie. Hauptprobleme der deutschen Phraseologie auf Basis
sowjetischer Forschungsergebnisse. Tübingen: Niemeyer.
Hoffmann, Thomas/Trousdale, Graeme (eds.) (2013): The Oxford Handbook of Construction
Grammar. Oxford: Oxford University Press.
Hohenhaus, Peter (2005): Lexicalization and institutionalization. In: Štekauer, Pavol/Lieber,
Rochelle (eds.). 353–373.
Hüning, Matthias (2010): Adjective + Noun constructions between syntax and word formation in
Dutch and German. In: Onysko, Alexander/Michel, Sascha (eds.): Cognitive Perspectives
on Word Formation. (= Trends in Linguistics. Studies and Monographs 221). Berlin/New
York: De Gruyter. 195–215.
Hüning, Matthias/Schlücker, Barbara (2010): Konvergenz und Divergenz in der Wortbildung.
Komposition im Niederländischen und im Deutschen. In: Dammel, Antje/Kürschner,
Sebastian/Nübling, Damaris (eds.): Kontrastive Germanistische Linguistik
(= Germanistische Linguistik 206–209). Vol. 2. Hildesheim i. a.: Olms. 783–825.
Hüning, Matthias/Schlücker, Barbara (2015): Multi-word expressions. In: Müller, Peter O. et al.
(eds.). 450–467.
Hyvärinen, Irma (2007): Phraseologie des Finnischen. In: Burger, Harald et al. (eds.). Vol. 2.
737–752.
Inghult, Göran (1991): Lexikalische Innovationen in Wortgruppenform. Zu einer Untersuchung
über die Erweiterung des Lexembestandes im Deutschen und Schwedischen. In: Palm,
Christine (ed.): Europhras 90. Akten der internationalen Tagung zur germanistischen
Phraseologieforschung Aske/Schweden, 12.–15. Juni 1990. Uppsala: Almqvist & Wiksell.
101–114.
Jackendoff, Ray (1997): The architecture of the language faculty. Cambridge, MA: MIT Press.
Jackendoff, Ray (2002): Foundations of language. Oxford: Oxford University Press.
40 Rita Finkbeiner/Barbara Schlücker
Jackendoff, Ray (2008): Construction after construction and its theoretical challenges. In:
Language 84. 8–28.
Jackendoff, Ray (2009): Compounding in the Parallel Architecture and Conceptual Semantics.
In: Lieber, Rochelle/Štekauer, Pavol (eds.). 105–128.
Jackendoff, Ray (2013): Constructions in the Parallel Architecture. In: Hoffmann, Thomas/
Trousdale, Graeme (eds.). 70–92.
Josefsson, Gunlog (1997): On the principles of word formation in Swedish. Lundastudier i
nordisk språkvetenskap 51. Lund: Lund University Press.
Kapatsinski, Vsevolod/Vakareliyska, Cynthia (2013): [N[N]] compounds in Russian: A growing
family of constructions. In: Constructions and Frames 5, 1. 69–87.
Karlsson, Fred (2015): Finnish: an essential grammar. 3rd edition. Milton Park i. a.: Routledge.
[Trans. by Andrew Chesterman].
Kiefer, Ferenc (1990): Noun incorporation in Hungarian. In: Acta Linguistica Hungarica 40, 1–2.
149–177.
Kiefer, Ferenc (1992): Compounding in Hungarian. Rivista di Linguistica 4. 61–78.
Kiefer, Ferenc (2009): Uralic, Finno-Ugric: Hungarian. In: Lieber, Rochelle/Štekauer, Pavol
(eds.). 527–541.
Kiefer, Ferenc (2016): Hungarian. In: Müller, Peter O. et al. (eds.). 3308–3326.
Klinge, Alex (2006): The Origin of Weapons of Mass Destruction. Investigating Traces of Lexical
Formation Patterns in the (Lingustic) History of Europe. In: Nølke, Henning (ed.):
Grammatica: Festschrift in honour of Michael Herslund/Hommage à Michael Herslund.
Frankfurt a. M./New York: Lang. 233–248.
Kolehmainen, Leena/Savolainen, Tiina (2007): Deverbale Verbbildung im Deutschen und im
Finnischen: ein Überblick. Würzburg. (Internet: https://opus.bibliothek.uni-wuerzburg.de/
opus4-wuerzburg/frontdoor/index/index/docId/1937, last access: 30.5.2018).
König, Ekkehard/Auwera, Johan van der (eds.) (1994): The Germanic Languages. London/New
York: Routledge.
Koptjevskaja-Tamm, Maria (2009): Proper-name nominal compounds in Swedish between
syntax and lexicon. In: Rivista di Linguistica 21, 1. 119–148.
Kunter, Gero (2011): Compound stress in English: the phonetics and phonology of prosodic
prominence. (= Linguistische Arbeiten 539). Berlin: De Gruyter.
Lieber, Rochelle/Štekauer, Pavol (eds.) (2009a): The Oxford handbook of compounding. Oxford:
Oxford University Press.
Lieber, Rochelle/Štekauer, Pavol (2009b): Introduction: Status and definition of compounding.
In: Lieber, Rochelle/Štekauer, Pavol (eds.). 3–18.
Liimatainen, Annikki (2008): Untersuchungen zur Fachsprache der Ökologie und des
Umweltschutzes im Deutschen und Finnischen: Bezeichnungsvarianten unter einem
geschichtlichen, lexikografischen, morphologischen und linguistisch-pragmatischen
Aspekt. (= Finnische Beiträge zur Germanistik 22). Frankfurt a. M.: Lang.
Lipka, Leonhard (1977): Lexikalisierung, Idiomatisierung und Hypostasierung als Probleme
einer synchronischen Wortbildungslehre. In: Brekle, Herbert E./Kastovsky, Dieter (eds.):
Perspektiven der Wortbildungsforschung. Beiträge zum Wuppertaler Wortbildungskol-
loquium vom 9.–10. Juli 1976. Bonn: Bouvier. 155–164.
Lüdeling, Anke (2001): On particle verbs and similar constructions in German. Stanford: CSLI
Publications.
Martincová, Olga (2015): Multi-word expressions and univerbation in Slavic. In: Müller, Peter O.
et al. (eds.). 742–757.
Compounds and multi-word expressions in the languages of Europe 41
Ralli, Angela (2016): Greek. In: Müller, Peter O. et al. (eds.). 3138–3156.
Ralli, Angela/Stavrou, Melita (1998): Morphology-syntax interface: A-N compounds vs. A-N
constructs in Modern Greek. In: Booij, Geert E./Marle, Jaap Van (eds.): Yearbook of
Morphology 1997. Dordrecht: Springer. 243–264.
Ramisch, Carlos (2015): Multiword expressions acquisition. A generic and open framework. New
York i. a.: Springer.
Ricca, Davide (2015): Verb-noun compounds in Romance. In: Müller, Peter O. et al. (eds.).
688–707.
Rijkhoff, Jan (2009): On the co-variation between form and function of adnominal possessive
modifiers in Dutch and English. In: McGregor, William (ed.): The expression of possession.
Berlin/New York: De Gruyter. 51–106.
Rosenbach, Anette (2006): Descriptive genitives in English. A case study on constructional
gradience. In: English Language and Linguistics 10. 77–118.
Saxton, Matthew (2010): Child Language. Acquisition and Development. London: SAGE.
Scalise, Sergio (1992): Compounding in Italian. In: Rivista di Linguistica 4. 175–199.
Scalise, Sergio (ed.) (1992): The Morphology of Compounding. Special issue of Rivista di
Linguistica 4, 1.
Scalise, Sergio/Guevara, Emiliano (2005): The lexicalist approach to word formation and the
notion of the lexicon. In: Štekauer, Pavol/Lieber, Rochelle (eds.). 147–187.
Scalise, Sergio/Vogel, Irene (eds.) (2010): Cross-disciplinary issues in compounding.
Amsterdam/Philadelphia: Benjamins.
Schlücker, Barbara (2014): Grammatik im Lexikon. Adjektiv+Nomen-Verbindungen im
Deutschen und Niederländischen. (= Linguistische Arbeiten 553). Berlin/Boston: De
Gruyter.
Schlücker, Barbara/Plag, Ingo (2011): Compound or Phrase? Analogy in Naming. In: Lingua 121.
1539–1551.
Sinclair, John (1991): Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Štekauer, Pavol/Lieber, Rochelle (eds.) (2005): Handbook of Word-Formation. Dordrecht:
Springer.
Szymanek, Bogdan (2009): IE, Slavonic: Polish. In: Lieber, Rochelle/Štekauer, Pavol (eds.).
464–477.
Szymanek, Bogdan (2017): Compounding in Polish and the absence of phrasal compounding.
In: Trips, Carola/Kornfilt, Jaklin (eds.). 49–79.
Teleman, Ulf (2005): The Standard languages and their systems in the 20th century: Swedish. In:
Bandle, Oscar et al. (eds.). 1603–1626.
Thráinsson, Höskuldur (1994): Icelandic. In: König, Ekkehard/Auwera, Johan van der (eds.).
142–190.
Torp, Arne (2002): The Nordic Languages in a Germanic Perspective. In: Bandle, Oscar et al.
(eds.). 13–24.
Trips, Carola/Kornfilt, Jaklin (2015): Typological aspects of phrasal compounds in German,
English, Turkish and Turkic. In: Language typology and universals (STUF) 68, 3. 281–321.
Trips, Carola/Kornfilt, Jaklin (eds.) (2017): Further investigations into the nature of phrasal
compounding. (= Morphological Investigations 1). Berlin: Language Science Press.
Uluhanov, Igor’ S. (2016): Russian. In: Müller, Peter O. et al. (eds.). 2953–2978.
Van Goethem, Kristel (2009): Choosing between A+N compounds and lexicalized A+N phrases:
The position of French in comparison to Germanic languages. In: Word Structure 2, 2.
241–253.
Compounds and multi-word expressions in the languages of Europe 43
1 I ntroduction
Compounds are traditionally defined as being, in the words of Lieber (2010: 43),
“words that are composed of two (or more) bases, roots, or stems”. Multi-word
expressions (also known as multi-word units or items, henceforth MWEs) can be
defined as “lexical items which consist of more than one ‘word’ and have some
kind of unitary semantic or pragmatic function” (Moon 2015: 120). Since all words
(in the sense of ‘lexeme’, which is what I assume Lieber to mean in the cited pas-
sage) are lexical items, the first thing to note is that these two definitions overlap
(pace ibid.: 121). Things called compounds, if they have ‘some kind of unitary
semantic or pragmatic function’, which they can be argued always to have, are
MWEs, although not all MWEs are compounds.
In this chapter, it will be argued that this fuzzy borderline between compounds
and MWEs is real, that there is no generally accepted way of dividing compounds
from MWEs, and that much of this derives from their common function as lexical
items. Furthermore, there is no generally accepted way of dividing compounds
from syntactic phrases, so that it follows that there is no generally accepted way of
dividing MWEs from syntactic phrases. This situation arises partly from the data,
and partly from the varying views of different scholars, who have tried to draw
dividing lines in different places, thus illustrating the lack of commonality of opin-
ion. Because this chapter focusses on the situation in English, the arguments
affect English specifically, and may not all transfer to other languages. No attempt
is made here to generalise to other languages; that is left for another chapter. The
effect is, however, a claim that there is no agreed definition of a compound in Eng-
lish (and possibly not of an MWE, as noted ibid.).
Open Access. © 2019 Bauer, published by De Gruyter. This work is licensed under the Creative
Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-002
46 Laurie Bauer
In addition, Bauer (1983b) finds that speakers are inconsistent in assigning stress
to (at least some) such expressions, and also notes (Bauer to appear) variable
usage of stress in the speech of newsreaders. Kunter (2011) finds a reasonable
minority of such forms show variable stress. While most authorities now see
stress as not being a reliable guide to the status of such items as compounds
(Giegerich 2004), this has not always been the case, so that some such expres-
sions have seemed to be changing category from compound to non-compound in
an apparently random fashion. Chomsky/Halle (1968), for example, use stress as
definitional for compounds.
Spelling is, to some extent, linked with stress: railway is written as one word
and has forestress, iron bar is written as two and has end-stress. Other factors are
also involved, however: schoolgirl tends to be written as one word, while univer
sity student has to be written as two, despite parallel stress and semantic read-
ings. Some of the examples in (1) equally show a distinction between stress and
orthography. It is also well-known that English orthography is inconsistent when
it comes to writing some compounds: rainforest, rain-forest and rain forest can all
be found in dictionaries. Matters as difficult to quantify as house-style and fash-
ion can influence such spellings. Spelling cannot be criterial for word status in
English. Nonetheless, some scholars use it in this way, either by default (cf. Hall
1964: 134) or to make dealing with the computational analysis of written text pos-
sible (McEnery/Xiao/Tono 2006: 147).
As a third criterion, consider the notion that words allow for global inflec-
tion, but not for inflection which is internal and applies to some element within
the word. If we consider a compound verb like badge-flash (see (2)), we can see
how this works.
Compounds and multi-word expressions in English 47
In (2) we see that badge-flash can take a past tense which affects the entire entity
badge-flash. However, even if several members of the police made their way to the
scene in this way, we could not change this to *We badges-flashed our way to the
scene. Global inflection is possible, but not internal inflection. There are two
problematic constructions in English in relation to this criterion. The first is illus-
trated by jobs growth, where the first element of the compound has an apparent
plural. Pinker (1999) sees this as sufficient evidence to say that such construc-
tions are phrasal, not words, others include these as compounds (and, hence, as
words). The other awkward construction in this regard is the classifying genitive
as in cat’s eye (‘reflecting road marker’ or ‘semi-precious stone’). Even if we ignore
the question as to whether the s-genitive in English is inflectional or a clitic (cf.
Bauer/Lieber/Plag 2013: 141 f. for a brief summary), it is not clear whether such
constructions count as single words. They are compound-like in many ways
(Rosenbach 2006), though most scholars exclude them from the set of com-
pounds. Corresponding expressions in other Germanic languages are generally
thought of as compounds, although ‘uneigentlich’ (‘non-genuine, false’) com-
pounds in Grimm’s terminology.
Other criteria for wordhood are frequently used in attempting to determine
whether given constructions are words (compounds) or not. These include the
fixed order of constructions, non-interruptibility of elements, lack of modifica-
tion of internal elements, lack of coordination of internal elements, impossibility
of referring back to individual elements by pronouns, including one, and listed-
ness. Not only do such criteria not define a coherent set of items as words (Bauer
1998), they are often broken in derivatives, whose wordhood is not usually que-
ried. These criteria will be referred to below, as required. The point here is that not
only do the criteria for wordhood not fit compounds particularly well (cf. also
Giegerich 2015; Bauer 2017), they do not allow agreement on what is or is not a
compound in English. The border of compounding is vague partly because the
border of wordhood is vague.
In what follows, a number of constructions will be considered in varying
detail. Some of these constructions will be ones which some scholars see as com-
pounds, others will be MWEs more loosely defined. The borderline between these
two groups of construction will be shown to be non-principled, with different
theoreticians making different decisions as to what is or is not a compound.
1 Karp, Marshall (2006): The rabbit factory. San Francisco: MacAdam, 179.
48 Laurie Bauer
3 F ormal constructions
3.1 N
+N
There are several classes of N+N constructions in English, and while some of them
are regularly considered to be compounds, many of them are equally regularly
considered to be excluded from the category. We can illustrate some of the classes
as in (3).
Names are not usually counted as being compounds. Those in (3c) are generally
seen as instances of apposition, and frequently have a pause and intonation
break between the title and the name (unlike those in (3a) which would other-
wise be parallel). Apposition is usually considered a syntactic construction
rather than a lexical one, and so the examples in (3a–c) and also the examples
in (3i) with common nouns, are excluded from compounds. However, at least
those in (3a) and (3b) must be listed, since they denote individuals and have
little semantic transparency. Those in (3a) appear to be left-headed (Doctor
Johnson is a member of the class of people with title of doctor), while those in
(3b) may not be – it is not clear whether it even makes sense to ask whether Eliz-
abeth Taylor is a member of the set of Elizabeths or the set of Taylors, especially
since asking such a question changes the category of both Elizabeth and Taylor
from proper noun to common noun. The examples in (3e) may also be instances
Compounds and multi-word expressions in English 49
2 Internet: www.spotlightstores.com/party/party-decorator/room-table/decorating-accessories/
amscan-assorted-silver-cutlery-box/p/BP80402188 (last access: 17 Nov 2017).
3 Internet: https://en.wikipedia.org/wiki/Frank_Mundus (last access: 17 Nov 2017).
50 Laurie Bauer
3.2 A
+N
Again, we can find many classes of construction involving adjectives and nouns.
A+N compounds are usually distinguished from syntactic constructions by their
stress (forestress) and, correspondingly, their orthographic unity, by the fact that
the adjective cannot be submodified or graded, and by the fact that the adjective
can be denied without contradiction. Thus blackbird is a compound by virtue of
its stress, its orthography, the fact that we cannot have a blackerbird or a very
blackbird, and because This blackbird is brown is not a contradiction. The syntac-
tic construction black bird differs from the compound blackbird in all of these
respects. This distinction arises because black in black bird describes, while black
in blackbird categorises.
If we look at intersective adjectives like black, heavy, silly etc. where a black
bird represents the intersection of black things and birds, we discover that they
are not always intersective. A red book may illustrate an intersective use of red,
but a red squirrel does not: red squirrel behaves semantically like blackbird, not
like black bird, despite different stress and orthography. Bauer (2004) points out
that there is a difference in frequency between the forestressed words and the
end-stressed expressions: the forestressed words are more frequent. This would
seem to indicate that there are intersective adjectives used descriptively, intersec-
tive adjectives used to categorise, and intersective adjectives used with forestress
to categorise. Most authorities distinguish the compounds with forestress from
Compounds and multi-word expressions in English 51
the other two types, but we might equally distinguish the descriptive adjectives
from the categorising ones.
If we now turn to relational adjectives like canine, dental, parental, vernal,
they are not intersective. A canine tooth is not the intersection of canine things
and teeth, but a kind of tooth related in some way to dogs (for fuller discussion
cf. Giegerich 2015). Relational adjectives are rarely descriptive unless they are
figurative or used predicatively, as in his movements were feline, her attitude was
vaguely parental. The precise relationship between the adjective and the noun
has to be discovered by considering the individual example, just as the relation-
ship between nouns in N+N compounds has to be discovered by considering the
individual example: a windmill uses wind power, but a flour mill grinds flour.
Part of the result of this is that relational adjectives are by default categorising.
Nevertheless, there are instances when they, too, can take forestress: consider
for instance dramatic society, mental hospital, primary school. The reason for the
forestress here is not clear. Neither is it clear whether things like mental hospital
are compounds. Scholars disagree on whether A+N constructions with relational
adjectives are compounds or not, but they certainly seem to fulfil a similar pur-
pose. In some cases there are pairs with a modifying noun and a modifying
adjective which may be nearly synonymous (atom bomb, atomic bomb; language
description, linguistic description), while in other instances they contrast in
meaning (a civic centre is not the same as a town centre). Speakers must know
that it is solar flare but sunspot; there does not seem to be a way to predict such
distinctions.
Expressions such as attorney general, court martial, heir apparent, where the
adjective follows the noun it modifies, are usually of French origin, and follow the
French order of noun and adjective. A few such as postmaster general are formed
in English on a French pattern. The pattern does not seem to be productive, so in
principle a full list of these can be given. There seems to be little reason to include
such expressions among compounds, particularly since they are left-headed
though most compounds are right-headed, but they are certainly MWEs.
The examples in (4a) probably imitate a Romance pattern which is no longer pro-
ductive in modern English. However, a similar type is found with the order of the
elements reversed: prick-tease, for example. The type in (4b) has verbs in modify-
ing position, but is endocentric (show room is a hyponym of room). It is often the
case that modifying verbs in forestressed constructions take the -ing form: dining
room, shooting party, walking stick. These are then usually considered to have
nominal first elements. The type in (4c) shows adverbs/prepositions/particles in
modifying position. Things like through-put may also belong here formally,
though they are probably nominalisations of phrasal verbs. Reverse ordered
forms like put-down are also found. The type in (4d) shows alternatives in modi-
fying position, the alternatives being, in these instances, verbs or adverbs. The
type in (4e) shows apparently unlimited syntactic constructions in initial posi-
tion. These expressions do not have to be idiomatic or even familiar. If these are
compounds, though, and most scholars accept that they are, they allow syntactic
structure within word-structure. The types in (4f–g) have already been men-
tioned, with plurals or genitives in the first element. In both cases there is often
an alternative with an unmarked noun in the first position (lineman and linesman
are synonymous; according to the OED tailor’s tack and tailor tack are synony-
mous). At the same time, a genitive first element can contrast with an unmarked
first element, as illustrated in (5) (data from the OED).
3.4 A
djectival compounds
origin). These are most commonly found with verbs of motion as the first verb (go
see, come buy) but go beyond that (I hope see you soon), especially in US English.
Some such constructions can be the base of further derivation, which seems to
imply listedness, if not other features of words (consider go-getter, jump-starter).
3.7 B
inomials
Binomials are pairs of words linked usually by and, occasionally by or. They are
normally called binominals only if they are fixed collocations. Thus Monday or
Tuesday would not be considered a binominal, but the examples in (6) would be.
(6) Abbot and Costello, bacon and eggs, bread and butter, cat and mouse,
chalk and cheese, fish and chips, gin and it, kit and caboodle, kith and kin,
life or death, milk and honey, salt and pepper, slap and tickle, sun and sand,
whisky and soda; do or die, kiss and tell, make or break, put up or shut up,
wine and dine; black and blue, free and easy, neat and tidy, sick and tired,
spick and span; as and when, back and forth, far and away ‘by a wide mar-
gin’, far and near ‘everywhere’, now and again
There is quite a large literature on the order of the elements in binomials (for a
good summary cf. Benor/Levy 2006), and the fixedness of the order. Binomials
vary in the degree to which each element presupposes the other. In spick and
span we cannot have either element without the other; black can easily occur
without blue, but black and blue is a fixed expression whose implications go
beyond the colours involved; chalk and cheese collocate only when illustrating
how different two things can be; Abbott and Costello illustrates a collocation
which was originally purely arbitrary, but became more fixed as the team became
more established. They also differ in how easily they can be interrupted: bread
and manuka honey is perfectly possible, but sick and really tired is no longer an
example of the relevant collocation. Again, they differ in how easily the coordi-
Compounds and multi-word expressions in English 55
nated items can be reversed. Eggs and bacon or bacon and eggs seem to be equally
good (and scrambled can be added to eggs in either ordering), jam and bread is
possible, if slightly unusual (it is found in a song in The Sound of Music, for
instance), chips and fish is mainly used when the chips and the fish are referred
to separately rather than as a single dish. Those binomials that have a figurative
reading cannot in general be interrupted or reversed: bread and butter ‘main
source of income’, salt and pepper ‘colour term’, far and away.
3.8 N
+P+N constructions
Part of the question here (and, incidentally, also in French) is the status of items
with internal determiners, such as those in (8). Are they a different construction
by virtue of having an NP (or DP) in second position, or are they a variant of the
same construction?
(8) belle of the ball, birds of a feather, two bites of the cherry, a Jack of all
trades, the man in the moon, the man of the moment, a pain in the neck, the
time of your life, will of the wisp
3.9 P
hrasal verbs
Phrasal verbs are usually taken to be syntactic units in English, though many of
them are figurative or idiomatic. Look up is literal when it means ‘raise your eyes
towards the sky’, but idiomatic when it means ‘refer to’ (as in look up a word in the
dictionary) or even ‘improve’ as in business is looking up. Put up is literal in put
your hand up the pipe, figurative in to put someone’s back up (‘annoy’) and idio-
matic in I can put you up in our spare room (‘accommodate’). Note that some
phrasal verbs have two particles, as put up with ‘tolerate’, look up to ‘admire’, but
this construction too can be literal, as in fall out of. Phrasal verbs have syntax-like
behaviour in being interrupted by their direct objects, but are lexical to the extent
that their meaning is not predictable from their elements.
Compounds and multi-word expressions in English 57
It might be claimed that some of the items mentioned above are simply syntactic
phrases that have become more word-like, by a process usually called univerba-
tion. Since univerbation is a diachronic process that proceeds by degrees, and
since there are a number of different univerbation processes, there are many dif-
ferent kinds of expression which, even if they started out containing two or more
words, are currently considered to be single words. Some examples are given
in (9).
Because these fit the rough definition of a compound given in Section 1, they are
sometimes considered to be compounds. To the extent that the constituent words
are transparent, they might be considered to be MWEs. (Note that bullseye fits
into the type illustrated in (4g) except that it is written as a single word and the
genitive is not overtly marked.) They might also be considered to be single
unanalysable words, as is implied in the term univerbation. Such items span the
borders of MWEs.
4 F unctional categories
The last section looked at categories that are more or less formally defined; in this
section other types of category are considered, including formation-types that
lead to MWEs. These are grouped together as ‘functional’ categories, in the sense
that they are not formal, but they are nonetheless a heterogeneous group. In par-
ticular, the first section below scarcely seems to be a category at all, but contrasts
with other categories discussed later.
It may seem trivial that literal interpretations of such constructions exist. For
example, Kim is good at music and maths contains a N+and+N construction
whose interpretation follows from the construction in which the coordinated pair
occurs and the meanings of the words involved. Such examples are typically non-
word-like. Discussion of such types is frequently carried out under the heading of
58 Laurie Bauer
An expression may also be interpreted figuratively. This is not the place to discuss
the various possible figures of speech, or the distinctions between them. Suffice it
to say that a figurative interpretation is a pragmatic interpretation based on the
literal meaning, but providing an interpretation which is not literal. Consider the
established metaphor a dog’s breakfast. We could interpret that as ‘a morning
meal for a dog’, that is literally, but its established meaning is ‘a mess’, and that
involves pragmatically inferring that where a dog has eaten, things are not tidy. A
king’s ransom means ‘a lot of money’, which is pragmatically inferred from the
amount that would be required to ransom a king. To be on the ropes is a metaphor
from boxing and means ‘to be in a desperate position’. As has been shown in a
number of publications (e. g. Lakoff/Johnson 2003), figurative language is ubiqui-
tous in everyday communication, and appears to be cognitively normal and
effortless: indeed, it is often the sign of brain damage if a listener cannot interpret
figures of speech.
4.3 I diomaticity
‘hold a conversation’, not by a long chalk ‘fall far short’, be in fine fettle ‘be fit and
healthy’ (fettle is now extremely rare except in this phrase). The important point
about this, though, is that expressions of all kinds can be idiomatic, including
compounds (consider blackmail, yellowhammer ‘bird sp.’) and phrasal verbs
(consider put up with ‘tolerate’, pan out ‘conclude’ – perhaps once figurative, but
not now recuperable).
A different type of idiom is the constructional idiom, a syntactic construction
where the idiomatic semantics is provided by the construction, and the construc-
tion may be filled with varied lexical content (Booij 2002). An example from Eng-
lish is found in (10) (cf. also Philip 2008), where all the examples mean ‘not to be
particularly intelligent’.
Any language will have a large number of recognised expressions which, in some
way, acknowledge the wisdom of past speakers of the language. Some of these are
quotations (from traditional tales, from literary works, from songs, movies or TV
shows, from religious sources) others are proverbial or even family sayings. Their
length and structure is infinitely variable: in principle, an actor or literary scholar
might know the whole of Hamlet by heart and quote from it freely. Quotations are
60 Laurie Bauer
often abbreviated, mis-quoted or even alluded to. The proverb Too many cooks
spoil the broth may be shortened, as perhaps It’s a case of too many cooks, or, if
someone was complaining about the number of people involved in a project,
someone else might conceivably ask, So how did the broth turn out? Quotations
may often go unrecognised by hearers. Some examples are given in (11).
(11) eye of the needle, fisher of men, the salt of the earth (Biblical); the goose that
lays the golden egg, the grand old duke of York, white rabbits (said on the
first of the month) (folklore); this sceptered isle, pound of flesh, star-crossed
lovers, strange bedfellows (Shakespeare); dim, religious light, a modest pro
posal, a truth universally acknowledged (other literary sources); the curate’s
egg, famous last words, lies, damned lies and statistics (non-literary
sources); the early bird, a gift horse, a watched pot (proverbial)
4.5 A
bbreviations
Initialisms and acronyms deserve a marginal place in this discussion, as they are
a means by which MWEs turn into single words. In initialisms, an MWE becomes
an orthographic word: FBI is a single orthographic entity, while its origin, Federal
Bureau of Investigation is an MWE. In acronyms, the MWE turns into a new pho-
nological and orthographic word: the MWE North Atlantic Treaty Organization
turns into NATO (/neɪtəʊ/). Although there is a rather old-fashioned spelling con-
vention whereby some of these items may have their individual letters interrupted
by full stops/periods (N.A.T.O.), the more modern orthography stresses the word-
hood of the outcome. For the most successful acronyms, the original MWE
becomes lost, and a new morpheme arises: scuba < self-contained underwater
breathing apparatus.
Compounds and multi-word expressions in English 61
4.6 R
hyming slang
The essence of rhyming slang is that a word is replaced with a (usually two- or
three-word) phrase which rhymes with the original. In this first stage, non-
MWEs are deliberately replaced by MWEs. The word kids is replaced by dustbin
lids, the word stairs is replaced with apples and pears. Note that there is no
semantic link between the original word and the rhyming replacement, though
occasional examples may be (or may be thought to be) jocularly appropriate,
such as trouble and strife for wife. To make things more difficult, the rhyming
word is then often deleted, so that kids becomes dustbins and stairs becomes
apples and what was an MWE is now replaced by a polysemous lexeme. Although
this is often termed ‘Cockney rhyming slang’ it is not restricted to London Eng-
lish. Not only is it also found, for instance, in Glasgow, Australia and New Zea-
land, but occasional expressions of rhyming slang creep unacknowledged in
the vocabulary of the wider language community: to do bird (bird lime = time [in
prison]), let’s have a butcher’s (butcher’s hook = look), my old china (china plate
= mate), use your loaf (loaf of bread = head), rabbit on (rabbit and pork = talk).
All of these retain the distinctly informal style level of the originals, and form
new idiomatic MWEs.
All the examples provided above are established examples. But rhyming
slang can also be used productively. One website cites Jar Jar Binks for forty winks
(‘a snooze’), clearly postdating the relevant Star Wars movie, and not necessarily
widely known.
4.7 C
ollocation
Collocations are sets of words which habitually occur together, even if they are
perfectly transparent. A standard example concerns the way in which dry changes
its meaning depending upon what it collocates with, as shown in (13).
Collocations are not always of the same strength. Sometimes the ability to predict
one of the items in the collocation from the other is strong, sometimes it is weak.
This can be measured in terms of the mutual information each element provides
as to the identity of the other element(s) in the collocation (Xiao 2015). This may
complicate the process of deciding what belongs in the lexicon in a theoretical
sense, but does not interfere with the notion that more than just the individual
word might have to be listed.
Note that while dry in dry ground can be submodified (very dry ground), and
many of these expressions can be interrupted (a dry French wine, dry red-rimmed
eyes) some of them seem to be more word-like (*very dry toast, dry battery does
not appear to allow random insertions).
A particular kind of collocation is that provided by light verbs. It is make a
difference, give a lecture, make a mistake, take the opportunity, take a shower,
have a smoke. There does not seem to be any straightforward semantic reason for
the selection of these light verbs, and speakers (including native speakers) will
often use a different one from the one expected, and say things like do a
mistake.
Another similar case is provided by adjectives that take complements, and
then collocate with fixed prepositions, as in afraid of, averse to, different from/
than/to, proud of. The case of different, which becomes a matter of prescription,
shows that the preposition is not always fixed, but generally speaking the prepo-
sition has to be seen as being chosen by the head adjective. This puts such con-
structions of the borderline between being lexical combinations and syntactic
structures showing government.
4.8 F ormulae
Formulae are the way things are said rather than the way they could be said (cf.
also Sabban 2008). In many European languages, there is an expression which
can be translated as ‘good day’ which is a greeting. In England, good day is a
Compounds and multi-word expressions in English 63
farewell. In Australia and New Zealand, good day (with a phonetically very much
reduced first syllable) is again a greeting. In the usage of young New Zealanders
around the turn of the millennium, spot you later, and laters were farewells
(Bauer/Bauer 2003). The fact that these are greetings and farewells (as opposed to
other potential expressions which are not, such as until we meet again, till the
next time or soon), with the corresponding increase in usage of these precise
phrases, makes them into formulae. Corresponding to the rather old-fashioned
How do you do? heard in England, How are you doing? can be heard in other parts
of the English-speaking world, but as a day-to-day greeting rather than as a greet-
ing on first introduction. How is it going? is an alternative possibility, but not How
does it go? There are many perfectly grammatical possible ways of saying things
that are never used, and those that are used, and their precise meaning, may be
unexpected.
Formulae, then, are particular types of collocation, with high frequency in
particular social environments. While they have syntactic structure (in the case
of How do you do a rather outmoded syntactic structure), some of them may be
learned as listed, fixed expressions, or have the status of words (as with
good-bye).
4.9 L exicalisation
5 D
iscussion
While this wide range of MWEs has to be recognised (however difficult they may
be to systematise), there are a number of expressions which do not appear to be
sufficiently lexical to fit in the category. Any study of n-grams will come up with
expressions like in a, which collocate not because in a is constituent with its own
meaning, but because all members of the category preposition are typically fol-
lowed by determiner phrases, typically headed by determiners like a in initial
position. The high number of such cross-constituent collocations has thus more
to do with the productivity of syntax than anything lexical. Similarly, colligations,
such as the fact that the verb construct is transitive, is not a matter of lexis but a
matter of grammar (again, perhaps, a matter of government). It is true that con
struct a building is likely to be more frequent than construct a daisy, but this has
as much to do with the nature of the world as with the nature of lexical items.
While it makes little sense to suggest that construct demands in its complement
something with a feature [+ constructible], as has been done on occasions, it
makes rather more sense to say that pragmatically the need for a sentence which
contains construct a daisy is likely to be extremely low (although it might be pos-
sible if people were decorating a kindergarten and making flowers out of recycled
material to use as decorations). As McCawley remarked many years ago (McCaw-
ley 1971), if someone says my toothbrush is pregnant, it is unlikely to be their
grammatical competence which is at fault.
The borderline between things which happen to collocate because they are
syntactically likely to arise in similar contexts and what is lexical is not necessar-
ily an easy one to draw. I tend to think that is like a is on the grammatical side, but
Wikberg (2008: 136 f.) makes a case for it on the basis that it is a formula used to
introduce similes.
Borderlines like these, and one mentioned earlier between government and
lexical structure, are potentially problematic, and the entire idea that there are
such borderlines is worthy of further discussion. At one extreme we find a view,
which we can characterise as essentially Chomskian, that virtually everything we
produce is the result of free syntactic rules in operation. The other extreme posi-
tion, and one worth arguing for, would be that there is no such thing as free syn-
tax, but that everything is lexically-driven, with MWEs, fixed phrases and strongly
restrictive constructions accounting for the fact that speakers do not say many
things which might appear to be grammatical. I distrust extreme views, and sus-
pect that there is some of each involved, but that the limits of each require careful
motivation. It seems to me that a line like Carroll’s (1871) ‘Twas brillig and the
slithy toves did gyre and gimble in the wabe shows that there must be some syntax
separate from vocabulary items, while the range of MWEs discussed in the phra-
Compounds and multi-word expressions in English 65
compounds arise from pieces of syntactic structure being frozen. While anyone is
free to define compounds as they see fit, agreement on any definition which can
determine which of the structures that have been canvassed here are really com-
pounds seems a long way off.
References
Adams, Valerie (2001): Complex words in English. Harlow: Pearson.
Bauer, Laurie (1983a): English word-formation. Cambridge, UK: Cambridge University Press.
Bauer, Laurie (1983b): Stress in compounds. A rejoinder. In: English Studies 64. 47–53.
Bauer, Laurie (1998): When is a sequence of two nouns a compound in English? In: English
language and linguistics 2. 65–86.
Bauer, Laurie (2000): Word. In: Booij, Geert/Lehmann, Christian/Mugdan, Joachim (eds.):
Morphology. An international handbook on inflection and word-formation. Berlin/New
York: De Gruyter. 247–257.
Bauer, Laurie (2001): Compounding. In: Haspelmath, Martin et al. (eds.): Language universals
and language typology. Berlin/New York: De Gruyter. 695–707.
Bauer, Laurie (2004): Adjectives, compounds and words. In: Nordic Journal of English Studies 3,
1. 7–22.
Bauer, Laurie (2016): Re-evaluating exocentricity in word-formation. In: Siddiqi, Daniel/
Harley, Heidi (eds.): Morphological metatheory. Amsterdam/Philadelphia: Benjamins.
461–477.
Bauer, Laurie (2017): Compounds and compounding. Cambridge, UK: Cambridge University
Press.
Bauer, Laurie (to appear): Stressing about the news. In: New Zealand English Journal.
Bauer, Laurie/Bauer, Winifred (2003): Playground talk. Wellington: Victoria University.
Bauer, Laurie/Lieber, Rochelle/Plag, Ingo (2013): The Oxford reference guide to English
morphology. Oxford: Oxford University Press.
Bauer, Laurie/Renouf, Antoinette (2001): A corpus-based study of compounding in English. In:
Journal of English Linguistics 29. 101–123.
Bell, Melanie J. (2014): The English noun-noun construct: a morphological and syntactic object.
In: Ralli, Angela (eds.): Morphology and the architecture of grammar. 59–91. (Internet:
https://geertbooij.files.wordpress.com/2014/02/mmm8_proceedings.pdf, last access:
20.4.2018).
Benor, Sarah/Levy, Roger (2006): The chicken or the egg? A probabilistic analysis of English
binomials. In: Language 82. 233–278.
Booij, Geert (2002): Constructional idioms, morphology and the Dutch lexicon. In: Journal of
Germanic linguistics 14. 301–329.
Carroll, Lewis (1871): Through the looking glass and what Alice found there. London:
Macmillan.
Carstairs-McCarthy, Andrew (2002): An introduction to English morphology. Edinburgh:
Edinburgh University Press.
Chomsky, Noam (1965): Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, Noam/Halle, Morris (1968): The sound pattern of English. New York: Harper & Row.
Compounds and multi-word expressions in English 67
1 I ntroduction
This chapter reviews multi-word expressions, compounds, and their mutual rela-
tion regarding their status in grammar and lexicon in contemporary German.1
Both multi-word expressions and compounds are lexical units and morphosyn-
tactically complex. That is, they are made up of a minimum of two words or
stems,2 which sets them apart both from simplex lexemes and from morphologi-
cally complex words derived by other word-formation processes, in particular
derivation and conversion.3 As lexical units, they have the common function of
providing labels for all kinds of concepts. This apparent similarity – which
becomes immediately obvious from the existence of parallel units such as
Frischluft / frische Luft ‘fresh air’ – raises various questions concerning the status,
the function, and the division of labor between multi-word expressions (hence-
forth: MWEs) and compounds, but also regarding the identification and demarca-
tion of these forms. These questions will be discussed in this chapter. To start
with, it has been noted time and again that the dividing line between MWEs and
compounds cannot always be clearly drawn. While many of the problems that are
discussed in the following – such as the theoretical considerations concerning
MWE formation and the status of MWEs and compounds in the mental lexicon –
have cross-linguistic implications, the question of identification and demarcation
of the forms is language-specific. Therefore, we will start our overview with a
brief survey of the relevant properties in German. The chapter is organized as
follows: Section 2 defines the central terms in the context of the object of investi-
1 I would like to thank Geert Booij, Jesús Fernández, Rita Finkbeiner, and Katerina Stathi for
very valuable comments on earlier versions of this chapter.
2 Although the notion of word is known to be notoriously problematic, it is used in most defini-
tions of multi-word expressions, relying (usually without further discussion) on orthography as
the defining criterion. In addition, one also finds other (unspecified) terms such as ‘element’
(Gries 2008). The term ‘stem’ is mentioned here because stems rather than words form the basic
constituents in compounds.
3 Strictly speaking, conversions, although derived by a morphological process, are not morpho-
logically complex.
Open Access. © 2019 Schlücker, published by De Gruyter. This work is licensed under the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-003.
70 Barbara Schlücker
gation of the study, in particular the scope of the units known as MWEs. This
section covers general aspects such as the relation between morphology and the
lexicon, as well as MWE formation, the proportion of compounds and MWEs in
the German lexicon, and the relation between both processes with respect to their
function as providing lexical units. Section 3 gives a more detailed overview of
German MWEs and compounds classified according to lexical category. Section 4
discusses the theoretical implications of the findings. The chapter ends with a
brief conclusion in Section 5.
2 G
eneral aspects
In his chapter “Idioms and other fixed expressions: Parallels between idioms and
compounds”, Jackendoff (1997a: 164) writes:
Another part of the goal is to show that the theory of fixed expressions is more or less coex-
tensive with the theory of words. Toward this end, it is useful to compare fixed expressions
with derivational morphology, especially compounds, which everyone acknowledges to be
lexical items.
The main reason for investigating MWEs and compounds and their interrelation
is the fact that they are quite similar with respect to (i) their status as lexical units
and their function of providing labels for concepts and (ii) their form, as both are
morphosyntactically complex, i. e. consisting of a minimum of two words or
stems. What follows from this first description is that if MWEs and compounds
are similar in being both lexical units and consisting of two (or more) words/
lexemes, the crucial difference lies in the way these words are combined. Gaeta/
Ricca (2009) have made this point very clear, distinguishing strictly between the
properties of being [± lexical] and [± morphological], where “lexical” means that
a unit has a stable referent, a unitary meaning and possibly a non-negligible fre-
quency of occurrence (ibid.: 39). While both MWEs and compounds are [+ lexi-
cal], compounds are [+ morphological] but MWEs are [− morphological]. This
means that only lexical units can be regarded as compounds that are the output
of the morphological operation of compounding, which in turn must clearly dif-
fer from the syntactic operations of the language in question. For this reason, we
will start with a concise description of compounding.
In German, nominal and, to a more limited extent, adjectival compounding
are productive word formation patterns, whereas verbal compounding is regarded
Compounds and multi-word expressions in German 71
In addition, there are several properties that apply only to specific subtypes of
compounding. To the extent that they are relevant to the present issue they will
be discussed in Section 3.
The properties mentioned distinguish compounds not only from phrases but
also from univerbations in the strict sense (“Zusammenrückung”), such as zulasten
(lit. on burden of, ‘account of’), demzufolge (lit. as a result of this, ‘accordingly’) or
Möchtegern (‘would-be, wannabe’). These lexical units are inseparable and written
in one single word. They are, however, not the result of a word formation process
but rather fossilized phrases. This can be seen in the fact that they can contain
inflected material instead of stem forms, such as lasten (pl.dat.) in zulasten, dem
(dat.sing.) in demzufolge or möchte (1./3.pers.sing.pres.act.) in Möchtegern.
Also, they retain phrasal stress. Contrary to compounding, the formation of such
units is unsystematic and cannot be predicted. Thus, they are lexical but not mor-
phological units. Accordingly, if we rely on the properties of [± lexical] and [± mor-
phological] only, univerbations are no different from MWEs (see below). However,
due to their inseparability and solid spelling they are generally considered words.6
4 Linking elements in German are not inflectional elements although some of them have evolved
diachronically from inflectional affixes, cf. footnote 6.
5 It can be observed that German language users sometimes write compounds as two separate
words (cf. Scherer 2012, for instance) and it has been speculated that this might be an increas-
ing tendendy due to influence from English. This breaks the official German spelling rules,
however.
6 From a diachronic perspective, it can be seen that a particular type of univerbation forms a
close link between MWEs and nominal compounds. In addition to compounds proper that can be
72 Barbara Schlücker
(1a) ein Fass aufmachen (lit. to open a barrel, ‘make a fuss’),7 um ein Haar (lit.
by a hair, ‘very nearly’)
(1b) Dank sagen (lit. say thanks, ‘thank’), leere Menge (‘empty set’), in Zusam
menhang mit (‘in connection with’)
(1c) Knall auf Fall (lit. bang on fall, ‘suddenly’), auf gut Glück (lit. on good luck,
‘on the off chance’)
found since Old High German (and before), a second type of compounds, the so-called ‘genitive
compounds’, or, in Grimm’s terminology, “uneigentliche Komposita” (‘false compounds’) arise
sporadically in Old High German and Middle High German times and become more frequent later
on. They are univerbations of a prenominal genitive construction and, for this reason, contain
genitive case marking. In Early High German, this pattern becomes productive and collapses
with the older compound type. As a result, the former case markings are reanalyzed as linking
elements, and the newly coined forms are no longer conceived of as univerbations, thus (former)
syntactic patterns, but as word formation proper (cf. Pavlov 1983, for instance). For instance, the
genitive construction (des) menschen herz (‘(the) human’s heart’) is reanalyzed as a nominal
compound Menschenherz (‘human heart’) and the former suffix -en (sing.gen.) is reanalyzed as
a linking element.
7 The German MWE is the result of folk etymology relating to the English verb fuss, due to the
phonological similarity of English fuss and German Fass ‘barrel’ and the equivalence between
German (auf)machen and English make.
Compounds and multi-word expressions in German 73
As formal and semantic irregularities are not defining criteria of MWEs, their
identification hinges crucially on (a) the function of the combination of words as
a semantic unit and (b) the frequency of occurrence, which means that the fre-
quency of occurrence of the particular combination of words is larger than expect-
ed.8, 9
This definition (and many similar approaches in the literature) have led to a
rather broad view of MWEs that encompasses many different types of lexical
phrasal units, some of which are not regarded as MWEs in older and more tradi-
tional phraseological theory. In particular, collocations which may have a fully
compositional meaning, are nowadays usually regarded as MWEs, e. g., billige
Kopie (‘cheap copy’), den Kopf schütteln (‘to shake one’s head’), eine Entschei
dung treffen (lit. to hit a decision, ‘to make a decision’). Presumably, they make up
a large part of all MWEs in German. Another group are partially fixed (or: lexically
filled) patterns, that is, patterns that contain open slots that can be filled with
various lexical items to produce new MWEs, cf. (2):
(2a) [X um X] ‘X by X’: Stein um Stein (‘brick by brick’), Jahr um Jahr (‘year
by year’)
(2b) [Wer X (der) Y] (‘he who X, Y’): Wer rastet, der rostet (‘He who rests, rusts’),
Wer suchet, der findet (‘He who seeks, finds’), Wer schreibt, der bleibt (‘He
who writes, remains’)
The observation that some phrasal patterns are systematically and productively
used to form lexical units can already be found in early traditional German phra-
seological research (cf. Häusermann 1977; Fleischer 1982). Quite influentially, the
idea of productive syntactic patterns in the lexicon has been discussed in detail in
cognitive and constructionalist frameworks (cf. Fillmore/Kay/O’Connor 1988;
Jackendoff 1997b; Kay/Fillmore 1999, among many others), often in connection
with the term ‘constructional idiom’ (Jackendoff 1997a; Jackendoff 2002; Booij
2002). Finally, and especially in connection with recent developments in usage-
based and corpus linguistics and rapidly increasing corpus sizes, the idea
emerged that the vast majority of MWEs are indeed realizations of abstract
8 The first criterion can serve to exclude frequently co-occuring sequences such as and the,
which obviously do not form a semantic unit. Yet, it is not clear what exactly a semantic unit is;
Gries (2008: 6), for instance, defines it as “to have a sense just like a single morpheme or word”.
This, however, seems too narrow given the meaning of proverbs or (some) verbal idioms such as
ein Fass aufmachen / make a fuss.
9 Other properties which are in principle compatible with these properties make use of the psy-
cholinguistic dimension, e. g., psycholinguistic stability or retrieval as a whole.
74 Barbara Schlücker
atterns, with numerous relations between these patterns (e. g. Steyer 2015, 2016).
p
Crucially, the idea of abstract MWE patterns implies that there are also occasional
MWEs, that is, nonce-MWEs that are formed ad hoc, and even potential MWEs
that might be formed according to these patterns but have yet to do so, just as is
the case with occasional and potential compounds. Obviously, such ideas chal-
lenge the original idea of MWEs as idiosyncratic stored items in the lexicon. We
will come back to this issue in Section 4.
For the purpose of the present chapter, some constraints on the range of
MWEs to be discussed are in order. First, MWEs are usually also thought to encom-
pass proverbs, sayings, quotations, and routine formulas, e. g. Good Morning or
Happy Birthday. However, these kinds of MWEs do not denote referents (either
objects or events), but rather have a propositional function due their sentence
character, or, in the case of routine formulas, a purely pragmatic (communica-
tive) function. As the present discussion focuses on MWEs that parallel com-
pounds, they are excluded in what follows. Similarly, as will be discussed in more
detail in Section 3, we will not be concerned with MWE patterns that systemati-
cally lack equivalent compound forms.
Given the potential functional overlap between MWEs and compounds, the ques-
tions of what share they hold in the (German) lexicon and whether any regulari-
ties can be observed concerning their distribution arise. Obviously, the answers
are determined by various factors: first, they crucially hinge on the definition of
MWEs and the question of which combinations are considered MWEs. Further-
more, we might ask how to deal with occasional and possible/potential forma-
tions, i. e. concrete patterns that might be instantiated from abstract MWE
patterns.
Most remarks in the literature on the distribution of MWEs and compounds
relate to lexical categories. It has often been assumed that verbal MWEs make up
the largest part of German MWEs (e. g., Burger 2001: 34). Nominal MWEs are usu-
ally considered much less frequent (e. g., Barz 1996: 131, 2007: 28; Donalies 2008:
308). According to Fleischer (1996a: 152, 1997a: 17–20), MWEs are most frequent in
the verbal and least frequent in the adjectival domain, with the nominal and the
adverbial domain in between. Fleischer (1996b: 336) and Barz (2007: 28) relate
this distribution to differences in productivity of compounding (or word-forma-
tion in general) in the respective lexical categories: whereas nominal compound-
ing is highly productive in German, there are considerably less word-formation
patterns in the verbal domain and verbal compounding in particular is consid-
Compounds and multi-word expressions in German 75
ered marginal or non-existent. In addition, is has been observed that the distribu-
tion also depends on register: nominal MWEs seem to be much more frequent in
terminology, e. g., medical language or professional titles, than in general usage
(Möhn 1986; Fleischer 1996a; Barz 2007).
However, these assessments about the distributions among the lexical cate-
gories crucially depend on what counts as an MWE. “Classical” verbal idioms
such as jdn. auf die Palme bringen (lit. bring so. on the palm, ‘drive so. nuts’)
stand out for their semantic and morphosyntactic idiosyncrasies and are there-
fore more often perceived as MWEs. Collocations, on the other hand, in particular
nominal ones, have not always been recognized as fixed units, as many of them
have a fully compositional meaning. They have often not been included in dic-
tionaries or phraseological lists. However, inclusion in such dictionaries or lists
usually forms the basis for the sort of assessment mentioned above. Thus, given
a broader view on MWEs like that introduced in the preceding section, it seems
hard to say whether (or to what extent) a distribution of compounds and MWEs
by lexical category can be established at all.
2.3 R
elation between MWEs and compounds in the German
lexicon: complementarity or competition?
An old and widespread idea about the lexicon is that it usually does not contain
real synonyms or doublets which means that the co-existence of compounds and
MWEs with identical meanings and grammatical function/distribution is not
expected (for discussion cf. Haiman 1980, for instance). It has also been assumed
that real doublets only exist between terminology and general vocabulary (Barz
1996: 132). However, this view is probably too strict. Obviously, there are also
examples of “real” doublets within the general lexicon, some of the (often cited)
examples being Schwert des Damokles / Damoklesschwert (‘sword of Damocles’),
Grüntee / grüner Tee (‘green tea’), schwarzer Markt / Schwarzmarkt (‘black mar-
ket’), halbherzig / mit halbem Herzen (‘half-hearted’), although in some cases
there are clear differences in the frequency of use of both forms.10 Also, there
might be regional variation concerning the use of an MWE vs. compound.
As to the differences, MWEs are often assumed to be more expressive than
parallel morphological units, e. g., jdn. übers Ohr hauen (lit. hit so. across the ear)
10 For instance, Schuster (2016: 195) shows that the distribution of schwarzer Markt vs.
Schwarzmarkt (‘black market’) has changed considerably in the period of 1946−2009 [ZEIT cor-
pus], with an initial proportion of the compound of about 10 % and 90 % at the end.
76 Barbara Schlücker
vs. jdn. betrügen, both meaning ‘cheat so.’. Expressivity is often due to metaphor-
ical meaning, e. g. grüne Welle (lit. green wave, ‘phased traffic lights’), blondes
Gift (lit. blond poison, ‘blonde bombshell’), but it may also arise from phonolog-
ical-prosodic properties, like rhyme or alliteration, as in binomial constructions
such as null und nichtig (‘null and void’), hegen und pflegen (‘nurture’) (cf. Flei-
scher 1997b: 164 f.). However, although expressivity and imagery might be the ini-
tial driving forces for the coinage of an MWE, these properties might wear out over
time and the forms are no longer perceived as particularly expressive (cf. Fleis-
cher 1997a). Furthermore, compounds might also have a metaphorical meaning,
such as Dickmops (lit. fat pug, ‘fat person, fatty’), Baumdiagramm (‘tree dia-
gram’), Kuchenhimmel (lit. cake heaven, ‘place that serves excellent cake’).
The question of whether the relation between compounds and MWEs is to be
characterized as complementary or competitive depends on the ideas about the
status and the formation of MWEs. According to the traditional view, MWEs are
not formed by abstract patterns (or rules) in the way compounds are. Rather, their
emergence has been regarded as a secondary, purely semantic process of idioma-
tization (e. g. metaphoric or metonymic) of syntactic units, which might in turn
have an effect on the morphosyntactic properties of the unit in question (e. g.,
Fleischer 1997a: 11; Barz 2007: 31). Barz (1996: 132, 2007: 30) regards MWEs as less
economic than complex morphological units due to their complexity, i. e. the
number of constituent parts, although they are often semantically more explicit
since the relation between the constituents is morphosyntactically expressed,
unlike with compounds. A typical example is an adjectival phrasal simile such as
so rot wie Blut (‘as red as blood’) and the corresponding adjectival compound
blutrot (‘blood-red’) (cf. also Section 3.4). The comparison between ‘blood’ and
‘red’ is expressed explicitly in the phrase while this relation is implicit in the com-
pound and must be inferred by the reader. At the same time, the morphological
counterpart is structurally less complex than the phrasal unit.
According to this view, MWE formation can be regarded as complementary to
compounding and is employed if compounding is not available (cf. Section 2.2) or
(at least in some cases) for the purposes of increasing expressivity (e. g., Fleischer
1997b). However, on a broader view on MWE formation that acknowledges – in
addition to sporadic, secondary idiomatization of phrases – the (widespread)
existence of more abstract MWE patterns, both with or without a compositional
meaning, MWE formation is not complementary to compounding but rather com-
peting or at least on an equal footing. If this is indeed the case, we ought to ques-
tion whether more can be said about the distribution of MWEs and compounds in
the lexicon than the preferences concerning lexical category. In other words: Are
there more (or other) factors influencing or determining the choice between both
patterns?
Compounds and multi-word expressions in German 77
In recent years, several studies have approached this question for German
with a focus on nominal units, both A+N and N+N. The study by Schlücker/Plag
(2011) adopts an analogical approach, investigating the idea that the choice
between MWEs and compounds depends on the individual lexemes involved. The
study examines the formation of new A+N combinations. It shows that there are
no general preferences for coining new A+N lexical units as either MWE or com-
pound, but that the choice depends on the way the individual adjectives and
nouns have been used before, i. e. either as a compound (e. g. voll (‘full’): Vollbart
‘full beard’, Vollmond ‘full moon’) or an MWE (e. g. offen ‘open’, offenes Geheimnis
‘open secret’, offenes Ohr ‘sympathetic ear’) or both (e. g. rot (‘red’): Rotwein ‘red
wine’, Rotkohl ‘red cabbage’; rote Bete ‘beetroot’, rote Grütze ‘red fruit jelly’).11 Put
simply, constituents that have previously been used in compounds tend to be
realized as compounds when coining new combinations, and those that have pre-
viously been used in MWEs tend to be realized as MWEs. Thus, the choice between
the forms is determined by the existence and number of related similar construc-
tions in the mental lexicon of the language users. This analogical effect has been
shown to be stronger for adjectives than for nouns.12 There is also evidence for the
co-existence of both patterns as well as for analogical effects from the diachronic
perspective. Studying the diachronic development of German A+N sequences
since 1700, Schuster (2016) shows that both patterns have continuously co-ex-
isted and that there is no clear trend towards either of the patterns or the disap-
pearance of the other. Again, the choice for either an MWE or a compound seems
to depend on individual adjectives. Thus, some adjectives consistently form A+N
phrases whereas others always occur in compounds. A third group is productive
in both patterns which also leads to the formation of doublets, e. g. rotes Wild –
Rotwild (‘red deer’) which both can be found in 19th century dictionaries (cf.
Schuster 2016: 278). It is only for the third group of adjectives that a diachronic
tendency towards compounding can be observed, as in the case of Rotwild which
is the only acceptable form in present-day language.13
11 The same holds for the noun; examples are not provided for reasons of space.
12 In addition, morphological and semantic properties also play a role in the determination of
the form, cf. Schlücker/Plag (2011); Schlücker (2014). Regarding semantics, there is a comple-
mentary distribution of metaphorical and metonymic A+N combinations such that the former
are always realized as MWEs (e. g., roter Faden: lit. red wire, ‘thread’) and the latter (almost) al-
ways as compounds (e. g., Blauhelm ‘Blue helmet’). However, the bulk of A+N combinations have
neither a metaphorical nor a metonymic meaning and are found in both forms.
13 To be sure, the phrase rotes Wild is fully grammatical, as it is formed according to the syntac-
tic rules for a nominal phrase with an adjectival modifier in present-day German. It is however
not a conventional lexical unit denoting the concept of red deer, and thus no MWE.
78 Barbara Schlücker
14 Schlücker/Hüning (2009) deal for the most part with Greek- and Latin-based relational adjec-
tives such as sozial ‘social’ and optimal ‘optimal’. Roth’s (2014, 2015) choice of comparable pat-
terns (i. e. compounds and collocations) relies on the quantitative method of distributional se-
mantics which determines the meaning of an expression on the basis of its context in an
automatical procedure. Expressions with very similar or identical lexical constituents in the
context are considered semantically equivalent, although it is obvious that subtler meaning dif-
ferences cannot be detected in this way.
Compounds and multi-word expressions in German 79
German has various prepositional MWEs, such as auf Grund ‘due to’, in Anbe
tracht (‘in consideration of’). Some of them have morphological counterparts, in
particular derivatives formed by the suffix -lich (‘belonging to X’), e. g. in Bezug
auf – bezüglich (‘pertaining to’), in Hinsicht auf – hinsichtlich (‘regarding’). There
are also morphological counterparts that resemble compounds, often consisting
of a P+N sequence, e. g. aufgrund (lit. onP groundN, ‘due to’), anhand (lit. atP handN,
‘on the basis of’). They are, however, not the output of compounding but the
result of univerbation, that is, they are former phrases that have become fixed
and, as a result, are now written as one word (cf. Section 2.1). This is also obvious
from the phrasal stress pattern of these forms (stress on the nominal head, e. g.
aufgrúnd), in contrast to genuine P+N compounds which have modifier stress,
e. g. Vórdach (lit. in front ofP roofN, ‘porch roof’). In many cases, this transition is
still in progress which means that both writing norms officially co-exist, e. g. zu
Gunsten – zugunsten (‘in favor of’). They are, for the reason just mentioned, no
instances of MWE/compound doublets however. The same holds for grammatical
MWEs such as conjunctions, e. g. wenn auch (‘although’). Although there are a few
non-phrasal counterparts, such as wenngleich (‘albeit’), these are not compounds
but univerbations.
Fleischer (1997b: 149–153) stresses that adverbial MWEs display great structural
variety. Many of them contain prepositions. Some frequent patterns are given in
(3). Note that these examples are diverse regarding syntactic category (so some
are structurally equivalent to the prepositional MWEs in the previous section,
with others equivalent to the binomials discussed in Section 3.5). The various
forms are grouped together due to their common adverbial function, in order to
compare them with adverbial word-formation.
80 Barbara Schlücker
(3a) Prepositional phrases: auf Anhieb (‘straightaway’), in der Tat (lit. in the
deed, ‘indeed’), unter vier Augen (lit. under four eyes, ‘in private’),
von Hause aus (lit. from home out, ‘by nature’)
Various kinds of binomials:
(3b) Conjoined nouns: Tag und Nacht (‘day and night’), bei Nacht und Nebel (lit.
at night and fog, ‘in secrecy’)
(3c) With prepositions: von Zeit zu Zeit (‘from time to time’), von Kopf bis Fuß
(‘from top to toe’), von Haus zu Haus (‘from house to house’)
(3d) Identical constituents (adverbs): durch und durch (‘out and out’), nach und
nach (‘little by little’)
It is obvious (and has also been discussed by Fleischer 1997b) that many of these
MWEs are instantiations of partially fixed abstract patterns (cf. Section 4).
Adverbial compounding, on the other hand, is highly restricted and often not
recognized as a word formation type on its own. Adverbial compounds are only
found with a handful of adverbs and prepositions, in particular directional
adverbs such as hin ‘to, there’ and her ‘to, there’ (cf. Fleischer/Barz 2012), e. g.
herauf (lit. there up, ‘up’), hinüber (lit. there over, ‘over’), dorthin (lit. thereto,
‘there’), daneben (lit. there next, ‘alongside’). However, in some (though not all)
cases these forms seem to be univerbations rather than compounds proper. Also,
contrary to genuine compounds, the head cannot be clearly identified in most
cases and they are not right-headed, as is usual in German. These restrictions
on adverbial compounding can explain the enormous amount and structural
diversity of adverbial MWEs, in particular given the fact that adverbial deriva-
tion is also restricted to a handful of affixes. For the domain of adverbs and
adverbials, this supports the idea of MWE formation as a complementary device
to compounding.
3.3 C
omplex verbs
The verbal domain is usually regarded as the most diverse and extensive domain of
German MWE formation. Verbal MWEs (the classical “idioms”), either with a fully or
a partially non-compositional meaning, such as bei jdn. einen Stein im Brett haben
(lit. have a stone in so.’s plank, ‘be in so.’s good books’) or den Wald vor lauter Bäu
men nicht mehr sehen (‘not see the wood for the trees’) have long been at the core of
phraseological research. In addition, there are various abstract verbal MWE pat-
terns and verbal collocations (mostly N+V). However, there are no corresponding
verbal MWE and compound patterns, due to the absence of verbal compounding in
German. We will therefore only briefly discuss some patterns, in particular in con-
Compounds and multi-word expressions in German 81
nection with the question of the demarcation between syntactic and morphological
verbal units.
The first one are light verb constructions. They are either [NP V]VP or [PP V]VP
sequences. All of them have corresponding morphological forms, either simplex
or derived, but no compounds. The correspondence is also obvious as most of
them (though not all) contain a corresponding lexical item, e. g. einen Beschluss
fassen/beschließen (lit. grab a decision, ‘decide’), zur Anzeige bringen / anzeigen
(lit. bring to record, ‘report’), but in Kenntnis setzen / informieren (lit. set in knowl-
edge, ‘inform’). The phrasal and the morphological forms are equal in meaning,
but often differ in argument structure. There are also differences in register as the
phrasal constructions are more formal.
Another group are particle verbs. Particle (or: phrasal) verbs such as anlächeln
‘smile at’, abschicken ‘send off’, austrinken ‘drink up’ have been widely discussed
for German as well as for other Germanic languages in connection with their unclear
status as either morphological or phrasal entities (cf. Los et al. 2012; Dehé 2015;
McIntyre 2015; Booij, this volume; a. o.). The central problem is that they are syntac-
tically and morphologically separable in some contexts, e. g. Er schickt den Brief ab.
(‘He sends the letter off’); past participle: abgeschickt (‘sent off’), i. e. with the
ge-prefix in the middle of the word rather than at the beginning, as usual. Insepa-
rability, however, is usually considered a basic property of morphological units.
Interestingly, it seems that German particle verbs are mainly discussed in morpho-
logical research (often in connection with the question of whether they form a word
formation pattern on their own or not) but are rarely considered in phraseological
research. For English, on the other hand, they are quite naturally also included in
phraseological work, cf. Gries (2008) and Ramisch (2015), for instance.15
Particles in German particle verbs often have prefixal counterparts, but there
are also particles that are homonymous to prepositions, adverbs, adjectives, and
nouns (cf. Fleischer/Barz 2012). Thus, even forms like herumbrüllen (lit. yell
around, ‘yell’), schönreden (lit. talk st. beautiful, ‘sugarcoat’) or totarbeiten (lit.
dead work, ‘work to death’) that on the surface look like compounds since they
involve lexical stems rather than prefixes, are in fact particle verbs since they are
separable.
For this reason, particle verbs have often been regarded as problematic
regarding the demarcation between MWEs and compounds/morphological units.
Whereas the cases discussed so far in this chapter raise the question of the way in
which (clearly) morphological and (clearly) syntactic lexical patterns relate to
15 However, Moon (1998) argues against the classification of particle verbs as verbal MWEs in
English.
82 Barbara Schlücker
each other, they rather demand a solution for the fact that there are also interme-
diate constructions. We will come back to the issue of intermediate construc-
tions in Section 3.5 and 4.
Finally, another unclear, intermediate group are N+V patterns of the type Rad
fahren / radfahren (‘ride a bike’), brustschwimmen (‘breaststroke’) or Eis laufen /
eislaufen (‘ice-skate’). They have been widely discussed in the literature, regard-
ing both their orthography and their morphosyntactic properties. However, con-
trary to particle verbs, they do not seem to form a homogeneous group. Thus,
several co-existing subtypes of these N+V patterns have been identified, with dif-
ferent analyses as either verbal compounds, backformations or incorporation (cf.
Fuhrhop 2007, among many others).
Häcki Buhofer et al. (2014) list numerous adjectival collocations, mostly an adjec-
tive preceded by a modifier (adverb, adjective or other), cf. (4):
(4) streng geheim (lit. strictly secret, ‘top secret’), bitter nötig (‘urgently
necessary’), geradezu klassisch (‘almost classical’), verschwindend klein
(‘vanishing small’), spielend leicht (lit. playing easy, ‘easily’), furchtbar
traurig (‘terribly sad’), immens wichtig (‘immensely important’)
(5) dunkelrot (‘dark red’), tiefrot (lit. deep red, ‘bright red’), heilfroh (lit.
salvation glad, ‘really glad’), stinkfaul (lit. stinking lazy, ‘bone-idle’),
grundverkehrt (‘fundamentally wrong’), hochbegabt (‘highly talented’)
However, real doublets are rare, e. g., schwerkrank – schwer krank (lit. heavily ill,
‘critically ill’). In addition to such compounds having a gradational meaning,
adjectival compounds very often have a determinative meaning, that is, the mod-
ifier specifies the property denoted by the adjectival head, often, though not
always, in a comparative way, cf. (6).
(6) graublau (‘grey-blue, powderblue’), hautnah (lit. skin close, ‘very close’),
schneeblind (‘snow-blind’), butterweich (lit. butter soft, ‘beautifully soft’)
Compounds and multi-word expressions in German 83
Thus, the morphological and the syntactic units discussed above only partially
overlap in the semantically restricted domain of gradation and cannot generally
be regarded as competing patterns.16
In addition to the adjectival collocations as in (4), there are also partially
fixed MWE patterns in the adjectival domain. One of them are adjectival phrasal
similes as in (7) (cf. Burger 2015: 56 f.; Hüning/Schlücker 2015).17 It is a typical
example of a partially filled MWE. The property denoted by the adjective is – by
means of the comparative conjunction wie ‘as’ – compared to a reference value
provided by the noun.
(7) [(so) A wie (ein) N] [(as) A as (an) N]): so weich wie Seide (‘as soft as silk’)
(8) (so) stumm wie ein Fisch / *fischstumm (lit. as mute as a fish, ‘as mute as a
maggot’)
(so) sanft wie Regen / *regensanft (‘as soft as rain’)
16 For some examples in (5) and (6) it may be a matter of debate whether they only have a gra-
dational or a determinative meaning or both, e. g. dunkelrot (‘dark red’). The crucial point here is,
however, that the determinative meaning is not available for the syntactic pattern and that for
this reason there is only partial overlap between the morphological and the syntactic pattern.
17 Adjectival binomials form another pattern, cf. (i). However, nominal, verbal and adverbial
binomials seem to be much more frequent than adjectival ones. Yet another pattern is given in
(ii), cf. Fleischer (1997b: 149). However, the patterns do not have a direct morphological counter-
part, neither regarding form nor semantics.
(i) [A und A] ([A and A]): fix und fertig (lit. fix and ready, ‘beat, strung out’), still und leise (‘silent and
quiet’)
(ii) [zum + infinitive + A] ([to the + infinitive + A]): zum Weinen schön (lit. to the crying beautiful, ‘mov-
ingly beautiful’), zum Bersten voll (lit. to the bursting full, ‘full to bursting’)
18 Interestingly, adjectival phrasal similes and corresponding N+A compounds do also exist in
other languages (cf. Finkbeiner/Schlücker, this volume), such as Dutch (cf. Booij, this volume),
Italian (cf. Masini, this volume), and Finnish (cf. Hyvarinen, this volume).
84 Barbara Schlücker
(so) dumm wie Brot / *brotdumm (lit. as dumb as bread, ‘as thick as
brick’)
(so) frech wie Dreck / *dreckfrech (lit. as cheeky as dirt, ‘as bold as brass’)
On the other hand, there are also compounds that lack corresponding phrasal
comparisons. In these cases, the phrasal expressions are not ungrammatical but
are not conventionalized lexical units and therefore much rarer, as can be seen
from corpus data, cf. (10).19
The distribution of forms is also dependent on the context, as discussed for A+N
sequences in Section 2.3. Whereas both patterns can be used predicatively or
adverbially, only compounds can occur in attributive position. Thus, although
the phrasal pattern might be more expressive, especially since it also allows non-
sensical, apparently unmotivated comparisons which compounds generally do
not (e. g. frech wie Dreck, dumm wie Brot, cf. (8)),20 compounds are more versatile
concerning their syntactic distribution.
Furthermore, it has been assumed that both the phrasal and the morpholog-
ical pattern have developed a semantic subpattern with an intensifying rather
than a comparative meaning (cf. Hüning/Booij 2014; Hüning/Schlücker 2015).
Thus, in cases like hart wie Stein / steinhart (‘rock-hard’), stark wie ein Bär / bären
stark (lit. as strong as a bear, ‘strong as an ox’) the noun does not provide an
actual measure for comparison but rather functions as an intensifier (‘very hard’,
‘very strong’). This intensifying meaning is available for both phrasal similes and
compounds, although it is not entirely clear under which condition comparative
patterns develop an intensifying meaning. Importantly, neither the phrasal nor
the morphological pattern do always have this intensifying meaning. For instance,
the adjective weich ‘soft’ occurs in numerous comparative patterns, both phrasal
and morphological, and all of them have a comparative rather than an intensify-
ing meaning. More specifically, two subgroups can be observed, one relating to
the softness of the surface and the other to the softness of the substance, cf.
(11)–(12).
surface
substance
In these cases, the various measures of comparison are literally present, thus
samtweich is different from seidenweich in the way velvet is different from silk. In
particular, the groups in (11) and (12) have clearly different meanings and cannot
be used interchangeably. It might then be concluded that an intensifying mean-
ing can only develop if only one comparative measure is conventionalized, as in
the case of hart (hart wie Stein / steinhart) and stark (stark wie ein Bär / bären
stark), and not several.21
21 There are also intensifying modifiers in compounds that have developed into a productive
intensifying pattern such that the modifier has completely lost its literal meaning, often dis-
cussed in connection with the term affixoid. A case in point is the intensifier stock ‘stick’ which
first occurred in morphological and phrasal comparisons such as stocksteif / steif wie ein Stock
‘as stiff as a stick’ but later, after having developed an intensifying meaning, was used as an in-
tensifier of other, totally unrelated adjectives, e. g. stockdunkel (‘very dark’), stockbesoffen (‘very
drunk’) (cf. Hüning/Booij 2014; Hüning/Schlücker 2015). The abovementioned example of grund-
(‘ground’) (cf. (5)) seems to be a similar case.
86 Barbara Schlücker
These observations lead to the conclusion that phrasal similes and adjectival
compounds are an example of the co-existence of corresponding phrasal and
morphological lexical structures. They show that blocking as a principle con-
trolling the lexicon does not seem to be as strong as sometimes assumed. Also, as
both patterns share lexical material and semantic subgroups (comparative/inten-
sifying) they cannot be regarded as complementary. Rather, it can be assumed
that both patterns as well as their instantiations are related to each other via their
constituents and their meanings. The choice between either form in the case of
doublets as in (9), (11), and (12) is likely to be determined by expressivity (in favor
of the phrasal structure) as well as syntactic flexibility and conciseness (favoring
the compound), but also other factors determined by the actual context, e. g. sen-
tence length (cf. Section 2.3).
(13a) Postnominal genitives: Schlaf der Gerechten (‘sleep of the just’), Geschenk
des Himmels (lit. gift from heaven, ‘godsend’), Macht der Gewohnheit
(‘force of habit’)
(13b) Prenominal genitives: des Rätsels Lösung (lit. the puzzle’s solution, ‘the
answer to this problem’)
(13c) Prepositional constructions: Dame von Welt (lit. lady of world, ‘sophisti-
cated woman’), Nerven aus Stahl (‘nerves of steel’)
(13d) Close apposition: Häufchen Elend (lit. heap misery, ‘picture of misery’),
Vater Staat (lit. father state, ‘Uncle Sam’)
(13e) Binomials: Grund und Boden (lit. ground and soil, ‘property’), Sack und
Pack (‘bag and baggage’)
gelbes Trikot (‘yellow jersey’), echte Grippe (lit. real flu, ‘influenza’),
(14b)
schwarzes Brett (lit. black board, ‘notice board’)
(15a) N+A constructions: Forelle blau (lit. trout blue, ‘blue trout’), Sonne pur (lit.
sun pure), Rahmspinat tiefgefroren (lit. cream spinach deep-frozen)
(15b) [ein N1 von einem/einer N2] (‘[an N1 of an N2]’): ein Berg von einem Mann (lit.
a mountain of a man, ‘a man like a mountain’), eine Null von einem Stürmer
(lit. a null of a striker, ‘a useless striker’), ein Arsch von einem Professor (lit.
a butt of a professor, ‘an idiot of a professor’)23
(15c) [N1 von N2] (‘[N1 of N2]’): Salat von Flusskrebsen (lit. salad of crayfish), Gra
tin von Tomaten (lit. gratin of tomatoes), Suppe von Spinat und Bärlauch
(lit. soup of spinach and wild garlic)
22 This view is somewhat simplified as there does not seem to be a strict border between con-
ventionalized binomials and free coordinative constructions. It can be observed that nouns in
occasional binomials do not occur with determiners, but their internal order is interchangeable
(cf. D’Avis/Finkbeiner 2013, for instance), so they might be regarded as in-between forms.
23 I owe the last two examples to Rita Finkbeiner.
88 Barbara Schlücker
4 Theoretical implications
In the past decennia of phraseological research it has become obvious that the
existence of abstract, partially fixed phrasal patterns in the lexicon is not
restricted to a handful of MWE patterns, such as binomials, but rather seems to be
a fundamental characteristic of MWEs more generally. Such patterns are assumed
to underlie MWEs both with and without a compositional meaning and both with
and without deviant phonological, morphological, or syntactic properties.
The crucial point here is that under this view, MWE patterns are syntactic
patterns in the lexicon, and thus are lexical patterns on a par with morphological
ones. Booij (2002, 2010) argues that constructional idioms are syntactic expres-
sions that function as alternatives to morphological expressions. In his defini-
tion, constructional idioms are
This definition can capture many pattern-like, partially-fixed MWEs as, for
instance, in (2), (7), or (15b). In addition, it also covers other, more grammatical
kinds of MWEs such as analytic causative constructions or analytic progressives
(cf. Booij 2002, 2010). These are productive patterns with the same function as
their morphological, synthetic counterparts and, just like these morphological
counterparts, their productivity can be shown to be subject to certain restrictions.
24 For further details, including an analysis of the adjective as either A0 or AP, cf. Booij (2010:
176 ff.) and Schlücker (2014: 177 ff.).
90 Barbara Schlücker
structions that have both phrasal and morphological properties like the syntactic
compounds discussed at the end of Section 3.5 (in addition to clearly morpholog-
ical and clearly syntactic patterns). They can be regarded as a link or transitional
category between morphological and syntactic lexical patterns. In other words,
morphological and syntactic lexical patterns form a continuum and these inter-
mediate constructions are situated in the middle.
In sum, treating MWEs in the way advocated here has a crucial impact on
ideas about the structure of the lexicon and the division of labor between mor-
phology and syntax.
5 C
onclusion
This chapter has provided an overview of German MWEs from the perspective of
relating MWEs and MWE formation to compounds and compounding. It has been
shown that in German, MWEs for the most part can be clearly distinguished from
compounds on formal grounds. This chapter has focused on MWEs that have – or
at least could have in principle – corresponding compounds with a similar mean-
ing and function. In general, it can be seen that the proportion of compounds and
MWE differs between lexical categories. These differences – or at least some of
them – can be explained by the idea about the avoidance of synonymous expres-
sions in the lexicon. On the other hand, however, it has also become clear that
there are numerous parallel and thus competing abstract patterns and even dou-
blets on the level of specific forms.
From a theoretical perspective, it has been argued that MWEs should not gen-
erally be regarded as individual and idiosyncratic formations that are derived
from “regular” syntactic phrases in a secondary process of idiomatization and
lexicalization. Instead – and in accordance with numerous findings in recent lit-
erature – it can be assumed that abstract patterns underlie MWE formation and
that, therefore, MWE formation can be regarded as being on a par with word for-
mation. Thus, just as there are abstract morphological patterns for the formation
of lexical units there are also syntactic ones.
References
Barz, Irmhild (1996): Komposition und Kollokation. In: Knobloch/Schaeder (eds.). 127–146.
Barz, Irmhild (2007): Wortbildung und Phraseologie. In: Burger, Harald et al. (eds.):
Phraseology. Vol. 1. Berlin/New York: De Gruyter. 27–36.
92 Barbara Schlücker
Booij, Geert (2002): Constructional Idioms, Morphology, and the Dutch Lexicon. In: Journal of
Germanic Linguistics 14, 4. 301–329.
Booij, Geert (2010): Construction morphology. Oxford/New York: Oxford University Press.
Burger, Harald (2001): Von lahmen Enten und schwarzen Schafen. Aspekte nominaler
Phraseologie. In: Häcki Buhofer, Annelies/Burger, Harald/Gautier, Laurent (eds.):
Phraseologiae Amor. Aspekte europäischer Phraseologie. Festschrift für Gertrud Gréciano
zum 60. Geburtstag. Hohengehren: Schneider. 33–42.
Burger, Harald (2015): Phraseologie: eine Einführung am Beispiel des Deutschen. 5th ed.
(= Grundlagen der Germanistik 36). Berlin: Schmidt.
Culicover, Peter W./Jackendoff, Ray/Audring, Jenny (2017): Multiword Constructions in the
Grammar. In: Topics in Cognitive Science 9, 3. 552–568.
D’Avis, Franz/Finkbeiner, Rita (2013): “Podolski hat Vertrag bis 2007, egal, ob wir in der Ersten
oder Zweiten Liga spielen.” Zur Frage der Akzeptabilität einer neuen Konstruktion mit
artikellosem Nomen. In: Zeitschrift für germanistische Linguistik 41, 2. 212–239.
Dehé, Nicole (2015): 35. Particle verbs in Germanic. In: Müller et al. (eds.). 611–626.
Donalies, Elke (2008): Sandstrand, sandy beach, plage de sable, arenile, piaskowy plaża,
homokos part. Komposita, Derivate und Phraseme des Deutschen im europäischen
Vergleich. In: Deutsche Sprache 36, 4. 305–323.
Dürscheid, Christa (2002): “Polemik satt und Wahlkampf pur” – Das postnominale Adjektiv im
Deutschen. In: Zeitschrift für Sprachwissenschaft 21, 1. 57–81.
Fillmore, Charles J./Kay, Paul/O’Connor, Mary Catherine (1988): Regularity and idiomaticity in
grammatical constructions. The case of “let alone”. In: Language 64, 3. 501–538.
Finkbeiner, Rita (2008): Zur Produktivität idiomatischer Konstruktionsmuster. Interpretier-
barkeit und Produzierbarkeit idiomatischer Sätze im Test. In: Linguistische Berichte 216,
4. 391–430.
Fleischer, Wolfgang (1982): Phraseologie der deutschen Gegenwartssprache. Leipzig: VEG
Bibliographisches Institut.
Fleischer, Wolfgang (1996a): Phraseologische, terminologische und onymische Wortgruppen
als Nominationseinheiten. In: Knobloch/Schaeder (eds.). 147–170.
Fleischer, Wolfgang (1996b): Zum Verhältnis von Wortbildung und Phraseologie im Deutschen.
In: Korhonen, Jarmo (ed.): Studien zur Phraseologie des Deutschen und des Finnischen II.
Bochum: Universitätsverlag Brockmeyer. 333–344.
Fleischer, Wolfgang (1997a): Das Zusammenwirken von Wortbildung und Phraseologisierung in
der Entwicklung des Wortschatzes. In: Wimmer, Rainer/Berens, Franz-Josef (eds.):
Wortbildung und Phraseologie. Tübingen: Narr. 9–24.
Fleischer, Wolfgang (1997b): Phraseologie der deutschen Gegenwartssprache. 2nd ed.
Tübingen: Narr.
Fleischer, Wolfgang/Barz, Irmhild (2012): Wortbildung der deutschen Gegenwartssprache. 4th
ed. Berlin: Schmidt.
Fuhrhop, Nanna (2007): Verbale Komposition: Sind brustschwimmen und radfahren Komposita?
In: Kauffer, Maurice/Métrich, René (eds.): Verbale Wortbildung im Spannungsfeld
zwischen Wortsemantik, Syntax und Rechtschreibung. Tübingen: Narr. 49–58.
Gaeta, Livio/Ricca, Davide (2009): Composita solvantur: Compounds as lexical units or
morphological objects? In: Gaeta, Livio/Grossmann, Maria (eds.): Compounds between
syntax and lexicon. Special Issue of Italian Journal of Linguistics/Rivista di Linguistica 2, 1.
35–70.
Compounds and multi-word expressions in German 93
Gries, Stefan Th. (2008): Phraseology and linguistic theory: a brief survey. In: Granger,
Sylviane/Meunier, Fanny (eds.): Phraseology. An interdisciplinary perspective.
Amsterdam/Philadelphia: Benjamins. 3–25.
Häcki Buhofer, Annelies et al. (2014): Feste Wortverbindungen des Deutschen. Kollokationen-
wörterbuch für den Alltag. Tübingen: Francke.
Haiman, John (1980): The Iconicity of Grammar: Isomorphism and Motivation. In: Language 56,
3. 515–540.
Häusermann, Jürg (1977): Phraseologie: Hauptprobleme der deutschen Phraseologie auf der
Basis sowjetischer Forschungsergebnisse. Tübingen: Narr.
Hüning, Matthias/Booij, Geert (2014): From compounding to derivation. The emergence of
derivational affixes through “constructionalization”. In: Folia Linguistica 48, 2. 579–604.
Hüning, Matthias/Schlücker, Barbara (2015): 24. Multi-word expressions. In: Müller et al.
450–467.
Jackendoff, Ray (1997a): The architecture of the language faculty. Cambridge, UK: Cambridge
University Press.
Jackendoff, Ray (1997b): Twistin’ the night away. In: Language 73, 3. 534–559.
Jackendoff, Ray (2002): Foundations of language. Oxford: Oxford University Press.
Kay, Paul/Fillmore, Charles J. (1999): Grammatical constructions and linguistic generalizations:
The What’s X doing Y? construction. In: Language 75. 1–33.
Knobloch, Clemens/Schaeder, Burkhard (eds.): Nomination, fachsprachlich und gemein-
sprachlich. Opladen: Westdeutscher Verlag.
Los, Bettelou et al. (2012): Morphosyntactic change: a comparative study of particles and
prefixes. (= Cambridge Studies in Linguistics 134). Cambridge, UK: Cambridge University
Press.
McIntyre, Andrew (2015): 23. Particle-verb formation. In: Müller et al. 434–449.
Möhn, Dieter (1986): Determinativkomposita und Mehrwortbenennungen im deutschen
Fachwortschatz. In: Jahrbuch Deutsch als Fremdsprache 12. 111–133.
Moon, Rosamund (1998): Fixed expressions and idioms in English: a corpus-based approach.
Oxford/New York: Oxford University Press.
Motsch, Wolfgang (2004): Deutsche Wortbildung in Grundzügen. 2nd ed. Berlin/New York: De
Gruyter.
Müller, Peter O. et al. (eds.) (2015): Word-formation. An international handbook of the
languages of Europe. Vol. 1 (= Handbooks of Linguistics and Communication Science (HSK)
40.1). Berlin/Boston: De Gruyter.
Pavlov, Vladimir M. (1983): Zur Ausbildung der Norm der deutschen Literatursprache im Bereich
der Wortbildung (1470–1730): Von der Wortgruppe zur substantivischen Zusammen-
setzung. (Zur Ausbildung der Norm der deutschen Literatursprache (1470–1730) VI).
Berlin: Akademie.
Ramisch, Carlos (2015): Multiword expressions acquisition: a generic and open framework. New
York: Springer.
Roth, Tobias (2014): Wortverbindungen und Verbindungen von Wörtern. Lexikografische und
distributionelle Aspekte kombinatorischer Begriffsbildung zwischen Syntax und
Morphologie. Tübingen: Narr.
Roth, Tobias (2015): Kompositum oder Kollokation? Konkurrenz an der Syntax-Morpholo-
gie-Schnittstelle. In: Schmidlin, Regula/Behrens, Heike/Bickel, Hans (eds.):
Sprachgebrauch und Sprachbewusstsein. Berlin/Boston i. a.: De Gruyter. 155–176.
94 Barbara Schlücker
Scherer, Carmen (2012): Vom Reisezentrum zum Reise Zentrum. Variation in der Schreibung von
N+N-Komposita. In: Gaeta, Livio/Schlücker, Barbara (eds.): Das Deutsche als komposi-
tionsfreudige Sprache. Strukturelle Eigenschaften und systembezogene Aspekte. Berlin/
New York: De Gruyter. 57–81.
Schlücker, Barbara (2014): Grammatik im Lexikon. Adjektiv+Nomen-Verbindungen im
Deutschen und Niederländischen. (= Linguistische Arbeiten 553). Berlin/Boston: De
Gruyter.
Schlücker, Barbara/Hüning, Matthias (2009): Compounds and phrases. A functional
comparison between German A+N-compounds and corresponding phrases. In: Italian
Journal of Linguistics/Rivista di Linguistica 21, 1. 209–234.
Schlücker, Barbara/Plag, Ingo (2011): Compound or phrase? Analogy in naming. In: Lingua 121,
7. 1539–1551.
Schuster, Saskia (2016): Variation und Wandel. Zur Konkurrenz morphologischer und syntak-
tischer A+N-Verbindungen im Deutschen und Niederländischen seit 1700. (= Konvergenz
und Divergenz (KuD) 4). Berlin/Boston: De Gruyter.
Steyer, Kathrin (2015): Patterns. Phraseology in a state of flux. In: International Journal of
Lexicography 28, 3. 279–298.
Steyer, Kathrin (2016): Corpus-driven Description of Multi-word Patterns. In: Pastor, Gloria
Corpas et al. (eds.): Workshop Proceedings “Multi-Word Units in Machine Translation and
Translation Technologies” (MUMTTT2015). Genf: Editions Tradulex. 13–18.
Geert Booij
Compounds and multi-word expressions
in Dutch
1 The existence of such a wide range of MWEs also raises the psycholinguistic question which
role they play in lexical processing. As far as Dutch is concerned, there are a number of psycho-
linguistic studies (Levelt/Meyer 2000; Sprenger 2003; Sprenger/Levelt/Kempen 2006; Noote-
boom 2011) to which the reader is referred. However, this psycholinguistic dimension will be left
out of consideration here.
Open Access. © 2019 Booij, published by De Gruyter. This work is licensed under the Creative
Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-004
96 Geert Booij
In most cases, these two structural options for creating names complement
each other, but there is also some competition. A comparison of these two options
provides insight into the organization of grammar, the role of the lexicon, and the
division of labour between morphological and syntactic devices.
The topic broached in this article may be qualified as a study of the relation
between compounding and forms of ‘periphrastic word formation’. The latter
term is used in Booij (2002c) as a characterization of the function of Dutch parti-
cle verbs. Traditionally, the term ‘periphrasis’ is applied to word combinations
that fill cells of inflectional paradigms, for instance the cells for the perfect tense
forms of Dutch verbs, combinations of an auxiliary (hebben ‘to have’ or zijn ‘to
be’) and a past participle. As we will see below, phrasal word combinations can
be used to fill in certain gaps in the word formation system and compete with
synonymous complex words. This is the idea of complementarity between mor-
phological and phrasal lexical units.
Investigating this relationship also makes sense from a diachronic perspec-
tive, since syntactic word combinations are the historical source of the various
types of compounding that we find in Germanic languages like Dutch. Hence, it is
important to understand the differences and similarities between phrasal and
morphological constructions, and it may not always be easy to make this distinc-
tion due to this historical source of compounds. This demarcation problem has
been pointed out by Hermann Paul in chapter XIX of his Prinzipien der Sprach
geschichte (Paul 1898), where he argues that “[d]er Uebergang von syntaktischem
Gefüge zum Kompositum ist ein so allmählicher, dass es gar keine scharfe Grenz-
linie zwischen beiden gibt” (ibid.: 304). Paul’s observation on the blurred bound-
ary between phrases and compounds implies that we need to investigate in more
detail how we can distinguish compounds from phrases with a similar form and
function. In this article, I will therefore first discuss the formal demarcation of
compounds from phrases (Section 2). In Section 3, the naming functions of vari-
ous types of compounds and their phrasal counterparts are discussed in detail.
Section 4 shows how syntax plays a role besides compounding in the construc-
tion of complex numeral expressions. In Section 5, it is briefly argued what these
empirical findings imply for a proper theory of the organization of grammar, and
why Construction Morphology (CxM) offers an insightful account of the relevant
facts.
Compounds and multi-word expressions in Dutch 97
The N+N and A+A phrases in (1) do not have a naming function, they are descrip-
tive in nature. The A+N phrase rode wijn can be used as a name for a particular
type of wine, or as a description. Yet, I discuss these phrases here because we are
focusing on the formal differences between compounds and phrases, whether
with a naming or a descriptive function.
Not all types of Dutch compounds have a counterpart in phrasal form; this
applies to the following types:
In these cases we cannot find phrasal counterparts because verbs cannot modify
a nominal head, and nouns and verbs cannot modify adjectives in pre-adjectival
position. Hence, for these types of word combinations there is no phrasal inter-
pretation possible, and thus, the demarcation issue does not arise.
The first criterion that comes to mind for the demarcation of words and phrases is
that of Lexical Integrity. The criterion of Lexical Integrity can be defined as fol-
lows: ‘Syntactic rules cannot manipulate parts of words’. In other words, words
are islands for syntactic operations. This narrow definition of Lexical Integrity as
being restricted to syntactic operations does not exclude the possibility that the
internal structure of words is accessible for other purposes, such as semantic
interpretation, as should be the case (cf. Booij 2009b for detailed discussion of
various definitions of Lexical Integrity).
The word combinations listed as phrases in (1) all allow for syntactic splits:
In the cases (3a), the nominal head can be modified additionally, and hence we
get a syntactic split between the first and the second word. The same applies to
the adjectival head in (3b). The three verbal phrases in (3c) are all examples of
so-called separable complex verbs (cf. Section 3.3). The non-verbal part is split off
from the verb in main clauses (Booij 2010; Los et al. 2012). The word combinations
in (1) that are classified as compounds, on the other hand, cannot be split. In the
case of compound verbs this is clear from their not being split in main clauses:
There are two cases where it seems as if parts of compounds can be split. First,
Dutch features gapping of parts of words: a compound constituent can be omitted
under identity with another constituent of the same prosodic form in a phrase, as
in:
However, as shown in Booij (1985), this kind of ellipsis is not syntactic in nature.
Instead, it is a prosodic process in which one of two identical prosodic words is
omitted. Both in compounds and phrases, the word constituents correspond to
separate prosodic words (also referred to as ‘phonological words’). That is, this
type of gapping is phonological in nature. This explains why a compound constit-
uent like divisie in eredivisie can be omitted under identity with a separate word
divisie, as in (5c): they are identical prosodic words, although their morpho-syn-
tactic status is different.
The second type of split is found in phrases with coordinated elative com-
pounds (cf. Hoeksema 2012) such as:
In elative compounds the first part functions as an intensifier. Again, this is not
a case of syntactic gapping. We cannot assume underlying structures like door
nat en doornat or doodziek en doodziek as the sources of the phrases in (6) since
such phrases are ill-formed. Instead, what is at stake here is the repetition
of an intensifier word in the left part of a compound, a case of word-internal
coordination.
100 Geert Booij
2.2 O
rthography
A+A compounds and A+A phrases are not always that easy to distinguish. In A+A
phrases, the first adjective functions as an adverb. However, Dutch adjectives can
be used as adverbs without being morphologically marked as such. Hence, when
we come across an A+A sequence such as jong getrouwd lit. ‘young married’ this
word sequence can be interpreted either as a compound or as a phrase. The dif-
ference between compound and phrase is primarily a semantic one. When we
spell jonggetrouwd, it is considered a compound with a naming, classifying func-
tion, and the meaning is ‘recently married’. When we use the phrase jong get
rouwd, the phrase has a descriptive function ‘married at a young age’. In the latter
case, we can modify the adjective jong:
(8) Matthias was de kamer aan het schoonmaken ‘Matthias was cleaning the
room’
Ik merkte dat de boodschap niet overkwam ‘I noticed that the message did
not come across’
This spelling convention reflects that these word combinations are felt as lexical
units, with often idiosyncratic meaning aspects. On the other hand, these separa-
ble complex verbs are not words in the morphological sense, as they cannot
appear in second position in main clauses. In Section 3.4 I will come back to this
issue.
Dutch orthography requires compounds to be written without an internal
space. However, many users of Dutch occasionally do insert a space between the
two parts of a compound. This may be partially due to the influence of English
orthography in which many compounds are written with an internal space.
Compounds and multi-word expressions in Dutch 101
Another factor might be that from a phonological point of view compounds are
similar to phrases in that each constituent word forms a phonological word of its
own. For instance, the N+N compound tandextractie ‘tooth-extraction’ consists of
the phonological words /tɑnd/ and /ɛkstrɑksi/. These two words form separate
domains of syllabification. Hence, the first part tand is a syllable of its own. This
implies that the underling final /d/ of tand is in syllable-final position, and not in
the onset of a syllable with the vowel /ɛ/ as its nucleus. It is therefore subject to
the constraint of Dutch that obstruents are voiceless in coda position (Auslautver-
härtung), and thus tand is pronounced as [tɑnt], and the phonetic form of tandex
tractie is [tɑntɛkstrɑksi].
This phonological similarity between compound constituents and phrasal
constituents, which both consist of more than one phonological word, may lead
to uncertainty as to how spell compounds properly.
2.3 P
honological properties
2.4 M
orphological properties
The presence of a linking element is a clear mark of compound status. The only
apparent exceptions to this criterion are nouns used in the possessive construc-
tion (Booij 2010: 216–222). The N+N sequence opoe-s fiets ‘grandma’s bike’, for
example, is a phrase: the -s is not a linking element here, but a marker of the
possessive construction. This word sequence exhibits the normal flexibility of
phrases, witness a phrase like opoe’s zwarte fiets ‘grandma’s black bike’. The
stress pattern is also revealing, as in this word sequence the word fiets can carry
main stress.
In the case of A+N sequences, the presence of the inflectional ending -e on
the adjectives reveals the phrasal status of such sequences. In Dutch, prenominal
adjectives have an ending -e, unless the noun phrase as a whole is singular indef-
inite, and the head noun is neuter. In the examples (10), the noun boek ‘book’ is
neuter, and the word vrouw ‘woman’ has common gender:
3 In zonneschijn, the final schwa of zonne ‘sun’ has disappeared in present-day Dutch, and zon
is now the Dutch word for ‘sun’.
Compounds and multi-word expressions in Dutch 103
The inflection of prenominal adjectives indicates that these adjectives are words
by themselves; within compounds an adjectival modifier cannot be inflected
(compare the compound snel+trein ‘fast train, intercity train’ with snelle trein ‘fast
train’). It is only the head of a compound that can carry inflectional markers.
There are two complications, however. The first one is that in some types of
noun phrases the adjective does not carry an overt inflectional marker (Booij
2002a: 43 ff.; Tummers 2005). This applies to adjectives ending in -en /ən/ (11a),
where a sequence of two syllables with a schwa as vowel is avoided. It also holds
for adjectives in A+N phrases that denote an individual (11b), the function of an
individual (11c), or an institution (11d), where the presence of the inflectional
marker -e is optional:
In these cases, the absence of the inflectional ending -e should not be taken as an
indication of compound status. The stress pattern is that of noun phrases, with
main stress on the nominal head.
The second complication is that some A+N phrases with inflected adjectives
have undergone univerbation, and are now considered as one word, as reflected
in the orthography:
The words in (12a) have final stress, like phrases, but the words in (12b) carry ini-
tial stress. The word status of these A+N sequences can be concluded from the
way in which they form diminutives, in contrast to regular phrases:
(13) een jongemannetje ‘a little boy’ versus een jong mannetje ‘a young little
man’
een wittebroodje ‘a small white sandwich’ versus een wit broodje ‘a
white small loaf of bread’
Diminutives are neuter nouns, and hence they require a prenominal adjective
without -e in indefinite singular phrases of which they form the head. The exam-
104 Geert Booij
ples in (13) show that both uses of the same A+N sequence are sometimes poss
ible. In their use as words, they function as names, whereas in their phrasal use
they have a descriptive interpretation.
A+N phrases frequently occur as left constituents of nominal compounds, as
in
These sequences are words, and they are to be written without internal spaces:
oudemannenhuis, heteluchtballon, zwartekousenkerk. The inflectional ending -e
of the adjectives oude, hete and zwarte shows that here A+N phrases have been
made parts of words. In the orthography, these compounds can be distinguished
from phrases like oude mannenhuis ‘old house for men’ and hete luchtballon ‘air
balloon that is hot’. The presence of a linking element s after the phrasal constit-
uent confirms the compound status, as in oude-dag-s-voorziening lit. old-day-s-
provision, ‘pension’.
In conclusion, there are a number of criteria for distinguishing between com-
pounds and phrases. In a few cases two structural interpretations of two-word-
sequences are possible, and in this case there is variation in the way language
users deal with such word sequences.
As pointed out by Schlücker (2014), the main, though not the only, function of
A+N and N+N compounds is that of classification. These words create names for
subclasses of entities. The same classifying function can be performed by A+N
phrases (Booij 2002b, 2009a, 2010: 183 ff.). Compare first N+N compounds with
A+N phrases:
In (15) we see that an N+N compound may correspond to an A+N phrase. Typi-
cally, in these phrases the adjective is a denominal adjective that belongs to the
class of relational adjectives. This is a productive class of adjectives in Dutch,
mainly, but not exclusively non-native in character. Both options are grammati-
cal, and both types function as names. This may be expected for these A+N
phrases since relational adjectives do not describe properties, but denote the
relation between the head noun of the phrase and the base noun of the adjective.
In principle both options are available, and which one is used is partially a matter
of convention. For me as speaker of Dutch, muzikale scholing is the conventional
name for this type of education, but muziekscholing is also found on the internet.
The compound koning-s-besluit ‘king-s-decision’ is not used as an alternative for
the A+N phrase koninklijk besluit ‘royal decision’, nor koningsfamilie ‘king-family’
besides koninklijke familie ‘royal family’, even though these N+N compounds are
well-formed. The advantage of using the adjective koninklijk ‘royal’ instead of the
compound constituent koning ‘king’ is that it may also be used for denoting
queens.
This kind of competition between words and phrases is similar to the compe-
tition between words that is known as ‘blocking’. Blocking is the phenomenon
that the formation of a complex word is blocked by the existence of another (sim-
plex or complex) word with the same meaning. The formation of the deverbal
noun lieg-er ‘liar’ in Dutch, for instance, is blocked by the existing complex word
106 Geert Booij
leugenaar ‘liar’. This does not mean that lieger is ill formed, but that it does not
belong to the language convention (the norm) of the Dutch-speaking community.
The fact that we find this type of competition between words and phrases as well
confirms that both types of lexical units must be stored in the lexicon, and that
the use of one of the relevant (morphological or syntactic) constructions for the
formation of a new expression can be blocked by a stored instantiation of a com-
peting construction. This implies that there cannot be a strict separation of mor-
phology and syntax in the grammar of Dutch.
The second type of competition is that between A+N compounds and A+N
phrases, a topic discussed in Hüning (2010), Hüning/Schlücker (2010), Schlücker
(2014) and Schuster (2016). Here are some examples:
The compounds have initial stress on the first constituent, the phrases carry
stress on the head noun, that is, final stress. The data in (16a) illustrate that both
A+N compounds and A+N phrases are possible as names, and do not necessarily
block each other. A compound such as roodwijn, however, is odd. In some cases
the compounds differ in semantic interpretation from the phrasal correlates, as
shown in (16b): the compounds are names, but the corresponding phrases are
used as descriptions.
A+N phrases that function as names have a restricted syntax compared to
other A+N phrases (Booij 2010: 178): they cannot be modified, or split by another
word. For instance, we cannot say *heel gele koorts ‘very yellow fever’, and a
phrase like gele and hevige koorts ‘yellow and high fever’ is also odd. When we
coin the phrase heel rode wijn ‘very red wine’, we coerce rode wijn into a descrip-
tion, denoting wine with a very red color. This lack of syntactic flexibility of
phrases with a naming function makes them more similar to compounds than
other kinds of phrases.
Dutch more often opts for A+N phrases as names for entities in comparison to
A+N compounds than German (Booij 2002b; Hüning 2010). There are two struc-
tural factors that play a role in this difference. First, given the rich adjectival
inflection of German, A+N phrases in German have quite a number of different
Compounds and multi-word expressions in Dutch 107
forms, whereas in Dutch there is only marginal variation in the shape of the adjec-
tive (usually ending in -e, occasionally in ø). Hence, in the case of German the
compound option has the advantage of reducing the form variation of the adjec-
tive, as only its stem is used (Hüning 2010). For instance, the Dutch phrase rode
wijn ‘red wine’ and the German compound Rot-wein both have a constant form for
the adjective (rode/rot). This makes use of the phrasal alternative more feasible
for Dutch. A second factor is that in Dutch A+N compounds the adjective has to be
simplex (Schlücker 2014). This excludes the use of relational adjectives in A+N
compounds. For instance, the compound wetenscháppelijk+domein ‘scientific
domain’ is ill-formed, whereas this combination is fine as a phrase: wetenschap
pelijk doméin. This restriction also excludes the use of the various non-native
relational adjectives in A+N compounds, a common pattern in German A+N
compounds:
This does not mean that A+N compounds with non-native adjectives are com-
pletely excluded in Dutch, but they are relatively rare, and often considered as
loan translations form German (Schlücker 2014: 234). This applies to compounds
such as nationaal-socialist ‘national-socialist’, normaal+verdeling ‘standard
distribution’, speciaal+zaak ‘specialist shop’, and spectraal+analyse ‘spectral
analysis’.
As to the choice between A+N compounds and A+N phrases, it has been
argued for German that paradigmatic analogy plays an important role (Schlücker/
Plag 2011; Rainer 2013; Schlücker, this volume). Schlücker/Plag (2011: 1546) argue
that “the larger the compound family of an item, the more likely it is that partici-
pants choose the compound, and the larger the phrasal family of an item, the
more likely it is that participants choose the phrase”. This role of paradigmatic
analogy in the choice between compounds and phrases has been confirmed for
Dutch by Schuster (2016) on the basis of an investigation of Dutch dictionaries
and corpora.
The role of paradigmatic analogy can be observed in the use of color adjec-
tives. For example, Dutch color adjectives such as geel ‘yellow’, rood ‘red’, and
zwart ‘black’ are used in A+N compounds that function as names for animals and
for human beings (in some cases with a possessive interpretation):
108 Geert Booij
On the other hand, we find these color adjectives in phrasal names such as gele
kaart ‘yellow card’ and rode kaart ‘red card’, names for the cards used for indicat-
ing improper actions in a football match (a kaart-family). Likewise, there is a
family of phrasal names with zwart ‘black’, as in zwarte markt ‘black market,
zwart geld ‘black money’, zwarte doos ‘black box’, and zwarte kunst ‘black magic’,
a zwart-family with zwart being used with the meaning ‘illegal, opaque’. These
observations confirm that analogy to similar compounds or phrases plays an
important role in the choice between compound and phrase.
According to Hoeksema (2012) the choice of the compound structure over the
phrasal alternative is determined by two advantages of the compound option:
compactness and expressiveness. There is always a phrasal alternative for the
compound, but not vice versa. For instance, the comparison sterk als een paard
‘strong as a horse’ cannot be expressed by the compound paardesterk. The
phrasal alternative might, however, not carry exactly the same meaning: ijzer
sterk ‘iron-strong’ can be used in contexts where the phrasal expression is odd.
Compounds and multi-word expressions in Dutch 109
For instance, een ijzersterk verhaal ‘a very strong story’ cannot be properly para-
phrased as een verhaal sterk als ijzer ‘a story strong as iron’ (ibid.). Similar obser-
vations have been made for German (Schlücker, this volume), and Italian (Masini,
this volume). The same applies to compounds with reuze (an allomorph of reus
‘giant’), as in reuze-groot ‘giant-big, very big’ where the phrase zo groot als een
reus ‘as big as a giant’ may not be a proper paraphrase. In these compounds the
nouns ijzer and reuze have acquired a more general meaning of intensification.
These compounds are called elative compounds and express that the property
denoted by the head is present to a high degree. This elative use is the source of
the development of these nouns into intensifier affixoids. For instance, besides
bloed+rood ‘red as blood’ we find compounds like bloed+saai lit. blood-boring,
‘very boring’ and bloed+mooi lit. bloed-beautiful, ‘very beautiful’, which cannot
be paraphrased as saai / mooi als bloed ‘boring / beautiful as blood’.
This difference between compounds and phrases can also be observed for
another class of N+A compounds of the type dood+ziek lit. dead-ill, ‘so ill that it
may cause death’. Again, some of these nominal modifiers have acquired a more
general meaning of intensification, and in such cases a phrasal paraphrase is not
adequate:
This development of nominal (and other) modifiers into affixoids, that is, words
with a more abstract meaning of intensification when embedded in compounds,
is discussed in detail in Booij/Hüning (2014) and Hüning/Booij (2014).
A second type of verbal compounds are verbs like klapper+tanden lit. chat-
ter-tooth.inf, ‘to have chattering teeth’ and kwispel+staarten lit. wag-tail.inf, ‘to
wag one’s tail’. They have the structure [VN]V, and are exceptional in that they are
left-headed. There are also a few V+V compounds like hoeste+proesten lit. to
cough-sneeze, ‘to cough and sneeze’, but again, this is not a productive process of
word formation (Booij 2002a: 164 f.).
The productive alternative for N+V compounds are phrasal word sequences
that consist of a bare noun and a verb. An example is the N+V sequence piano+spe
len ‘to play the piano’. This word sequence can be used as a verb phrase, but the
noun can also be quasi-incorporated into the verb:
Verb phrases with a bare noun are often used as names for denoting a certain
kind of activity. For instance, piano spelen is a specific type of musical activity.
The word piano does not denote a specific referent here. This may be contrasted
with a verb phrase like de piano bespelen ‘to play on the piano’, where, by using
a definite noun phrase, the identifiability of a specific referent of piano is presup-
posed. When count nouns are used as bare nouns, without the normally expected
determiner, this evokes an interpretation as name instead of description of the
verbal phrase in which that bare noun is used. Note that in a compound like
pianospeler ‘piano player’ the word piano likewise has no referential power.
In the second variant in (22), the noun and the verb form a syntactically closer
unit than in the first variant, and are adjacent. This unit can be qualified as a case
of quasi-noun incorporation. Noun incorporation is the process in which a noun
is incorporated into a verb, and thus creates a verbal compound. However, in
Dutch the incorporation process does not lead to compounds in the morphologi-
cal sense. This is shown by the fact that the N+V sequence cannot appear in the
position for finite verbs (the second position) in main clauses, unlike a real verbal
compound like beeldhouwen ‘to sculpture’:
This is why Dahl (2004) calls this process quasi-incorporation: there is incorpora-
tion and formation of lexical units, but these lexical units are not words. Qua-
si-noun incorporation in Dutch is discussed in detail in Booij (2010: Chapter 4),
and the account below is mainly based on this chapter.
The strong bond between N and V in the incorporated variant can also be
seen in two syntactic constructions, the verb raising construction and the pro-
gressive construction. In the verb raising construction the verb of the main clause
forms a unit with the verb of the embedded clause. The incorporated noun can
appear in between the two verbs (24a), whereas this is impossible for a full noun
phrase (24b). The first option in (24a) is that with quasi-incorporation, and Dutch
orthography requires the quasi-incorporated word combination to be spelled as
one word, without an internal space:
The second construction that functions as a litmus test for quasi-noun incorpora-
tion is the progressive construction of the form aan het V-infinitive:
Matthias is {de piano aan het bespelen / *aan het de piano bespelen}
Matthias is {the piano at the pref.play.inf / at the piano pref.play.inf}
‘Matthias is playing on the piano’
Verbs with an incorporated noun can function as a unit in the progressive con-
struction, and thus appear after aan het. This applies to the N+V sequence
piano+spelen. On the other hand, the prefixed verb bespelen ‘to play on’ is an
112 Geert Booij
obligatorily transitive verb that does not allow for noun incorporation. Like ver-
bal phrases with bare nouns, the quasi-incorporation structure is used to express
that the action referred to is a conventional action. In other words, it creates
names for types of action. Whatever is considered as a conventional action by the
language user can be expressed in this form. For instance, auto+wassen ‘to wash
cars’ is a conventional action, whereas buying a car is not conceived as a conven-
tional action, and therefore there is no verb phrase auto kopen, or quasi-com-
pound autokopen (instead, the proper phrase for naming this action is een auto
kopen, with an indefinite determiner). Hence the difference in syntactic behavior
between auto+wassen en auto+kopen:
Conventional actions can also be expressed with verbs + plural nouns. For
instance, aardappels schillen lit. potatoes-peel, ‘to peel potatoes’ can be con-
ceived as a conventional action, and hence we can say:
(27) Geert is aan het aardappels schillen ‘Geert is peeling the potatoes’
… dat Geert wil aardappels schillen ‘… that Geert wants to peel potatoes’
However, when the noun is plural, the N+V sequence is not spelled as one word.
The use of the term ‘quasi-incorporation’ may suggest that these quasi-com-
pounds always derive from a regular phrase, but this is not the case. There are
many N+V sequences where the bare noun cannot be interpreted as an object-NP.
This applies to, for instance, the following cases (Booij 2010: 112):
trated here for zee-zeilen (29a). At the same time, they cannot be split (29b), but
are fine if they are not split (29c, d):
The conclusion drawn from these facts in Booij (2010: Chapter 4) is that there are
N+V combinations that are neither regular compounds nor regular syntactic
phrases. Instead, they are quasi-compounds without a corresponding verbal
phrase: a word sequence such as zee zeilen cannot be used as a well-formed
phrase.
For a proper account of the distribution of quasi-compounds, their structure
should be different from that of phrases and that of morphological compounds.
They may be considered syntactic compounds. In a syntactic verbal compound a
bare N0 is adjoined to a V0, and together they form a V0:
Their syntactic compound status prohibits them from being split in main clauses
(29a). At the same time they cannot appear in second position in main clauses as
this position allows only for a single verb (29b). When the bare noun functions as
an object, as in pianospelen, the quasi-compound corresponds with a verbal
phrase with a bare noun that can be split. Hence, the two possible word orders in
sentences like (22). Thus, the grammar of Dutch provides three different struc-
tures for N+V combinations that function as names:
These A+V combinations are not words in the morphological sense, and are
therefore split in main clauses, just like the N+V combinations. They exhibit the
same word order variation as that shown in (22):
(33) … dat de directeur het voorstel {wilde goedkeuren / goed wilde keuren}
… that the director the proposal {wanted good-judge / good wanted judge}
‘… that the director wanted to approve the proposal’
… dat Ton het boek {heeft zoekgemaakt / zoek heeft gemaakt}
… that Ton the book {has missing-made / missing has made}
‘… that Ton has mislaid the book’
In other words, what we see here are A+V combinations, often idiosyncratic in
meaning, that are structurally interpreted either as verbal phrases with a bare
adjective as complement, or as quasi-compounds.
Both types of compounds have past participles in which the participial prefix
ge- appears before the verbal stem, which confirms their phrasal status:
The adjectives of the quasi-compounds cannot be modified, that is, they cannot
head an adjectival phrase. When we add a modifier, this leads to an ungrammat-
ical result, or another, more literal interpretation. For instance, the verb phrase
heel vreemd gaan lit. very strange go, ‘to go very strange’, with the degree adverb
heel, cannot be interpreted as ‘sleep around intensively’.
Compounds and multi-word expressions in Dutch 115
The types of verb with aan-, achter- and voor- exemplified in (35a) are unproduc-
tive, just like those with the adverbs mis- and weer- and the adjective vol- shown
in (35c). The types exemplified in (35b) with door-, om-, onder-, and over-, how-
ever, are productive. In reference grammars of Dutch they are usually considered
prefixed words, because unlike what is normally the case for Dutch compounds,
the main stress in these words is located on the second constituent (instead of the
first constituent). Thus, from the point of view of stress location, they pattern
with prefixed verbs such as be-hálen ‘to acquire’ and ver-zóeken ‘to request’.
Moreover, the meaning contribution of these morphemes in complex verbs may
differ from that of the corresponding morphemes when used as words by them-
116 Geert Booij
selves. In other words, these words have grammaticalized into prefix-like mor-
phemes. Prefixes like be- and ver- also originate from words that are parts of com-
pounds, but their phonological form has been reduced as well, with a reduced
vowel /ə/. Hence, in present-day Dutch there are no identical lexical counterparts
for these prefixes.
The number of productive processes of verbalizing prefixation in Dutch is
quite restricted, and therefore, there is a huge range of meanings for the expres-
sion of which phrasal verbal predicates with a corresponding make-up can be
used. This is the class of particle verbs, with the particles corresponding to prep-
ositions like binnen ‘inside’, postpositions like mee ‘with’, and adverbs like neer
‘down’. The number of types is quite big, and I list here only a few for the purpose
of illustration. Complete lists can be found in De Haas/Trommelen (1993), and on
Taalportaal (www.taalportaal.org):
Particle verbs are lexical units, but phrasal in nature, just like verbal predicates
such as piano+spelen and schoon+maken discussed in Section 3.3. They are split
in main clauses, and can function as verbal phrases. At the same time, they can
also be used as quasi-compounds, that is, behave like a tight syntactic unit in
verb raising constructions. In this latter use, they are spelled as one word. These
two syntactic options are illustrated by the following sentences:
(37) … dat Hans zijn moeder {op wilde bellen / wilde opbellen}
… that Hans his mother {up wanted phone / wanted up-phone}
‘… that Hans wanted to call his mother’
Morphologically, particle verbs also behave as phrases, since the prefix ge- of the
past participle appears in between the particle and the verb: op-ge-beld, not
*ge-op-beld. When we nominalize a particle verb by means of the prefix ge-, it
also appears before the verbal stem, as in op-ge-bel ‘calling’.
The proper grammatical analysis of Dutch particle verbs is discussed in detail
in Booij (2010: Chapter 5), and in Los et al. (2012). The gist of this analysis is that
each class of particle verbs has to be represented in the grammar of Dutch as a
Compounds and multi-word expressions in Dutch 117
constructional idiom. Constructional schemas are schemas that specify the sys-
tematic correspondence between form and meaning of a construction. A con-
structional idiom is a constructional schema in which one or more slots are lexi-
cally fixed. Each type of particle verb will be represented by a constructional
idiom with that particle specified. The meaning of the particle sometimes corre-
sponds with that word used in isolation, but in other cases it has acquired a spe-
cific meaning. For instance, the particle door ‘through’ has acquired, among oth-
ers, the aspectual meaning of ‘to continue with’, as in door+fietsen ‘to continue
cycling’ and door+eten ‘to continue eating’, unlike the preposition door ‘through’.
Hence, I assume the following constructional idioms for door+V, one without,
and one with quasi-incorporation. In the first case we have a phrasal verbal pred-
icate, labeled as V’, in the second case a syntactic compound:
where SEMi stands for the meaning of the verb Vi, and the symbol ≈ indicates the
paradigmatic relationship between the two constructional schemas.
For a number of morphemes we saw that they are used in Dutch either as
prefix or as particle. This applies in particular to door, om, onder, and over, which
can be used productively as prefixes. In these cases there is no competition
between prefixed verbs and particle verbs, but complementarity, since they differ
in meaning. These morphemes in their prefixal use create transitive verbs that
denote an action that completely affects the object in a specific manner, as illus-
trated in (39) (examples from Los et al. 2012: 184):
There are a few minimal pairs for prefixed verbs / particle verbs with semantic
differences, for example:
In sum, prefixed verbs and particle verbs coexist, the number of prefixed verb
types is restricted, and the high number of particle verb types provides an exten-
sive range of names for activities and events.
where [[x]V z]N stands for the nominalized form of the simplex verb. The variable
x stands for (an allomorph of) the verbal stem, and the variable z stands for a
suffix or zero. All instantiations of unproductive types of nominalization have of
course to be listed. Hence, listed nouns like gang and komst will be available for
combining with a particle into a compound. Thus, it is predicted that the nomi-
nalized form of a particle verb corresponds to that of the nominalized form of the
corresponding simplex verb.
The structure for compounds of the form (42) has to be available anyway in
the grammar of Dutch, as there are a number of compounds of this form with-
out a corresponding particle verb. This applies to, for instance, the following
nouns:
Recall that the symbol ≈ denotes a paradigmatic relationship. The formal and
semantic correspondences between the two schemas are specified by means of
co-indexation. Such a schema of schemas is called a second order schema. For
instance, given the particle verb aankomen with the meaning ‘to arrive’, second
order schema (44) states that the compound noun aankomst is interpreted as the
event of arriving.
This case shows that there might be an asymmetry between form and mean-
ing in morphological constructions. The meaning of the particle compound is a
compositional function of the meaning of the particle verb, even though the par-
ticle verb is not a formal subconstituent of the corresponding compound. Instead,
there is a paradigmatic relationship between the particle compound schema and
the schema for particle verbs. This kind of asymmetry can be accounted for by
relating schemas paradigmatically in second order schemas (Booij/Masini 2015).
Schema (44) is a second order schema, as it relates the constructional schema for
particle compounds to the constructional schema for particle verbs.
This implies that the grammar of Dutch requires access to the meaning of
phrasal lexical expressions in order to account for the meaning of particle com-
pounds. This is another type of complementarity between compounds and
phrasal lexical items, and shows again that we need a grammar in which mor-
phological and phrasal lexical units can interact.
In (46) we see the use of syntactic coordination by means of en. This corresponds
with the semantic effect of addition. This phrasal pattern is subject to a specific
restriction, however, that does not apply to syntactic coordination in general:
there is a fixed order in which the two numbers have to appear, the lower digit
before the higher digit in numbers < 100, the higher digit before the lower one in
numbers > 100. You cannot say zestig-en drie ‘63’ or drie-en-honderd ‘103’. More
over, the conjunction en is optional in numbers > 100, an optionality that does not
apply to regular coordination. In other words, phrasal coordination is used here
for the expression of addition, but is subject to specific restrictions. Additional
construction-specific properties for this use of coordination are that the conjunc-
tion en /ɛn/ can be pronounced either as [ɛn] or as [ən] in numbers < 100, and can
be optionally omitted in numbers > 100.
Compounding and phrasal coordination are used together in the formation of
complex numerals: the numeral compounds are building blocks of the coordina-
tion construction, as in:
4 The word sequence zes miljoen ‘six million’ looks similar to these compounds, but is conside-
red a phrase, as reflected by its spelling with an internal space. This means that miljoen is inter-
preted as a measure noun, similar to nouns like gulden ‘guilder’ and uur ‘hour’ which also appear
in their singular form after a cardinal > 1: drie gulden, drie uur. However, this interpretation is not
chosen for words with honderd and duizend. Honderd, duizend, en miljoen can all function as
nouns, and may appear in plural form: honderden, duizenden, miljoenen.
122 Geert Booij
The orthography of numerals reflects their hybrid nature. The compounds and
the coordinated numerals are spelled as one word, except that there is a space
after duizend. Moreover, the words miljoen and miljard are always spelled as
separate words. Thus we get spellings like achthonderd (800), drieëntwintig (23),
achthonderdendrieëntwintig (823), tweeduizend drieënveertig (2,043), and vijf
miljoen achthonderdduizend driehonderdentwintig (5,800,320).
These numeral phrases seem to feed word formation in the construction of
ordinals, as in:
5 C
onstruction Grammar and Construction
Morphology
The data discussed in Sections 3 and 4 provide strong evidence for a view of the
organization of the grammar in which there is no strict separation between mor-
phology and syntax. This is one of the core hypotheses of constructionist
approaches to morphology and syntax. Here are the main points:
(i) Morphological and syntactic constructions may compete; both can be
used for creating names, and hence, there are blocking effects between
morphological and phrasal constructs.
(ii) Phrasal constructions may be subject to specific restrictions when used as
names. For instance, in A+N phrasal names, the adjective cannot be sepa-
rated from the head noun, nor be modified. In a constructionist approach
we can account for the properties of such phrasal names by phrasal const-
ructional schemas which derive from general syntactic schemas, but with
specific formal and semantic properties specified. The same applies to the
description of specific forms of coordination in the construction of com-
plex numeral expressions.
(iii) Morphological processes may be unproductive, or unavailable for the
expression of certain types of names. In Dutch, phrasal structures fill
Compounds and multi-word expressions in Dutch 123
The claim that morphology and syntax cannot be separated in grammar does
not mean that there is no formal distinction between morphological and phrasal
constructions. This formal distinction is necessary for a proper account of the
syntactic behavior of the various types of names. At the same time, since com-
pound schemas and phrasal schemas are not split in different components of the
grammar, they can interact: phrasal constituents may form parts of compounds
and vice versa, and compounds may function as nominalizations of particle verbs
which themselves are phrasal expressions. These observations led to the conclu-
sion that second order schemas (paradigmatic relations between constructional
schemas) form part of the grammar.
Since morphology often derives historically from syntax, it should not come
as a surprise that there are transitional cases such as quasi-compounds, verbs
with incorporated particles, and cardinal numerals of the type drieëntwintig ‘23’
where the conjunction en can also be interpreted as a linking element. These phe-
nomena underscore Hermann Paul’s remarks on the blurred boundary between
syntax and word formation quoted in the introduction of this article. As we saw
above, a Construction Grammar approach can do justice to these transitional
cases.
References
Booij, Geert (1985): Coordination reduction in complex words: a case for prosodic phonology.
In: Van der Hulst, Harry/Smith, Norval (eds.): Advances in non-linear phonology.
Dordrecht: Foris. 143–160.
Booij, Geert (2002a): The morphology of Dutch. Oxford: Oxford University Press.
Booij, Geert (2002b): Constructional idioms, morphology, and the Dutch lexicon. In: Journal of
Germanic Linguistics 14. 301–329.
Booij, Geert (2002c): Separable complex verbs in Dutch: a case of periphrastic word formation.
In: Dehé, Nicole et al. (eds.): Verb-particle explorations. Berlin: De Gruyter. 21–42.
Booij, Geert (2009a): Phrasal names: A constructionist analysis. In: Word Structure 2.
219–240.
Booij, Geert (2009b): Lexical integrity as a morphological universal, a constructionist view. In:
Scalise, Sergio/Magni, Elisabetha/Bisetto, Antonietta (eds.): Universals of Language
Today. Dordrecht: Springer. 83–100.
Booij, Geert (2010): Construction Morphology. Oxford: Oxford University Press.
Booij, Geert (2015): The nominalization of Dutch particle verbs: schema unification and second
order schemas. In: Nederlandse Taalkunde 20. 285–314.
Booij, Geert/Hüning, Matthias (2014): Affixoids and constructional idioms. In: Boogaart,
Ronny/Colleman, Timothy/Rutten, Gijsbert (eds.): Extending the scope of Construction
Grammar. Berlin: De Gruyter. 77–105.
Compounds and multi-word expressions in Dutch 125
Booij, Geert/Masini, Francesca (2015): The role of second order schemas in word formation. In:
Bauer, Laurie/Körtvélyessy, Lívia/Štekauer, Pavol (eds.): Semantics of complex words.
Cham i. a.: Springer. 47–66.
Dahl, Östen (2004): The growth and maintenance of linguistic complexity. Amsterdam/
Philadelphia: Benjamins.
De Haas, Wim/Trommelen, Mieke (1993): Morfologisch handboek van het Nederlands. Den
Haag: SDU Uitgeverij.
Hoeksema, Jack (2012): Elative compounds in Dutch: properties and developments. In: Oebel,
Guido (ed.): Intensivierungskonzepte bei Adjektiven und Adverben im Sprachenvergleich/
Crosslinguistic comparison of intensified adjectives and adverbs. Hamburg: Verlag Dr.
Kovač. 97–142.
Hoffman, Thomas/Trousdale, Graeme (eds.) (2013): The Oxford Handbook of Construction
Grammar. Oxford: Oxford University Press.
Hüning, Matthias (2010): Adjective + noun constructions between syntax and word formation in
Dutch and German. In: Onysko, Alexander/Michel, Sascha (eds.): Cognitive perspectives
on word formation. Berlin/New York: De Gruyter. 195–215.
Hüning, Matthias/Booij, Geert (2014): From compounding to derivation. The emergence of
derivational affixes through ‘constructionalization’. In: Folia Linguistica 48. 579–604.
Hüning, Matthias/Schlücker, Barbara (2010): Konvergenz und Divergenz in der Wortbildung.
Komposition im Niederländischen und im Deutschen. In: Dammel, Antje/Kürschner,
Sebastian/Nübling, Damaris (eds.): Konvergenz und Divergenz in der Wortbildung.
(= Germanistische Linguistik 206–209). Hildesheim i. a.: De Gruyter. 783–825.
Hüning, Matthias/Schlücker, Barbara (2015): Multi-word expressions. In: Müller, Peter O. et al.
(eds.): Word formation. An international handbook of the languages of Europe. Vol. 1.
(= Handbooks of Linguistics and Communication Science (HSK) 40.1). Berlin/Boston: De
Gruyter. 450–467.
Koefoed, Geert (1993): Benoemen. Een beschouwing over de faculté de langage. Amsterdam:
Meertens-Instituut.
Levelt, Willem J. M./Meyer, Antje (2000): Word for word: Multiple lexical access in speech
production. In: European Journal of Cognitive Psychology 12, 4. 433–452.
Los, Bettelou/Blom, Corrien/Booij, Geert/Elenbaas, Marion/van Kemenade, Ans (2012):
Morphosyntactic change. A comparative study of particles and prefixes. Cambridge, UK:
Cambridge University Press.
Nooteboom, Sieb (2011): Self-monitoring for speech errors in novel phrases and phrasal lexical
items. In: Yearbook of Phraseology 2. 1–16.
Paul, Hermann (1898): Prinzipien der Sprachgeschichte. Halle/Saale: Max Niemeyer [11880].
Rainer, Franz (2013): Can relational adjectives really express any relation? An onomasiological
perspective. In: Skase Journal of Theoretical Linguistics 10. 12–40.
Schlücker, Barbara (2014): Grammatik im Lexikon. Adjektiv-Nomen-Verbindungen im Deutschen
und Niederländischen. Berlin: De Gruyter.
Schlücker, Barbara/Plag, Ingo (2011): Compound or phrase? Analogy in naming. In: Lingua 121.
1539–1551.
Schuster, Saskia (2016): Variation und Wandel. Zur Konkurrenz morphologischer und
syntaktischer A+N-Verbindungen im Deutschen und Niederländischen seit 1700.
Berlin: De Gruyter. Internet: www.degruyter.com/view/product/456743 (last access:
4.5.2018).
126 Geert Booij
Schutz, Rik/Permentier, Ludo (2016): Met zoveel woorden. Gids voor trefzeker taalgebruik.
Amsterdam/Leuven: Amsterdam University Press/Davidsfonds Uitgeverij.
Sprenger, Simone A. (2003): Fixed expressions and the production of idioms. Ph. D.
dissertation. University of Nijmegen.
Sprenger, Simone A./Levelt, Willem J. M./Kempen, Gerard (2006): Lexical access during the
production of idiomatic phrases. In: Journal of Memory and Language 54. 161–184.
Tummers, José (2005): Het naakt(e) adjectief. Kwantitatief-empirisch onderzoek naar de
adjectivische buigingsalternantie bij neutra. Leuven: Katholieke Universiteit Leuven
[Ph. D. dissertation].
Vikner, Sten (2005): Immobile complex verbs in Germanic. In: Journal of Comparative Germanic
Linguistics 8. 83–105.
Kristel Van Goethem/Dany Amiot
Compounds and multi-word expressions
in French
1 I ntroduction
French compounds differ from Germanic compounds in two important aspects.
First, while Germanic compounding complies with the Right-hand Head Rule
(e. g. English postage stamp, German Briefmarke, Dutch postzegel), French, like
other Romance languages (see the chapters by Masini (Italian) and Fernán-
dez-Domínguez (Spanish) in this volume), has a general tendency towards left-
hand headed compounding (e. g. timbre-poste lit. stamp-post). Second, whereas
languages such as Dutch and German establish a clear demarcation between
compounds and lexicalized phrases on the basis of formal criteria (spelling, pros-
ody, linking elements, loss of adjectival inflection in [A N] compounds), French
compounds are not easily distinguishable from syntactic expressions, and true
compounds in Germanic languages often correspond to syntactic multi-word
units in French (e. g. English admission ticket vs. French billet d’entrée (lit. ticket
of entrance)) (Zwanenburg 1992: 222; see also the chapters by Booij (Dutch),
Schlücker (German) and Bauer (English) in this volume).
Contrary to Germanic languages, French has no distinctive word stress, only
phrase stress. Moreover, whereas Germanic compounds may present linking ele-
ments (e. g. Dutch zonnebril, German Sonnenbrille ‘sunglasses’), these do not
occur in French. Furthermore, the spelling of French multi-word units is charac-
terized by many inconsistencies and irregularities: many combinations can be
spelled with or without a hyphen (e. g. bébé(-)éprouvette ‘test-tube baby’ (lit.
baby(-)test tube), porte(-)monnaie ‘coin purse’ (lit. carry(-)money)) or even as one
word (e. g. portefeuille ‘wallet, billfold’ (lit. carrysheet) (Lehmann/Martin-Berthet
2008). Spelling of complex lexical units as one word occurs (e. g. vinaigre ‘vine-
gar’ (lit. wineacid)), but it is far from being the rule (cf. French vin rouge vs. Ger-
man Rotwein), and the French spelling rules are systematically updated by
orthographic reforms.1 Finally, many French compound-like expressions have
1 The orthographic reform of 1990 proposed, for instance, to hyphenate complex numerals
greater or lower than one hundred (e. g. vingt-trois ‘twenty-three’, cent-cinquante-huit ‘one
hundred and fifty-eight’), whereas this was only the case for numerals lower than one hundred
before. The French Academy also suggested writing as one word a list of complex lexical units
Open Access. © 2019 Goethem/Amiot, published by De Gruyter. This work is licensed under
the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-005
128 Kristel Van Goethem/Dany Amiot
internal inflection markers (e. g. beaux-arts ‘fine arts’), while these are generally
attributed to syntactic formations.
As a result, none of the formal criteria typically applicable in Germanic lan-
guages2 allow for a straightforward differentiation between compounds and (lex-
icalized) multi-word phrases in French, and, accordingly, the term ‘compound’ is
not always used in a consistent way in the literature on French morphology. As a
matter of fact, ‘compounding’ is often used to refer to various types of complex
lexical units regardless of the formation process (morphological or syntactic) (for
an overview, see, for example, Van Goethem 2009 and Villoing 2012).
Van Goethem (2009) illustrates this in the domain of [A N] units. The Dutch
compound zuurkool ‘sauerkraut’ (lit. sour-cabbage) can be distinguished from the
lexicalized phrase zure regen ‘acid rain’ and the non-lexicalized syntactic phrase
zure kers ‘sour cherry’ on the basis of its spelling (written as one word), its stress
pattern (prominent stress on zuur in zúurkool while zúre kérs has double stress
and zure régen has prominent stress on the noun regen, cf. De Caluwe 1990: 17)
and the lack of inflection of the adjectival component zuur in the compound (cf.
Booij 2002: 314). In French, however, these criteria do not apply and Van Goethem
(2009) concludes that, leaving aside some exceptions that do not conform to reg-
ular modern French syntax (e. g. rouge-gorge ‘robin’ (lit. red-throat) and grand-
mère ‘grandmother’, cf. Van Goethem 2009: 246 f.), French [N A] and [A N] units
are phrases and not compounds, whatever their spelling may be: whether written
as two separate words (e. g. premier ministre ‘prime minister’), hyphenated (e. g.
cordon-bleu ‘master chef’ (lit. cord-blue)) or even as one single word (e. g. vinaigre
‘vinegar’ (lit. wineacid)).
In this paper, we will turn the focus to [N1 N2] units, but before doing so we
will present the different approaches to complex lexical units in French and show
previously written as separate words (with or without a hyphen), for example chauvesouris
‘bat’ (lit. bald-mouse), millepattes ‘centipede’ (lit. thousand-legs), passepartout ‘pass key’ (lit.
pass-everywhere), portemonnaie ‘coin purse’ (lit. carry-money) and véloski ‘skibob’ (lit. bike-
ski). (Internet: www.lalanguefrancaise.com/guide-complet-nouvelle-orthographe, last access:
18.4.2017).
2 In this respect, English may be considered to occupy an intermediary position: the traditional
distinctive criterion applicable to English is the stress pattern, compounds being typically char-
acterized by fore-stress (e. g. black bírd vs. bláckbird, cf. Bauer 2004 and this volume), but even
this criterion is not always straightforward and many mismatches can be observed: as shown by
Bauer (2004), a lexicalized phrase such as prímary school has first-element stress (or compound
stress), whereas first-áid, with the two components hyphenated and unified, has second element
stress (or phrase stress). These inconsistencies also apply to [N N] formations: péanut oil, for in-
stance, has fore-stress, whereas olive oil may have end stress (cf. Bauer 1998, this volume; Gieg-
erich 2009a, 2009b).
Compounds and multi-word expressions in French 129
In French, formations such as [N de N] (e. g. fil de fer ‘iron wire’ (lit. wire of
iron)), [N à N] (e. g. verre à vin ‘wine glass’ (lit. glass to wine)), [N à Det N] (e. g.
sauce à l’ail ‘garlic sauce’ (lit. sauce to the garlic)), [A N] (e. g. Moyen Âge ‘Middle
Ages’) and [N A] (e. g. poids lourd ‘heavyweight’ (lit. weight heavy)) (Fradin 2003:
199; Booij 2010: 172) are constructed by means of syntactic rules, as manifested
through the presence of prepositions, determiners and adjectival inflection. Nev-
ertheless, like compounds, they are productively used in name formation and it is
therefore not surprising that the notion of compounding is often extended to all
kinds of complex lexical units with a naming function, regardless of the forma-
tion rules. This approach can be illustrated by Mathieu-Colas’s (1996) classifica-
tion of French compounds, which includes, for instance, lexicalized [A N] and
[N A] units such as premier ministre ‘prime minister’ and table ronde ‘round table
meeting’ (lit. table round), even though these comply with the syntactic forma-
tion rules, including adjectival inflection.
(2) Nous avons constaté un fait vs. *Nous avons constaté un fait
qui est évident qui est divers
‘we have observed a fact ‘we have observed a
that is evident’ fact that is diverse’
Compounds and multi-word expressions in French 131
3 Cf. also ten Hacken’s (1994) tests (such as insertion, substitution, anaphora from one constit-
uent of the sequence).
132 Kristel Van Goethem/Dany Amiot
It now appears that French (and no doubt Spanish) lacks compounding altogether. Once we
have subtracted fixed syntactic phrases (idioms) such as timbres-poste and phrases reanal-
yzed as words (syntactic words) such as essui-glace <sic>, there are no candidates left.
(ibid.: 83)
Corbin (1992, 1997) is less restrictive and preserves the term ‘compound’ to refer to
lexical units of the type [N1 N2] (e. g. timbre-poste ‘postage stamp’) and [V N] (e. g.
essuie-glace ‘windscreen wiper’) because they are formed according to lexical
composition rules, specific to the lexicon and different from syntactic rules.
Corbin (1997) uses the notion of ‘polylexematic units’ (‘unités polylexématiques’)
as a general term for covering both compounds and lexicalized phrases. However,
both naming strategies are distinguished on the basis of the ‘division of labor
principle’ between morphology and syntax. According to this principle, also labe-
led the ‘Lexical Integrity Hypothesis’ (LIH hereafter), syntax has no access to mor-
phological operators or infralexical units and, conversely, morphology has no
access to syntactic operators:4
Les règles syntaxiques n’ont accès ni aux opérateurs morphologiques ni à des unités
infralexicales. Les règles morphologiques n’ont pas accès aux opérateurs syntaxiques.
(ibid.: 83)
On the one hand, this implies that affixed polylexematic units such as fil-de-
fériste ‘high wire walker’ (lit. wire-of-iron-ist) belong to morphology, since syntax
cannot attach affixes. On the other hand, polylexematic units containing a syn-
tactic operator, a preposition as in verre à vin ‘wine glass’ (lit. glass to wine) or a
determiner as in hors-la-loi ‘outlaw’ (lit. outside-the-law), necessarily belong to
syntax.5 In other words, polylexematic units are exclusively formed either by syn-
tax or by morphology, and the idea of a scale is thus rejected:
4 Corbin’s analysis is in line with the strong lexicalist hypothesis: ‘The syntax neither manipu-
lates nor has access to the internal structure of words’ (Anderson 1992: 84). On this topic, see,
among many others, Lieber (1992), Plag (2003) and, for an overview, Lieber/Scalise (2007).
5 There seems to be a contradiction in Corbin’s analysis, which considers fil-de-fériste as a mor-
phological unit despite the presence of the preposition de ‘of’. However, Corbin (1997: 83) argues
that the morphological insertion of the suffix -iste is subsequent to the insertion of the preposi-
tion de and that only the final step of the formation should be taken into account: the word is a
morphological construct (application of the suffix -iste) on the basis of a syntactically construct-
ed stem, fil de fer, which can be considered a lexical unit.
Compounds and multi-word expressions in French 133
En vertu du partage des tâches entre les modules d’une grammaire, les séquences engen-
drables syntaxiquement ne le sont pas morphologiquement et réciproquement. (ibid.: 84)6
On the same grounds, Fradin (2009: 418) excludes expressions such as sans-
papiers ‘person without identity papers, illegal immigrant’ (lit. without papers)
and pied-à-terre ‘pied-à-terre, holiday cottage’ (lit. foot-on-ground) from true
compounding because they correspond to phrases that can be generated by syn-
tax (cf. Il s’est retrouvé sans papiers ‘he ended up without (identity) papers’ and
Le cavalier mit pied à terre ‘the horseman dismounted’ (lit. put foot on ground)).
He relabels Corbin’s proposal as ‘Principle A’:
Principle A: Compounds may not be built by syntax (they are morphological constructs)
(ibid.: 417)
6 Our translation: ‘By virtue of the division of tasks between the modules of a grammar, sequenc-
es that are possibly generated by syntax are not generated by morphology and vice versa’.
134 Kristel Van Goethem/Dany Amiot
Villoing (2012: 30) specifies that French native compounding7 ‘is prototypically
formed of two lexemes of the current lexicon of French, without any linking ele-
ment; the internal order of constituents is XY, where X is the governing element’.
Furthermore, the composing lexemes belong, by definition, to the major word
classes (noun, verb, adjective), and are uninflected. This implies that ‘no constit-
uent is marked by inflection: no modality, tense, person or aspect marking on the
verb in VN compounds, no number on the N, and no gender or number on adjec-
tives, disregarding cases of agreement’ (ibid.: 31 f.).8 Examples are poisson-chat
‘catfish’ (lit. fish-cat), wagon-fumeur ‘smoking car’ (lit. car-smoker), ouvre-boîte
‘can opener’ (lit. open-can) and vert-pomme ‘apple green’ (lit. green-apple).9
This view implies that many other multi-word units that are often considered
compounds do in fact belong to syntax and, therefore, need to be analyzed as
lexicalized phrases. According to Villoing (ibid.: 35 f.), the following French mul-
ti-word units should not be analyzed as compounds:
–– Complex units composed of non-lexemes, such as complex prepositions
and complex conjunctions: e. g. par-dessus ‘from above’, de sorte que ‘such
that’10
–– Lexicalized syntactic constructions, namely NPs (3), PPs (4) and VPs (5) that
behave like lexical units:
7 Villoing (2012) distinguishes native compounds from neoclassical compounds, which have
different properties: the latter are ‘prototypically composed of two bases of Greek or Latin origin
that are not syntactically autonomous in French, connected by a linking element; the internal
order of constituents is YX, where X is the governing element’ (Villoing 2012: 30) (e. g. ludo-thèque
‘game library’, homi-cide ‘manslaughter’, cyno-céphale ‘dog head’).
8 However, Villoing (2012: 34) rightly observes that some compounds actually display inflected
forms of the lexeme: for instance, many [V N] compounds include a plural N, orthographically
and/or phonologically marked (e. g. presse-fruits ‘fruit press’ (lit. press-fruits), protège-yeux ‘eye
protector’ (lit. protect-eyes)). Villoing argues that this plural inflection is not the result of syntac-
tic marking, but of inherent and semantically motivated inflection.
9 This approach, in line with Corbin (1992), Villoing (2009), Bonami/Boyé (2003, 2014) and Fra-
din (2009), among others, implies that the V in French [V N]N compounds (e. g. ouvre-boîte ‘can
opener’) is not an inflected form of the verb (imperative or present indicative), but a stem of the
lexeme.
10 Although Zwanenburg (1992) starts from the same syntax-morphology divide principle, his
analysis leads to completely different results: he concludes that real compounding in French is
precisely restricted to nouns, adjectives and verbs with a modifying preposition or adverb (e. g.
sous-chef ‘deputy’ (lit. under-boss), arrière-pays ‘hinterland’ (lit. behind-land), maltraiter ‘mal-
treat’). Paradoxically, this implies that French compounding would be right-headed, similar to
Germanic compounding.
Compounds and multi-word expressions in French 135
–– Lexicalized phrasal expressions that behave like lexical units: for instance,
rendez-vous ‘appointment, date’ (lit. go-you), qu’en-dira-t-on ‘gossip’ (lit.
what about it-will say-one).
Villoing (ibid.: 36) admits, nevertheless, that the boundary between compounds
and syntactic units is most problematic in the case of [N1 N2] sequences. This can
also be derived from her examples: horloger-bijoutier ‘jeweler-watchmaker’ is
considered a compound, whereas case départ ‘square one’ is analyzed as a lexi-
calized syntactic construction. It does indeed appear that French [N1 N2] sequences
can be constructed by both morphology and syntax and that a subcategorization
of [N1 N2] formations is needed. We will therefore focus on this particular forma-
tion type in Sections 3 and 4.
It can be concluded from the preceding overview that the term ‘compounding’ is
not used consistently in the French linguistic tradition and often covers much
more than, strictly speaking, morphological complex lexical units. Hüning/
Schlücker (2015) point out the commonalities and differences found between
compounds as word-formation units and syntactically formed multi-word expres-
sions. In spite of the differences, both patterns may serve the same purpose and
even enter into competition to do so. As for French, many examples of competi-
tion can be found between [N N] and [N Prep N] formations: village(-)vacances
coexists with village de vacances ‘holiday village, holiday resort’ (lit. village (of)
holidays) and the same holds for point(-)rencontre and point de rencontre ‘meet-
ing point’ (lit. point (of) meeting) and impression (par) laser ‘laser printing’ (lit.
printing (by) laser) (cf. also Section 3.1). These facts indicate that in French, too,
the boundary between compounds and syntactic multi-word expressions is fuzzy
and the data are suggestive of a lexicon-syntax continuum.
This non-modular view of language is precisely a basic assumption of Con-
struction Grammar (cf. Goldberg 1995, 2006; Croft 2001; Booij 2010; Hoffmann/
136 Kristel Van Goethem/Dany Amiot
Trousdale (eds.) 2013, a. o.). Crucial to this model is the concept of ‘constructions’:
these are conventional pairings of form (referring to syntactic, morphological and
phonological properties) and meaning (including semantic, pragmatic and dis-
course-functional properties) and are considered the fundamental units of the
linguistic system. All levels of grammatical description involve such form-mean-
ing pairings – not only words as in the Saussurean tradition – and constructions
vary in size, degree of schematicity and complexity (cf. Goldberg 2009), the min-
imal linguistic construction being the word in Booij’s (2010) model of Construc-
tion Morphology. Furthermore, constructions, both syntactic and morphological,
are linked to each other by (vertical) inheritance relations and also by (horizon-
tal) connectivity links (Norde 2014; Norde/Morris 2018). As a consequence, lan-
guage can be considered a complex network of constructions. Substantive con-
structions (e. g. petit mais vaillant ‘small but tough’, position clé ‘key position’) are
instances of semi-schematic constructions (e. g. [Adj1 mais Adj2], [N1 clé]), which
– in turn – inherit properties from more general schematic constructions (e. g.
[Adj1 CONJ Adj2], [N1 N2]). Moreover, constructions may also inherit properties
from multiple-parent constructions via so-called ‘multiple inheritance’ (cf. Trous-
dale 2013; Trousdale/Norde 2013).11
It is not surprising that many recent studies in the field of multi-word expres-
sions are in the constructionist vein. In this approach, it can be assumed that
both compounds and phrasal structures with a naming function can act as con-
ventionalized form-meaning pairings or ‘constructions’, and we should accept
the existence of what Booij (2010: 190) calls ‘lexical phrasal constructions’: these
are syntactic formations that should be stored as lexical units in the mental lexi-
con, such as fil de fer ‘iron wire’ (lit. wire of iron) and moulin à vent ‘windmill’ (lit.
mill at wind). These formations demonstrate that there is no strict boundary
between the lexicon and syntax, or, as Booij (ibid.: 191) puts it, ‘syntax permeates
the lexicon because syntactic units can be lexical’.
Compounds and phrasal structures are not only closely linked in the con-
structional network; they may also compete or interact with each other. The pro-
cess of ‘multiple inheritance’ may even produce hybrid constructions that inherit
properties from parent constructions belonging to different domains, such as
morphology and syntax. We believe that these insights from Construction Gram-
11 The idea of ‘multiple inheritance’ could be seen as the synchronic representation of the com-
plexity of language change. Diachronic developments do not always follow linear pathways from
one source construction to another target construction; a complex interplay between different
sources and processes is often at stake (cf. De Smet/Ghesquière/Van de Velde’s (eds.) 2013 vol-
ume On multiple source constructions in language change).
Compounds and multi-word expressions in French 137
mar are useful to account for problematic cases that cannot be univocally classi-
fied as morphological or syntactic constructs, such as French [N1 N2] subordina-
tives. In Sections 3 and 4 we will therefore focus on these particular cases and in
Section 5 we will propose an analysis in line with the constructionist insights.
Fradin (2009) distinguishes between four types of [N1 N2] sequences: coordinates,
subordinates, two-slot nominal constructs and identificational constructs; the
first two are assigned to morphology and the others to syntax.
First, two types of [N1 N2] coordinates can be distinguished: in (6) each N has
a distinct referent and the compound’s denotatum is the sum of these referents;
the compounds in (7), however, denote a unique referent combining properties of
both N1 and N2 (ibid.: 429 f.):
one of its salient properties. According to Fradin (2009: 430 f.), this property may
concern a physical dimension (shape, length, weight) (8), an intrinsic capacity
(slowness, quickness, strength, duration) (9) or a function (10), and is metaphor-
based.
Even though Fradin recognizes that the morphological status of these compounds
is open to debate (cf. Section 4), he claims that the regular interpretative patterns
found in these subordinate compounds are similar to those of some derived lex-
emes, such as French adjectives derived with the suffix -able (Fradin 2003). In the
same way as productive suffixes, the N2 of subordinate [N1 N2] formations can be
combined with a broad range of stems and forms a productive constructional pat-
tern with a regular interpretation. This similarity with derivation is taken as an
argument in favor of their morphological status.
Whereas coordinate and subordinate [N1 N2] sequences follow a constrained
pattern and have a regular semantic relationship between the constituents, this is
not the case with two-slot nominal constructs (Fradin 2009: 432 f.) and identifi-
cational [N1 N2] sequences. The examples in (11) all denote the referent expressed
by N1, but they completely differ from subordinate compounds because N2 does
not refer to an intrinsic and salient property of N1. Moreover, the sequence usually
corresponds to a syntactic phrase in which N2 forms part of a prepositional phrase
(12), which suggests a syntactic origin.
Fradin (2009: 433 f.) likewise argues for identificational [N1 N2] sequences (cf.
also Noailly 1990):
N2 identifies N1 (‘N2 is an N1’) and from this point of view, these sequences are
equivalent to syntactic (appositional) [N1 N2] constructs in which N2 is a proper
noun and N1 expresses a socially recognized category (e. g. le président Mandela
‘President Mandela’, la région Bourgogne ‘the region of Burgundy’).
3.2 D
iscussion: morphological and syntactic approaches to
[N1 N2] subordinatives
We agree with Fradin that [N1 N2] coordinates are true compounds and cannot be
the result of syntactic formation. We also subscribe to his view on two-slot nomi-
nal and identificational [N1 N2] constructs: both sequences can be shown to corre-
spond to syntactic phrases. However, subordinate [N1 N2] formations are more
problematic than acknowledged by Fradin (2009) and it can be demonstrated
that the examples mentioned for this class are not all of the same kind. At first
glance, it can, for instance, be observed that some of them permit degree modifi-
cation of N2 while others do not (discours vraiment fleuve ‘really lengthy discourse’
(lit. discourse really river) vs. *requin vraiment marteau ‘really hammerhead
shark’ (lit. shark really hammer)), and some but not all N2s form productive series
(e. g. discours-fleuve ‘lengthy discourse’ (lit. discourse-river), roman-fleuve ‘novel
cycle’, film-fleuve ‘lengthy movie’, débat-fleuve ‘lengthy debate’, etc.), while no
series formation is possible for [N-marteau], for instance. We will discuss these
differences more extensively in Section 4.
As already mentioned, these formations have been the subject of some
debate. Amiot/Van Goethem (2012: 350 ff.) and Van Goethem (2012: 77–81) pro-
vide an overview of the different accounts, which range from purely syntactic
analyses (cf. Noailly 1990 and Goes 1999) to strictly morphological accounts, like
the one by Fradin (2009).
With regard to the syntactic approaches, a distinction can be made between
analyses where the second component of the phrase is still considered a noun in
spite of some adjectival properties (cf. Noailly 1990, who labels N2 as ‘substantif
épithète’ and Arnaud/Renner 2014, who detect adjective-like syntactic behavior
to some extent), and others like Lehmann/Martin-Berthet (2008: 206), who claim
140 Kristel Van Goethem/Dany Amiot
As these definitions show, attributives (e. g. high school) and appositives (e. g.
snailmail, swordfish) belong to the same ATAP class because they have similar
functions. The metaphorical value of the modifier is argued to be an important
distinctive criterion between [N1 N2] subordinatives (e. g. mushroom soup), on the
one hand, and [N1 N2] appositives (e. g. mushroom cloud), on the other:
In appositives that, together with attributives, make up the ATAP class, the noun plays an
attributive role and is often to be interpreted metaphorically. Metaphoricity is the factor that
enables us to make a distinction between, e. g. mushroom soup (a subordinate ground com-
pound) and mushroom cloud, where mushroom is not interpreted in its literal sense but is
rather construed as a ‘representation of the mushroom entity’ (...) whose relevant feature in
the compound under observation is shape. (ibid.: 52)
In the next section, we will take a closer look at this specific type of formation and
will argue that we need to distinguish between two different subclasses: classify-
ing and qualifying [N1 N2] subordinatives, of which only the former undoubtedly
belong to morphology.
N1.12 Despite their similarities (in all these subordinate compounds, N2 denotes a
salient, metaphor-based property of N1), N2 has a classifying role in some [N1 N2]
formations (e. g. requin-marteau) but a qualifying role in others (e. g. guerre-
éclair). We will present the distinguishing properties of both types of [N1 N2] sub-
ordinatives in 4.1 and 4.2, respectively.
4.1 C
lassifying [N1 N2] subordinatives
12 This category merges what Arnaud (2003: 13) calls the ‘composés équatifs-analogiques’ (‘equa-
tive analogical compounds’) and the ‘composés méronymiques-analogiques’ (‘meronymic ana-
logical compounds’), i. e. poisson-chat ‘catfish’ vs. poisson-scie ‘sawfish’, respectively.
13 To a certain extent, such sequences correspond to the ‘generic-specific compounds’ in Ar-
naud (2003), but the author classifies them as ‘equative/analogical compounds’, because of the
metaphorical use of N2.
142 Kristel Van Goethem/Dany Amiot
(iii) In these cases, and as opposed to coordinate compounds, the two nouns
denote concrete entities that do not belong to the same semantic class and the
metaphor that underpins the relation between N1 and N2 is often based on physi-
cal resemblance: the nose of a poisson-scie is shaped like a saw (scie) and a saule
têtard has roughly the shape of a tadpole (têtard): a big head like the upper part
(the foliage) of the willow, and a short tiny bottom part (like the trunk). In our
examples, the only sequences that do not instantiate this relation are enfant-roi,
femme-objet and voiture-bélier, in which the metaphor is based on behavioral
resemblance. For example, an enfant-roi is a child (enfant) who is treated like a
king (roi) and who often becomes a ‘domestic tyrant’.
(iv) Syntactically, all the linguistic tests usually used to measure the lexical
integrity of a sequence (cf. Sections 2.2 and 2.3) show that these classifying [N1 N2]
formations are words, insofar as they do not accept any of these manipulations,
unlike the qualifying [N1 N2] subordinatives that we will study in Section 4.2.
(v) The last property to be mentioned is the fact that, unlike the qualifying
[N1 N2] formations, these classifying subordinatives do not give rise to productive
series.
We can conclude from this survey that the subordinate [N1 N2] formations like
those exemplified under (14) are binominal words and true compounds in which
N2 metaphorically denotes a classifying property of N1.
4.2 Q
ualifying [N1 N2] subordinatives
Qualifying [N1 N2] subordinatives can be distinguished from the classifying sub-
type on the following grounds:
Compounds and multi-word expressions in French 143
(i) All kinds of nouns may instantiate N1: nouns denoting artefacts (15a),
social roles (15b), time or slots of time (15c), events (15d), and even abstract nouns
(15e):
(ii) According to Fradin (2009), N2s often refer to a metaphoric intrinsic property
of N1 (cf. Section 3.1): slowness (e. g. justice-escargot), quickness (e. g. guerre-
éclair), strength (e. g. attaquant-bulldozer) or duration (e. g. discours-fleuve). To
a certain extent, they often express intensity, as in livre-phare, acteur-clé,
moment-charnière: a livre-phare, for example, is a very famous book that attracts
a lot of attention. However N2 does not have a categorization function (a livre-
phare is not a kind of book, an acteur-clé is not a kind of actor, etc.): the [N1 N2]
sequences exemplified under (15) are not designations that could be included in
a hyperonymy/hyponymy hierarchy. Instead, N2 has a qualifying role and, more
over, it can often be substituted with a qualifying adjective: an acteur-clé is a very
important actor (in a given context), a justice-escargot is very slow justice, and so
on.
(iii) It is precisely the qualifying role of N2 that could, in our view, explain the
specific behavior of these [N1 N2] formations, and particularly their lack of lexical
integrity (cf. 2.3):
14 All examples followed by (www) were taken from the web via Google searches in May 2017.
144 Kristel Van Goethem/Dany Amiot
(16b) Wall Street 2 adopte la forme d’une saga familiale fleuve (www)
‘Wall Street 2 takes the form of a very long (lit. river) family saga’
(16c) L’affiche du film d’animation culte Akira a eu droit à de nombreuses paro
dies (www)
‘The poster of the cult animated movie Akira spawned many parodies’
(16d) d’un coup de poing éclair, elle dévie le ballon (www)
‘with a lightning punch (lit. punch-of-fist), she deflects the ball’
In these examples, the N1s resemble phrases: they result from the association of a
noun and an adjective (16a–b) or of a noun and a prepositional phrase (16c–d).
The N2 slot can also be filled by a complex item, but this is more excep-
tional:
This second property is more challenging for the LIH: the lexical integrity of the
[N1 N2] sequences is undoubtedly called into question by the insertion of an adverb
of degree between the two Ns. This is why some authors put forward a weakened
Compounds and multi-word expressions in French 145
(19) Les années 1970 constituent en effet une période absolument charnière
dans la vie des communautés […] (www)
‘The 1970s constituted an absolutely pivotal (lit. hinge) period in the life of
communities [...]’
(20) Nous reviendrons sur ce point réellement clé pour la suite de la réflexion
(www)
‘We will return to this point, which is really key (lit. this really key point)
for the continuation of the discussion’
(21) […] une version raccourcie d’un texte extrêmement fleuve qu’il a publié
quelques années plus tôt (www)
‘[…] an abridged version of an extremely lengthy (lit. river) text that he
published a couple of years before’
(22) le match a été une orgie offensive avec un score très fleuve (42–24 en
faveur des Parisiens) (www)
‘the match was an offensive orgy with a very crushing (lit. river) score (42–
24 in favor of the Parisians)’
The presence of such adverbs conflicts not only with the lexical integrity of the [N1
N2] sequence, but also with the nominal status of N2: usually an adverb of degree
modifies a gradable adjective, not a noun. However, in the context of the qualify-
ing [N1 N2] sequences, N2 seems to switch to adjectival status.
Syntactically, evidence for this adjectival status is not only provided by the
possibility of modification by an adverb, but, like a qualifying adjective, N2 can
also be inserted into a comparative construction:
15 Cf. also the ‘Italian trasporto latte-type constructions’ (Lieber/Scalise 2007), in which both
components can be modified by an adjective, e. g. produzione scarpe ‘shoe production’ → produz
ione (accurata) scarpe (estive) ‘(accurate) production of (summer) shoes’.
146 Kristel Van Goethem/Dany Amiot
(23a) pour moi c’est [la préadolescence] une période bien plus charnière
que l’adolescence (www)
‘For me it [pre-adolescence] is a much more pivotal (lit. hinge) period than
adolescence’
(23b) La proximité de commerces est moins clé que pour une résidence senior
(www)
‘The proximity of shops is less key than for a senior housing complex’
This demonstrates the qualifying value of N2 vis-à-vis N1. We will return to this in
Section 5, but it is worth noting for the time being that this behavior distin-
guishes the qualifying subordinative [N1 N2] from the classifying subordinative
(Section 4.1).
(c) Some N2s can be used predicatively. Predicative use is the most prototypical
use of qualifying adjectives. In some cases, ‘N2’ can fill the slot of an adjective in
a predicative construction (25) with or without degree marking:
In this use, the [N1 N2] construction (période charnière in (25a), rôle clé in (25b)
and interview fleuve in (25c)) is broken up, and N2 acquires autonomous adjectival
behavior. This separation of compound-like sequences has been labeled ‘debond-
ing’ by Norde (2009) (cf. also Amiot/Van Goethem 2012; Van Goethem 2012;
Norde/Van Goethem 2014; Van Goethem/De Smet 2014; and Van Goethem 2015).
Compounds and multi-word expressions in French 147
5 A
constructionist analysis of qualifying [N1 N2]
subordinatives
As can be concluded from the preceding section, besides coordinate [N1 N2]
sequences, only classifying [N1 N2] subordinatives should be regarded as true
compounds in French, whereas the qualifying [N1 N2] formations display hybrid
behavior in the sense that they may, to a greater or lesser extent, undergo syntac-
tic operations. We will now demonstrate how the idea of ‘multiple inheritance’
(cf. Section 2.4) can be fruitfully applied to account for these hybrid qualifying
[N1 N2] subordinative constructions.
Two phases can be distinguished in the emergence of qualifying subordi
natives (cf. Amiot/Van Goethem 2012 and Van Goethem 2015 on [N1 clé]
subordinatives).
The first step is the emergence of a productive constructional idiom – via
so-called ‘constructionalization’ (Traugott/Trousdale 2013; Hüning/Booij 2014) –
in which N2 develops a specific (metaphoric) qualifying meaning when combined
with an N1 in a compound(-like) sequence (e. g. question-clé ‘key question’,
moment charnière ‘pivotal moment’, réunion marathon ‘marathon meeting’, cas
limite ‘borderline case’, etc.). This qualifying meaning may be seen as the result
of ‘coercion’ (cf. Audring/Booij 2016) in which the metaphoric meaning some-
times already available for the noun outside the compound-like pattern (e. g. la
clé du succès ‘the key of success’) is selected (‘coercion by selection’) and/or in
which N2 develops adjective-like (semantic and formal) properties within the [N1
N2] pattern (‘coercion by override’). This semi-schematic construction, applied to
the example of [N charnière] formations, can be represented as follows:
16 The schematic representations are a bit simplified since, as we have seen in 4.2, N1 and N2 can
include a multi-word sequence, and the A can be instantiated by a phrase in the case of degree
modification (e. g. une période vraiment charnière).
148 Kristel Van Goethem/Dany Amiot
The [N1 [(Adv) charnière]]N/NP sequence inherits its properties from two distinct
parent constructions, the morphological qualifying compound [N1 N2]N pattern
(e. g. moment-charnière ‘pivotal moment’) and the syntactic [N [(Adv) A]]NP pattern
(e. g. un moment (vraiment) crucial ‘a (really) crucial moment’). As a consequence,
and as shown in Section 4.2, it is a hybrid between a morphological and a syntac-
tic construction and N2 can, in some cases, gradually develop more adjective-like
syntactic uses, such as the predicative use.
This approach indicates that French [N1 N2] subordinatives, and especially
the subclass of formations with a qualifying N2, are in reality closely related to
[N A] or [A N] formations. As we have seen in Section 3.2, Scalise/Bisetto (2009)
merge [N1 N2] appositives and [N A]/[A N] attributives within the class of ATAP
compounds because the modifier in both cases expresses a qualifying property of
the head noun. We can therefore conclude that their classification for these types
of formations is highly insightful. However, what is still missing in this approach
is the fact that this ATAP class contains not only pure (morphological) com-
pounds, but also hybrid constructs with both morphological and syntactic
properties.
6 C
onclusion
Compared with Germanic languages, it turns out to be very difficult to delineate
French compounds from syntactic multi-word units. In the first part of this contri-
bution, we outlined three different approaches dealing with compounding in the
French tradition: non-restrictive, scalar and restrictive (lexicalist). Although we
believe morphological formations should be distinguished from syntactic forma-
tions, it is insightful to highlight their shared potential for expressing the same
denominative functions. We therefore added a fourth approach: we believe a
constructionist, non-modular approach to the language system provides a more
Compounds and multi-word expressions in French 149
appropriate account. From this perspective, both compounds and phrasal struc-
tures with a naming function can act as conventionalized form-meaning pairings
or ‘constructions’ and we should accept the existence of what Booij (2010: 190)
calls ‘lexical phrasal constructions’, namely phrasal constructions that are stored
in the (mental) lexicon.
Another advantage of this constructionist approach is that it can deal with
structurally ambiguous formations, such as [N1 N2] structures with a qualifying
N2. As shown throughout this paper, these sequences are particularly difficult to
deal with in a modular approach because, on the one hand, they formally and
semantically resemble [N1 N2] (subordinative) compounds, but, on the other
hand, they allow syntactic operations to a greater or lesser extent. In a concep-
tion of language as a constructionist network, these hybrid formations can be
fruitfully accounted for by the mechanism of ‘multiple inheritance’. Following
this process, we have argued that the hybrid properties of French qualifying [N1
N2] sequences result from the inheritance of properties from both a morphological
and a syntactic parent construction.
References
Ackema, Peter/Neeleman, Ad (2004): Beyond Morphology. Oxford: Oxford University Press.
Amiot, Dany/Van Goethem, Kristel (2012): A constructional account of French -clé ‘key’ and
Dutch sleutel- ‘key’ as in mot-clé/sleutelwoord ‘key word’. In: Morphology 22. 347–364.
Anderson, Steven (1992): A-Morphous Morphology. Cambridge, UK: Cambridge University
Press.
Arnaud, Pierre J. (2003): Les Composés timbre-poste. Lyon: Presses Universitaires Lyon.
Arnaud, Pierre/Renner, Vincent (2014): English and French [NN]N lexical units: A categorial,
morphological and semantic comparison. In: Word Structure 7. 1–28.
Audring, Jenny/Booij, Geert (2016): Cooperation and coercion. In: Linguistics 54. 617–637.
Bauer, Laurie (1998): When is a sequence of two nouns a compound in English? In: English
Language and Linguistics 2. 65–86.
Bauer, Laurie (2004): Adjectives, compounds and words. In: Nordic Journal of English Studies 3.
7–22.
Bonami, Olivier/Boyé, Gilles (2003): Supplétion et classes flexionnelles. In: Langages 152.
102–126.
Bonami, Olivier/Boyé, Gilles (2014): De formes en thèmes. In: Villoing, Florence/David, Sophie/
Leroy, Sarah (eds.): Foisonnements morphologiques. Études en hommage à Françoise
Kerleroux. Presses Universitaires de Paris Ouest. 17–45.
Benveniste, Emile (1974): Problèmes de linguistique générale. Vol. 2. Paris: Gallimard.
Booij, Geert (2002): Constructional idioms, morphology, and the Dutch lexicon. In: Journal of
Germanic Linguistics 14. 301–329.
Booij, Geert (2005): Compounding and derivation. Evidence for construction morphology. In:
Dressler, Wolfgang U. et al. (eds.): Morphology and its Demarcations. Selected Papers
150 Kristel Van Goethem/Dany Amiot
from the 11th Morphology Meeting, Vienna, February 2004. Amsterdam: Benjamins.
109–132.
Booij, Geert (2010): Construction Morphology. Oxford: Oxford University Press.
Corbin, Danielle (1992): Hypothèses sur les frontières de la composition nominale. In: Cahiers
de Grammaire 17. 27–55.
Corbin, Danielle (1997): Locutions, composés, unités polylexématiques: lexicalisation et mode
de construction. In: Martins-Baltar, Michel (ed.): La locution entre langue et usages. Paris:
ENS. Editions Fontenay/Saint-Cloud. 53–101.
Croft, William (2001): Radical Construction Grammar: Syntactic theory in typological
perspective. Oxford: Oxford University Press.
De Caluwe, Johan (1990): Complementariteit tussen morfologische en syntactische
benoemingsprocédés. In: De Caluwe, Johan (ed.): Betekenis en productiviteit. Gentse
bijdragen tot de studie van de Nederlandse woordvorming. (= Studia Germanica
Gandensia 19). Gent: Seminarie voor Duitse taalkunde, Rijksuniversiteit Gent. 9–25.
De Smet, Hendrik/Ghesquière, Lobke/Van de Velde, Freek (eds.) (2013): On multiple source
constructions in language change. Special issue of Studies in Language 37, 3. Amsterdam
i. a.: Benjamins.
Di Sciullo, Anna Maria/Williams, Edwin (1987): On the definition of word. Cambridge, MA:
MIT Press.
Fradin, Bernard (2003): Nouvelles approches en morphologie. Paris: Presses Universitaires
de France.
Fradin, Bernard (2009): IE, Romance: French. In: Lieber/Štekauer (eds.). 417–435.
Giegerich, Heinz (2009a): Compounding and lexicalism. In: Lieber/Štekauer (eds.).
178–200.
Giegerich, Heinz (2009b): The English compound stress myth. In: Word Structure 2. 1–17.
Goes, Jan (1999): L’adjectif. Entre nom et verbe. Paris/Bruxelles: Duculot.
Goldberg, Adele (1995): Constructions. A Construction Grammar approach to argument
structure. Chicago: University of Chicago Press.
Goldberg, Adele (2006): Constructions at Work. The Nature of Generalization in Language.
Oxford: Oxford University Press.
Goldberg, Adele (2009): The nature of generalization in language. In: Cognitive Linguistics 20.
93–127.
Gross, Gaston (1988): Degré de figement des noms composés. In: Langages 90. 57–72.
Gross, Gaston (1996): Les expressions figées en français: noms composés et autres locutions.
Paris: Ophrys.
Hacken, Pius ten (1994): Defining Morphology. Hildesheim i. a.: Olms.
Hoffmann, Thomas/Trousdale, Graeme (eds.) (2013): The Oxford Handbook of Construction
Grammar. Oxford: Oxford University Press.
Hüning, Matthias/Booij, Geert (2014): From compounding to derivation. The emergence of
derivational affixes through “constructionalization”. In: Folia linguistica 48, 2.
579–604.
Hüning, Matthias/Schlücker, Barbara (2015): Multi-word expressions. In: Müller, Peter O. et al.
(eds.): Word-formation. An international handbook of the languages of Europe. Vol. 1.
(= Handbooks of Linguistics and Communication Science (HSK) 40.1). Berlin/Boston: De
Gruyter. 450–467.
Kleiber, Georges (1984): Dénomination et relations dénominatives. In: Langages 76.
77–94.
Compounds and multi-word expressions in French 151
Van Goethem, Kristel/De Smet, Hendrik (2014): How nouns turn into adjectives. The emergence
of new adjectives in French, English and Dutch through debonding processes. In:
Languages in Contrast 14, 2. 251–277.
Wierzbicka, Anna (1996): Semantics. Primes and universals. Oxford: Oxford University Press.
Zwanenburg, Wiecher (1992): Compounding in French. In: Rivista di linguistica 4. 221–240.
Francesca Masini
Compounds and multi-word expressions
in Italian
It is often observed that compounds, being complex words formed by two (or
more) words, are the morphological constructions closest to syntactic construc-
tions, and that this is the reason why drawing a line between compounds and
phrases is often difficult. Other complex lexical units challenge – possibly even
more – the distinction between syntax, morphology and the lexicon: these are
generally known as multi-word expressions (henceforth MWEs). MWEs are larger
than morphological words and are nonetheless stored into our lexicon. The very
existence of such MWEs poses a number of theoretical questions regarding (i)
the organization of the lexicon, and (ii) the relationship between MWEs and
compounds.
The first question has been addressed, among others, by Jackendoff (1995,
1997), who proposes to extend the lexicon to “multiword constructions” (1997:
153), including so-called “constructional idioms” (Jackendoff 1990: 221; cf. also
Booij 2002a), since these phenomena are too pervasive to be regarded as a periph-
eral part of the grammar. This enlarged view of the lexicon is viable under such
approaches as the Parallel Architecture (Jackendoff 2010), Construction Mor-
phology (Booij 2010) and Construction Grammar in general (Hoffmann/Trous-
dale (eds.) 2013).
If we accept MWEs as part of our lexicon, we may want to address the second
question, which is exactly what the present volume does. More specifically, we
may ask:
a) Is there a way to distinguish between MWEs and compounds? On the basis of
which criteria? Are there criteria that would hold crosslinguistically?
b) What kind of role do MWEs and compounds play in the construction of the
lexicon? Is there competition between them?
These questions emerge quite naturally, given that both MWEs and compounds
are, in a way, complex (multiword) lexical units. Yet, relatively little attention has
been devoted to these specific issues, mainly because compounds and MWEs are
topics that traditionally belong to different linguistic fields: morphology on the
one hand, lexicology and phraseology on the other. In this paper I will address
Open Access. © 2019 Masini, published by De Gruyter. This work is licensed under the Creative
Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-006
154 Francesca Masini
the matter by discussing data from Italian, with a view to contributing some
answers to questions in a) and b) above.
First, I briefly describe the state of the art as far as Italian compounds and
MWEs are concerned (Section 2). In Section 3 I address demarcation issues con-
cerning compounding and MWEs. Section 4, instead, explores possible areas of
competition between compounds and MWEs.
(1a) pesce-cane
fish-dog
‘shark’
(1b) cava-tappi
extract-corks
‘corkscrew’
(1c) crimin-o-logo
crime-lv-logist
‘criminologist’
Compounds and multi-word expressions in Italian 155
Compounding in Italian productively feeds mostly the word classes of nouns (2)
and adjectives (3), not verbs. As for input elements, productive patterns creating
nouns and adjectives involve mostly nouns, adjectives and verbs, secondarily
prepositions, as showed in (2)–(3) (where the head is underlined, when
present).1
1 These observations are taken from Masini/Scalise (2012). Patterns with semiwords are not
included.
156 Francesca Masini
divano letto is both a sofa and a bed, hence a hyponym of both its input elements),
or no internal head at all, like in nord-est (4c). Subordinate (SUB) NN compounds
also comprise two subtypes, depending on the nature of the head noun, that may
be deverbal (like vendita in (4d)) or not (like agenzia in (4e)).
Finally, one should note that Italian displays at least three productive pat-
terns of exocentric compounds: coordinate NN compounds (cf. (4c)), PN com-
pounds (cf. (2d)) and VN compounds, giving rise both to nouns (2c) and adjec-
tives (3c). The latter is one of the most productive types of compounds in
contemporary Italian (cf. Ricca 2010). Hence, exocentricity is well-attested in Ital-
ian compounding.
Most of these expressions have been investigated separately from word forma-
tion, and within other scholarly traditions. Idioms and collocations, for instance,
are typically the realm of phraseology (cf., e. g., Cowie (ed.) 1998) and corpus
linguistics (cf., e. g., Moon 1998), but also psycholinguistics (cf., e. g., Cacciari/
Tabossi (eds.) 1993) and syntax (cf., among others, Everaert et al. (eds.) 1995).
Morphologists, on the other hand, have always devoted little attention to
these multiword phenomena. A notable exception regards complex predicates
(cf., e. g., Butt 1995, Ackerman/Webelhuth 1997) – in particular verb-particle con-
structions in Germanic (cf., e. g., Dehé et al. (eds.) 2002) but also Romance (cf.
Iacobini/Masini 2007; cf. also below) languages.
Recently, morphologists have started devoting more attention to this area,
especially within the framework of Construction Morphology (Booij 2010; hence-
forth CxM). This is little surprising – as also observed by Hüning/Schlücker (2015)
– given that CxM is linked to Construction Grammar (Hoffmann/Trousdale (eds.)
2013; henceforth CxG), a model whose foundations lie on studies on idiomatic
structures, from Fillmore/Kay/O’Connor (1988) onwards.
In CxM, both words and word formation patterns are seen as ‘constructions’,
i. e. conventionalized form-meaning pairings: morphological constructions may
differ in size, complexity and schematicity, and are organized into a hierarchical
lexicon. Besides, units that are larger than a morphological word but nonetheless
conventionalized and stored into our lexicon are also regarded as constructions,
as complex signs. Indeed, CxM has originated from work on phenomena in-
between morphology and syntax, in particular separable complex verbs in Dutch,
which have been treated as a case of ‘periphrastic word formation’ by Booij
(2002b).
In other words, within CxG and CxM, MWEs are seen as part of our lexicon, as
anticipated in Section 1. Some MWEs have the same distribution of sentences
(sayings) or full VPs (idiomatic expressions); formulaic expressions may also
Compounds and multi-word expressions in Italian 159
serve as full utterances (but note that formulae may be constituted also by one
single word). Some other MWEs, in particular those that have been called phrasal
lexemes or lexical phrases (Booij 2009a, 2010; Masini 2009, 2012) are closer than
other MWEs to morphological words (especially compounds), hence I will mainly
focus on these.
Phrasal lexemes are those MWEs that are closest to words in terms of both
distribution and function, i. e., they have a word-like distribution (so sen-
tence-level MWEs would not be phrasal lexemes) and they have the same con-
cept-naming function of words, thus contributing to lexical enrichment (cf. Mas-
ini 2012). They correspond to various patterns and can in principle belong to all
lexical categories, at least in Italian, e. g.: nouns (6a), adjectives (6b), verbs (6c),
adverbs (6d), prepositions (6e), conjunctions (6f), interjections (6g), pronouns
(6h).
These items are not words in the proper sense, since they have a phrase-like struc-
ture; some of them may even be separable under certain conditions.2 At the same
time, however, they present a unitary, often conventionalized semantics, and dis-
play a higher degree of internal cohesion than free phrases.
As an example, let us take phrasal lexemes that belong to the noun category,
i. e., phrasal nouns.3 Italian presents a variety of patterns that fill this class (cf.
Masini 2012), including:
Phrasal nouns of the NP(Art)N type, for instance, look like normal noun phrases
(formed by a noun plus a prepositional phrase), but are more cohesive than free
phrases: indeed, they generally resist various operations (with some variation)
2 This is especially true of verbal expressions: stare su (6c), for instance, may be interrupted by
a light adverb, e. g. stai subito su! (lit. stay immediately up) ‘get up immediately!’. On this topic cf.
Voghera (2004), who claims that the (different) degree of cohesiveness displayed by these ex-
pressions partially depends on the lexical category they belong to, with prepositional and con-
juctional phrasal lexemes being more cohesive than adverbial and adjectival ones, the latter be-
ing more cohesive than nominal ones, whereas verbal expressions are the least cohesive of all.
3 These items have been named in many different ways in the literature, including, e. g., “phras-
al compounds” / “prepositional compounds” (Delfitto/Melloni 2009; Rio-Torto/Ribeiro 2009),
and “improper compounds” (Rainer/Varela 1992). The distinction between phrasal nouns and
compound nouns is not always trivial, as we will see in Section 3.
4 Coordinate phrasal nouns can also be formed by two verbs, e. g. va e vieni (lit. go and come)
‘coming and going / toing and froing’ (cf. Masini/Thornton 2008).
Compounds and multi-word expressions in Italian 161
What is crucial about these items is that they are not just univerbations or lexical-
ized phrases that emerge diachronically. Some certainly are, but a number of
them are actually neologisms productively created by speakers to name new con-
cepts. Sometimes, they are calques from other languages. Take for instance the
three following examples, from the ONLI database:5
Cibo di strada (9a) is a calque from English street food which is however rendered
in Italian with a NPN phrasal noun rather than a NN compound, which possibly
points to the higher availability of the former type. Popolo della rete and città digi
tale are new coinages that have been introduced into the Italian language by
exploiting the NPArtN and NA patterns, respectively. All examples in (9) are there-
(10a) andare su
go up
‘to go up(wards) / to ascend’
(10b) mettere sotto
put down
‘to run over (with a vehicle)’
(10c) guardare avanti
look forward
‘to look forward / to look to the future’
(10d) buttare via
throw away
‘to throw away / to waste’
Since phrasal lexemes (and other MWEs) can be seen as constructions within
CxM – exactly like simple and complex words, as well as word formation schemas
and subschemas – we expect them to interact in various ways with word-forma-
tion processes. Hüning/Schlücker (2015) claim that “MWEs and compounds are
largely a complementary means for creating lexical units”. In Section 4, I offer
some data and reflections about the relationship between these two strategies in
terms of competition. Before that, however, it is necessary to discuss some demar-
cation issues.
3 D
emarcation issues
Starting from the idea that we have two sets of complex lexical constructions that
are used to form stable (stored), complex denotations in the world’s languages,
namely compounds and MWEs, we may ask if they can actually be distinguished,
and on which ground. In addition, we may want to ask whether their demarcation
is clear-cut or not in every language, and if the criteria to be used are valid cross-
linguistically. The expectation is that crosslinguistic validity is hardly achievable,
since the demarcation between compounds and MWEs ultimately has to do with
the demarcation between morphology and syntax, between words and phrases,
which is a well-known, unsolved question, especially in a typological perspective
(cf., e. g., Haspelmath 2011).
Compounds as purely morphological objects have been defined by Guevara/
Scalise (2009: 108) as complex words formed by two (or more) words whose gen-
eral structure is captured by the formula in (12). Let us take this operational defi-
nition as a starting point for our discussion.
(12) [X ℜ Y]Z
where X, Y and Z represent major lexical categories, and ℜ represents an
implicit relationship between the constituents (a relationship not spelled
out by any lexical item)
164 Francesca Masini
German works very similarly (cf. Schlücker/Hüning 2009): like in Dutch, in Ger-
man AN compounds, the adjective is not inflected, bears the main stress and is
generally monomorphemic (14a), whereas in AN phrasal lexemes the adjective is
inflected, does not bear the main stress, and can be complex (cf. (14b–c)).
6 As Booij (2009a: 224) states, “[t]he pre-nominal adjective ends in the suffix -e, unless the NP is
indefinite and the head noun is singular and neuter (in the latter case the ending is zero)”.
166 Francesca Masini
Quite expectedly, the criteria vary from language to language, depending on lan-
guage-specific properties. Some criteria may be shared by more than one lan-
guage (e. g. Dutch and Russian share the loss of agreement inflection, although
the phenomenon is more consistent in Russian), whereas others may not (e. g.
Russian linking vowels can be used to distinguish compounds from phrases, but
not all languages feature these items). Furthermore, some criteria are themselves
questionable: it is not clear, for instance, whether “semantic transparency”
would be a reliable criterion, as we will discuss below.
What about Italian? How can we say, for instance, that the expressions in (17)
are phrasal nouns and not compounds?
It seems to me that the following criteria might be used for Italian, taking the
definition in (12) and lexical integrity as reference points:
Agreement in number and gender is present in (17a) and (17b), as shown by the
glosses. The presence of explicit relational markers is displayed by both (17c)
(preposition di ‘of’) and (17d) (conjunction e ‘and’). The presence of minor lexical
categories is shown by the examples in (19): in (19a) the two nouns are linked by
a preposition with article (della ‘of the’ = di ‘of’ + la ‘the.f.sg’), whereas in (19b)
we have a lexicalized expression containing an article. Finally, bounded elements
show up in compounds only (cf. Section 2.1).7
It is worth noting that the proposed criteria are formal, not semantic. Bisetto
(2004) proposes a semantic criterion to distinguish between compounds and
so-called polirematiche (an Italian standard term for phrasal lexemes): com-
pounds would be the result of a productive process, thus tending to be hyponyms
of their heads, whereas polirematiche would arise from lexicalization and thus
typically display a non-compositional meaning. In our view, this semantic crite-
rion is not really deciding: on the one hand, we may have compounds that are
formed productively and can be readily interpreted by the hearer (cf. (20a), where
capostazione is actually a type of capo) and compounds that are more lexicalized
and whose semantics is not as transparent (20b); on the other hand, phrasal
nouns may either be created on the basis of a productive and interpretable pat-
tern (21a) or arise from lexicalization or idiomatization of a phrase (21b).
(20a) capo-stazione
head-station
‘stationmaster’
7 A possible counterexample would be the phrasal lexemes with two coordinated verbs men-
tioned in footnote 4 (e. g. va e vieni lit. go and come ‘coming and going’). As argued by Masini/
Thornton (2008), the verbal forms used in these expressions are homophonous to the 2nd person
singular imperative, exactly like the verbal forms occurring within VN compounds. If we analyze
this verbal form as some sort of morphomic stem used in Italian morphology, we end up with a
clash: use of a bounded form on the one hand, and presence of an explicit relational marker (e
‘and’) on the other.
168 Francesca Masini
(20b) capo-cielo
head-sky
‘canopy erected over a high altar’
The application of the criteria proposed above is not always straightforward and
may produce unexpected results. Take for instance the agreement criterion. This
is pretty efficient in Dutch and Russian, but less so in Italian, since agreement
takes place in virtually all combinations of a noun and an adjective. This means
that even an expression like croce-rossa (cross.f.sg-red.f.sg) ‘Red Cross’, which is
traditionally regarded as a compound in the literature (like many others, e. g.,
cassaforte ‘safe’ in Table 1), should instead be considered as a phrasal lexeme by
this criterion, exactly like carta telefonica and terzo mondo in (17a–b).
Along these lines, one may argue that also “internal inherent inflection” (i. e.
inflectional marking occuring inside the word, not triggered by agreement, such
as number for nouns) should be considered as a criterion to be added to the list in
(18). Also in this case, we would end up regarding many Italian items (tradition-
ally analyzed as compounds) as phrasal lexemes, such as left-headed NN com-
pounds of the capostazione type (20a), in which the plural marker applies to the
left (head) constituent: capo-stazione (lit. head-station) ‘stationmaster’ turns
into capi-stazione (lit. heads-station) ‘stationmasters’, and not *capo-stazioni
(head-stations), with plural marker on the right (as we would expect from a “true
word”). However, we can also observe that, despite internal inflection, capostazi
one is still (at least partly) compound-like due to the absence of any relational
element (see criterion (18b)) between capo and stazione (cf. the corresponding
phrasal expression capo della stazione lit. head of.the station). Therefore, the
compound-phrasal noun demarcation may be a matter of degree rather than
clear-cut (cf. also footnote 7): for instance, compounds that display no internal
agreement and no internal (inherent) inflection (e. g. dopoguerra ‘post war period’
or asciugamani ‘towel’, cf. (2), Section 2.1) are more compound-like than capo
stazione (which is split by inflection in the plural). In other words, the concepts of
compound and phrasal lexeme may be seen as prototypes, or radial categories,
that can be defined on the basis of a complex interaction of properties, rather
than on a set of necessary and sufficient features.
Compounds and multi-word expressions in Italian 169
All in all, based on the observations above, one may note that the demarca-
tion between noun compounds and phrasal nouns in Italian largely relies on cri-
teria that typically distinguish words from phrases, with the complication that
phrasal nouns are not free, full-fledged phrases.8 As is well-known, word(hood) is
far from being a simple concept with crosslinguistic validity (Haspelmath 2011).
However, CxM assumes that “cohesiveness is the defining criterion for canonical
wordhood” (Booij 2009b: 97). And cohesiveness obviously manifests itself in dif-
ferent ways in different languages, depending on the morphological and syntac-
tic properties of the language in question. So, the exact criteria to be used should
be identified on a language-specific basis, but the same general principle applies.
Given these premises, we might expect to have languages in which the formal
differences between compounds and phrasal lexemes are evident and easily
detectable (e. g. Russian, i. e. a language where compounds are mostly root-com-
pounds), languages in which these are vague or even non-existent (e. g. English,
where it is very difficult to state whether conventionalized AN combinations such
as black board are compounds or phrases, cf. Giegerich 2005, 2009), and lan-
guages, such as Italian, that are in-between, since they offer at least some evi-
dence in favor of maintaining such a division.
In conclusion, we may regard the demarcation between compounds and
phrasal lexemes as an element of variation among the languages of the world that
possibly correlates with their morphological type: with the limited data gathered
so far, we may hypothesize that this demarcation is clearer in highly inflectional
languages displaying root compounding, whereas in isolating languages the
boundary is definitely more blurred, if not absent.
4 C
ompetition issues
Competition in morphology and the lexicon is generally viewed as a relation
holding between different word-level strategies that compete to realize the same
grammatical or lexico-conceptual meaning. However, recent work has claimed
that morphological words also compete with MWEs (Booij 2010; Hüning/Schlü
cker 2015; Masini 2016, to appear). The relationship between morphological
words and MWEs, however, is still underinvestigated and calls for further research.
8 The following general properties keep phrasal lexemes apart from true, free phrases: greater
internal cohesion, paradigmatic fixedness (i. e., they resist lexical substitution) and convention-
alized (though not necessarily idiomatic) meaning (cf. Section 2.2, example (8)).
170 Francesca Masini
In this section I show that competition between compounds and MWEs may
result in the blocking of specific lexical items, and that these blocking effects may
operate in both directions.9 More specifically, I briefly illustrate three case-studies
regarding the competition between compounds and phrasal lexemes in the nom-
inal domain, namely: i) NP(Art)N phrasal nouns (e. g. macchina della verità ‘lie
detector’) in comparison with NN compounds (e. g. capostazione ‘stationmaster’)
(Section 4.1); ii) the simile construction with color adjectives (e. g. rosso come il
fuoco lit. red as the fire ‘red as fire’) in comparison with the corresponding com-
pound pattern (e. g. rosso fuoco lit. red fire ‘fire-like red’) (Section 4.2); iii) irrever
sible binomials (e. g. sano e salvo ‘safe and sound’) as compared with coordinate
compounds of the sordomuto ‘deaf-mute’ type (Section 4.3).
4.1 C
omplex nominals: NP(Art)N phrasal nouns vs. NN
compounds
9 For a broader picture of the competition between MWEs and all kinds of morphological words,
including simple words and derived words, cf. Masini (2016, to appear).
Compounds and multi-word expressions in Italian 171
Given that NN compounds and NP(Art)N phrasal nouns coexist in Italian, and
that both are used to coin new complex nominals, competition between these two
patterns is likely to emerge. As a case-study, let us consider Italian NN compounds
where capo ‘head, boss’ is the head (leftmost) constituent. This pattern of com-
pounding is pretty productive in Italian and is associated with the meaning
‘head/boss of N’.10
(23a) capo-stazione
head-station
‘stationmaster’
(23b) capo-classe
head-class
‘class president’
(23c) capo-gruppo
head-group
‘group leader’
(23d) capo-famiglia
head-family
‘head of the family’
(24a) °capo-stato
head-state
(24b) °capo-governo
head-government
(24c) °capo-polizia
head-police
However, these perfectly well-formed items are not actually produced (the ° sign
marks well-formed but non-existent expressions). The reason for this is that they
are blocked by already existing NPN phrasal nouns featuring the same constitu-
ent words, namely:
10 Note that not all capo+N compounds have this semantics: some mean ‘chief N’, such as capo-
redattore (lit. head-editor) ‘editor in chief’.
172 Francesca Masini
The reverse may also occur: for instance, the expression capo della classe (26) is
perfectly grammatical and interpretable as ‘class president’; however, it is not
used with this specific intended reading, because the same meaning is already
conveyed by the established compound capoclasse (cf. (23b)).
(27a) grüne Welle (lit. green wave) ‘progressive signal system’ (German)
(27a’) °Grünwelle
(27b) Dunkelkammer ‘darkroom’
(27b’) °dunkle Kammer
These data may be interpreted either as cases of lexical blocking (Rainer 2016)
(token blocking in Rainer 1988) or rather as an effect of a more general tension
between two competing patterns, namely, in the Italian case, between NN com-
Compounds and multi-word expressions in Italian 173
pounding on the one hand and NP(Art)N phrasal lexemes on the other. Both
views are viable in a constructionist view of morphology and the lexicon, where
constructions are arranged into an inheritance hierarchy where abstract schemas
generalize over more specific constructions. Hence, which type of blocking is
actually at work is an empirical question.
4.2 C
omplex color expressions: simile constructions vs.
compounds
If we search the [A come NP] pattern in a large corpus11 and rank the results for
frequency, what we find is that many of the top ranked occurrences contain a
color term (32), most notably nero ‘black’, bianco ‘white’ and rosso ‘red’ (but also
other colors, e. g. azzurro ‘light-blue’, giallo ‘yellow’, blu ‘blue’, verde ‘green’). The
simile construction with color terms apparently retains the intensification mean-
ing associated with the general [A come NP] construction.
Interestingly, some of the AN pairs occurring within the simile construction are
also found as compounds in German, as noted by Hüning/Schlücker (2015):
The same holds for Italian, but only for a subset of expressions, namely those
containing a color adjective (34). Similar doublets are not found in Italian with
other kinds of adjectives (cf. (35), corresponding to (31)).
11 The data for this analysis are taken from the Italian Web 2010 (or itTenTen10) corpus, a web
corpus of approx. 2,5 billion words searched through the SketchEngine (www.sketchengine.
co.uk, last access: March 2017).
Compounds and multi-word expressions in Italian 175
Compounds of the [ACOLOR N] type are relatively common in Italian (cf. D’Achille/
Grossmann 2010, 2013). The color A is the head of the compound (and is generally
invariable), whereas the N serves as a modifier: more precisely, it denotes a refer-
ent that typically exemplifies the shade of the color in question. The expression
giallo canarino (lit. yellow canary), for instance, denotes a kind of yellow that is
typically exemplified by canary birds.
Therefore, we have a domain, that of complex color adjectives, where there
seem to be two competing strategies that form expressions with similar content:
[ACOLOR come NP] multiword simile constructions and [ACOLOR N] compounds. How
much do they actually overlap?
In order to answer this question, I generated frequency lists of both the [ACOLOR
N] and the [A come NP] pattern for five color terms (nero ‘black’, bianco ‘white’,
rosso ‘red’, azzurro ‘light-blue’, verde ‘green’), using the itTenTen10 corpus (cf.
footnote 11), and then I compared the top results of the (manually revised) lists,
in order to see if the two constructions occur with the same nouns. It turned out
that the two constructions share quite a lot of nouns, thus producing a consider-
able number of doublets. As an exemplification, see the 15 top ranked hits for
rosso ‘red’ in Table 2, where the grey cells highlight the nouns that both construc-
tions occur with.
176 Francesca Masini
Table 2: Comparing [rosso N] and [rosso come NP]: top ranked results from the itTenTen10
corpus
A similar picture emerged for other colors. For instance: nero ‘black’ frequently
occurs with pece ‘pitch’, notte ‘night’, carbone ‘coal’, inchiostro ‘ink’ and petrolio
‘oil’ in both constructions (vs. e. g. morte ‘death’, which selects only the simile
construction: nero come la morte lit. black like the death ‘intense black’); bianco
‘white’ frequently occurs with latte ‘milk’, avorio ‘ivory’, marmo ‘marble’, neve
‘snow’, carta ‘paper’ and cadavere ‘corpse’ in both constructions (vs. cencio ‘rag’
and crema ‘cream’, which occur only in one construction: bianco come un cencio
lit. white as a rag ‘very pale’, bianco crema lit. white cream ‘cream-like white’).
Therefore, these two constructions share quite a lot of environment and actually
seem to compete with each other.
At this point, one may inquire whether they are really equivalent. Take for
instance the pairs in (36)–(39), where the (a) examples are taken from the
itTenTen10 corpus and the (b) examples contain the corresponding (either MWE
or compound) expression.
Compounds and multi-word expressions in Italian 177
Per arrivarci bisogna guadare a piedi un fiume [...] rosso come la ruggine
(37a)
‘To get there you have to cross on foot a rust-like red river’
(37b) Per arrivarci bisogna guadare a piedi un fiume [...] rosso ruggine
In these pairs, the two expressions seem quite interchangeable. However, a closer
analysis of a number of examples showed that interchangeability is possible in
specific contexts that meet certain semantic properties, to which I now turn.
I mentioned above that compounds of the [ACOLOR N] type denote, quite neu-
trally, a kind of color that is typically exemplified by N, whereas the simile con-
struction with color terms, besides denoting a type of color, shares the intensifi-
cation meaning with the general [A come NP] construction. This intensifying
effect is especially prominent when N refers to an object that is associated with an
intense shade, or with the focal shade of the color in question (40a). The intensi-
fication effect diminishes when N identifies a referent that is not associated with
such an intense or “prototypical” shade (40b). At the same time, when the com-
pound features an N that identifies a referent that is associated with such an
intense or “prototypical” shade of the color at hand (41a), some slight intensifica-
tion emerges, otherwise absent in this construction (41b).12
12 Incidentally, the association with an entity (N) that is regarded as a prototypical example of
the property conveyed by A might actually be at the basis of the intensification meaning con-
veyed by the more general [A come NP] construction.
178 Francesca Masini
(40b) rosso come la ruggine (lit. red as the rust) ‘rust-like red’
≠ true/intense red
The two patterns are more likely to be interchangeable when they tend to “con-
verge”, i. e. when the intensification value is low in the [ACOLOR come NP] pattern
(cf. (37), (39)) and when some intensification emerges in the [ACOLOR N] pattern (cf.
(36), (38)), depending on the kind of N used. This said, it must be added that even
in these specific situations, the two constructions are not totally equal semanti-
cally, because the simile construction always has a higher degree of expressive-
ness, probably inherited by the more general simile construction of which it is an
instance. Compounds, on the other hand, are more objective and neutral. In those
contexts where they are interchangeable, the two expressions may thus be seen
as propositional synonyms (Cruse 2004), i. e. as denotationally equivalent but dif-
ferent in expressive meaning.
Besides semantics, there are a number of formal properties, partially derived
from their phrasal vs. morphological status, that differentiate the two construc-
tions. First of all, in the [ACOLOR come NP] pattern the color adjective is variable (see
e. g. (39a), where azzurri agrees in number and gender with occhi: plural, mascu-
line), whereas in the compound pattern it is primarily invariable:13
Second, in the [ACOLOR come NP] pattern, the color adjective can only be an adjec-
tive, whereas the compound may also be used as a noun:
Third, although the two constructions share a lot of nouns, not all nouns are
equally likely to occur in both constructions. For instance, the combination of
azzurro ‘light blue’ and polvere ‘dust’ seems to occur within the compound pat-
tern only (44a), whereas the combination of nero ‘black’ and buio ‘dark’ seems to
work only within the simile construction (44b).
In some cases, the attempt to apply a given A-N combination occurring in one
construction to the other construction results in an unacceptable string. This typ-
ically happens when N is an abstract noun (45a–a’), when a metonymy is at work
(cf. (45b–b’), where the entity referred to is not a cardinal, but the cardinal’s cas-
sock), and when the association with N has a purely intensifying effect, like in
(45c–c’), where there is no obvious relationship between rags and whiteness.
In conclusion, what emerges from this overall picture is that the two construc-
tions are not really equivalent in terms of both meaning and form. In some spe-
cific instances the two versions – compound and multiword – are pretty close and
possibly competing with one another (although the multiword version is gener-
ally more expressive), but even in these cases they have partially different struc-
tural properties. Besides, they do not share the whole array of possible A-N pairs.
In other words, the two constructions seem to do their best not to overlap too
much, and to differentiate from each other.
4.3 C
oordination in the lexicon: irreversible binomials vs.
compounds
(46a) sordo-muto
deaf-mute
‘deaf-mute’
(46b) studente-lavoratore
student-worker
‘student-worker’
(46c) agro-dolce
sour-sweet
‘sweet and sour, bittersweet’
(46d) ceco-slovacco
Czech-Slovak
‘Czechoslovak’
Along the lines of the pattern for coordinate compounds, we might theoretically
form compounds like those in (48): however, these expressions are not actually
created by speakers because the corresponding binomials already exist (47a–b).
(48a) °sanosalvo
healthy-safe
(48b) °vivo-vegeto
alive-thriving
The reverse situation may also occur: for instance, the existence of an established
coordinate compound like sordomuto (46a) blocks the formation, or lexicaliza-
tion, of the corresponding irreversible binomial (49), which would be technically
well-formed.
To which extent are these two patterns – coordinate compounds and irreversible
binomials – actually equivalent? Let us take a step back.
Arcodia/Grandi/Wälchli (2010: 178) propose a macro-distinction between:
i) “hyperonymic coordinate compounds” (what Wälchli 2005 calls “co-com-
pounds”), which express superordinate-level concepts, i. e. their referent is in a
superordinate relationship to the meaning of the parts (cf. (50a)); ii) “hyponymic
coordinate compounds”, which express subordinate-level concepts, i. e. their ref-
erent is in a subordinate relationship to the meaning of the parts (cf. (50b)). They
also claim that, whereas the latter are common in Standard Average European
(SAE) languages, including of course Italian (cf. also Grandi 2011), the former are
more typically found in East and South East Asia.
However, Masini (2006, 2012) shows that most of these functions are actually
found in Italian, but they are conveyed by irreversible binomials (cf. (54a), (55a),
(56a)). Most likely the same holds for other SAE languages: see for instance the
English examples in (54b), (55b) and (56b).
(54) Generalizing
(54a) giorno e notte (Italian)
day and night
‘day and night, always’
(54b) high and low (meaning ‘everywhere’) (English)
Compounds and multi-word expressions in Italian 183
(55) Collective
(55a) coltello e forchetta (Italian)
knife and fork
‘cutlery’
(55b) bra and panties (meaning ‘lingerie’) (English)
(56) Approximate
(56a) poco o niente (Italian)
little or nothing
‘very little, almost nothing’
(56b) two or three (meaning ‘some’) (English)
5 T
owards a unified treatment of complex lexical
items
In this paper I dealt with complex lexical items in Italian, namely proper com-
pounds and MWEs. Specifically, I focused on so-called phrasal lexemes, which
are closer to compounds in distribution and function than other (e. g. sen-
tence-level) MWEs. Whereas compounds mostly feed nouns and adjectives in Ital-
ian, phrasal lexemes – beside creating expressions belonging to nouns/adjectives
– may also feed other major word classes, most notably verbs and adverbs, thus
apparently compensating the limits of compounding in these specific areas. What
should be stressed, once again, is that phrasal lexemes are not just the product of
diachronic lexicalization: some instances certainly are, but some others are the
result of synchronic lexical creation that relies on stored naming patterns (i. e.,
constructions).
The demarcation between compounds and phrasal lexemes turned out to be
a non-trivial issue. I proposed four tentative criteria for the Italian language, i. e.
presence/absence of: internal agreement, explicit relational markers, minor lexi-
cal categories, bounded elements. However, this set of criteria has no pretense of
184 Francesca Masini
crosslinguistic validity: in fact, each language will display a specific set of prop-
erties that help distinguishing between these two kinds of constructions (when
this is actually possible). Ultimately, these criteria trace back to the traditional
distinction between morphology and syntax, which is however not clear-cut
within a constructionist view of the grammar.
I also contributed some data and observations on the competition that –
quite expectedly – emerges between compounds and phrasal lexemes, given
their shared function. I showed that this competition may lead to bidirectional
blocking: compounds may block the establishment of a phrasal lexeme in the
lexicon, and an established phrasal lexeme may block the creation of a new com-
pound. From the data examined so far, it seems that these two competing pat-
terns tend to differentiate, by specializing for different functions (cf. especially
Sections 4.2 and 4.3). This goes into the direction advocated for by Aronoff (2016,
to appear) in recent work, where competition leads to either extinction of one of
the competitors, or to differentiation in terms of form, meaning or distribution, as
a result of a “struggle for existence” between linguistic expressions.
In conclusion, the discussion of demarcation and competition issues carried
out in this chapter suggests a view of the mental lexicon where both compounds
and phrasal lexemes are stored, on a par with each other: they share the same
function and distribution, they may compensate for each other at the most
abstract level, and they definitely compete with each other for the expression of
lexico-conceptual meanings.
References
Ackerman, Farrell/Webelhuth, Gert (1997): The composition of (dis)continuous predicates:
Lexical or syntactic? In: Acta Linguistic Hungarica 44, 3/4. 317–340.
Arcodia, Giorgio Francesco/Grandi, Nicola/Wälchli, Bernhard (2010): Coordination in
compounding. In: Scalise, Sergio/Vogel, Irene (eds.). 177–197.
Aronoff, Mark (2016): Competition and the lexicon. In: Elia, Annibale/Iacobini, Claudio/
Voghera, Miriam (eds.): Livelli di analisi e fenomeni di interfaccia. Roma: Bulzoni. 39–52.
Aronoff, Mark (to appear): Competitors and alternants in linguistic morphology. In: Rainer,
Franz/Gardani, Francesco/Dressler, Wolfgang U./Luschützky, Hans Christian (eds.).
Baldwin, Timothy/Kim, Su Nam (2010): Multiword expressions. In: Indurkhya, Nitin/Damerau,
Fred J. (eds.): Handbook of Natural Language Processing. Boca Raton: CRC Press.
267–292.
Benveniste, Émile (1966): Différentes formes de la composition nominale en français. In:
Bulletin de la Société de Linguistique de Paris 61, 1. 82–95.
Bernal, Elisenda (2012): Catalan compounds. In: Probus 24, 1. 5–27.
Compounds and multi-word expressions in Italian 185
Bisetto, Antonietta (2004): Composizione con elementi italiani. In: Grossmann, Maria/Rainer,
Franz (eds.). 33–51.
Bisetto, Antonietta/Scalise, Sergio (1999): Compounding. Morphology and/or syntax? In:
Mereu, Lunella (ed.): Boundaries of Morphology and Syntax. Amsterdam/Philadelphia:
Benjamins. 31–48.
Booij, Geert (2002a): Constructional idioms, morphology and the Dutch lexicon. In: Journal of
Germanic Linguistics 14, 4. 301–329.
Booij, Geert (2002b): Separable complex verbs in Dutch: A case of periphrastic word formation.
In: Dehé, Nicole et al. (eds.): Verb-Particle Explorations. Berlin/New York: De Gruyter.
21–42.
Booij, Geert (2009a): Phrasal names: A constructionist analysis. In: Word Structure 2, 2.
219–240.
Booij, Geert (2009b): Lexical integrity as a formal universal: A constructionist view. In: Scalise,
Sergio/Magni, Elisabetta/Bisetto, Antonietta (eds.). 83–100.
Booij, Geert (2010): Construction Morphology. Oxford: Oxford University Press.
Butt, Miriam (1995): The structure of complex predicates in Urdu. Stanford: CSLI.
Cacciari, Cristina/Tabossi, Patrizia (eds.) (1993): Idioms: Processing, structure, and
interpretation. Hillsdale: Psychology Press.
Cowie, Anthony (ed.) (1998): Phraseology: Theory, analysis, and applications. Oxford: Oxford
University Press.
Cruse, Alan (2004): Meaning in language. Oxford: Oxford University Press.
D’Achille, Paolo/Grossmann, Maria (2010): I composti aggettivo + aggettivo in italiano. In:
Iliescu, Maria/Siller-Runggaldier, Heidi M./Danler, Paul (eds.): Actes du XXVe Congrès
International de Linguistique et de Philologie Romanes (3–8 sept. 2007, Innsbruck), VII.
Berlin/New York: De Gruyter. 405–414.
D’Achille, Paolo/Grossmann, Maria (2013): I composti <colorati> in italiano tra passato e
presente. In: Casanova Herrero, Emili/Calvo Rigual, Cesáreo (eds.): Actas del XXVI
Congreso Internacional de Lingüística i Filología Románicas (Valencia, 6–11 de septiembre
2010). Berlin/New York: De Gruyter. 523–537.
Dehé, Nicole et al. (eds.) (2002): Verb-particle explorations. Berlin: De Gruyter.
Delfitto, Denis/Melloni, Chiara (2009): Compounds don’t come easy. In: Lingue e Linguaggio
VIII(1). 75–104.
Everaert, Martin et al. (eds.) (1995): Idioms: Structural and psychological perspectives.
Hillsdale.
Fillmore, Charles/Kay, Paul/O’Connor, Mary Catherine (1988): Regularity and idiomaticity in
grammatical constructions: The case of let alone. Language 64, 3. 501–538.
Giegerich, Heinz J. (2005): Associative adjectives in English and the lexicon-syntax interface. In:
Journal of Linguistics 41, 3. 571–591.
Giegerich, Heinz J. (2009): The English compound stress myth. In: Word Structure 2, 1. 1–17.
Grandi, Nicola (2011): La coordinazione tra morfologia e sintassi. Tendenze tipologiche ed
areali. In: Massariello Merzagora, Giovanna/Dal Maso, Serena (eds.): I luoghi della
traduzione/Le interfacce. Roma: Bulzoni. 881–895.
Grossmann, Maria/Rainer, Franz (eds.) (2004): La formazione delle parole in italiano. Tübingen:
Niemeyer.
Guevara, Emiliano/Scalise, Sergio (2009): Searching for universals in compounding. In:
Scalise, Sergio/Magni, Elisabetta/Bisetto, Antonietta (eds.). 101–128.
186 Francesca Masini
Haspelmath, Martin (2011): The indeterminacy of word segmentation and the nature of
morphology and syntax. In: Folia Linguistica 45, 1. 31–80.
Hoffmann, Thomas/Trousdale, Graeme (eds.) (2013): The Oxford Handbook of Construction
Grammar. Oxford: Oxford Handbooks.
Hüning, Matthias/Schlücker, Barbara (2015): Multi-word expressions. In: Müller, Peter O. et al.
(eds.). 450–467.
Iacobini, Claudio (2015): Particle-verbs in Romance. In: Müller, Peter O. et al. (eds.). 627–659.
Iacobini, Claudio/Masini, Francesca (2007): The emergence of verb-particle constructions in
Italian. In: Morphology 16, 2: 155–188.
Jackendoff, Ray (1990): Semantic structures. Cambridge, MA: MIT Press.
Jackendoff, Ray (1995): The boundaries of the lexicon. In: Everaert, Martin et al. (eds.). 133–165.
Jackendoff, Ray (1997): The architecture of the language faculty. Cambridge, MA: MIT Press.
Jackendoff, Ray (2010): Meaning and the lexicon: the Parallel Architecture 1975–2010. Oxford:
Oxford University Press.
Jezek, Elisabetta (2004): Types et degrés de verbes supports en italien. In: Linguisticae
investigationes 27, 2: 185–201.
Kay, Paul (2013): The limits of (Construction) Grammar. In: Hoffmann, Thomas/Trousdale,
Graeme (eds.). 32–48.
Lambrecht, Knud (1984): Formulaicity, frame semantics, and pragmatics in German binomial
expressions. In: Language 60, 4. 753–796.
Malkiel, Yakov (1959): Studies in irreversible binomials. In: Lingua 8. 113–160.
Masini, Francesca (2005): Multi-word expressions between syntax and the lexicon: the case of
Italian verb-particle constructions. In: SKY Journal of Linguistics 18. 145–173.
Masini, Francesca (2006): Binomial constructions: Inheritance, specification and subregu-
larities. In: Lingue e Linguaggio 5, 2. 207–232.
Masini, Francesca (2009): Phrasal lexemes, compounds and phrases: A constructionist
perspective. In: Word Structure 2, 2. 254–271.
Masini, Francesca (2012): Parole sintagmatiche in italiano. Roma: Caissa Italia.
Masini, Francesca (2016): Morphological words and multiword expressions: Competition or
cooperation? Paper given at the 17th International Morphology Meeting (IMM17), Vienna,
18–21 February 2016.
Masini, Francesca (to appear): Competition between morphological words and multiword
expressions. In: Rainer, Franz/Gardani, Francesco/Dressler, Wolfgang U./Luschützky,
Hans Christian (eds.).
Masini, Francesca/Benigni, Valentina (2012): Phrasal lexemes and shortening strategies in
Russian: The case for constructions. In: Morphology 22, 3. 417–451.
Masini, Francesca/Scalise, Sergio (2012): Italian compounds. In: Probus 24, 1. 61–91.
Masini, Francesca/Thornton, Anna M. (2008): Italian VeV lexical constructions. In: On-line
Proceedings of the 6th Mediterranean Morphology Meeting (MMM6). University of Patras.
146–186.
Moon, Rosamund (1998): Fixed expressions and idioms in English: A corpus-based approach.
New York: Clarendon Press.
Müller, Peter O. et al. (eds.) (2015): Word-formation. An international handbook of the
languages of Europe. Vol. 1. (= Handbooks of Linguistics and Communication Science
(HSK) 40.1). Berlin/Boston: De Gruyter.
Radimský, Jan (2015): Noun+Noun compounds in Italian: A corpus-based study. České
Budějovice: University of South Bohemia in České Budějovice.
Compounds and multi-word expressions in Italian 187
Rainer, Franz (1988): Towards a theory of blocking: The case of Italian and German quality
nouns. In: Booij, Geert/van Marle, Jaap (eds.): Yearbook of Morphology 1988. Dordrecht:
Springer. 155–185.
Rainer, Franz (2016): Blocking. In: Aronoff, Mark (ed.): The Oxford Research Encyclopedia of
Linguistics. Internet: DOI: 10.1093/acrefore/9780199384655.013.33.
Rainer, Franz/Varela, Soledad (1992): Compounding in Spanish. In: Rivista di Linguistica 4, 1.
117–142.
Rainer, Franz/Gardani, Francesco/Dressler, Wolfgang U./Luschützky, Hans Christian (eds.) (to
appear): Competition in inflection and word formation. Cham: Springer.
Ricca, Davide (2010): Corpus data and theoretical implications with special reference to Italian
V-N compounds. In: Scalise, Sergio/Vogel, Irene (eds.). 237–254.
Rio-Torto, Graça/Ribeiro, Sílvia (2009): Compounds in Portuguese. In: Lingue e Linguaggio
VIII(2). 271–291.
Rio-Torto, Graça/Ribeiro, Sílvia (2012): Portuguese compounds. In: Probus 24, 1. 119–145.
Scalise, Sergio (1992): Compounding in Italian. In: Italian Journal of Linguistics/Rivista di
Linguistica 4, 1. 175–199
Scalise, Sergio/Bisetto, Antonietta (2009): The classification of compounds. In: Lieber,
Rochelle/Štekauer, Pavol (eds.): The Oxford handbook of compounding. Oxford: Oxford
University Press. 34–53.
Scalise, Sergio/Magni, Elisabetta/Bisetto, Antonietta (eds.) (2009): Universals of language
today. Berlin: Springer.
Scalise, Sergio/Vogel, Irene (eds.) (2010): Cross-disciplinary issues in compounding.
Amsterdam: Benjamins.
Schlücker, Barbara/Hüning, Matthias (2009): Compounds and phrases. A functional
comparison between German A+N compounds and corresponding phrases. In: Italian
Journal of Linguistics 21, 1. 209–234.
Voghera, Miriam (2004): Polirematiche. In: Grossmann, Maria/Rainer, Franz (eds.). 56–69.
Wälchli, Bernhard (2005): Co-compounds and natural coordination. Oxford: Oxford University
Press.
Jesús Fernández-Domínguez
Compounds and multi-word expressions
in Spanish
1 I ntroduction
Compounds have been customarily defined as lexical units that consist of two
lexemes. They are morphological entities.1 Phrases, for their part, may be made
up by one unit, but often comprise two elements when they carry internal modi-
fication. They are syntactic entities. Provisionally satisfactory though these
descriptions are, linguists are often in trouble when having to decide on which
basis a two-word structure is morphological or syntactic, as a formation may be
argued to be a compound on some grounds but at the same time display phrasal
features. Certainly, in any category, it is quite common for some members not to
meet all the prototypical features of the group; this is in fact statistically likely.
Compounding is no exception as can be seen regarding the many exceptions to
initial stress placement (e. g. ‘snowball vs. rubber ‘ball), or to spelling (market
place vs. market-place vs. marketplace). A dilemma arises, however, if peripheral
membership is not an exception but the norm (Bauer 1998: 65).
The Spanish word-formation system offers an array of means for the creation
of neologisms, typically classified within the categories of derivation, compound-
ing and minor processes. Derivation is by far the most fruitful resource and
includes prefixation (pintar lit. paint ‘to paint’ > repintar lit. re.paint ‘to repaint’),
suffixation (admirar lit. admire ‘to admire’ > admirable lit. admir.able ‘admira-
ble’) and infixation (cantar lit. sing ‘to sing’ > cant.urre.ar lit. sing.SUF ‘to hum’).
A number of other processes may be distinguished, for example, parasynthesis
(largo lit. long ‘long’ > alargar lit. a.long.ar ‘to lengthen’), back-formation (com
prar lit. purchase ‘to purchase’ > compra lit. purchase ‘purchase’), blending (doc
umental ‘documentary’ + drama ‘drama’ = docudrama ‘docudrama’), acronymy
(Pequeña Y Mediana Empresa ‘small and medium-sized business’ > PYME), and
clipping (colegio ‘school’ > cole). The affinities between compounds and phrases
have also been noted for Spanish, with the fundamental difference that com-
pounds have a morphological origin, serve a naming function and are often spe-
1 I would like to thank Francesca Masini, Barry Pennock, Vincent Renner, Barbara Schlücker
and an anonymous reviewer for sensible advice on the form and content of this article.
(1a) sunlight
(1b) girlfriend
Compounds and multi-word expressions in Spanish 191
In the face of these similarities, one unavoidable step when studying compounds
and MWEs is the description of the morphology-syntax interface (Gaeta/Ricca
2009; Masini 2009; Buenafuentes 2010, 2014; Pafel 2017). The most frequently
adduced factors concern several language areas:
*Usó
(5) el quita1esmalte2, pero no lo2
pudo borrar
Use-3sg-past the remove1polish2, but neg d.obj-m-sg2
could erase
‘She used the polish remover, but couldn’t erase the polish’
Compounds and multi-word expressions in Spanish 193
Most of the above criteria are either syntactic (atomicity, fixity, locus of inflection)
or semantic (naming unity, idiomaticity), although productivity is a characteristic
of morphology. Besides these, stress has proved crucial for the differentiation
between phrases and compounds in other languages (e. g. German and Dutch),
but it is not decisive in Spanish, as compounds may display single but also dou-
ble stress (Rao 2015: 90 f.). Bustos Gisbert (1986) reviews the phonetics of Spanish
compounds and concludes that stress assignment is caused by the interaction of
factors like the number of syllables, the semantic relationship between the con-
stituents or the compound’s headedness. Rao (2015) provides interesting experi-
mental findings on the influence of orthography over prosodic interpretation, or
the apparently minimal effect of the semantic relation between constituents on
stress assignment.
The above generalizations represent general tendencies regarding proto-
typical compounds and prototypical syntactic entities but, crucially, most of
these features may be displayed by both compounds and phrases and cannot
individually provide conclusive evidence with respect to the compound-phrase
divide.
3 S
panish compounds and MWEs: between
morphology, syntax and phraseology
Formations of a nominal, verbal and adjectival type are taken into consideration
in this section, particularly those made up of nouns, adjectives and verbs. There
is a consensus in the literature that Spanish compounding is largely endocentric
and that, while adjective and verb compounds are right-headed, noun com-
pounds are left-headed, with the exception of specific right-headed types. The
following subsections explore, in turn, nominal (Section 3.1), adjectival (Sec-
tion 3.2), and verbal (Section 3.3) compounds and MWEs.
3.1 N
ouns
Spanish noun compounds most often consist of two members whose grammati-
cal categories may be the same, e. g. noun+noun (N+N), or different, e. g. noun+
adjective (N+A) or verb+noun (V+N). A preposition may also be involved as a link
between the two main constituents in the productive MWE type noun+preposi-
tion+noun (N+p+N), cf. (9a). A three-lexeme structure is less common but possi-
ble, as in limpiaparabrisas or portacuentakilómetros, which does not alter the
Compounds and multi-word expressions in Spanish 195
Within the range of existing nominal structures, several types stand out in Span-
ish: orthographic constructions (Section 3.1.1), where several different word-
classes are found as input, together with the syntagmatic types N+N (Sec-
tion 3.1.2), N+p+N (Section 3.1.3) and N+A (Section 3.1.4).
Spelling may be indicative of a unit’s lexical status. That is the case of construc-
tions which unequivocally qualify as compounds and as such are spelt as one
word. The following units, for example, are made up of a preposition and a
noun:
In contrast to such units, there are compounds whose form fluctuates between a
single-word and a two-word spelling. These are phrasal structures with an
increasingly tighter compound status, which forms a continuum from phrases to
completely settled compounds and intermediate hybrid formations. It is therefore
possible to come across guardia civil and guardiacivil, or retrato-robot and retrato
robot (cf. Van Goethem 2009 for a scalar proposal on French A+N units):
A more extreme although less frequent situation is that in which both compound
constituents are pluralized, be the compound as a whole in plural or not. Con-
structions like (18) have been analyzed as exocentric compounds with morpho-
logical mismatches in their number and gender (RAE 2010: 193, 199; Scalise/
Fábregas 2010: 122; Buenafuentes 2014: 4). These formations are dealt with in
detail in the following sections.
N+N formations are one of the most frequently studied phenomena within the
morphology-syntax divide, with a variety of labels revealing their ambiguous
condition (binominals, coordinate compounds, dvandvas) in a good number of
languages (Bauer 1998; Bisetto/Scalise 2005; Booij 2009). The constituents of
N+N constructions concatenate with no tying formal mark, although a hyphen
occasionally signals lexical status.2 Because Spanish is, with the exception of the
For Booij (2009: 223), the naming function shared by compounds and phrases
complicates their demarcation especially in languages with left-headed com-
pounding, since certain formations can be seen as compounds but also as phrases
followed by an apposition. The fact that plural inflection tends to appear on the
first constituent only (meriendas cena, efectos invernadero) substantiates access
from syntax to these formations, and Booij’s (2009) interpretation is hence a
phrasal one. This argument is not refuted by the use of these units in more collo-
quial registers, where inflection is attested for both members (perros policías,
muebles-bares). The degree of fixity is also significant here. Val Álvaro (1999:
4782) puts forward that an inflexible layout is typical of coordinate compounds
because, given the paratactic relationship of their constituents, it is an optimal
means to make the first member more salient. This happens, for example, when
the first member precedes the second one chronologically (merienda cena) or
when it is cognitively more relevant (perro policía). Similarly, sets of N+N forma-
tions may display a shared second (21) or first constituent (22):3
tive status. The same applies to minor types like V+V reduplicatives (pilla.pilla lit. catch.catch
‘tag’, a playground game) or V+V formations (duerme.vela lit. sleep.stay up ‘slumber’), which are
not illustrative of current trends (Val Álvaro 1999: 4804–4807).
3 An analogous series is comprised by visita relámpago lit. visit lightning ‘lightning visit’, guerra
relámpago lit. war lightning ‘blitzkrieg’, viaje relámpago lit. trip lightning ‘lightning trip’, etc.
These have been rejected as nominal MWEs because appositive nouns like clave or relámpago are
not restricted in their co-occurrence, and they can be accompanied by almost any noun. That
Compounds and multi-word expressions in Spanish 199
One variant is represented by the examples in (23), which have been analyzed
either as morphological or as syntactic structures. These constructions have also
been considered collocations on the basis that nouns concatenate with no prepo-
sition whatsoever (Ruiz Gurillo 2002), but their referential uniqueness makes
such reading unadvisable. Then again, an interpretation in terms of compound-
ing is hindered essentially by plural inflection, which materializes internally
(fotos tamaño carnet, cremas tipo pomada). This, together with the possibility of
recovering an elided preposition de ‘of’ after the left-most member (24), seems
sufficient evidence for a syntactic nature, in line with the units in (20).
such constructions can be inflected for number in standard registers (viajes relámpagos, guerras
relámpagos) points to semantic specialization and suggests that they are more akin to standard
modifying phrases (cf. Val Álvaro 1999: 4785; Montoro del Arco 2008: 133 f.).
200 Jesús Fernández-Domínguez
N+p+Ns are a fertile kind of construction that links a noun to a simple (25a) or dever-
bal noun (25b) by way of a preposition. N+p+Ns are head-initial formations whose
right-hand constituent is subordinated and displays adjective-like behavior:4
The first hurdle in the description of N+p+N units is that they are derived from a
syntactic pattern whereby a nominal head is postmodifed by a prepositional
phrase, but at the same time they perform a naming function that is typical of
compounding. Several criteria have been put forward to test the compoundhood
of N+p+N constructions. One is whether an equivalent lexeme exists in a different
language (26), or whether a synonymous structure has been attested in Spanish
through a different word-formation process, as in (27), although these do not
seem entirely reliable criteria. Both features hint at the lexical status of N+p+Ns
but do not evidence a morphological origin which, together with the syntactic
provenance of these formations, has led to their rejection as compounds (cf.
Rainer/Varela 1992). Telaraña, for example, developed out of lexicalization from
tela de araña, a process that has nothing to do with morphology and can be more
accurately described as univerbation than as compounding (Gaeta/Ricca 2009:
44 f.).
4 The label syntagmatic compound has been widely employed but it is also regarded as inaccu-
rate on the grounds that the phraseologization of these constructions converts them into lexical
units, not compounds, as they do not originate in morphology.
Compounds and multi-word expressions in Spanish 201
It was discussed above that whether or not constituents can be modified is a good
indication of the morphological status of a construction. The following examples
show how, for two N+p+N formations (cf. (28a) and (29a)), postmodification is
permitted (cf. (28b) and (29b)), while internal separability is ungrammatical (cf.
(28c) and (29c)):
Inflection in N+p+N units is customarily placed on the head, although the right-
hand member may display permanent plural if it refers to a plural notion. Even in
the latter case, the plural marker of the whole compound appears on the head
(agencias de viajes, trenes de mercancías, cuentos de hadas):
Nominal phrases constitute a different subtype, with full fixity and idiomaticity.
These are infrequent constructions with a significant degree of lexicalization and
metaphorical meanings, which bears witness to their phraseological status (exo-
centricity is impossible in syntactic formations). Some of such metaphorical
units, e. g. (35b), may perform a limited range of syntactic roles at the clause level,
usually direct object or subject complement, and never subject. This goes against
an analysis of these formations as compounds because their use seems to be lim-
ited to comparative constructions, as in (36):
In the vast majority of such constructions no article is found between the prepo-
sition and the right-hand member, although there are exceptions, for example
when an article denotes a well-known entity:
Spanish features abundant A+N and N+A nouns. Because of native left-headed-
ness, the former are less numerous even if, due to their spelling, they stand out
more clearly within the field of compounding than the latter. Moyna attributes a
syntactic origin to these formations, which explains why “[…] they are the hardest
to distinguish from non-compounded phrases” (2011: 181; cf. Gaeta/Ricca 2009:
51 ff. for Italian). Most A+N units are endocentric and display a relationship of
modification between their constituents (38a), although heads can become
opaque over time and acquire metaphorical readings (38b). It is possible to find
exocentric formations too, as in example (38c), which is not ‘a kind of table’ but
‘a kind of meeting’.
Spanish N+A compounds (39) are difficult to distinguish from N+A phrases (40)
if one only looks at their meaning, since in both kinds the head can be a concrete
noun (39a), a noun denoting physical state (39b), or an abstract noun (39c). As
in other Romance languages, orthography is not reliable by itself, especially in
formations with a separate spelling, since it does not necessarily reflect stress
assignment (cf. Van Goethem 2009).
In contrast to N+p+Ns, N+A units may undergo derivation, in which case the
whole construction serves as lexical base, as in agua bendita ‘holy water’ and
cuenta corriente ‘current account’, from which -era and -ista generate an instru-
ment (49a) and an agent (49b). This test proves the semantic unity of such con-
structions and ratifies their lexical nature, although it tells us nothing about their
morphological status. In addition, the test is of limited application from a mor-
phological viewpoint because operating derivation on Spanish N+A constructions
most frequently leads to ungrammatical formations (Bustos Gisbert 1986: 139).
N+A units show heterogeneous behaviors, and disparities exist regarding their
endocentricity/exocentricity, ability to undergo derivation or locus of inflection.
In particular, some authors have argued for a level intermediate between N+A
compounds and phrases. This would involve sets of MWEs that are compositional
but at the same time share one of their constituents, e. g. negro ‘black’ in (50).
Here, negro contributes a regular figurative sense throughout different examples,
while the other member of the construction adds a literal meaning. These cer-
tainly behave as mixed nominal phrases insofar as they have a fixed component
and an idiomatic one (Ginebra 2002: 148–151).
3.2 A
djectives
Complex adjectival constructions are less challenging than nominal ones thanks
to their spelling, which may be closed or hyphenated but always reveals their
lexical nature. The formal makeup of these constructions is adjective+adjective
208 Jesús Fernández-Domínguez
(A+A) or N+A. The A+A examples in (51) are lexicalized and represent a synchron-
ically unproductive type, while those in (52) are profuse and stand unambigu-
ously within morphology. The former type is limited to adjectives expressing
colors and judgement, while the latter displays a much wider semantic scope and
is frequently recursive.
Two main kinds of N+A constructions exist: one where the noun refers to salient
body parts (53), an exocentric and usually non-compositional type, and a small
group where the noun is the name of a language and the head is a participle
meaning ‘to speak’, cf. (54). Both are analyzable as compounds as they receive
external inflection (e. g. pelirrojos lit. hair.red.PL, vascoparlantes lit. Basque.
speaking.PL) and forbid internal modification (*castellano.muy.hablante lit.
Spanish.very.speaking).
Some adjective compounds denote a color which is derived from the colors
expressed by their constituents (55), while others denote nationalities (56). Plural
marking varies, since most formations are peripherally inflected to the right (57a),
but some remain uninflected (57b); gender is always expressed in the right-most
member, both in the singular and the plural forms (58). The phraseological nature
of these constructions can be observed in the restricted selection of their compo-
Compounds and multi-word expressions in Spanish 209
(59a) *pat.i.muy.corto
lit. leg.i.very.short
(59b) muy ancho de espaldas ‘having a wide back’ (person)
lit. very wide of back
3.3 Verbs
These formations aside, the verbal procedure that most closely resembles com-
pounding is that of light verb constructions (62), where a semantically void verb
is accompanied by a noun to create a conceptual unit. These constructions are
compositional and have a corresponding synthetic lexical verb which often
expresses the same meaning. Even if they cannot be called morphological objects,
these are not regular verb phrases and resemble compounds because of their
highly regular and frequent occurrence (cf. Val Álvaro 1999: 4830–4834).
Despite their verbal nature, formations like (63) stand apart from regular verbs
and from light verb constructions due to the fact that it is impossible to replace
their constituents by synonyms (64), to internally modify the noun (65), or
to apply sentence transformation on their structure (66) (cf. Val Álvaro 1999:
4831).
Verbal collocations must be taken into account as well (cf. (68)). Here, the head is
a verb that is complemented by a noun (68a), a preposition plus a noun (68b) or
an adverb (68c). These exhibit different degrees of idiomaticity and fixity, and
must be regarded as syntactic.
As happens in Italian (Iacobini 2009), it may be the case that Spanish verbal
MWEs are proportionally more widely employed than MWEs of other word classes
because of the low productivity of verbal compounding, although this has yet to
212 Jesús Fernández-Domínguez
be substantiated. Even though verbal MWEs do occur, it seems safe to assert that
the native procedures for phrasal or multi-word verbs are not powerful if com-
pared to Germanic languages or even Romance languages like Catalan or Italian
(Guevara 2012; Bisetto 2015).
Multi-word nature + + +
Naming ability + + –
Consolidated formation ± + ±
Frequent co-occurrence + + +
Paradigm membership + – –
5 These features have been discussed at different points in the present article and appear here
with specialized Spanish terminology. Paradigm membership, for example, refers to the fact
that, if a construction is coined via a synchronic syntactic procedure, it will be placed together
with the previous constructions created by that rule. The body of constructions built through the
same structure would therefore constitute its consolidation as a paradigm. Similarly, isomor-
phism is a variable of a unit’s idiomaticity, since it indicates to what extent a unit can be broken
down in meaningful subcomponents.
Compounds and multi-word expressions in Spanish 213
Plural inflection + + +
Insertion of modifiers – – ±
Isomorphism + – +
Meaning compositionality + – +
Idiomaticity – + ±
Lexical selection – – +
Table 1 makes manifest an uneven distribution pattern of features, with the result
that some are possible in all three constructions (e. g. the ability to be made up of
multiple words), others are largely optional (e. g. making up a consolidated for-
mation), and others are impossible (e. g. insertion of modifiers), although excep-
tions have been noted for most of the categories. Taken together, this causes a
cross-categorial overlap which leads to descriptive vagueness and fuzzy borders.
Depending on the degree of concurrence of these features, we will be faced with
a more or less prototypical morphological, phraseological or syntactic unit. The
combination of these characteristics also demarcates two features often associ-
ated with phraseology: fixity and idiomaticity. In principle, the more fixed and
idiomatic a unit is, the more it can be considered as unambiguously phraseologi-
cal, even if less prototypical constructions may be phraseological too (Gries 2008:
5 f.). There are hence archetypal compounds and archetypal phraseologisms,
depending on their overall reaction to the above criteria. In view of their border
properties, Gaeta/Ricca (2009) accommodate compounds and phrases into a
quadripartite system that distinguishes the feature of being listed in the lexicon
from that of being the output of morphology. For these authors, lexicalization and
compoundhood are independent notions and each may be present or absent in a
particular construction. This materializes in a four-level typology (69) which
embraces prototypical compounds (69a), prototypical phrases (69d), and two
intermediate positions (69b) and (69c):
The rationale is that, just like there are “[…] lexical units that are not compounds,
but syntactic units, we should also find compounds (morphological units) which
are not lexical units” (Gaeta/Ricca 2009: 40). An example of (69a) is compraventa
‘buying and selling’, and one of (69d) is gorra de metal ‘metal cap’. Type (69c)
involves syntactic elements that have a conceptual referent, e. g. dolor de cabeza
‘headache’, while (69b) is a priori an unexpected kind: compounds that are not
lexically listed. This is possible for extremely productive morphological pro-
cesses, whose output is large, and not all of which is lexicalized. In Spanish, it is
the case of V+N compounding, as in espantacucarachas ‘cockroach scarer’ (cf.
Section 3.1.1).
This leads us to the competitive vs. cooperative behavior of compounds and
MWEs. The fact that many phrasal constructions (e. g. guerra fría ‘cold war’, café
con leche ‘white coffee’) have a denominative role and are accompanied by a defi-
nition in lexicographic studies is proof of their naming ability, which in turn sets
them up as potential competitors for word-formation (Booij 2009: 220). This is
evident for example in doublets formed by one morphological and one phraseo-
logical construction, as in (42): cita médica ‘medical appointment’ vs. cita del
médico ‘appointment of the doctor’. Occasionally, one of the units becomes estab-
lished and blocks the other, e. g. *guerra de(l) frío lit. war of the cold (vs. guerra
fría ‘cold war’), although coexistence is not rare. The exact nature of this interac-
tion depends on language-specific factors (Hüning/Schlücker 2015 on German;
Masini this volume on Italian), not extensively discussed in the Spanish
literature.
The consequence deriving from this behavior is what one would expect: gen-
uine compounding is not a frequent lexical resource in Spanish, and this causes
the interference of MWEs as a naming device. In the case of nouns, Section 3.1
discusses orthographic constructions which unequivocally qualify as compounds
and the three configurations N+N, N+p+N and N+A. In the case of adjectives (Sec-
tion 3.2), broad agreement exists on their morphological origin, which is why
adjectival MWEs (70) are not generally required to fulfil a naming function.
to complex idioms. As has been shown, the data available for Spanish is not
favorable for a two-category distinction, and so the possibility of a single all-in-
clusive class is particularly welcome in this case. Turning to constructions of
course implies allowing MWEs into the mental lexicon, meaning that MWEs and
compounds co-exist, overlap somewhat in their forms and functions and are
hence competitors for the naming act. This position is consistent with the depic-
tion of the Spanish system presented above, and offers a middle-ground solution
to the apparently irreconcilable nature of these two sets of units.
5 C
onclusions
This article has offered a concise overview of MWEs and compounds in Contem-
porary Spanish. It has dealt with constructions that can be viewed as compounds,
phrases or collocations depending on an analysis based on a combination of syn-
tactic, phraseological and morphological features. A non-discrete demarcation of
such units is the clearest outcome of the tests available, with several features
shared by compounds and idiomatic expressions. These tests make it impossible
to empirically separate morphological from phraseological formations due to idi-
osyncrasies and exceptions caused by semantic and functional similarities. The
above arguments and examples indeed make a case for a gradient structure of
MWEs, of which compounds and phrases are extreme positions.
Some Spanish compounds and MWEs stand in cooperative rivalry. This asso-
ciation is apparently inversely proportional to their respective lexical output,
such that the more productive compounding is for a given category, the less pro-
ductive MWE formation will be. This ensures that a linguistic resource for concept
naming will always be available. In this sense, observation of the data makes it
safe to assert that Spanish compounding is productive mainly for nouns and
adjectives, and that MWE formation is exploited for other categories. It must be
borne in mind, however, that the environment of Spanish morpho-syntax is dif-
ferent from that of English, from which most current linguistic frameworks and
theories of word-formation have emerged. The contrast between the Spanish and
English systems is evident for example in the allegedly poor output of Spanish
compounding or very high productivity of the exocentric V+N pattern, measures
which will by need seem unsatisfactory if English is taken as the benchmark. It
may then be the case that a strict application of Germanic models on Romance
phenomena will most likely project an imperfect picture. The present situation
calls for an approach which considers MWEs in other languages but does not
impose external models to native patterns (e. g. Booij 2009; Gaeta/Ricca 2009;
Masini 2009).
Compounds and multi-word expressions in Spanish 217
In elucidating the status of MWEs, the need for agreement among linguistic
disciplines is urgent, a task that has been neglected so far. For decades, research
into morphology has made little headway in the analysis of phrase-like com-
pounds, and phraseologists have unsuccessfully struggled in explaining various
levels of multi-word formations. Joint efforts may thrive in precisely locating
MWEs in the language system, not through separate investigations, but by look-
ing at the common goals of morphology and phraseology: “a proper theory of the
relation between morphological and syntactic naming constructions is called
for” (Booij 2009: 220). Let us remember that phraseology is a young field whose
conceptual foundations seem to be under development. Gries puts it as follows
(2008: 22; also Colson 2016):
Many phraseologists […] have focused on rather descriptive work on phraseology (or, more
narrowly, idioms) and have often not been concerned with integrating their accounts of
phraseologisms in particular and other patterns more generally into a larger theory of the
linguistic system.
References
Bauer, Laurie (1998): When is a sequence of two nouns a compound in English? In: English
Language and Linguistics 2, 1. 65–86.
Bauer, Laurie (2017): Compounds and compounding. (= Cambridge Studies in Linguistics 155).
Cambridge, UK: Cambridge University Press.
Bisetto, Antonietta (2015): Do Romance languages have phrasal compounds? A look at Italian.
In: Language Typology and Universals (STUF) 68, 3. 395–419.
Bisetto, Antonietta/Scalise, Sergio (2005): The classification of compounds. In: Lingue e
Linguaggio 4, 2. 319–332.
Booij, Geert (2009): Phrasal names: A constructionist analysis. In: Word Structure 2, 2.
219–240.
Booij, Geert (2010): Construction Morphology. Oxford: Oxford University Press.
Buenafuentes de la Mata, Cristina (2010): La composición sintagmática en español. San Millán
de la Cogolla: Cilengua.
Buenafuentes de la Mata, Cristina (2014): Compounding and variational morphology: The
analysis of inflection in Spanish compounds. In: Borealis: An International Journal of
Hispanic Linguistics 3, 1. 1–21.
218 Jesús Fernández-Domínguez
RAE: Real Academia Española y Asociación de Academias de la Lengua Española (2010): Nueva
gramática de la lengua española. Manual. Madrid: Espasa.
Rainer, Franz/Varela, Soledad (1992): Compounding in Spanish. In: Rivista di Linguistica 4, 1.
117–142.
Rao, Rajiv (2015): On the phonological status of Spanish compound words. In: Word Structure 8,
1. 84–118.
Renner, Vincent (2006): Les composés coordinatifs en anglais contemporain. Unpublished PhD
dissertation. Université Lumière Lyon 2. Internet: https://tel.archives-ouvertes.fr/
tel-00565046 (last access: 10.12.2016).
Ruiz Gurillo, Leonor (2002): Compuestos, colocaciones, locuciones: Intento de delimitación. In:
Veiga, Alexandre/González Pereira, Miguel/Gómez, Montserrat Souto (eds.): Léxico y
gramática. Lugo: Tris Tram. 327–339.
Scalise, Sergio/Fábregas, Antonio (2010): The head in compounding. In: Scalise, Sergio/Vogel,
Irene (eds.): Cross-disciplinary issues in compounding. Amsterdam u. a.: Benjamins.
109–125.
Schlücker, Barbara/Hüning, Matthias (2009): Compounds and phrases. A functional
comparison between German A+N compounds and corresponding phrases. In: Rivista di
Linguistica 21, 1. 209–234.
Val Álvaro, José Francisco (1999): La composición. In: Demonte, Violeta/Bosque, Ignacio (eds.):
Gramática descriptiva de la lengua española. Vol. 3: Entre la oración y el discurso.
Morfología. Madrid: Espasa-Calpe. 4757–4842.
Van Goethem, Kristel (2009): Choosing between A+N compounds and lexicalized A+N phrases:
The position of French in comparison to Germanic languages. In: Word Structure 2, 2.
241–253.
Maria Koliopoulou
ompounds and multi-word expressions
C
in Greek
1 I ntroduction
Complex lexical units include compounds as well as multi-word expressions dis-
playing mixed morphosyntactic properties.1 These mixed properties are deter-
mined by language-specific characteristics. Moreover, a diversity of properties is
observed among the different types of multi-word expressions; in some cases
even within the same type of structure. Therefore, their status is rather unclear, as
is also revealed by the strong name variation among scholars (Hüning/Schlücker
2015: 450 f.), even within the same language. The different naming suggestions
cannot be considered as one-to-one equivalents or synonyms. The selection of
one of them is also determined by the theoretical approach adopted. Specifically,
the selection or the creation of a new label depends on the type of grammatical
model as well as on the role of the lexicon to the formation of new lexical units.
Multi-word expressions in Greek have caught the attention of linguists in the
twentieth century. This type of lexical unit has been used more often in the form
of loan translations from English and French (Anastassiadis-Symeonidis 1986,
1994). Since then it has been rather prominent in many terminological domains
as well as in media language. Moreover, it constitutes a commonly selected for-
mation type of lexical units for the naming of new concepts or the translation of
borrowed terms gaining ground over the formation of typical compounds.
The phenomenon of terminological variation regarding multi-word expres-
sions is also apparent in the literature of Greek. Different names that have been
suggested among scholars are for instance lexical phrases (Anastassiadis-Syme-
onidis 1986; Ralli 1991), multi-word compounds (Ralli 1992; Anastassiadis-Syme-
onidis 1996; Christofidou 1997; Ralli/Stavrou 1998) and loose multi-word com
pounds (Ralli 2005, 2007; Koliopoulou 2006, 2008, 2009). Ralli (2013a, 2013b; cf.
also Bağriaçik/Ralli 2015) adopts in her later studies the term phrasal compounds,
inspired by Booij’s (2009, 2010: 169–192) term phrasal names, in order to differen-
1 I wish to thank the editor of this volume, Anna Anastassiadis-Symeonidis, Pius ten Hacken as
well as the two anonymous reviewers for their constructive comments and criticism. Needless to
say, remaining mistakes and opinions expressed are of my own responsibility.
Open Access. © 2019 Koliopoulou, published by De Gruyter. This work is licensed under the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-008
222 Maria Koliopoulou
tiate specific types of complex lexical units from typical one-word compounds
which are morphological objects. However, the use of the term phrasal compound
to refer to this type of structure can be misleading, since it is also used to denote
another kind of structure, namely compounds with a phrasal element at the non-
head position, like chicken and egg situation in English. Such structures are not
possible in Greek (cf. Section 2.1).
In this study, I adopt the term multi-word expression as a term that is general
and theory-neutral – also suggested by Hüning/Schlücker (2015: 451) – to refer to
different types of complex lexical units in Greek sharing morphological and syn-
tactic features in various proportions. The aim of this study is to analyze their
complicated properties and compare them to typical compounds without letting
theoretical considerations override the data. After having analyzed in detail the
different types of multi-word expressions in Greek, I will come back to more the-
oretical considerations regarding their interrelation with other comparable lexi-
cal units as well as their locus of realization in grammar.
Specifically, this study is structured as follows: Section 2 gives an overview of
various complex lexical units found in Greek. Typical compounds, multi-word
expressions as well as phrase-like structures are analyzed in detail and compared
to each other. Section 3 discusses the interrelation between the various types
arguing that they coexist in the lexicon as complementary resources of nominal
naming units. However, coexistence in the lexicon does not exclude competition
among types. Section 4 deals with the question of how complex lexical units can
be accounted for in the lexicon and in grammar. Finally, Section 5 summarizes
the conclusions.
2 T
ypical compounds vs. other complex lexical
units
Compounding can be considered as the output of a morphological operation sit-
uated closer to syntax than any other morphological formation (Scalise 1992: 4).
As a result of this closeness, it is sometimes rather difficult to differentiate com-
pounds from phrases2, and even more from intermediate structures displaying a
2 Many studies have been carried out on the distinction between compounds and phrases based
on selected criteria mostly concerning the formal properties of a compound contrary to those of
a syntactic phrase (e. g. Borer 1988; Scalise 1992; ten Hacken 1994; Bisetto/Scalise 1999; Bauer
2001; Olsen 2001; Donalies 2004; Gaeta/Ricca 2009; Schlücker/Hüning 2009). Despite the detec-
Compounds and multi-word expressions in Greek 223
tion of specific criteria, there is no agreement on a clear-cut distinction between compounds and
phrases, at least not cross-linguistically.
3 Greek examples in this chapter are given in Greek as well as transliterated in the Latin script,
before being translated into English.
4 The abbreviation stands for “linking element”.
224 Maria Koliopoulou
Nominal compounds consisting of two nouns are the most productive ones (1), as
in a number of other languages, for instance in German. Verbal compounding is
very productive in Greek in comparison to other European languages, either in
the form of determinative structures, as in (3), or in the form of coordinative struc-
tures (e. g. πηγαινοέρχομαι ‘come and go’). In German, for instance, the limited
number of verbal compounds is the result of a backformation process from nom-
inal compounds (Becker 1992: 20 f.; Günther 1997: 6).
With regard to their structural properties, compounds in Greek usually con-
sisting of stems form one phonological word written as one graphemic unit
(cf. (1)–(3)). This phonological word has one main stress assigned either on the
antepenultimate syllable of the entire compound formation (1), or on the regular
stress position of the right-hand constituent (2, 3). Stress assignment is deter-
mined by two specific phonological rules applicable to all compound formations
(Nespor/Ralli 1994: 201, 1996: 357). The form of these rules will not concern us
here.
Moreover, compounds in Greek constitute one morphological unit, to which
syntactic operations do not have access. In the following, I contrast the properties
displayed by a compound formation (4) with those of a syntactic phrase (5), both
consisting of an adjective and a noun, so that the analysis is comparable. The first
indication of the word atomicity displayed by compounds is related to the fact
that word internal inflection is not allowed (4b), contrary to syntactic phrases,
whose components are inflected.
(4) [A N]compound
(4a) τρελόπαιδοNeu.Nom.Sg ← τρελ(ό)Neu.Nom.Sg -ο- παιδ(ί)Neu.Nom.Sg
trelopedo trel(o) -o- ped(i)
crazy boy crazy LE child
(4b) *τρελ-ά-παιδ-αNeu.Nom.Pl ← τρελ(ά)Neu.Nom.Pl παιδ(ιά)Neu.Nom.Pl
trel-a-ped-a trel(a) ped(ia)
crazy boys crazy children
(5) [A N]phrase
(5a) τρελ-όNeu.Nom.Sg παιδ-ίNeu.Nom.Sg
trel(o) ped-i
crazy child/boy
(5b) τρελ-άNeu.Nom.Pl παιδ-ιάNeu.Nom.Pl
trel-a ped-ia
crazy children/boys
Compounds and multi-word expressions in Greek 225
Apart from this first distinctive characteristic, I will apply a number of diagnostic
tests to both types of structure in order to verify the lexical integrity of compound
structures. Some of the typical diagnostic tests found in the literature on com-
pounding in Greek (cf. Ralli 2013a: 21, 24; Bağriaçik/Ralli 2015: 328 f.; ten Hacken/
Koliopoulou 2016: 130 ff.) are the following: a) independent modification of the
non-head (6), b) coordination of the components (7), c) reversing the word order
(8).
(6a) *πολυ-τρελ-ό-παιδο
poli-trel-o-pedo
very crazy boy
(6b) πολύ τρελό παιδί
poli trelo pedi
very crazy boy
(7a) *τρελ-ο-και-χαζ-ό-παιδο
trel-o-ke-chaz-o-pedo
crazy and stupid boy
(7b) τρελό και χαζό παιδί
trelo ke chazo pedi
crazy and stupid boy
(8a) *παιδ-ό-τρελο
ped-o-trelo
boy crazy
(8b) παιδί τρελό
pedi trelo
boy crazy
With regard to the last test, according to which the order of the constituents of
syntactic phrases can be reversed (8b), it should be mentioned that this possibil-
ity increases the emphasis on the syntactic phrase. Specifically, the property des-
ignated by the adjective is highlighted by this stylistic variation (ten Hacken/Koli-
opoulou 2016: 131 f.). On the contrary, the word order of compound components is
fixed (8a). Even in compounds consisting of components with the same lexical
category, like noun-noun compounds, the change of the order of the two compo-
nents is – at least in Standard Modern Greek – ungrammatical (cf. (1) κεφαλόσκα
λο/*σκαλοκέφαλο ‘upper/wider step’).
226 Maria Koliopoulou
(9a) [stem-stem]
καραβόπανο ← καράβ(ι)5 -ο- παν(ί)
karav-o-pano karav(i) -o- pan(i)
sailcloth ship LE cloth
(9b) [stem-word]
θαλασσοταραχή ← θάλασσ(α) -ο- ταραχή
thalassotarachi thalassa(a) -o- tarachi
sea disturbance sea LE disturbance
(9c) [word-stem]
επτάψυχος ← επτά ψυχ(ή)
eptapsichos epta psich(i)
having seven lives seven soul
(9d) [word-word]
ξαναμιλάω ← ξανά μιλάω
ksanamilao ksana milao
talk again again talk
The most productive pattern is that of stem-stem formations (9a), since stem-con-
stituents are preferred in Greek compounds.
The preference for a specific type of constituent in the compound formations
constitutes an important parameter determining various structural characteris-
tics of compounds (cf. Koliopoulou 2013, 2014a), among others the possibility of
the appearance of a linking element. Specifically, in case the first constituent is a
stem (9a), (9b), the two constituents are linked to each other with the element
-o-.6 Its appearance is obligatory and rather systematic, motivated by the fact that
compound constituents are usually stems.
5 A stem constituent in a Greek compound can also be indicated by the fact that the truncated
inflectional ending is given in parentheses.
6 Other possible forms of the linking element are -ι- and -α- appearing in rare cases (cf. Ralli
2013a: 50–53).
Compounds and multi-word expressions in Greek 227
The difference in the degree of recursion between German and Greek compounds
is related to the type of constituents. Specifically, I argue that stem constituents
exhibit more restrictions than word constituents, whose more independent char-
acter allows the connection with further compound members, either in the head
or in the non-head position.
7 For an extensive comparison between Greek and German compounds cf. Koliopoulou (2013,
2014a, 2014c, 2015).
228 Maria Koliopoulou
2.2 M
ulti-word expressions
Another type of complex lexical unit composed of free morphemes are multi-word
expressions. These peculiar structures, found also in Greek, have been already
studied by many scholars (cf. literature mentioned in Section 1 as well as Ana-
stassiadis-Symeonidis 1986: 138–143, 203 ff.; Koliopoulou 2012: 862) in compari-
son to one-word compounds, to syntactic phrases and even to each other, since
the various types display different characteristics. Specifically, contrary to typical
compounds constituting one morphological word to which syntax has no access,
multi-word expressions in Greek are structures with some morphological proper-
ties (cf. Section 2.2.1) without though preventing syntax from having access to
their internal structure. They can be considered as intermediate structures, since
they behave similarly to one-word compounds, but they also bear features typical
for syntactic phrases. Their mixed properties vary not only within the different
types of intermediate structures, but in some cases even among the various exam-
ples of the same type (cf. (26)–(27)).
Specifically, multi-word expressions in Greek are nominal structures8 com-
posed either of an inflected adjective and a noun or of two nouns. They look like
syntactic phrases since their components are independent phonological words,
contrary to one-word compounds constituting a single phonological word,
regardless of the type of the compound constituents. Moreover, multi-word
expressions consist of two inflected words. Compounds, by contrast, are usually
formed out of stems linked by the element -o-. Compound formations are inflected
at the right edge of the structure.
To be more specific, there are four types of multi-word structures9:
a) [A N] expressions composed of an inflected adjective and a noun (12),
b) [N NGEN] expressions consisting of two nouns, the second being in the genitive
case (13),
c) [N NAttr.] expressions consisting of two nouns in attributive relation (14),
d) [N NApp.] expressions composed by two nouns in appositive relation (15).
8 There are only nominal multi-word expressions in Greek, which should not be confused with
other types of phrasal expressions, like την κάνω (tinFEM.ACC.SG kano1P.SG, her make, ‘I am going’),
namely fossilized expressions with a very idiomatic meaning (cf. Ralli 2013a: 252).
9 Most examples are taken from Anastassiadis-Symeonidis (1986), the first linguist that men-
tioned and analyzed thoroughly these structures in Greek.
Compounds and multi-word expressions in Greek 229
(12) [A N]
(12a) ψυχρός πόλεμος ← ψυχρόςMasc/Nom/Sg πόλεμοςMasc/Nom/Sg
psichros polemos psichros polemos
cold war cold war
(12b) τρίτος κόσμος ← τρίτοςMasc/Nom/Sg κόσμοςMasc/Nom/Sg
tritos kosmos tritos kosmos
third world third world
(12c) μαύρη αγορά ← μαύρηFem/Nom/Sg αγοράFem/Nom/Sg
mavri agora mavri agora
black market black market
(12d) μεγάλη οθόνη ← μεγάληFem/Nom/Sg οθόνηFem/Nom/Sg
megali othoni megali othoni
cinema big screen
(13) [N NGEN]
(13a) αγορά εργασίας ← αγοράNom.Sg εργασίαςGen.Sg
agora ergasias agora ergasias
job market market job
(13b) τάγματα ασφαλείας ← τάγματαNom.Pl ασφαλείαςGen.Sg
tagmata asfalias tagmata asfalias
security battalions battalions safety
(13c) κρέμα ημέρας ← κρέμαNom.Sg ημέραςGen.Sg
krema imeras krema imeras
day cream cream day
(13d) άρση βαρών ← άρσηNom.Sg βαρώνGen.Pl
arsi varon arsi varon
weightlifting lift weight
(14) [N NAttr.]
(14a) λέξη κλειδί ← λέξηNom κλειδίNom
leksi klidi leksi klidi
key word word key
(14b) νόμος πλαίσιο ← νόμοςNom πλαίσιοNom
nomos plesio nomos plesio
frame law law frame
(14c) φόρος φωτιά ← φόροςNom φωτιάNom
foros fotia foros fotia
very high tax tax fire
(14d) γυναίκα αράχνη ← γυναίκαNom αράχνηNom
gineka arachni gineka arachni
greedy, dishonest woman woman spider
230 Maria Koliopoulou
(15) [N NApp.]
(15a) μεταφραστής διερμηνέας
metafrastis diermineas
translator-interpreter
(15b) σκηνοθέτης παραγωγός
skinothetis paragogos
director-producer
(15c) ηθοποιός τραγουδιστής
ihtopios tragudistis
actor-singer
(15d) δικηγόρος πολιτικός
dikigoros politikos
layer-politician
[A N] expressions, like those given under (12), stripped off both inflectional end-
ings and turned into one complex stem can also receive a derivational suffix, as
shown in the examples given under (18). However, [A N] expressions are not the
only structures that display this possibility. As Anastassiadis-Symeonidis (1986:
140) mentions, some [N NGEN] expressions (13) can also be input to a suffixation
process, as shown in (19). The suffixes that take part in this derivational process
are the adjectival suffix -ικ(ός) and the nominal suffixes -ίτ(ης) and -ίστ(ας).
Coordinative compounds in Greek that are possible in all major lexical categories
(21) do not usually display this possibility except for very few [A A] compounds,
such as (21b) (cf. Ralli 2007: 99; Koliopoulou 2013: 301).
Despite the fact that the order of the [N NApp.] components is more easily reversi-
ble, this possibility affects in some degree the meaning of the structure (Anastas-
siadis-Symeonidis 1986: 191 f.; Ralli 2013a: 256). Specifically, the first member
Compounds and multi-word expressions in Greek 233
bears a more prominent semantic role than the second one. Therefore, the mean-
ing of the expression changes slightly in case the order of the constituents is
reversed.
Moreover, coordinative compounds and [N NApp.] structures are not directly
comparable, although some scholars treat them in this way (cf. Olsen 2001;
Bisetto/Scalise 2005). In many studies it has been argued that [N NApp.] expres-
sions display different characteristics in comparison to coordinative compounds
(cf. Wälchli 2005: 7; Bauer 2008: 4; Gaeta/Ricca 2009: 50; Manolessou/Tsolakidis
2009: 30). In the case of Greek, there is a clear demarcation between the two types
of formation since coordinative compounds constitute one phonological and
morphological word (21). In contrast, [N NApp.] expressions consist of two phono-
logically and morphologically independent words. Moreover, coordinative com-
pounds are not characterized by an appositional relation between the compo-
nents. The most common type of semantic relation found in Greek coordinative
compounds is the additive one (Ralli 2007: 80 f., 98, 2013a: 163; Koliopoulou 2013:
297 ff.).
Since multi-word expressions always consist of two inflected words, they do
not display the morphological properties of one-word compounds (cf. Anastassi-
adis-Symeonidis 1986: 149, 174, 196). Particularly, the inflected components of
[A N] expressions agree in gender, case and number, as shown in (22), like regular
syntactic phrases.
Similar characteristics of agreement in gender, case and number are also dis-
played by [N NApp.] expressions (cf. (15)), as illustrated below.
Moreover, the genitive case of the non-head is always singular regardless of the
number value of the head, as presented in (24b) and (24d). The inflectional prop-
erties of the non-head are less variable than the inflectional properties of the non-
head of equivalent regular phrases. Specifically, both constituents of a syntactic
phrase can be variably inflected regarding the features of number, as illustrated
in (25).
Interestingly, comparing two example of the same type of expression, λέξη κλειδί
(‘key word’, (14a), (26)) and νόμος πλαίσιο (‘frame law’, (14b), (27)), it becomes
obvious that not all examples have the same inflectional properties in compara-
ble contexts (cf. Koliopoulou 2009: 67 f., 2012: 866 f., Ralli 2013a: 254 f.). Specifi-
cally, the non-head of the expression νόμος πλαίσιο displays a higher degree of
inflectional autonomy in comparison to the inflectional variation displayed by
the non-head of the example λέξη κλειδί (cf. (26b)–(27b), (26d)–(27d)). Moreover,
with regard to the features of plural and genitive case there are different gram-
maticality judgements among native speakers.
*νόμωνMasc/Gen/Pl πλαισίουNeu/Gen/Sg
nomon plesiu
*νόμωνMasc/Gen/Pl πλαισίωνNeu/Gen/Pl
nomon plesion
Despite the fact that multi-word expressions share basic properties with regular
syntactic phrases, they share many properties with typical compounds as well.
Specifically, all four types of multi-word expression in Greek display a certain
degree of syntactic fixedness. Some expressions are more restricted than others
with regard to the degree of access to syntactic operations, as illustrated by the
result of applying a number of tests concerning their internal properties i. e. their
degree of lexical integrity (Anderson 1992: 84). Their mixed morpho-syntactic
properties have been studied in detail (Anastassiadis-Symeonidis 1986, 1994,
1996; Ralli 1991, 1992, 2005, 2007, 2013a, 2013b; Christofidou 1997; Ralli/Stavrou
1998; Koliopoulou 2006: 43–56, 2008, 2009, 2012; Bağriaçik/Ralli 2015; ten
Hacken/Koliopoulou 2016). In most of these studies, the degree of lexical integrity
of the multi-word expressions has been analyzed on the basis of diagnostic tests
exploring how many properties they share with regular syntactic formations.
In the following, I use the tests applied to typical compounds (cf. (6)–(8)) in
the previous section in order to determine the degree of syntactic fixedness dis-
played by the different types of multi-word expressions found in Greek (cf. (12)–
(15)). Moreover, I use an additional test regarding the possibility of adjective-noun
syntactic phrases to double the definite article for emphatic reasons, which is
only applicable to [A N] expressions. I summarize the tests under (28):
In (29), I apply the above tests contrastively to the [A N] expression μεγάλη οθόνη
(‘cinema’, (12d)) as well as to the corresponding syntactic phrase μεγάλη οθόνη
(‘big screen’). The examples chosen for the contrastive analysis consist of the
same constituents. However, the difference between them is clear since the [A N]
expression denotes the cinema, whereas the meaning of the syntactic phrase is
fully compositional, denoting a big screen.
The negative response of the [N NGEN] expression to the applied test reveals a
degree of syntactic fixedness similar to that of the [A N] expressions.
The reaction of [N NAttr.] expressions to the same tests is not different from that
of the structures tested above, as illustrated in the following on the basis of the
example φόρος φωτιά (‘very high tax’, (14c)).
With regard to the possibility of reversing the order of the constituents, most of
the examples belonging to this type of expression have a negative response,
proven by (31c) as well as by (32a΄–c΄). However, there are a few exceptions, e. g.
(32d΄), in which the inversion of the two constituents is allowed (Koliopoulou
2006: 52, 2009: 66), since in this way the property designated by the non-head
can be highlighted.10
2.2.2 Summary
In (34), I summarize the main points of the analysis of the four types of mul-
ti-word expression found in Greek regarding their degree of syntactic fixedness:
(34a) [A N] and [N NGEN] expressions look like syntactic phrases and are inflected
as such. However, both their inflectional properties as well as their behav-
ior on the diagnostic tests show a certain degree of lexical integrity. Specif-
ically, they share the most morphological characteristics with typical com-
pounds compared to the other types of expressions. Moreover, they can be
input to a suffixation process. Although both types of expressions are
rather rigid with regard to their morphosyntactic features, not all instances
may take part in a suffixation process.
(34b) [N NAttr.] expressions display a rather unclear status. Not only their inflec-
tional properties but also their response to the tests of syntactic fixedness
varies among the different instances.
(34c) [N NApp.] expressions constitute a borderline case among multi-word
expressions in Greek. Not only with regard to their inflectional properties
but also with regard to their behavior in the diagnostic tests, they show the
lowest degree of syntactic fixedness among all types of expressions con-
sidered in this study. However, they still show some signs of lexical auton-
omy, according to which their classification as multi-word expressions is
justified.
2.3 P
hrase-like structures
(35) [A N]
(35a) θεατρική κριτική ← θεατρική κριτική
theatriki kritiki theatriki kritiki
theater review theatrical criticism/review
(35b) βιομηχανική ζώνη ← βιομηχανική ζώνη
viomichaniki zoni viomichaniki zoni
industrial zone industrial zone
(35c) πυρηνική δοκιμή ← πυρηνική δοκιμή
piriniki dokimi piriniki dokimi
nuclear testing nuclear testing
(35d) ψηφιακό κύκλωμα ← ψηφιακό κύκλωμα
psifiako kikloma psifiako kikloma
digital circuit digital circuit
(36) [N NGEN]
(36a) επεξεργασία δεδομένων ← επεξεργασία δεδομένωνGEN
epeksergasia dedomenon epeksergasia dedomenon
data processing processing data
(36b) εκπομπή αερίων ← εκπομπή αερίωνGEN
ekpompi aerion ekpompi aerion
gas emission emission gases
However, as it is obvious from the examples above, both types of structure dis-
play a certain degree of semantic opacity, like the corresponding types of mul-
ti-word expression.
According to their response to the diagnostic test of syntactic atomicity, both
structures can be subjects to syntactic operations. Specifically, in (37) I consider
the application of the tests (28a–d) to the [A N] structure βιομηχανική ζώνη (35b)
and in (38) I apply the tests (28a–c) to the [N NGEN] example εκπομπή αερίων, cf.
(36b).
242 Maria Koliopoulou
It becomes clear from the above tests that syntactic operations have access to
their internal structure, contrary to the [A N] and [N NGEN] multi-word expressions
which display a certain degree of lexical integrity, as shown in (29) and (30).
However, due to the argument structure displayed by these structures it can
be argued that they are of a different nature from common syntactic phrases. Par-
ticularly, their structure resembles that of compounds consisting of a relational
adjective and a noun (ten Hacken 1994: 89–98; Bisetto 2010: 65–85). Moreover,
they constitute naming units which also supports the view that they are of a dif-
ferent nature than common syntactic phrases. Therefore, they constitute a fur-
ther type of complex lexical units, which on the one hand differs from regular
syntactic phrases, and on the other cannot be classified as belonging to the set of
the multi-word expressions analyzed above. Moreover, they display more syntac-
tic properties than the multi-word expressions. Thus, their demarcation from syn-
tactic phrases is a rather difficult task, since it is only based on a few minor dis-
tinctive characteristics and not on their response with regard to the diagnostic
tests of syntactic fixedness.
Compounds and multi-word expressions in Greek 243
3 C
omplementation vs. competition
I have argued above for a distinction between three types of complex lexical units
in Greek: one-word compounds, multi-word expressions and phrase-like struc-
tures. All three constitute nominal structures sharing a function, i. e. to name con-
cepts, particularly complex concepts. Regarding their function, they are clearly
different from syntactic phrases, which describe a concept but do not name it.
Since they provide further means for naming concepts associated with various
terminological areas, the set of naming devices in the nominal domain of the lex-
icon is extended through their existence. In this sense, the three types of complex
lexical units constitute complementary resources of nominal naming units.
Complementation in the lexicon with regard to different naming strategies
does not exclude competition among structures. Specifically, typical one-word
compounds and multi-word lexical units do not exist in Greek side by side,
although this scenario cannot be excluded for all languages. Take for instance
lexical units in German (cf. ten Hacken/Koliopoulou 2016), like grüner Tee and
Grüntee (‘green tee’) or schwarzer Markt and Schwarzmarkt (‘black market’), coex-
isting synchronically. Their coexistence is explained by Hüning/Schlücker (2015:
459) on the grounds of stylistic variation and/or diachronic change arguing that
the structure schwarzer Markt, for instance, has been gradually replaced by the
compound Schwarzmarkt, which is synchronically more frequent than the equiv-
alent phrase.
In Greek, the three types of complex lexical units compete with each other.
However, there is no evidence supporting the existence of a blocking mechanism
(cf. Rainer 2016), although the formation of typical compounds is much more pro-
ductive and regular than the formation of multi-word structures. Moreover, I
claim that the selection of a possible naming strategy depends on the character-
istics of the concept. Specifically, a borrowed nominal concept or a complex con-
cept meant for terminological use is a possible candidate for a type of multi-word
lexical unit.
They are neither morphological structures nor regular syntactic phrases; they are
rather situated in between. Therefore, multi-word expressions in Greek have been
often assigned to a continuum situated between the two components. In this
sense, different grammatical models supporting the interaction between the two
domains (Kiparsky 1982; Bybee 1985; Borer 1988) have been adopted by many
scholars as the most sufficient way to deal with multi-word expressions and their
variable features in Greek (cf. Ralli 1991, 1992, 2007: 245 f.; Ralli/Stavrou 1998;
Koliopoulou 2009: 69, 2012: 868).
In a similar context, Ralli (2013a: 261 f., 266 ff., 2013b: 183 f., 194), based on
Borer’s (2009) analysis of comparable nominal constructs in Hebrew, argues that
multi-word expressions in Greek are derived within the syntactic domain which
interacts with morphology. Her argument is rather justified, since multi-word
expressions and phrase-like structures in Greek look like syntactic phrases that
consist of two phonologically and morphologically independent words. However,
they are different from regular syntactic phrases, since their structure is not
accessible to all syntactic operations. Moreover, they display a certain degree of
lexical integrity coinciding in many cases with a non-compositional meaning,
also displayed by typical compounds.
The fact that there is strong variation among the different types of multi-word
structures with regard to their mixed morphosyntactic properties supports the
view that there is no clear borderline between morphology and syntax and that
the two domains are situated on a continuum.11 Multi-word expressions in Greek
which display a varying degree of structural visibility to syntactic operations
occupy different positions on this continuum. [A N] and [N NGEN] expressions in
Greek are clearly nearer to the morphological domain, i. e. to typical compounds,
than any other multi-word expression. The fact that some [A N] and [N NGEN] for-
mations can be input to a derivational process is a further argument in favor of
the interaction between morphology and syntax, since structures generated in
syntax are turned into one complex stem in order to undergo a morphological
operation (cf. (18)–(19)). The other two types of nominal expressions are wide-
spread on the continuum, specifically between the [A N] and [N NGEN] formations
and regular syntactic phrases. Phrase-like structures are situated near to the syn-
tactic domain.
The various approaches that argue in favor of the existence of a continuum
between the two grammatical components or the interaction among them are
based on the assumption that the two grammatical domains are distinct. Although
they may account for structures like multi-word expressions in Greek displaying
mixed morphosyntactic properties, they do not throw any light on the grey zone
between morphology and syntax. In this respect, the question arises whether the
two grammatical domains are actually distinct and if not what kind of demarca-
tion would allow us to differentiate between typical morphological structures,
syntactic phrases and intermediate structures.
In order to distinguish compounds from phrases as well as from the in-be-
tween formations, Gaeta/Ricca (2009: 38 f.) propose another type of demarcation.
They argue in favor of a four-scaled classification based on two criteria: a) ‘mor-
phological’, i. e. the output of a morphological operation and b) ‘lexicalized’, i. e.
attributed to the lexicon taking into consideration not only idiosyncrasy but also
token frequency and/or naming force. In this respect, typical compounds are
characterized as [+ morphological] and [+ lexical], whereas syntactic phrases
have a negative sign in both properties. Multi-word expressions – or phrase-like
units in Gaeta/Ricca’s terminology – are non-morphological but lexical units. In
this view being a lexical unit is independent from being an output of a morpho-
logical operation.
On a similar basis, ten Hacken/Koliopoulou (2016: 134 ff.), dealing with [A N]
multi-word expressions in various languages, argue that the main criterion to
demarcate [A N] intermediate structures from adjective-noun syntactic phrases is
related to the function of these structures. Structures constituting a naming unit
are lexical units, while descriptive phrases belong to the syntactic domain.
With regard to Greek, the different types of multi-word expressions and
phrase-like structures, despite their varying morphosyntactic features, some-
times even within the same type, share the naming function (cf. Anastassiad-
is-Symeonidis 1986: 142 f.). They are lexical units with a rule-based formation
extending the naming device of the lexicon. This extended view of the lexicon is
also supported by approaches such as the Parallel Architecture (cf. Jackendoff
2010) and Construction Morphology (cf. Booij 2010, this volume) on the basis of
comparable multi-word, intermediate structures.
5 C
onclusions
The demarcation between the various types of complex lexical units is primarily
a language specific matter, although most of the criteria used to differentiate mor-
phological from syntactic structures apply at an abstract level to all languages. It
actually depends on the particular characteristics of wordhood and compound-
hood, as displayed in each language. These two basic characteristics determine
the morphological structures and the lexicon. The degree of resemblance between
246 Maria Koliopoulou
typical morphological structures and other complex lexical units specifies the
form of the lexicon in a particular language and the possibility of interaction
between the grammatical domains.
Multi-word expressions and phrase-like structures in Greek are clearly dis-
tinct from typical compounds: their constituents are phonologically and morpho-
logically independent words, a linking element is not required, they display head
properties similar to syntactic phrases as well as internal inflection. In Greek, the
degree of syntactic fixedness depends on the type of expression one deals with.
Sometimes, there is variation of the syntactic characteristics even among the dif-
ferent examples of the same type of structure (cf. (26)–(27), (32)). Despite the fact
that multi-word expressions and phrase-like structures in Greek cannot be
assigned to morphology like typical compounds, all three types of complex lexi-
cal units share the same function, i. e. the naming function. They are generated
by different lexical unit formation patterns which extend the naming strategies of
the lexicon. The outcome of this formation process is lexical units stored in the
speakers’ mental lexicon.
Compounding in Greek is a very productive process and thus a main language
naming device. However, new concepts have been introduced to the language in
the last decades through the form of a multi-word expression or a phrase-like
structure mostly found in specialized or newspaper texts. The appearance of such
a lexical unit is an indication for native speakers of the terminological use of the
concept. The emergence of various types of lexical units other than compounds
shows a clear tendency to different types of naming units and indicates a silent
process of language change regarding the naming of concepts, especially those
borrowed from English.
References
Anastassiadis-Symeonidis, Anna (1986): Η Νεολογία στην Κοινή Νεοελληνική [Neology in
standard Modern Greek]. Thessaloniki: Aristotle University of Thessaloniki.
Anastassiadis-Symeonidis, Anna (1994): Νεολογικός δανεισμός της Νεοελληνικής [Neological
borrowing in modern Greek]. Thessaloniki: Self publishing.
Anastassiadis-Symeonidis, Anna (1996): Η νεοελληνική σύνθεση [Modern Greek compounding].
In: Katsimali, Georgia/Kavoukopoulos, Fotis (eds.): Ζητήματα νεοελληνικής γλώσσας:
Διδακτική Προσέγγιση [Themes of the Modern Greek language: A didactic approach].
Rethymno: University of Crete. 97–120.
Anderson, Stephen R. (1992): A-morphous morphology. (= Cambridge Studies in Linguistics
62). Cambridge, UK: Cambridge University Press.
Bağrıaçık, Metin/Ralli, Angela (2015): Phrasal vs. morphological compounds: Insights from
Modern Greek and Turkish. In: Language Typology and Universals (STUF) 68. 323–357.
Compounds and multi-word expressions in Greek 247
Bauer, Laurie (2001): Compounding. In: Haspelmath, Martin et al. (eds.): Language typology
and language universals. An international handbook. Vol. 1. Berlin/New York: De Gruyter.
695–707.
Bauer, Laurie (2008): Dvandva. In: Word Structure 1, 1. 1–20.
Bauer, Laurie (2009): Typology of compounds. In: Lieber, Rochelle/Štekauer, Pavol (eds.).
343–356.
Becker, Thomas (1992): Compounding in German. In: Rivista di Linguistica 4, 1. 5–36.
Bisetto, Antonietta (2010): Relational adjectives crosslinguistically. In: Lingue e Linguaggio 9, 1.
65–85.
Bisetto, Antonietta/Scalise, Sergio (1999): Compounding: Morphology and/or syntax? In:
Mereu, Lunella (ed.): Boundaries of morphology and syntax. (= Current Issues in Linguistic
Theory 180). Amsterdam: Benjamins. 31–48.
Bisetto, Antonietta/Scalise, Sergio (2005): The classification of compounds. In: Lingue e
Linguaggio 4, 2. 319–332.
Booij, Geert (2009): Compounding and construction morphology. In: Lieber, Rochelle/Štekauer,
Pavol (eds.). 201–216.
Booij, Geert (2010): Construction morphology. Oxford: Oxford University Press.
Borer, Hagit (1988): On the morphological parallelism between compounds and constructs. In:
Booij, Geert/van Marle, Jaap (eds.): Yearbook of morphology. Dordrecht: Foris. 45–65.
Borer, Hagit (2009): Afro-Asiatic, Semitic: Hebrew. In: Lieber, Rochelle/Štekauer, Pavol (eds.).
491–511.
Bybee, Joan (1985): Morphology: A study of the relation between meaning and form.
(= Typological Studies in Language 9). Amsterdam: Benjamins.
Christofidou, Anastasia (1997): A textlinguistic approach to the phenomenon of multi-word
compounds. In: Drachman, Gaberell et al. (eds.): Greek linguistics 95. Proceedings of the
2nd International Conference on Greek Linguistics. Graz: Neugebauer. 67–75.
Donalies, Elke (2004): Grammatik des Deutschen im europäischen Vergleich. Kombinatorische
Begriffsbildung. Vol. 1: Substantivkomposition. Mannheim: Institut für Deutsche Sprache.
Fellbaum, Christiane (2011): Idioms and collocations. In: Maienborn, Claudia/von Heusinger,
Klaus/Portner, Paul (eds.): Semantics. An international handbook of natural language
meaning. Vol. 1. Berlin/New York: De Gruyter. 441–456.
Gaeta, Livio/Ricca, Davide (2009): Composita solvantur: Compounds as lexical units or
morphological objects? In: Rivista di Linguistica 21, 1. 45–68.
Günther, Ηarmut (1997): Zur grammatischen Basis der Getrennt-/Zusammenschreibung im
Deutschen. In: Dürscheid, Christa/Ramers, Karl-Heinz/Schwarz, Monika (eds.): Sprache
im Fokus. Festschrift für Heinz Vater zum 65. Geburtstag. Tübingen: Niemeyer. 3–16.
Hacken, Pius ten (1994): Defining morphology. A principled approach to determining the
boundaries of compounding, derivation and inflection. Hildesheim: Olms.
Hacken, Pius ten/Koliopoulou, Maria (2016): Adjectival non-heads and the limits of
compounding. In: SKASE Journal of Theoretical Linguistics 13, 2. 122–139.
Hüning, Matthias/Schlücker, Barbara (2015): Multi-word expressions. In: Müller, Peter O. et al.
(eds.): Word-formation. An international handbook of the languages of Europe. Vol. 1.
(= Handbücher zur Sprach- und Kommunikationswissenschaft (HSK) 40.1). Berlin/Boston:
De Gruyter. 450–467.
Jackendoff, Ray (2010): Meaning and the lexicon: The parallel architecture 1975–2010. Oxford:
Oxford University Press.
Kiparsky, Paul (1982): Lexical morphology and phonology. In: The Linguistic Society of Korea:
Linguistics in the morning calm. Selected papers from SICOL-1981. Seoul: Hanshin. 3–91.
248 Maria Koliopoulou
Koliopoulou, Maria (2006): Περιγραφή και ανάλυση των χαλαρών πολυλεκτικών συνθέτων της
Νέας Ελληνικής [Description and analysis of the loose multi-word compounds of Modern
Greek]. Patras: University of Patras. M.A. Thesis. Internet: http://nemertes.lis.upatras.gr/
jspui/handle/10889/914 (last access: 14.6.2018).
Koliopoulou, Maria (2008): The loose multi-word compounds of Modern Greek under the prism
of construction morphology. In: Lavidas, Nikolaos/Nouchoutidou, Elissavet/Sionti,
Marietta (eds.): New perspectives in Greek linguistics. Newcastle upon Tyne: Cambridge
Scholars Publishing. 213–224.
Koliopoulou, Maria (2009): Loose multi-word compounds and noun constructs. In: Patras
Working Papers in Linguistics 1. Special issue: Morphology. 59–71.
Koliopoulou, Maria (2012): Μεταξύ συνθέτων και φράσεων [Between compounds and phrases].
In: Gavriilidou, Zoe et al. (eds.): Selected papers of the 10th International Conference on
Greek Linguistics. Komotini: Democritus University of Thrace. 861–869.
Koliopoulou, Μaria (2013): Θέματα σύνθεσης της Ελληνικής και της Γερμανικής: συγκριτική
προσέγγιση. [Issues of Modern Greek and German compounding: A contrastive approach].
Patras: University of Patras. PhD Dissertation. Internet: http://nemertes.lis.upatras.gr/
jspui/handle/10889/5962?locale=en (last access: 14.6.2018).
Koliopoulou, Maria (2014a): Issues of Modern Greek and German compounding: a contrastive
approach. In: Journal of Greek Linguistics 14, 1. 117–125.
Koliopoulou, Maria (2014b): How close to syntax are compounds? Evidence from the linking
element in German and Modern Greek compounds. In: Rivista di Linguistica 26, 2. 51–70.
Koliopoulou, Maria (2014c): Komposition im Deutschen und Neugriechischen: Eine kontrastive
morphologische Analyse. In: Katsaounis, Nikolaos/Sidiropoulou, Renate M. (eds.):
Sprachen und Kulturen in (Inter)Aktion. Vol. 2: Linguistik, Didaktik, Translationswis-
senschaft. (= Hellenogermanica 2). Frankfurt a. M.: Lang. 43–55.
Koliopoulou, Maria (2015): Possessive/bahuvrīhi compounds in German: An analysis based on
comparable compounds in Modern Greek. In: Languages in Contrast 15, 1. 81–101.
Koliopoulou, Maria (2017): What can word formation offer to translation practice? A case study
of German compounds and their English equivalents. In: Zybatow, Lew N. et al. (eds.):
Übersetzen und Dolmetschen: Berufsbilder, Arbeitsfelder, Ausbildung. Ein- und Ausblick
in ein sich wandelndes Berufsfeld der Zukunft. (= Forum Translationswissenschaft 21).
Frankfurt a. M.: Lang. 117–136.
Lieber, Rochelle/Štekauer, Pavol (eds.) (2009): The Oxford handbook of compounding. Oxford:
Oxford University Press.
Manolessou, Io/Tsolakidis, Symeon (2009): Greek coordinated compounds: Synchrony and
diachrony. In: Patras Working Papers in Linguistics 1. 23–39.
Mukai, Makiko (2013): Recursive compounds and linking morpheme. In: International Journal of
English Linguistics 3, 4. 36–49.
Neef, Martin (2009): IE, Germanic: German. In: Lieber, Rochelle/Štekauer, Pavol (eds.).
386–399.
Nespor, Marina/Ralli, Angela (1994): Stress domains in Greek compounds: A case of morpholo-
gy-phonology interaction. In: Philippaki-Warbuton, Irene/Nicolaidis, Katerina/Sifianou,
Maria (eds.): Themes in Greek linguistics I. Amsterdam: Benjamins. 201–208.
Nespor, Marina/Ralli, Angela (1996): Morphology-phonology interface: Phonological domains
in Greek compounds. In: The Linguistic Review 13, 3/4. 357–382.
Compounds and multi-word expressions in Greek 249
Olsen, Susan (2001): Copulative compounds: A closer look at the interface between syntax and
morphology. In: Booij, Geert/van Marle, Jaap (eds.): Yearbook of morphology 2000.
Dordrecht: Springer. 279–320.
Rainer, Franz (2016): Blocking. In: Aronoff, Mark (ed.): Oxford research encyclopedia of
linguistics. 1–22. Internet: http://dx.doi.org/10.1093/acrefore/9780199384655.013.33
(last access: 14.6.2018).
Ralli, Angela (1991): Λεξική φράση: Αντικείμενο μορφολογικού ενδιαφέροντος [Lexical phrase:
A morphological analysis]. In: Μελέτες για την Ελληνική Γλώσσα [Studies in Greek
Linguistics 1990]. 205–221.
Ralli, Angela (1992): Compounds in Modern Greek. In: Rivista di Linguistica 4, 1. 143–174.
Ralli, Angela (2005): Μορφολογία [Morphology]. Athina: Patakis.
Ralli, Angela (2007): Η σύνθεση λέξεων: διαγλωσσική μορφολογική προσέγγιση [Compounding:
A cross-linguistic morphological approach]. Athina: Patakis.
Ralli, Angela (2013a): Compounding in Modern Greek. Dordrecht: Springer.
Ralli, Angela (2013b): Compounding and its locus of realization: Evidence from Greek and
Turkish. In: Word Structure 6, 2. 181–200.
Ralli, Angela/Stavrou, Melita (1998): Morphology-syntax interface: A-N compounds vs. A-N
constructs in Modern Greek. In: Booij, Gert/van Marle, Jaap (eds.): Yearbook of
morphology 1997. Dordrecht: Springer. 243–263.
Scalise, Sergio (1992): Compounding in Italian. Rivista di Linguistica 4, 1. 175–200.
Schlücker, Barbara/Hüning, Matthias (2009): Compounds and phrases: A functional
comparison between German A + N compounds and corresponding phrases. In: Rivista di
Linguistica 21, 1. 209–234.
Wälchli, Bernhard (2005): Co-compounds and natural coordination. Oxford: Oxford University
Press.
Ingeborg Ohnheiser †
Compounds and multi-word expressions
in Russian
Introduction
This chapter deals with the discussion of the relation between multi-word expres-
sions, compounds, and derivations in the description of Russian and other Slavic
languages. Referring to pertinent publications, the aim is to show how these
descriptions have been influenced by particular theoretical conceptions (e. g. the
onomasiological view adopted by Dokulil 1962) and the respective grammatical
tradition (e. g. Russkaja grammatika 1980, generally known as “Grammatika-80”:
Švedova 1980). New approaches to the description of the relation between phrases
and derivatives as well as between phrases and a special type of Russian com-
pounds (the so-called stump compounds) from the viewpoint of Construction
Grammar are presented with reference to works by Benigni/Masini (2010) and
Masini/Benigni (2012). In view of recent linguistic developments, the competi-
tion between multi-word expressions and N+N compounds is discussed, which
persists irrespective of the increasing productivity of this compound type in
Russian.
The chapter does not provide a comprehensive overview of all naming pro-
cesses in Russian, but rather focuses – also from the perspective of research his-
tory – on those types of nominal multi-word expressions and compounds (as well
as one derivational type) that stand in a mutual relation of cooperation and/or
competition. Particular attention will be paid to stylistic and pragmatic aspects.
The chapter is organized as follows: Section 1 gives a brief overview of the
main findings of previous studies on complex lexical units in Russian and other
Slavic languages. Section 2 presents compound and MWE patterns in Russian as
well as their interrelation as discussed in Grammatika-80. Sections 3 and 4 dis-
cuss the co-existence and interaction between MWEs and various morphological
patterns. The chapter ends with a conclusion in Section 5.
Open Access. © 2019 Ohnheiser, published by De Gruyter. This work is licensed under the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-009
252 Ingeborg Ohnheiser
Isačenko (1958), for instance, paid special attention to the formal and semantic
condensation of complex naming units, stating that “complex designations
consisting of several lexical units have a clear tendency towards univerbation,
i. e. to the compression of the semantic content into one word” (ibid.: 340;
translated from Russian). This phenomenon manifests itself in different nam-
ing procedures:
a) Certain types of compounding (e. g., Slovak svet-o-názor [world.lv o-view]1
ʻworld view’ < svetový názor [world.ra view] ‘id.’2,3)
b) Mergers (Czech pravdě-podobný [truth.dat-similar] ‘probable’)
c) Ellipsis
1) of the head (Russian prjamaja ʻstraight line’ < prjamaja linija ʻid.’)
2) of the modifier (Russian plastinka ʻrecord’ < grammofonnaja plastinka
‘(grammophone) record’)
d) Affixal derivation (Russian setčat-k-a ʻretina’ < setčataja oboločka [net.a
membrane] ʻid.’)
e) Binominals (appositional compounds), particularly in Russian, e. g., ženšči
na-vrač [woman-doctor] ʻfemale doctor’
f) Different types of compounds with a clipped modifier (“stump compounds”
in the terminology of Comrie/Stone 1978 and Comrie/Stone/Polinsky 1996)
(Russian zarplata ʻsalary’< zarabotnaja plata [for.work.ra payment] ʻid.’), but
also of initialisms and acronyms (Russian IMLI < Institut mirovoj literatury
[institute world.ra.gen literature.gen] ʻInstitute of World Literature’ [of the
Russian Academy of Sciences]). According to Isačenko, the dominance of this
latter type in Russian is not accidental as it provides an important option to
condensate MWEs with modifiers in the genitive case.
g) Formations of the type Russian Glavryba [glav- clipped stem of the adjective
glavnyj ʻmain, principal’ + ryba ʻfish’] < Glavnoe upravlenie rybnoj promyšlen
nosti – the name of the Soviet central administration of the fishing industry.
As has been pointed out by one of the reviewers, from a semantic point of
view, Glavryba reflects a metonymic shift, because the modifier does not
directly modify the noun, but a concept connected to the noun (“central
1.2 M
WEs, compounds, and derivations from an
onomasiological point of view
The relationship between MWEs, compounds, and derivations was dealt with in
Czech linguistics in the description of word formation as part of naming proce-
dures (Dokulil 1962). Thus for example Czech MWEs, compounds, and suffixed
compounds (1a–e) are contrasted with suffixed derivatives (1a’–e’) (with the same
meaning) (ibid.: 31):
(1a’) krajin-ář
(1b’) housl-ista
(1c’) prvň-ák
(1d’) kov-ák
(1e’) dřev-ař
Dokulil (ibid.) uses examples from terminology and technical language to show
that the formation of multi-word designations is a very common naming proce-
dure, as in (2):
4 This formation type can also be found in more recent designations, e. g., Glavlinza [main lens],
a leader brand for contact lenses. (www.glavlinza.ru, last access: 1.3.2017).
254 Ingeborg Ohnheiser
(3) čajová růže5 [tea.ra rose] ‘tea rose’ > čajovka tea.ra-stem-suff ‘id.’6
The word stař-ik ‘old man’ should, however, be regarded as a deadjectival suf-
fixed formation < starý ‘old’ and not as univerbation of starý člověk ‘old man’. A
significant extension of the concept of univerbation in Slavic studies has been
proposed in a new monograph on Slovak (Ološtiak (ed.) 2015). In this study, the
criterion of stability of the underlying MWEs is maintained. The results, however,
are not restricted to suffix formations. Some examples are provided by Ološtiak
(ed.) (ibid.: 308 ff.):
(4a) MWEs and “traditional” suffixal univerbations with truncation of the stem
and ellipsis of the head, e. g., Slovak izolač-n-á páska ‘insulating tape’ >
izolač-k-a ‘id.’
(4b) Combination of compounding and univerbation (“kompozičná univerbizá
cia”), e. g., Slovak hráč prvej ligy [player first.gen division.gen] > prv-o-lig-
ist-a ‘first division player’. In Russian grammars, the analog formation
pervoligist < pervaja liga is described as suffixed compound
(4c) Clipping of the modifier of an MWE and formation of a compound, e. g.
Slovak alkoholový test > alkotest ‘alcotest’ (however, this formation might
also be a direct loan from English)
(4d) Phenomena like the following are also included:
Slovak kompaktný fotoapparát > kompakt ‘compact camera’
(4e) The formation of acronyms from MWEs is also often regarded as univerba-
tion:
Slovak Mestská hromadná doprava > MHD; coll. suffixed MHD-čka ‘local
public transport’7
5 In botanical nomenclature N+RA růže čaj-ová [rose tea-ra] ‘tea rose’ with inverted word
order.
6 Another formally identical word can be based on the MWE čajový salám ‘tea sausage (spread)’.
7 The vernacular suffixation of acronyms is more productive in Slovak than in Russian.
Compounds and multi-word expressions in Russian 255
2.1 Compounds
1. N+LV+N
1. NSTEM+LV+N11
2. N+N
Focussing on the absence of a linking vowel, Grammatika-80 forms a heterogeneous group
of formations, including loans such as džaz-orkestr ‘jazz orchestra’. On the activation of
the N+N type cf. Section 4.2.2.
9 Attributive compounds are not considered as a group in their own right, cf. however Benigni/
Masini (2009).
10 Some formations of this structure are not considered as compounds, but as appositive con-
structions and thus as syntactic phenomena, cf. car’-ubijca [tsar-murderer] ‘a tsar who was a
murderer’ (in contrast to the determinative compound careubijca [tsar.lv.murderer] ‘regicide’).
Cf. also inžener-fizik [engineer-physicist] ‘engineer and physicist’; sudno-cholodiľnik [ship-refrig-
erator] ‘refrigerator ship’; more recent: komp’juter-tabletka ʻtablet computer’.
The combination of two words is not considered appositive if they designate objects consisting of
a larger number of elements or groups of persons and semantically resemble a single word. In
Russian they are frequently used as a means of stylization, e. g., čaški-bljudca [cups-plates] ‘dish-
es’, ruki-nogi [hands-legs] ‘limbs’, devočki-maľčiki [girls-boys] ‘children’ (cf. also Wälchli 2015,
who uses the term ‟co-compound” for similar, but regular and stylistically neutral formations in
various languages).
11 The lack of compounds with (de-)verbal modifiers is often compensated by phrases of the
structure [A<V]+N (e. g., stiral’naja mašina ‘washing machine’). Compare, however, some types
of exocentric compounds, whose first constituent might be regarded as derived from an impera-
tive, as in sorvigolova [bite-off-head] ‘daredevil’.
Compounds and multi-word expressions in Russian 257
e. Some formations with hyphenated spelling, čudo-bogatyr’ ʻ(epic) hero with magical
also known from folk literature, e. g., čudo- stength’,
ʻmiracle, wonder’ or gore- ʻsorrow, misery’ gore-rukovoditeľ ʻbad leader, manager’
2.2 M
ulti-word expressions
12 This type demonstrates once again the wide-spread distribution of compounds with a dever-
bal second component. Further, less productive formation types are not discussed here.
Compounds and multi-word expressions in Russian 259
Phrasemes of the type (5a) and (5b) or their constituents can function as the basis
of compounds or derivations, cf., e. g., baklušničat’ as synonymous expression to
bit’ bakluši (5a) or the adjective myl’noopernyi ‘similar to a soap opera’ < myl’naja
opera (5b). Numerous studies are devoted to the relations between phraseology
and word formation, including a dictionary of Russian dephrasemic lexis (Alek-
seenko/Belousova/Litvinnikova (eds.) 2003).
Phrasemes with coordinate relations between the components are generally
disregarded in the literature – as in the case of free word combinations (cf. how-
ever Benigni (2012: 5 f.) on fixed coordinate phrases (binomi coordinativi) with a
varying degree of idiomaticity, e. g. mužčina i ženščina ‘man and woman’, sploš i
rjadom [pretty often and nearby] ‘very often’, ni ryba ni mjaso [neither fish nor
flesh] ‘neither fish nor fowl’).
In continuation of Vinogradov’s classification of phrasemes Šanskij ([1963]
1985) specifies a fourth group which proves to be of special importance for our
topic:
Just as the phrasemes of the other groups, they display the following characteris-
tics: multi-word structure, reproducibility, fixedness (and thus belonging to the
lexicon). They do not necessarily need to be idiomatic or metaphorical, however,
cf., e. g., medicinskaja sestra [medical sister] ‘nurse’, teplovaja ėnergija [heat.ra
energy] ‘heat energy, thermal energy’, vysšee učebnoe zavedenie [higher educa-
tional institution] ‘institution of higher education’, etc.
In recent Russian studies (cf. Droga 2010, for instance) such expressions are
described as “complex designations” (Russian sostavnye naimenovanija), par-
ticularly those of the structure:
13 Cf. Mokijenko/Walter (2008: 105); the authors do however not adopt the traditional typology
of phraseology.
260 Ingeborg Ohnheiser
(6a) A+N
paneľnyj dom [panel.ra house] ‘panel house, prefabricated building’
(6b) N+NGEN (or – more rarely – other oblique cases)
sredstva massovoj informacii [media mass.ra.gen information.gen] ‘mass
media’
(6c) N+Prep+N
kniga dlja čtenija [book for reading] ‘reader’
It should be mentioned here that such word combinations for a large part com-
pensate for non-existent compound patterns in Russian, including the adapta-
tion of compound loanwords (cf. Section 4). Complex designations of this kind
are regarded as “phrasal nouns” by Masini/Benigni (2012: 422): Just like com-
pounds they “generally cannot (a) be interrupted by lexical material, (b) undergo
paradigmatic commutability, (c) be internally modified”.
2.3 T
he interaction between different naming procedures in
Russian academic grammars
14 “Complex words that contain at least three morphemes, with neither the combination of the
first two nor of the last two existing as free words” (Neef 2015: 583); other studies use the term
“parasynthetic compound”.
Compounds and multi-word expressions in Russian 261
walker’ < chodit’ po kanatu ʻto walk on a wire’ (in a certain way this also
refers to -vod, -mer formations)
[[A+N]-SUFF]
nouns: vodolyž-nik [[water.lv.ski]-suff] ʻwater skier’ < vodnye lyži
ʻwaterski’;
adjectives: daľnevostoč-nyj [[far.lv.east]-suff] ‘Far Eastern’ < Daľnij
vostok ‘Far East’; with alternation k > č; qualitative adjectives < free
word-combinations, e. g., dlinnonogij [[long.lv.leg]-suff] ʻlong-legged’ <
dlinnye nogi ʻlong legs’;
(7e) MWE > abbreviations
In Grammatika-80 formations of clipped components of MWEs are
regarded as abbreviations,15 e. g., prodmag < prodovoľstvennyj magazin
[food.ra store] ʻfood store’, etc. (cf. Section 3.2)
Synonymous word formations in the strict sense are listed systematically only for
derivations in Grammatika-80 (e. g., salat-nik/salat-nica ‘salad dish’, žad-ina/
žad-juga ‘greedy person’, meri-l’nyj/meri-tel’nyj ‘measuring’, akcentovat’/akcen
tovirat’ ‘accent, emphasize’, kratk-o/v-kratc-e ‘briefly in short’). Regarding adjec-
tival compounds, reference is made to synonymous second components express-
ing similarity such as -vidnyj,-obraznyj (šarovidnyj, šaroobraznyj [globe/ball.lv.
shaped] ‘globular, round’). However, Grammatika-80 does not take into account
parallel formations of MWEs and nominal compounds, cf. (8), as they are also
found in Polish (cf. Cetnarowska, this volume, example (25)):
The next section discusses phenomena like those in (7a) and (7e) above in greater
detail.
(9a) parusnoe sudno [sail.ra boat] ʻsailing boat’ > parus-nik ʻid.’
(9b) tovarnyj poezd [goods.ra train] ʻfreighttrain’ > tovarn-jak ʻid.’
According to Masini/Benigni (2012: 421), the MWEs the derivations are based on
are “phrasal lexemes which have a naming function”, e. g. kreditnaja karta [cred-
it-ra card] ‘credit card’. This means that the strategy at hand “consists in shorten-
ing a phrasal noun of the [ADJ N] type via ellipsis of the noun plus truncation of
the adjective by means of a set of suffixes” (ibid.: 431).17 They propose the follow-
ing formal representation of the Russian [ADJ N] lexical construction (ibid.: 444,
example (47)):
For phrasal nouns such as Polish telefon komórkowy [phone cellular]18 ‘mobile
phone’, Cetnarowska (this volume, example (31)) proposes the following
representation:
(11) [N0i A0j ]k ↔ [NAME for SEMi with some relation R to entity E of SEMj ]k
16 Derivations that are not synonymous to MWEs are, for instance, neotlož-ka ‘ambulance’ <
neotložnaja pomošč’ [unpostponable aid] ‘emergency service’, jader-ščik ‘nuclear physicist’ <
jadernaja fizika ‘nuclear physics’, figure-ist ‘figure skater’ < figurnoe katanie [figure-ra skating]
ʻfigure skating’.
17 See above for the definition of the term “univerbation” in Slavic studies that does not explic-
itly mention the ellipsis of the head of the MWE.
18 The postposition of the RA typically applies to phrasal nouns in Polish, i. e. word combina-
tions with a naming function.
Compounds and multi-word expressions in Russian 263
The wide semantic range of the underlying relational adjective results in the
occurrence of numerous homonyms, which are disambiguated in the context or
the respective communicative situation; ėlektronka, for instance, can refer to
1. ėlektronnaja kniga ‘e-book’ or 2. ėlektronnaja literatura ‘e-literature’, but also to
3. ėlektronnaja sigareta ‘electronic cigarette’.
Ološtiak (2015: 296) summarizes typical features, distinguishing Slovak
MWEs and the results of “univerbation”, as follows:
a) greater vs. lesser degree of formal explicitness and therefore
b) lack of ambiguity vs. greater degree of ambiguity,
c) lack of stylistic markedness vs. markedness,
d) official vs. unofficial character,
e) more pronounced association of MWEs with written language vs. under
representation of univerbation in written language.
264 Ingeborg Ohnheiser
Similarly, Masini/Benigni (2012: 441) state that Russian shortened lexemes with
-ka, “despite having the same propositional meaning of corresponding full
forms, have different pragmatic features”. These features are implemented in
the formal representation of the -ka construction in (13) (cf. ibid.: 445, example
48). The features of the -ka lexemes are compared to those described for dimin-
utives with -ka which also display familiar/intimate characteristics. (Diminu-
tives may, however, also imply negative or ironic traits, cf. Nagórko 2014: 784 on
“quasi-diminutives”).
(13) FORM: [[c]N’z -ka]N0k where SYNz = [[a]Adj0x [b]N0y]Nˈz & PHONz
= truncated ADJ
MEANING: < NAME for SEMz & [+ familiar/intimate] (& [+ jargon J]) >k
Thus, although the full forms (the phrasal nouns) and the shortened lexemes in
-ka share the semantics, they differ with the respect to their pragmatic and tex-
tual properties and thus the formal difference between the constructions comes
along with a difference in meaning. For this reason, they are not (fully) synony-
mous and they meet the non-synonymy constraint on constructions as proposed
in Construction Grammar (cf. Masini/Benigni 2012: 446).
With respect to analogous forms in Polish Cetnarowska (this volume) states:
“The interaction between phrasal lexemes and derivatives (or compounds
proper), exemplified by univerbation, can be accounted for in Construction Mor-
phology by means of second order schemas.” The respective representation of
“shortened phrasal nouns” in Polish can be found in (14) (cf. Cetnarowska, this
volume, example (37)):
Relations between MWEs and one-word designations do not only exist in the area
of derivation but also with compounding. This includes formations which in Rus-
sian research are frequently described as složnosokraščennye slova (‘stump com-
pounds’) as they represent a combination of compounding and shortening. The
shortening process is not based on morphemes but on syllables, in contrast to
compounds with clipped, mostly “neoclassical” modifiers, cf. Section 2.1. Stump
Compounds and multi-word expressions in Russian 265
compounds have become productive since the end of the 19th century and the
beginning of the 20th century and are frequently associated with Sovietisms, cf.
(15):
Numerous formations have now become historical formations but lexical units
based on these models can still be observed. New formations show a tendency to
shorten the modifying components. Formations that contain stumps of both com-
ponents are often proper nouns, e. g. names of Internet domains:
Stump compounds consisting of two clipped elements are for instance found in
the case of the semi-official namings/names of ministries (17a). The stump min-
combined with the full form of the genitive is relatively rare (17b). The formation
of the stump obor (from oborony) does obviously not comply with the preferred
number of syllables (for the phonetic idiosyncracies of the first component of
stump compounds cf. Billings 1998). Nevertheless, among the new formations of
the type min-+ Ngen there is a combination with the non-euphonic stump obr
(however not in final position) (17c):
The genitive ending of the nominal modifier is also retained in some designations
of deputies, e. g.:
All formations with oblique case forms as second components cannot be inflected.
In the adjectival derivation of the type (17b)–(19) which are generally informal,
colloquial or ironically connotated, the case ending is clipped, e. g. minoboron-skij
gambit ‘the gambit of the Ministry of defense’, zamdekan-skij post ‘position of the
vice-dean’, or zavkafedr-al’nyj kabinet ‘office of the head of the department’.
3.2.2 A
djectives (mostly relational adjectives) + nouns as underlying
MWEs/phrases
20 See, however, the personal designation upravdom < upravljajuščij domom ‘caretaker’ where
dom is the stump of the instrumental case domom. This formation can be inflected and is easier
to use in colloquial language than the formations mentioned above.
Compounds and multi-word expressions in Russian 267
The model is also productive in the formation of neologisms (see below). Com-
pared to the corresponding MWEs/phrases, stump compounds may have the
additional advantage of serving as bases for the derivation of relational adjec-
tives, cf.:
Some stumps such as kom- ‘communist’ and soc- ‘socialist’ are mostly found in
historical expressions of the Soviet era. Others are still productive, also as part of
newly coined formations, such as gos- ‘state.ra’, e. g.:
In addition, certain stumps such as polit- < političeskij ‘political’ which are known
from the notorious designation politbjuro ‘politburo’, can also be found in more
recent forms, such as (24):
(25a) roddom (7 m.) < rodiľnyj dom (2 m.) [give birth/bear.a house]
‘maternity clinic’; no corresponding stump compound (*rodklinika) of the
less frequent and more prestigious naming rodil’naja klinika ‘id.’ is found,
however.
(25b) zapčasti (212 m.) < zapasnye časti (15 m.) ‘spare parts’
Numerous new formations contain the stump Ros- < rossijskij ‘related to Russia,
Russian governmental institutions, enterprises with state participation etc.’, e. g.
Rostelekom ‘Rostelecom’. Ros- is, however, predominantly found in proper names
that are based only on parts of multi-word names. In the following example Ros-
can be said to replace Federal’nyj ‘federal’:
The differences between stump compounds and suffix formations with -ka can be
summarized as follows: Stumped compounds are generally used in the area of
politics, administration and business. The underlying phrasal nouns, however,
indicate a higher level of official status. A higher level of transparency is obtained
with currently used stump compounds by not clipping the head. The clipped
modifiers are less transparent than the respective word stem (which is retained in
the deadjectival -ka formations), but they relate to a thematically more clearly
restricted range of designation. Frequency specification of stump compounds in
Russian newspapers from the year 2014 can be found in Milan Albertin (2013/14:
22 Compounds such as Rostrud [Russ(ian) labor], Federal’naja služba po trudu i zanjatosti ‘Fed-
eral Labor and Employment Service’ are reminiscent of the old type Glavryba (cf. the examples
cited by Isačenko 1958 in Section 1.1).
Compounds and multi-word expressions in Russian 269
Masini/Benigni (2012: 447) regard the formation of such shortened lexemes also
as a strategy of a highly inflectional language “to ‘morphologize’ lexical items
that are larger than a word”.
23 There are, however, older formations where the stump kom refers to kommunističeskij ‘com-
munist (A)’, komandir ‘commander’ and komitet ‘committee’, for instance.
270 Ingeborg Ohnheiser
4 O
n the relation between MWEs of the type
“relational adjective + noun” and compounds
4.1 R
A+N combinations compensating a lack of nominal
compound types
The preceding section has discussed the tendency of “morphologizing” word
combinations. However, it is obvious that in Russian everyday speech there are
also numerous relatively fixed designations of the type [RA<N]+N without short-
ened variants on -ka. These MWEs contrast with N+N compounds in English and
German (leaving calques out of consideration), e. g.
a) polevaja myš’ ʻfield mouse’ (but see suffixal polёvka ʻvole’), vodjanaja ptica
‘water bird’,
b) utrennjaja smena ʻmorning shift’ (but see suffixal utrennik ʻmorning perfor-
mance’), nočnoj polet ʻnightflight’,
c) jabločnyj pirog ‘apple pie’, rapsovoe maslo24 ʻrape oil’ (see also parallel forma-
tions of the type N+Prep+N, e. g. with the preposition s ʻwith’, iz ʻof, from’),
d) bannoe polotence ‘bath towel’, komp’juternye igry ‘computer games’
(see also parallel formations of the type N+Prep+N, e. g. with the preposition
dlja ʻfor’).
The reservations that have been expressed about the listing of possible meaning
relations between modifiers and non-deverbal heads of compounds (cf., e. g., Plag
2009: 150 with respect to English), may also hold for RA+N combinations.25 How-
ever, it is obvious that there are certain typical relations, depending on the seman-
tics of the modifier and the head of the MWE, i. e. local and temporal (a, b), purpose
(c) or reference to the source or origin of what is referred to by the head (d).
Even if compounds can be formed, MWEs may be perceived as more canoni-
cal. This becomes evident from the persistence of RA+N combinations alongside
older compound calques as well as from the different ways of adapting of new
English N+N compounds and compound patterns.
4.2 C
ompounds
When dealing with Russian determinative compounds with NSTEM as modifier and
non-derived head, it becomes obvious that their number is restricted. Compounds
of the type NSTEM + [N<V] are much more productive. Although Grammatika-80
(Švedova 1980: 242) does not make such a distinction, examples such as zvuko
režisser [sound.lv.director] ‘sound producer’, pticefabrika ‘poultry plant’, chlebo
zavod [bread.lv.plant] ‘bakery plant’ (next to RA+N chlebnyj zavod), gazoballon
‘gas bottle’ (more frequently RA+N gazovyj ballon), kino-teatr [cinema-theatre]
‘cinema’ can be assigned to the first group, expressing primarily purpose rela-
tions. The second group includes compounds like sen-o-uborka ‘hay harvest’,
dač-e-vladelec ‘dacha-owner’, ovošč-e-chranilišče ‘vegetable store’, reflecting the
argument structure of the verb that underlies the head.
These differences become also apparent in the form of Russian equivalents of
English compounds. Russian equivalents of English formations with deverbal
heads are more frequently compounds of the form NSTEM+LV+N (or N+NGEN) and
less frequently RA+N patterns. Russian equivalents of other English compounds
are, however, for the most part of the type RA+N, cf. Table 2:
English Russian
English Russian
Besides, we also have to consider that numerous English MWEs and compounds
correspond to regular suffixal expressions in Russian, as in the case of a) denom-
inal personal nouns: parket-čik ‘parquet-layer’ < parket ‘parquet’, ryb-ak ‘fisher-
man’ < ryba ‘fish’, splet-nik ‘scandalmonger’ < spletni ‘rumors’, šachmat-ist ‘chess
player’ < šachmaty ‘chess’, and b) place nouns: vinograd-nik ‘wine yard’ < vinograd
‘grapes; vine’, cvet-nik ‘flower garden‘ < cvet(y) ‘flower(s)’, spaľnja ‘sleeping-room’
< spať ‘sleep’, etc.
Numerous older borrowed N+N compounds (without linking vowel) as a rule have
RA+N equivalents, which sometimes are more frequent than the compound, cf.:
(28) dizeľ-motor (15,000) ‘diesel engine’ vs. dizeľnyj motor (13 m.) ‘id.’; note,
however, the use of the compound in names of business and the formation
of a new common noun according to the structure N+N: dizeľ-servis
“Dizeľ-Motor” ʻdiesel-service “Diesel-Engine”’ (cf. Section 4.2.3), vakuum-
kamera (45,000) ‘vacuum chamber’ vs. vakuumnaja kamera (25 m.) ‘id.’
(29) demping ceny ‘dumping prices’ vs. dempingovye ceny [dumping.ra prices]
the reasons for the preference of RA+N are to be found in the enhanced syntactic
availability or, more precisely, transparency. The ratio of the borrowed compound
is considerably lower than the phrase in oblique cases, cf. Russian dative pl. po
Compounds and multi-word expressions in Russian 273
demping cenam ʻgoods at dumping prices’ (1,240) vs. po dempingovym [RA] cenam
ʻid.’ (181,000), prepositive case pl. o demping cenach (not attested) ʻabout dump-
ing prices’ vs. o dempingovych cenach ʻid.’ (900). The genitive plural demping cen
is obviously entirely avoided due to its homonymy with N+Ngen [dumping prices.
gen] ‘price dumping’.
In addition to parallel formations of the patterns N+N (marketing direktor
‘marketing director’) and RA+N (marketingovyj direktor) alternative patterns of
the form N+Ngen (director martekinga) and N+Prep+N (direktor po marketingu
‘director of marketing’) occur frequently, in particular with respect to professional
titles and functional descriptions.
There are, however, numerous new N+N compounds, including compounds
with abbreviated modifiers, that do not or only occasionally have RA+N
“competitors”:27
27 The preference for another compound pattern over MWEs with RA has been evident for some
time in the formation of compounds with neo-classical modifiers, or, more generally, interna-
tionalisms, e. g., tele- (< televizionnyj) ‘TV-, television-’, cf. telezriteľ (nom.sg 50 m.) ‘TV-viewer’
vs. televizionnyj zriteľ (nom.sg 1,000).
28 Cf. also the direct, only grammatically adapted borrowing (here: nom.pl) lajting & šejding
supervajzery ‘lighting and shading supervisors’.
274 Ingeborg Ohnheiser
with some other Russian constructions” (in accordance with the No Synonymy
Principle as postulated in Construction Grammar). Whereas a possible stump
compound like gorzal from gorodskoj zal [city.ra hall] ʻcivic hall’ would have the
connotation of a “Soviet holdover”, the new N+N compound Krokus Siti Choll
ʻCrocus City Hall’ (opened in 2009 near Moscow) “has a cosmopolitan, western
association” (p. 81). In the case of other patterns that were already used in the
Soviet era such as Ntoponym+N (e. g. Tulaugoľ ʻTulacoal’, name of a coal trust in the
district of Tula), the pattern is retained but newly filled, e. g. Tulabar. Here, the
“difference in connotations can be plausibly attributed to the interaction between
the structure of the expression and the individual words that enter the structure,
rather than to the structure per se” (Kapatsinski/Vakareliyska 2013: 81). By means
of the new filling the pattern itself gains “a new prestige”.
As has been shown by some of the examples above, the idea that N+N compounds
(without linking vowel) are on the increase is also suggested by their frequent
occurrence in proper nouns, such as company names (e. g. Ivent-Ėkspert ‘event
expert’ as the name of an agency for marketing solutions). Such names often
adopt English patterns, which are also used for common nouns in English every-
day speech. This is however not the case in Russian:
29 It is striking that in many of these newly coined formations the second component is
capitalized.
Compounds and multi-word expressions in Russian 275
Vodopolo (in standard language RA+N vodnoe pole) ‘water polo’ and vodolyži (in
standard language vodnye lyži ‘water ski’) are also found as common nouns in
Internet texts, however with a linking vowel (!), i. e. not *Voda sport. (In standard
language the stem vod- ‘water’ is found only in compounds with a deverbal head,
e. g. vod-o-snabženie ‘water supply’.) It remains to be seen whether – under the
influence of certain text types – the pattern Nstem+LV+N will also occur with those
implicit meaning relations that have only been used in RA+N combinations so far
(cf. Section 4.1).
5 C
onclusion
After a short overview of contributions of Slavic studies on the topic of the present
volume this chapter explored some of the relations between non-idiomatic deter-
minative MWEs/phrasal nouns and one-word designations in Russian, viz.:
a) MWEs and a (specific Slavic) type of condensed one-word designations,
b) MWEs/phrasal nouns und stump compounds,
c) MWEs/phrasal nouns and nominal compounds.
30 There we also find vodopolo (in standard language RA+N vodnoe pole) ‘water polo’, vodolyži
(in standard language vodnye lyži) ‘water ski’.
31 https://kachevan.livejournal.com/tag/ %D1 %81 %D0 %BF %D0 %BE %D1 %80 %D1 %82
(last access: 30.4.2018).
276 Ingeborg Ohnheiser
References
Alekseenko, Michail A./Belousova, Taťjana P./Litvinnikova, Oľga I. (eds.) (2003): Slovar’
otfrazeologičeskoj leksiki sovremennogo russkogo jazyka. Moskva: Azbukovnik.
Arcordia, Giogio Francesco/Montermini, Fabio (2013): Are reduced compounds compounds?
Morphological and prosodic properties of reduced compounds in Russian and Mandarin
Chinese. In: Renner, Vincent/Maniez, François/Arnaud, Pierre (eds.): Cross-disciplinary
perspectives on lexical blending. Berlin/Boston: De Gruyter. 93–114.
Belošapkova, Vera A. (ed.) (1989): Sovremennyj russkij jazyk. 2nd ed. Moskva: Vysš. Škola.
Benigni, Valentina (2012): I binomi coordinativi in russo: un’analisi costruzionista. In:
mediAzioni 13. Internet: www.mediazioni.sitlec.unibo.it/images/stories/PDF_folder/
document-pdf/slavistica2012/01_benigni.pdf (last access: 1.9.2017).
Benigni, Valentina/Masini, Francesca (2009): Compounds in Russian. Lingue e Linguaggio
VIII, 2. 171–193.
Benigni, Valentina/Masini, Francesca (2010): Nomi sintagmatici in russo. In: Studii Slavistici
VII. 145–172.
Billings, Loren A. (1998): Morphology and Syntax. Delimiting stump compounds in Russian. In:
Booij, Gert/Ralli, Angelij/Scalise, Sergio (eds.): Proceedings of the First Mediterranean
Morphology Meeting. Patras: University of Patras. 99–110.
Booij, Geert E. (2010): Construction morphology. Oxford/New York: Oxford Univ. Press.
Comrie, Bernhard/Stone, Gerald (1978): The Russian language since the revolution. Oxford:
Clarendon Press.
Comrie, Bernhard/Stone, Gerald/Polinsky, Maria (1996): The Russian language in the twentieth
century. 2nd ed. Oxford: Clarendon Press.
Dokulil, Miloš (1962): Tvoření slov v češtině. Vol. 1: Teorie odvozování slov. Praha: Nakl.
Československé Akad. Věd.
Droga, Marina A. (2010): Sostavnye naimenovanija v russkom jazyke. Belgorod: Belgorodskij
Gosudarstvennyj Universitet. Internet: http://cheloveknauka.com/v/332425/d#?page=1
(last access: 11.9.2018).
Isačenko, Aleksandr V. (1958): K voprosu o strukturnoj tipologii slovarnogo sostava slavjanskich
literaturnych jazykov. In: Slavia 27. 334–352.
Kapatsinski, Vsevolod/Vakareliyska, Cynthia M. (2013): [N[N]] compounds in Russian. A growing
family of constructions. In: Constructions and Frames 5, 1. 69–87.
Kuchař, Jaroslav (1963): Základní rysy struktur pojmenování (Basic features of naming
structures). In: Slovo a slovesnost 24, 2. 105–114. Internet: http://sas.ujc.cas.cz/archiv.
php?art=1230 (last access: 11.9.2018).
Martincová, Olga (2015): Multi-word expressions and univerbation in Slavic. In: Müller, Peter O.
et al. (eds.). 742–757.
Masini, Francesca/Benigni, Valentina (2012): Phrasal lexemes and shortening strategies in
Russian. The case for constructions. In: Morphology 22. 417–451.
Milan Albertin, Isabella (2013/14): Analisi linguistica dei composti troncati in russo e del loro
utilizzo nel linguaggio giornalistico. (MA thesis). Università Ca’Foscari Venezia. Internet:
http://dspace.unive.it/bitstream/handle/10579/6112/987901-1193458.pdf?sequence=2
(last access: 1.11.2017).
Mokijenko, Walerij/Walter, Harry (2008): Leksičeskie i frazeologičeskie neologizmy: obščee i
različnoe. In: Mokijenko, Walerij/Walter, Harry (eds.): Komparacja systemów i
278 Ingeborg Ohnheiser
1 I would like to thank the editor of the volume and the anonymous reviewers for their useful
comments on the previous version of this chapter.
Open Access. © 2019 Cetnarowska, published by De Gruyter. This work is licensed under the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-010
280 Bożena Cetnarowska
Before presenting some examples of MWEs in Polish, we can add that instead
of the term “multi-word unit” (Pol. jednostka wielowyrazowa), Polish linguists
often use the term “phraseological unit” or “phraseme”2 (Pol. związek frazeolo
giczny, frazem). According to the traditional classification3 proposed by Stanisław
Skorupka (e. g. Skorupka 1967), three types of phraseological units are distin-
guished on the basis of their formal structure: units which are nominal expres-
sions (Pol. wyrażenia), such as pies ogrodnika (dog.nom gardener.gen) ‘dog in the
manger’, verb-phrases (Pol. zwroty), e. g. gryźć ziemię (bite.inf earth.acc) ‘to bite
the dust’, and units which exhibit the structure of a sentence (Pol. frazy), e. g. Do
wesela się zagoi (until wedding.gen refl heal.fut.3sg) ‘It will heal in no time’.
Furthermore, phraseological units are divided into three types, depending on
their degree of semantic non-compositionality and syntactic fixedness, into fixed
idiomatic phraseological units (Pol. związki stałe), collocable phraseological
units (Pol. związki łączliwe), and free syntactic combinations (Pol. związki luźne,
lit. loose phraseological units). Fixed phraseological units, such as biały kruk (lit.
white raven) ‘rare specimen’, resemble non-derived words in that their meaning
does not follow from the meaning of individual components. In the case of collo-
cable phraseological units, such as dobry humor ‘good mood’ and pobudzić do
działania (wake.inf to action.gen) ‘to incite, to invigorate’, their constituents
retain literal meaning but show a preference to occur together. Loose phraseo
logical units correspond to free syntactic strings, such as młoda kobieta ‘young
woman’ or zjeść jabłko ‘to eat (an/the) apple’.
Cross-linguistic typologies of phraseological units are discussed by, among
others, Granger/Paquot (2008), Fellbaum (2011) and Hüning/Schlücker (2015:
45). I will follow the latter classification in a very brief presentation of types of
multi-word expressions in Polish below.
Proverbs in Polish can be exemplified by such sentences as Ręka rękę myje
(hand.nom hand.acc wash.pres.3sg) ‘You scratch my back and I’ll scratch yours’.
Commonplaces can be illustrated by truisms and tautologies based on everyday
experience, e. g. Żyje się raz ‘You only live once’. Quotations come from popular
literary works, songs and films, e. g. Kobieto, puchu marny (woman.voc flu ff.voc
feeble.voc) ‘Woman, you wretched fluff’.
2 As is stated in the entry for “idiom” in Polański (ed.) (1999: 244), the term “phraseme” (Pol.
frazem) in the narrow sense is employed to refer to multi-word expressions in which at least one
item shows a literal meaning, e. g. ślepa uliczka ‘blind alley’, in contrast to idiomatic expressions
whose meaning shows no relatedness to the meaning of particular constituents, e. g. drzeć koty
(tear.inf cat.acc.pl) ‘to quarrel’.
3 For discussion of other classifications of phraseological units used in the Polish phraseologi-
cal literature, cf. Lewicki (1976: 9–23), Żmigrodzki (2009: 100) and Szerszunowicz (2012).
Compounds and multi-word expressions in Polish 281
4 Solid compounds, such as wniebowzięcie ‘assumption (of Virgin Mary)’, can also be interpret-
ed as frozen forms (cf. Section 2).
5 As pointed out to me by an anonymous reviewer, Buttler (1976) observes the expansion of
analytic constructions in Polish. She (ibid.: 70) mentions the occurrence of verbo-nominal
constructions, such as ulec zepsuciu (lit. undergo deterioration) ‘deteriorate, go bad’, and noun-
adjective combinations such as akcja szkoleniowa (lit. action training.ra), which replace
synonymous verbs or nouns, i. e. zepsuć się ‘to deteriorate, go bad’, and szkolenie ‘training
course’.
282 Bożena Cetnarowska
2 T
ypes of compounds proper and solid
compounds in Polish
Polish composites are usually divided into three types (Grzegorczykowa/Puzynina
1984; Szymanek 2010; Nagórko 2016): compounds proper (which meet the criteria
of morphological compounds, as shown in Section 4), solid compounds (Pol.
zrosty), and juxtapositions (Pol. zestawienia).
Solid compounds originate from the coalescence (i. e. merging) of syntactic
phrases (Długosz-Kurczabowa/Dubisz 1999: 60; Szymanek 2010: 224). They are
written as one orthographic word, e. g. Wielkanoc ‘Easter’, which comes from
Wielka Noc (lit. great night), czcigodny ‘respectful’, from czci godny (lit. respect-de-
serving), and zmartwychwstały ‘resurrected’, originating from the phrase z mar
twych wstały (lit. from dead arisen). According to Grzegorczykowa/Puzynina
(1984: 396), solid compounds characteristically lack interfixes6 or suffixes but
they retain (compound-internal) inflectional elements.7
Compounds proper consist of two stems which are characteristically linked
with a vocalic interfix (abbreviated here as LV, i. e. linking vowel), e. g. drobn-o-
ustrój (small+lv+organism)8 ‘microorganism, microbe’ and słodk-o-gorzk-i (lit.
sweet+lv+bitter+nom.sg) ‘bittersweet’. In the case of compounds consisting of
a verb stem followed by a nominal stem, the interfix is the vowel -i-/-y-, as in gol-
i-brod-a (shave+lv+beard+nom.sg) ‘barber’, and mocz-y-mord-a (soak+lv+trap+
nom.sg) ‘sponge, drunkard’. When the left-hand constituent is the numeral
6 Consequently, Jadacka (2005: 121) regards other composites which lack a vocalic interfix as
solid compounds, even if they do not originate from the “freezing” of syntactic phrases, e. g.
seksmasaż ‘sex massage’, biznespartner ‘business partner’.
7 Cf. Section 4 for more discussion of inflectional endings in solid compounds.
8 The compound nouns in question are normally written without hyphens. I use hyphens here
to show the internal structure of the composites under discussion.
Compounds and multi-word expressions in Polish 283
dw(u)- ‘two’, the interfix appears as the vowel -u-, e. g. dw-u-znak (two+lv+sign)
‘digraph’. Some types of compounds proper, e. g. those with the numeral trój-
‘three’, or the element pół- ‘half’ contain no linking vowel, e. g. trójskok (three+
jump) ‘triple jump’, północ (half+night) ‘midnight, north’.
Compounds such as drobnoustrój ‘microorganism’ and północ ‘midnight,
north’ can be compared to primary (root) compounds in English, in which two
stems are combined without any intervention of derivational suffixes. The only
formative that functions as the marker of composition is the vocalic interfix (if
present).
On the other hand, in the case of compound nouns such as król-o-bój-stw-o
(king+lv+kill+suff+nom.sg) ‘regicide’, and krwi-o-daw-c-a (blood+lv+give+
suff+nom.sg) ‘blood donor’ both the linking vowel and the final derivational
suffix act as co-formatives. Such Polish compounds, referred to as “interfix-
al-suffixal formations”, are analogous to synthetic compounds in English, such
as proof-reading or truck-driver (as observed by Szymanek 2010: 221). The right-
hand verb stem with the nominalising suffix can either form an independently
occurring word, e. g. dawca ‘giver’, or be unattested as a free form, e. g. *bójstwo
‘killing’.
There is yet another (formal) type of compounds proper, namely “interfix-
al-paradigmatic formations” (Grzegorczykowa/Puzynina 1984: 398; Szymanek
2010: 222), in which two elements act as co-formatives (signalling the operation
of compounding): the linking vowel and the so-called paradigmatic formative
(i. e. a change of the inflectional paradigm). The right-hand stems of the interfixal-
paradigmatic compounds paliw-o-mierz (fuel+lv+measure+ø)9 ‘fuel indicator’
and dług-o-pis (long+lv+write+ø) ‘ballpen’ are nominalised verb roots, which
undergo conversion (i. e. paradigmatic derivation) into nouns. The resulting nom-
inalised elements -mierz and -pis do not occur as nouns in isolation. Another type
of interfixal-suffixal formations is exemplified by the compound noun żmij-o-
głów (adder+lv+head+ø) ‘snakehead fish’, in which the right-hand stem does not
show a category change but undergoes a shift of the paradigm (from feminine
declension, as in głow-a (head+nom.sg), to masculine declension).
If Polish compounds proper are divided into structural types (according to
the cross-linguistic classification proposed by Scalise/Bisetto 2009), the com-
pounds in (1) are recognised as subordinate compounds, in which one constitu-
ent is subordinated semantically and syntactically to the other so that a comple-
ment-head relation can be established between them. The left-hand constituent
9 The element ø represents here a paradigmatic formative (i. e. a zero morpheme), as in Szyma-
nek (2010: 222) and Kolbusz-Buda (2014: 121).
284 Bożena Cetnarowska
in (1a–c) can be regarded as the object of the action of picking or indicating, and
the result of the action of writing. In (1d) the left-hand constituent, i. e. the verb
stem wyrw-, is syntactically superordinate to the following nominal stem dąb.
The compound nouns in (1a) and (1b) are endocentric since they are hyponyms
of their heads, e. g. bajkopisarz ‘fabulist, writer of fables’ is a kind of a writer.
The compounds in (1c) and (1d) are regarded as exocentric by Grzegorczykowa/
Puzynina (1984) and Szymanek (2010).10
(3a)
barman-o-kelner (bartender+lv+waiter) ‘waiter and bartender’
(3b) gad-o-ptak (reptile+lv+bird) ‘archaeopteryx’
(3c) spódnic-o-spodni-e (skirt+lv+trouser+nom.pl) ‘skort, cullotes’
Compound adjectives can be similarly divided into subordinate (e. g. (4a)), attrib-
utive (4b) and coordinate ones (4c).
Compound verbs are rare in Polish. Nagórko (2016: 2838) suggests that many of
them result from loan translation, e. g. lekceważyć ‘to disrespect, to neglect’ (from
German gering schätzen12).
Długosz-Kurczabowa/Dubisz (1999: 50 f.) point out that many compound
nouns proper, solid compounds, and compound adjectives in Polish can be
treated as calques. Some religious terms are translations of Latin compounds,
e. g. wszech-mogąc-y (all+able+nom.sg) ‘almighty’ (from Latin omnipotens).
Polish compounds which are imitations of German compound lexemes include,
among others, list-o-nosz (letter+lv+carry+ø) ‘postman’ (from Briefträger) and
ogni-o-trwał-y (fire+lv+durable+nom.sg) ‘fireproof’ (from feuerfest). The influ-
ence of Russian, on the other hand, can be observed in the case of such com-
pounds as brak-o-rób-stw-o (dud+lv+do+suff+nom.sg) ‘wastage’ (from brako
dielstvo). Nevertheless, Długosz-Kurczabowa/Dubisz (ibid.: 75) argue for the
recognition of compound formation in Polish as a native pattern (which can be
traced back to Proto-Slavonic forms or the Old Polish period).
(5) N+N.gen
dom studenta (house.nom student.gen.sg) ‘dormitory, student hall of
(5a)
residence’
(5b) mąż stanu (man.nom state.gen.sg) ‘statesman’
12 As is pointed out to me by the editor of the volume, the expression gering schätzen is not
normally regarded as a compound in German.
286 Bożena Cetnarowska
(6) N+PP
(6a) chustka do nosa (kerchie f.dim.nom for nose.gen) ‘handkerchief’
(6b) dziurka od klucza (hole.dim.nom from key.gen) ‘keyhole’
(7) N+A
(7a) panna młoda (maid young) ‘bride’
(7b) drukarka laserowa (printer laser.adj) ‘laser printer’
(7c) krem odżywczy (cream nourishing) ‘nourishing cream’
(8) A+N
(8a) biały kruk (white raven) ‘rare specimen’
(8b) nocna zmiana (night.adj shift) ‘night shift’
(8c) wieczne pióro (eternal pen) ‘fountain pen’
(9) N+N
(9a) poeta-tłumacz (poet translator) ‘poet-translator’
(9b) kobieta-guma (woman rubber) ‘female contortionist’
(9c) wywiad-rzeka (interview river) ‘extended interview’
13 On the basis of translation equivalence between Germanic N+N compounds and Polish N+RA
(or RA+N) units, ten Hacken (2013) argues that multi-word expressions in Polish consisting of
nouns and relational adjectives should be treated as compounds.
Compounds and multi-word expressions in Polish 287
The relationship between the syntactic type and the structural classification of
juxtapositions is not complete, though. N+N combinations (whose constituents
show agreement) and N+N.gen phrasal nouns in (13) require attributive
interpretation.
Damborský (1966) remarks that some N+N juxtapositions may have entered the
Polish language as calques of French formations (e. g. zegarek-bransoletka
‘watch-bracelet’) or as calques of Russian complex lexemes (e. g. miasto-bohater
‘hero city’). Nevertheless, he concludes that N+N juxtapositions represent mostly
a native pattern of composite formation (as is also observed by Długosz-Kurcza-
bowa/Dubisz 1999).
In the next section criteria which can be employed in distinguishing between
compounds proper and juxtapositions will be presented.
4 D
ifferences between compounds proper, solid
compounds and juxtapositions
Polish compounds proper exhibit features expected of morphological compounds
cross-linguistically (cf. Lieber/Štekauer 2009; Booij 2010). They are written as one
orthographic word, though some compounds are hyphenated, e. g. słodko-kwaśny
‘sweet and sour’.14
14 The hyphen is employed in the case of coordinate compound adjectives (e. g. przemysłowo
-rolniczy ‘industrial and agricultural’) while attributive and subordinate compound adjectives
288 Bożena Cetnarowska
(14a) mebl-o-ścian-k-a
furniture+lv+wall+dim+nom.sg
‘wall unit’
(14b) staw-o-nog-a
joint+lv+foot+gen.sg
‘arthropod’ (gen.sg)
are written as single orthographic words (e. g. roponośny ‘oil-bearing’, ciemnozielony ‘dark
green’).
15 In the case of polysyllabic compounds, apart from the main stress on the penultimate sylla-
ble, there may occur secondary stresses on the first constituent, e. g. prAlkosuszArka ‘washer
dryer’, ciEmnoniebiEski ‘dark blue’.
16 The compound noun stawonóg ‘arthropod’ is masculine, while its right-hand constituent
noga ‘foot’ is feminine (cf. nog-i ‘foot+gen.sg’).
17 There is no vocalic element linking the constituents dusz (soul.gen.pl) and pasterz (shep-
herd.nom.sg) since the marker of genitive plural in the first constituent is a morphological zero.
Compounds and multi-word expressions in Polish 289
5 Syntactic fixedness
The Lexical Integrity Principle, postulated by Anderson (1992), does not allow
rules of syntax to manipulate or have access to parts of words. Booij (2010: 177)
points out that this principle can be split into two subparts (i. e. two subcon-
straints).
One subconstraint prohibits the operation of syntactic rules of case assign-
ment and agreement on constituents of morphologically complex words. Inflec-
tional endings do not occur inside affixal derivatives or inside compounds proper,
cf. czarn-o-biał-ego (black+lv+white+gen.sg) ‘black-and-white.gen.sg’ and not
*czarn-ego-biał-ego (black+gen.sg+white+gen.sg). This subconstraint is vio-
18 There occur also solid compounds which allow alternative word-forms, e. g. Wielk-a-noc
(great+nom.sg/lv+night) ‘Easter.nom.sg’, Wielk-a-noc-y (great+lv+night+gen.sg) or Wielki-ej-
noc-y (great+gen.sg+night+gen.sg) ‘Easter.gen.sg’.
19 According to current prescriptive recommendations, Polish coordinate compounds should be
hyphenated while attributive compounds should not.
290 Bożena Cetnarowska
lated in the case of juxtapositions and some solid compounds, as was illustrated
in the previous section.
The second subpart of the Lexical Integrity Principle predicts that words can
be neither split by intervening constituents nor reordered. This subconstraint is
met in the case of the majority of compounds proper and solid compounds in
Polish. The left-hand modifiers of the compound nouns dług-o-pis (long+
lv+write+ø) ‘ballpen’ and grzyb-o-bra-ni-e (mushroom+lv+take+suff+nom.sg)
‘mushroom picking’ cannot be shifted to the right-hand position, as is shown by
the ill-formedness of *pis-o-dług and *brani-o-grzyb. Moreover, those left-hand
(modifier) stems cannot be modified themselves, as indicated by the unaccepta-
bility of *bardzo-dług-o-pis (very+long+lv+write+ø) in the intended meaning
‘ballpen which can write for a long time’. Constituents of coordinate compounds
proper show some possibility of reordering, e. g. czerwono-biały ‘red and white’
and biało-czerwony ‘white and red’.20 However, one potential order of elements
tends to be conventionalised, hence ?suszark-o-pralk-a (dryer+lv+washer+nom.
sg) and ?robotnik-o-chłop (worker+lv+peasant) sound decidedly odd when com-
pared to the institutionalised forms pralk-o-suszark-a (washer+lv+dryer+nom.
sg) ‘washer and dryer’ and chłop-o-robotnik (peasant+lv+worker) ‘a peasant
farmer who also works in a factory’.
Juxtapositions resemble compounds proper in Polish in that their internal
constituents cannot be modified (cf. Cetnarowska/Trugman 2012; Cetnarowska
2018).21 If an adverbial modifier is inserted in front of the adjective in the N+A
juxtaposition foka szara (seal grey) ‘grey seal’, the resulting string stops function-
ing as a naming unit and can be interpreted as a free syntactic combination, i. e.
foka bardzo szara (seal very grey) ‘seal whose fur is very grey’. Similarly, the addi-
tion of the demonstrative tego (this.gen.sg) in front of the noun człowieka (man.
gen.sg) in the N+N.gen phrasal noun prawa człowieka (law.nom.pl man.gen.sg)
‘human rights’ results in the reanalysis of the juxtaposition as a freely composed
noun phrase, i. e. prawa tego człowieka (law.nom.pl this.gen.sg man.gen.sg)
‘this man’s rights’. Some instances of phrasal nouns that contain internal pre- or
post-modifiers (and complements) can be encountered, as shown in (15). It can be
argued, though, that these are cases of complex phrasal nouns which contain
20 Nagórko (2016: 2837) remarks that there is a difference in meaning between biało-czerwony
(white-red), which can be used to describe the flag of Poland, and czerwono-biały (red-white),
which describes the colours of the flag of Monaco.
21 Consequently, adjectives and nouns are regarded as non-projecting categories (A0 and N0) in
multi-word units in Polish by Cetnarowska (2018), as is suggested for MWEs in other languages
by Booij (2010).
Compounds and multi-word expressions in Polish 291
22 The internal word order is fixed in the case of some types of coordinate and quasi-coordinate
juxtapositions, e. g. those that consist of a superordinate term followed by a hyponym, such as
lekarz ginekolog (physician+gynecologist) ‘gynecologist’ or Kinship+Property coordinate juxta-
positions, e. g. syn prawnik (son+lawyer) ‘lawyer son’.
292 Bożena Cetnarowska
three groups: idiomatic A+N combinations, N+A ‘tight units’ and A+N/N+A com-
binations in which the classifying adjective is regarded as ‘migrating’.
A+N juxtapositions which are regarded by Cetnarowska/Pysz/Trugman
(2011) as lexicalised idiomatic phrases, such as koński ogon (horse.ra tail) ‘pony-
tail’, lwia paszcza (lion.ra jaw) ‘snapdragon’, and boża krówka (god.ra cow.dim)
‘ladybird’, show syntactic fixedness. Their consitutents cannot be shifted, since
the postposing of the adjective changes their meaning to non-idiomatic combina-
tions, as shown in (16).
The elements of N+A ‘tight units’ are not (normally) reversible, either. Post-head
classifying adjectives in tight units, such as kurier dyplomatyczny (courier diplo-
matic) ‘diplomatic courier’, pancernik olbrzymi (armadillo giant) ‘giant armadillo’
and foka szara (seal grey) ‘grey seal’, change their interpretation to those of qual-
ifying adjectives, as indicated in (17) and (18).
6 C
ompetition between compounds and
juxtapositions
The conventionalisation of a given concept by means of a compound or a phrasal
unit in Polish is to some extent arbitrary. For instance, while there exist the syn-
thetic compounds proper koni-o-krad (horse+lv+steal+ø) ‘horse thief’ and (used
rather rarely) kur-o-krad (hen+lv+steal+ø) ‘chicken thief’, N+N.gen phrasal lex-
294 Bożena Cetnarowska
emes are used to denote a person who steals cars or bicycles, i. e. złodziej samo
chodów (thie f.nom.sg car.gen.pl) ‘car thief’ and złodziej rowerów (thie f.nom.sg
bicycle.gen.pl) ‘bicycle thief’.
Nevertheless, it is possible to come across synonymous compounds proper
and juxtapositions in Polish. Let us look at the competition between (and coexist-
ence of) subordinate synthetic compounds proper and N+N.gen combinations
(or N+A units).
There exist several institutionalised synthetic compounds which end in
the constituent -dawca ‘giver’, e. g. kredyt-o-daw-c-a ‘lender’, prac-o-daw-c-a
‘employer’, ustaw-o-daw-c-a ‘lawmaker, legislator’, spadk-o-daw-c-a ‘testator’.
Jadacka (2001: 96, 99) observes that compounds terminating in -dawca repre-
sent a fairly numerous group of neologisms in the Polish vocabulary at the end
of the twentieth century (i. e. after 1989).23
As shown in (21)–(22) below, the existence of synthetic compounds proper
terminating in -dawca, such as licencj-o-daw-c-a ‘licensor’, does not block the
formation (and use of) a synonymous N+N.gen juxtaposition, i. e. dawc-a licencj-i
‘licensor (lit. giver of licence)’.
(21) licencj-o-daw-c-a
licence+lv+give+suff+nom.sg
‘licensor’
(23a) krwi-o-daw-c-a
blood+lv+give+suff+nom.sg
‘blood donor’
(23b) daw-c-a krw-i
give+suff+nom.sg blood+gen.sg
‘blood donor’
23 Nevertheless, the pattern of synthetic compounds with the constituent -dawca ‘giver’ shows
many gaps. There are no attestations (in the National Corpus of Polish) of the potentially well-
formed compounds ?organodawca (organ+lv+giver) ‘organ donor’, ?szpikodawca (mar-
row+lv+giver) ‘(bone) marrow donor’ or ?sercodawca (heart+lv+giver) ‘heart donor’. However,
the anonymous reviewer points out that Google searches result in 17 hits for ?organodawca ‘or-
gan donor’ (including some metaphorical uses of the word) and 9 hits for ?szpikodawca ‘marrow
donor’.
Compounds and multi-word expressions in Polish 295
The comparison of the occurrence of the (various inflectional forms of the) lex-
emes in (21)–(23) in the National Corpus of Polish (NKJP) shows that the synthetic
compound licencjodawca ‘licensor’ is more common in the corpus than the
phrasal noun dawca licencji (giver.nom.sg licence.gen.sg) ‘licensor’: it occurs
167 times, while the equivalent phrasal noun is attested 9 times. In the case of the
items in (23), both the synthetic compound krwiodawca ‘blood donor’ and the
N+N.gen phrasal noun dawca krwi ‘blood donor’ are fairly frequent.24
Jadacka (2001: 98) also points out the productivity of the pattern of interfix-
al-paradigmatic derivation of compounds, represented by such novel compounds
as diet-o-mierz (diet+lv+measure+ø) ‘dietometer’, where the right-hand constitu-
ent is the verb stem mierz- (as in mierzyć ‘measure.inf’) and the nominalizing
morpheme is the paradigmatic formative (i. e. the zero morpheme ø). There exist
doublets or even triplets consisting of synonymous compounds terminating in
-mierz or -metr and phrasal nouns consisting of the head miernik ‘meter, gauge’
followed by a noun in the genitive.
(24a) głośn-ości-o-mierz
loud+suff+lv+measure+ø
‘volume unit meter’
(24b) audio-metr
audio+meter
‘audiometer’
(24c) mier-nik głośn-ośc-i
measure+suff loud+suff+gen.sg
‘volume unit meter, volume indicator’
(25a) wilgotn-ości-o-mierz
wet+suff+lv+measure+ø
‘moisture meter’
(25b) higro-metr
hygro+meter
‘hygrometer’
(25c) mier-nik wilgotn-ośc-i
measure+suff wet+suff+gen.sg
‘hygrometer, moisture meter’
24 There is a difference in the occurrence of the nominative singular forms of both competing
lexemes: the compound occurs 345 times and the phrasal noun 57 times, mainly in the expres-
sion honorowy dawca krwi ‘honorary blood donor’.
296 Bożena Cetnarowska
The usage of N+N.gen pattern allows the speaker to reach greater precision in
denoting the kind of instrument. The genitive attribute can in turn be modified by
another genitive, as is shown in (26)–(27).
(28a) chłop-robotnik
peasant+worker
‘peasant farmer who works in a factory’
(28b) chłop-o-robotnik
peasant+lv+worker
‘peasant farmer who works in a factory’
(28c) klub-kawiarni-a
club+café+nom.sg
‘café that hosts cultural events’
(28d) klub-o-kawiarni-a
club+lv+café+nom.sg
‘café that hosts cultural events’
In the case of the pairs of multifunctional coordinate phrasal nouns and com-
pounds proper given in (29), both formations coexist (and compete).
(29a) krem-żel
cream+gel
‘gel cream’
(29b) krem-o-żel
cream+lv+gel
‘gel cream’
(29c) barman-kelner
bartender+waiter
‘waiter-bartender’
(29d) barman-o-kelner
bartender+lv+waiter
‘waiter-bartender’
Certain types of coordinate composites allow for one pattern only, i. e. either the
creation of N+N juxtapositions or compounds proper. Multifunctional coordinate
composites representing (among others) the following semantic types26 cannot be
expressed by synthetic compounds:
26 The semantic typology is based on that postulated for English by Olsen (2001).
298 Bożena Cetnarowska
7 T
he treatment of phrasal nouns in Construction
Morphology
As noted by Grzegorczykowa (1982: 59) and Długosz-Kurczabowa/Dubisz (1999)
and as mentioned in Section 2, in traditional accounts of Polish word-formation
(e. g. Klemensiewicz 1939) phrasal nouns were treated as a subtype of composites
(i. e. compounds in the broad sense of the term), namely as juxtapositions. In
more rigorous descriptive grammars of Polish (e. g. those written in the structur-
alist paradigm), juxtapositions are excluded from the domain of morphology.
Puzynina (1974) argues that multi-word expressions, such as maszyna do szycia
(machine for sewing) ‘sewing machine’ and szkoła podstawowa (school elemen-
tary) ‘primary school’, should fall within the domain of phraseological research,
and not morphological enquiry.27 In their chapter on compound nouns in Polish,
Grzegorczykowa/Puzynina (1984: 396) recognise only two types of compounds,
i. e. compounds proper and solid compounds. They do not devote any attention to
juxtapositions. Kallas (1980) treats coordinate multi-word units, such as kobieta
pilot ‘woman pilot’ and lalka-niemowlak (doll baby) ‘baby doll’, as free syntactic
combinations and analyses them in the same way as (regular) noun phrases in
apposition, such as mleko – cenny pokarm ‘milk – precious food’.
Nagórko (1997), in her brief but insightful account of Polish grammar, postu-
lates a strict division between syntax, phraseology and the lexicon. Consequently,
27 Grzegorczykowa (1982: 59) mentions the existence of juxtapositions, such as czarna jagoda
(black berry) ‘bilberry’ and maszyna do pisania (machine for typing) ‘typewriter’, yet she notes
that they do not constitute the subject matter of word-formation proper.
Compounds and multi-word expressions in Polish 299
in her chapter on Polish syntax (Chapter V), she notes the occurrence of conven-
tionalised phraseological units but concludes that from the point of view of syn-
tax such strings of words are indivisible (Nagórko 1997: 189).28 Her conclusion
refers both to idiomatic multi-word units, such as kocie łby (cat.ra head.nom.pl)
‘cobblestones’ or pies ogrodnika (dog.nom.sg gardener.gen.sg) ‘dog in the man-
ger’, as well as semantically regular juxtapositions, e. g. kosz na śmieci (bin for
rubbish) ‘rubbish bin’ and gwiazda polarna (star polar) ‘pole star, Polaris’. In a
modular framework (such as the one assumed by Nagórko 1997) it is difficult to
draw a rigid and uncontroversial border between lexical multi-word units and
freely composed phrases. While such N+N combinations as człowiek instytucja
(man institution) ‘one-man-institution’ or kobieta szef (woman boss) ‘female
boss’ are regarded by Nagórko (1997: 190 f.) as syntactic units (consisting of a
head noun and a nominal attribute), other N+N juxtapositions, such as lekarz
pediatra (physician pediatrician) ‘pediatrician’ and szpital-pomnik (hospital
monument) ‘memorial hospital’, are recognised as lexical units.
Such a strict separation of modules of grammar, i. e. morphology, syntax and
the lexicon, is characteristic both of structuralist linguistics and of generative
framework.29 Syntax and morphology do not interact, and the lexicon is treated
as a collection of irregularities (Bloomfield 1933; Di Sciullo/Williams 1987), i. e. a
list of items which carry unpredictable semantic information and/or exhibit other
idiosyncratic properties.
A markedly different view of the lexicon and the architecture of grammar is
postulated in Construction Grammar (Goldberg 2006), Parallel Architecture and
Construction Morphology (Masini 2009; Booij 2010; Masini/Benigni 2012; Booij/
Audring 2015; Booij/Masini 2015). The lexicon, referred to as the constructicon, is
viewed as a network of construction schemas of varying degrees of abstractness.
Schemas are instantiated by fully specified constructions, which are also stored
in the lexicon. Such constructions can take the form of syntactic strings, words or
units with an intermediate (i. e. both lexical and syntactic) status.
28 Phraseological units are treated as indivisible from the point of view of syntax as well as se-
mantics also by Grochowski (1982). Cf., however, Lewicki (1976) and Węgrzynek (1998) for some
discussion of the internal syntax of idioms in Polish.
29 N+A phrasal nouns are recognised as free syntactic combinations by, among others, Rut-
kowski/Progovac (2005), who are proponents of the Minimalist Program, and by Szymanek
(2010), who advocates the lexicalist approach. Willim (2001) regards N+A and N+N multi-word
units, such as ogród zoologiczny (garden zoological) ‘zoo’ and kobieta-anioł (lit. woman angel)
‘angel of a woman’ as syntactic constructs, basing her analysis on the discussion of Greek A+N
combinations by Ralli/Stavrou (1998). Syntactic constructs are treated as syntactic compounds
(i. e. phrasal lexemes) by Booij (2010).
300 Bożena Cetnarowska
(31) [N0i A0j ]k ↔ [NAME for SEMi with some relation R to entity E of SEMj ]k
Since some N+A strings contain classifying adjectives which are not denominal,
e. g. panda wielka (panda great) ‘giant panda’, the schema in (32) can account for
their structure.
A classifying adjective (be it relational or a non-derived one) can stand in the pre-
head position in a phrasal noun in Polish. Consequently, two more schemas are
necessary, to account for the structure of RA+N phrasal nouns, e. g. nocny dyżur
‘night shift’ (where the relational adjective nocny is derived from noc ‘night’) and
A+N units which contain a non-derived or deverbal adjective, e. g. głuchy telefon
(deaf phone) ‘Chinese whispers’, odżywczy krem na noc (nourishing cream for
night) ‘nourishing night cream’.
(33) [A0i N0j ]k ↔ [NAME for SEMj with some relation R to entity E of SEMi ]k
Another phrasal schema, given in (35) below, can be postulated for N+N.gen
phrasal nouns, both transparent semantically and idiomatic ones, e. g. prawa
człowieka (right.nom.pl man.gen.sg) ‘human rights’, and pies ogrodnika (dog.
nom.sg gardener.gen.sg) ‘dog in the manger’.
(35) [N0i N-GENj ]k ↔ [NAME for SEMi with some relation R to SEMj ]k
(36) [N0i N0j ]k ↔ [NAME for an entity which is both SEMi and SEMj ]k
(37) <[N0i A0j ]k ↔ [NAME for SEMi with some relation R to entity E of SEMj ]k>
≈ <[ A -ka]Nz ↔ [SEMk [+familiar]]z>
The second order schema given above states that deadjectival nouns terminating
in the suffix -ka can be motivated by (i. e. semantically related to) phrasal N+RA
lexemes.
8 C
onclusion
This chapter offered a brief overview of multi-word expressions in Polish, focus-
ing on phrasal nouns (which are often referred to as “juxtapositions”) and their
interaction with compound nouns. The following subtypes of juxtapositions were
discussed at greater length: N+N.gen, N+A, A+N, and coordinate N+N phrasal
lexemes. Juxtapositions do not meet the majority of the criteria for morphological
compounds (as stated by Lieber/Štekauer 2009). A morphological compound in
Polish, i. e. a compound proper, is written as one orthographic word and inflected
like one morphological word (with the inflectional endings attached to the right-
hand constituent). It carries one primary lexical stress (typically on the penulti-
mate syllable). A juxtaposition, in contrast, consists of two or more orthographic
words, each of which is inflected. Constituents of a juxtaposition can carry inde-
pendent lexical stresses, e. g. mĄż stAnu (man.nom state.gen) ‘statesman’. On the
other hand, juxtapositions act as naming units, therefore they can be regarded as
multi-word lexical items. It is important to emphasise here that phrasal nouns in
Polish are far from being exclusively idiomatic and unanalysable multi-word
expressions. While selected multi-word units are semantically non-composi-
tional (and can be treated as figurative idioms), e. g. biały kruk (white raven) ‘rare
specimen’, the majority of phrasal nouns in Polish show varying degrees of
semantic transparency. They are also analysable syntactically, which results in
some degree of their syntactic mobility, as is shown above for coordinate N+N
juxtapositions and for phrasal nouns consisting of a head noun and a relational
adjective. The syntactic analysability of phrasal nouns also tallies with the fact
that their constituents are inflected as independent morphological words.
The approach of Construction Morphology allows the researcher to provide a
proper account of the above-mentioned properties of phrasal nouns in Polish.
Multi-word units inherit their syntactic structure from construction schemas. In
Compounds and multi-word expressions in Polish 303
other words, phrasal construction schemas can be employed to analyse the inter-
nal structure of existing phrasal nouns. The construction schemas state that
phrasal nouns are generally interpreted as “names of kinds” (i. e. as subtypes of
entities), e. g. droga dojazdowa (road access.ra) ‘access road’, miernik promienio
wania (meter.nom radiation.gen) ‘radiation meter’, kierowca-dostawca (driver.
nom supplier.nom) ‘delivery driver’. Phrasal schemas can be used not only as
redundacy statements (to license conventionalised phrasal nouns), but also as
patterns for creating novel multi-word units. The latter function of schemas is
particularly important in Polish since the patterns for phrasal nouns discussed
above are very productive. Novel phrasal lexemes abound in Polish, e. g. in the
vocabulary associated with the Internet technology, as is illustrated by such mul-
ti-word units as dostawca usług internetowych (provider.nom.sg service.gen.pl
Internet.ra.gen.pl) ‘Internet service provider’, pióro świetlne (pen light.ra) ‘light
pen’, ekran dotykowy (screen touch.ra) ‘touch screen’, telefon z klapką (phone
with flip) ‘clamshell phone’. Schemas for multi-word units in Polish both com-
pete with and complement patterns of compounding. As was shown in Section 6,
fairly numerous examples can be found of co-existence of synonymous com-
pound nouns and phrasal nouns in Polish, such licencjodawca (licence+lv+giver)
and dawca licencji (giver.nom licence.gen) ‘licensor’. However, the formation of
synthetic compounds appears to be more restricted than the coinage of N+N.gen
or N+A multi-word units. Moreover, some types of naming units can be formed
only by using phrasal schemas, e. g. attributive N+N compounds, such as czło
wiek-zagadka (man mystery) ‘mystery man’, and coordinate phrasal nouns con-
sisting of units denoting Kinship+Profession, e. g. mąż prawnik (husband lawyer)
‘lawyer husband’. Finally, it was shown that multi-word units need to be accessi-
ble to affixation and compounding processes (i. e. to morphological construction
schemas), as they undergo morphological condensation. Such evidence indicates
that the study of both morphologically complex words (such as compounds
proper) and multi-word units should be of interest to morphologists. Researchers
should pay greater attention to the interaction between phrasal lexemes and mor-
phologically complex words in Polish, which is the kind of phenomenon that can
find an appropriate account within the framework of Construction Morphology.
References
Anderson, Stephen (1992): A-morphous morphology. Cambridge, UK: Cambridge University
Press.
Bloomfield, Leonard (1933): Language. New York: Holt, Rinehart and Winston Inc.
Booij, Geert (2010): Construction morphology. Oxford: Oxford University Press.
304 Bożena Cetnarowska
Booij, Geert/Audring, Jenny (2015): Construction morphology and the parallel architecture of
grammar. In: Cognitive Science 41, 2. 277–302.
Booij, Geert/Masini, Francesca (2015): The role of second order schemas in the construction of
complex words. In: Bauer, Laurie/Körtvélyessy, Livia/Štekauer, Pavol (eds.): Semantics of
complex words. Cham etc.: Springer. 47–66.
Buttler, Danuta (1976): Innowacje składniowe współczesnej polszczyzny. Warszawa: Państwowe
Wydawnictwo Naukowe.
Cetnarowska, Bożena (2014): On pre-nominal classifying adjectives in Polish. In: Bondaruk,
Anna/Dalmi, Gréte/Grosu, Alexander (eds): Topics in the syntax of DPs and agreement.
Amsterdam/Philadelphia: Benjamins. 100–127.
Cetnarowska, Bożena (2015): The linearization of adjectives in Polish noun phrases: Selected
semantic and pragmatic factors. In: Bondaruk, Anna/Prażmowska, Anna (eds.): Within
language, beyond theories. Vol. 1: Studies in theoretical linguistics. Newcastle upon Tyne:
Cambridge Scholars Publishing. 188–205.
Cetnarowska, Bożena (2016): Identifying (heads of) copulative appositional compounds in
Polish and English. In: Körtvélyessy, Lívia/Štekauer, Pavol/Valera, Salvador (eds.):
Word-formation across languages. Newcastle upon Tyne: Cambridge Scholars Publishing.
51–71.
Cetnarowska, Bożena (2018): Phrasal names in Polish: A+N, N+A and N+N units. In: Booij, Geert
(ed.): The construction of words. Advances in construction morphology. Cham: Springer
International Publishing AG. 287–313.
Cetnarowska, Bożena/Pysz, Agnieszka/Trugman, Helen (2011): Distribution of classificatory
adjectives and genitives in Polish NPs. In: Dębowska-Kozłowska, Kamila/Dziubalska-
Kołaczyk, Katarzyna (eds.): On words and sounds: A selection of papers from the 40th PLM,
2009. Newcastle upon Tyne: Cambridge Scholars Publishing. 273–303.
Cetnarowska, Bożena/Trugman, Helen (2012): Falling between the chairs: Are classifying
adjective+noun complexes lexical or syntactic formations? In: Błaszczak, Joanna/
Rozwadowska, Bożena/Witkowski, Wojciech (eds.): Current issues in generative
linguistics: Syntax, semantics and phonology. Wrocław: University of Wroclaw. 138–154.
Damborský, Jiří (1966): Apozycyjne zestawienia we współczesnej polszczyźnie. In: Język Polski
46, 4. 255–268.
Di Sciullo, Anna-Maria/Williams, Edwin (1987): On the definition of word. Cambridge, MA: The
MIT Press.
Długosz-Kurczabowa, Krystyna/Dubisz, Stanisław (1999): Gramatyka historyczna języka
polskiego: Słowotwórstwo. Warszawa: Wydawnictwa Uniwersytetu Warszawskiego.
Fellbaum, Christiane (2011): Idioms and collocations. In: Maienborn, Claudia/von Heusinger,
Klaus/Portner, Paul (eds.): Semantics. An international handbook of natural language
meaning. Vol. 1. Berlin/New York: De Gruyter. 441–456.
Goldberg, Adele (2006): Constructions at work. The nature of generalization in language.
Oxford: Oxford University Press.
Granger, Sylviane/Paquot, Magali (2008): Disentangling the phraseological web. In: Granger,
Sylviane/Meunier, Fanny (eds.): Phraseology. An interdisciplinary perspective.
Amsterdam/Philadelphia: Benjamins. 27−49.
Grochowski, Maciej (1982): Zarys leksykologii i leksykografii. Toruń: Wydawnictwa
Uniwersytetu Mikołaja Kopernika.
Grzegorczykowa, Renata (1982): Zarys słowotwórstwa polskiego. Słowotwórstwo opisowe.
5th edn. Warszawa: Państwowe Wydawnictwo Naukowe.
Compounds and multi-word expressions in Polish 305
Ralli, Angela/Stavrou, Melita (1998): Morphology-syntax interface: A+N compounds and A+N
constructs in modern Greek. In: Booij, Geert/van Marle, Jaap (eds.): Yearbook of
morphology 1997. Dordrecht: Springer. 243–264.
Renner, Vincent/Fernández-Domínguez, Jesús (2011): Coordinate compounding in English and
Spanish. In: Poznań Studies in Contemporary Linguistics 47. 873–883.
Rutkowski, Paweł/Progovac, Ljiljana (2005): Classification projection in Polish and Serbian: The
position and shape of classifying adjectives. In: Franks, Steven/Gladney, Frank Y./
Tasseva-Kurktchieva, Mila (eds.): Formal approaches to Slavic linguistics: The South
Carolina Meeting. Ann Arbor, MI: Michigan Slavic Publications. 289–299.
Scalise, Sergio/Bisetto, Antonietta (2009): Classification of compounds. In: Lieber, Rochelle/
Štekauer, Pavol (eds). 49–82.
Skorupka, Stanisław (1967): Słownik frazeologiczny języka polskiego. (2 Vols.). Warszawa:
Wiedza Powszechna.
Sprenger, Simone (2003): Fixed expressions and the production of idioms. [Ph. D. dissertation,
Max Planck Instituut voor Psycholinguïstiek]. Nijmegen: Max-Planck Institut fur
Psycholinguistik.
Szerszunowicz, Joanna (2012): English-Polish contrastive phraseology. In: Rozumko, Agata/
Szymaniuk, Dorota (eds.): Directions in English-Polish contrastive research. Białystok:
Wydawnictwo Uniwersytetu Białostockiego. 139–162.
Szumska, Dorota (2006): Przymiotnik jako przyłączone wyrażenie predykatywne. Analiza
formalizacji struktur propozycjonalnych w warunkach predykacji niezdaniotwórczej.
Kraków: UNIVERSITAS.
Szymanek, Bogdan (2010): A panorama of Polish word-formation. Lublin: Wydawnictwo KUL.
Węgrzynek, Katarzyna (1998): Składnia wyrażeń frazeologicznych w modelu gramatyki
generatywno-transformacyjnej. In: Polonica 19. 67–74.
Willim, Ewa (2001): On NP-internal agreement: A study of some adjectival and nominal
modifiers in Polish. In: Zybatow, Gerhild et al. (eds.): Current issues in formal Slavic
linguistics. Frankfurt a. M. i. a.: Lang. 80–95.
Żmigrodzki, Piotr (2009): Wprowadzenie do leksykografii polskiej. Katowice: Wydawnictwo
Uniwersytetu Śląskiego.
Irma Hyvärinen
Compounds and multi-word expressions
in Finnish
1 I ntroduction
Most of the processes to expand the vocabulary of a language are based on a recy-
cling principle: Instead of creating not yet occupied arbitrary sound sequences
for new concepts, existing lexemes or morphemes are reused as material for new
words. This can happen by borrowing a word from some other language or by
altering the meaning and thus shifting the extension of an existing word. Yet,
these means are fairly unsystematic. Instead, a system of word-formation offers
productive models for expanding the lexicon in an economic way, and it is actu-
ally the most common way it happens.1
Word-formation types such as (1a–f) are usually regarded as a domain of
morphology:
1 Foreign influence can manifest itself in word formation, too, as calques of singular formations
or by taking over a formation model from another language. Many Finnish compounds are loan
translations from (or via) Swedish or German. Nowadays loan translations come increasingly
from English, cf. jakamis+talous < sharing economy, palvelu+muotoilu < service design. In termi-
nology, neoclassical compounds (with elements from Greek or Latin) as internationalisms play
an important role.
2 The word stem is the form to which affixes can be attached. As for word stems in Finnish,
cf. ISK (2004: 86–89).
Open Access. © 2019 Hyvärinen, published by De Gruyter. This work is licensed under the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-011
308 Irma Hyvärinen
The boundaries between different formation types are not always clear-cut: Com-
pound nouns often compete with MWEs, for example as constructional synonyms
in terminology, cf. (2e) above. Some Finnish compounds have internal inflec-
tional elements, which is a syntactic feature (cf. Section 2.1). Moreover, there are
hybrid formations, like the so-called “derived compounds” (Section 2.3.1.1,
group 2). And finally, scholars have divergent views of certain structures, such as
Finnish particle verbs that have been classified either as compounds, prefix deri-
vations or MWEs (Section 3).
Compounds and MWEs share some characteristics: Both are complex lexical
units and thus secondary signs for a specific concept, their constituents are
words, and they can bear an idiomatic (figurative or opaque) or non-idiomatic
(transparent) meaning. One instance of opaqueness is presented by unique com-
ponents (isolates, cranberry morphemes), compare the MWE in (2a) with the
cranberry-compound puna+tulkku (lit. red+unique component) ‘bullfinch’ (cf.
Nenonen 2002: 13, 15, 21 f., 37–40; Stein 2012: 227 f.). Both compounds and MWEs
can express determinative, appositive and coordinative relations. The compound
constituents occur in a fixed order; regarding MWEs this applies mainly to nomi-
nal, adjectival and adverbial expressions, whereas verbal MWEs are more flexi-
ble. In Finnish, the great majority of compounds are nouns (N), while among idi-
omatic MWEs verb idioms (V) are the predominant class.
In this chapter, the focus is on the characteristics of compounds, with remarks
on differences and overlap in the structure and syntactic distribution of com-
pounds and (fixed or free) phrasal units. Section 2 gives an overview of com-
pounding in Finnish, mostly using examples of nouns and adjectives:6 In Sec-
tion 2.1 characteristics of prototypical compounds and their absence, making a
compound less prototypical and bringing it nearer to an MWE, are discussed.
Section 2.2 deals with the complexity of compounds, and in Section 2.3 the main
semantic-hierarchical and morphosyntactic types of compounds are presented.
Section 3 focuses on a word class that has been regarded as rather peripheral
from the perspective of compounding in Finnish, namely complex verbs. They are
interesting for two reasons: They are on the increase in modern Finnish, and they
lie at the intersection of compounds (3.1), prefix derivatives (3.2) and MWEs (3.3).
In the closing remarks (Section 4) observations on the blurred border between
Finnish compounds and MWEs are gathered and suggestions for future research
are presented.
2 Compounding in Finnish
2.1 P
rototypical compounds
(3a) Verb stem + suffix -jA → juoja ‘drinker’ vs. syöjä ‘eater’
yö+juna (*yö+jynä) ‘night train’, varpus+pöllö (*varpus+pollo)
(3b)
(lit. sparrow+owl) ‘pygmy owl’
and Section 2.3.1.1, group 2), has become more common than earlier (Tyysteri
2015).
As a general rule, Finnish compounds are written without space between the
constituents, cf. (4) below. Hyphenation is obligatory in case of hiatus (5a) and to
indicate the constituent boundary after a special sign (letter, number, acronym
etc.) (5b). A compound differs also prosodically from a phrase: The main stress is
on the first compound constituent (cf. criterion 3), while in a corresponding
phrase both words have a stress of their own (Pääkkönen 1989: 371; Vesikansa
1989: 213; ISK 2004: 388), cf. (6). Yet, stress is not a reliable criterion: Adverbial
and conjunctional units (7) bear only one main stress on the first part and show a
strong tendency towards univerbation. Until the 1960s they could be written
together or apart, today the orthographical norm requires separation and thus an
MWE status for them, which is in contradiction with the stress pattern (Niinimäki
1992).
(6) músta+rastas (lit. black+thrush) ‘blackbird’ vs. músta rástas ‘black thrush’
(10a)
kissan+kokoisin kirjaimin ‘in letters big as a cat, in huge letters’ (conven-
tional idiom)
(10b)
kissan kokoinen rotta ‘rat having the size of a cat’ (concrete compositional
meaning)
7 There are less than 100 monosyllabic word roots in Finnish, whereas English has at least 7.000
(Karlsson 2004: 1329).
314 Irma Hyvärinen
(18a) kivi+talo ‘stone house’ (a special kind of house: ‘house made of stone’)
kirkon+kello (lit. churchgen+bell) ‘church bell’ vs. (läheisen) kirkon kello
(18b)
‘bell of the (nearby) church’
8 Tyysteri’s material consists of more than 28.000 new compounds (types) in Finnish print me-
dia in the period 2000–2009, collected from Nykysuomen sanastotietokanta (Lexical database of
modern Finnish) of Kotimaisten kielten keskus (Institute for the Languages of Finland) (Tyysteri
2015: 79–84).
Compounds and multi-word expressions in Finnish 315
(21b)
ajan+mukaistaa ‘modernize, update’< ajan+mukainen (lit. timegen+in
accordance with) ‘up to date’
(21c) musta+pukuinen ‘dressed in black’ < musta puku ‘black dress’
9 As for the theoretical status of prefixation in the history of linguistics, cf. Olsen (2015: 364 f.).
Compounds and multi-word expressions in Finnish 317
Similarly to epä-, ei-, foreign negation prefixes, such as dis-, in-, are treated as
compound components in Finnish, as well as other foreign prefixes and confixes,
e. g. ex-/eks-, pre-, hyper-, mikro-, poly-, neo-, audio-, anglo-, bio-, geo-, psyko-
which occur in neoclassical compounds (see Olsen 2015: 374 f.), cf. (24a). Some
foreign pre-elements can also be combined with indigenous heads (24b) (cf. Saja-
vaara 1989: 76 ff.10; ISK 2004: 192, 394, 402; Pitkänen-Heikkilä 2016: 3214).
10 Sajavaara (1989: 79 f.) also gives an overview of bound second constituents of neoclassical
compounds in Finnish.
318 Irma Hyvärinen
the head and implies a contrast (26). As for adjectives, cf. (27), the first constitu-
ent is mostly in the genitive (Agen+A) and functions as an intensifier; the compo-
nents can be combined as a compound or a phrase without an essential differ-
ence in meaning (ISK 2004: 410; Tyysteri 2015: 66 f.), similar to (8) above.
(26) ruoka+ruoka (lit. food+food) ‘real food’ (in contrast to fast food or
unhealthy food)
kirja+kirja (lit. book+book) ‘printed book’ (in contrast to e-book).
can be a stem, case form or a specific combining form.11 The first component is
usually classified on grounds of its word class (if identifiable) and/or its form
(nominative, genitive, other case form, combining form, indeclinable element or
element with deficient paradigm). Subclasses that arise from the cross classifica-
tion of the morphosyntactic types of both constituents are described semantically
in detail in the research literature, but no hard and fast rules can be given.
It is a controversial question to what extent the meaning of a compound is
influenced by the form of the first constituent. The most frequent first constituent
form in Finnish is the nominative which is the base form without any inflectional
elements. This base form, as well as the combining forms, leaves the constituent
relation underspecified so that several interpretations are possible. Inherently
ambiguous compounds can be interpreted semantically and pragmatically, such
as world knowledge of prototypical (e. g. local, temporal, causal, instrumental,
possessive etc.) relations, common ground and contextual inference (cf. Olsen
2015: 365 f., 376 ff., 382; Pitkänen-Heikkilä 2016: 3213). Lexicalized and frequently
used compounds can be understood holistically, without analytic compositional
processing, but there is psycholinguistic evidence that some form of analysis
is co-present (Mäkisalo 2000). Räisänen (1986) points out that lexicalized
compounds can be reinterpreted on contextual grounds: In a football report,
maa+pallo (lit. earth+ball) and ilma+pallo (lit. air+ball) with the lexicalized
meaning ‘globe’ resp. ‘balloon’ are interpreted in a context-adequate way as occa-
sionalisms describing the motion of the ball either along the ground or through
the air.
If the first constituent is in the genitive or some other non-nominative case,
the interpretation is more restricted. In such cases the head is usually a deverbal
noun and the first component corresponds to an argument of the underlying verb
(synthetic compounds, cf. Section 2.3.1.1, group 1). A first constituent in the geni-
tive can indicate a (in a broad sense) subjective-possessive (29a) or objective rela-
tion (29b); the latter is more common (Saukkonen 1973: 338; cf. ISK 2004: 400).
Locative cases are also current (29c). It is noteworthy, however, that case marking
is not obligatory: Similar relations can also be expressed by compounds with
morphologically unspecified modifiers (30a–c).
11 A combining form (casus componens) is a form of the non-head constituent that as such does
not occur as an autonomous word form. Besides non-autonomous stem forms, such as nais- <
nainen ‘woman’ (nais+ryhmä ‘women’s group’) or pien- < pieni ‘small’ (pien+teollisuus ‘small in-
dustries’), there are specific combining forms with additional morphological material. For exam-
ple, verbal first constituents appear mostly in a combining form with -ma- or -in- (istuma+paikka
(lit. sitting+place) ‘seat’, leivin+uuni ‘baking oven’ (cf. also Tyysteri 2015: 121, 131, 134 f.).
320 Irma Hyvärinen
There are pairs of compounds with a nominative vs. genitive first constituent
where the case choice seems more or less arbitrary (31a–b), and others where the
difference in meaning is minimal (32a–b). Yet, sometimes there is a clear seman-
tic opposition: (33a) is a specific house, whereas in (33b) the head describes an
action and the first constituent in the genitive is the object argument of the under-
lying verb (cf. Vesikansa 1989, 230–237; ISK 2004, 398–400).
Case marking on the constituent boundary does not contradict the principle of
world-knowledge and context-based interpretation, but in giving further infor-
mation on the relation between the constituents it can exclude alternatives that
are possible when the first constituent is unmarked: While the underspecified
form pöytä+tarjoilu (lit. table+service) can be used in the meaning ‘buffet ser-
vice, self-service from the table’ (source), the marked form pöytiin+tarjoilu (lit.
tablesillat+service) precludes this interpretation because the illative ending
makes the opposite direction (goal) explicit.
Compounds and multi-word expressions in Finnish 321
3 C
omplex verbs in Finnish at the intersection of
compounds, prefix derivatives and MWEs
In Finnish, compound verbs are rare.12 They belong to the category of determina-
tive compounds;13 the first constituent is a noun, adjective, numeral, pronoun,
non-autonomous stem or particle (adverb/adposition) (Rahtu 1984: 409–412; ISK
2004: 414 f.). Verbs with a particle as first constituent are often replaced by MWEs
with the same elements. On the other hand, some first constituents come near to
prefixes. Thus, complex verbs can be explored on a scale MWE – compound –
prefix derivative.
Modern Finnish has about 250 lexicalized compound verbs with a full para-
digm, but the number is increasing (ISK 2004: 414). Additionally, formations with
a deficient paradigm (mostly participle forms) are in use, and occasionalisms
occur. Compound verbs were banned by Finnish language planning as loan
translations for a long time. In the last decades the norm has become more per-
missive, which can explain the increasing occurrence (cf. Rahtu 1984: 409; Vesi-
kansa 1989: 254–258; Vaittinen 2003: 50; Tyysteri 2015: 40, 154, 220 f.).
There are three historical layers of compound verbs in Finnish: The oldest
compound verbs, with an adverb as first constituent, are loan translations from
the time of the Reformation. In the end of the 19th century a new type, derived
from compound nouns, appeared. In the beginning of the 20th century also adjec-
tive compounds became derivation bases of verbs (Häkkinen 1987: 10–19; Vaitti-
nen 2003: 47). Also in modern Finnish most of the compound verbs are secondary
“derived compounds”, i. e. derivatives or backformations from compound adjec-
tives or nouns, such as (34a–c) (Vesikansa 1989, 256 ff.; Tyysteri 2015: 153; cf. also
Section 2.3.1.1, group 2). According to ISK (2004: 414 f.), most present-day com-
pound verbs are derived from complex adjectives ending on the suffix -inen,
cf. (34a). According to Tyysteri (2015: 158, 213), however, the majority of the new-
est compound verbs go back to compound nouns (34b–c). For the most part new
compound verbs have a noun (N) as first component (Tyysteri 2015: 173), which is
12 According to Saukkonen (1973: 337 f.), the proportion of verbs among all compounds in “Ny-
kysuomen sanakirja” (1951–1961) remains at 0,3 %. In Tyysteri’s (2015: 113) corpus their ratio
(types) is 1,2 %.
13 Copulative compound verbs do not exist in Finnish. Compounds with a verb stem as first
constituent are possible, cf. riippu+liitää ‘hang-glide’, but the constituent relation is determina-
tive, not additive. In itku+naurattaa ‘make cry and laugh’ (Vesikansa 1989: 258) the semantic re-
lation is similar to an additive compound, but the first constituent is a deverbal noun, i. e. the
morphological structure is asymmetric.
322 Irma Hyvärinen
unsurprising since compound nouns are the most common derivation base, and
among these, the structure N+N is predominant.
Adverbs, particles and non-autonomous elements can combine directly with ver-
bal heads (Vesikansa 1989: 254 ff.). Such preverbs are often called “prefix-like
elements” because they are in many respects similar to prefixes in other
languages. In Finnish, however, prefixation is untypical (Häkkinen 1994: 488;
Kolehmainen 2006: 111, 113). This is why word formation with bound “prefix-like
elements” is subsumed under compounding in the Finnish grammar tradition,
even if the notion of “prefix-likeness” varies (cf. Tyysteri 2015: 127 ff.). In the fol-
lowing, the focus is on verbs with such prefix-like elements.
Kolehmainen (2006) makes a distinction between position fixed bound pre-
verbs, divided into (a) confixes and (b) prefixes, and in contrast to them (c) sepa-
rable particles in phrasal verbs. Consequently, in each group the word formation
status of the verbs is different: in (a) compound (3.1), in (b) prefix derivative (3.2),
and in (c) MWE (3.3). In the following, these groups are examined in detail in
order to estimate their structural status and productivity.
3.1 C
onfix compounds
Complex words with a prefix-like first constituent that does not occur as an auton-
omous lexical unit (and thus has an unspecific word class status) are relatively
common in modern Finnish. In Tyysteri’s material, including all word classes,
they make up 9,3 % of all two-constituent compounds; indigenous and foreign
pre-elements are roughly equally common. Yet, the word class distribution (e. g.
the ratio of verbs) of such formations is not given (cf. Tyysteri 2015, 118 ff., 125,
128). The examples in (35a) are lexicalized compounds (cf. Kolehmainen 2006:
115); neologisms and occasionalisms such as (35b) are being used more and more
frequently.
The “prefix-likeness” of such elements is debatable. The term confix seems more
suitable here because in contrast to semantically abstract prefixes, the pre-ele-
ments in question still have a more or less clear lexical-conceptual meaning. His-
torically, they go back to autonomous lexemes; some of them occur today only as
bound elements (e. g. epä- ‘un-, non-’; esi- ‘pre-’; etä- ‘long-distance, remote’),
some are obsolete or archaic as autonomous words (e. g. lähi- ‘near’; taka- ‘back,
rear’; tasa- ‘even, equal’). Others have an autonomous homonym, but the seman-
tic difference is so big that the common origin is not transparent (e. g. edes- ‘fur-
ther, forth’; etu- ‘fore, forward, front’; jälki- ‘post-, after-’)14 (Kolehmainen 2006:
113 f., 128). Karlsson (1983: 192 f.) points out that these elements are semantically
similar to nouns and adjectives and calls them lexical “relic morphemes”.15
Moreover, these elements differ from prefixes in their ability to function as
derivation bases (cf. esi- ‘pre-’ in the derivate esittää ‘present, put forth, perform’
vs. esi+katsella ‘preview’; more examples in Kolehmainen 2006, 119). They are
somewhere between prototypical compound constituents and affixes (ibid.: 118–
124). Confix verbs meet the prototypicality criterion 4) (cf. Section 2.1) according
to which a form-identical phrase is not possible (*esi katsella, *katsella esi), but
since the first constituent is not an autonomous lexeme (criterion 1), they count
as non-prototypical compounds.
In spite of the fact that non-autonomous elements can in principle be com-
bined regularly with verbal heads, many of the complex verbs in this group are
actually secondary compounds, i. e. derivatives (36a) or backformations (36b)
from already existing compounds (see above).16 Many confix verbs have an incom-
plete paradigm: They are preferably used in infinite forms, especially as adjec-
14 In affirmative expressions the autonomous word edes means ‘at least’ in modern Finnish,
with negation it has the meaning ‘[not] even’. The noun etu means ‘advantage, benefit’ and the
noun jälki ’track, trace’. In spite of the common etymology, native speakers hardly associate
these words with the corresponding pre-elements (Kolehmainen 2006: 114, 126).
15 ISK (2004: 192, 393, 414 f.) and Rahtu (1984: 409) characterize them as “prefix-like nominal
stems”.
16 In Tyysteri’s random sample of 300 two-constituent-compounds (100 nouns, 100 adjectives
and 100 verbs), 75 % of the compound verbs (including all kinds of first constituents) were
formed by derivation or backformation and only 25 % by regular compounding. The ratio of reg-
ular compounding is much lower than in previous studies (Tyysteri 2015: 154 f., 158).
324 Irma Hyvärinen
tive-like participles, which is a transitional phase on the way towards a full para-
digm via analogy and generalization. Analogy plays a role in producing new
verbs as well: When verbs with a given initial element, e. g. ala- ‘sub-’, become
more frequent (e. g. alaotsikoida ‘subtitle’, alaluokitella ‘subclassify’ etc.), the
word structure is reanalyzed such that the main constituent boundary is after the
pre-element, and not after the complex nominal base, thus as if the verbs were
formed regularly via combining ala- directly with the verb. In this way, an origi-
nally prenominal confix can develop into a preverbal confix, cf. (i), which leads
to a symmetric compounding model (ii) that can be generalized, cf. (iii):
In Kolehmainen’s assessment (2006: 116 f.), given the limited lexical variation in
her research material (76 different verbs with 22 indigenous confixes)17 the struc-
ture confix+verb plays a minimal role in modern Finnish, i. e. it is not productive.
Yet, according to ISK (2004: 414 f.), the number of different verbs with epä- ‘un-’,
esi- ‘pre-’, jälki- ‘post-’, pika- ‘quick, instant’ is increasing, which means that at
least these elements are productive. Among the new compounds from the first
decade of the 21st century many more than the above-mentioned bound preverbs
are in frequent use – to an extent that proves the productivity of this formation
model (Tyysteri 2015: 130). Confix verbs are, however, often stylistically marked:
They occur as terms in languages for special purposes; in everyday language
and print media occasionalisms are often used playfully (Vesikansa 1989: 257 f.;
Kolehmainen 2006: 116; Tyysteri 2015: 88, 113, 213). Nevertheless, it is evident
that the number of lexicalized confix verbs in standard language is increasing.
The currently most popular indigenous and foreign verb confixes have a high
communicative and cultural relevance: They reflect modern life with its hectic
pace (pika-), green values (bio-, eko-) and technological innovations (digi-, nano-,
etä-, täsmä-).
17 Kolehmainen collected her research material from dictionaries and authentic texts from the
1990es in SKTP (Suomen kielen tekstipankki / Language Bank of Finland).
Compounds and multi-word expressions in Finnish 325
3.2 P
refix verbs?
The question is whether adpositional and adverbial elements that are used as
bound preverbs in Finnish can be regarded as prefixes. Kolehmainen (2006: 130–
137) cautiously refers to them as “prefix-like elements” and underlines that they
differ in some aspects from prefixes in Germanic languages. Firstly, they are not
unstressed: The main word stress in Finnish is generally on the initial syllable,
i. e. word stress does not apply as prefix criterion in Finnish. Secondly, the Finn-
ish adpositions are mainly postpositions.18 Thirdly, many of them are secondary
adpositions, having developed from inflected forms of relative nouns,19 and have
therefore (fossilized) case endings; some of them have a restricted nominal para-
digm in several (still existing or historical), mostly locative, especially directional
cases. The same holds for adverbs: Many elements occur both as adpositions and
as adverbs (ISK 2004: 664 f.; Tyysteri 2015: 121). Consequently, there are hundreds
of different adposition and adverb forms in Finnish, but not all of them function
as preverbs.
Kolehmainen’s research material from grammars, previous studies and dic-
tionaries contains 70 of such elements (Kolehmainen 2006: 134–137). “Nyky-
suomen sanakirja” (Dictionary of modern Finnish, 1951–1961) mentions 251
complex verbs with these elements, but many of them are marked as archaic,
e. g. alas+astua ‘step down’, and almost a half of them occur in univerbated form
only as participles, cf. yhteen+laskettu (lit. together+counted) ‘combined’. In both
cases separated alternatives are recommended, cf. astua alas, yhteen laskettu (cf.
Section 3.3). Thus the number of inseparable verbs in active use is much lower.
Most of the elements combine only with one or two verbs (ibid.: 138). About ten
elements show a somewhat broader spectrum, e. g. irti- ‘loose, off’, läpi- ‘through,
throughout’, sisään- ‘in, inside’, ulos- ‘out, outside’, yli- ‘over’ (ibid.: 137 ff.). All in
all, Kolehmainen regards the model prefix+verb as unproductive.
Finnish inseparable verbs of this group are historical relics that go back to
old loan translations from Germanic and classical languages resp. to an interfer-
ence-based formation model (cf. Öhmann 1957: 33 ff.; Vaittinen 2003; Toropainen
2017: 72). In Old Literary Finnish (1540–1810) the majority of printed texts were
18 In principle, postpositions (and adverbs) can develop into prefixes in SOV-languages where
complements precede the verb. SOV is supposed to be the basic word order in Uralic languages;
in Finnish, however, the order has changed into SVO. This is one possible explanation for the
weak affinity to prefixes. As for typological theories of linearization in connection with prefixes,
see the overview in Kolehmainen (2006: 149–156).
19 This is the first step of a gradual grammaticalization called “noun-to-affix-cline”, cf. Leh-
mann (1985: 304); Hopper/Traugott (2003: 110); as for Finnish Jaakola (1997: 126 f., 134).
326 Irma Hyvärinen
irti+sanoa ~ *irti sanoa ~ sanoa irti ‘discharge, fire; cancel, (fig.) break off’
(38a)
(38b) irti+sanottu vs. *irti sanottu
In my opinion, these pre-elements are not prefixes. One reason is their obvious
unproductivity, i. e. the restricted verb variation per pre-element – for affixes a far
wider use is expected. The still existing bound forms are sporadic historical rel-
ics, based on calques from foreign languages with systematic prefixation, yet, in
Finnish, a generalization never took place. The initial word stress protects the
elements from phonological erosion typical of affixes. Above all, the fact that
there are parallel phrasal forms, cf. (37a) and (38a), is a proof of the lexical auton-
omy of the elements in question – in that respect they show a higher autonomy
than confixes (cf. Section 3.1). It follows that the univerbated forms are com-
pounds. Here I agree with Tyysteri (2015: 119, 121) who, in contrast to Kolehmainen
(2006), does not classify the above-mentioned elements as prefixes or “prefix-like
elements” but as “indeclinable elements or elements with incomplete declina-
tion (adverbs, adpositions and particles)” in ordinary compounds. The advantage
of this analysis is that the coexistence of occurrences with and without separa-
tion, i. e. MWEs vs. compounds, can be compared with similar cases in other word
classes where both alternatives have (nearly) the same meaning, cf. (8) and (20b)
above.
Whether the one-word and the two-word combination represent one and the
same verb lexeme or two synonymous lexemes and whether the phrasal alterna-
tives should be regarded as regular (“free”) syntactic constructions or rather as
phrasal verbs, i. e. MWEs, is discussed in the next section.
3.3 P
hrasal verbs
In the linguistic literature the terms “phrasal verb” and “particle verb” are often
used as synonyms. The former implies that the components are separate, while
the latter refers to the functional category of the component the verb is connected
328 Irma Hyvärinen
with. In English, for example, particle verbs are always phrasal verbs. In Finnish
this need not be the case.
In traditional Finnish grammar phrasal verbs are not recognized as an estab-
lished category, but several scholars refer to fixed sayings or idiomatic figures of
speech in the form of MWEs, similar to separable particle verbs in Germanic lan-
guages (cf. Häkkinen 1997: 44; Nenonen 2002: 55), cf. (41a). They are semantically
and structurally similar to verb idioms consisting of a verb and a non-particle
component, for example a unique component (41b) or a nominal component in a
locative case (41c) (cf. Nenonen 2002: 55 f.; Kolehmainen 2006: 164).
According also to ISK (2004: 447), particle verbs are “idiomatic predicates”. Here
“particle” refers to the functional category of the element co-occurring with the
verb, regardless of univerbation or separation. In some cases “Kielitoimiston
sanakirja” (2006) lemmatizes the univerbated form but refers to the phrasal one.
From entries like (42a) it can be inferred that both forms are regarded as
representations of the same lexeme; remarks such as ‘mostly’ or ‘better’ (42b)
indicate that the MWE is generally the dominant form. Occasionally only the
univerbated form is given although separated forms occur commonly, cf. (42c).
However, as mentioned above, some verbs are used only in the univerbated
form, cf. (39).
Kolehmainen (2006: 170 f.) sees the separation (i. e. the MWE structure) and the
idiomaticity or metaphoricity of the combination to be key criteria; in her assess-
ment particle verbs are either singular idioms or go back to phraseological pat-
terns. This means that transparent (non-idiomatic) combinations, such as (43),
are excluded from the class of particle verbs and regarded as products of free
syntax; according to Kolehmainen (ibid.) native speakers do not perceive them as
single semantic units.
Compounds and multi-word expressions in Finnish 329
Yet, it is not always easy to draw the line between idiomatic and free combina-
tions because idiomaticity is a continuum. Kolehmainen (ibid.: 172–183) distin-
guishes between four grades of idiomaticity and compositionality:
(A) Fully idiomatic combinations that do not permit any component variation are
obvious verb idioms, e. g. lyödä laimin, cf. (41b) above, where laimin is a unique
(adverb-like) component and the verb lyödä does not bear its regular meaning
‘hit, beat’, cf. (44a) vs. (44b).
(44a) He lyövät laimin lapsiaan. ‘They neglect their children.’ (idiomatic mean-
ing) vs.
(44b) He lyövät lapsiaan. ‘They are beating their children.’ (regular meaning)
(C) MWEs consisting of verbs and the particles ilmi ‘open(ly), apparent(ly)’ and
julki ‘(in) public, out’ build productive phraseosyntactic patterns, expressing that
information is made available or public. In contrast to adverbs like ulos ‘out’ or
kiinni ‘shut, fixed, closed’ which occur both in concrete and in figurative
combinations, ilmi and julki always have a constant abstract meaning, which can
explain the stronger serialization. There are both intransitive and transitive
series. The kernel verbs are so-called light verbs like tulla ‘come’, antaa ‘give’,
saada ‘get’, tuoda ‘bring’, but they can be replaced with more specific verbs
expressing for example that the publicity was not intended, cf. (47a) vs. (47b). In
the transitive pattern, antaa ‘give’, saattaa ‘put’ or tuoda ‘bring’ can be replaced
by various speech verbs and their descriptive and expressive variants,
cf. (48a–c):
(D) Combinations of verb and directional adverb are often situated on the bound-
ary between regular syntactic constructions and fixed MWEs. At first sight it
seems controversial that, according to Kolehmainen (2006: 91, 97, 170), the Ger-
man separable particle verbs in (49) are lexicalized phraseological (but not idio-
matic) units, whereas the corresponding Finnish combinations are not. However,
this is not necessarily controversial because the lexicalization strategies in two
languages need not be identical. Yet, the difference in the language-specific affin-
ity of such combinations to merge into one lexeme should be proved theoretically.
A possible explanation could be related to the grade of semantic-structural
autonomy of German and Finnish adpositions and adverbs. Different word order
conditions could be relevant, too.
(50a) ajaa ulos tunnelista ‘drive out of the tunnel, leave the tunnel’
(50b) ajaa ulos (Ø) ‘leave’
Besides contextual ellipses there are conventionalized ellipses that are not figura-
tive but bear some specific semantic features connected with a certain topic or
text type. For example, in reports on road accidents or motor sports ajaa ulos has
the conventional meaning ‘drive off the track, swerve off the road’ (51a). The noun
ulos+ajo (51b) is used particularly in this specific meaning, yet it is difficult to say
if it has been derived from the lexicalized phrasal verb. It could as well have been
originated as a synthetic compound and then later specialized as a traffic term, of
which the specific phrasal verb has been formed analogically, similar to backfor-
mation. This makes it difficult to use phrasal input for derivation as a criterion of
lexicalizedness of the base, especially as there are synthetic compounds going
back to fully transparent non-specific combinations, cf. (52) and (49) above –
even if dictionaries codify primarily the idiomatized or spezialized compounds
and leave the semantically self-evident ones out.
(53) Anna meni ulos. – Menikö hän sinne yksin? ‘Anna went out. – Did she go
there alone?’
332 Irma Hyvärinen
4 C
oncluding remarks
Compounding is the most common way to form new words in modern Finnish.
Prototypical determinative nominal compounds with an underspecified first con-
stituent (N+N) form the most common and still increasing type. Apart from this
type many less prototypical compound models are productive, too. Among these,
special attention has been paid above to formations showing syntactic features
similar to MWEs and/or competing with MWEs. The essential findings can be
summarized as follows:
1. In about one third of A+N compounds the adjective agrees in number and
case with the head, which does not fulfil the criterion of morphological integ-
rity. However, compound-internal congruence is a recessive feature; there are
hardly any neologisms with internal congruence. Compounds with a non-con-
gruent first constituent tend to have a term-like character.
2. Internal inflection also occurs in complex numerals. Numerals with hun-
dreds, thousands etc. are grouped into smaller (still complex) units, thus
combining characteristics of non-prototypical compounds and MWEs.
3. In synthetic compounds argument relations of the verb that underlies the
head are explicated by case forms, which is a syntactic feature.
4. A prototypical compound cannot be replaced with a phrasal unit of formally
identical components. Generally, if such pairs occur, they differ in meaning.
Overlap occurs if the modifier is in the genitive, which is the situation for
semantically relative adjectives and many deverbal nominalizations. Univer-
bation strengthens the conceptual unity, and vice versa, conceptualization
furthers univerbation.
5. An opposite example of the correlation between conceptual unity and univer
bation is represented by Finnish particle verbs. Compound verbs with an
adverb or adposition as first constituent are not productive in modern Finn-
ish, partly as consequence of normative language planning. This gap in the
system is compensated by “phrasalization”, i. e. keeping apart the compo-
nents in particle verbs. However, the formation model is far less productive
than in English or German, for instance. Apart from singular idioms, seriali-
zation, based on phraseosyntactic patterns, occurs in some amount. Drawing
the line between lexicalized MWEs and syntactically free combinations
requires further research.
Compounds and multi-word expressions in Finnish 333
The following topics remain for further research: In Finnish, non-figurative MWEs
such as fixed collocations and nominations for specific concepts have been so far
studied mostly in terminology. In the future, more attention should also be paid
to corresponding combinations in standard language. So far, MWEs have been
excluded when working out the statistical distribution of different lexem struc-
ture types in the Finnish vocabulary. Another question deserving attention is the
role of MWE patterns at the intersection of syntax and lexicon: Besides particle
verbs and similes, e. g. light-verb constructions, binomials and serial modifica-
tion of a specific idiom structure are topics worth of further attention. Several
single studies to these areas have been carried out within contrastive phraseology
and construction grammar but a systematic overview of MWE patterns is still
outstanding.
References
Baldwin, Timothy/Kim, Su Nam (2010): Multiword expressions. In: Indurkhya, Nitin/Damerau,
Fred J. (eds.): Handbook of natural language processing. 2nd ed. Boca Raton i. a.: CRC
Press. 267–292.
Eronen, Riitta (1996): Monimuotoiset yhdyssanat. In: Kielikello 1996, 4. Internet: http://
kielikello.fi.libproxy.helsinki.fi/index.php?mid=2&pid=11&aid=386 (last access:
15.9.2017).
Fleischer, Wolfgang/Barz, Irmhild (2012): Wortbildung der deutschen Gegenwartssprache. 4th
ed. Berlin/Boston: De Gruyter.
Häkkinen, Kaisa (1987): Suomen kielen vanhoista ja uusista yhdysverbeistä. In: Sananjalka 29.
7–29.
Häkkinen, Kaisa (1994): Agricolasta nykykieleen. Suomen kirjakielen historia. Porvoo: WSOY.
Häkkinen, Kaisa (1997): Kuinka ruotsin kieli on vaikuttanut suomeen? In: Sananjalka 39. 31–53.
Heinonen, Tarja Riitta (2001): Harmaaturkit herkkusuut – bahuvriihit sanakirjassa ja kieliopissa.
In: Virittäjä 105. 625–634.
Heinonen, Tarja Riitta (2010): Kuin-vertaukset. In: Virittäjä 114. 348–373.
Hopper, Paul J./Traugott, Elizabeth Closs (2003): Grammaticalization. 2nd ed. (= Cambridge
Textbooks in Linguistics). Cambridge, UK: Cambridge University Press.
334 Irma Hyvärinen
Hüning, Matthias/Schlücker, Barbara (2015): Multi-word expressions. In: Müller, Peter O. et al.
(eds.). 450–467.
ISK (2004) = Hakulinen, Auli et al. (2004): Iso suomen kielioppi. (= Suomalaisen Kirjallisuuden
Seuran Toimituksia 950.) Helsinki: SKS.
Jaakola, Minna (1997): Genetiivin kanssa esiintyvien adpositioiden kieliopillistumisesta. In:
Lehtinen, Tapani/Laitinen, Lea (eds.): Kieliopillistuminen. Tapaustutkimuksia suomesta.
(= Kieli 12). Helsinki: SKS. 121–156.
Jussila, Raimo (1988): Agricolan sanasto ja nykysuomi. In: Koivusalo, Esko (ed.): Mikael
Agricolan kieli. (= Tietolipas 112). Helsinki: SKS. 203–288.
Karlsson, Fred (1983): Suomen kielen äänne- ja muotorakenne. Porvoo: WSOY.
Karlsson, Fred (2004): Finnish (Finno-Ugric). In: Booij, Geert et al. (eds.): Morphologie.
Morphology. Ein internationales Handbuch zur Flexion und Wortbildung. An international
handbook on inflection and word-formation. Vol. 2. (= Handbooks of Linguistics and
Communication Science (HSK) 17.2). Berlin/New York: De Gruyter. 1328–1342.
Karlsson, Fred (2015): Finnish. An essential grammar. Translated by Andrew Chesterman. 3rd
ed. London/New York: Routledge.
Kielitoimiston sanakirja (2006) = Grönros,Eija-Riitta et al. (2006): Kielitoimiston sanakirja. Osat
1–3. Helsinki: Kotimaisten kielten tutkimuskeskus.
Koivisto, Vesa (2013): Suomen sanojen rakenne. Helsinki: SKS.
Kolehmainen, Leena (2006): Präfix- und Partikelverben im deutsch-finnischen Kontrast.
(= Finnische Beiträge zur Germanistik 16). Berlin i. a.: Lang.
Korhonen, Jarmo (2018): Fraseologia – Kiinteiden sanayhtymien tutkimus. Helsinki: Finn
Lectura.
Lehmann, Christian (1985): Grammaticalization. Synchronic variation and diachronic change.
In: Lingua e Stile 20, 3. 303–318.
Mäkisalo, Jukka (2000): Grammar and experimental evidence of finnish compounds. (= Studies
in Languages 35). Joensuu: University of Joensuu.
Malmivaara, Terhi (2004): Luupää, puupää, puusilmä. Näkymiä sananmuodostuksen
analogisuuteen ja bahuvriihiyhdyssanojen olemukseen. In: Virittäjä 108. 347–363.
Müller, Peter O. et al. (eds.) (2015–2016): Word-formation: An international handbook of the
languages of Europe. (= Handbooks of Linguistics and Communication Science (HSK) 40).
Berlin/Boston: De Gruyter.
Nenonen, Marja (2002): Idiomit ja leksikko. Lausekeidiomien syntaktisia, semanttisia ja
morfologisia piirteitä suomen kielessä. (= Publications in the Humanities 29). Joensuu:
University of Joensuu.
Niemi, Jussi (2009): Compounds in Finnish. In: Lingua & Linguaggio 8. 237–256.
Niinimäki, Anneli (1992): Sanaliittojen tiivistyminen yhdyssanoiksi. In: Virittäjä 96. 283–286.
Nykysuomen sanakirja (1951–1961) = Sadeniemi, Matti (1951–1961): Nykysuomen sanakirja.
Osat I–VI. Porvoo: WSOY.
Öhmann, Emil (1957): Beobachtungen über feste Verbalzusammensetzungen im Finnischen. In:
Ural-altaische Jahrbücher 29. 33–37.
Olsen, Susan (2015): Composition. In: Müller, Peter O. et al. (eds.). 364–386.
Pääkkönen, Irmeli (1989): Sanojen äänneasu ja oikeinkirjoitus. In: Vesikansa, Jouko (ed.).
357–382.
Pitkänen-Heikkilä, Kaarina (2016): Finnish. In: Müller, Peter O. et al. (eds.). 3209–3228.
Rahtu, Toini (1984): Suomen nominialkuiset yhdysverbit. In: Virittäjä 88. 409–430.
Räisänen, Alpo (1986): Sananmuodostus ja konteksti. In: Virittäjä 90. 155–163.
Compounds and multi-word expressions in Finnish 335
Sag, Ivan A. et al. (2002): Multiword expressions: A pain in the neck for NLP. In: Gelbukh,
Alexander (ed.): Computational linguistics and intelligent text processing. Third
International Conference, CICLing-2002, Mexico City, Mexico, February 17–23, 2002.
(= Lecture Notes in Computer Science 2276). Berlin i. a.: Springer. 1–15.
Sajavaara, Paula (1989): Vierassanat. In: Vesikansa, Jouko (ed.). 64–109.
Saukkonen, Pauli (1973): Suomen kielen yhdyssanojen rakenne. In: Commentationes
Fenno-Ugricae in honorem Erkki Itkonen. (= Suomalais-Ugrilaisen Seuran toimituksia 150).
332–339.
Schellbach-Kopra, Ingrid (1964): Die Bahuvrihi-Komposita in der alten finnischen
Volksdichtung. In: Suomalais-ugrilaisen seuran aikakauskirja – Journal de la Société
finno-ougrienne 65. 1–41.
Stein, Stephan (2012): Phraseologie und Wortbildung des Deutschen. Ein Vergleich von Äpfeln
mit Birnen? In: Prinz, Michael/Richter-Vapaatalo, Ulrike (eds.): Idiome, Konstruktionen,
„verblümte Rede“. Beiträge zur Geschichte der germanistischen Phraseologieforschung.
(= Beiträge zur Geschichte der Germanistik 3). Stuttgart: Hirzel. 225–240.
Suomen kielen perussanakirja (1990–1994) = Haarala, Risto (1990–1994): Suomen kielen
perussanakirja. Osat 1–3. (= Kotimaisten kielten tutkimuskeskuksen julkaisuja 55).
Helsinki: Valtion painatuskeskus.
Toropainen, Tanja (2017): Yhdyssanat ja yhdyssanamaiset rakenteet Mikael Agricolan teoksissa
(= Turun yliopiston julkaisuja – Annales Universitatis Turkuensis C 439). Turku: University
of Turku. Internet: https://doria.fi/bitstream/handle/10024/143331/AnnalesC%20
439Toropainen.pdf?sequence=2 (last access: 25.8.2017).
Tyysteri, Laura (2015): Aamiaiskahvilasta ötökkötarjontaan. Suomen yleiskielen morfosyn-
taktisten yhdyssanarakenteiden produktiivisuus. (= Turun yliopiston julkaisuja – Annales
Universitatis Turkuensis C 408). Turku. Internet: https://doria.fi/bitstream/
handle/10024/113113/AnnalesC408Tyysteri.pdf?sequence=2 (last access: 25.8.2017).
Vaittinen, Tanja (2003): Vanhan kirjasuomen yhdysverbit. In: Sananjalka 45. 45–66.
Vesikansa, Jouko (1989): Yhdyssanat. In: Vesikansa, Jouko (ed.). 213–258
Vesikansa, Jouko (ed.) (1989): Nykysuomen sanavarat. Porvoo: WSOY.
Ferenc Kiefer/Boglárka Németh
Compounds and multi-word expressions in
Hungarian
The notion of compounding is notoriously difficult to define and there are hardly
any universally accepted criteria for determining what a compound is. In the
present chapter we will make a distinction between prototypical compounds and
non-prototypical compounds. The latter but not the former are syntactically sep-
arable. All compounds are right-headed and are inflected as a whole. Moreover,
according to the received view compounds express a conceptual unit though it is
not easy to define what exactly this means. Finally, typically only the first syllable
of a compound bears stress.
Compounding is a rather late development in the history of Hungarian.
Though compounds can be found sporadically before the 18th century, during the
language reform (end of 18th and beginning of 19th century) new compounds were
massively created partly by using existing patterns and partly by loans mainly
from German. This explains why productive patterns of root (endocentric) com-
pounds are – as far as the categories involved are concerned – identical in Hun-
garian and German.1
The structure of our chapter is as follows: in the first part of the chapter we are
going to provide an overview of productive compounding patterns, i. e. root com-
pounds, morphologically marked compounds, deverbal compounds and coordi-
native compounds. Section 2 is devoted to the description of compound-like
phrases in Hungarian, i. e. preverb + verb constructions and bare noun + verb con-
structions. Finally, Section 3 summarizes the main conclusions of the chapter.
1 P
rototypical compounds
1.1 R
oot compounds
1 Sections 1.1 through 1.3 and 2.2 are heavily based on our earlier works on the subject. Cf., in
particular, Kiefer (1992, 1993, 2009) and Kiefer and Németh (2018).
Open Access. © 2019 Kiefer/Németh, published by De Gruyter. This work is licensed under the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
https://doi.org/10.1515/9783110632446-012
338 Ferenc Kiefer/Boglárka Németh
involve nouns and adjectives only, there are no productive patterns with adverbs
and/or verbs. All endocentric compounds in Hungarian are right-headed and are
formed by juxtaposition of the relevant lexical items. No morphological markers
appear between the constituents of root compounds. (1a–d) shows the chart of
productive patterns.2,3
(1a) N+N
város+háza
‘city hall’
tök+mag
‘pumpkin seed’
(1b) A+N
kis+autó
‘small car’
meleg+ágy
‘hotbed’
(1c) N+A
kő+kemény
‘stone hard’
oszlop+magas
‘pillar high’
(1d) A+A
sötét+zöld
‘dark green’
bal+liberális
‘left-liberal’
Recently a fifth pattern seems to be gaining ground in addition to the ones shown
in (1a–d), namely the pattern N + V. It can be argued, however, that the corre-
sponding compounds are (at least in the majority of cases) backformations from
the corresponding deverbal compounds. For some examples, cf. (2a–c).4
2 In Hungarian compounds are usually written as one word. In the examples the constituents
are written separately for the sake of clarity.
3 1 = first person; 3 = third person; acc = accusative; com = comitative; cond = conditional; dat=
dative; def = definite; inf = infinitive; instr = instrumental; intr = intransitive; loc = locative;
nmlz = nominalization; pl = plural; poss = possessive; prev = preverb; pst = past; ptcp = parti-
ciple; res = resultative; sg = singular; temp = temporal (terminative).
4 Cf. also Ladányi (2007: 64 f.).
Compounds and multi-word expressions in Hungarian 339
(2a) N+V
gép+ír
machine write
‘write on a typewriter’
from gép+ír-ás5
machine writing
‘typing’
(2b) ház+kutat
house search (verb)
from ház+kutat-ás
house search (noun)
(2c) tömeg+közlekedik
mass run
from tömeg+közleked-és
mass/public transportation
Similar examples are legion. It should be noted, however, that compounds such
as (2a–c) are more frequent in everyday and newspaper language than in literary
language.
5 ás/és is a nominalizing suffix, the choice between the two forms is determined by vowel har-
mony. The usual phonological notation is -Vs where V denotes the harmonizing vowel, i. e. -ás or
-és.
6 In contrast to phrases such as könyv-et néz book acc look ‘look at a book, on books’, kép-et néz
‘look at a picture on pictures’, which are not compound-like since they don’t share any property
of compounds. Cf. Section 2.2 for a more detailed discussion of ‘bare object noun + verb’
constructions.
340 Ferenc Kiefer/Boglárka Németh
(3a) vihar+ver-t-e
storm+beat-ptcp-3sg
‘storm-beaten’
(3b) víz+mos-t-a
water+wash-ptcp-3sg
‘water-lashed’
Once again the participial head adjective of the compound is not an independent
word: *verte, *mosta.8 At first sight it would seem that in these compounds the
first member satisfies the subject argument of the deverbal head. However, such
an analysis would run counter the received view that subject arguments cannot
be satisfied in compound structure (cf., for example, Di Sciullo/Williams 1987).
The analysis of N+A constructions with participial heads as verbal compounds is
not mandatory, however. It can be argued that these constructions are participial
constructions rather than genuine compounds (cf. Kenesei 1986). Productive par-
ticipial constructions must be distinguished from frozen ones, while the former
can freely be modified, modification is impossible in the latter case. Compounds
such as víz+mosta ‘water-lashed’, por+lepte ‘covered with dust’ are frozen expres-
sions. In contrast, an expression such as (4),
(4) munkás+lak-t-a
worker+inhabit-ptcp-3sg
‘inhabited by workers’
1.3 D
everbal compounds
Deverbal compounds are special and have received much attention in the perti-
nent literature because there is a clear argument-head relationship between the
elements of the compound. In this case two questions need to be answered: (i)
what kind of arguments can the head inherit from its base; (ii) which arguments
can be satisfied by the nonhead.
Nouns can be derived from verbs by means of the suffix -ás and in a consid-
erable number of cases the derived nouns can be interpreted as event nouns, e. g.
ír-ás ‘writing’ (from the verb ír ‘write’), olvas-ás ‘reading’ (from the verb olvas
‘read’).9 If such an event noun occurs as the head of a compound the nonhead can
be interpreted as an argument of the verb. Apparently in the case of a deverbal
noun derived from a transitive verb the only argument which can occur in non-
head position is the object argument:
(5a) levél+ír-ás
letter+write-nmlz
‘letter writing’
(5b) könyv+olvas-ás
book+read-nmlz
‘book reading’
9 In the case of resultative verbs the derived nominal may be ambiguous between the action and
result reading. The deverbal noun Italics may mean the activity of writing but also the result of
writing.
342 Ferenc Kiefer/Boglárka Németh
(7a) If the deverbal head of a compound is derived from a transitive verb the
only argument which can occur in nonhead position is the object
argument.
(7b) No other internal argument can occur in compounds.
(8a) hó+es-és
snow+fall-nmlz
‘snowfall’
(8b) motor+zúg-ás
engine+buzz-nmlz
‘hum of the engine’
(8c) dió+ér-és
walnut+ripen-nmlz
‘ripening of walnuts’
(9a) liba+gágog-ás
goose+gaggle-nmlz
‘gaggling of a goose’
Compounds and multi-word expressions in Hungarian 343
(9b) kutya+ugat-ás
dog+bark-nmlz
‘barking of a dog’
(9c) gyermek+sír-ás
child+cry-nmlz
‘crying of a child’
(10a) ár+csökken-és
price+decrease.intr-nmlz
‘drop in prices’
ár+drágul-ás
price+go.up-nmlz
‘rise of prices’
(10b) ár+csökken-t-és
price+decrease-acc-nmlz
‘reduction of prices’
ár+drágít-ás
price+raise-nmlz
‘raising of prices’
The examples in (10a–b) demonstrate the difference between a head derived from
an intransitive and a head derived from a transitive verb. In (10a) the nonhead
can only be interpreted as the actor argument of the verbal base. In contrast the
head in (10b) is derived from a transitive verb, hence the nonhead is interpreted
as the object argument of the verbal base.
344 Ferenc Kiefer/Boglárka Németh
There are a number of compounds in which the nonhead looks very much
like an actor argument but it can be shown that the relation between nonhead
and head can only be interpreted conceptually but not syntactically. Consider:
(11a) bolha+csíp-és
flea+sting-nmlz
‘flea-bite’
(11b) kutya+harap-ás
dog+bite-nmlz
‘dog-bite’
(11c) disznó+túr-ás
pig+root-nmlz
‘rooting of pigs’
In the examples in (11) the head noun is a result nominal (referring to the result of
biting or rooting) which has not inherited the argument structure of the base
verb, hence argument satisfaction does not arise. The properties of result nomi-
nals are well-known from the relevant literature which we will not repeat here.
Suffice it to mention that result nominals are incompatible with durative tempo-
ral adverbials while action nominals are.
Before embarking on the discussion of coordinative compounds it should be
made clear that deverbal compounds can also be formed by means of the parti-
cipial suffixes -ó10 (present participle) and -t (past participle). E.g. dió+darál-ó
‘nut grinder’ and sertés+sül-t ‘roast pork’ (from sül ‘roast’).
1.4 C
oordinative compounds
(12a) ad-vesz (from ad ‘give’ + vesz ‘buy’) ‘mart, buy and sell’
jön-megy (from jön ‘come’+ megy ‘go’) ‘come and go, fidget’
üt-ver (from üt ‘hit’+ ver ‘beat’) ‘beat, pound’
10 Denoting the suffixes -ó or -ő where once again the choice is determined by vowel harmony.
Compounds and multi-word expressions in Hungarian 345
The ill-formed examples in (12b) above are meant to demonstrate the limited pro-
ductivity of the construction type: the compounds in (12a) are all fully lexicalized,
frozen items, while derivation from other non-bound elements seems to be rather
problematic.
Another type of coordinative compounds is derived by lexical reduplication,
which has several subcategories, as shown in (13).
Another type of lexical reduplication is when the base is copied with some
kind of modification: either an initial consonant of the base is replaced by another
one (cf. 13c), or there is a vowel alternation pattern similar to ablaut (cf. 13d).
Brdar/Brdar-Szabó (2014: 39 f.) label the former phenomenon as inexact total
reduplication or rhyming(-motivated) reduplication, and the latter as ablaut-moti
vated reduplication. Finally, the examples in (13e) are instances of partial redupli
cation, where only a segment of the base is copied (ibid.: 39).
Note that in these cases, too, the semantic feature added to the base is inten-
sification, and the compounds mainly serve as stylistic versions of their bases:
they mostly express the endearing attitude of the speaker, thus they should be
dealt with in a morphopragmatic framework as well.
2 C
ompound-like phrases
We have already mentioned some cases of non-prototypical compounds; in the
present section a more detailed analysis of such constructions will be provided.
In Hungarian preverbs (particles attached to the verb base) are all separable and
can fulfil various functions. If fully grammaticalized they express telicity, the
most typical being the preverb meg which has completely lost its original mean-
ing and has become an aspectual marker. Among other things, it can express the
resultative Aktionsart as in the case of főz ‘cook’ – meg+főz ‘cook.res’, varr ‘sew’
– meg+varr ‘sew.res’ or the semelfactive Aktionsart as in vakar ‘scrape’ – meg+
vakar ‘scrape once’, csóvál ‘wag’ – meg+csóvál ‘wag once’.
Most preverbs are less grammaticalized yet they can be used to derive an
Aktionsart. For example, the preverb el (whose original directional meaning is
‘away’) can be used to express inchoativity if it is accompanied by the reflexive
pronoun magát ‘self’, e. g. ordít ‘shout, cry’ – el+ordítja magát ‘cry out’ or nevet
‘laugh’ – el+neveti magát ‘burst out laughing’. In addition to meg some other orig-
inally directional preverbs can be used to express resultativity: takarít ‘tidy, clean’
– ki+takarít ‘clean up’, gereblyéz ‘rake’ – fel+gereblyéz ‘rake up’, kaszál ‘scythe’
– le+kaszál ‘scythe.res’, költ ‘spend’ – el+költ ‘spend.res’.
At first sight Aktionsart-formation may seem to belong to derivational mor-
phology. This would, however, contradict several generalizations concerning der-
ivational morphology in Hungarian. First, derivational affixes harmonize with
Compounds and multi-word expressions in Hungarian 347
the verbal stem (szép-ség ‘beauty’, jó-ság ‘goodness’), in contrast, preverbs never
harmonize.11 Second, derivational affixes may change the part of speech category
of the base which is not the case with preverbs. Third, derivational affixes are
bound morphemes. On the other hand, preverbs can be detached from their base.
First, they can be used in short answers to a question without their base as in
(14–15) below.
Moreover, preverbs can freely be moved to various positions in the sentence, cf.
the variants of (15a) in (16a–c).
We may thus conclude that the formation of complex verbs cannot be part of der-
ivational morphology. On the other hand, preverb+verb constructions are not
prototypical compounds either, at least not with respect to their behavior vis-a-
vis syntax. In other words, their internal structure is accessible to syntactic rules.
Yet they are compounds semantically as testified, among other things, by the
large number of lexicalized forms. It should also be noted that a large number of
preverbs are undistinguishable from the formally identical adverbs.
An interesting property of the Hungarian preverbs is that they can be redupli-
cated to express iterativity.12 Consider:
11 Preverbs with a front vowel such as ki can easily be attached to back vowel stems as in ki+mar
‘corrode’, ki+old ‘undo’, ki+rúg ‘kick out’.
12 Iterativity can also be expressed by the verbal suffix -gat which is, however, semantically
radically different from the iterativity expressed by preverb reduplication.
348 Ferenc Kiefer/Boglárka Németh
These properties seem to suggest that reduplicated forms are not only semanti-
cally but also syntactically words. First they have a specific meaning (to do some-
thing repeatedly), second syntactic rules cannot change their internal structure.
Preverb reduplication is not possible across the board: it must obey a phono-
logical and several semantic constraints. The phonological constraint refers to
the length of the preverb in terms of the number of syllables: preverbs longer than
two syllables cannot be reduplicated, as shown by (20).
According to the literature (Kiefer 1990; Farkas/de Swart 2003), Hungarian bare
noun + verb constructions (in short, BNV constructions) are instances of type I
noun incorporation in terms of Mithun (1984). Mithun describes the phenomenon
as a type of compounding where a verb and a noun with the semantic function of
patient, location or instrument combine to form a new complex verb. The eventu-
ality designated by the BNV construction is not just a random co-occurrence of an
entity and an eventuality, but it is perceived as a recognizable, unitary concept
worth labelling (cf. Mithun 1984: 848 f.).
We consider the Hungarian BNV construction type as a special case of com-
pounding by juxtaposition, the general characteristics of which are briefly cap-
tured by Mithun as follows:
A number of languages contain a construction in which a V and its direct object are simply
juxtaposed to form an especially tight bond. The V and N remain separate words phonolog-
ically; but as in all compounding, the N loses its syntactic status as an argument of the
sentence, and the VN unit functions as an intransitive predicate. The semantic effect is the
same as in other compounding: the phrase denotes a unitary activity, in which the compo-
nents lose their individual salience. (ibid.: 849)
*/?Péter újságot
(23) olvas, és elégedett vele.
Peter newspaper.acc read and content instr
*/?Péter zenét hallgat, és elégedett vele.
Peter music.acc listen and content instr
*/?Péter tanulmányt ír, és elégedett vele.
Peter article.acc write and content instr
*/?Péter keresztrejtvényt fejt, és elégedett vele.
Peter crossword.acc solve and content instr
*/?Péter ruhát próbál, és elégedett vele.
Peter outfit.acc try on and content instr
‘Peter is reading (a) newspaper(s) / listening to music / writing an article /
solving (a) crossword puzzle(s) / trying on (an) outfit(s), and he is content
with it.’
As pointed out by Kiefer (1990: 153 f.) and shown in (22) above, Hungarian BNVs
form one single phonological unit from the point of view of stress assignment
(i. e., only the subject and the incorporated object bear stress on their first sylla-
ble, cf. 22a), while their V + DP counterparts show the opposite pattern (i. e., the
subject, the verb and the direct object all bear separate stress on their first sylla-
ble, cf. 22b). The ill-formedness of some of the constructions in (23) is due to the
fact that some of these BNVs, namely keresztrejtvényt fejt ‘solve crossword puz-
zles’ and ruhát próbál ‘try on outfits’ seem to be lexicalized units without exact
syntactic paraphrases, e. g. V + DP counterparts.
One of the key semantic features of direct object incorporation, often men-
tioned in the literature (cf. Mithun 1984; Kiefer 1990; Farkas/de Swart 2003), is
the non-referentiality of the bare object noun, which means that the nouns in
these BNV constructions do not denote any specific, identifiable entity in the
Compounds and multi-word expressions in Hungarian 351
Péter érdekes
(24a) újságot olvas, és elégedett vele.
Peter interesting newspaper.acc read and content instr
P
éter érdekes tanulmányt ír, és elégedett vele.
Peter interesting article.acc write and content instr
‘Peter is reading an interesting newspaper / writing an interesting article,
and he is content with it.’13
(24b) Péter egy érdekes újságot olvas, és elégedett vele.
Peter a interesting newspaper.acc read and content instr
Péter egy érdekes tanulmányt ír, és elégedett vele.
Peter a interesting article.acc write and content instr
‘Peter is reading an interesting newspaper / writing an interesting article,
and he is content with it.’
The constructions in (24a) above are meant to demonstrate the effects of modifi-
cation on BNV constructions. The inserted adjective overrides the non-referenti-
ality property of the object noun and – as a consequence – the complex eventual-
ity meaning of the BNVs. This means that we are dealing with at least two different
construction types from the point of view of semantics and discourse transpar-
ency, as shown by the fact that, contrary to the case of (23), the modified version
of the construction admits the insertion of an anaphoric pronominal constituent
into the sentence. As noted in Kiefer (1990: 152), the constructions like those in
(24a) seem to be some kind of stylistic variants of the full-fledged construction
types shown in (24b).
The number neutrality of the singular incorporated noun is another impor-
tant characteristic of BNVs, and it is strongly connected to the above mentioned
non-referentiality feature. As Farkas/de Swart (2003: 13 f.) point out, morpholog-
ically singular incorporated nouns are compatible with both atomic and non-
atomic interpretations. Most of the examples in (22a) above are underspecified
regarding the number of objects involved in the eventualities described by the
BNVs. The singular noun in the BNV újságot olvas ‘read (a) newspaper(s)’, for
instance, allows for both an atomic (singular) and a non-atomic (plural) interpre-
tation, i. e. the BNV does not specify whether Peter is reading one newspaper or
several newspapers one after the other. As shown by the examples in (25) below,
the varying interpretations are influenced by pragmatic (contextual) information.
The BNV in (25a) triggers an atomic interpretation due to extra linguistic knowl-
edge about marriage related customs (though it would allow for a non-atomic
interpretation in the context of legal bigamy), the one in (25b) clearly triggers an
atomic interpretation (without any cultural variation), finally, the one in (25c)
unambiguously triggers a non-atomic interpretation.
As far as plural bare objects are concerned, the following generalization holds:
plural bare object nouns form grammatical BNVs, however, as shown in (26)
below, their discourse transparency properties are similar to the ones of modified
singular objects, as shown in (25a) above.
Finally, a distinction must be made between fully productive and idiomatic cases.
As pointed out in Kiefer (1990), the meaning of idiomatic BNVs cannot be derived
from a corresponding free construction (cf. the examples in (27)–(28) below),
while fully productive BNVs generally have matching syntactic paraphrases as
already demonstrated by the examples in (23a–b) above.
Compounds and multi-word expressions in Hungarian 353
The difference between the lexicalized BNVs in (27a–c) and (27d) is that the for-
mer type cannot be grammatically matched with a syntactic paraphrase (cf.
(28a–c)), while the latter construction type has a well-formed syntactic para-
phrase, however, (synchronically) this paraphrase has nothing to do with the
meaning of its BNV counterpart (compare (27d) and (28d)).
354 Ferenc Kiefer/Boglárka Németh
As mentioned above, the most prominent and universal semantic and prag-
matic feature of BNVs is that the eventuality designated by the construction has
to be perceived as a recognizable, unitary concept worth separately labelling.
This ‘institutionalized’ character of the complex activity expressed by the BNV
seems to be a strong criterion regarding the derivation of the construction type.
Thus it does not come as a surprise that not all bare objects are admitted in BNV
constructions with equal ease. Consider the examples in (29b) and (29d) which,
as opposed to those in (29a) and (29c), are odd on their generic reading.
The oddness of (29b) is caused by the fact that, generally speaking, reading pack-
ages is not considered a recognizable, re-occurring complex eventuality, how-
ever, the BNV in question becomes acceptable if matched with a proper context:
if, for example, the participants of the speech situation know that Mari has a
habit of reading the package of meat products trying to avoid certain ingredients.
The same holds true for (29d) as well: waiting for the end of the world is generally
not perceived as an ‘institutionalized’ activity, nevertheless, the use of the BNV is
justified in the context of knowing that the Virágs have prepared for the end of the
world on several occasions in the past due to false predictions.
These types of marginal examples show that, although there may be some
pragmatic factors that influence the derivation of BNVs, if the contextual factors
match the corresponding pragmatic criteria, even seemingly odd BNVs will be
considered well-formed.
Finally, mention must be made of the aspectual restrictions filtering the
range of input verbs. The generalization seems to be as follows: activity/process
verbs, i. e. [+dynamic, –telic] verbs potentiate well-formed BNVs, while accom-
plishment and achievement verbs, i. e. [+dynamic, +telic] verbs as well as stative,
Compounds and multi-word expressions in Hungarian 355
14 We use the terms activity, achievement, accomplishment and state according to the Vendleri-
an tradition well known in the literature on aspect. Vendler (1967) isolated four situation types:
states (e. g. love, know, etc.), activities (e. g. run), achievements (e. g. reach the summit) and ac-
complishments (e. g. draw a circle). For more on these aspectual categories, cf. Smith (1991), Ten-
ny (1994), Kiefer (2006), etc.
356 Ferenc Kiefer/Boglárka Németh
According to these examples, the above generalization seems to hold true for
Hungarian BNVs. The constructions in (30a–d) derived from telic verbs are
ungrammatical, although a distinction should be made between prefixed and
unprefixed telic verbs, as the latter are invariably ungrammatical in these con-
structions, while in some cases the former may serve as acceptable input verbs
(as shown in (31a–b) below).15 The ungrammatical BNVs like those in (30e–g)
lead to the conclusion that stative verbs are indeed excluded from the range of
possible input verbs, however, as shown in (31d–e), we may find some grammat-
ical BNVs derived from stative verbs as well.
15 The distributional properties of these verb classes are captured in Kiefer (1990: 169) as fol-
lows: “Syntactically, both the bare noun and the prefix belong to the same class of elements, of-
ten referred to as preverb since under normal circumstances an element of this class occupies the
position immediately preceding the verb. Consequently, two preverbs can never co-occur.”
Compounds and multi-word expressions in Hungarian 357
The well-formed examples in (31) violate the aspectual criteria formulated above,
so we need to take a closer look at the semantic and pragmatic features of these
BNVs. The sentences in (31a–b) contain BNVs derived from telic verbs, while the
ones in (31d–e) contain stative verbs. The example in (31c), contrasted with (30d),
is meant to demonstrate how contextual non-atomicity entailments induce aspec-
tual coercion in the case of punctual verbs (the BNV triggers an iterative interpre-
tation, otherwise, with an atomic interpretation, it would be considered ill-
formed, like the one in (30d) above; and reversely: the BNV poharat tör ‘break
glasses’ becomes well-formed with an iterative and habitual interpretation).
The common feature of these BNVs is that they all denote institutionalized,
re-occurring eventualities. The institutionalized nature of the eventualities
expressed by (31a–b) is also shown by their contrast with the constructions in
(30b–c) above: in football, touching the ball with one’s hand is a frequent, pun-
ishable occurrence. The same institutionalized character holds true for the even-
tuality of calling an ambulance and for the stative predicates in (31d–e).
Based on these observations, we conclude that the aspectual criterion
described above should be reduced to a remark regarding the prevalency of pro-
cess verbs in BNVs, as the range of verbs which (potentially) denote institutional-
ized eventualities strongly overlaps with the category of process verbs, however,
some telic and stative verbs also describe eventualities which satisfy the prag-
matic criterion controlling BNV formation.
3 Summary
In the present paper we have summarized the most important facts concerning
compounds and compound-like phrases (= non-prototypical compounds) in
Hungarian. We have concentrated on the productive, or at least regular patterns
of compounding and derivation of compound-like constructions. In particular,
358 Ferenc Kiefer/Boglárka Németh
we have stressed the features which deviate from “Standard Average European”.
Some of such features can be found in the case of deverbal compounds as well,
e. g. that the subject argument can be satisfied in compounds which does not
seem to be the case in Germanic or Romance. However, the most striking feature
of Hungarian compounding is the existence of bare noun constructions and their
relation to verbal aspect.
References
Brdar, Mario/Brdar-Szabó, Rita (2014): Syntactic reduplicative constructions in Hungarian (and
elsewhere): Categorization, topicalization and concessivity rolled into one. In: Rundblad,
Gabriella et al. (eds.): Selected Papers from the 4th UK Cognitive Linguistics Conference.
London: UK Cognitive Linguistics Association. 36–51.
Di Sciullo, Anna Maria/Williams, Edwin (1987): On the definition of word. Cambridge, MA: The
MIT Press.
Farkas, Donka/de Swart, Henriëtte (2003): The semantics of incorporation: From argument
structure to discourse transparency. Stanford, CA: CSLI Publications.
Kenesei, István (1986): On the role of the agreement morpheme in Hungarian. ALH 86, 1–4.
104–120.
Kiefer, Ferenc (1990): Noun incorporation in Hungarian. In: Acta Linguistica Hungarica 40, 1–2.
149–177.
Kiefer, Ferenc (1992): Compounding in Hungarian. Rivista di Linguistica 4, 1. 45–55.
Kiefer, Ferenc (1993): Thematic roles and compounds. Folia Linguistica 27, 1–2. 25–55.
Kiefer, Ferenc (2000): A szóösszetétel [Compounds]. In: Kiefer, Ferenc (ed.): Strukturális magyar
nyelvtan 3. Morfológia. Budapest: Akadémiai Kiadó. 519–568.
Kiefer, Ferenc (2006): Aspektus és akcióminőség – különös tekintettel a magyar nyelvre [Aspect
and Aktionsart – with special emphasis on Hungarian]. Budapest: Akadémiai Kiadó.
Kiefer, Ferenc (2009): Compounding in Hungarian. In: Lieber, Rochelle/Stekauer, Pavol (eds.):
Oxford handbook of compounding. Oxford: Oxford Handbooks. 527–541.
Kiefer, Ferenc/Németh, Boglárka (2018): Aspectual constraints on noun incorporation in
Hungarian. In: Zoltán, Huba Bartos/den Dikken, Marcel/Váradi, Tamás (eds.): Boundaries
crossed: Studies of the crossroads of morphosyntax, phonology, pragmatics, and
semantics. Berlin: Springer. 21–32.
Ladányi, Mária (2007): Produktivitás és analógia a szóképzésben [Productivity and analogy in
word formation]. Budapest: Tinta Könyvkiadó.
Maleczki, Márta (1994): Bare common nouns and their relation to the temporal constitution of
events in Hungarian. In: Dekker, Paul/Stokhof, Martin (eds.): Proceedings of the Eighth
Amsterdam Colloquium. Amsterdam: Institute for Logic, Language and Computation,
University of Amsterdam. 347–365.
Mithun, Marianne (1984): The evolution of noun incorporation. Language 60. 847–895.
Smith, Carlota (1991): The parameter of aspect. Dordrecht: Springer.
Tenny, Carol L. (1994): Aspectual roles and the syntax-semantics interface. Dordrecht: Springer.
Vendler, Zeno (1967): Verbs and times. Linguistics in philosophy. Ithaca/New York: Duke
University Press. 97–121.