Migration events splitting speaker communities and establishing novel contact situations are amon... more Migration events splitting speaker communities and establishing novel contact situations are among the major drivers of language variation and change. While the precise processes that lead to change cannot usually be determined for past events with any certainty, the study of minority and heritage language usage in apparent time may provide insight into the contribution of the linguistic behavior underlying the dynamics. We capitalize on this and compare parts of speech usage in Pear Story renarrations across Gheg Albanian speakers of three generations in Germanspeaking environments, applying methods from information theory. The results suggest that the changing conventions in parts of speech usage across generations and places of residence can be attributed to changing linguistic behavior within the speaker community in the migration setting. These findings highlight the impact of changing sonnenhauser, ismajli and widmer 10.1163/22105832-bja10027 | Language Dynamics and Change (2023) 1-26 sociocultural embedding and the roles of vertical and horizontal transmission in language change.
Previous work suggests that when speakers linearize syntactic structures, they place longer and m... more Previous work suggests that when speakers linearize syntactic structures, they place longer and more complex dependents further away from the head word to which they belong than shorter and simpler dependents, and that they do so with increasing rigidity the longer expressions get, for example, longer objects tend to be placed further away from their verb, and with less variation. Current theories of sentence processing furthermore make competing predictions on whether longer expressions are preferentially placed as early or as late as possible. Here we test these predictions using hierarchical distributional regression models that allow estimates of word order and word order variation at the level of individual dependencies in corpora from 71 languages, while controlling for confounding effects from the type of dependency (e.g., subject vs. object), and the type of clause (main vs. subordinate) involved as well as from trends that are characteristic of individual languages, language families, and language contact areas. Our results show the expected correlations of length with position and variation only for two out of six dependency types (obliques and nominal modifiers) and no difference between clause types. These findings challenge received theories of across-the-board effects of complexity on word order and word order variation and call for theoretical models that relativize effects to specific kinds of syntactic structures and dependencies.
Philosophen im römischen Legionslager von Argentorate? Überlegungen zum Wanddekor mit Ritzinschri... more Philosophen im römischen Legionslager von Argentorate? Überlegungen zum Wanddekor mit Ritzinschriften in den Offiziersquartieren. .. . .
A critical feature of language is that the form of words need not bear any perceptual similarity ... more A critical feature of language is that the form of words need not bear any perceptual similarity to their functionthese relationships can be 'arbitrary'. The capacity to process these arbitrary form-function associations facilitates the enormous expressive power of language. However, the evolutionary roots of our capacity for arbitrariness, i.e. the extent to which related abilities may be shared with animals, is largely unexamined. We argue this is due to the challenges of applying such an intrinsically linguistic concept to animal communication, and address this by proposing a novel conceptual framework highlighting a key underpinning of linguistic arbitrariness, which is nevertheless applicable to non-human species. Specifically, we focus on the capacity to associate alternative functions with a signal, or alternative signals with a function, a feature we refer to as optionality. We apply this framework to a broad survey of findings from animal communication studies and identify five key dimensions of communicative optionality: signal production, signal adjustment, signal usage, signal combinatoriality and signal perception. We find that optionality is widespread in non-human animals across each of these dimensions, although only humans demonstrate it in all five. Finally, we discuss the relevance of optionality to behavioural and cognitive domains outside of communication. This investigation provides a powerful new conceptual framework for the cross-species investigation of the origins of arbitrariness, and promises to generate original insights into animal communication and language evolution more generally.
Phylogenetic trees are a central tool for studying language evolution and have wide implications ... more Phylogenetic trees are a central tool for studying language evolution and have wide implications for understanding cultural evolution as a whole. For example, they have been the basis of studies on the evolution of musical instruments, religious beliefs and political complexity. Bayesian phylogenetic methods are transparent regarding the data and assumptions underlying the inference. One of these assumptions—that languages change independently—is incompatible with the reality of language evolution, particularly with language contact. When speakers interact, languages frequently borrow linguistic traits from each other. Phylogenetic methods ignore this issue, which can lead to errors in the reconstruction. More importantly, they neglect the rich history of language contact. A principled way of integrating language contact in phylogenetic methods is sorely missing. We present contacTrees, a Bayesian phylogenetic model with horizontal transfer for language evolution. The model efficiently infers the phylogenetic tree of a language family and contact events between its clades. The implementation is available as a package for the phylogenetics software BEAST 2. We apply contacTrees in a simulation study and a case study on a subset of well-documented Indo-European languages. The simulation study demonstrates that contacTrees correctly reconstructs the history of a simulated language family, including simulated contact events. Moreover, it shows that ignoring contact can lead to systematic errors in the estimated tree height, rate of change and tree topology, which can be avoided with contacTrees. The case study confirms that contacTrees reconstructs known contact events in the history of Indo-European and finds known loanwords, demonstrating its practical potential. The model has a higher statistical fit to the data than a conventional phylogenetic reconstruction, and the reconstructed tree height is significantly closer to well-attested estimates. Our method closes a long-standing gap between the theoretical and empirical models of cultural evolution. The implications are especially relevant for less documented language families, where our knowledge of past contacts and linguistic borrowings is limited. Since linguistic phylogenies have become the backbone of many studies of cultural evolution, the addition of this integral piece of the puzzle is crucial in the endeavour to understand the history of human culture.
In Blasi et al. (2019) we have shown, through a series of statistical analyses and models, that h... more In Blasi et al. (2019) we have shown, through a series of statistical analyses and models, that human sound systems have been affected by a transition in bite configuration starting from the Neolithic. Tarasov and Uyeda (2020) (henceforth T&U) raise a number of observations in relation to our article. We appreciate T&U’s engagement with our work and their sharing of the code and data of the analyses reported. In brief, their technical comment involves five analyses:Binomial Causal Models (BCM)Linear Regression of across-area variation in labiodentals and subsistencePredictive Posterior Simulations (PPS)Poisson Linear Regression (PLR): model comparisonPhylogenetic AnalysesIn what follows, we show that the discrepancies they report between our findings and theirs are due mostly to ill-specified models, weak (or missing) statistical evidence, and a misinterpretation of our results. After these issues are addressed, we conclude that T&U’s claims do not hold.
This paper investigates the origins of sortal numeral classifiers in the Indo-Iranian languages. ... more This paper investigates the origins of sortal numeral classifiers in the Indo-Iranian languages. While these are often assumed to result from contact with non-Indo-European languages, an alternative possibility is that classifiers developed as a response to the rise of optional plural marking. This alternative is in line with the so-called Greenberg-Sanches-Slobin (henceforth GSS) generalization. The GSS generalization holds that the presence of sortal numeral classifiers across languages is negatively correlated with obligatory plural marking on nouns. We assess the extent to which Indo-Iranian classifier development is influenced by loosening of restrictions on plural marking using a sample of 65 languages and a Bayesian phylogenetic model, inferring posterior distributions over evolutionary transition rates between typological states and using these rates to reconstruct the history of classifiers and number marking throughout Indo-Iranian, constrained by historically attested sta...
Journal of South Asian Languages and Linguistics, 2022
The present study provides a survey of the semantics of depictive (in a broad sense, including ci... more The present study provides a survey of the semantics of depictive (in a broad sense, including circumstantials) adjectival compounds in Vedic Sanskrit. Following the typology of depictive constructions developed by Himmelmann and Schultze-Berndt (2005), we structure our classification along the semantic fields such expressions tend to occur in. Our results show that in Vedic, the use of depictive adjectival compounds spans (almost) the whole gamut of functions reported for depictives in cross-linguistic studies. In Vedic, depictive compounds rank on a par with other strategies of non-finite event elaboration such as participles, verbal adjectives, and action nouns.
In many cases of apparent contact-induced change the contribution of genealogical correlation in ... more In many cases of apparent contact-induced change the contribution of genealogical correlation in the language sample and its interaction with processes such as matter and pattern replication are difficult to specify. In order to get a better sense of the relevance of shared ancestry, we quantify the change in similarity since the late Middle Ages in a sample of Romance and Germanic languages with data from a selected grammatical domain (expression of reflexivity). We compare their dynamics to patterns of change of similarity in two contact zones in Europe, namely the British Isles (Dedio et al. 2019) and the Balkans. Concerning the genealogical signal, the results indicate a maintenance and gain of similarity in Romance as opposed to a loss of similarity in Germanic. This hints at the importance of the inherited states, the time since the split from the common ancestor, and subsequent developments. We presume that these factors are likely to be at the origin of the maintenance and increase in similarity observed for the sampled Romance varieties. While this result cannot be generalised beyond the specific case study presented here, the basic approach will contribute to a better understanding of how contact, genealogy and culture interact in shaping the dynamics of linguistic similarity.
In this paper we introduce an extended version of the Vedic Treebank (vtb, Hellwig et al. 2020) w... more In this paper we introduce an extended version of the Vedic Treebank (vtb, Hellwig et al. 2020) which comes along with revisited and extended annotation guidelines. In order to assess the quality of our annotations as well as the usability and limits of the guidelines we performed an inter-annotator agreement test. The results show that agreement between annotators is hampered by various factors, most prominently by insufficient understanding of the content because of the cultural and temporal gap and incomplete knowledge of Vedic grammar. An in-depth discussion of disagreeing annotations demonstrates that the setup of the workflow, too, has a major influence on inter-annotator agreement. We suggest some measures that can help increase the transparency and annotation consistency according to current knowledge of the language when annotating Vedic Sanskrit, or ancient language varieties in general.
Maiores philologiae pontes. Festschrift für Michael Meier-Brügger zum 70. Geburtstag.
Anders als etwa bei den wenig belegten Determinativkomposita (AiGr., II.2, 241) bietet der RV bek... more Anders als etwa bei den wenig belegten Determinativkomposita (AiGr., II.2, 241) bietet der RV bekanntlich bei adjektivisch verwendeten (exozentrischen) Komposita (Scarlata & Widmer 2015) eine gewaltige Fülle an Material. Diese Komposita finden ganz parallel zu " einfachen " Adjektiven Verwendung als adnominale Attribute und Epitheta, aber auch, wie bereits Delbrück festgestellt hat (Delbrück 1878, 54), als Äquivalente von adverbiellen Nebensätzen, Partizipialkonstruktionen (restriktiven) Relativsätzen, Absolutiva und an-derem mehr. Im Beitrag soll anhand einer ausgewählten Stelle des RV (5.8.3), worin eine ganze Reihe exozentrischer Komposita erscheint, erörtert werden, inwiefern von solchen formal abhängigen Konstruktionen rekursiv weitere Komposita abhängen können.
Linguistics 58(3) (Special Issue: Shades of Partitivity: Formal and areal properties), 745-766, 2020
We discuss a potential case of borrowing in this paper: Breton a-'of', 'from' marking of (interna... more We discuss a potential case of borrowing in this paper: Breton a-'of', 'from' marking of (internal) verbal arguments, unique in Insular Celtic languages, and reminiscent of Gallo-Romance de/du-(and en-) arguments. Looking at potential Gallo-Romance parallels of three Middle Breton constructions analyzed in some detail (a with indefinite mass nominals in direct object position, a-marking of internal arguments under the scope of negation, a [allomorphs an(ez)-/ahan-] with personal pronouns for internal arguments, subjects (mainly of predicative constructions) and as expletive subjects of existential constructions), we demonstrate that even if there are some semantic parallels and one strong structural overlap (a and de under the scope of negation), the amount of divergences in morphology, syntax and semantics and the only partially fitting relative chronology of the different constructions do not allow to conclude with certainty that language-contact is an explanation of the Breton facts, which might have come into being also because of internal change (bound to restructuring of the prono-minal system in Breton). More research is necessary to complete our knowledge of a-marking in Middle Breton and Modern Breton varieties and on the precise history of French en, in order to decide for one or the other explanation.
Convergence by loss is a concept that is often adduced to characterize the Balkans as a linguisti... more Convergence by loss is a concept that is often adduced to characterize the Balkans as a linguistic area and to substantiate the areality of particular linguistic features, developments and varieties. Time and again, it has been pointed out that however useful this concept may be for certain purposes, e.g., when descriptively stating differences between historical stages of one specific variety, it is problematic for others, in particular for comparing languages and assessing areality. In addition to implying the undisputed existence of categorial distinctions, applying this concept indiscriminately obscures the fact that its manifestations may differ substantially across features and languages. Furthermore, focusing on “loss” impedes insight into both more general and more specific processes. On the examples of case and infinitive in the standard norms of Albanian and Macedonian this article acts on these intuitions and elaborates a finer-grained approach that avoids the assumption of generally applicable categorial distinctions and the ignoring of differences below seemingly identical surface phenomena. By the decomposition of linguistic units into their constitutive morphosyntactic features it becomes possible to sketch the interaction of morphosyntactic exponents in expressing characteristic functions, such as the selection of grammatical relations or the licensing of constituency. This provides a solid empirical basis for comparing morphosyntactic patterns across languages in synchronic and diachronic respects and may be operationalized for assessing the areality of particular developments.
This paper introduces the first treebank of Vedic Sanskrit, a morphologically rich ancient Indian... more This paper introduces the first treebank of Vedic Sanskrit, a morphologically rich ancient Indian language that is of central importance for linguistic and historical research. The selection of the 4,000 sentences contained in this treebank reflects the development of metrical and prose texts over a period of 600 years. We discuss how these sentences are annotated in the Universal Dependencies scheme and which syntactic constructions required special attention. In addition, we describe a syntactic labeler based on neural networks that supports the initial annotation of the treebank, and whose evaluation can be helpful for setting up a full syntactic parser of Vedic Sanskrit.
Approaches to linguistic areas have largely focused either on purely qualitative investigation of... more Approaches to linguistic areas have largely focused either on purely qualitative investigation of area formation processes, on quantitative and qualitative exploration of synchronic distributions of linguistic features without considering time, or on theoretical issues related to the definition of the notion "linguistic area". What is still missing are approaches that supplement qualitative research on area formation processes with quantitative methods. Taking a bottom-up approach, we bypass notional issues and propose to quantify area formation processes by a) measuring the change in linguistic similarity given a geographical space, a socio-cultural setting, a time span, a language sample, and a set of linguistic data, and b) testing the tendency and magnitude of the process using Bayesian inference. Applying this approach to the expression of reflexivity in a dense sample of languages in northwestern Europe from the early Middle Ages to the present, we show that the method yields robust quantitative evidence for a substantial gain in linguistic similarity that sets the languages of Britain and Ireland apart from languages spoken outside Britain and Ireland and cross-cuts lines of linguistic ancestry.
Linguistic diversity, now and in the past, is widely regarded to be independent of biological cha... more Linguistic diversity, now and in the past, is widely regarded to be independent of biological changes that took place after the emergence of Homo sapiens . We show converging evidence from paleoanthropology, speech biomechanics, ethnography, and historical linguistics that labiodental sounds (such as “f” and “v”) were innovated after the Neolithic. Changes in diet attributable to food-processing technologies modified the human bite from an edge-to-edge configuration to one that preserves adolescent overbite and overjet into adulthood. This change favored the emergence and maintenance of labiodentals. Our findings suggest that language is shaped not only by the contingencies of its history, but also by culturally induced changes in human biology.
Migration events splitting speaker communities and establishing novel contact situations are amon... more Migration events splitting speaker communities and establishing novel contact situations are among the major drivers of language variation and change. While the precise processes that lead to change cannot usually be determined for past events with any certainty, the study of minority and heritage language usage in apparent time may provide insight into the contribution of the linguistic behavior underlying the dynamics. We capitalize on this and compare parts of speech usage in Pear Story renarrations across Gheg Albanian speakers of three generations in Germanspeaking environments, applying methods from information theory. The results suggest that the changing conventions in parts of speech usage across generations and places of residence can be attributed to changing linguistic behavior within the speaker community in the migration setting. These findings highlight the impact of changing sonnenhauser, ismajli and widmer 10.1163/22105832-bja10027 | Language Dynamics and Change (2023) 1-26 sociocultural embedding and the roles of vertical and horizontal transmission in language change.
Previous work suggests that when speakers linearize syntactic structures, they place longer and m... more Previous work suggests that when speakers linearize syntactic structures, they place longer and more complex dependents further away from the head word to which they belong than shorter and simpler dependents, and that they do so with increasing rigidity the longer expressions get, for example, longer objects tend to be placed further away from their verb, and with less variation. Current theories of sentence processing furthermore make competing predictions on whether longer expressions are preferentially placed as early or as late as possible. Here we test these predictions using hierarchical distributional regression models that allow estimates of word order and word order variation at the level of individual dependencies in corpora from 71 languages, while controlling for confounding effects from the type of dependency (e.g., subject vs. object), and the type of clause (main vs. subordinate) involved as well as from trends that are characteristic of individual languages, language families, and language contact areas. Our results show the expected correlations of length with position and variation only for two out of six dependency types (obliques and nominal modifiers) and no difference between clause types. These findings challenge received theories of across-the-board effects of complexity on word order and word order variation and call for theoretical models that relativize effects to specific kinds of syntactic structures and dependencies.
Philosophen im römischen Legionslager von Argentorate? Überlegungen zum Wanddekor mit Ritzinschri... more Philosophen im römischen Legionslager von Argentorate? Überlegungen zum Wanddekor mit Ritzinschriften in den Offiziersquartieren. .. . .
A critical feature of language is that the form of words need not bear any perceptual similarity ... more A critical feature of language is that the form of words need not bear any perceptual similarity to their functionthese relationships can be 'arbitrary'. The capacity to process these arbitrary form-function associations facilitates the enormous expressive power of language. However, the evolutionary roots of our capacity for arbitrariness, i.e. the extent to which related abilities may be shared with animals, is largely unexamined. We argue this is due to the challenges of applying such an intrinsically linguistic concept to animal communication, and address this by proposing a novel conceptual framework highlighting a key underpinning of linguistic arbitrariness, which is nevertheless applicable to non-human species. Specifically, we focus on the capacity to associate alternative functions with a signal, or alternative signals with a function, a feature we refer to as optionality. We apply this framework to a broad survey of findings from animal communication studies and identify five key dimensions of communicative optionality: signal production, signal adjustment, signal usage, signal combinatoriality and signal perception. We find that optionality is widespread in non-human animals across each of these dimensions, although only humans demonstrate it in all five. Finally, we discuss the relevance of optionality to behavioural and cognitive domains outside of communication. This investigation provides a powerful new conceptual framework for the cross-species investigation of the origins of arbitrariness, and promises to generate original insights into animal communication and language evolution more generally.
Phylogenetic trees are a central tool for studying language evolution and have wide implications ... more Phylogenetic trees are a central tool for studying language evolution and have wide implications for understanding cultural evolution as a whole. For example, they have been the basis of studies on the evolution of musical instruments, religious beliefs and political complexity. Bayesian phylogenetic methods are transparent regarding the data and assumptions underlying the inference. One of these assumptions—that languages change independently—is incompatible with the reality of language evolution, particularly with language contact. When speakers interact, languages frequently borrow linguistic traits from each other. Phylogenetic methods ignore this issue, which can lead to errors in the reconstruction. More importantly, they neglect the rich history of language contact. A principled way of integrating language contact in phylogenetic methods is sorely missing. We present contacTrees, a Bayesian phylogenetic model with horizontal transfer for language evolution. The model efficiently infers the phylogenetic tree of a language family and contact events between its clades. The implementation is available as a package for the phylogenetics software BEAST 2. We apply contacTrees in a simulation study and a case study on a subset of well-documented Indo-European languages. The simulation study demonstrates that contacTrees correctly reconstructs the history of a simulated language family, including simulated contact events. Moreover, it shows that ignoring contact can lead to systematic errors in the estimated tree height, rate of change and tree topology, which can be avoided with contacTrees. The case study confirms that contacTrees reconstructs known contact events in the history of Indo-European and finds known loanwords, demonstrating its practical potential. The model has a higher statistical fit to the data than a conventional phylogenetic reconstruction, and the reconstructed tree height is significantly closer to well-attested estimates. Our method closes a long-standing gap between the theoretical and empirical models of cultural evolution. The implications are especially relevant for less documented language families, where our knowledge of past contacts and linguistic borrowings is limited. Since linguistic phylogenies have become the backbone of many studies of cultural evolution, the addition of this integral piece of the puzzle is crucial in the endeavour to understand the history of human culture.
In Blasi et al. (2019) we have shown, through a series of statistical analyses and models, that h... more In Blasi et al. (2019) we have shown, through a series of statistical analyses and models, that human sound systems have been affected by a transition in bite configuration starting from the Neolithic. Tarasov and Uyeda (2020) (henceforth T&U) raise a number of observations in relation to our article. We appreciate T&U’s engagement with our work and their sharing of the code and data of the analyses reported. In brief, their technical comment involves five analyses:Binomial Causal Models (BCM)Linear Regression of across-area variation in labiodentals and subsistencePredictive Posterior Simulations (PPS)Poisson Linear Regression (PLR): model comparisonPhylogenetic AnalysesIn what follows, we show that the discrepancies they report between our findings and theirs are due mostly to ill-specified models, weak (or missing) statistical evidence, and a misinterpretation of our results. After these issues are addressed, we conclude that T&U’s claims do not hold.
This paper investigates the origins of sortal numeral classifiers in the Indo-Iranian languages. ... more This paper investigates the origins of sortal numeral classifiers in the Indo-Iranian languages. While these are often assumed to result from contact with non-Indo-European languages, an alternative possibility is that classifiers developed as a response to the rise of optional plural marking. This alternative is in line with the so-called Greenberg-Sanches-Slobin (henceforth GSS) generalization. The GSS generalization holds that the presence of sortal numeral classifiers across languages is negatively correlated with obligatory plural marking on nouns. We assess the extent to which Indo-Iranian classifier development is influenced by loosening of restrictions on plural marking using a sample of 65 languages and a Bayesian phylogenetic model, inferring posterior distributions over evolutionary transition rates between typological states and using these rates to reconstruct the history of classifiers and number marking throughout Indo-Iranian, constrained by historically attested sta...
Journal of South Asian Languages and Linguistics, 2022
The present study provides a survey of the semantics of depictive (in a broad sense, including ci... more The present study provides a survey of the semantics of depictive (in a broad sense, including circumstantials) adjectival compounds in Vedic Sanskrit. Following the typology of depictive constructions developed by Himmelmann and Schultze-Berndt (2005), we structure our classification along the semantic fields such expressions tend to occur in. Our results show that in Vedic, the use of depictive adjectival compounds spans (almost) the whole gamut of functions reported for depictives in cross-linguistic studies. In Vedic, depictive compounds rank on a par with other strategies of non-finite event elaboration such as participles, verbal adjectives, and action nouns.
In many cases of apparent contact-induced change the contribution of genealogical correlation in ... more In many cases of apparent contact-induced change the contribution of genealogical correlation in the language sample and its interaction with processes such as matter and pattern replication are difficult to specify. In order to get a better sense of the relevance of shared ancestry, we quantify the change in similarity since the late Middle Ages in a sample of Romance and Germanic languages with data from a selected grammatical domain (expression of reflexivity). We compare their dynamics to patterns of change of similarity in two contact zones in Europe, namely the British Isles (Dedio et al. 2019) and the Balkans. Concerning the genealogical signal, the results indicate a maintenance and gain of similarity in Romance as opposed to a loss of similarity in Germanic. This hints at the importance of the inherited states, the time since the split from the common ancestor, and subsequent developments. We presume that these factors are likely to be at the origin of the maintenance and increase in similarity observed for the sampled Romance varieties. While this result cannot be generalised beyond the specific case study presented here, the basic approach will contribute to a better understanding of how contact, genealogy and culture interact in shaping the dynamics of linguistic similarity.
In this paper we introduce an extended version of the Vedic Treebank (vtb, Hellwig et al. 2020) w... more In this paper we introduce an extended version of the Vedic Treebank (vtb, Hellwig et al. 2020) which comes along with revisited and extended annotation guidelines. In order to assess the quality of our annotations as well as the usability and limits of the guidelines we performed an inter-annotator agreement test. The results show that agreement between annotators is hampered by various factors, most prominently by insufficient understanding of the content because of the cultural and temporal gap and incomplete knowledge of Vedic grammar. An in-depth discussion of disagreeing annotations demonstrates that the setup of the workflow, too, has a major influence on inter-annotator agreement. We suggest some measures that can help increase the transparency and annotation consistency according to current knowledge of the language when annotating Vedic Sanskrit, or ancient language varieties in general.
Maiores philologiae pontes. Festschrift für Michael Meier-Brügger zum 70. Geburtstag.
Anders als etwa bei den wenig belegten Determinativkomposita (AiGr., II.2, 241) bietet der RV bek... more Anders als etwa bei den wenig belegten Determinativkomposita (AiGr., II.2, 241) bietet der RV bekanntlich bei adjektivisch verwendeten (exozentrischen) Komposita (Scarlata & Widmer 2015) eine gewaltige Fülle an Material. Diese Komposita finden ganz parallel zu " einfachen " Adjektiven Verwendung als adnominale Attribute und Epitheta, aber auch, wie bereits Delbrück festgestellt hat (Delbrück 1878, 54), als Äquivalente von adverbiellen Nebensätzen, Partizipialkonstruktionen (restriktiven) Relativsätzen, Absolutiva und an-derem mehr. Im Beitrag soll anhand einer ausgewählten Stelle des RV (5.8.3), worin eine ganze Reihe exozentrischer Komposita erscheint, erörtert werden, inwiefern von solchen formal abhängigen Konstruktionen rekursiv weitere Komposita abhängen können.
Linguistics 58(3) (Special Issue: Shades of Partitivity: Formal and areal properties), 745-766, 2020
We discuss a potential case of borrowing in this paper: Breton a-'of', 'from' marking of (interna... more We discuss a potential case of borrowing in this paper: Breton a-'of', 'from' marking of (internal) verbal arguments, unique in Insular Celtic languages, and reminiscent of Gallo-Romance de/du-(and en-) arguments. Looking at potential Gallo-Romance parallels of three Middle Breton constructions analyzed in some detail (a with indefinite mass nominals in direct object position, a-marking of internal arguments under the scope of negation, a [allomorphs an(ez)-/ahan-] with personal pronouns for internal arguments, subjects (mainly of predicative constructions) and as expletive subjects of existential constructions), we demonstrate that even if there are some semantic parallels and one strong structural overlap (a and de under the scope of negation), the amount of divergences in morphology, syntax and semantics and the only partially fitting relative chronology of the different constructions do not allow to conclude with certainty that language-contact is an explanation of the Breton facts, which might have come into being also because of internal change (bound to restructuring of the prono-minal system in Breton). More research is necessary to complete our knowledge of a-marking in Middle Breton and Modern Breton varieties and on the precise history of French en, in order to decide for one or the other explanation.
Convergence by loss is a concept that is often adduced to characterize the Balkans as a linguisti... more Convergence by loss is a concept that is often adduced to characterize the Balkans as a linguistic area and to substantiate the areality of particular linguistic features, developments and varieties. Time and again, it has been pointed out that however useful this concept may be for certain purposes, e.g., when descriptively stating differences between historical stages of one specific variety, it is problematic for others, in particular for comparing languages and assessing areality. In addition to implying the undisputed existence of categorial distinctions, applying this concept indiscriminately obscures the fact that its manifestations may differ substantially across features and languages. Furthermore, focusing on “loss” impedes insight into both more general and more specific processes. On the examples of case and infinitive in the standard norms of Albanian and Macedonian this article acts on these intuitions and elaborates a finer-grained approach that avoids the assumption of generally applicable categorial distinctions and the ignoring of differences below seemingly identical surface phenomena. By the decomposition of linguistic units into their constitutive morphosyntactic features it becomes possible to sketch the interaction of morphosyntactic exponents in expressing characteristic functions, such as the selection of grammatical relations or the licensing of constituency. This provides a solid empirical basis for comparing morphosyntactic patterns across languages in synchronic and diachronic respects and may be operationalized for assessing the areality of particular developments.
This paper introduces the first treebank of Vedic Sanskrit, a morphologically rich ancient Indian... more This paper introduces the first treebank of Vedic Sanskrit, a morphologically rich ancient Indian language that is of central importance for linguistic and historical research. The selection of the 4,000 sentences contained in this treebank reflects the development of metrical and prose texts over a period of 600 years. We discuss how these sentences are annotated in the Universal Dependencies scheme and which syntactic constructions required special attention. In addition, we describe a syntactic labeler based on neural networks that supports the initial annotation of the treebank, and whose evaluation can be helpful for setting up a full syntactic parser of Vedic Sanskrit.
Approaches to linguistic areas have largely focused either on purely qualitative investigation of... more Approaches to linguistic areas have largely focused either on purely qualitative investigation of area formation processes, on quantitative and qualitative exploration of synchronic distributions of linguistic features without considering time, or on theoretical issues related to the definition of the notion "linguistic area". What is still missing are approaches that supplement qualitative research on area formation processes with quantitative methods. Taking a bottom-up approach, we bypass notional issues and propose to quantify area formation processes by a) measuring the change in linguistic similarity given a geographical space, a socio-cultural setting, a time span, a language sample, and a set of linguistic data, and b) testing the tendency and magnitude of the process using Bayesian inference. Applying this approach to the expression of reflexivity in a dense sample of languages in northwestern Europe from the early Middle Ages to the present, we show that the method yields robust quantitative evidence for a substantial gain in linguistic similarity that sets the languages of Britain and Ireland apart from languages spoken outside Britain and Ireland and cross-cuts lines of linguistic ancestry.
Linguistic diversity, now and in the past, is widely regarded to be independent of biological cha... more Linguistic diversity, now and in the past, is widely regarded to be independent of biological changes that took place after the emergence of Homo sapiens . We show converging evidence from paleoanthropology, speech biomechanics, ethnography, and historical linguistics that labiodental sounds (such as “f” and “v”) were innovated after the Neolithic. Changes in diet attributable to food-processing technologies modified the human bite from an edge-to-edge configuration to one that preserves adolescent overbite and overjet into adulthood. This change favored the emergence and maintenance of labiodentals. Our findings suggest that language is shaped not only by the contingencies of its history, but also by culturally induced changes in human biology.
'to see him glorified' M 3037 b. en-em 3SG.M-REFL maruaille marvel.IPF.3SG ez ADV bras great 'he ... more 'to see him glorified' M 3037 b. en-em 3SG.M-REFL maruaille marvel.IPF.3SG ez ADV bras great 'he marveled much' Ca. 6 'from somebody' B 107 c. pan if quaffen find.PRS.SUBJ.1SG vn-re
The Hittite clitic-za is known to be substitutable with clitic personal pronouns. Starting from t... more The Hittite clitic-za is known to be substitutable with clitic personal pronouns. Starting from this observation, a corpus analysis taking into account all periods of Hittite reveals that-za cooccurs with-šši-only in a handful of cases. We conclude that-za and-šši-are probably mutually exclusive, except for some rare cases, when-šši-functions as an argument of a predicate or as an adnominal modifier. This in turn means that clauses that contain-šši-(and probably other clitic dative forms) need to be looked at, too, when investigating the semantics of-za-since-šši-may stand for-za to an extent yet to be determined.
Indogermanische Morphologie in erweiterter Sicht, 2022
@book{aktenzurich2020ed,
address = {Innsbruck},
title = {Indogermanische Morphologie in erwei... more @book{aktenzurich2020ed,
address = {Innsbruck},
title = {Indogermanische Morphologie in erweiterter Sicht},
editor = {Sommer, Florian and Stüber, Karin and Widmer, Paul and Yamazaki, Yoko},
publisher = {Institut für Sprachwissenschaft der Universität},
year = {2022}}
Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen National bibliogra... more Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen National bibliogra e; detaillierte bibliogra sche Daten sind im Internet über http://dnb.dnb.de abru ar.
CONTENTS: Erich Poppe, Karin Stüber, Paul Widmer: Preface / Aaron Griffith: Preliminaries to the ... more CONTENTS: Erich Poppe, Karin Stüber, Paul Widmer: Preface / Aaron Griffith: Preliminaries to the syntax of the Welsh reduplicated pronouns / Axel Harlos: The influence of animacy and accessibility on Middle Welsh positive declarative main clauses. Evidence from historiographical texts / Mícheál Hoyne: Why resumption? Resumptive pronouns in prepositional relative clauses / Britta Irslinger: Detransitive strategies in Middle Welsh. The preverbal marker ym-/ Marieke Meelen: Object-initial word order in Middle Welsh narrative prose / Erich Poppe: How to resolve under-determination in Middle Welsh verbal-noun phrases / Karin Stüber: Subjects of non-finite adverbial clauses in the Old Irish biblical glosses / Paul Widmer: Cases, paradigms, affixes and indexes. Selecting grammatical relations in Middle Breton
Folia Linguistica (accepted for publication) , 2019
In this article we venture to elucidate the origin of the Albanian subjunctive marker të-. We con... more In this article we venture to elucidate the origin of the Albanian subjunctive marker të-. We contend that this marker is historically linked to a morphosyntactic device which is traditionally described as linking article and which licenses nominal syntactic units as constituents of larger syntactic units. Based on the observation that there is a substantial distributional, functional and semantic overlap between nonfinite verbal forms marked with të-and finite subjunctive predicates, we propose that the subjunctive marker spread across host classes from nominals to nonfinite predicates and to finite subjunctive predicates. The spread into the finite verbal domain is areally fostered, while the licensing device itself is an independent Albanian development that possibly picks up a vertical, Indo-European signal.
Sprachenvielfalt und Mehrsprachigkeit sind zentrale Begriffe der schweizerischen Sprachpolitik un... more Sprachenvielfalt und Mehrsprachigkeit sind zentrale Begriffe der schweizerischen Sprachpolitik und Sprachlandschaft und betreffen auch Herkunftssprachen von Migrantengruppen. Obwohl albanischsprachige Gemeinschaften (meist aus Kosovo und aus Mazedonien) seit den 1980er Jahren zu den grössten Migrantengruppen im deutschen Sprachraum und speziell in der Schweiz zählen, ist über die Sprache und das sprachliche Verhalten dieser mittlerweile mehrere Generationen umfassenden Sprechergemeinschaft trotz bedeutender Pionierwerke (Caprez-wenig bekannt. Angesichts der Tatsache, dass gesellschaftliche und wirtschaftliche Integration und Teilhabe sowie Identitätsbildung unentflechtbar mit Sprache, ermöglichter und geförderter Sprachpraxis und Sprachbewusstsein verbunden sind und immer waren, stellt die Untersuchung der herkunftssprachlichen Praxis und ihrer Interaktion mit den neuen Mehrheitssprachen ein dringendes Desiderat dar. In diesem Projekt wird ein umfassendes Bild der sprachlichen Praxis der Herkunftssprachen-sprecher des Albanischen und der verwendeten Sprache(n) über die Zeit und in diversen Kon-taktsituationen erarbeitet, indem Ansätze und Methoden der Herkunftssprachlinguistik und-di-daktik mit solchen der Kontakt-, Sozio-und Variationslinguistik kombiniert werden. Die Verbin-dung dieser Zugänge ermöglicht a) akteurzentrierte bottom-up Einblicke in kontaktinduzierte Spezifikation von Merkmalen (Bewahrung oder Veränderung), die ihrerseits die Voraussetzung für die Modellierung von Sprachkontakt in der Sprachgeschichte und Sprachevolution dar-stellen, und b) die Entwicklung anwendungsorientierter Lösungen für die Pflege von Herkunfts-sprache als Instrument der Integration via gesellschaftlicher Teilhabe und zur Stärkung der im heutigen Europa unentbehrlichen Mehrsprachigkeit. Zu allen Bereichen existieren substantielle Vorarbeiten, doch fehlen sowohl in der Herkunftssprachlinguistik und-didaktik und der Kontakt-linguistik belastbare Studien über längere Zeitverläufe unter maximal kontrollierten soziokultu-rellen Bedingungen. Dies betrifft neben Untersuchungen zum Albanischen als Herkunftssprache in deutschsprachiger Umgebung und den diversen Kontakterscheinungen in den beteiligten Varietäten, auch das mikroperspektivische Nachzeichnen der soziokulturellen Motivation und sprachlichen Prozesse bei Sprachentwicklung in spezifischen Kontaktkonfigurationen. Die sprachlichen Daten werden in drei Sprechergenerationen anhand verschiedener Stimuli elizitiert und in Familiennetzwerken sowie mit crowd sourcing erhoben, soziokulturelle Information mittels biographischer und narrativer Interviews gewonnen. Die Texte werden durch Linguisten und im crowd sourcing-Verfahren durch Herkunftssprachensprecher doppelt kodiert; für letztere können so zugleich Sprachbewusstsein und die Einstellung zur Herkunftssprache erhoben werden. Die entstehenden soziolinguistischen Sprecherprofile und Spracheinstellungen werden mit den sprachlichen Daten korreliert. Anhand der Veränderungen der Sprachstruktur über die Sprechergenerationen hinweg werden somit qualitativ und quantitativ Kontakteffekte unter gut kontrollierbaren Bedingungen extrahiert und evaluiert. Die Verbindung struktur-und soziolinguistischer Ansätze für eine umfassende Analyse der linguistischen, soziokulturellen und gesellschaftspolitischen Relevanz herkunftssprachlicher Aspekte verspricht in dieser Kombination und für alle genannten disziplinären Perspektiven einen signifikanten wissenschaftlichen Erkenntnisgewinn sowie anwendungsorientierte Pro-dukte in Form von didaktischen und pädagogischen Materialien. Konkret werden für die Kon-taktlinguistik empirische Daten zur Abschätzung von Kontakteinfluss für phylogenetische Modelle zur Verfügung gestellt, sowie Unterrichtsmaterialien für den Herkunftssprachunterricht erstellt. Die Resultate werden in Form von Qualifikationsschriften, wissenschaftlichen Publika-tionen, Infobroschüren und Webauftritten verbreitet, die erhobenen Daten auf geeigneten Repositorien (DaSCH) deponiert und verfügbar gemacht
Uploads
Papers by Paul Widmer
insufficient understanding of the content because of the cultural and temporal gap and incomplete knowledge of Vedic grammar. An in-depth discussion of disagreeing annotations demonstrates that the setup of the workflow, too, has a major influence on inter-annotator agreement. We suggest some measures that can help increase the transparency and annotation consistency according to current knowledge of the language when annotating Vedic Sanskrit, or ancient language varieties in general.
insufficient understanding of the content because of the cultural and temporal gap and incomplete knowledge of Vedic grammar. An in-depth discussion of disagreeing annotations demonstrates that the setup of the workflow, too, has a major influence on inter-annotator agreement. We suggest some measures that can help increase the transparency and annotation consistency according to current knowledge of the language when annotating Vedic Sanskrit, or ancient language varieties in general.
address = {Innsbruck},
title = {Indogermanische Morphologie in erweiterter Sicht},
editor = {Sommer, Florian and Stüber, Karin and Widmer, Paul and Yamazaki, Yoko},
publisher = {Institut für Sprachwissenschaft der Universität},
year = {2022}}