Papers by Simon J . Greenhill
Science Advances, 2023
While global patterns of human genetic diversity are increasingly well characterized, the diversi... more While global patterns of human genetic diversity are increasingly well characterized, the diversity of human languages remains less systematically described. Here, we outline the Grambank database. With over 400,000 data points and 2400 languages, Grambank is the largest comparative grammatical database available. The comprehensiveness of Grambank allows us to quantify the relative effects of genealogical inheritance and geographic proximity on the structural diversity of the world's languages, evaluate constraints on linguistic diversity, and identify the world's most unusual languages. An analysis of the consequences of language loss reveals that the reduction in diversity will be strikingly uneven across the major linguistic regions of the world. Without sustained efforts to document and revitalize endangered languages, our linguistic window into human history, cognition, and culture will be seriously fragmented.
The Open Handbook of Linguistic Data Management
Cite the source of the dataset as: Greenhill, Simon J. (2015): TransNewGuinea.org: An Online Data... more Cite the source of the dataset as: Greenhill, Simon J. (2015): TransNewGuinea.org: An Online Database of New Guinea Languages. PLoS ONE 10.10: e0141563.
Cite the dataset as: List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph... more Cite the dataset as: List, Johann-Mattis; Forkel, Robert; Greenhill, Simon J.; Rzymski, Christoph; Englisch, Johannes; and Russell D. Gray (2021): Lexibank: A publicly available repository of standardized lexical datasets with automatically computed phonological and lexical features for more than 2000 language varieties [Dataset, Version 0.1]. Geneva: Zenodo. https://github.com/lexibank/lexibank-analysed
Probabilities of presence/absence of a trait at root nodes.
Codes used in ESM1-6 to define the diversity of ways in which plants are used and plant parts use... more Codes used in ESM1-6 to define the diversity of ways in which plants are used and plant parts used. Glottolog codes used to identify language of vernacular names.
Viking-Age archaeobotanical evidence per plant species, including sites where samples were found,... more Viking-Age archaeobotanical evidence per plant species, including sites where samples were found, interpretation, and references to source material.
This repository (see Tutorial.md for getting started) contains the code and data for reproducing ... more This repository (see Tutorial.md for getting started) contains the code and data for reproducing the analyses described in the chapter "Managing Historical Linguistic Data for Computational Phylogenetics and Computer-Assisted Language Comparison".
List, Johann Mattis & Rzymski, Christoph & Greenhill, Simon & Schweikhard, Nathanael & Pianykh, K... more List, Johann Mattis & Rzymski, Christoph & Greenhill, Simon & Schweikhard, Nathanael & Pianykh, Kristina & Tjuka, Annika & Hundt, Carolin & Forkel, Robert (eds.) 2021. Concepticon v2.5.0. A Resource for the Linking of Concept Lists. Leipzig: Max Planck Institute for Evolutionary Anthropology. Available online at https://concepticon.clld.org
A Python tool for constructing a newick formatted tree from a set of classifications.
Cite as Johann-Mattis List, Cormac Anderson, Tiago Tresoldi, Christoph Rzymski, Simon Greenhill, ... more Cite as Johann-Mattis List, Cormac Anderson, Tiago Tresoldi, Christoph Rzymski, Simon Greenhill, and Robert Forkel (2018). Cross-Linguistic Transcription Systems (Version 1.1.1). Max Planck Institute for the Science of Human History: Jena. DOI: 10.5281/zenodo.1623511
This is the first release of our LingPy tutorial, accompanying the paper "Sequence Compariso... more This is the first release of our LingPy tutorial, accompanying the paper "Sequence Comparison in Computational Historical Linguistics" (List et al. 2018, Journal of Language Evolution, DOI: http://dx.doi.org/10.1093/jole/lzy006).
The Dravidian language family consists of about 80 varieties (Hammarström H. 2016 <i>Glotto... more The Dravidian language family consists of about 80 varieties (Hammarström H. 2016 <i>Glottolog 2.7</i>) spoken by 220 million people across southern and central India and surrounding countries (Steever SB. 1998 In <i>The Dravidian languages</i> (ed. SB Steever), pp. 1–39: 1). Neither the geographical origin of the Dravidian language homeland nor its exact dispersal through time are known. The history of these languages is crucial for understanding prehistory in Eurasia, because despite their current restricted range, these languages played a significant role in influencing other language groups including Indo-Aryan (Indo-European) and Munda (Austroasiatic) speakers. Here, we report the results of a Bayesian phylogenetic analysis of cognate-coded lexical data, elicited first hand from native speakers, to investigate the subgrouping of the Dravidian language family, and provide dates for the major points of diversification. Our results indicate that the Dravidi...
List, Johann-Mattis & Cysouw, Michael & Greenhill, Simon & Forkel, Robert (eds.) 2018. Conceptico... more List, Johann-Mattis & Cysouw, Michael & Greenhill, Simon & Forkel, Robert (eds.) 2018. Concepticon. A Resource for the Linking of Concept Lists. Jena: Max Planck Institute for the Science of Human History. Available online at http://concepticon.clld.org
The data used to run the analysis
python-nexus - Generic nexus (.nex, .trees) reader/writer for python
Scholars have debated naturalistic theories of religion for thousands of years, but only recently... more Scholars have debated naturalistic theories of religion for thousands of years, but only recently have scientists begun to test predictions empirically. Existing databases contain few variables on religion, and are subject to Galton’s Problem because they do not suffi-ciently account for the non-independence of cultures or systematically differentiate the tradi-tional states of cultures from their contemporary states. Here we present Pulotu: the first quantitative cross-cultural database purpose-built to test evolutionary hypotheses of super-natural beliefs and practices. The Pulotu database documents the remarkable diversity of the Austronesian family of cultures, which originated in Taiwan, spread west to Madagascar and east to Easter Island–a region covering over half the world’s longitude. The focus of Austronesian beliefs range from localised ancestral spirits to powerful creator gods. A wide range of practices also exist, such as headhunting, elaborate tattooing, and the const...
Cite the source dataset as Carling, Gerd (ed.) 2017. Diachronic Atlas of Comparative Linguistics ... more Cite the source dataset as Carling, Gerd (ed.) 2017. Diachronic Atlas of Comparative Linguistics Online. Lund: Lund University. (DOI/URL: https://diacl.ht.lu.se/. ). Accessed on: 2019-02-07.
Cite the source dataset as Tryon, D.T. and Hackman, B.D. 1983. Solomon Islands Languages: An inte... more Cite the source dataset as Tryon, D.T. and Hackman, B.D. 1983. Solomon Islands Languages: An internal classification. Canberra: Pacific Linguistics.
Uploads
Papers by Simon J . Greenhill
The origins of the Indo-European language family are hotly disputed. Bayesian phylogenetic analyses of core vocabulary have produced conflicting results, with some supporting a farming expansion out of Anatolia ~9000 years before present (yr B.P.), while others support a spread with horse-based pastoralism out of the Pontic-Caspian Steppe ~6000 yr B.P. Here we present an extensive database of Indo-European core vocabulary that eliminates past inconsistencies in cognate coding. Ancestry-enabled phylogenetic analysis of this dataset indicates that few ancient languages are direct ancestors of modern clades and produces a root age of ~8120 yr B.P. for the family. Although this date is not consistent with the Steppe hypothesis, it does not rule out an initial homeland south of the Caucasus, with a subsequent branch northward onto the steppe and then across Europe. We reconcile this hybrid hypothesis with recently published ancient DNA evidence from the steppe and the northern Fertile Crescent.