Computational Historical Linguistics
96 Followers
Recent papers in Computational Historical Linguistics
In: Matilde Serangeli & Thomas Olander (eds.). 2020. Dispersals and diversification: Linguistic and archaeological perspectives on the early stages of Indo-European (Brill's Studies in Indo-European Languages & Linguistics). Leiden &... more
In the thesis it is discussed in what ways concepts and methodology developed in evolutionary biology can be applied to the explanation and research of language change. The parallel nature of the mechanisms of biological evolution and... more
By combining an Ancient Greek semantic domains database with a corpus of annotated Ancient Greek treebanks, one may observe semantic preferences of individual words or word combinations. This information may then be applied to phrases... more
The evidence one can draw from the rhyming behavior of Old Chinese words plays a crucial role for the reconstruction of Old Chinese, particularly for the more recent proposals. Some of these proposals are no longer solely based on the... more
The exploration of distant language relationships reaching back to the Neolithic age remains very demanding and is often perceived as controversial. Over the last two decades, significant advances in computational linguistics offer... more
In this study, an attempt has been made to use the computer program, PHONO, to develop a computer model which operates on the principle of the regularity of sound change. Surprising though it may seem, since this concept was first coined... more
The field of Chinese Historical Phonology is traditionally dealing with a large number of complex and diverse types of data. While the data diversity can be conveniently dealt with in qualitative approaches, computational possibilities... more
Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven... more
Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven... more
Sound correspondence patterns play a crucial role for linguistic reconstruction. Linguists use them to prove language relationship, to reconstruct proto-forms, and for classical phylogenetic reconstruction based on shared... more
An etymological proposition is often said to be probable or improbable from the phonetic point of view, and it is not rare for opinions to diverge on which it is. The estimation is typically purely intuitive, based on perceived similarity... more
The ASJP (Automated Similarity Judgment Program) described an automated, lexical similarity-based method for dating the world's language groups using 52 archaeological, epigraphic and historical calibration date points. The present paper... more
In this study, an attempt has been made to use the computer program, PHONO, to develop a computer model which operates on the principle of the regularity of sound change. Surprising though it may seem, since this concept was first coined... more
Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven... more
Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven... more
The proposal of new quantitative methods supposed to handle problems in historical linguistics has created a gap between what one could call "classical" approaches to historical language comparison and the "new and innovative" automatic... more
Like many other research fields, linguistics is entering the age of big data. We are now at a point where it is possible to see how new research questions can be formulated-and old research questions addressed from a new angle or... more
In this paper, we investigate how the prediction paradigm from machine learning and Natural Language Processing (NLP) can be put to use in computational historical linguistics. We propose word prediction as an intermediate task, where the... more
This paper presents the results of an exercise in lexical comparison between language families: In order to test the statistical significance of the quantity and quality of the author's lexical comparisons between reconstructed... more
A new approach is proposed for studying Arabic morphosemantics, set in the belief that language evolves, has tendency for economy and simplification, and avoids synonymy. It is a corpus-based approach that utilizes, for theorization,... more
In order to better support the text mining of historical texts, we propose a combination of complementary techniques from Geographical Information Systems, computational and corpus linguistics. In previous work, we have described this as... more
This repository contains the nexus files and MrBayes command files needed for running the experiments to determine the optimal word list size required for inferring the best phylogenies. The paper is forthcoming at <strong>The 27th... more
Like many other research fields, linguistics is entering the age of big data. We are now at a point where it is possible to see how new research questions can be formulated - and old research questions addressed from a new angle or... more
The amount of data from languages spoken all over the world is rapidly increasing. Traditional manual methods in historical linguistics need to face the challenges brought by this influx of data. Automatic approaches to word comparison... more
Advances in computer-assisted linguistic research have been greatly influential in reshaping linguistic research. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven... more
Advances in computer-assisted linguistic research are greatly influencing and reshaping linguistic investigation. With the increasing availability of interconnected datasets created and curated by researchers, more and more interwoven... more
In this paper, we describe the problem of cognate identification and its relation to phylogenetic inference. We introduce subsequence based features for discriminating cognates from noncognates. We show that subsequence based features... more
Resumen: El estudio del cambio gramatical que se aprecia en la lengua constituye un área de justificado interés en la lingüística diacrónica. En este ámbito, los córpora lingüísticos pueden constituir una herramienta idónea para el... more