Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
6 pages
1 file
This is the handout of my contribution to the seminar "New Directions in the Computational Analysis of Biblical Hebrew Grammar", IOSOT Congress, Stellenbosch, September 2016. (Apologies for Word format; will be replaced by PDF)
Journal of Biblical Text Research , 2019
For more than four decades, the Eep Talstra Centre for Bible and Computer (ETCBC) has been building a richly-annotated linguistic database of the Hebrew Bible. This contribution describes the processes of data creation of this database and its underlying methodological principles. These principles, which can be labeled “bottom-up” and “form-to-function”, stem from a deep concern to do justice to the biblical text itself and to prevent it from being overruled by thematic or theological considerations. The database facilitates the application of computational linguistics and digital humanities to the Hebrew Bible and supports biblical exegesis, Bible translation as well as the study of the Bible as a language corpus. In recent years the ETCBC database has been transformed to an open tool, which can be consulted online and which can be downloaded as a package for anyone who wants to use it for more advanced computational analysis of the Hebrew Bible. A research project on syntactic variation in the Hebrew Bible demonstrated the interaction of presumed data of origin (early versus late texts), genre (e.g. prose or poetry), text type (e.g. narrative and direct speech) and syntactic environment (e.g. main versus subordinate clauses). Regarding the realization of the copula “to be”, for example, it can be observed that the narrative text type and the direct speech sections differ considerably in the alleged early texts of the Bible and that the direct speech in the early corpus shows similarities with the Late Biblical Hebrew corpus. Regarding the complexity of tree structures, it can be observed that changes in the average size of tree structures take place in main clauses, and only later, or not at all, in subordinate clauses. This agrees with a well-known principle in linguistics, the so-called Penthouse Principle, that accounts for the distinction between “innovative” main clauses and “conservative” subordinate clauses. Such distribution patterns, which can only discovered with a computational full corpus analysis, are helpful to get a better understanding of diachronic language development of Classical Hebrew in the intersection of oral and written text transmission.
Proceedings of 2007 Information Resources Management Association, International Conference, 2007
In processing language electronically, one can either concentrate on the digital simulation of human understanding and language production, or on the most appropriate way of storing and using existing knowledge. Both are valid and important. This paper falls in the second category, by assuming that it is useful to capture the results of linguistic analyses in well-designed, exploitable, electronic databanks. The paper focuses on the conversion of linguistic data of Genesis 1 between an XML data cube and a multidimensional array structure in Visual Basic 6 in order to facilitate data access and manipulation.
The ETCBC Database of the Hebrew Bible” Journal for Semitics 27/1 (2018). We provide a brief introduction to the history, methodology, and tools of the Eep Talstra Centre for Bible and Computer (ETCBC). The ETCBC maintains a searchable database of morphology, syntax, and text-level features for the Hebrew Bible, Hebrew inscriptions, Dead Sea Scrolls, the Peshitta, and one of the Targumim. The ETCBC follows a form-to-function approach, in which surface-level features are registered first and functional labels second. Linguists and exegetes can use the database’s freely accessible query tools for pattern searches and analysis of the text’s structure in order to address their research questions.
… Automatique des Langues, 2001
This paper describes the process of building the first tree-bank for Modern Hebrew texts. A major concern in this process is the need for reducing the cost of manual annotation by the use of automatic means. To this end, the joint utility of an automatic morphological analyzer, a probabilistic parser and a small manually annotated tree-bank was explored. An initial tree-bank that consists of 500 annotated sentences from a daily newspaper is described. The annotation scheme that underlies the tree-bank analyses integrates morphology and syntax. An existing morphological analyzer and a language-independent probabilistic parser were applied to this tree-bank. Based on the results of some experiments with these tools, a semi-automatic procedure for future enlargement of the tree-bank is outlined. RSUM. Cet article décrit les différentesétapes dans la construction d'un corpus arboré de l'Hébreu moderne. L'objectif premier viseà la réduction du coût des annotations faitesà la mainà l'aide de moyens automatiques.À cette fin, nous montrons l'utilité de combiner un analyseur morphologique, un analyseur probabiliste et un corpus de référence de taille réduite manuellement annoté. Le corpus initial arboré consiste en 500 phrases annotéesà la main extraites d'un quotidien. Le schéma d'annotation intègre des informations morphologiques et syntaxiques. Un analyseur morphologique et un analyseur syntaxique probabiliste ont eté appliquéesà ce corpus arboré. En fonction des résultats de quelques expérimentations avec ces outils, une procédure semi-automatique est mise au point pour annoter de nouveaux textes.
2015
Biblical Hebrew databases and grammars are not a novelty: numerous medieval treatises deal with grammatical features of the Hebrew Bible, providing statistics as to the number of occurrences of a given phenomenon. This can already be seen in the marginal notes that accompany the biblical text on Masoretic manuscripts. The development of computer sciences in the twentieth century has paved the way for the creation of extensive computer databases of the Hebrew Bible, starting with the text itself — usually that of the Leningrad Codex rather than an eclectic edition or a text with critical apparatus. Lemmatisation enhances the textual database by identifying the various forms of a given lemma, thus enabling the user to perform lexicological queries. Morphological analysis encodes such features as part of speech, person, gender, number, state, aspect, and so on. The user is then able to search for all occurrences of a given pattern.
The Linguistic Annotation Framework (LAF) provides a general, extensible stand-off markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in particular. We first walk through the history of the Hebrew Bible as text database in decennium-wide steps. Then we describe how LAF-Fabric may serve as an analysis tool for this corpus. Finally, we describe three analytic projects/workflows that benefit from the new LAF representation: 1) the study of linguistic variation: extract cooccurrence data of common nouns between the books of the Bible (Martijn Naaijer); 2) the study of the grammar of Hebrew poetry in the Psalms: extract clause typology (Gino Kalkman); 3) construction of a parser of classical Hebrew by Data Oriented Parsing: generate tree structures from the database (Andreas van Cranenburgh).
… of The fifth international conference on …, 2006
University of Pretoria Electronic Theses and Dissertations, 2008
The thesis discusses a series of related techniques that prepare and transform raw linguistic data for advanced processing in order to unveil hidden grammatical patterns. A threedimensional array is identified as a suitable data structure to build a data cube to capture multidimensional linguistic data in a computer's temporary storage facility. It also enables online analytical processing, like slicing, to be executed on this data cube in order to reveal various subsets and presentations of the data. XML is investigated as a suitable mark-up language to permanently store such an exploitable databank of Biblical Hebrew linguistic data. This concept is illustrated by tagging a phonetic transcription of Genesis 1:1-2:3 on various linguistic levels and manipulating this databank. Transferring the data set between an XML file and a threedimensional array creates a stable environment allowing editing and advanced processing of the data in order to confirm existing knowledge or to mine for new, yet undiscovered, linguistic features. Two experiments are executed to demonstrate possible text-mining procedures. Finally, visualisation is discussed as a technique that enhances interaction between the human researcher and the computerised technologies supporting the process of knowledge creation. Although the data set is very small there are exciting indications that the compilation and analysis of aggregate linguistic data may assist linguists to perform rigorous research, for example regarding the definitions of semantic functions and the mapping of these functions onto the syntactic module.
This paper presents an overview and evaluation of several Greek syntax databases of the Greek New Testament (not Classical, Byzantine, Medieval or Modern) currently available with comment on their theoretical foundation and consistency. *Note* This paper was written in spring 2010 and does not take into account developments since that time.
NEW LEHRHAUS COURSES, 2024
agriculture.purdue.edu
Clio büvöletében. Válogatott tanulmányok Pandula Attila 65. születésnapjára. NÉMETH Szilvia (ed.). Budapest, Szent István társulat, s. 671-682. r, 2022
Ethnopolitics, 2023
The Defense Horizon Journal, 2024
Logbuch Wissensgeschichte, hg. v. Becker-Sawatzky, Mira / Dadaş, Şirin / Eusterschulte, Anne / Hasselmann, Kristiane / Johnston, Andrew James / Quenstedt, Falk / Reufer, Claudia / Trauer, Hanna Zoe / Vogel, Christian/ Wächter, Katrin / Wendt, Helge; Wiesbaden, Harrassowitz , 2024
Revue d'histoire des sciences humaines, "Usages de l’enfant sauvage", 2021
Управлінські компетенції у професійній діяльності викладача , 2015
Badiklat Kumham Jawa Tengah, 2023
Proceedings of the ... European conference on information warfare and security, 2024
Nanoscale, 2015
Journal of Mid-life Health, 2013
Journal of Forecasting, 1992
Clinical Psychological Science, 2015
Coffee Science, 2021
Human Immunology, 1994
Revista Brasileira de Fruticultura, 2011