Arabic Corpus Linguistics
7,048 Followers
Recent papers in Arabic Corpus Linguistics
Prague Arabic Dependency Treebank (PADT) consists of refined multi-level linguistic annotations over the language of Modern Written Arabic. The kind of morphological and syntactic information comprised in PADT differs considerably from... more
هذه ترجمة لبحثي المنشور تحت عنوان
Clausal Subjects in Modern Standard Arabic: A Corpus Analysis
ضمن أبحاث المؤتمر الدولي الخامس لكلية الآداب بجامعة جنوب الوادي، مصر، نوفمبر 2017
Clausal Subjects in Modern Standard Arabic: A Corpus Analysis
ضمن أبحاث المؤتمر الدولي الخامس لكلية الآداب بجامعة جنوب الوادي، مصر، نوفمبر 2017
Idioms represent a fascinating linguistic phenomenon that has captured the attention of many linguists for decades. The ubiquity of these expressions in language use, the wide range of functions they perform in discourse, the problems... more
I verify a chronology in which seven groups of passages represent consecutive phases. A proposed chronology is verified if independent markers of style vary over its phases in a smooth fashion. Four markers of style follow smooth... more
摘要: 最近几十年,语料库语言学已成为现代应用语言学的支柱。因此,本文的宗旨是更深入地探讨语料库建设的一些认知性和操作性的步骤,以便把语料库观念向广大的研究人员推广。本文主要分为三个部分: 1. 语料库建设:理论与实践 2. 语料文本的加工层面 3. 语料格式属性的标注... more
Encyclopedia of Arabic Language and Linguistics
this edited book is about Arabic corpora and its corpus linguistics
A toda mi familia A Salma لسوراي يف السوريني للك و نيا" ادل "هاي La ciencia es una estrategia, es una forma de atar la verdad que es algo más que materia, pues el misterio se oculta detrás. Luis Eduardo Aute, De paso. ix
هدف الدراسة: عرض توجهِ أنشطةِ أقسامِ اللُّغاتِ الأكاديمية نحو استثمارِ المخزونِ اللغويِّ العربيِّ؛ بهدفِ استخراجِ رصيدٍ إلكترونيٍّ من أبنيتِه وتراكيبِه وأساليبِه، ومن ثَمّ توظيفَهُ في إنتاجِ مشاريعَ علميةٍ وتطبيقاتٍ لغويةٍ، وبيان أهمية... more
A dissertation submitted in fulfillment of the Bachelor of Arts (Honours) degree.
this book is about Arabic corpus linguistics
[…]orandum est ut sit mens sana in corpore sano. Saturae X, v. 356 Decimus Iunius Iuvenalis (ss. I -II d. C.) RESUMEN: Tras una introducción a los corpus y su desarrollo en el campo de la lingüística moderna, presentamos un panorama... more
Verbs in Maltese can occasionally be negated with the nominal negator mhux. Analogous negation with reflexes of muš has been observed in Egyptian Arabic, with those researching it speculating that it is a recent phenomenon. Yet its... more
This paper contains basic information on the electronic corpus of Modern Standard Arabic, which has been compiled at
Despite the notion that written Arabic is invariable across the Arab world, a few researchers, using large corpora to discover patterns of usage, have demonstrated regional differences in Arabic writing. While most such research has... more
The study explores the process of using Arabic websites for Arabic language learning, utilising the Arabic Corpus Linguistic approach. This approach enables data-mining out of websites, systematically compiling the mined data, as well as... more
تُعرّف الكلمات المميزة ضمن إطار لسانيات المدونات اللغوية بالكلمات التي يختلف تكرارها في المدونة اختلافا واضحًا عن تكرارها في مدونة أخرى وذلك من منظار إحصائي. ولاستخراج الكلمات المميزة للمدوة فوائد واستخدامات متعددة تبدأ من توجيه اهتمام... more
The earliest Maltese grammars of the 18th and 19th-centuries attest a polar interrogative enclitic –š. An early 20th-century grammar shows it retaining an interrogative function and marking indirect questions. By the late 20th century,... more
This paper investigates the constructional behaviour of three of the most frequent GO verbs in Modern Standard Arabic: ḏahaba, maḍā, and rāḥa. These verbs are considered somewhat synonymous according to many classical and modern... more
Corpora have made many significant contributions to our understanding of how language works and how it is used in the society. In particular, they have become an indispensable tool in almost any systematic investigation of the lexicon.... more
This paper explains how Arabic Language Corpus Linguistics can learn from the most developed English Language Corpus Linguistics so as to develop and advance itself on their own basis.
In the present paper we look into the different ways in which one of the Arabic modalities of proximity, viz. kāda, is rendered in Hispanic Qur´anic translations ranging from the 16th century and up to the early 17th one. The analysis is... more
Modal auxiliaries in spoken Arabic usually embed verbs in the unmarked imperfect. Yet, Brustad (2000) has documented modals embedding perfective verbs in the speech of an informant from a village near Latakia, Syria. This study... more
Prague Arabic Dependency Treebank (PADT) consists of refined multi-level linguistic annotations over the language of Modern Written Arabic. The kind of morphological and syntactic information comprised in PADT differs considerably from... more
الهدف الأساسي لهذا المبحث هو تقديم نماذج تطبيقية، وعينات توضح طرق استخدام ’لسانيات ا الحاسوبية في رصد وتحليل الظواهر اللغوية. وأما المادة اللغوية المستخدمة المدونات‘، وسبل استثمار إمكانا في التحليل فهي ’العربية الفصحى المعاصرة‘... more
Text classification (TC) is an essential field in both text mining (TM) and natural language processing (NLP). Humans have a tendency to organize and categorize everything as they want to make things easier to understand. Therefore, text... more
As opposed to its earlier counterparts written in Aljamiado – i.e. Romance transcribed in Arabic letters –, the late Morisco Ms. 235 (Toledo, Biblioteca de Castilla-La Mancha) was copied in Latin script. Produced in early 17th century, in... more
Like in other languages, Arabic distinguishes between conditional sentences on the one hand, represented by the operators of supposition ʾiḏā, ʾin and law applied to a clause, and indefinite conditional sentences on the other, represented... more
Linguistic argumentation is defined by Arab grammarians to mean a formulation of grammatical rules from primary sources using anomaly, consensus, measurement, an argumentation based on circumstances and other rules. The anomaly sources... more
The Aljamiado-Morisco literature consists mainly of religious writings produced mostly in the 15th and the 16th centuries by a cryptic Iberian Muslim community. With twenty seven codices known up to this date, the Koranic translations are... more
Paronomasia, used as a grammatical device, is a construction much favoured in Semitic languages. It comes as no surprise to encounter such structures in calque languages. However, considering that those translations are very literal, it’s... more
This paper would like to introduce the reader into those aspects of the Arabic language which require some special treatment compared to languages Europeans are more familiar with. In spite of having fresh experience in building the... more
The Arabic tour kāna … sa-/sawfa yafʿalu is frequently encountered in contemporary Arabic press as well as in novels. Recent grammars of this state of language seem to oscillate between three distinct values which are conditional (« il... more
I am writing my PhD dissertation on the elaboration of a vocabulary frequency list specific to the Arabic novel, based on a literary corpus that I am building. The purpose of this research is to come up a list of 2,000 most frequently... more
This paper would like to introduce the reader into those aspects of the Arabic language which require some special treatment compared to languages Europeans are more familiar with. In spite of having fresh experience in building the... more
Purpose: The study explores the process of using Arabic websites for Arabic language learning, utilising the Arabic Corpus Linguistic approach. This approach enables data-mining out of websites, systematically compiling the mined data, as... more
Collocation extraction from corpora, whether complete or according to specific criteria, plays a significant role in computational linguistics, corpus linguistics, and natural language processing. In this paper, we present Musaheb, an... more