Academia.eduAcademia.edu

Aspects of Mongol writing today

https://doi.org/10.48796/20230809-000

This is the edited version of a paper on the traditional Mongol script I presented at the 63rd Annual Meeting of the Permanent International Altaistic Conference (PIAC) in Ulaanbaatar on August 27, 2021. The conference, held online via Zoom, was hosted by the Institute of Language and Literature of the Mongolian Academy of Sciences. The manuscript was submitted in May 2022 and is expected to be published.

Aspects of Mongol writing today Michael Balk, Berlin1 The theme of the 63rd Annual Meeting of the PIAC in Ulaanbaatar – Communality and Mutual Influence of Language and Civilisation in the Altaic World – provides an opportunity to address a complex cultural commonality for any civilisation, writing. The original script of the Mongols, in use since the dawn of the Mongol Empire, was adopted from the Uighurs. The script of this Turkic-speaking nation goes back to the alphabet of the Sogdians, an Iranian people, and ultimately to Aramaic. As a Semitic script, it is characterised by a clear-cut range in the notation of consonants, while vowels are marked in a more restrained manner in comparison. While it was abandoned by the Uighurs themselves in favour of the Arabic script in the wake of their adoption of Islam, their peculiar vertical form of writing became a unique feature of Mongolian communities over centuries. Not only has it developed, in the form of the Oirat “clear script” (тод бичиг), a variant characterised by greater precision in the rendering of phonetic features, but also, in the form of so-called Galig (галиг), a system of notation had emerged that is capable of expressing the characters of both Tibetan and Sanskrit in a distinctly unique manner. Let us not forget that Written Manchu is also derivative of the Mongol script, holding nothing less than imperial status during the Qing dynasty. Though, as a result of Soviet influence, the old script in modern Mongolia has met a strong and presumably lasting competitor in an adapted form of Cyrillic, Written Mongol has never really disappeared. Since 1990, there has been a steady renaissance, giving historical substance and etymological scope to what is now a two-pronged performance, constituting a dynamic duopoly in the writing of the language. In Inner Mongolia, which is part of present-day China, the ancient script was not replaced. Here Mongol seems less threatened by Cyrillic than by the levelling pulls of Chinese in the new age. The nineties of the twentieth century were marked by a technological development that seems to have been as significant for writing and publishing as the invention of modern letterpress printing with movable type by Johannes Gutenberg around 1450. The reference is to Unicode, a comprehensive technical standard that aims to make it possible to write all characters of all scripts on the same technical platform – using the same typewriter, so to speak – comprising an inventory that includes all letters and ideographs on Earth. This is a profound media revolution. If it has become commonplace today to type not only elementary Latin letters on a computer or a smartphone, but also Cyrillic, Mongolian or characters of any other script more or less without problems, we owe this to the development of the Unicode standard in the 1990s. As far as the traditional Mongol script is concerned, its characters were first included as a component of the third version of the Unicode standard, which was published in September 1999.2 During this period, the present author worked as subject specialist for Central Asia at the Berlin State Library 1 2 Thanks go to Juha Janhunen, who contributed some important points of detail. See http://www.unicode.org/versions/Unicode3.0.0/. 1 (Staatsbibliothek zu Berlin). In this capacity, I was involved in the project by attending meetings and international consultations as an official representative of Germany. My impression was that the details of the standard were mainly worked out by experts from the People’s Republic of China and a group of non-governmental specialists from the computer community who were interested in an early completion and, naturally, the widest possible acceptance for the new character code tables of the Mongol range of the standard.3 Opinions may differ on the outcome. A fundamental objection to the Unicode standard for Written Mongol is the fact that the interpretation of the characters of the script does not primarily follow the logic of a one-to-one transliteration of the characters, but that of a transcription in the sense of determining an articulation understood as classical standard. Put simply, the letters and basic elements of the script are not encoded in relation to themselves, but in relation to a pronunciation or “reading” in which the characters can be interpreted in different ways. This is particularly obvious in the encoding of the (genuine) Mongolian vowels, for which a total of seven letters are defined in the relevant partition of the code: 1820 ᠠ MONGOLIAN LETTER A 1821 ᠡ MONGOLIAN LETTER E 1822 ᠢ MONGOLIAN LETTER I 1823 ᠣ MONGOLIAN LETTER O 1824 ᠤ MONGOLIAN LETTER U 1825 ᠥ MONGOLIAN LETTER OE 1826 ᠦ MONGOLIAN LETTER UE This kind of presenting the Mongol vowels – the Unicode labels A E I O U OE UE are nondiacritical modes of expressing a e i o u ö ü – corresponds to a traditional description as laid down in numerous grammars4 and in like manner reappearing in the Cyrillic script (а э и о у ө ү). On an unbiased view, however, this can only be understood as a phonetic interpretation of certain characters or composed ligatures of letters, not as a transliteration of the factual elements of the script. This is particularly obvious with the codes that are graphically identical in the table: U+1823 (O) = U+1824 (U) and U+1825 (OE) = U+1826 (UE). The letters themselves offer no clue to recognise O and U as separate elements, and the same applies to OE and UE. Written Mongol simply does not make this distinction. As a genuinely Semitic script, it basically has only three characters to express vowels: ālaph, yodh and waw, to use the common names of these letters in the Aramaic alphabet. When Mongolian children learn their script at school, teachers are likely to teach them the elementary signs (махбод) through a series of popular nicknames. 5 Semitic ālaph is 3 A list of the currently implemented characters of the script will be found at https://www.unicode.org/ charts/PDF/U1800.pdf. Besides punctuation marks, format controls, digits and the basic Mongol letters the table also includes additional signs for Todo, Sibe, Manchu, Buryat as well as extensions for Sanskrit and Tibetan. 4 For example: Poppe, Grammar of written Mongolian, p. 17. 5 In naming these elements I follow Шагдарсүрэн, Монголчуудын усэг бичигийн товчоон, pp. 29-30. See also: Kara, Books of the Mongolian nomads, pp. 79-95. 2 matched in Mongol by characters that look and are named differently depending on their position in the word. Initially a “crown” (титим or титэм) is written, medially a “tooth” (шүд). In final position at the end of a word there can be either a “tail” (сүүл) curving to the right or a final stroke to the left (цацлага); the latter can appear connected to a preceding medial or unconnected, i.e. in isolated position after a final letter and a space. Primordial yodh corresponds to “shin” (шилбэ) in popular Mongol parlance, which also occurs in a smaller variant called “hook” (дэгээ) appearing in final position only. The Mongolian “belly” (гэдэс) goes back to an ancestral waw. In the Unicode characters listed above, most of these basic elements of the script can be detected, along with a so-called “bow” (нум). In the order of their appearance in the Unicode range we can discern: титим (at the onset of all the seven vowels), сүүл (second element in Mongolian letter A) , цацлага (E), нум (I), гэдэс (O and U), дэгээ (final element after гэдэс in letters OE and UE). During the deliberations on the Unicode Standard, there was occasional talk of the basic elements known as “glyphs”, 6 but they remained largely unconsidered in the alphabetic encoding of letters and were relegated to the level of graphic fonts. The reason for this procedure was certainly that at that time there existed no transliteration generally accepted among scholars in Mongolian studies to which one could have referred easily and without generating objections. There was no precise rule mechanism for an unambiguous transliteration of letters (in the sense of an orthographic romanisation) that was universally accepted, as is the case with the Greek or Cyrillic alphabets and most other non-Latin scripts in Oriental philologies. The fact that only a transcription (or phonetic romanisation) has been in general use, and still is to a large extent, has probably something to do with the fact that Mongolian studies is a comparatively old discipline with conservative instincts. When Isaak Jakob Schmidt (1779-1847) dedicated his “Anfangsgründe der mongolischen Sprache” (rudiments of the Mongolian language) with the since unchanged transcription of the vowels7 to His Majesty Nicholas the First, in 1831, almost twenty years were to pass before Franz Xaver Ritter von Miklosich (in Slovenian: Franc Miklošič, 1813-1891) was appointed to the first chair of Slavic philology at the University of Vienna in 1849. In the Oriental studies, especially in then prominent disciplines of Indology, Iranian and Arabic Studies, it was only towards the end of the nineteenth century that what we understand today as transliteration has become firmly established in the methodological apparatus of scholars.8 While the principle of strict letter-based transliteration has since become widely accepted in most philological disciplines dealing with foreign scripts, Mongolian studies has The term glyph is derived from the Greek word γλυφη “carving”, from γλυφειν “to hollow out, cut out with a knife, engrave, carve”. In typography, this is broadly understood to mean any meaningful sign. While character refers to the abstract idea, glyph is more understood as its concrete graphic representation. 7 Schmidt, Grammatik der mongolischen Sprache, p. 1. 8 Though it was common at the time to call the unambiguous transposition of letters transcription rather than transliteration. Sometimes we come across wordings like transcription of the script or (in German) Schrifttranskription. An important document of the time is the “Rapport de la commission de transcription” issued by the Tenth International Congress of Orientalists in Geneva in 1894. 6 3 adhered to a kind of mixed form in which the rendering of a letter is combined with an articulation-based transcription. It is not the place here to expound the history and authorship of Latin renditions of Written Mongol, used today with a certain degree of variation. Nicholas Poppe, whose grammar may be considered a standard reference for orthographic questions, has only remarked in the preface: “The transcription used in this book is that found in most scientific works dealing with the Mongolian language.”9 Around the same time as my participation in the deliberations on the Unicode Standard for Written Mongolian, I presented some thoughts on a transliteration of the script in a talk given at the Seventh International Congress of Mongolists in Ulaanbaatar in 1997. The response was not overwhelming, but I made the wonderful acquaintance of Professor Juha Janhunen from Helsinki, who in his own presentation at the conference also firmly expressed the opinion that a true letter-based transliteration was an urgent necessity.10 The result was a joint project, the first outline of which appeared in a paper for the proceedings of the 41st PIAC held in Majvik (Finland) in 1998.11 A detailed description of our transliteration of the classical Mongolian written language is also contained in the volume “The Mongolic Languages” edited by Juha Janhunen, which has become a standard work in Mongolian studies. 12 A brief overview of this romanisation following the Unicode arrangement of the Mongolian alphabet can be found on the website of the Berlin State Library.13 The system developed by Janhunen and myself is the first practical and manageable romanisation of the script that has proved useful, amongst other purposes, in the bibliographical documentation of Mongol publications from Inner Mongolia. It is not intended to displace or supersede existing transcriptions commonly used in scholarship, which everyone may continue to maintain as he or she wishes. The system is conceived and designed as an additional tool to existing forms of notation; anything else would be unhistorical and not feasible. However, our romanisation system claims to describe the Mongolian script more accurately than the existing systems of transcription, and it does so in several respects. On the one hand, a general principle is observed that when describing a source, no more information should be indicated than what the original actually contains or implies. On the other, all recognisable features of the script should be distinctly expressed in its Latin rendering. The analysis of the script is done in two steps. At the elementary level, the 9 Poppe, Grammar of written Mongolian, p. xiii. A short recount of our first meeting can be found in: Balk, Sieben Strophen des Udānavarga in mongolischer Version, p. 25. 11 Balk & Janhunen, A new approach to the Romanization of Written Mongol. 12 See in particular Janhunen's contribution “Written Mongol” (The Mongolic Languages, pp. 30-56) and the “Chart of Romanization” (pp. xxvii-xxviii.) 13 Please consult https://staatsbibliothek-berlin.de/die-staatsbibliothek/abteilungen/ostasien/rechercheund-ressourcen/zentralasiatischer-katalog/transkription-mongolisch. The Staatsbibliothek zu Berlin has one of the richest bibliographically documented Mongolian-language library collections in the world. A little less than 15,000 books and periodicals are post-war holdings from Mongolia, mostly in Cyrillic script. The slightly more than 7,000 Mongolian items published in the People's Republic of China date mainly from the period after about 1980 until the present and are catalogued following BJR (Balk-Janhunen Romanisation). The bibliographic data is available on the internet at http://stabikat.de/. 10 4 glyphs are described, taking up the traditional names mentioned above (титэм, шүд, сүүл, шилбэ, гэдэс etc) and expressing them through Latin characters. Based on these elementary signs, functional letters can be identified, distinguishing between vowels and consonants as they correspond to the inherent logic of Mongol phonotactic and orthographic rules. It is a common heritage of Semitic writing that this extends to three characters in the first place, referred to in Aramaic as ālaph, yodh and waw. These signs can have both a vocalic and a consonantal function in the Mongol script, which is expressed accordingly at the alphabetic level. There is also a limited number of ligatures, where two glyphs stand for one single letter. A tactical detail on which Janhunen and me were very much in agreement from the outset was that we only want to use the 26 letters of the Latin alphabet for romanisation, hence no diacritics or Greek letters. Distinctions that are necessary are resolved combinatorially and not by means of such special marks (for example sh qh and not š ɣ to indicate the dots to the right and the left of letters s q). One advantage of a purely script-based romanisation is an unbiased look at elementary facts concerning the relationship between script and language. It is the particular Mongolian way of balancing sound and letter that becomes much more apparent to the eye than if transcriptional hybrids are used, where letter-based rewriting of a conservative script is combined with assumptions about a historical articulation that has long since ceased to be relevant for communication in contemporary language. There are perhaps not too many examples from other languages where spelling and pronunciation are as far apart as in the case of Written Mongol and modern Khalkha. The same applies mutatis mutandis to other Mongolic languages such as Buryat, albeit to a lesser extent. The gap between speech and writing is even wider than in English, where the Tudor Vowel Shift has created a discrepancy between spelling and pronunciation that to call loose would be an understatement. Nevertheless, adherence to traditional spellings has not hindered the rise of English as the modern world’s most important language. Rather, the apparent inconsistencies in the notation of vowels in relation to pronunciation seem to have given English a visual conciseness of orthography that allows the reader to grasp what is meant quite quickly. Something similar may be argued about Written Mongol. Despite its reductionism in the representation of vowels, partly also of consonants, its significance and prevalence extends beyond that of a local language as it is used across linguistic boundaries for different Mongolic idioms. The orchestral range in the number of syllables, a result of the history of the language, as well as the resourceful inventiveness in the selection of glyphs and design of letters in various positions of a word, give the spelling a stimulating precision that facilitates reading. Both the recognisability of the agglutinative structure and the visual unambiguity of the orthographic impression of words as a whole seem to be higher in Mongol than in the Cyrillic script. For native Mongolian speakers, Written Mongol is complicated to write but comfortable to read if the spelling of the words is familiar. In the following remarks on Mongol writing, I will refer to the Cyrillic spelling of the Mongolian language and its terminology throughout. This seems to make more sense to me than a parallel specification of the transcriptions commonly used in Mongolian studies, which are likely to be rather unknown to the general public. For Mongolian in Cyrillic script, there is an abundance of good dictionaries as well as excellent websites freely 5 accessible via the internet, which have been developed thanks to the far-sighted support of the Mongolian cultural authorities. They are characterised by a high degree of reliability, and I may take this opportunity to express my deep appreciation for these achievements and the dedicated people who have created them. Let me now go into some examples, not claiming to be exhaustive, but merely discussing a few salient points and some problems that typically arise in connection with Unicode. Those who are not familiar with Balk-Janhunen Romanisation may take them as a brief introduction to the idea and functioning of the system. Please consider these names for elementary signs: <v> <v> <v> <e> <i> <j> <u> <g> <b> титэм шүд сүүл цацлага шилбэ дэгээ гэдэс нум нумтай гэдэс crown tooth tail final stroke to the left shin hook belly bow belly with bow The system presented here for the Latin rendering of Written Mongol consists, in the first step, of a transliteration of the elementary signs or glyphs (махбод). The second step is a romanisation of these glyphs, which claims to capture the logic of writing in terms of the distribution of vowels and consonants and the implication of ligatures; it will prove surprisingly readable for those familiar with Mongolian. While the glyphs are set in angle brackets, romanised Mongolian letters appear in bold. Here are examples of the letter g occurring in varying positions: 2 3 ᠭᠡᠭᠡ ᠬᠥᠭ ᠭᠦᠩ 1 < g-v-g-v-e > gagae < g-u-i-i-e > guig < g-u-i-v-i-e > guivg гэгээ light гүн prince хөг tune In the first line you see a word consisting of a “bow” (нум), followed by a “tooth” (шүд), another “bow”, another “tooth” and a final stroke to the left at the end (цацлага). The sequence of these glyphs (нум-шүд-нум-шүд-цацлага) can be expressed via transliteration or elementary conversion into Latin letters by the formula < g‑v‑g‑v‑e >. However, this formula does not yet express the alphabetical value of the elements according to the inner logic of the Mongol script. The нум or < g > in glyphic transliteration is used only in initial and medial position to notate the letter romanised by g. If the same letter g is to be placed at the end of a word, a ligature of шилбэ and цацлага is written: in the second and third example it is the glyph sequence < i‑e > that serves the purpose. When talking about letters, it should be emphasised that the term is based on an alphabetical understanding 6 of the concept. Pronunciation is something else: different phonemes may well correspond to the same letter, as can be seen from the Mongol-Cyrillic equation guig ~ хөг (traditional transcription: kög): in the onset of the syllable, the letter g has a different phonetic value than in the coda. In the nucleus, there is a “belly” (гэдэс) followed by a “shin” (шилбэ), which are graphically a sequence of two letters (ui) and phonemically represent an umlaut (ө or in other words ү). The glyphs титэм, шүд and сүүл transliterated with < v > as well as the glyph with the Mongolian nickname шилбэ represented with < i > allow for both a vocalic and a consonantal interpretation as regards their alphabetic value. If < v > is used for a consonant, v is retained, as in the final ligature in guivg where the penultima is part of a digraphic notation of the velar nasal (traditional transcription: güng). Consonantal function of the glyph < v > is also manifest in initial position of full words (particles aside) in the appearance of a prosthetic consonant before the vowels a i u ui, for which ālaph is also found in other Semitic scripts. Phonetically, the sign represents the onset of a vowel at the glottis. In the coda of a closed syllable, consonantal < v > usually serves to mark a nasal, as in valdav ~ алтан “golden”. Only if a diacritical dot is visible to the left of the element is the dental nasal expressed in romanisation, as in vuinav ~ үнэн “truth”. If < v > functions as a vowel, it is uniformly romanised as a, so in gagae (above) or words like ardani ~ эрдэнэ “juwel” (lacking prosthetic v), regardless of how the vowel is to be understood phonologically. Final цацлага, following a and romanised by e in the word for “light”, is the second element in the digraphic spelling ae in vocalic auslaut. The principle of either consonantal (v) or vocalic valency (a) is fully independent of whether, in the case of a vocalic interpretation, the letter is phonetically realised as back (а) or front vowel (э). This may take some getting used to, but it corresponds to the orthographic facts. In close analogy, glyph < i > can also be understood in two ways, either as a vowel (i) or a consonant (j). In examples 2 and 3, ui is a sequence of two vowels, even if phonetically monophthongs. As said, < g > is the regular character to write g only in initial or medial position of a word while in final position we find < i-e > for the letter. While transliteration of glyphs consists in the description of graphic facts, romanisation involves the notation of letters, whereby the position in the word is a decisive criterion for determining which letter is intended by a glyph or a combination of glyphs. To describe these mechanisms is to outline the inner logic of Written Mongol. It should be clear that this does not and should not imply the specification of a pronunciation – this is what transcription is supposed to do. If the elementary sign < g > appears in final or in isolated position (the latter is the case with the accusative particle after a final consonant of the preceding word), the “bow” is used to express the vowel i as can be seen at the end of the following examples: 5 ᠮᠣᠷᠢ ᠵᠥᠭᠡᠢ 4 < m-u-r-g > muri < i-u-i-g-v-g > juigai морь зөгий 7 horse bee It was mentioned that “shin” is the Mongol continuation of Semitic yodh. Like most of its ancestors and relatives, it can have both a consonantal and a vocalic function. The rule is that шилбэ is always a consonantal letter in initial and intervocalic position of nonenclitic words; only in grammatical particles romanisations such as ijav (reflexive particle) or ijar (instrumental particle) may occur.14 In medial position can < i > be read as a vowel, which applies both to singular letter i (between two consonants) and to vowel compounds or digraphs with i as a second component such as ai, ii or ui. The latter can be observed in the first syllable of the word for “bee”, where the first and third elements are graphically identical, but romanised differently: initially by j (corresponding to з in Cyrillic) and medially by i (part of the digraphic notation of umlaut ө). In final position, “shin” does not occur and is indicated in the script by an “bow”, which stands for i in this place. In connection with these samples, it is worth recalling the clear conceptual difference between graphic transliteration, alphabetical romanisation and phonetic transcription as understood here. The sequences < u‑i > and < v‑g > are romanised as ui and ai according to their alphabetic function, which is dependent on the position of the glyphs in the word. As letters, ui and ai are sequences consisting of two vowels. Phonetically, ui is a digraph standing for an umlaut (traditionally ö ~ Cyrillic ө) and ai may be regarded as an original diphthong that has developed into a long vowel in Khalkha (ai ~ ий). There is a remarkable parallelism between the glyphs < g > and < b > in terms of their alphabetic valence and otherwise. Unlike the glyph usually called “bow”, there seems to be no generally used name for <b>, at least not in the works of Kara and Shagdarsüren (Шагдарсүрэн) already mentioned. A relaxed view allows to see a blend of “bow” and “belly”, in which a нум extending across the full width of the letter is fused with a гэдэс on the left side. A Mongolian colleague in the library referred to the sign as нумтай гэдэс, saying that this is what she was taught in school, and I think the term is good enough to be retained. Anyway, the parallelism between < g > and < b > is that these glyphs have the alphabetic value of a g and b only in non-final positions. In final position, they express the vowels i and u: 7 8 ᠪᠠᠪᠠᠢ ᠬᠦᠦ ᠭᠡᠭᠦᠦ 6 < b-v-b-v-g > babai < g-u-i-b > guju < g-v-g-u-b > gaguu баавай father гүү mare хүү son The examples show < b > and < g > in initial and medial (letters b and g) as well as in final position (u and i). With regard to guju, an objection could be raised as to whether guiu should be written instead of guju because of the palatal nature of the word. The sequence ui would then correspond to the first umlaut in хүү (ui ~ ү). In the evidence available to me, it is consistently the case that the trigraphic Mongolian sequence < u‑i‑u > cor14 Particles are subject to other rules regarding the orthography of the initial. Prostetic consonants (v) are generally not written, cf. vuv ~ он “year” versus uv = genitive particle after final consonant other than v. 8 responds to a vowel sequence үү in Cyrillic, as also found in gujusi ~ гүүш “teacher” or bujur ~ бүүр “group”. I am not aware of any deviating example, at least not in modern orthography. For the sake of a consistent romanisation, however, a decision must be made between uiu or uju. I argue for the latter, because this corresponds to the general rule that < i > between vowels is considered a consonant (as in sajiv ~ сайн “good” or gajit ~ хийд “monastery”). In order to adequately assess cases such as guju, it must be taken into account that today, in addition to the glyph < i >, a character < y > is also used in writing, which differs in that the stroke of the шилбэ has a small upward bend at the end. The difference between j and y is found in initial position in examples such as jaruca ~ зарц “servant” versus yaruvggai ~ ерөнхий “general”, in medial position in sajiqav ~ сайхан “beautiful” versus sayiqav ~ саяхан “recent”. It is therefore not surprising to find next to uju the sequence uyu, for example in buyu ~ буюу “is” or suyul ~ соёл “culture”. These back-vowel examples can be contrasted with front-vowel ones, for example guiyugu ~ гүйх “to walk”. It can be empirically established that the glyph sequence < u‑i‑u > occurs regularly in palatal words, while in velar words the sequence < u‑y‑u > is used. But this is the evidence only for today’s orthography. As late as the nineteenth century the glyph < y > is not written in many texts, but uniformly expressed by < i >. We will therefore not only read guju and bujur (with pattern uju ~ үү), but also buju and sujul throughout (representing uju ~ ую or оё). Since it is essential that the romanisation be applicable to older orthographic variants as well, there is a case for consistently romanising < u‑i‑u > so that in intervocalic position the consonant letter j is used for < i > also in palatal words. It is instructive in this context to look at what can be observed in the glyph sequence < u‑i‑i > that is comparable. The sequence < u‑i‑i > can appear in both palatal and velar vowel environments. Examples include gujidav ~ хүйтэн “cold” and jujil ~ зүйл “kind” on the front side and bujiziqu ~ бойжих “grow” and bujilaqu ~ буйлах “roar (of camels)” for the back vowels. It seems reasonable to suggest that the Mongol element ji corresponds to short й (also known as и краткое). Viewed in this way, uji is obviously a correct spelling for the latter two (showing uji ~ ой and уй). In cases like gujidav one would rather expect a spelling *guijidav, where ui would correspond to the Cyrillic umlaut ү and ji to following й. But this is not the case. The reason is that Written Mongol does not seem to permit a sequence of three шилбэ, which such a spelling would amount to. A sequence < i‑i‑i > does never occur; at least I know of no evidence for it in my lexical records.15 The observation indicates the following: the general rule that the letters traditionally denoted by ö and ü in the first syllable of a word are to be spelt by a combination of гэдэс and шилбэ does not always apply. The rule is not adhered to if ui is followed by the sequence ji. In that case *uiji is shortened to uji (implying u ~ ө ү in first-syllable posi15 In the late 1990s, as part of my library work, I began to compile orthographic checklists of Mongolian terms appearing in the catalogue or elsewhere, which now contain over 7000 entries in both Cyrillic and Mongol orthography. From this material is derived my statement about the non-existence of the sequence < i‑i‑i >. The database is currently being prepared for publication on a website of the State Library, presumably at the following location, where a number of other online tools are already assembled: https://crossasia.org/service/crossasia-lab/. 9 tion). From this I derive confidence that the glyph sequence < i‑u‑i‑g > corresponding to Cyrillic зүй is correctly rendered alphabetically by juji and not *juii. The romanisation guju can be considered a casual analogy, which seems acceptable in order to preserve the general principle how < i > is romanised in intervocalic position (as consonantal letter j). If this principle was abandoned, the glyphic sequences < a‑i‑i > and < u‑i‑i > would also have to be romanised as aii and uii throughout, which would not be an improvement. The reliability of the description of the glyphs remains the same, since both romanisations, uju as well as uiu, imply and represent a Mongol sequence to be transliterated as < u‑i‑u >. In both cases, then, the romanisation is not equivocal. At this point I will briefly discuss how the problem of spelling has been solved on the excellent website Монгол хэлний их тайлбар толь.16 This primarily Cyrillic site also provides the lemmas in Mongol script based on the existing Unicode standard. With the help of tools like “What Unicode character is this?”17 it can be determined which codes were actually used when entering Mongol characters. The sequence < u‑i‑u> was consistently realised by U+1826 (UE) inserted twice, which corresponds to üü in traditional notation: 9 10 11 хүү guju бүүр bujur гүүш gujusi ᠬᠦᠦ ᠭᠦᠦᠰᠢ ᠪᠦᠦᠷ üü QA · UE · UE üü GA · UE · UE · SA · I üü BA · UE · UE · RA Unicode letter QA (U+182C) may surprise traditionally socialised readers, but can be explained by the fact that Unicode ignores the difference between velar q and palatal g in the encoding. Which character will actually appear in the typeface depends on the vowel environment, which must be either velar (traditionally a o u) or palatal (e ö ü). So basically, what is encoded under label QA is the fact that the assumed sound corresponds to what is traditionally transcribed as q or k to appear as х in Cyrillic. An analogous thinking applies to Unicode letter GA (U+182D), which also does not really encode a Mongol character, but a pronunciation for qh or g, which will traditionally appear as γ or g and correspond to Cyrillic г. Here it becomes particularly apparent that Unicode is not oriented towards alphabetic reality, but towards articulation, which can only be derived from the script to a limited extent. The two velar signs (q and qh) and the one palatal letter (g) are encoded as if there were no distinction between velar and palatal. U+182C encodes a voiceless spirant (QA ~ х) and U+182D a voiced stop (GA ~ г). 16 17 https://mongoltoli.mn/. https://www.babelstone.co.uk/Unicode/whatisit.html. 10 12 13 14 15 16 хүйтэн gujidav зүй juji буйлах bujilaqu зүйл jujil бойжих bujiziqu ᠬᠦᠢᠲᠡᠨ ᠵᠦᠢᠯ ᠵᠦᠢ ᠪᠣᠶᠢᠵᠢᠬᠤ ᠪᠤᠶᠢᠯᠠᠬᠤ üi QA · UE · I · TA · E · NA üi JA · UE · I · LA üi JA · UE · I oyi BA · O · YA · I · JA · I · QA · U uyi BA · U · YA · I · LA · A · QA · U For the glyph sequence < u‑i‑i > romanised here as uji, the website gives examples of U+1826 (UE) followed by U+1822 (I) if a word is front-vocalic. If a word is velar in character, the letters uji are written as trigraphs, namely by U+1823 (O) or U+1824 (U) in the first, U+1836 (YA) in the second and U+1822 (I) in the third place. The sequence < u‑i‑i > of the Mongol script can thus be coded in Unicode in no less than three different varying forms. This is an orderly array of readings, but no clear encoding of letters. The remarkable parallelism between the glyphs < g > and < b > mentioned above is also evident in other respects. As described, < b > in the final position has the alphabetic valency of u. Now it is precisely the letters g and b where this rule does not apply. Instead of writing the final syllable < *b‑b > or < *g‑b >, the glyph < u > is retained here and not supplanted: 18 ᠪᠣᠣᠪᠣ ᠵᠢᠳᠬᠦᠬᠦ 17 < b-u-u-b-u > buubu < i-i-u-v-g-u-g-u > jitgugu боов зүтгэх cake to pull At this point, I would like to conclude these considerations; a detailed overall presentation of the Mongol script would go beyond the scope of this article. The aim was to illustrate, through a few simple examples, ways of approaching the script without being determined from the outset by transcriptions, and to show where Unicode becomes problematic. The following are some remarks on words that are spelled the same but can be interpreted phonetically and semantically differently. The fact that homographs are not necessarily homophones is anything but a Mongolian peculiarity. In English, one may think of the bow that is pronounced |bou| when used to shoot arrows. The same noun with the same spelling will sound like |bau| when it is supposed to mean an instance of bending the head or body, for example in greeting. Mongolian also has many words that are homographs in the Mongol script but can be understood and pronounced differently. The Cyrillic standard, which is lexicologically well established, provides us with a good basis for contrastive description of such cases: 11 20 21 22 ᠪᠥᠬᠡ ᠪᠥᠭᠡ ᠳᠠᠶᠢᠨ ᠲᠡᠶᠢᠨ 19 < b-u-i-g-v-e > buigae < b-u-i-g-v-e > buigae < t-v-i-i-v > tajiv < t-v-i-i-v > tajiv бөх wrestler дайн war бөө shaman тийн thus It is at the reader’s discretion whether he or she will understand buigae as “wrestler” and pronounce it as бөх, or whether a “shaman” is assumed, who will be called бөө. In traditional notation, the two words are distinguished by using two different transcriptions, böke “wrestler” and böge “shaman”. In Unicode notation, the first word must be typed with QA (U+182C) in the middle of the word (~ k), the second with GA (U+182D) representing the voiced stop (~ g).18 A fundamental problem with these inputs tied to interpretations is that they do capture the real picture, but at the same time limit the range of what is potentially meant or alluded to by introducing a specification that has no basis in the spelling itself. Someone writing a prosaic text in Mongol will know whether he or she has бөх or бөө in mind. Unicode gives the possibility to encode this subcutaneously, so to speak, if not openly recognisable in the typeface (to be detected only by software tools). The fact that Unicode requires the user to make such specifications can be a disadvantage. Why deprive a writer of the opportunity to use a stylistic device known in Sanskrit as śleṣa, a poetological term signifying “pun, paronomasia, double entendre, susceptibility of a word or sentence to yield two or more interpretations (regarded as a figure of speech and very commonly used by poets)”19? It could be, for example, that there is an intention in a poem that buigae may be understood and pronounced in the sense of both “wrestler” and “shaman”. Forcing an unambiguous decision here has a totalitarian feel to it. If Written Mongol gives the freedom to write both words the same, this is more than a licence, it is the writer’s liberty. Since tajiv ~ дайн and tajiv ~ тийн belong to different word categories – the former a noun, the latter a pronominal adverb – more sophisticated spell-checker programs should be able to offer the “correct” encoding as an automatic suggestion during input. From the point of view of Mongol script, however, different encodings of the two homographs seem redundant, as they are not associated with any visible disparity in the typeface. Superfluous things have the property of either disappearing or losing their relevance. There is an aspect that is perhaps not entirely unimportant for the practice of writing: for the first two letters of tajiv ~ дайн “war” and tajiv ~ тийн “thus”, the person writing has to press different keys on the keyboard: DA (U+1833) and A (U+1820) for the first word (traditionally dayin), but TA (U+1832) and E (U+1821) for the second (teyin). The big See Монгол хэлний их тайлбар толь, s.v. бөх II (https://mongoltoli.mn/dictionary/detail/16458) and бөө (https://mongoltoli.mn/dictionary/detail/16177). 18 19 Apte, The practical Sanskrit-English dictionary, p. 1579, s.v. श्लेषः 12 problem is that you cannot see the difference on the screen or on a paper printout. If you cannot see a difference, you cannot tell if you have typed your text correctly. Proofreading is impossible unless additional technical tools are used to identify characters by their code. If spelling mistakes cannot be seen and recognised straight away, this is an invitation not to worry about making errors in the first place and not to try to correct them afterwards. If it is the same whether one writes “war” or “thus” correctly, this may not matter when reading, but with automatic indexing, the two spellings are sorted into different places, since the alphabetical arrangement is based on the codes in Unicode and not on the external appearance of the letters. Homographs also exist in Cyrillic. Insofar as the Cyrillic standard of the state language of Mongolia is regarded as a reasonably reliable transcription of modern pronunciation, one can also speak of homophones with regard to the Mongol script, whereby the homophony of differently spelled words refers primarily only to Khalkha, not necessarily to other Mongolic tongues. 24 25 26 ᠪᠡᠬᠡ ᠪᠡᠬᠢ ᠲᠡᠦᠬᠡ ᠲᠡᠭᠦᠬᠦ 23 < b-v-g-v-e > bagae < b-v-g-i-j > bagij < t-v-u-g-v-e > taugae < t-v-g-u-g-u > tagugu бэх ink түүх history бэх strong түүх to gather The words for “ink” and “strong”, homographs in Cyrillic (бэх), show a different orthography at the end in Mongol: the first is written with шүд and цацлага (romanised ae), the second with шилбэ and дэгээ (ij). The glyph called “hook” in Mongolian, transliterated by < j > and always romanised by j, can in a sense be considered a final variant of шилбэ, which itself does not appear in final position. This is why there can be no possibility of confusion between the glyph < i > on the one hand and < j > on the other, even though < i > is romanised by j in initial and intervocalic position: in final position, romanised letter j can only represent a дэгээ under the logic of the script. The examples show yet another peculiarity of the letter g: vowels following are notated differently at the end than is the case with most other preceding letters. The vowel is not written < *g‑v > or < *g‑g >, but < g‑v‑e > and < g‑i‑j >, equivalent to gae and gij in the romanisation (which is not *ga or *gi). At this point, the parallelism between g and b can be pointed out yet again. With b, too, the vowel is not written <*b‑v > and < *b‑g >, but expressed digraphically by < b‑v‑e > and < b‑i‑j >, as can be seen in common words like bae ~ ба “and” and bij ~ би “I”. Another set of Cyrillic homographs is the word for "history", which philologically trained historians will probably pronounce as teüke, as distinct from tegükü "to gather". From a Cyrillic perspective, Written Mongol appears to be graphemically overdifferentiated, especially in regard to the number of syllables, but phonemically underdifferentiated due to the lack of precision in notation especially of vowels. For someone who uses 13 the script, Mongol is certainly more cumbersome and complicated than Cyrillic, but it seems to make reading easier for those familiar with the orthography. Historically, it was probably precisely the tendency towards a non-specific correspondence between sound and letter that ensured that Written Mongol was in use across wide linguistic and dialectal boundaries and could stand the test of time as the common script of the Mongols. Written Mongol orthography does not force anyone to adopt a particular pronunciation. The writing is loose and tight at the same time: it leaves the reader free to articulate what is written in his or her own tongue, and on the other hand is quite precise when it comes to using orthography to explicitly note what is meant. There is a striking similarity between traditional transcription mostly used in Mongolian studies and the modern Cyrillic script with its refined orthographic regularities. In both systems, there is an evident effort to specify the phonemes as precisely and meticulously as possible. With the Cyrillic script, this is a good thing, since the script is conceived and intended as a binding standard for the official language of a country to be learned at school and expected to be adhered to later. In Written Mongol, the case is somewhat different, as the script obviously avoids noting phonemes too precisely. In genuine Mongolian words (foreign words aside), there are only three letters with vowel intention (a i u), a few digraphs (ai au ii ui uu), trigraphic sequences (aji uji uju), and some special notations that appear only in final position (e ae ij and a few more). It should have become clear that the author of this paper considers it a priority to have a reliable transliteration of the elementary glyphs for the Mongol script, as well as an alphabetic romanisation with letters that accurately express the inherent logic of Written Mongol – no more, but no less. Transcription in the sense of traditional conventions, which have developed over a period of almost two hundred years, undeniably has its merits, which are in no way questioned here. But it is not a good idea to do transliteration by means of transcription when the focus is on the script. The encoding of Written Mongol in Unicode fell into exactly this trap – in the midst of the rapid technical innovations that took place in those eventful nineties of the last century. Basically, it was not the simple elements of the script or their alphabetic correlates that were encoded, but rather the traditional interpretation of these elements. This is particularly clear in the case of the vowels. The practical consequences are serious and not always pleasant. Now it is definitely impossible to undo or revoke a Unicode encoding once it has been authorised. Therefore, more flexible options should be discussed on how to supplement the existing specifications with fresh solutions to a convoluted situation. A step towards something more sustainable could be, for example, to allow the insertion of a гэдэс in a document even if one does not know (and will never know) whether the vowel is meant in terms of O or U. This amounts to providing, here and in similar cases, an additional encoding in which a character is not locked into one or other phonetic interpretation, but capable of representing both in an overarching representation. Furthermore, one should be able to distinguish the letters g and qh and enter them directly, regardless of the nature of adjacent vowels. This requirement amounts to complementing the letters encoded as QA (~ х) and GA (~ г) in such a way that q qh can be disentangled from their counterpart g along the evidence of the letters themselves. 14 Works cited APTE – The practical Sanskrit-English dictionary / Vaman Shivaram Apte. Kyoto 1978 [reprinted from the revised & enlarged edition, Poona 1957] BALK – Sieben Strophen des Udānavarga in mongolischer Version / Michael Balk. In: Per Urales ad Orientem : iter polyphonicum multilingue ; festskrift tillägnad Juha Janhunen på hans sextioårsdag den 12 februari 2012 / edited by Tiina Hyytiäinen ... Helsinki 2012 (Suomalais-Ugrilaisen Seuran toimituksia ; 264), pp. 25-37 BALK & JANHUNEN – A new approach to the Romanization of Written Mongol / Michael Balk & Juha Janhunen. In: Writing in the Altaic world / edited by Juha Janhunen and Volker Rybatzki. Helsinki 1999 (Studia orientalia ; 87), pp. 17-27 JANHUNEN – The Mongolic languages / edited by Juha Janhunen. London 2003 (Routledge language family series ; 5) KARA – Books of the Mongolian nomads : more than eight centuries of writing Mongolian / György Kara ; translated from the Ru0ssian by John R. Krueger. Bloomington 2005 (Indiana University Uralic and Altaic series ; volume 171) POPPE – Grammar of written Mongolian / Nicholas Poppe. Wiesbaden 1954 [5th unrevised printing 2006] (Porta linguarum orientalium ; Neue Serie ; Band 1) Rapport de la commission de transcription / Xme Congrè s international des orientalistes. Session de Genè ve. Geneva, Switzerland 1894 SCHMIDT – Grammatik der mongolischen Sprache / verfasst von I. J. Schmidt. St. Petersburg 1831 ШАГДАРСҮРЭН – Монголчуудын үсэг бичигийн товчоон : үсэгзүйн судалгаа = Study of Mongolian scripts / Цэвэлийн Шагдарсүрэн. Улаанбаатар 2001 (Bibliotheca mongolica ; monograph 1) Websites http://stabikat.de/ http://www.unicode.org/versions/Unicode3.0.0/ https://crossasia.org/service/crossasia-lab/ https://mongoltoli.mn/ https://mongoltoli.mn/dictionary/detail/16177 https://mongoltoli.mn/dictionary/detail/16458 https://staatsbibliothek-berlin.de/die-staatsbibliothek/abteilungen/ostasien/ recherche-und-ressourcen/zentralasiatischer-katalog/transkription-mongolisch https://www.babelstone.co.uk/Unicode/whatisit.html https://www.unicode.org/charts/PDF/U1800.pdf 15