In Jelle J. P. Wouters and Tanka B. Subba, Eds. (2022) The Routledge Companion
to Northeast India. London, Routledge: 469-474.
77
UPLAND LANGUAGES
Mark W. Post
In terms of sheer number of languages, India is one of a small handful of ‘superdiverse’ nations.
Currently, around 450 languages are thought to be natively spoken within India, meaning that
only Nigeria, Indonesia and Papua New Guinea can boast of more.Yet most of India’s languages
are spoken primarily in Northeast India, which hosts levels of ethno-linguistic diversity well
surpassing anywhere else in the Eurasian mainland.The bulk of this diversity is accounted for by
upland languages, of which there may be something on the order of 250–300.
Upland Northeast India is not only significant for its sheer number of languages, however,
but also for their phyletic diversity – which again exceeds that of anywhere else in mainland
Eurasia. In the hills of eastern Meghalaya, we find speakers of several languages of the Khasian
branch of Austroasiatic, distant relatives of Mundan, Khmer and Vietnamese.Tai-Kadai languages,
while mostly spoken in the plains of Upper Assam, descend from Shan varieties whose speakers
crossed the Patkai range from modern-day northern Myanmar beginning in the 13th century
CE. Indo-Aryan languages, though they are relatively late additions to the upland linguistic
landscape, have been spreading rapidly as lingua francas among tribal and non-tribal populations alike. This process has also given rise to endemic contact languages such as Nefamese,
Arunachali Hindi and Nagamese. Such languages have a special significance in being among the
few substantially contact-structured languages (a.k.a. ‘creoles’), which are substantially based on
a lexicon of non-European origin.
The remaining languages of the region in focus here – by far the majority – fall within the
Trans-Himalayan phylum (a.k.a. ‘Sino-Tibetan’ or ‘Tibeto-Burman’), one of the largest and most
diverse language families in the world (Van Driem 2014). Within the Trans-Himalayan phylum,
the upland languages represent at least six of what may be around 10–15 major clades (groups of
closely related languages, analogous to genera in biology). This is a notably high level of diversity for such a compact area, given that the remaining Trans-Himalayan clades are distributed
across a vast terrain stretching from Pakistan, throughout Tibet and the Himalayas, into mainland
Southeast Asia and all the way up to Northwest China.The high levels of diversity found within
such a geographically compact area as Northeast India marks it as a region of special ethnolinguistic significance, in which sometimes profound linguistic differences can occur between
geographically contiguous and culturally similar populations (Post and Burling 2017).
Arunachal Pradesh is host to much of this diversity: Bodish languages related to Tibetan,
such as Brokpa and Khampa, are spoken along the Bhutan and Tibet borders to the west and
DOI: 10.4324/9781003285540-78
469
Mark W. Post
north, nearby to the potentially related East Bodish language Dakpa (Tawang Monpa) and also
Tshangla. Moving east we find Kho-Bwa languages such as Puroik, Sherdukpen and Bugun,
followed by the quite distinct Miji/Bangru and Hruso languages. Spoken by relatively small
populations of a few hundred or two to three thousand, these languages are among the most
significant in all of Asia from the perspective of diversity linguistics, and are also among the least
known.
Continuing east,Tani languages such as Nyishi, Apatani, Galo and Adi are spoken throughout
the bulk of central Arunachal Pradesh, flanked by the potentially related Siangic languages Koro
and Milang. Kera’a (Idu Mishmi) and Tawrã (Digaru Mishmi) form the eastern extremity of this
highly diverse Eastern Himalayan language ecology.
The Lohit River marks a linguistic boundary within upland Northeast India, with languages
such as Miju and Meyor presenting a marked linguistic contrast with the languages of central
and western Arunachal Pradesh – including Kera’a and Tawrã, to whom the Miju are nonetheless in many ways culturally related. The again different Loloish language Lisu is spoken in
small pockets nearby, as is the Kachinic language Singpho. A highly diverse group of languages
is spoken in the areas immediately south and east of the Lohit and in many cases also across
the modern-day border with Myanmar. They include the ‘Northern Naga’ Konyak, Chang
Naga,Tutsa, Khiamniungan, Phom,Wancho and Tangsa-Nocte languages, most of which remain
insufficiently described. These languages were first argued by Burling (1983) to be related to
the Jingpho-Asakian and Bodo-Garo languages within a ‘Sal’ meso-phylum, despite that the
latter languages are primarily spoken in upper Myanmar and in Assam, Tripura and western
Meghalaya, respectively.
Nagaland is then home to a perhaps even more diverse group of languages, forming the Ao
(or ‘Central Naga’) group, the Angami-Pochuri group and the ‘Western Naga’ group (Zeme,
Liangmai and Khoirao, among others). Further south and within Manipur, the enigmatic
Tangkhul-Sorbung-Khangoi cluster may or may not be relatable to the ‘Northwestern’ group of
what DeLancey has called the ‘South Central’ branch of his ‘Central’ group of Trans-Himalayan
languages (DeLancey 2015). The Northwestern group was at one time called ‘Old Kukish’ and
includes highly significant but still under-described languages such as Aimol, Purum, Monsang,
Anal and Lamkang. The core of the South Central group, also known as Kuki-Chin, comprises an enormous number of mostly closely related languages spoken in Manipur, Mizoram,
Myanmar’s Chin state and the Chittagong Hills in Bangladesh; they include Mizo, Zou,Thadou,
Lai and Paite – among many others – as well as the perhaps more distantly relatable Meitei, and,
even more tenuously, perhaps also Karbi, which is spoken primarily in the modern-day KarbiAnglong district of Assam.
It is important here to underscore the fact that the above is not simply a list of languages; it
is a list of language clades. Within each clade, it is, of course, possible to find two or more closely
related languages, analogous to Spanish and Portuguese, but across them, the languages may be as
different as are English, Russian, Farsi and Hindi.Yet while the latter languages are separated by
hundreds of kilometres, such divergent languages in the upland Northeast Indian context may
be separated by only a river or a small number of mountains.
Faced with such enormous diversity, it is natural to ask the question “why”? What is different
about this region such that it has given rise to, or been able to sustain, such high levels of ethnolinguistic diversity by comparison with other parts of mainland Eurasia?
One source of explanation is undoubtedly the landscape itself. The region presents an overwhelmingly mountainous environment, with few plains or plateau areas of any appreciable size.
It can thus be viewed as a ‘residual zone’ in the sense of Nichols (1992): a rugged environment
that – in pre-modern times, at least – tended to hinder inter-group communication and to
470
Upland Languages
hold back language spread and language replacement. Instead, the hilly environment tended to
preserve, and perhaps also fostered, the genealogical and typological diversity that would have
at one time characterised much more of the surrounding region. By comparison, neighbouring plains or plateau areas such as the Brahmaputra valley or the Tibetan plateau, being more
connected, tend to have much larger populations speaking a far smaller number of languages.
Such ‘spread zones’ have undergone more, and wider-reaching, group expansion events, which
resulted in language spread, language replacement and linguistic homogenisation. For example, the Boro-Garo and Eastern Indo-Aryan languages appear to have overspread pre-existing
Austroasiatic languages in the plains of Assam, and Tibetic languages may have overspread other
Trans-Himalayan languages in the Tibetan plateau (DeLancey 2010).
Relative inaccessibility appears to have similarly preserved upland linguistic diversity from the
influencing effects of state-sponsored languages until relatively recent historical times. Although
many upland populations were long connected to both Brahmaputra Valley and Tibetan plateaubased states via trade networks and engaged with state actors in a variety of capacities (Huber
2011), the ultimate impact on most upland languages was until recently almost negligible (Post
2012). In the modern era, this situation has basically been reversed: through electronic media,
education and government employment, Indo-European languages such as Hindi and English
are spreading rapidly, serving as lingua francas between upland populations whose traditional
languages are mutually unintelligible, and completely supplanting traditional languages in some
areas and among some economic and age demographics (Modi 2006).
Time-depth is another factor with the capacity to explain the high levels of language diversity
in this region. Although the ultimate temporal and geographic origins of Trans-Himalayan languages remain uncertain, most modern scholars place the divergence points of upland Northeast
Indian clades at some of the highest levels within the family, implying a time-depth of between
three and five thousand years (Blench and Post 2014; Sagart et al. 2019; Zhang et al. 2019).
This does not of course mean that individual upland languages of Northeast India were spoken
throughout this period, still less in the precise geographical locations that we find them today; the
modern-day distribution of the region’s languages is likely to reflect a mixture of conserved deeptime distributions and other diversification events (such as group expansions or migrations) that
occurred much later. However, what is implied is that upland Northeast Indian linguistic diversity
at the highest levels – the differences, for example, between Tani languages, Kho-Bwa languages,
Miju-Meyor languages and South-Central languages – are not likely to be an outcome of recent
language divergence in the order of centuries. Rather, upland linguistic diversity is likely to be
in part explainable through developments that occurred over a timescale of thousands of years.
In many cases, long-term use within a mountainous environment has influenced the structures of the region’s languages themselves. One especially clear case is that of topographical deixis:
a grammaticalised system found in many of the world’s montane languages in which demonstrative pronouns meaning ‘that (located on an upward slope)’ are distinguished from ‘that
(located on a downward slope)’ and ‘that (located on the same or an unknown level)’. Similar
distinctions are also found among verbs, directional verb suffixes and even case markers. Related
systems of topographical deixis are found in most northerly upland Northeast Indian languages,
though more rarely – or in a different form – in languages of the Myanmar/Bangladesh border regions. The presence or absence of topographical deixis in a given language correlates
significantly with the environment in which a language is traditionally spoken: it tends to be
found only in languages which have been spoken continuously and for many generations by
relatively small groups within a montane environment. It is almost never found in plains, plateau
or cosmopolitan languages with large populations. Analysis of topographical deixis in upland
Northeast Indian languages thus has the potential to reveal information about their prehistory
471
Mark W. Post
and evolution, in terms of the types of landscape their speakers were more or less likely to have
inhabited (Post 2020).
Another important contributor to the development of upland Northeast Indian languages
has been language contact. In modern times, the most noticeable language contact situation is that
of the profound influence currently being exerted by Indo-European languages upon upland
languages. Such influence can be found in almost every aspect of upland language structure,
including their phonologies, morphologies and syntax, but is perhaps most readily observed
through the replacement of vocabulary denoting traditional cultural perspectives (such as the
lexical encoding of time) with contemporary Indo-European forms. Yet, in pre-modern times,
many upland languages appear to have had at least equally and sometimes even more profound
and lasting impacts on one another. Prolonged contact with speakers of Adi appears to have
substantially influenced the structure and vocabulary of Milang (Post and Modi 2011), and similar situations seem to obtain between Nyishi and Miji, Nyishi and Puroik, and both Mizo and
Meitei in relation to the many smaller languages that surround them. While studies of language
contact phenomena on such a localised scale remain in their infancy in this region, they are of
potentially high significance. If conducted effectively, detailed studies of present and historical language contact situations in the region will ultimately enable a richer, more precise and
more socio-culturally nuanced interpretation of the history of and relationships among upland
Northeast Indian languages than we have been able to obtain through the mostly genealogical
studies that have been pursued thus far.
Research Landscape
Given their uniquely high levels of diversity and the long and complex population prehistories
that seem likely to have given rise to it, upland Northeast Indian languages have an outsized
potential significance to sciences of language, culture and the human past.Yet despite their clear
importance, they have not received commensurate levels of research attention. Early descriptions
such as those in the Linguistic Survey of India (Grierson 2005 [1909]) are invaluable historical
records but are insufficient for most modern applications and have not been effectively superseded by similarly comprehensive and large-scale modern linguistic and ethnographic surveys.
A few intrepid scholars – mostly, PhD students – have produced large-scale, modern, fieldworkbased descriptive grammars of upland languages (e.g. Konnerth 2014; Boro 2017), and we must
be grateful to them.Yet most upland Northeast Indian languages remain either undescribed or
insufficiently described.The situation regarding ethnographies is even more vexing: virtually no
major ethnographic descriptions pertaining to this region have been published since the 1960s,
a fact which can only be regarded with consternation and dismay.
To this must be added the fact that many upland languages are either currently threatened or
endangered, or are likely to become so in the very near future. For example,Tangam, spoken east
of Tuting in Arunachal Pradesh’s Upper Siang District, has no more than 300 speakers. In decades past, such population levels were evidently sufficient to sustain Tangam’s intergenerational
transmission. Today, however, there is a new road connecting the Tangam area to metropolitan
Tuting, where there are boarding schools accepting Tangam-speaking children as young as 4 or
5. Such developments presumably improve socio-economic prospects for Tangam speakers and
other upland populations. In the absence of planning and management, however, they are also
virtually certain to disrupt intergenerational transmission of Tangam and other upland minority
languages over the coming years and decades.
Thus, one of the greatest research needs is for fundamental ethno-linguistic documentation and
description. Unless languages are understood in detail within their socio-cultural contexts, they
472
Upland Languages
cannot be effectively developed, conserved and provided for. Parallel to this is the urgent need for
training, capacity building and enhancement of employment opportunities for local scholars, and
particularly for native speakers of upland languages. These are areas in which significant progress
has been made in recent years, such that it has become possible for a grammar written by an indigenous upland Northeast Indian linguist to be shortlisted for the Panini Award, perhaps the world’s
premiere award in descriptive linguistics (Modi 2017). Similarly, community linguistics – linguistics
as practised by community member scholars, who typically conduct their research in the absence
of academic institutional support – has strengthened considerably in this region in recent years.
Community organisations such as the Galo Language Development Committee and the Aka
Language Academy have partnered with local and international linguists to drive forward longterm language documentation and development projects, producing orthographies, dictionaries,
textbooks, webpages and public signage. Two highly accomplished community linguists – Mosyel
Syelsaangthyel Khaling of the Uipo community in Manipur and Chikari Tisso of the Karbi community in Assam – have received prestigious international awards from the Linguistic Society of
America for their combined decades of language documentation, development and community
awareness-raising work.
Accomplishments like this underscore the potential significance, as well as the eminent
feasibility, of research partnerships involving local, national and international scholars, as well
as those which pair linguists with scholars from other disciplines (such as social and cultural
anthropology, archaeology, (ethno-) botany and zoology, and musicology). We need many
more such partnerships, which have the potential to move forward knowledge of upland
Northeast Indian languages within the cultural and environmental contexts of their use and
evolution, in addition to providing a knowledge base to support their conservation and sustainability.
References
Blench, Roger and Mark W. Post. 2014. Re-thinking Sino-Tibetan phylogeny from the perspective of
North East Indian languages. In Nathan Hill and Tom Owen-Smith (eds), Trans-Himalayan Linguistics:
Historical and Descriptive Linguistics of the Himalayan Area. Berlin: de Gruyter, Mouton, 71–104.
Boro, Krishna. 2017. A Grammar of Hakhun Tangsa. PhD Thesis, University of Oregon.
Burling, Robbins. 1983. The Sal languages. Linguistics of the Tibeto-Burman Area, 7(2), 1–32.
DeLancey, Scott. 2010. Language replacement and the spread of Tibeto-Burman. Journal of the South East
Asian Linguistics Society, 3(1), 40–55.
DeLancey, Scott. 2015. Morphological evidence for a Central branch of Trans-Himalayan (Sino-Tibetan).
Cahiers de Linguistique Asie Orientale, 44, 122–149.
Grierson, George A., (ed). 2005 [1909]. Linguistic Survey of India, Volume 3: Tibeto-Burman Family, Part 1:
General Introduction, Specimens of the Tibetan Dialects, Himalayan Dialects, & North Assam Groups. New
Delhi: Low Price Publications.
Huber, Toni. 2011. Pushing South: Tibetan economic and political activities in the Far Eastern Himalaya,
ca. 1900–1950. In Alex McKay and A. Balicki-Denjongpa (eds), Buddhist Himalaya: Studies in Religion,
History and Culture. Gangtok: Namgyal Institute of Tibetology, 259–276.
Konnerth, Linda. 2014. A Grammar of Karbi. PhD Thesis, University of Oregon.
Modi, Yankee. 2006. The Complexity and Emergence of Hindi as Lingua Franca in Arunachal Pradesh. Mysore:
CIIL.
Modi,Yankee. 2017. A Grammar of Milang. PhD Thesis, Universität Bern.
Nichols, Johanna. 1992. Linguistic Diversity in Space and Time. Chicago: Chicago University Press.
Post, Mark W. 2012.The language, culture, environment and origins of Proto-Tani speakers:What is knowable, and what is not (yet). In Toni Huber and Stuart Blackburn (eds), Origins and Migrations in the
Extended Eastern Himalaya. Leiden: Brill, 153–186.
Post, Mark W. 2020. The distribution, reconstruction and varied fates of topographical deixis in TransHimalayan (Sino-Tibetan). Diachronica, 37(3), 146–187.
473
Mark W. Post
Post, Mark W. and Robbins Burling. 2017. The Tibeto-Burman languages of Northeast India. In Graham
Thurgood and Randy J. LaPolla (eds), The Sino-Tibetan Languages. London: Routledge, 213–242.
Post, Mark W. and Yankee Modi. 2011. Language contact and the genetic position of Milang (Eastern
Himalaya). Anthropological Linguistics, 53(3), 215–258.
Sagart, Laurent, Guillaume Jacques, Yunfan Lai, Robin J. Ryder, Valentin Thouzeau, Simon J. Greenhill
and Johann-Mattis List. 2019. Dated language phylogenies shed light on the ancestry of Sino-Tibetan.
Proceedings of the National Academy of Sciences, 116(21): 10317–10322. DOI: 10.1073/pnas.1817972116.
Van Driem, George. 2014. Trans-Himalayan. In Nathan W. Hill and Thomas Owen-Smith (eds), TransHimalayan Linguistics: Historical and Descriptive Linguistics of the Himalayan Area. Berlin: de Gruyter,
Mouton, 11–40.
Zhang, Menghan, Shi Yan, Wuyun Pan and Li Jin. 2019. Phylogenetic evidence for Sino-Tibetan origin in
northern China in the Late Neolithic. Nature, 569(7754), 112–115.
474