The maternal history of tribal populations of
Chhattisgarh India
Shivani Dixit
Jaipur National University
Pankaj Shrivastava (
[email protected] )
Regional Forensic Science Laboratory
Manisha Rana
State Forensic Science Laboratory
Pushpesh Kushwaha
State Forensic Science Laboratory
Divya Shrivastava
Jaipur National University
R. K. Kumawat
State Forensic Science Laboratory
Prajjval Pratap Singh
Banaras Hindu University
Sachin K. Tiwary
Banaras Hindu University
Neeraj K. Chauhan
Thermo sher Scienti c India Pvt. Limited
Gyaneshwer Chaubey
Banaras Hindu University
Article
Keywords: Central India, mitogenome, tribe, phylogenetics, haplogroup
Posted Date: May 19th, 2023
DOI: https://doi.org/10.21203/rs.3.rs-2757780/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read
Full License
Page 1/10
Abstract
The central region of India is incredibly rich in tribal heritage. It is the most frequent Indian state in terms of
tribal population. Understanding the genetic history of the tribal population of India may add detailed
information about various demographic processes, including social upliftment. However, to understand these
microevolutionary processes, high-resolution genetic analysis is warranted. Therefore, we have used cuttingedge Next-generation sequencing (NGS) techniques and sequenced the mitogenomes of 25 random samples
from two major (Gond and Kanwar) tribal populations for complete mitogenome analysis. We aimed to
understand the initial peopling of Chhattisgarh from a maternal perspective. The complete genome
sequencing enabled us to identify several novel sub-haplogroups. Our results suggested an early expansion
and proliferation of maternal ancestry rooted in the time of initial settlement of the subcontinent, which has
reached near saturation during 25-30Kya. At the background of founding lineages M and N, we identi ed
maternal haplogroups M2, R5 and U2 as three basal founding haplogroups of this region. Overall, we suggest
a high effective (Ne) maternal population in Central India during 25Kya, sustained during the Last Glacial
Maximus(LGM).
Introduction
The Central region of India geographically consists of Madhya Pradesh, Maharashtra and Chhattisgarh. All
these states have more than 40 designated tribal populations(1, 2). The major tribals of this region are Kol,
Bhil, Gond, Kolam, Oraon, Korku, Saharia and Varlis. In 2000, Madhya Pradesh was partitioned, andtheregion
with 10 Chhattisgarhi and six Gondi-speaking districts is now known as Chhattisgarh. Currently, this state has
ve divisions and thirty-one districts. Chhattisgarh has a thirty million overall population of nearly 34% of the
Scheduled Tribes (3). This state shares borders with the seven states of India (Fig. 1). Physiographically, this
region is divided into Chhattisgarh Plain, Rimland and Bastar Plateau. The plain part has the Mahanadi River,
whereas the Rim lands consist of hills and plateaus. The river Godavari and its tributaries drain the Bastar
plateau.
Historically, the studied region was once an essential part of the Mahabharata and the Ramayana. It was
known as the Dandakaranya and is a signi cant portion of the ancient empire of southern Kosala(4).
Geographically, the region is divided into three groups based on cultural zone. The northern cultural zone is
politically known as the Surguja division; the Central cultural zone is politically known as the Bilaspur division,
and the Southern cultural zone is politically known as the Baster division. This region is an overlapping zone
of Indo-Aryan, Dravidian and Austroasiatic language groups(5). The tribes Oraon, Kanwar, Munda, Nagesia,
Korwa, Bhuinhar, Bhumia, Dhanwar, Saunta, Biar, Majhwar, Majhi, Kharia, Savra, Birhor, Kondh, Khairwar, Gond,
Baiga and Agaria are the tribal group of Chhattisgarh. Among these, the Gond and subcaste (4.2 million),
Kanwar (0.9 million) and Oraon (0.8 million) are majority (1). As per the 2011 census, the total population of
tribes in the state is 30.60%. The region under observation is densely inhabited by the tribes Gond, Halba,
Dhurvaa, Abujhmadia, Bison Horn Maria and Muria.
STR (Short Tandem Repeat) markers are the most commonly used markers for forensic investigation because
of their high information potential for establishing identity(6). Regardless of their potential, interpretations of
DNA typing results from degraded samples have long been a challenge in forensics. Besides this, samples
Page 2/10
which lack nuclear DNA (viz., hairs without roots) have also been a challenge for STR-based DNA technology.
Because of abundance in cells, power to decipher maternal lineage and comparatively lower sensitivity
towards degradation, mitochondrial DNA has been preferred over STR markers for analyzing compromised
samples. Forensic analysts use Sanger sequencing to decipher mitochondrial control regions without any
alternative in the preformulated kit format. Precision ID mtDNA Whole Genome Panel(Applied Biosystems) is
a next-generation based sequencing approach to mitochondrial DNA (mtDNA) analysis speci cally designed
for itsuse in forensic DNA typing and anthropological studies in a kit format. The kit is recently validated for
forensic applications(7).
Over the past few years, genetic studies using mitochondrial DNA (mtDNA) and Y chromosomal and
autosomal variations have provided a substantial understanding of South Asia's human origins and dispersal
patterns. So far, there is no high resolution maternal genetic study has been performed on Chhattisgarh
population. As this state is a shelter for many tribal groups, it may help to test several language-gene
interaction models. The archaeological studies also suggest that this state has played a vital role in peopling
of the subcontinent. Seeing its central role in shaping the major episodes of peopling of South Asia, it is
required to have a high-resolution study on the populations of this critical state.There are less detailed genetic
studies on the populations inhabited in this region. Therefore, we have selected two major tribal populations
i.e. Gond and Kanwar from this state and studied the mitogenomes from these populations. This study is an
attempt at an extensive characterization of the maternal ancestry of the tribal populations using complete
mitochondrial sequences and to establish the use of NGS technology in forensic applications.
Material and Methods
Sample Collection
We have collected 2 ml of blood samples from 25 unrelated individuals belonging to Gond and Kanwar
populations from Chhattisgarh state, India (Fig. 1). The samples were collected per the ethical approval from
the Institutional ethics committee of Dr. H.S. Gour Vishwavidyalaya, Sagar Madhya Pradesh, India, vide its
approval no. DHSGV/IE/2021/2/02 dated 3.9.21. Written informed consent was obtained from all the
participants. We also con rm that all methods were performed in accordance with the relevant guidelines and
regulations of the Ethical Committee.
DNA isolation and quanti cation
DNA was isolated and puri ed with AutoMate Express™ Forensic DNA Extraction System using PrepFiler®
Express Forensic DNA Extraction Kit (Thermo Fisher Scienti c (Thermo), Waltham, MA, USA) as per the
protocol of the manufacturer. DNA concentration was estimated with the Qubit 3.0 instrument applying the
Qubit dsDNA HS Assay kit (Life Technologies, Invitrogen division, Darmstadt, Germany).
Library preparation
Genomic DNA isolated from the sample is converted to a sequencing library by targeted ampli cation of
regions of interest by Precision ID mtDNA Whole Genome Panel(Thermo). Precision ID mtDNA panel is an
innovative approach to mtDNA sequencing, speci cally developed for forensic applications. This mtDNA tiling
Page 3/10
approach, using amplicons that are only 163 bp in average length, assists with obtaining optimal
mitochondrial genome (mtGenome) coverage from highly compromised, degraded samples. The Precision ID
library preparation work ow was performed on an automated system (Ion Chef System from Thermo) as per
the protocol provided by the manufacturer. The system facilitates automation of up to 8 samples per run for a
2-pool panel design to generate pooled libraries ready for downstream template preparation.
Template preparation
Libraries prepared by automation are clonally ampli ed on the Ion Chef System by emulsion PCR of library
molecules captured on beads. The Ion Chef System automates all template preparation steps, including
creating the emulsion mixture, performing the PCR, carrying out the post-PCR puri cations, and loading the
puri ed templated beads onto the Ion S5 chips.
Sequencing
A sequencing run on the Ion S5 systems is initiated by loading a reagent cartridge, buffer, cleaning solution,
and waste container as per the protocol of the manufacturer. The Ion S5 chip is then loaded, and the run
starts. The addition of nucleotides by the DNA polymerase results in the production of hydrogen ions; the
change in pH is converted to sequencing signals through ion-sensitive wells that hold the templated beads.
Converge Software for mtDNA analysis work ow
The raw data obtained after sequencing was analysed using specially designed Converge software
(ThermoFisher Scienti c) to determine the sequence.The Converge NGS Data Analysis module automates
mtDNA analysis, leveraging optimized base calling, alignment, and quality ltering algorithms.
Analysis of obtained sequences
Of the 25 sequences obtained, haplogroups were assigned to each individual using the Global human mtDNA
phylogenetic tree(8). We manually reconstructed the mitogenome phylogenetic tree based on the tree
generated by mtphyl(https://sites.google.com/site/mtphyl/home) and the nomenclature of PhyloTree (Build
17). Coalescence ages for each haplogroup were calculated by ρ statistics(9). Standard errors were calculated
as in Saillard et al. using a synonymous clock of one substitution every 7884 years and a mitogenome clock
of one substitution every 3624 years(10). To evaluate the effective population size (Ne ) for the studied
population, we computed Bayesian Skyline Plots (BSPs) using BEAST 1.8.0(11). We used a relaxed molecular
clock, a two-parameter nucleotide evolution model, and a rate of 2.514 x 10− 8 mutations per site. We have
calculated the frequency of each haplogroup in the studied population and drawn a PCA (Principal
Component Analysis) plot with the other populations from the adjoining regions and states. The spatial
distributions of three major haplogroups have been generated by (https://www.datawrapper.de/).
Results and Discussion
Based on archaeological and genetic data from South Asia, East Asia and Southeast Asia,it has been
unanimously accepted that modern humans were present in this region at least 50–74 Kya (12–14). After the
Page 4/10
Out-of-Africa dispersal events, the most prominent global population expansionwas thought to have taken
place in South and Southeast Asia, where most of the human population might have lived 25Kya (15). The
maternal analysis of remote populations living in India is necessary to understand this demographic process.
Since the Central part of India has played a vital role in human migration (16), we have randomly collected
samples from the Indian state of Chhattisgarh and sequenced their mitochondrial DNA (mtDNA) with NGS
technology.
We rst classi ed our samples into haplogroups and constructed a PCA plot (Fig. 2). The maternal PCA of
India showed a clinal pattern. The geographical distribution of the population is re ected in the genetic
similarities. The present study population is placed near a cluster mainly comprised of Central Indian states
(Madhya Pradesh, Chhattisgarh) (Fig. 2). The PCA suggested a close genetic a nity of our studied samples
with the populations of Madhya Pradesh and Chhattisgarh.
In the haplogroup frequency distributions, we observed three major haplogroups in studied Chhattisgarh tribal
populations. Haplogroup (hg) M2 is the major haplogroup harbouring a frequency of 0.28, followed by the
haplogroups R5 and U2 (both 0.12). We have reconstructed a frequency-based geographical map to
understand the spatial distribution of these haplogroups among Indian populations (Fig. 3). The spatial
distribution of these haplogroups suggests their prominent presence in Central India. These haplogroups have
been reported as basal haplogroups of South Asian maternal ancestry (17, 18).
In order to understand the phylogenetic placement of studied individuals, we reconstructed a phylogenetic tree
using the 25 complete sequences (Supplementary Fig. 1). In the phylogenetic tree, we identi ed several novel
sub-haplogroups. We de ned a new branch of haplogroup M46 as M46b. In the background of haplogroup
M2, we de ned sub-haplogroups M2a1a4, M2a1d, M2a3b and M2b1c. For haplogroup M63, we de ned a
branch M63b. Similarly, we newly de ned various sub-haplogroups as the background of haplogroups M5,
M78, R5 and U2 (supplementary Fig. 1). Altogether, we designated twelve novel sub-haplogroups in the
present study.
The population demographic history in this region has not been evaluated yet. Therefore, we have performed
the Bayesian Skyline Analysis (BSA) (Fig. 4). In the plot, we see a gradual expansion from 55Kya with a
saturation of nearly 25Kya. Thereafter, the population followed a linear growth. This supports the India-wide
introduction of microlithic technology, which has supported the population linearity growth (19). However, we
have not seen any dip in the population growth during the LGM in Chhattisgarh tribals which have been
observed in Kashmir (20). This is likely due to distinct geographic regions which might have been differently
affected during the LGM (21, 22).
Thus, the complete mitogenome sequence analyses enabled us to identify at least twelve novel subhaplogroups. We suggested an early expansion of maternal ancestry in Chhattisgarh. The effective
population size of this region reached saturation around 25Kya. We identi ed three basal maternal
haplogroups widespread in this region. Unlike the colder regions, we have not observed any growth dip during
the LGM.
Declarations
Page 5/10
Data availability
The datasets generated and/or analysed during the current study are available (GenBank accession numbers
OP718226 to OP718249) in the [NCBI] repository (https://www.ncbi.nlm.nih.gov/genbank.
Acknowledgement
Authors are thankful to Thermo sher Scienti c India Pvt. Limited, Gurgaon, India for providing reagents and
kits used in the study.
References
1. Russell RV. The tribes and castes of the Central Provinces of India. Vol. 1. Macmillan and Co., limited;
1916.
2. Singh KS. The Scheduled Tribes. Singh KS, editor. Oxford: Oxford University Press; 1997. 1266 p. (People
of India; vol. III).
3. Census of India Website : O ce of the Registrar General & Census Commissioner, India [Internet]. [cited
2021 Jan 5]. Available from: https://censusindia.gov.in/2011-common/censusdata2011.html
4. Shukla HL. Tribal History: A New Interpretation. Delhi: BR Publishing Corporation; 1988.
5. Grierson G. Linguistic Survey of India (Vol. XI, Gipsy Languages). Calcutta: Superintendent of Government
Printing, India.; 1922.
. Mohapatra B, Chauhan K, Shrivastava P, Dixit S, Kumawat R, Sharma A, et al. A genomic exploration of 15
autosomal STR loci for establishment of a DNA pro le database of the population of Himachal Pradesh.
Leg Med. 2020;46:101719.
7. Strobl C, Eduardoff M, Bus MM, Allen M, Parson W. Evaluation of the precision ID whole MtDNA genome
panel for forensic analyses. Forensic Sci Int Genet. 2018;35:21–5.
. van Oven M, Kayser M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA
variation. Hum Mutat. 2009 Feb;30(2):E386-94.
9. Soares P, Ermini L, Thomson N, Mormina M, Rito T, Rohl A, et al. Correcting for purifying selection: an
improved human mitochondrial molecular clock. Am J Hum Genet. 2009;84(6):740–59.
10. Saillard J, Forster P, Lynnerup N, Bandelt HJ, Nųrby S. mtDNA variation among Greenland Eskimos: the
edge of the Beringian expansion. Am J Hum Genet. 2000;67(3):718–26.
11. Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7.
Mol Biol Evol [Internet]. 2012;29. Available from: http://dx.doi.org/10.1093/molbev/mss075
12. Pagani L, Lawson DJ, Jagoda E, Mörseburg A, Eriksson A, Mitt M, et al. Genomic analyses inform on
migration events during the peopling of Eurasia. Nature. 2016;538(7624):238–42.
13. Petraglia MD, Haslam M, Fuller DQ, Boivin N, Clarkson C. Out of Africa: new hypotheses and evidence for
the dispersal of Homo sapiens along the Indian Ocean rim. Ann Hum Biol. 2010 Jun;37(3):288–311.
14. Petraglia M, Korisettar R, Boivin N, Clarkson C, Ditch eld P, Jones S, et al. Middle Paleolithic assemblages
from the Indian subcontinent before and after the Toba super-eruption. Science. 2007 Jul
6;317(5834):114–6.
Page 6/10
15. Atkinson QD, Gray RD, Drummond AJ. mtDNA variation predicts population size in humans and reveals a
major Southern Asian chapter in human prehistory. Mol Biol Evol. 2008 Feb;25(2):468–74.
1 . Athreya S. Was Homo heidelbergensis in South Asia? A test using the Narmada fossil from central India.
In: Petraglia MD, Allchin B, editors. The evolution and history of human populations in South Asia
[Internet]. Springer Verlag; 2007. p. 464. Available from: http://books.google.com/books?
id=Qm9GfjNlnRwC&printsec=frontcover&dq=evolution+history+human+populations+South+Asia&ie=ISO8859-1&cd=1&source=gbs_gdata
17. Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K, Parik J, et al. The genetic heritage of the earliest
settlers persists both in Indian tribal and caste populations. Am J Hum Genet. 2003;72(2):313–32.
1 . Quintana-Murci L, Chaix R, Wells RS, Behar DM, Sayar H, Scozzari R, et al. Where west meets east: the
complex mtDNA landscape of the southwest and Central Asian corridor. Am J Hum Genet. 2004
May;74(5):827–45.
19. Petraglia M, Clarkson C, Boivin N, Haslam M, Korisettar R, Chaubey G, et al. Population increase and
environmental deterioration correspond with microlithic innovations in South Asia ca. 35,000 years ago.
Proc Natl Acad Sci U S A. 2009 Jul 28;106(30):12261–6.
20. Sharma I, Sharma V, Khan A, Kumar P, Rai E, Bamezai RN, et al. Ancient human migrations to and through
Jammu Kashmir-India were not of males exclusively. Sci Rep. 2018;8(1):1–9.
21. Quamar MF, Bera SK. Pollen records of vegetation dynamics, climate change and ISM variability since the
LGM from Chhattisgarh State, central India. Rev Palaeobot Palynol. 2020;278:104237.
22. Kumar V, Shukla T, Mishra A, Kumar A, Mehta M. Chronology and climate sensitivity of the post‐LGM
glaciation in the Dunagiri valley, Dhauliganga basin, Central Himalaya, India. Boreas. 2020;49(3):594–
614.
Figures
Page 7/10
Figure 1
The sampling location of Chhattisgarh state.
Page 8/10
Figure 2
The principal component analysis (PCA) of Indian populations of various states showing the placement of
studied tribal population.
Figure 3
Page 9/10
The spatial distribution of major haplogroups (haplogroups M2, R5 and U2) observed in the studied
geographic region.
Figure 4
The Bayesian Skyline Plot (BSP)based on complete mitogenomes of Chhattisgarh showing the population
demography of tribal populations in this region.
Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.
SuppFig.1.jpg
Page 10/10