Y-Chromosomal and Mitochondrial SNP Haplogroup Distribution in Indian Populations and its Significance in Disaster Victim Identification (DVI) - A Review Based Molecular Approach

Sinha M¹*, Rao IA¹ and Mitra M²

¹Department of Forensic Science, Guru Ghasidas University, India

²School of Studies in Anthropology, Pt. Ravishankar Shukla University, India

*Corresponding author: Sinha M, Department of Forensic Science, Guru Ghasidas University, India

Received: December 08, 2016; Accepted: January 19, 2017; Published: January 24, 2017

Abstract

Disaster Victim Identification is an important aspect in mass disaster cases. In India, the scenario of disaster victim identification is very challenging unlike any other developing countries due to lack of any organized government firm who can make these challenging aspects an easier way to deal with. The objective of this article is to bring spotlight on the potential and utility of uniparental DNA haplogroup databases in Disaster Victim Identification. Therefore, in this article we reviewed and presented the molecular studies on mitochondrial and Ychromosomal DNA haplogroup distribution in various ethnic populations from all over India that can be useful in framing a uniparental DNA haplogroup database on Indian population for Disaster Victim Identification (DVI).

Keywords: Disaster Victim identification; Uniparental DNA; Haplogroup database; India

Introduction

Disaster Victim Identification (DVI) is the recognized practice whereby numerous individuals who have died as a result of a particular event have their identity established through the use of scientifically established procedures and methods [1]. Deceased identification is crucial in mass disaster cases not only on humanitarian ground but also on legal grounds. A huge figure of peoples as victims of disaster remains unrevealed in every case of mass disasters as the bodies of victims are mostly ruined afar from identification. In India, the scenario of disaster victim identification is very challenging unlike any other developing countries. For dealing with mass fatalities in India, there is lack of any organized theoretical and practical outlook. Conventional identification methods are available but these methods failed in gathering information from highly degraded remains. Advancement in technical aspects of DNA based methods makes forensic DNA profiling a method of choice. In mass disasters, for Disaster Victim Identifications (DVIs) comparison were made of victims (deceased) sample to the DNA available from victim’s body claimant for DNA profiling. Those bodies are identified and victim’s charters were given but problem arises where bodies were not identified due to non-availability informative reference samples. Environmental conditions of places where disaster has occurred results in disintegration, decaying and intermixing of deceased remains. In such situations, simply doing a DNA Profiling is not sufficient. In the way of determining the individual identity of deceased additional information on geographical origin of such mutilated bodies will became the gold standard for disaster victim identification.

Genetic markers used in DVI

In DVI, the choice of genetic markers should be consistent with the necessity mentioned above which can reveal the fact that the human genome variation is not uniform. This inconsequential assertion put forward characteristics of a number of markers ranging from its distribution in the genome, their power of discrimination and population restriction, to the sturdiness nature of markers to the process of degradation and their willingness for multiplex and automated analysis. The characteristics of different markers and the technical approaches presently used in personal genetic identification were discussed further.

Short tandem repeats (STR) markers

STR profiling is the most frequently applied approach for personal genetic identification. Usually, 5 to 20 common alleles of forensically relevant STR loci are considered [2]. These STR loci are characterized by their high power of individual discrimination and Polymorphic Informativity Content (PIC) [3]. One of the great advantages of STRs is their multiallelic nature which can aid in analyzing mixtures of alleles from multiple contributors and their nonoverlapping size helps to distinguished different alleles. They can be analyzed in a multiplex manner, with amplification of as many as 16 loci which can be typed simultaneously, therefore, reducing the sum of material analyzed [4,5]. However being the most routinely and commonly applied genetic markers in forensics, STRs have some limitations, which are restraining their utility in Disaster Victim Identification. STRs provides specifically useful on well-preserved bone and soft tissue samples. The amplicon size required for STR analysis is too long (150–450 bp) for allowing useful amplification of degraded DNA templates [6]. Although the development of mini-STRs having length 60–80 bp, obtained by primers designed to closely locate the tandem repeat sequence, largely resolve this problem [7-9] at the same time sustaining consistency with the existing core loci. But, the template size required for the analysis including mini-STRs might be too long to successfully analyze heavily degraded samples. Another limitation of using STRs is that where the analysis is not fully automated reliable discrimination is not possible for the set of 13–15 core loci, which, in some cases, is necessary. The presence of null alleles, triallelic patterns, alleles with size not matching the standardized allelic ladder are some of the other potential technical problems might occur in STR analysis. STR loci (10−3–10−5) are having high mutation rate, which depicting them very informative, while also makes them less stable [10,11]. The use of STRs in lineage or ethnicity analysis is compromised because of the resulting problem of its ability to distinguish the alleles identical by the state from those identical by descent [12]. Finally, STRs do not provide phenotypic hints regarding the analyzed samples [13].

Single-nucleotide polymorphism (SNP) markers

To a routine STR-based DNA profiling, SNP markers suggest a useful and gradually more important augmentation. Unlimited source of human genome diversity for analysis is provided by SNPs (Cooper, et al. 1985; Wang, et al. 1998). SNP profiling as an application for DNA identification put forward some of the advantages over the use of STR markers, but at the same time experiences some limitations. An imperative, widespread advantage of using SNPs with aim of DVI rises from the scope of analyzing heavily degraded fragments and unpreserved tissue samples. DNA templates of less than 60 bp in length can be amplified using SNPs (approximately the length of two flanking primers). SNP typing eases the identification due to their biallelic nature but makes them not very revealing on a perlocus basis for identity testing. Their utility for identity testing can be increased by exploring many unlinked SNPs, but in order to accomplish the level of discrimination large group of SNPs (50–100) and characteristic 13 core STR loci (i.e. 10−15–10−16) is essential to be genotyped [14-16]. Conveniently, the multiple SNP profiling nowadays becomes cost-effective and standardizable because of large multiplex assays enabled through complete automation of the process. Although SNP markers have some disadvantages, these markers are difficult in explanation the situations which involve mixtures of samples, since the individual profiles are difficult to prepare and are not reliable from mixed sample of unknown source [2]. Autosomal SNPs exhibits an important feature of considerable variability in their heterozygosity levels across the genome. On the order of 10−8/ site/generation, which is a low mutation rate of SNPs [17,18], this makes SNPs stable genetic markers, that can provide identity by descent reflected by the state of their alleles. This creates differences in frequency and distribution of SNPs among diverse populations with different ethnic background and which attributes their distinct evolutionary and demographic histories [19,20]. Most investigations of autosomal X-and Y-chromosome, and mtDNA markers indicate more elevated amounts of hereditary variety in African contrasted with non-African populaces, mirroring the antiquated history of the beginning of current people [19-21]. Different contrasts in SNP markers dissemination reflect later statistic histories, for example, movements, populace bottlenecks, detachment, admixture and so forth. [22], and might be utilized for surmising the geographic or ethnic source of a person. The likelihood of utilizing SNPs for the surmising of ethnic starting point of the examined test, which can help in anticipating the geographic root or certain physical attributes of a benefactor, can be a vital resource in DVI endeavors, when no match is found between the casualty’s remaining parts and any living individual or accessible database records. Y-Chromosomal and Mitochondrial SNP haplotypes have a place with mainland particular haplogroup and can be utilized to show the patrilineal and matrilineal root of the examined test, however their informativity is restricted but can provide some indications inferring paternal and maternal biogeographic ancestry [23,24]. SNP markers for DVI have now been applied in a variety of instances, following the terrorist attacks of 11 September 2001 on the World Trade Center in New York City [25]. Finally, SNPs, for their 100,000‑times lower mutation rate in contrast to STRs, are superior for kinship testing [26] and may replace STRs for such a purpose once commercial kits become available.

Y-chromosomal DNA and mitochondrial SNP markers

Y-chromosomal DNA and mitochondrial SNP markers has already proven its utility in identifying cases, but a very little emphasis has been given to these uniparental SNP haplogroups having the potential of elucidating the geographical ancestry. “Haplogroup is a genetic population group of people who share a common ancestor on the (y-chromosomal DNA) patrilineal or (mt DNA) matrilineal line. They are assigned letters of alphabets and numbers” (International Society of Genetic Genealogy) or it can be defined as a group of haplotype that share a common ancestor. Haplotypes are further described as group of genes inherited collectively from a single parent. Both Y-chromosomal and mitochondrial DNA haplogroups are identified by single nucleotide polymorphisms (SNPs) on DNA locations where one nucleotide has mutated or switched to a different nucleotide. Individuals, put into a particular haplogroup by detecting the presence of a particular SNP at these uniparental DNA locations. Haplogroups of uniparental DNA are having potential for tracing the geographical ancestry. A large number of molecular studies have been performed from past 20th century and verified the utility of haplogroups in tracing the origin and ethnic association for various populations of the world. Haplogroup frequencies vary from population to population and even continent [27] this is because of the diverse nature of human populations at the genetic level. Examination of Y chromosome haplogroups and mitochondrial haplogroups can give data on biogeographic family line of a person. Y chromosome haplogroups have differential recurrence conveyances around the globe. For example haplogroups A and B are about only found among sub-Saharan Africans; H is solely found on the Indian sub‑continent (and among Roma); and M is only found in Oceania. Other Y haplogroups, for example, R and N are seen crosswise over endless ranges of Eurasia. Similarly late advance in entire mtDNA(Mitochondrial DNA) sequencing has given expanded comprehension about the mtDNA phylogeny and uncovered a substantial number of various mtDNA haplogroups. Numerous mtDNA haplogroups demonstrate confined mainland disseminations, for example, haplogroup l to Africa, V to Europe and the Middle East, or P and Q to Oceania and haplogroup M except M1 and U to India [28-31].

Across the globe human genetic diversity and its distribution is not uniform (Cavalli-Sforza, Menozzi, and Piazza 1994), this differences in diversity distribution accompanied the significance of realizing the distribution of these differences to satisfy the questions associated to ethnic diversity, migrations, founder populations and fondness to studies of complex disorders or Pharmacogenomics and geographical origin determination in cases of disaster victim identification in large diverse population. Geographical differentiation in diversity patterns in terms of geography are seen on many spots and exists of continental differences lie in populations of African continent the heterozygosity level is higher which is supported by their larger effective size of populations and their extensive time of occupation [32], on the other hand sudden changes and genetic margins generated due to social hierarchy also created differences [33] whereas for differences at gene frequency level signifies the reliable fact of natural selection and the majority of variations at DNA is considered to be neutral or nearly so.

The differing qualities at haploid Y-chromosome have huge relationship with topography which makes it an essentially proficient examination of both arrangement and populace augmentations [34,35]. The acceptably fathomable all inclusive plan of the Y-chromosome phylogeny and its extension starts the examination and examinations on composite Y-chromosome twofold haplogroup assorted qualities [36-41]. Intensive reviews on association and elements of human mitochondrial genome have been comprehensively utilized as a part of populace hereditary qualities and developmental reviews. It’s maternal transmission, lack of recombination aides in recognizing the mutual family. Coordinating to particular haplogroup or its sub clades served to connected people to a typical geographic starting point [42]. The overall circulation of mitochondrial DNA (mtDNA) variations, connecting particular clades of the mtDNA phylogeny with certain geographic zones. In any case, a multiplex genotyping framework for the recognition of the mtDNA haplogroups of major mainland conveyance that would be attractive for effective DNAbased biogeographic parentage testing in different applications [43]. Although, mitochondrial DNA suffers low power of discrimination due to lack of recombination and uniparental inheritance but several typing methods are used nowadays to improve its power of discrimination [44].

Indian Scenario y-chromosomal and mitochondrial DNA SNP haplogroup distributions in India

The exclusive assets of uniparental DNA markers like inheritance (paternal and maternal), small effective population size, polymorphic nature and absence of recombination has allowed the examination of this revealing markers by investigators worldwide for determining the genetic structure and the geographical maternal and paternal history of human populations. In current years, researcher’s are concerned to major issues relevant to haplogroup determination in peopling and origin of various population of India by conducting hierarchical typing of the Y-chromosome binary polymorphisms and mitochondrial DNA hyper variable region sequencing. Studies on these markers presented a scenario to understand the association of ethnicity and geography with particular haplogroup frequency.

In this context, we present a systematic review of the studies conducted on Indian populations using different set of Y-chromosomal and mitochondrial DNA markers for determining haplogroup distributions. The details of the populations studied by various authors, their geographical affiliation and haplogroup distributions are summarized in Table 1 and Table 2 for Y-chromosomal and mitochondrial DNA. The number of population studied for Y-chromosomal and mitochondrial DNA polymorphisms in different geographical regions of India are depicted in Table 3 and Table 4. The number of populations studied in each geographical region of India for Y-chromosomal and Mitochondrial DNA is depicted in Table 5 and Table 6. Frequencies of Y-chromosomal in four language families and two major ethnic groups of India are summarized in Table 3. The dialect family can be characterized as a gathering of dialects related through plunge from a typical progenitor, called the proto-dialect of that family. The expression “family” mirrors the tree model of dialect start in chronicled phonetics, which makes utilization of an allegory contrasting dialects with individuals in an organic family tree, or in an ensuing alteration, to species groups in a phylogenetic tree of transformative scientific classification. (https://www.ethnologue. com/) [29]. Populaces of India talk’s dialect of a specific dialect family had a dialect connection for any of the dialect family. In India, there are four dialect families Dravidian, Indo-European, Tibeto-Burman and Austro-Asiatic.

References	Austro-asiatic	Dravidians	Indo-European	Tibeto-Burman	Caste	Tribe
References	Haplogroup	Haplogroup	Haplogroup	Haplogroup	Haplogroup	Haplogroup
[48-54,59,62,63]	O2a	H1	O2a	O3e	R1a1, H1	O2a
			H1a
	O2a
	O2a	O2a	O2a
				O3a3c
	K	BR*	BR*	K
	O2a	H1, R1a	H1, R1a	O2a
	J2b	J2a	J2a	J2a and b	J2a,b and J1	J2a
	O2a	H1	R1a1	O2a	R1a1, R2	O2a
					R	H
					R,H	O

References	Austro-Asiatic	Dravidians	Indo-European	Tibeto-Burman	Caste	Tribe
References	Haplogroup	Haplogroup	Haplogroup	Haplogroup	Haplogroup	Haplogroup
[48,50, 52,61,62,63,65,66, 70, 72,76-79,81, 74]	M,M2, M3, M33, M37, M38, M55, M57, M4,M6,M48, M49, HV,pHV,R^*,R6,U2i	R, R^, R5, R6,R7, R30, HV, N1, J2, U^, U2, U2a, U5, U7,N1,N,N5,R^*, R5, TJ, X,J101, T1a, T2a, U1a, U2b, K1a1, K	M, M5, F,R,R^,R6,J,W,H, HV,U^,U2a,U2b, U2i, B,F, H,HV, N1,N, R^,U^,U1,U2i, U7	M, M6, M9, D, M18, M59, M60, R^*,R9, W,U,U2,U2b,U7, M, C, D, N, A, B, F	M^*, M3,U(K), R(B,J,T,F), A, U, U2a, U2b, U2i, U7	R^, HV, pHV, R^,R6,U2i, H,R^*,R5,R7,T,U,U2a,U2b,U7, U2i,U7a

Geographic Region	No. of populations studied	References
North India	21	[45-52]
South India	57	[47,50,51,49, 55-59]
Central India	21	[45,46,53,49,61-62]
East India	57	[45,50,53,49]
West India	22	[51-52]
North-East India	13	[50,51,53,49,60]

Geographic Region	No. of populations studied	References
North India	21	[63,55,56,65-68]
South India	51	[51,53,64,66,69, 71,74]
East India	28	[63,65,66,70,76]
West India	35	[48,52,63,65,66,70,76,77,78,79,74]
North-East India	25	[50,63,55,66,70,72,82]
Central India	11	[61-63,70,80]
Andaman & Nicobar	5	[70,81,50]