Tuesday, March 26, 2013

Y-DNA Haplogroup N in India: Wayward Uralics or Lab Error? [Original Work]

Y-DNA Haplogroup N Eurasian Distribution

Per ISOGG's 2013 SNP tree and as has been the case for years, Y-DNA Haplogroup N is defined by the M231 mutation (G->A at rs9341278) on the Y-Chromosome. With a predominantly North Eurasian distribution, it peaks in Europe among the Finnish people and various ethnic groups residing in Russia's far north through the N1c-Tat subclade. N1c-Tat specifically is frequently associated with Uralic-speaking populations in the literature.

Haplogroup N also appears to have an association with Central Asia as shown in the N Y-DNA Haplogroup Project (FTDNA) results, with several samples coming in from Kazakhstan, Uzbekistan and Mongolia. It has also been observed in Turkey (KurdishDNA blog entry) as well as appearing in 1.6% of Iran's Azeri population (Grugni et al. entry).

The finding of Haplogroup N in India through Sharma et al.'s The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system [1] is a curious one. Unfortunately, the paper did not include any Y-STR material to help understand the basis of N's presence in India.

Significance of Potential Haplogroup N in India

Linguistics provides us with a plausible scenario regarding how Haplogroup N may have arrived in the Indian Subcontinent. Contacts between early Finno-Ugric and Indo-Iranian groups took place around the Ural mountains, specifically between the forest and steppe zones. Evidence of transmission in horsekeeping techniques, economy, deities and common words are firmly established from Andronovo archaeological horizon on the steppes into the "Andronovoid" societies living in the nearby forests. [2]

The presence of Haplogroup N in India, if present in relevant populations and displaying MRCA values or STR clusters consistent with a Neolithic origin further north, would satisfy the likelihood of Haplogroup N representing an accompanying genetic signal from the steppe zone roughly four thousand years ago, as well as serving as a genetic remnant of the interactions that undoubtedly took place between Indo-Iranian and Finno-Ugric tribes.

Current Findings

In 2009, Sharma et al. published a paper highlighting the Y-Chromosome haplogroup differences between various upper caste (Brahmin) and tribal populations across India. The paper went on to deduce that Haplogroup R1a1a in India was autochthonous in origin based on their findings [1] (now disputable and improbable based on Underhill et al.'s landmark study on Y-DNA R1a1a and recent findings by the R1a Subclades FTDNA Project, although this topic is beyond the scope of this entry).

It was this very paper by Sharma et al. which revealed the presence of Y-DNA N in India. Haplogroup N1-LLY22g was found in Brahmins from Gujarat, Madhya Pradesh and Mahastra (3.13%, 2.38% and 3.33% respectively), as well as tribal populations from Uttar Pradesh (1.56%). Their results were extended to include greater caste differentiation (Brahmins vs. Scheduled Castes vs. Tribals); here, Brahmins were found to have five times greater the frequency of N1-LLY22g than tribal groups (0.5% vs. 0.1% respectively).  [1]

Although the frequencies were arguably insignificant, the inference stood - Y-DNA Haplogroup N showed an association with the upper caste practitioners of Hinduism in India, paving the way for the scenario described in the above chapter to be considered.

However, the strength of this conclusion is weakened greatly by cross-sectional data from numerous studies concerning the Indian Subcontinent produced in the past decade:

  • Sengupta et al. (2006) revealed that, out of 1090 samples, with the majority coming from the Indian Subcontinent, the only populations revealing any Haplogroup N (N-M231) and associated downstream subclades were either East Asian (Chinese ethnicities, Cambodian) or Siberian (Yakut). [3] No groups from India belonged to Haplogroup N-M231.
  • Furthermore, Sahoo et al. (2006) also sampled individuals from across the Indian Subcontinent (n=1074) and failed to find a single instance of N-M231. [4]
  • In a recent study on various populations in Tamil Nadu (South India), Haplogroup N was completely absent in the 1680 samples tested. [5]
  • Y-DNA N1c-Tat was absent in the 607 tribal populations tested from East and Northeast India. [6]
  • Returning to the north of the country, 560 men from various upper castes and Muslim groups were tested by Zhao et al. and N1c-Tat was absent from all. [7] 
  • Focused specifically on Brahmins from Saraswat (Jammu-Kashmir), Yadav et al. found none of the approximately 109 haplotypes to belong to any derivative of Haplogroup N. [8]

Finally, the N Y-DNA Haplogroup Project at FTDNA currently does not show any samples whatsoever from the Indian Subcontinent.

Possible Explanation

Despite over 4,000 samples over five studies representing various groups from across India, not a single trace of Haplogroup N has been detected. What explains this glaring discrepancy with Sharma et al.'s findings? Differences in sampling strategy between the other studies with Sharma et al. cannot account for this; there is enough regional overlap to rule this out.

As was the case with Sengupta et al. where several Hazara haplogroup classifications were allegedly due to a laboratory error, it is probable the Haplogroup N seen here follows the same suit. By reasonable deduction, if one study reveals a trend that several others covering thousands of samples cannot verify, there must be something intrinsically erroneous in the former.


Until I can physically view the purported Haplogroup N haplotypes reported in Sharma et al., it is the conclusion of this entry that they are most likely the result of a laboratory error given the complete absence of any flavour of N-M231 in India through other recent studies. If any Haplogroup N is found, it must be contrasted against Sharma et al. and should be investigated on a separate line of inquiry. As ever, details of any future cases of Haplogroup N in India should be taken into consideration. If of a Mughal background, the paternal origins are readily explained by Medieval Central Asian ancestry. If from the furthest northeast of the Indian Subcontinent, the possibility of Nepali ancestry should be sought. [9] Although prehistoric indirect influence from Finno-Ugric interactions in the second millennium BC onwards shouldn't be dismissed outright, other more recent explanations exist.


1. Sharma S, Rai E, Sharma P, Jena M, Singh S, Darvishi K. The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system. J Hum Genet. 2009 Jan;54(1):47-55. doi: 10.1038/jhg.2008.2. Epub 2009 Jan 9.

2. Kuz'mina EE. The Origin of the Indo-Iranians. Koninklijke Brill NV, Leiden, The Netherlands. 2007.

3. Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CE. Polarity and temporality of high-resolution y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists. Am J Hum Genet. 2006 Feb;78(2):202-21. Epub 2005 Dec 16.

4. Sahoo S, Singh A, Himabindu G, Banerjee J, Sitalaximi T, Gaikwad S. A prehistory of Indian Y chromosomes: evaluating demic diffusion scenarios. Proc Natl Acad Sci U S A. 2006 Jan 24;103(4):843-8. Epub 2006 Jan 13.

5. Arunkumar G, Soria-Hernanz DF, Kavitha VJ, Arun VS, Syama A, Ashokan KS. Population differentiation of southern Indian male lineages correlates with agricultural expansions predating the caste system. PLoS One. 2012;7(11):e50269. doi: 10.1371/journal.pone.0050269. Epub 2012 Nov 28.

6. Borkar M, Ahmad F, Khan F, Agrawal S. Paleolithic spread of Y-chromosomal lineage of tribes in eastern and northeastern India. Ann Hum Biol. 2011 Nov;38(6):736-46. doi: 10.3109/03014460.2011.617389. Epub 2011 Oct 6.

7. Zhao Z, Khan F, Borkar M, Herrera R, Agrawal S. Presence of three different paternal lineages among North Indians: a study of 560 Y chromosomes. Ann Hum Biol. 2009 Jan-Feb;36(1):46-59. doi: 10.1080/03014460802558522.

8. Yadav B, Raina A, Dogra TD. Genetic polymorphisms for 17 Y-chromosomal STR haplotypes in Jammu and Kashmir Saraswat Brahmin population. Leg Med (Tokyo). 2010 Sep;12(5):249-55. doi: 10.1016/j.legalmed.2010.05.003.

9. Gayden T, Chennakrishnaiah S, La Salvia J, Jimenez S, Regueiro M, Maloney T. Y-STR diversity in the Himalayas. Int J Legal Med. 2011 May;125(3):367-75. doi: 10.1007/s00414-010-0485-x. Epub 2010 Jul 21.