Saturday, December 22, 2012

Yaghnobi Tajiks: Preliminary Results May Reveal Iranian Plateau Affinity [Original Work]

Slipping under the radar of the genetic genealogy world is this paper by Elisabetta Cilli and her colleagues, which investigated the mitochondrial data of 62 individuals from Tajikistan's Yaghnobi population. [1]

The Yaghnobis are of interest given their geographical isolation and the East Iranic nature of their language. Living just northeast of the predominantly Persian (Dari) speaking capital, Dushanbe, Yaghnobi is a continuation of a fully agglutinative Soghdian dialect representing the sole survivor of this language following the Persianization of Central Asia in Medieval times [2]. Despite its' East Iranic vocabulary, Yaghnobi demonstrates several linguistic features (i.e. gender loss, past imperfective preservation from present stem of a verb) which separates it from those modern East Iranic languages immediately surrounding it. Furthering the uniqueness of the Yaghnobi language in this context is the unity it forms through these features with languages mostly spoken further west in the Iranian plateau (e.g. Persian, Gilaki, Kurdish dialects). [2]

Although the results are preliminary and lack any empirical data, Cilli et al. have discovered some interesting connections between the Yaghnobi and relevant populations. In summary, they found the following:

MDS Plot of Results
  • 42 individuals used for the preliminary work belonged to only 19 distinct mtDNA haplotypes. Of these, 11 were distinct among the Yaghnobi.
  • The Yaghnobi have less mtDNA genetic diversity than other Central Asian populations (0.930) and this is attributed to their geographical isolation and recent history of displacement by the U.S.S.R. in the 1970's for agricultural purposes, where a small group (300) returned and repopulated their original homelands.
  • Intriguingly, the Yaghnobi shared all of the mutual haplotypes (8/19) with populations from Iran (e.g. Gilakis, Mazandaranis and Iranians from Tehran and Esfahan) instead of other Central Asian groups, including their Tajik compatriots.
  • The Yaghnobi shared most of these mutual haplotypes with Gilakis, Kurmanji Kurds and Avars from the Caucasus (4 each).
  • However, owing to their predominantly distinct mtDNA character, the Yaghnobi are clear outliers from the general zone occupied by the reference groups. 

My critique and interpretation of these results are as follows:

  • At least two instances of genetic drift occurring (founder effect via geographic isolation, bottleneck due to Soviet relocation) is likely responsible for the decreased mtDNA diversity. Thus, it is clearly simply a reflection of their environment.
  • As a result of the Soviet relocation, it may be useful to determine whether results from the displaced parent population match what has been stated here. This is quite possible given the relocations occurred just over one generation ago (~40 years).
  • It is difficult to criticise the decision to test 62 individuals and the utilisation of 42 haplotypes, given the Yaghnobi population in their homeland between 2007-9 only numbered approximately 500. Approximately 8% of the entire Yaghnobi population was therefore analysed here, which is a generous frequency given the amount of attention the region has received.
  • The MDS plot would have benefited from the inclusion of populations in Europe, Southwest Asia and South Asia to comprehensively flesh out the position of Yaghnobis in Eurasia.
  • Accepting that this is a preliminary investigation, it would still have been pleasing to see some raw data published. Aside from confirming that some/one Yaghnobi matched the Cambridge Reference Sequence (CRS, thus Haplogroup H2a2a which happened to be found in all the populations tested), there is no indication as to what the other mutations looked like. Or, for that matter, what mtDNA haplogroups were even present!

Correlation with Y-Chromosomal Data?

The Yaghnobi have been studied at least one other time through their inclusion in Dr. Spencer Wells et al.'s seminal piece The Eurasian heartland: a continental perspective on Y-chromosome diversity. The breakdown of their Y-Chromosomal SNP data (n=31) is as follows: [3]

3% C-M130(xC3a3-M48)
32% J2-M172
Y-SNP clustering reveals Yaghnobis sit near SE Europe and the Near-East
3% K-M9(xO-M175, O3-M122, O1a-M119, O2a1-M95, N1c1-M46) (possibly parahaplogroup such as K*-M9)
10% L-M20
3% P-M45 (xQ1a1-M120, Q1a3a1-M3, R2a-M124)
32% R1-M173 (likely R1b1a1-M73 or R1b1a2-M269)
16% R1a1a-M17(xR1a-M87, private marker)

Despite the double genetic drift undoubtedly affecting the frequencies, it is worth pointing out that the Yaghnobi presented with a broadly similar Y-DNA spectrum as Iran, where J2-M172, L-M20, R1-M173 and R1a1a-M17 (including subclades) comprise approximately 53% of the national average (refer to Grugni et al. analysis). 

This comparison should be taken with a grain of salt given the Iranian national average also comprises non-Iranic-speaking ethnic groups, the Wells Yaghnobi data does not present with thorough downstream Y-SNP evidence, the sample size is contentious and at least two contributors of a founder effect exist. However, that the Yaghnobi appear rich in J2, L and R is certainly reminiscent of Iranic-speaking populations in the region.


The Yaghnobi are an exceedingly interesting population whose overall parental markers seem to support a connection with populations further west than one would anticipate.

Despite the misgivings of all the data concerning them to date, the mtDNA similarity does corroborate specific linguistic features between the Yaghnobi language with those in the Iranian plateau, such as Kurdish or Persian.

If the data holds up in future investigations, it certainly calls to question whether the proposed model of linguistic inheritance exclusively down the parental line (as represented by Y-DNA data) is entirely correct given this connection.

How the Yaghnobi came to display the markers within them whilst speaking an East Iranic dialect with traits akin to those found in West Iranic languages is an intriguing question. One possible scenario is that the Yaghnobi are partly descended from ancient Iranians from the Iranian plateau during the Achaemanid era. This would also account for the linguistic commonalities noted in current literature.

Time (with the assistance of more mtDNA, Y-DNA and auDNA) will help us understand what happened in Central Asia during the formative period that was the Indo-Iranian migrations.


1. Cilli E, Delaini P, Costazza B, Giacomello L, Panaino A, Gruppioni G. Ethno-anthropological and genetic study of the Yaghnobis;an isolated community in Central Asia. A preliminary study. J Anthropol Sci. 2011;89:189-94.

2. Windfuhr, G. The Iranian Languages. 1st ed. Routledge Language Family Series. 2009.

3. Wells RS, Yuldasheva N, Ruzibakiev R, Underhill PA, Evseeva I, Blue-Smith J. The Eurasian heartland: a continental perspective on Y-chromosome diversity. Proc Natl Acad Sci U S A. 28;98:10244-9. 2001.