Thursday, February 9, 2012

Autosomal variation from Anatolia to the Tarim periphery [Original Work]

The nature of ADMIXTURE as a tool for inferring ancestral components makes it difficult to discern the nature of a shared Autosomal component between several populations. For instance, a given component may originate in one population and be donated to others (e.g. purported African admixture in the Arabian Peninsula), stem from a mutual population (e.g. West Eurasian-specific components in low K=n runs between the Druze and the French Basque) or be the result of genetic drift (e.g. potentially, the peaking of East Asian-specific components in Korea and Japan).

Nevertheless, using results from the latest Dodecad Ancestry Project K12b run (link), I have investigated the component variation across a horizontal axis from Anatolia to the Tarim periphery in West China, with the intention of establishing the nature of the observed components across this area of interest. Raw values can be viewed on the newly-published Vaêdhya Data Sink. Populations are listed in a geographical cline.

One of the most immediate observations is the similarity between Kurdish and Iranian populations, with both expressing similar admixture percentages (deviation per component usually not >1%). This suggests that Kurds and Iranians have common origins, with the former largely maintaining those ancestral signals despite moving further westwards relative to their linguistic cousins in Iran.

Near-congruency between the Assyrians and Armenians is also striking, bar the variations on the North European, Caucasus and Southwest Asian components. It is again tempting to postulate the two descend for the most part from a similar root population with the aforementioned component differences accounting for the linguistic differences.

If one allocates the Kurds alongside the Iranians, several of the Autosomal components shown here have a distribution that appears to be determined by geography alone;

  • South Asian peaks in Tajiks, who are situated approximately due NNW of the Indian Subcontinent.
  • Caucasus reaches a maximum in Armenians and adjacent populations.
  • Atlantic Med steadily decreases as one moves further away from Europe.
  • Southeast Asian has an inverse relationship to the above, peaking in the Uyghurs sigificantly only.

Other components appear to have more complicated distributions;

  • Interestingly, East Asian and Siberian are not too dissimilar in the populations containing them. The elevation of both in populations which speak Turkic/Altaic languages relative to neighbours speaking other languages confirms genetic input from the Turkish steppe nomads who expanded from the eastern side of Central Asia, eventually reaching the Iranian plateau and Anatolia. However, it is possible some of the Siberian and East Asian values may simply be the result of prehistoric demic diffusion across Eurasia (demonstrated by potential gradient between Kurds/Iranians <-> Tajiks), although this may in itself be of medieval steppe ancestry.
  • Southwest Asian peaks in Assyrians, the only Semitic-speaking population shown in this analysis. This component falls rapidly beyond the Iranian plateau but is found at a background frequency east of Turkmenistan. Whether this is again an artefact of prehistoric demic movements or more recent migrations (e.g. Silk Road, various Persian empires) is debatable. As with the Siberian and East Asian components, there is an elevation which defies a geographical pattern and confirms historical accounts; the Tajiks, who descend in part from Persian speakers escaping Iran after the Sassanid collapse, show an elevation relative to the Uzbeks and Uyghurs. The greater frequency in Christian Armenians relative to the predominantly Muslim Kurdish territories and Iran disregards outright the notion it was introduced by the Islamic expansion out of the Arabian Peninsula.
  • The Gedrosia component has a bifurcated peak between Iranians and Tajiks, implying an ultimate peak in the region of Pakistan (corroborated by other Dodecad population results, such as the Balochis of Pakistan). However, the Gedrosian frequency drops from a stable 28% across West Iranic-speaking populations to 13-18% in Anatolian Turks, Armenians and Assyrians. It is again impossible to infer whether this is of prehistoric origins (i.e. mutual Neolithic phenomena between the Iranian plateau and South-Central Asia) or more recent (inflated Gedrosian values a function of Median, Persian and Parthian ancestry).
  • The North European component has what appears to be a dual geographic and linguistically-oriented distribution, which may be confounded further by recent interactions between Europe and some of the populations shown here (Anatolian Turks may potentially be the greatest example of this). It is interesting to note the Assyrian and Armenians show an inverse in the North European and Southwest Asian components despite otherwise appearing identical. The elevated frequency of this component in Central Asia will hopefully be covered in a future entry.

Despite the usefulness of ADMIXTURE in determining approximate ancestral origins of populations and individuals, it is impossible to ascertain the nature of component X between populations A and B; such Autosomal results should ideally be complementary to historical, linguistic, archaeological and even deep paternal and maternal evidence (Y-DNA, mtDNA).

Some of the observations made in this entry have been gleaned with earlier renditions of population data; through the use of deeper penetrating Autosomal techniques (such as IBD), the exact nature of the component variations should hopefully be resolved in the future.


The raw values used in this investigation are attributed to Dienekes Pontikos, author of the Dodecad Ancestry Project.