Showing posts with label Central Asia. Show all posts
Showing posts with label Central Asia. Show all posts

Thursday, August 6, 2015

Steppe Ancestry Estimations in West, Central & South Asia (Ancestral Proportions Method) [Original Work]

Disclaimer
This is largely a re-post, albeit with additional explanations, from a recent ADMIXTURE autosomal run (Eurasia K20) performed at Anthrogenica by the user Kurd. Full technical information and the original files may be found in his original thread. Full acknowledgement is provided to him for the great work. Unless stated otherwise, assume the contents refer to the Eurasia K20 run. This entry may be updated at any time to include further investigations based on future runs. Finally, this entry assumes the mainstream Pontic-Caspian theory for the genesis of the Indo-Europeans to be fully accurate.

Preamble
This entry/repost contains a "quick and dirty" method for a preliminary attempt at deriving their Sintashta admixture levels in West, Central and South Asians based on the Eurasia K20 scores. Given the different admixture histories elsewhere in Eurasia, this probably won't be very informative for users with ancestral backgrounds outside the lands between Kurdistan and the Indo-Gangetic plains. This is especially the case with modern Europeans, who share the same core components with Sintashta, while also deriving their own Indo-European ancestries from different archaeological cultures and time periods.

Establishing the Context
According to this Eurasia K20 run, Sintashta are approximately 62% Yamnaya, 22% EEF, 10% European and 3% SHG_WHG. Sintashta, at present, appear to be the best proxy for the Indo-Iranians that arrived in West and South Asia. The above four components define the majority (94%) of Sintashta's autosomal profile here.

As discussed elsewhere in Anthrogenica (kudos to user Sein for pointing this out), Sintashta should be considered better surrogates for the Andronovo-related waves which reached West, Central and South Asia than the actual Andronovo samples derived from Allentoft et al. 2015. This is due to the Andronovo samples being derived from the extreme northeast of the archaeological horizon (above the Altai, near Afanasievo). Their position opens up the possibility for extraneous admixture from other steppe groups (including early speakers of Tocharian through Afanasievo?).

The user Kurd has previously demonstrated that recent steppe-related admixture may be segregated from other components. While undertaking this exercise, it also looks like Kurd has done an excellent job addressing the "teal" component that defined up to half of Samara Yamnaya and a big chunk of Sintashta. Kurd's K20 is, in my view, the most effective attempt thus far at separating the complicated autosomal overlapping in West and Central Eurasia.

Introducing the Ancestral Proportions Method (APM)
At present, the genetic landscape in West, Central and South Asia presents as a triple conundrum:

  1. There is, to date (and with the exception of the poor quality Barcin Neolithic Turkish sample), absolutely no interpretable ancient DNA (aDNA) from any of these regions, or indeed, at any point in this broad area's history. Perhaps the greatest obstacle at present.
  2. Autosomal and uniparental marker data from across the region are either inconsistent in sample strategy, or outdated, preventing a knowledge-based approach towards interpreting results.
  3. Archaeological evidence is inconsistent across the region; some cultures are well-studied, whereas others have fallen to mirthful speculation or cannot be readily assigned to any particular prehistoric group.

The APM is, in principle, unconcerned with these issues. Instead, it relies on objective data from a single ancient population to discern the numerical degree of overlap with modern populations.

Whether or not Iranians, Punjabis or Nepalis derive the bulk of their ancestry from unrelated group X or Y is beside the point. The sole purpose of the APM is, therefore, to establish whether or not ancient population Z has left any genetic imprint on modern populations A-K, and if so, to what extent. As such, the methodology described here is completely different as it is assymetrical; one-way gene flow across space and time from one ancestral (extinct) population to numerous extant populations. APM or derivative approaches should be considered as supplementary rather than directly competing with symmetrical modelling techniques such as f3 statistics.

The APM was specifically designed to answer the question; to what degree did Sintashta-related populations contribute to the modern groups of West, Central and South Asia? This simple inquiry has a tendency to attract considerable debate and wildly differing estimates in online discussion boards. Today, using the APM and recently generated data from the Eurasia K20 run, I hope to provide one set of estimations completely independent of extraneous modelling factors.

This approach is entirely reliant on high component specificity (e.g. minimal overlap or bleed-over from one component to another). This particular parameter is not within my control in this instance. As such, the outputs from APM here should be considered cautionary preliminary estimates at best, given the potential for ADMIXTURE-related shortcomings in the absence of relevant aDNA. I anticipate this approach will be much more effective at gleaning admixture extents once aDNA from West and Central Asia dating <2200 B.C. are retrieved.

The APM Approach
To contrast against the ADMIXTURE Sintashta scores, two different approaches are utilised together:

1) Direct Overlap (DO): summarised, for each component, the maximum overlap between a given population average and Sintashta's scores are calculated. This is done individually across all four components (Yamnaya, EEF, European, SHG_WHG) with the outputs added. See image below for schematic (conceptual breakdown of how the DO principle works between hypothetical samples 1 and 2, with Components a-d representing the distinct components).

Schematic diagram showing the principle behind Direct Overlap calculation


2) Component Proportions (CP): A single dominating component (frequency > 50%) is considered modal for the ancestral population of choice, with the other values considered as a fraction of this in modern populations. Given the Yamnaya component makes up almost two-thirds of Sintashta, the ratio between a population's and Sintashta's Yamnaya score are calculated and re-applied to the rest.

There are, however, problems with either approach:

1) DO is overinflated the more West Eurasian a population is. For example, several of the Iranian or Kurdish users at Anthrogenica had component scores greater than what is found in Sintashta (e.g. European being 12% in one sample, when it's 10% in Sintashta). This biases the results for Iranians and Kurds greatly, even when absolute value adjustments are set in place, which the formula shows is (it is highly improbable an Iranian with 10% European derived all of it from Sintashta).

2) CP is more accurate given the Yamnaya component appears highly steppe-oriented in Eurasia K20 and can therefore serve as a direct admixture marker. However, some of the South Asians are scoring very low, or almost none of, the other key components found in Sintashta (e.g. EEF). Due to this, CP doesn't fully account for the "missing variation" in South Asians, biasing the results slightly in their favour.

One convenient workaround is undertaking an average of both scores. However. given CP is intuitively more accurate due to the reasonable specificity of the Yamnaya component, a weighted average biased in favour of CP by a ratio of 3:1 was undertaken. The ratio choice in this variant of the APM is arbitrary here. Other variants (2:1, 4:1) would not result in radically different outcomes.

Results
Full results from up to 24 populations are shown in the Data Sink (interactive chart below). Summarised, Pamiri Tajiks are the most Sintashta-derived at 31.9%. North Caucasian (Ossetian) and Central Asian ethnic groups (Pashtuns, Uzbek, other Tajiks, Turkmen) follow at 22-19%. Various other ethnic groups across West, Central and South Asia follow. The lowest scoring population sampled here are the Makrani at 9.2%.

Internal Validation
The output (Data Sink) readily demonstrates strong correlation between DO and CP scores per population (e.g. Tajik Pamiris at 34.20% & 30.8%, Nepalis at 15% & 14.5% and Makrani at 10.2% & 8.7% respectively account for the top, middle and bottom pairs). The only marked deviation between the DO and CP scores were noted in West Asian populations (Armenians, Kurds, Iranians), as mentioned previously. Thus, empirical confirmation of correlation (e.g. Spearman's rank order) is unwarranted here.

Another means of confirming the validity of APM is to confirm Andronovo is a descendant of Sintashta. As the Andronovo archaeological horizon originates from Sintashta directly, one would expect very high (>90%) Sintashta-derived ancestry among them.

This appears to be the case. compared against Sintashta, Andronovo exhibits DO = 83.9%, CP = 97.3%, an average of 90.6% and a weighted average of 92.8%.

Summarised, these two results (dataset-wide correlation, ancestral-immediate successor high overlap) validate the outcomes of the APM.

Closing Thoughts
The results featured in this entry are in line with both broad uniparental marker data, previously published IBD results (unfortunately removed from sources) and are largely (though not fully) compatible with the degree of archaeological input from Andronovo-derived cultures in Asia. As stated previously, due to earlier shortcomings, they should not be considered definitive.

Given the CP here is not exclusively associated with Sintashta, I anticipate this technique will be more accurate if future "steppe"/"Yamnaya"/"Yamanya_related" components are shown to define more of the Sintashta samples. I look forward to extending this method in the near future.

Acknowledgements
Special thanks to the user Kurd from Anthrogenica for making this data available and obliging member inquiries with productive responses, as well as the user Sapporo for generating several of the population averages.

Saturday, July 13, 2013

A Hidden Gem in Central Asia: Previously Unknown Y-DNA R1b Haplotype [Original Work]

1. Introduction

Central Asian Y-DNA diversity has been an area of constant intrigue in the genetics community. Wells et
al.'s The Eurasian Heartland: A continental perspective on Y-chromosome diversity paved the way, with several others following in their regard. Members of the same team (including Dr. Wells) produced another paper - A Genetic Landscape Reshaped by Recent Events: Y-Chromosomal Insights into Central Asia - on the same topic in the following year, this time headed by Dr. Tatania Zerjal. I noted a greater emphasis on East-Central Asian populations as well as a mentioning of Y-STR analysis in the study itself. However, none of this data was supplied, with only Y-SNP information included (shown sporadically in this entry). The age of this paper is apparent through the nomenclature used (see Method section).

Several months ago, I made a request to obtain the Y-STR data from this study to one of the co-authors, Dr. Tyler-Smith, who kindly replied with the results of all sampled populations (Data Sink > Zerjal et al. Raw Data).

In this blog entry, the Y-STR data is showcased with a special emphasis on the Y-DNA R1b-M269 which was discovered.


2. Method
Y-SNP Phylogeny in original paper (Zerjal et al.) [1]


The maximum number of compatible Y-STR's were utilised for processing in Urasin's YPredictor for easier haplogroup identification (14 of a possible 16, DYS434 and 435 were excluded). All data was run through YPredictor. Only samples with ≥70% probability were included in the final results (Data Sink > Processed Data). As discussed below, relevant findings are compared with the basic Y-SNP haplogroups shown in the original study (on right).

One point which needs to be addressed immediately is the high frequency of "_DE-M1" and "P-M45". It appears that the STR selection has led to a phantom result, rendering many of the samples useless. For instance, the original study shows the Kazakhs belong overwhelmingly to C3c-M48, [1] although the probable results shown here are mostly "_DE-M1".  The exclusion of DYS434 and 435 from my level of processing likely contributed to this; if one assigns equal weight to the statistical strength of a prediction, removal of two STR's from a panel numbering 16, accuracy is reduced by 12.5%. Additionally, some conversion error seems to have applied with DYS437 (i.e. a value <12 is unusual). Therefore, "_DE-M1" and "P-M45" results were dismissed on account of the mismatch between predicted and likely confirmed haplogroups probably due to a compatibility issue between the study's STR panel and YPredictor..


3. Results

As the majority of samples were removed owing to the caveat described above, this entry will take a qualitative rather than quantitative approach to analysis on the general picture formed. Much of the remaining results are congruent with findings in other papers. Populations around the Caucasus are signified by plenty of R1b-M269, J2a-M410 and G2a-P15. Tajiks and the Kyrgyz were predominantly R1a1a-M17. Mongolians and other East-Central Asian ethnic groups yielded the most O3-M122 and "NO-M14" (likely to be Y-DNA N or O suffering from the STR restrictions described in the Method section).


Y-SNP distribution in Central Asia (Zerjal et al.) [1]


3i. The R1b Signal 

R1b-M269 was found across Central Asia and not only in the Caucasus (Armenians, Azeris, Georgians, Ossetians). It was mostly detected among the Turkmen (trk1, trk2, trk4, trk6, trk7, trk22, T29, T32) with a single sample among the Uzbek (uz-s110). [1]

Analysis of the haplotypes (including DYS434 and DYS435) revealed the nine Central Asian R1b samples belonged to a secure haplotype (Data Sink > R1b Results). trk6 diverged greatest, albeit with two 1-step mutations on DYS393 and DYS434. The rest match this haplotype exactly or have single 1-step mutations. [1] When this Central Asian R1b haplotype is compared with the other Caucasian samples, a mixed picture emerges, with the poorest being an Armenian (arm47) at 8/16, whereas the best are another Armenian (arm12) and Azeri (az48), both at 15/16. [1]
One interesting point is the Kurds sampled in this study (some of whom also belong to R1b-M269) are actually the displaced population positioned on the Iranian-Turkmenistani border. All of whom match the Central Asian R1b haplotype with a similar value (12-13/16). This definitively rules out the Kurds as a source for the haplotype, particularly as better matches can be found further to the west. It should be noted the Kurds themselves formed their own R1b haplotype (defined here by DYS389II=27, DYS391=10). [1]

In summary, the data reveals that the Turkmen are particularly abundant in R1b-M269 and all belong to the same haplotype as one of the Uzbek samples. This haplotype matched some Caucasians very well, but others not so well. The Kurds living in Turkmenistan belonged to their own haplotype.


3ii. Is This Actually R1b-M269?

Attention must first be shown to the original paper again; any potential R1b-M269 here will be present as P(xR1a)-92R7 (shown in the paper as "Haplogroup 1"). [1] Evidently, this makes up approximately half of the Turkmen lines and a quarter of Uzbek ones. Other haplogroups (such as other forms of R1b, R2a-M124, various Q subclades) presumably make up the rest of "Haplogroup 1" shown.

The next step is to verify whether or not this Central Asian R1b haplotype matches other R1b haplotypes online. As Y-DNA R1b-M269 is fortunately well-represented in the world of genetic genealogy, searching for the haplotype's matches on ySearch is a reasonable enterprise. DYS437 had to be excluded here due to a conversion issue, leaving the haplotype at 15 STR's. A genetic distance (GD) of 3 was allowed on these 15 markers. Results are shown on the right.

ySearch results for Central Asian R1b haplotype
With some confidence, the search has demonstrated that the Central Asian R1b haplotype does indeed belong to R1b-M269, as all the seven matches shown (one of whom is Armenian) belong to it.

Expanding the line of inquiry one further step came through comparing this haplotype with Iranian haplotypes [2] which were readily available. Due to differences in STR panels (an overlap of only 11) this proved to be inconclusive, aside from the observation that DYS389i+ii was completely different between the Central Asian modal (10-26) and the Iranian values. At this point I suspect that, much like DYS437, there is a conversion issue with DYS389 also.

Finally, a comparison was made with the R1b found in Afghanistan last year [3]. Interestingly, if DYS389i+ii and DYS437 are excluded, the two Uzbeks (samples 35 and 181) match the Central Asian R1b haplotype almost exactly based on the remaining 11 STR's. The one Tajik (sample 32) is less likely to be related due to two 1-step mutations on different STR's.


4. Conclusion

The inferences made from the data hang by a metaphorical thread due to the persistent STR issue; different labs have used different panels in the past decade, making it excruciatingly difficult to use materials from older papers. Fortunately, the presence of a specific strain of R1b-M269 in Central Asian (in Turkmen and Uzbeks) has successfully been demonstrated after select exclusions and no modifications to the data.

However, some larger questions remain. If STR limitations were not an issue, how would the Iranians from Haber et al. have compared? Would the Tajik from the other Haber et al. paper have belonged to the same haplotype in the end?

The origin of this Central Asian R1b haplotype will, I anticipate, also be a point discussed heavily among interested parties. At this point in time, I must stress that none of the evidence thus far points to anything in particular without ruling other theories out, although it leaves the door for interpretation wide open.

Having given this cautionary statement, the main thrust of this entry should be emphasised; R1b-M269 in Central Asia is a confirmed reality and here to stay. I will defer any subsequent analyses to the experts on Y-DNA R1b which grace several genetic genealogy boards for their take on the flavour of this haplotype.


5. Acknowledgement

I publicly extend my gratitude to Dr. Tyler-Smith for being so kind in sending me the raw STR's from this important paper for my research, as well as co-authoring the other two excellent studies I have cited here and in the past.


6. References

1. Zerjal T, Wells RS, Yuldasheva N, Ruzibakiev R, Tyler-Smith C. A genetic landscape reshaped by recent events: Y-chromosomal insights into central Asia. Am J Hum Genet. 2002 Sep;71(3):466-82. Epub 2002 Jul 17.

2. Haber M, Platt DE, Badro DA, Xue Y, El-Sibai M, Bonab MA. Influences of history, geography, and religion on genetic structure: the Maronites in Lebanon. Eur J Hum Genet. 2011 Mar;19(3):334-40. doi: 10.1038/ejhg.2010.177. Epub 2010 Dec 1.

3. Haber M, Platt DE, Ashrafian Bonab M, Youhanna SC, Soria-Hernanz DF, Martínez-Cruz B. Afghanistan's ethnic groups share a Y-chromosomal heritage structured by historical events. PLoS One. 2012;7(3):e34288. doi: 10.1371/journal.pone.0034288. Epub 2012 Mar 28.

Saturday, March 31, 2012

North European Component Variation within the Eurasian Heartland [Original Work]

As DNA variation across Asia have progressed over the years (Wells et al., Xing et al., teaser mtDNA results from Burger et al.'s upcoming analysis of prehistoric Eurasian steppe remains), the prevailing theme of ancestral markers with origins in Europe has remained a frequent one, particularly with regard to the expansion of Bronze Age semi-pastoral nomads from the Pontic-Caspian steppe bearing the Indo-European languages.

David W. of the Eurogenes Genetic Ancestry Project has recently posted data online from a new Intra-European run using ADMIXTURE (K=12) with the intention of breaking up the North European component that often arises through the program. Spreadsheet results here.

This brief investigation seeks to identify the North European-derived component patterns within Asia by first mapping out the frequencies and then correlating with Eurogenes' release notes on each.

Method
As many samples from immediately-identifiable populations were obtained from the spreadsheet results (link above). No sample restrictions were implemented. Averages of each population were calculated, except where n=1. No modifications made to population labels except for Eurogenes population averages, denoted by the addition of a _Eg suffix. Populations were then allocated into arbitrary regional groups, allowing results to be displayed more coherently.

Results
Tabulated results can be found in the Data Sink. Autosomal variation per Regional Group can be found below:











The North European-derived components, despite their exceptionally close Fst. distances relative to the other components, do seem to reveal a few interesting trends;

  • Northeast European appears to (at least partially) be the result of allele sharing with populations further east, as evidenced by its' predominance in East-Central Asian groups, as well as extending even further eastwards into the Siberian Selkup (n=1). This component has a circumstantial correlation with the craniometric and ancient mtDNA evidence suggestive of a "migration corridor" between Eastern Europe and Siberia (Malyarchuk et al.'s On the Origin of Mongoloid Component in the Mitochondrial Gene Pool of Slavs, Newton's Ancient Mitochondrial DNA From Pre-historic Southeastern Europe: The Presence of East Eurasian Haplogroups Provides Evidence of Interactions with South Siberians Across the Central Asian Steppe Belt). While it also explains this component's abundance in North Caucasian populations (lie en route between Ukraine and Siberia), the same cannot be said with absolute certainty of South-Central Asia. With that being said, the 0.021 Fst distance with West European despite the markedly different distributions suggests both are the result of prehistoric (possibly paleolithic?) hunter-gatherer migration paths across large swathes of Eurasia.
  • West European has a sporadic appearance across with an Asian peak in the North Caucasus. This implies - Staying true to its' assigned label - It is a generic West Eurasian component that has reached a maximum in Western Europe, with the North Caucasus representing the closest point of reference to there. Indeed, this inference is made independently by Eurogenes, albeit using different parameters;
"I used samples of Scottish, Irish and Western English ancestry to create this cluster. Not surprisingly, it peaks in individuals of Western Irish descent. However, it also peaks in Basques and many Iberians, which is fascinating, because that makes it the autosomal equivalent of Y-chromosome haplgroup R1b in Europe."
  • North Sea and South Baltic accompany one another at similar frequencies across much of Asia, especially in populations with an Indo-Iranian-speaking heritage (observe the ~0.8-1:1 ratio among Kurds, Iranians, the Turkmen, Uzbeks, Tajiks, Brahmins, Kshatriya's and Kyrgyz as examples of this). It is interesting to note that, of the two, only the North Sea component is readily present in East-Central Asians. The only other likely migration path along this trajectory is that of the proto-Tocharians, who (under the Eurasian steppe theory) split off from the Proto-Indo-European homeland several millennia prior to the Proto-Indo-Iranians that eventually formed the Andronovo archaeological horizon from Sintashta/Pit Grave (E Kuz'mina, The Origin of the Indo-Iranians, pg.451). Perhaps this near-solitary North Sea component within the Altaians, Mongolians and Uyghurs is attributed to early speakers of Tocharian? Perhaps the elevated presence of the North Sea component in South-Central Asia (Jatts, Pathans, Kyrgyz) is a relic of the Kushans, nomads supposedly a part of the Yuezhi confederacy, who may have been Tocharian speakers themselves? 
  • One curious phenomenon is the similar West European-North Sea-Northeast European component proportions across the Turkmen, Uzbeks, Kyrgyz, Pathans, Uttar Pradesh Brahmins, Altaians and the Uyghur. Whether this can be substantiated in any way, or whether it is simply an anomalous association predicated by non-uniform and varying sample sizes, prevents a firm conclusion from being made.
  • North European-derived frequencies among Southwest Asian Semitic-speaking groups shown here seldom exceed 1% apiece and are either the result of recent, inconsistent small-scale admixture events or are simply background noise generated by ADMIXTURE.
Summary
The Northeast European and West European components appear to have a distribution independent of any significant migration events since the Neolithic, instead being associated with either the "migration corridor" across Eurasia or simply being the result of mutual West Eurasian heritage. North Sea and South Baltic, on the other hand, do seem to correlate with one another and support (rather than contradict) the eastward movement of Bronze age semi-pastoral nomads speaking early dialects of Proto-Indo-European.

Edit I [31/03/2012]: Correction of erroneous Brahmin results due to Google Spreadsheet lag.

Thursday, February 9, 2012

Autosomal variation from Anatolia to the Tarim periphery [Original Work]

The nature of ADMIXTURE as a tool for inferring ancestral components makes it difficult to discern the nature of a shared Autosomal component between several populations. For instance, a given component may originate in one population and be donated to others (e.g. purported African admixture in the Arabian Peninsula), stem from a mutual population (e.g. West Eurasian-specific components in low K=n runs between the Druze and the French Basque) or be the result of genetic drift (e.g. potentially, the peaking of East Asian-specific components in Korea and Japan).

Nevertheless, using results from the latest Dodecad Ancestry Project K12b run (link), I have investigated the component variation across a horizontal axis from Anatolia to the Tarim periphery in West China, with the intention of establishing the nature of the observed components across this area of interest. Raw values can be viewed on the newly-published Vaêdhya Data Sink. Populations are listed in a geographical cline.


One of the most immediate observations is the similarity between Kurdish and Iranian populations, with both expressing similar admixture percentages (deviation per component usually not >1%). This suggests that Kurds and Iranians have common origins, with the former largely maintaining those ancestral signals despite moving further westwards relative to their linguistic cousins in Iran.

Near-congruency between the Assyrians and Armenians is also striking, bar the variations on the North European, Caucasus and Southwest Asian components. It is again tempting to postulate the two descend for the most part from a similar root population with the aforementioned component differences accounting for the linguistic differences.

If one allocates the Kurds alongside the Iranians, several of the Autosomal components shown here have a distribution that appears to be determined by geography alone;


  • South Asian peaks in Tajiks, who are situated approximately due NNW of the Indian Subcontinent.
  • Caucasus reaches a maximum in Armenians and adjacent populations.
  • Atlantic Med steadily decreases as one moves further away from Europe.
  • Southeast Asian has an inverse relationship to the above, peaking in the Uyghurs sigificantly only.

Other components appear to have more complicated distributions;

  • Interestingly, East Asian and Siberian are not too dissimilar in the populations containing them. The elevation of both in populations which speak Turkic/Altaic languages relative to neighbours speaking other languages confirms genetic input from the Turkish steppe nomads who expanded from the eastern side of Central Asia, eventually reaching the Iranian plateau and Anatolia. However, it is possible some of the Siberian and East Asian values may simply be the result of prehistoric demic diffusion across Eurasia (demonstrated by potential gradient between Kurds/Iranians <-> Tajiks), although this may in itself be of medieval steppe ancestry.
  • Southwest Asian peaks in Assyrians, the only Semitic-speaking population shown in this analysis. This component falls rapidly beyond the Iranian plateau but is found at a background frequency east of Turkmenistan. Whether this is again an artefact of prehistoric demic movements or more recent migrations (e.g. Silk Road, various Persian empires) is debatable. As with the Siberian and East Asian components, there is an elevation which defies a geographical pattern and confirms historical accounts; the Tajiks, who descend in part from Persian speakers escaping Iran after the Sassanid collapse, show an elevation relative to the Uzbeks and Uyghurs. The greater frequency in Christian Armenians relative to the predominantly Muslim Kurdish territories and Iran disregards outright the notion it was introduced by the Islamic expansion out of the Arabian Peninsula.
  • The Gedrosia component has a bifurcated peak between Iranians and Tajiks, implying an ultimate peak in the region of Pakistan (corroborated by other Dodecad population results, such as the Balochis of Pakistan). However, the Gedrosian frequency drops from a stable 28% across West Iranic-speaking populations to 13-18% in Anatolian Turks, Armenians and Assyrians. It is again impossible to infer whether this is of prehistoric origins (i.e. mutual Neolithic phenomena between the Iranian plateau and South-Central Asia) or more recent (inflated Gedrosian values a function of Median, Persian and Parthian ancestry).
  • The North European component has what appears to be a dual geographic and linguistically-oriented distribution, which may be confounded further by recent interactions between Europe and some of the populations shown here (Anatolian Turks may potentially be the greatest example of this). It is interesting to note the Assyrian and Armenians show an inverse in the North European and Southwest Asian components despite otherwise appearing identical. The elevated frequency of this component in Central Asia will hopefully be covered in a future entry.

Despite the usefulness of ADMIXTURE in determining approximate ancestral origins of populations and individuals, it is impossible to ascertain the nature of component X between populations A and B; such Autosomal results should ideally be complementary to historical, linguistic, archaeological and even deep paternal and maternal evidence (Y-DNA, mtDNA).

Some of the observations made in this entry have been gleaned with earlier renditions of population data; through the use of deeper penetrating Autosomal techniques (such as IBD), the exact nature of the component variations should hopefully be resolved in the future.


Reference

The raw values used in this investigation are attributed to Dienekes Pontikos, author of the Dodecad Ancestry Project.