Wednesday, September 23, 2015

Pain: OPRM1 & The Ancestral Contribution [Review]

A paper by Soto & Catanesi assessing genetic variation in the μ (mu) opioid receptor gene (OPRM1) was published in May 2015. OPRM1 contributes to the structure of the μ opioid receptor (MOR), one of three major opioid receptor types which broadly contribute to pain sensation, addiction, ion influx into cells and a host of other functions [1]. 

WHO Pain Ladder (courtesy of
Opioid receptors are of interest to medical researchers due to the varying specificity of receptor agonism (activation) by conventional treatments (e.g. tramadol has a higher affinity to MORs than others [2]). Additionally, opioid receptor antagonists are widely used in pain management (as directed by the World Health Organisation's classic "pain management ladder", figure opposite [3]). As such, understanding the structure of opioid receptor characteristics between individuals could theoretically fine-tune the ideal pharmaceutical agents to be used in specific situations, such as in palliative or acute care, as well as narcotic or surgical rehabilitation (in effect stratified or personalised medicine).

Although the paper indicates contradiction in our current data regarding the most studied SNP to date (rs1799971, linked to the A118G polymorphism), others residing within or near OPRM1 are postulated to have an effect on MOR function.

OPRM1 from a population genetics perspective is of interest through the observation in numerous older studies (listed within paper) that show differing A118G polymorphism frequencies across various world populations. The authors extended these findings by including the HapMap world database with their own Argentinian samples to determine whether OPRM1 SNP variants correlated with ancestral background. Establishing the polymorphic frequency among Argentinians appears to be a secondary aim here.

The authors of this paper concluded that Sub-Saharan African, West and East Eurasian ancestral status coincides with OPRM1 polymorphism status in the several SNPs examined (through the use of Fst and AMOVA). However, they noted that no such clustering was observed between West Eurasians (Europeans) and American populations (mixed Argentinian and Mexican samples). This was taken to indicate extensive gene flow from European colonists had made a massive contribution to the polymorphic status at this gene. 

Another possibility not highlighted in the paper is that the native Amerindian population of pre-colonial America had similar OPRM1 polymorphic status as modern West Eurasians. In light of the recent findings supporting sizeable mutual prehistoric ancestry between these two populations through a conceptual "Ancestral North Eurasian" (ANE) component (Raghavan et al., see below) [4], the OPRM1 congruity between Amerind-European mixed modern Americans and Europeans could partially be attributed to the proposed ANE-containing migratory events. [4] 

Estimated Shared Drift Heat Map with MA1 (Raghavan et al.)

Overall, Soto & Catanesi provide us with a good summary of the population structure that can be directly observed through OPRM1 gene polymorphism variation and support earlier work indicating a correlation with ancestry. It would have furnished their discussion better had some exploration of recent developments in archaeogenetics been undertaken. Their assertion of complete OPRM1 SNP status replacement among Amerind-European admixed American individuals is of course possible, but no evidence is provided that categorically dismisses commonality between Europeans and Amerindians on this genomic region as at least partially responsible for the observation.

Human population genetic structure detected by pain-related mu opioid receptor gene polymorphisms.
López Soto EJ, Catanesi CI. Genet Mol Biol. 2015 May;38(2):152-5. doi: 10.1590/S1415-4757382220140299. Epub 2015 May 1.
"Several single nucleotide polymorphisms (SNPs) in the Mu Opioid Receptor gene (OPRM1) have been identified and associated with a wide variety of clinical phenotypes related both to pain sensitivity and analgesic requirements. The A118G and other potentially functional OPRM1 SNPs show significant differences in their allele distributions among populations. However, they have not been properly addressed in a population genetic analysis. Population stratification could lead to erroneous conclusions when they are not taken into account in association studies. The aim of our study was to analyze OPRM1 SNP variability by comparing population samples of the International Hap Map database and to analyze a new population sample from the city of Corrientes, Argentina. The results confirm that OPRM1 SNP variability differs among human populations and displays a clear ancestry genetic structure, with three population clusters: Africa, Asia, and Europe-America."

1. Feng Y, He X, Yang Y, Chao D, Lazarus LH, Xia Y. Current research on opioid receptor function. Curr Drug Targets. 2012 Feb;13(2):230-46.

2. Dayer P, Desmeules J, Collart L. [Pharmacology of tramadol]. Drugs. 1997;53 Suppl 2:18-24. [Article in French]

3. WHO | WHO's cancer pain ladder for adults. [Last Retrieved 22/09/2015]: 

4. Raghavan M, Skoglund P, Graf KE, Metspalu M, Albrechtsen A. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. 2014 Jan 2;505(7481):87-91. doi: 10.1038/nature12736. Epub 2013 Nov 20.

Thursday, August 6, 2015

Steppe Ancestry Estimations in West, Central & South Asia (Ancestral Proportions Method) [Original Work]

This is largely a re-post, albeit with additional explanations, from a recent ADMIXTURE autosomal run (Eurasia K20) performed at Anthrogenica by the user Kurd. Full technical information and the original files may be found in his original thread. Full acknowledgement is provided to him for the great work. Unless stated otherwise, assume the contents refer to the Eurasia K20 run. This entry may be updated at any time to include further investigations based on future runs. Finally, this entry assumes the mainstream Pontic-Caspian theory for the genesis of the Indo-Europeans to be fully accurate.

This entry/repost contains a "quick and dirty" method for a preliminary attempt at deriving their Sintashta admixture levels in West, Central and South Asians based on the Eurasia K20 scores. Given the different admixture histories elsewhere in Eurasia, this probably won't be very informative for users with ancestral backgrounds outside the lands between Kurdistan and the Indo-Gangetic plains. This is especially the case with modern Europeans, who share the same core components with Sintashta, while also deriving their own Indo-European ancestries from different archaeological cultures and time periods.

Establishing the Context
According to this Eurasia K20 run, Sintashta are approximately 62% Yamnaya, 22% EEF, 10% European and 3% SHG_WHG. Sintashta, at present, appear to be the best proxy for the Indo-Iranians that arrived in West and South Asia. The above four components define the majority (94%) of Sintashta's autosomal profile here.

As discussed elsewhere in Anthrogenica (kudos to user Sein for pointing this out), Sintashta should be considered better surrogates for the Andronovo-related waves which reached West, Central and South Asia than the actual Andronovo samples derived from Allentoft et al. 2015. This is due to the Andronovo samples being derived from the extreme northeast of the archaeological horizon (above the Altai, near Afanasievo). Their position opens up the possibility for extraneous admixture from other steppe groups (including early speakers of Tocharian through Afanasievo?).

The user Kurd has previously demonstrated that recent steppe-related admixture may be segregated from other components. While undertaking this exercise, it also looks like Kurd has done an excellent job addressing the "teal" component that defined up to half of Samara Yamnaya and a big chunk of Sintashta. Kurd's K20 is, in my view, the most effective attempt thus far at separating the complicated autosomal overlapping in West and Central Eurasia.

Introducing the Ancestral Proportions Method (APM)
At present, the genetic landscape in West, Central and South Asia presents as a triple conundrum:

  1. There is, to date (and with the exception of the poor quality Barcin Neolithic Turkish sample), absolutely no interpretable ancient DNA (aDNA) from any of these regions, or indeed, at any point in this broad area's history. Perhaps the greatest obstacle at present.
  2. Autosomal and uniparental marker data from across the region are either inconsistent in sample strategy, or outdated, preventing a knowledge-based approach towards interpreting results.
  3. Archaeological evidence is inconsistent across the region; some cultures are well-studied, whereas others have fallen to mirthful speculation or cannot be readily assigned to any particular prehistoric group.

The APM is, in principle, unconcerned with these issues. Instead, it relies on objective data from a single ancient population to discern the numerical degree of overlap with modern populations.

Whether or not Iranians, Punjabis or Nepalis derive the bulk of their ancestry from unrelated group X or Y is beside the point. The sole purpose of the APM is, therefore, to establish whether or not ancient population Z has left any genetic imprint on modern populations A-K, and if so, to what extent. As such, the methodology described here is completely different as it is assymetrical; one-way gene flow across space and time from one ancestral (extinct) population to numerous extant populations. APM or derivative approaches should be considered as supplementary rather than directly competing with symmetrical modelling techniques such as f3 statistics.

The APM was specifically designed to answer the question; to what degree did Sintashta-related populations contribute to the modern groups of West, Central and South Asia? This simple inquiry has a tendency to attract considerable debate and wildly differing estimates in online discussion boards. Today, using the APM and recently generated data from the Eurasia K20 run, I hope to provide one set of estimations completely independent of extraneous modelling factors.

This approach is entirely reliant on high component specificity (e.g. minimal overlap or bleed-over from one component to another). This particular parameter is not within my control in this instance. As such, the outputs from APM here should be considered cautionary preliminary estimates at best, given the potential for ADMIXTURE-related shortcomings in the absence of relevant aDNA. I anticipate this approach will be much more effective at gleaning admixture extents once aDNA from West and Central Asia dating <2200 B.C. are retrieved.

The APM Approach
To contrast against the ADMIXTURE Sintashta scores, two different approaches are utilised together:

1) Direct Overlap (DO): summarised, for each component, the maximum overlap between a given population average and Sintashta's scores are calculated. This is done individually across all four components (Yamnaya, EEF, European, SHG_WHG) with the outputs added. See image below for schematic (conceptual breakdown of how the DO principle works between hypothetical samples 1 and 2, with Components a-d representing the distinct components).

Schematic diagram showing the principle behind Direct Overlap calculation

2) Component Proportions (CP): A single dominating component (frequency > 50%) is considered modal for the ancestral population of choice, with the other values considered as a fraction of this in modern populations. Given the Yamnaya component makes up almost two-thirds of Sintashta, the ratio between a population's and Sintashta's Yamnaya score are calculated and re-applied to the rest.

There are, however, problems with either approach:

1) DO is overinflated the more West Eurasian a population is. For example, several of the Iranian or Kurdish users at Anthrogenica had component scores greater than what is found in Sintashta (e.g. European being 12% in one sample, when it's 10% in Sintashta). This biases the results for Iranians and Kurds greatly, even when absolute value adjustments are set in place, which the formula shows is (it is highly improbable an Iranian with 10% European derived all of it from Sintashta).

2) CP is more accurate given the Yamnaya component appears highly steppe-oriented in Eurasia K20 and can therefore serve as a direct admixture marker. However, some of the South Asians are scoring very low, or almost none of, the other key components found in Sintashta (e.g. EEF). Due to this, CP doesn't fully account for the "missing variation" in South Asians, biasing the results slightly in their favour.

One convenient workaround is undertaking an average of both scores. However. given CP is intuitively more accurate due to the reasonable specificity of the Yamnaya component, a weighted average biased in favour of CP by a ratio of 3:1 was undertaken. The ratio choice in this variant of the APM is arbitrary here. Other variants (2:1, 4:1) would not result in radically different outcomes.

Full results from up to 24 populations are shown in the Data Sink (interactive chart below). Summarised, Pamiri Tajiks are the most Sintashta-derived at 31.9%. North Caucasian (Ossetian) and Central Asian ethnic groups (Pashtuns, Uzbek, other Tajiks, Turkmen) follow at 22-19%. Various other ethnic groups across West, Central and South Asia follow. The lowest scoring population sampled here are the Makrani at 9.2%.

Internal Validation
The output (Data Sink) readily demonstrates strong correlation between DO and CP scores per population (e.g. Tajik Pamiris at 34.20% & 30.8%, Nepalis at 15% & 14.5% and Makrani at 10.2% & 8.7% respectively account for the top, middle and bottom pairs). The only marked deviation between the DO and CP scores were noted in West Asian populations (Armenians, Kurds, Iranians), as mentioned previously. Thus, empirical confirmation of correlation (e.g. Spearman's rank order) is unwarranted here.

Another means of confirming the validity of APM is to confirm Andronovo is a descendant of Sintashta. As the Andronovo archaeological horizon originates from Sintashta directly, one would expect very high (>90%) Sintashta-derived ancestry among them.

This appears to be the case. compared against Sintashta, Andronovo exhibits DO = 83.9%, CP = 97.3%, an average of 90.6% and a weighted average of 92.8%.

Summarised, these two results (dataset-wide correlation, ancestral-immediate successor high overlap) validate the outcomes of the APM.

Closing Thoughts
The results featured in this entry are in line with both broad uniparental marker data, previously published IBD results (unfortunately removed from sources) and are largely (though not fully) compatible with the degree of archaeological input from Andronovo-derived cultures in Asia. As stated previously, due to earlier shortcomings, they should not be considered definitive.

Given the CP here is not exclusively associated with Sintashta, I anticipate this technique will be more accurate if future "steppe"/"Yamnaya"/"Yamanya_related" components are shown to define more of the Sintashta samples. I look forward to extending this method in the near future.

Special thanks to the user Kurd from Anthrogenica for making this data available and obliging member inquiries with productive responses, as well as the user Sapporo for generating several of the population averages.

Tuesday, July 28, 2015

Comparison of Online Y-STR Predictors (Petrejcíková et al.) [Review]

An interesting study was published in 2014 based on Slovak Y-STR samples testing for 12 microsatellite markers. The main scope of this paper appears to be the investigation of the efficacy of three publicly available Y-STR haplogroup predictors (Athey, Cullen and YPredictor in alphabetical order) based on these 12 Y-STRs. Study contents shown below.

Y-SNP analysis versus Y-haplogroup predictor in the Slovak population.
Petrejcíková E, Carnogurská J, Hronská D, Bernasovská J, Boronová I, Gabriková D, Bôziková A, Maceková S. Anthropol Anz. 2014;71(3):275-85.
Human Y-chromosome haplogroups are important markers used mainly in population genetic studies. The haplogroups are defined by several SNPs according to the phylogeny and international nomenclature. The alternative method to estimate the Y-chromosome haplogroups is to predict Y-chromosome haplotypes from a set of Y-STR markers using software for Y-haplogroup prediction. The purpose of this study was to compare the accuracy of three types of Y-haplogroup prediction software and to determine the structure of Slovak population revealed by the Y-chromosome haplogroups. We used a sample of 166 Slovak males in which 12 Y-STR markers were genotyped in our previous study. These results were analyzed by three different software products that predict Y-haplogroups. To estimate the accuracy of these prediction software, Y-haplogroups were determined in the same sample by genotyping Y-chromosome SNPs. Haplogroups were correctly predicted in 98.80% (Whit Athey's Haplogroup Predictor), 97.59% (Jim Cullen's Haplogroup Predictor) and 98.19% (YPredictor by Vadim Urasin 1.5.0) of individuals. The occurrence of errors in Y-chromosome haplogroup prediction suggests that the validation using SNP analysis is appropriate when high accuracy is required. The results of SNP based haplotype determination indicate that 39.15% of the Slovak population belongs to R1a-M198 lineage, which is one of the main European lineages.
[Abstract] [Direct Link]

Are They Really Comparable?
Although all three predictors returned similar efficacy rates (~97-99%), it should be noted the authors' chief divisions of interest appear to be the conventional subclade designations currently used in both literature and the genetic genealogy community (e.g. R1a1a-M198). The authors correctly state Y-SNP testing is paramount in definitively gauging subclade classifications, especially for lines substantially downstream of a given haplogroup's phylogeny.

The rest of this entry determines whether these calculators display any other features which may give aspiring researchers reasons to choose one over another.

Subclade Coverage
A substantial difference is observed between the three. Athey's output is oriented around 21 categories spread across most of the major clades/subclades, although haplogroups not commonly found in West Eurasia (e.g. A-D) are unrepresented. Cullen improves on this significantly with 86 subclades, with Y-DNA I receiving the most attention (R1b to a lesser extent), with some improvements, such as well as the inclusion of "A&B". YPredictor has the highest count, hosting over 100 subclades, with the majority found in Y-DNA haplogroups E, G, J, N and R. With the exception of Y-DNA M and S, all are accounted for here.

STR count
Athey is capable of handling 111 Y-STR's (21 and 27-STR versions also available) with the format being listed in either numerical or Family Tree DNA (FTDNA) order. Cullen accepts a maximum of 67 STR's. YPredictor houses approximately 82 STR's. As such, all three are capable of handling a considerable number.

All three predictors permit the use of batched data and provide different means of categorising the data as seen fit by the user. Instructions are adequately provided for all three as well. As a research utility, however, YPredictor stands out through its' custom YFiler iterations (widely-used format in population genetics publications concerning Y-STRs) and debug feedback before predictions are made by the calculator.

Computational Time
This varies based on the user's CPU processing time, as well as whether they are manually entering STR values or inserting batched data. As such, this probably shouldn't be a pertinent factor in deciding which calculator to use.

Output Information
All three produce similar information (subclade prediction with probability expressed as a percentage).

Before summarising these findings, it is worth noting that Athey's predictor precedes Cullen's and YPredictor. As such, any perceived deficiencies in subclade breakdown or functionality are likely a result of age. Athey's predictor was widely used in the past, irrespective of the current application rate.

All three predictors are of use to genetic genealogists. This entry concludes the following "idealised" purposes for each:

  • Athey - For users keen to utilise upwards of 111 FTDNA Y-STR's as cross-validation against the other two
  • Cullen - Best for those seeking refined Y-DNA I or R1b subclade predictions
  • YPredictor - Most versatile and research-friendly, best worldwide coverage of Y-DNA subclades

As such, the three calculators certainly are comparable for making basic Y-STR predictions for West Eurasians, but obvious differences exist with respect to non-West Eurasian subclade coverage.

If compelled to make a single choice, I would recommend Cullen first to genetic genealogists of Northwest European paternal heritage (given the high frequencies of Y-DNA's I and R1b). YPredictor would be the best choice for those belonging to subclades more common outside Europe. This also explains why it has been extensively used in this blog to date. Athey's function has otherwise been usurped by the other two. 

Thursday, July 9, 2015

Presenting Bakhtiari Uniparental Marker Data [Original Work]

Bakhtiari people (Google Search)
The Bakhtiari people are one of Iran's ethnic minorities. Inhabiting the Iranian plateau's southwestern portion, the Bakhtiari traditionally maintained a hierarchical social structure with a genealogical basis (with organisations or positions including rish safids, kalantars, khans and ilkhani) [1]. Historically, the Bakhtiari have played a role in several pivotal events leading up to the formation of the modern Iranian state [2].

In recent years, the Bakhtiaris have received additional attention in the literature with respect to ancestry. This has been achieved predominantly via uniparental markers (Y-DNA and mtDNA) and coincides with work addressing the genetic origins of other ethnic minorities in Iran. For instance, in 2012, Grugni et al. expanded our understanding of Iranian Y-DNA across the country through sampling almost 1,000 unrelated men across 15 distinct ethnic groups (previous entry).

In spite of such developments, however, the Bakhtiari have not received much attention in either the genetic genealogy community or the literature. This entry attempts to explore the available data and arrive at a stable set of results for this group.

Khuzestan province, Iran (Wikipedia)

Search engines were limited to PubMed and Google Translate. Search terms included "Bakhtiari", "Y-DNA", "Y-Chromosome", "mtDNA", "mitochondrial", "STR", "SNP", "HVR" and "Iran". No limit was placed on publication date. All mtDNA and Y-DNA data was compiled. Where Y-STRs are presented, these were run through Vadim Urasin's YPredictor (v1.0.3 offline version). A 70% prediction strength threshold was implemented. If the resulting data is sparse, novel ways of consolidating the information will have to be devised and explained during the course of this entry.

Search Outcomes
Three studies were found to contain Bakhtiari uniparental data, with one partially covering Bakhtiari mtDNA (Derenko et al. 2013 [3]) and two for Y-DNA (Nasidze et al. 2008 [4], Roewer et al. 2009 [5]). The Bakhtiari populations featured mostly reside in Izeh, Khuzestan province, Iran [3-5] with a single sample coming from Lurestan province, Iran [4].

mtDNA Results
Derenko et al. featured only two Bakhtiari samples. One belonged to mtDNA H*, which was also observed in several Persian (Kerman province) and Qashqai samples, alongside a single Armenian. [3] The only other sample was mtDNA U2d2, also found in a single Persian (Kerman province). The authors noted that the combined frequency of mtDNA's U2c and U2d in Iran were highest among the Persians nationwide (approaching 10%) [3]. However, given the absence of additional samples, no reasonable conclusions can be drawn from these results.

Nasidze et al. provides both frequency and HVR1 derived variance data on the Bakhtiari and Ahwazi Arab populations [4]. The Bakhtiari appear to chiefly belong to mtDNA haplogroups N, U, H, T and J (below).

mtDNA Frequency Data from Khuzestan province, Iran {Nasidze et al. 2008)

Unfortunately, further information on subclade breakdown is not provided. However, as concluded by the authors and is evident through frequency data, the mtDNA profile of the Bakhtiari is almost identical to the Ahwazi Arab sample. Additionally, Nasidze et al. note "considerable sharing of HV[R]1 sequences" between these two groups [4]. In tandem with the inferences described above through Derenko et al., it appears that significant matrilineal marker overlap does exists across the Iranian plateau.

Y-DNA Results
Nasidze et al. first published data on 53 unrelated Bakhtiari men [4]. Due to substandard Y-SNP genotyping, the only conclusions that may broadly be discerned is the Bakhtiari chiefly belong to Y-DNA haplogroups J2-M172 (25%) and G-M201 (15%) (Data Sink). In this respect, these results cannot give observers a reliable indication of the Bakhtiari Y-DNA profile. Roewer et al.'s data indicates that some number of Bakhtiari do share the same core 17 STR haplotypes among one another (e.g. J2a4, T*)  but do not with any other samples across the country [5].

One "quick and dirty" way of addressing this problem is by using the YFiler (17 STR) Bakhtiari haplotypes (Data Sink) from Roewer et al. to "recharacterise" the Nasidze data. This is deemed the most suitable option for two reasons:
1) Nasidze et al. has an adequate sample size (n=53) but inadequate Y-SNP genotype selection
2) Roewer et al. has an inadequate sample size (n=18) and no confirmed Y-SNP testing, but the YPredictor data should provide reasonable subclade determination with a 70% probability threshold in place

"Recharacterisation" is achieved by expressing the Nasidze et al. data by the predicted subclade information provided by the Roewer et al. SNP predictions proportionally. For example, Nasidze et al. found "DE-YAP" at 8%, with the Roewer et al. predicted results showing 5.6% each for "DE*" and "E1b1b1". As both these subclades are contained within the DE-YAP node, the original value is recharacterised as DE 4% and E1b1b1 4%. The outcome is presented numerically (Data Sink) and demonstrated below (values rounded down to fit to 100%):

Y-DNA J2a4 constitutes the largest subclade (22.1%), with H (10.8%), R1a1a (8.9%) and T* (8.5%) following. The results imply considerable Y-SNP diversity within the Izeh Bakhtiari.

These results are somewhat at odds with that suggested by the Roewer et al. figures, particularly the frequency of Y-DNA J2-M172 (50% in Roewer et al. vs. 25% in Nasidze et al.). The most likely basis for this is sampling bias, given the former only tested for 18 individuals. It should be noted that Y-DNA J-12f2 has been documented to have a major (>60%) presence in Southwestern Iran (Quintana-Murci et al. 2001) with the majority of this likely being represented by downstream J2-M172 subclades (as per Grugni et al. 2012). It is therefore plausible for some Bakhtiari groups to yield exceptionally high frequencies of Y-DNA J2-M172 (likely J2a4 subclade) with future testing. The breakdown shown above is also broadly in line with past data from Southwestern Iran (Grugni et al. 2012).

It must be cautioned that literal interpretation of these results (both subclade breakdown and numbers) are not advised due to the inaccuracies brought by the "recharacterisation" and the lack of Y-SNP confirmation in Roewer et al.

It should also be emphasised that, as a tribal group, the Bakhtiari have most likely undergone genetic drift in their uniparental markers over time. As such, the finding of ~10% Y-DNA H is not completely surprising. Whether these values will be substantiated in future work is an open question.

The current evidence does suggest that the Bakhtiari closely resemble and share heritage with their immediate neighbours matrilineally, resting upon a backdrop of some common mtDNA diversity across the Iranian plateau. Inferences beyond this point will fall towards the realm of speculation.

The situation appears somewhat inverted on the Y-DNA side, where non-existent Y-STR haplotype sharing is observed with other groups in the Iranian plateau. The "recharacterised" data gives us an approximate idea of what the Bakhtiari Y-DNA profile should look like if Nasidze et al. used a better Y-SNP genotype panel.

Other ethnic minorities in Iran have received consistent attention in this respect, such as the neighbouring Qashqai and Lurs (Farjadian et al. 2011). The paucity in Bakhtiari uniparental marker data indicates this is very much an area that needs immediate attention. An initial first direction for researchers is to sample at least 50 unrelated individuals from Izeh using a more conventional Y-SNP genotype panel. Additional clarity will be gained by testing further areas, as well as reconciling the Bakhtiari tribal structure with these outcomes.

A very special thanks to the user "J Man" from Anthrogenica for bringing this interesting topic to my attention.

[Edit 10/07/2015]: I have also learned while researching this topic that Dr. Ivan Nasidze unfortunately passed away in 2012. His work served as an important early foundation towards understanding the genetic constitution of Caucasian and Iranian populations. May he rest in peace.

1. Bakhtiari. Last Accessed 25/06/2015:

2. Study of the Qajar government policy at the case of Household Bakhtiari. Last Accessed 6/07/2015:,%202014/26%202014-30-1-pp.124-127.pdf 

3. Derenko M, Malyarchuk B, Bahmanimehr A, Denisova G, Perkova M, Farjadian S. Complete mitochondrial DNA diversity in Iranians. PLoS One. 2013 Nov 14;8(11):e80673. doi: 10.1371/journal.pone.0080673. eCollection 2013.

4. Nasidze I, Quinque D, Rahmani M, Alemohamad SA, Stoneking M. Close genetic relationship between Semitic-speaking and Indo-European-speaking groups in Iran. Ann Hum Genet. 2008 Mar;72(Pt 2):241-52. doi: 10.1111/j.1469-1809.2007.00413.x. Epub 2008 Jan 20.

5. Roewer L, Willuweit S, Stoneking M, Nasidze I. A Y-STR database of Iranian and Azerbaijanian minority populations. Forensic Sci Int Genet. 2009 Dec;4(1):e53-5. doi: 10.1016/j.fsigen.2009.05.002. Epub 2009 Jun 5.