Haplotype frequencies at the DRD2 locus in populations of the East European Plain
© Flegontova et al. 2009
Received: 9 March 2009
Accepted: 30 September 2009
Published: 30 September 2009
Skip to main content
© Flegontova et al. 2009
Received: 9 March 2009
Accepted: 30 September 2009
Published: 30 September 2009
It was demonstrated previously that the three-locus RFLP haplotype, TaqI B-TaqI D-TaqI A (B-D-A), at the DRD2 locus constitutes a powerful genetic marker and probably reflects the most ancient dispersal of anatomically modern humans.
We investigated TaqI B, BclI, MboI, TaqI D, and TaqI A RFLPs in 17 contemporary populations of the East European Plain and Siberia. Most of these populations belong to the Indo-European or Uralic language families. We identified three common haplotypes, which occurred in more than 90% of chromosomes investigated. The frequencies of the haplotypes differed according to linguistic and geographical affiliation.
Populations in the northwestern (Byelorussians from Mjadel'), northern (Russians from Mezen' and Oshevensk), and eastern (Russians from Puchezh) parts of the East European Plain had relatively high frequencies of haplotype B2-D2-A2, which may reflect admixture with Uralic-speaking populations that inhabited all of these regions in the Early Middle Ages.
The DRD2 gene is located on chromosome 11 and encodes the neuronal dopamine receptor D2, which plays a role in movement, emotional memory, and appetitive behavior . The DRD2 locus was an object of numerous genetic association studies [2–5], and the most extensively studied polymorphism is a TaqI A RFLP (rs1800497; in the vicinity of the DRD2 gene), which has been associated with the pathology of psychoses (schizophrenia and manic-depressive disorder), Parkinson's disease, and various substance abuse syndromes. It has been proposed that TaqI A might be in linkage disequilibrium with some unidentified polymorphisms within the exons or regulatory regions of the DRD2 gene, but recently it has been mapped to the last exon of the ANKK1 (ankyrin repeat and kinase domain containing 1) gene, and it results in a Glu-Lys substitution . Other frequently studied RFLPs, for example, TaqI B and D (rs1079597 and rs1800498, respectively) are located in the introns of the DRD2 gene and, most probably, have no functional significance.
TaqI B, TaqI D, and TaqI A polymorphisms have also been studied on a worldwide scale [[7–11]; the ALFRED database, http://alfred.med.yale.edu/alfred/index.asp], and centers of dispersal, which probably reflect the most ancient dispersal of anatomically modern humans, have been proposed for their three-locus haplotypes . It has been shown that the B2, D2, and A1 alleles are ancestral alleles common to other hominoids [12–14]. Kidd et al.  proposed the following evolutionary sequence for the most common haplotypes: evolution of B2-D2-A1 to B2-D2-A2 and B1-D2-A1 and evolution of B2-D2-A2 to B2-D1-A2. The other less frequent haplotypes probably arose by recombination. Three-locus haplotypes exhibit pronounced geographical differentiation. With the exception of some tribal populations of India , the ancestral haplotype B2-D2-A1 is mainly confined to African groups. The singly derived haplotype B1-D2-A1 is most widespread among people of East Asian descent, including Native Americans. Haplotype B2-D2-A2 is present among all populations but is least prevalent in Western European and American populations. The doubly derived haplotype B2-D1-A2 is common in Europe and rare in East Asia. The other haplotypes are extremely rare and have sporadic distribution.
Here we provide data on the variability of DRD2 haplotypes in a previously uninvestigated region, the East European Plain. We investigated 14 contemporary populations of the East European Plain and Siberia that belong to the Indo-European and Uralic language families. In addition, two populations of the Altaic language family (Yakuts and Kalmyks) and a population of the North Caucasian language family (Adygeis) were included as reference groups. We also performed an updated global analysis of B-D-A haplotype frequencies using our data and data from the ALFRED database.
Geographical and linguistic affiliations of populations sampled in this study, and allele frequencies in the studied populations
Number of individuals
Russia, Krasnodar region
Belarus, Brest region, Pinsk
Belarus, Minsk region, Mjadel'
Belarus, Mogilev region, Klimovichi
Russia, Tver region, Andreapol'
Russia, Smolensk region, Sychevka
Russia, Kursk region, Ponyri
Russia, Ivanovo region, Puchezh
Russia, Archangelsk region, Oshevensk
Russia, Archangelsk region, Mezen'
Russia, Vologda region, Babaevo
Komi 1 (Izhemski)i
Russia, Komi Republic, Izhma
Komi 2 (Priluzski)i
Russia, Komi Republic, Obyachevo
Russia, Khanty-Mansi autonomous region
Russia, Yamalo-Nenetsky autonomous region
Russia, Saha Republic, Tiungiuliu
Russia, Republic Kalmykiya, Elista
Alleles 1 at the TaqI B, BclI, and TaqI A loci were more frequent in the populations of Russians 1 and 5, Veps, Komi 2, Khants, Nenets, Yakuts, and especially Kalmyks, than in the other populations (Table 1). Allele 1 at the MboI locus and allele 2 at the TaqI D locus were the most frequent alleles in Asian populations, i.e. Khants, Nenets, Yakuts, and Kalmyks. It is notable that combinations of allele 2 at the TaqI B locus with allele 1 at the BclI locus and allele 2 at the MboI locus with allele 2 at the TaqI D locus, found in the sequenced chimpanzee genome , were absent in human populations (Table 1).
Pairwise linkage disequilibrium values for the studied populations
TaqIB/BclI - MboI/TaqID
TaqIB/BclI - TaqIA
MboI/TaqID - TaqIA
P = 0.155
P = 0.827
P = 0.927
Frequencies of TaqI B-TaqI D-TaqI A haplotypes in the study populations
Adygeis 1 (Shapsugs)
Komi 1 (Zyrian)
Komi 2 (Zyrian)
In the Asian cluster genetic distances were significant between Yakuts and Kalmyks (P < 0.00001), Khants and Yakuts (P < 0.00001), Khants and Kalmyks (P < 0.04661), Nenets and Yakuts (P < 0.00099), but not between Nenets and Kalmyks (P < 0.07438). The distance between Khants and Nenets was small and non-significant (P < 0.52033).
All populations of the European cluster had significant genetic distances from the populations of the Asian cluster. The European-1 and European-2 clusters were quite homogeneous; most distances were non-significant within the former, and all were non-significant within the latter. Only Russians 1 occupied distinct position within the European-1 cluster. In contrast, most pairwise genetic distances between the two subclusters were significant. Exceptions included Veps, which had no significant distances from the populations of the European-2 cluster, and Byelorussians 2, which, had a small number of significant distances from the populations of the European-1 cluster.
The matrix of haplotype-based genetic distances was also compared with matrices of great circle geographical distances (data not shown) to assess the possible effect of isolation by distance. The matrices including all 17 populations were highly correlated according to the Mantel test (r = 0.749, P-value = 0.0001). However, correlation of geographical and genetic distances was not observed among the populations of the European cluster, i.e., after exclusion of Khants, Nenets, Kalmyks (which have migrated from East Asia in historical times), and Yakuts (r = 0.121, P-value = 0.2999). The use of more realistic distance measurements around geographical barriers, such as the Azov Sea, the Caucasus, and some parts of the Ural Mountains had little effect on results of the test (the correlation coefficients for 17 and 13 populations were 0.754 and 0.117, respectively; P-values were 0.0001 and 0.3109, respectively). Thus, isolation by distance is not a likely cause of the genetic variation observed in the East European Plain.
Comparison of some population groups according to the Kolmogorov-Smirnov test of haplotype frequencies
The following populations constituted the European-1 subcluster: Indo-European-speaking Irish, Danes, Byelorussians 1 and 3, Russians 1, 2, 3, and 7; North-Caucasian-speaking Adygeis 1 and 2; Jews 3 (Ashkenazi); Uralic-speaking Veps, Komi 1 and 2. The populations of this cluster had the following haplotype frequencies: 0.50-0.65 for the "European" haplotype B2-D1-A2, 0.21-0.31 for the "worldwide" haplotype B2-D2-A2, 0.08-0.25 for the "East Asian" haplotype B1-D2-A1. Some populations of the East European Plain (Indo-European-speaking Byelorussians 2, and Russians 4, 5, and 6; Uralic-speaking Finns and Komi 3; Altaic-speaking Chuvash) fell into the European-2 subcluster, which is characterized by the following haplotype frequencies: B2-D1-A2, 0.37-0.52; B2-D2-A2, 0.27-0.45; and B1-D2-A1, 0.04-0.25. Two European subclusters can be differentiated on the basis of their B2-D2-A2 and B2-D1-A2 frequencies (P-values according to the Kolmogorov-Smirnov test < 0.0001 and 0.001, respectively). This is also evident from Figure 3: on the MDS plot, both European subclusters occupy almost identical positions in the B1-D2-A1 frequency gradient (Figure 3, upward arrow) but have different positions in the B2-D1-A2 and B2-D2-A2 frequency gradients (Figure 3, arrows).
Indexes of population structure
No. of samples
Ugric-speaking Khants 1 and 2 and Samoyedic-speaking Nenets were distant from the other Uralic populations (Finns, Veps, Komi 1, 2, and 3) and grouped into the Intermediate-1 subcluster with Sino-Tibetan-speaking Riang and Tripperah from India, with Melanesians and Micronesians (the results for Melanesians and Micronesians should be interpreted with caution because of a low sample size; see Additional file 1). In Khants and Nenets, the frequencies of the "European" haplotype B2-D1-A2 (0.22-0.25) and the "East Asian" haplotype B1-D2-A1 (0.23-0.26) were intermediate between those of the European-1 and Asian-1 subclusters (Figure 3). It is interesting that the frequencies of the "worldwide" haplotype B2-D2-A2 in Khants (0.48-0.49) and Nenets (0.48) fell within the range typical of the Asian-1 subcluster, 0.44-0.53 (P-value = 0.9441 according to the Kolmogorov-Smirnov test).
The Asian-1 subcluster was formed by populations of East Asian descent (Altaic-speaking Kalmyks; Koreans and Japanese; Sino-Tibetan-speaking Hakka, Han 1 and 2; and Austro-Asiatic-speaking Cambodians). They had the following frequency pattern: B2-D1-A2, 0.04-0.14; B2-D2-A2, 0.44-0.53; and B1-D2-A1, 0.34-0.46.
The global FST value, 0.11413 (Table 5), falls within the range typical of autosomal markers (0.09-0.14, ). This value was estimated using haplotype frequencies without taking into account the extent of molecular differences between haplotypes. Calculation of FST based on numbers of pairwise differences between haplotypes gave a nearly identical value, 0.11383. Thus, the differentiation of populations may be explained by drift only, without any significant influence of mutation.
The East European (Russian) Plain is a region in which peoples of the Indo-European and Uralic language families have come into contact over an extended period. Uralic-speaking peoples have the longest validated archaeological record in this region . The most recent large-scale migration to this region involved the movement of Slavs (the Indo-European language family) to the east and northeast of their presumed homeland in Central Europe about 500 AD [18, 19]. Slavs were not the first Indo-European-speaking people who arrived in the Russian Plain: in the first millennium BC, Baltic-speaking tribes occupied a large part of the East European Plain . They were later displaced by Slavic tribes. According to the widely accepted hybridization theory of the origin of Eastern Slavs , Slavic populations arriving in the East European Plain were mixed with indigenous Uralic- and, probably, Baltic-speaking people.
In our study, all populations of the East European Plain (excluding the Kalmyks, which are of East Asian origin) fell into a single large cluster termed European. Many populations within this cluster are indistinguishable with our genetic marker, i.e., genetic distances between them were not significant, which is in agreement with the low FST value for the European cluster (0.013). However, some populations were characterized by a large percentage of significant genetic distances from the other populations of the cluster. Most such populations fell into the so-called European-2 subcluster defined by cluster analysis; the 'core' subcluster was termed European-1, and 59% of genetic distances between populations of the two subclusters were significant. European-1 and European-2 subclusters (Figure 3) are differentiated according to the B2-D2-A2 frequency, but not according to the B1-D2-A1 frequency, which might reflect the degree of Asian admixture. Natural selection probably was not responsible for separation of the two European subclusters as there is no difference in allele frequencies at the TaqI A locus (Table 4), which is considered the most likely candidate for selection in the whole DRD2 region [2–6].
The European-2 subcluster includes two Middle Eastern populations (Jews 2 from Yemen and Druze from Israel), two Uralic-speaking populations (Finns and Komi 3), also four Slavic-speaking populations (Byelorussians 2 and Russians 4, 5, and 6), and the Altaic-speaking Chuvash. All these linguistically and geographically distant populations are differentiated to some extent from the core of the European cluster, the European-1 subcluster, because of a relatively high B2-D2-A2 frequency.
The B2-D1-A2 and B1-D2-A1 haplotypes apparently have centers of dispersal in Europe/West Asia and East Asia, respectively . The B2-D2-A2 haplotype may also have a center of dispersal, the most probable location of which is in Africa. B2-D2-A2 was among the first haplotypes that evolved from the ancestral haplotype B2-D2-A1 in Africa  and still is the most abundant haplotype in all African populations (Additional file 1). Therefore, the first settlers of Eurasia that migrated to Arabia and Levant may mostly have carried the B2-D2-A2 haplotype and a small proportion of other haplotypes that were either subsequently eliminated or amplified by genetic drift and/or natural selection in various parts of the world.
B2-D2-A2 is the predominant haplotype in contemporary populations of Middle Eastern origin (Jews 1 from Ethiopia, Jews 2 from Yemen, and Druze from Israel). Populations of Levant began to disperse into Europe about 50,000 YBP [21, 22]. People of that diaspora might have carried B2-D2-A2 as a prevalent haplotype. However, another haplotype, B2-D1-A2, is the predominant haplotype in contemporary populations of Europe, for example, in linguistically and geographically distant populations such as the Irish, Danes, Russians, and Adygeis. Amplification of this haplotype might have taken place during the initial migration to Europe or might be associated with confinement in refugia of the last glacial maximum and reexpansion. According to one of several hypotheses based on archaeological evidence (reviewed by Zvelebil ), Uralic-speaking groups descended from Europeans who had been confined in the East European refugium in Ukraine and Southern Russia [24–29]. It is possible that the people of the East European refugium retained a high frequency of the B2-D2-A2 haplotype, in contrast with the other European populations that have spread from other refugia.
Thus, on a global scale, the B2-D2-A2 frequency possibly reflects genetic features of the first Eurasian settlers, e.g. in Jews 2 from Yemen and Druze from Israel, and, on a local scale, genetic features of Uralic-speaking populations, e.g., in Finns and Komi 3, Khants and Nenets. Therefore, the other members of the European-2 subcluster, Byelorussians 2, Russians 4, 5, and 6, and Chuvash, probably have a certain level of Uralic admixture. Moreover, according to other genetic and historical data, these populations have Uralic substratum.
Studies on mtDNA [30–34] and Y-chromosome haplogroups [33, 35, 36], autosomal VNTR diversity [37, 38], and autosomal haplotypes  consistently show a degree of Uralic admixture in populations of northern Russians. This admixture is manifested by a 'Uralic-specific' marker, the U4 mtDNA haplogroup [40–45], or by Asian-specific markers which Uralic populations acquired during earlier admixture with East Asians: Asian mtDNA haplogroups [40, 42, 45], N3 and N2 Y-chromosome haplogroups [46, 47]. For example, the Mezen' population investigated by Balanovsky et al.  has high N3 frequency, resembling some Uralic populations.
Russians 5 (Oshevensk) were most closely associated with Finns and Chuvash according to the MDS results (Figure 3). In the study of Verbenko et al.  on polymorphic tandem repeats at the D1S80 locus, the same Oshevensk sample (as well as another northern Russian sample) clustered together with Uralic-speaking Mari, Komi, and Udmurts, whereas other Russian populations clustered with Indo-Europeans and Adygeis. Analysis of other repeat loci, 3' ApoB, DMPK, DRPLA, and SCA1, also demonstrated remoteness of some northern Russian populations (including Russians 5, Oshevensk) from the core of the European cluster . Similar results were obtained using haplotypes at the TP53 locus : the Oshevensk population tended to form a cluster with Uralic-speaking Mordvins and with Altaic-speaking Kalmyks and Buryats, but not with Russians from Smolensk (Russians 2) or Byelorussians from Pinsk (Byelorussians 1).
The Uralic genetic substratum is appreciable not only in the Northeast of the East European Plain but also in its northwestern part, for example, in the Pskov  and Novgorod regions , in Latvians and Lithuanians [46, 49, 50]. Baltic-speaking peoples, now represented by the Latvians and Lithuanians, came into contact with Uralic groups before the Slavs did (Figure 4; ). That Byelorussians 2 (Mjadel') fell into the European-2 subcluster may also reflect a general tendency in the northwestern region. Mjadel' is located in the northwestern part of Belarus near the contemporary Lithuanian border. The Russians 4 (Puchezh) population is distant from the northeastern and northwestern groups, but also belongs to the European-2 subcluster (Figure 3). Uralic admixture in this population may be explained by the presence of Uralic-speaking tribes in the region of Puchezh in historical times (; see Figure 4).
Russians 1 (Andreapol') and Uralic-speaking Veps are close to Uralic-speaking Komi 2 (Obyachevo) on the MDS plot (Figure 3). All these populations are located within the region occupied by Uralic peoples in the Middle Ages (Figure 4), but belong to the European-1 cluster and do not have high B2-D2-A2 frequencies typical of the European-2 cluster. However, they are shifted from the core of the European cluster because of a relatively high proportion of the "East Asian" haplotype B1-D2-A1. In fact, the Veps population has significant genetic distance only from Druze, but not from the other populations of the European-2 cluster, and Komi 2 only from Jews 2, Druze, Russians 6, and Komi 3. The Andreapol' sample had the highest B1-D2-A1 frequency of all European populations (Additional file 1), and eight of 13 genetic distances between this sample and the other populations of the European-1 subcluster are significant.
Komi populations demonstrate remarkable heterogeneity according to various marker systems. For example, Komi-Permyaks and Komi-Zyrians have rather different mtDNA haplogroup frequencies but both have a relatively high U4 frequency . In our study, one of the Komi-Zyrian populations (Komi 1, Izhma) belonged to the core of the European-1 cluster. It is interesting that the craniological results of Moiseyev  also place Komi-Zyrians at the core of the European cluster and distant from Uralic and Asian groups. According to three-site haplotype frequencies at the TP53 locus and VNTR frequencies at the D1S80 and 3' ApoB loci, the Komi 1 and 2 populations are distant from Uralic-speaking Finns, Mordvins and Khants, East Asian groups, and Slavic groups . Moreover, the Komi 2 (Obyachevo) population is distant from Komi 1 and closer to Slavic groups than Komi 1 . Thus, the position of Komi in genetic gradients remains uncertain because of substantial divergence of population samples and contradictory results, which may reflect a complex history of this group or natural selection.
B2-D2-A2 haplotype frequencies in Khants (0.48-0.49) and Nenets (0.48) fell within the range typical of the Asian-1 subcluster, 0.44-0.53. Therefore, the high frequency of the B2-D2-A2 haplotype in Uralic-speaking Khants and Nenets cannot be explained only by admixture of Europeans with typical East Asians; a combination of admixture and other processes such as gene drift or selection is a more likely explanation. The Khants and Nenets may have received the B2-D2-A2 haplotype both from the East European refugium and from East Asia, e.g., from Siberian populations of East Asian origin. One of such Siberian populations, the Yakuts, has a very high frequency of haplotype B2-D2-A2, 0.6-0.7. However, DRD2 haplotype frequencies in other Siberian populations of East Asian origin are unknown, and it is not possible to draw definitive conclusions about B2-D2-A2 frequencies in Siberia based on one ethnic group alone. Moreover, because Yakuts 1 exhibited low linkage disequilibrium between RFLPs, haplotype frequencies for this sample should be interpreted with caution. Yakuts 1 and 2 apparently do not belong to the cluster of typical East Asians, East Asian-1 (Figure 3), although they are clearly of East Asian origin. As suggested by archaeological and linguistic evidence, the Yakuts probably migrated north from their original area of settlement near Lake Baykal because of the Mongol expansion from the 13th to 15th century AD . Y-chromosome results reveal a very strong bottleneck in the Yakut population, which probably preceded their recent expansion [46, 54]. This bottleneck effect may be responsible for the aberrant haplotype frequencies for Yakuts observed in our study.
Populations in the northwestern (Byelorussians 2 from Mjadel'), northern (Russians 5 from Mezen' and 6 from Oshevensk; Komi 3), and eastern parts (Russians 4 from Puchezh and Chuvash) of the East European Plain have relatively high frequencies of haplotype B2-D2-A2, which may reflect admixture with Uralic-speaking populations. Uralic genetic substratum in these regions, which were inhabited by Uralic-speaking tribes as late as the Early Middle Ages, was also shown by studies in which other genetic markers were used (mtDNA, Y-chromosome, and autosomal). Thus, the analysis of DRD2 haplotypes supports results on Slavic-Uralic admixture obtained using other markers, mainly neutral and sex-specific markers.
Blood samples (8 ml) were obtained by venipuncture and collected into EDTA-coated containers. Informed consent was obtained from each individual. The research protocols and forms of informed consent have been approved by the Ethic Commission of the Medico-Genetic Scientific Centre of the Russian Academy of Medical Sciences (an approval was signed by the Head of the Ethic Commission, PhD, professor L.F. Kurilo). All individuals belonged to the native ethnic group of the region, i.e., their lineage in the region extended for at least two previous generations, and were unrelated to each other. DNA was isolated from leucocytes by proteinase K treatment and extraction with phenol-chloroform . Each DNA sample was subjected to three PCR analyses: amplification of a 459 bp fragment for TaqI B and BclI RFLP analysis, amplification of a 300 bp fragment for MboI and TaqI D RFLP analysis, and amplification of a 237 bp fragment for TaqI A RFLP analysis (primer sequences and original PCR protocols were obtained from the website of K. Kidd: http://info.med.yale.edu/genetics/kkidd/Protocol_TOC_new2002.html). The locations of these polymorphisms are shown on the gene map (Figure 1). All endonuclease restriction reactions were carried out overnight. Samples containing unrestricted fragments were tested at least twice. Restriction products were separated by electrophoresis using a 2.5% agarose gel.
The general strategy of statistical analysis was similar to that used in the work of Poloni et al. . Allele frequencies, correspondence to Hardy-Weinberg equilibrium, and significance of linkage disequilibrium (P-values) were evaluated using Arlequin version 2.0 software http://cmpg.unibe.ch/software/arlequin. Linkage disequilibrium values (D') between polymorphic loci were calculated as suggested by Lewontin . Frequencies of haplotypes were estimated from RFLP genotype data using the expectation-maximization algorithm of Excoffier and Slatkin  implemented in Arlequin 2.0. Genetic affinities among populations were evaluated using coancestry coefficients, or linearized pairwise FST values  calculated on the basis of allele or haplotype frequencies. Significance of genetic distances was tested using permutations .
Correlation of geographical and genetic distances was assessed using the Mantel test and XLSTAT version 2008.6.04 software (Addinsoft). Great circle geographical distances were calculated using the haversine formula; latitudes and longitudes were determined using Google Earth software. Multidimensional scaling (MDS) and cluster analysis with the unweighted pair-group average (UPGA) method were performed using STATISTICA version 6.0 http://www.statsoft.com. Statistical comparison of haplotype frequencies in population groups was performed using the Kolmogorov-Smirnov test implemented in XLSTAT. Differentiation between population groups defined by MDS and cluster analysis was estimated using the analysis-of-molecular-variance approach, AMOVA , implemented in Arlequin 2.0. Conventional FST distances between haplotypes were used, i.e., all haplotypes were considered equidistant (a conservative scenario of pure drift). Significance of the genetic-structure indexes obtained with the AMOVA method was tested using a permutational procedure (1 × 106 permutations).
The study was supported by the Russian Ministry of Science and Education program "Federal Support of Leading Scientific Schools" (grant No 3911.2008-04), by the Russian Academy of Sciences programs "Molecular and Cell Biology" and "Basic Science for the Advantage of Medicine", and by the Russian Foundation for Basic Research grant No 07-04-00027-a.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.