Genetic polymorphisms of pharmacogenomic VIP variants in the Mongol of Northwestern China

Background Within a population, the differences of pharmacogenomic variant frequencies may produce diversities in drug efficacy, safety, and the risk associated with adverse drug reactions. With the development of pharmacogenomics, widespread genetic research on drug metabolism has been conducted on major populations, but less is known about minorities. Results In this study, we recruited 100 unrelated, healthy Mongol adults from Xinjiang and genotyped 85 VIP variants from the PharmGKB database. We compared our data with eleven populations listed in 1000 genomes project and HapMap database. We used χ2 tests to identify significantly different loci between these populations. We downloaded SNP allele frequencies from the ALlele FREquency Database to observe the global genetic variation distribution for these specific loci. And then we used Structure software to perform the genetic structure analysis of 12 populations. Conclusions Our results demonstrated that different polymorphic allele frequencies exist between different nationalities,and indicated Mongol is most similar to Chinese populations, followed by JPT. This information on the Mongol population complements the existing pharmacogenomic data and provides a theoretical basis for screening and therapy in the different ethnic groups within Xinjiang.


Background
It is well known that different individuals have different reactions to the same medications. Pharmacogenomics seeks to identify genetic markers that may influence a person's response to pharmaceuticals. It will undoubtedly become an indispensable part of medical care in the future [1,2]. Pharmacogenomic research seeks to identify single nucleotide polymorphisms (SNPs) or multiple gene signatures that are possibly associated with medication responses [3]. The goal of the research is to provide information for personalized medicine, i.e. give to the patient the optimal medication in optimal dose, and promote personalized therapeutics [4][5][6].
Numerous studies had shown that certain important genes and genetic variations affect critical functions during the drug reaction process. These genetic variations are called very important pharmacogenetic (VIP) variants and listed in the pharmacogenomics databases such as the Pharmacogenomics Knowledge Base (PharmGKB), the Pharmacogenetics of Membrane Transporters (PMT) database, and PharmaADME [6][7][8]. Currently, PharmGKB (http://www.pharmgkb.org) is the most comprehensive database and dedicates to propagating primary pharmacogenomic data and knowledge. They have extensively annotated the vital drug response genes and presented this information in VIP summaries, pathway diagrams, and curated literature [9].
In China, there are 56 different nationalities. Besides Han, the others account for approximately 100 million people. Due to the different genetic backgrounds and diverse environments of these minor populations, we distinguish them easily from the Han ethnicity. The Mongolian population represents one of the fifteen largest ethnic minorities in China [10]. They primarily live in the Inner Mongolia, Liaoning, Heilongjiang and the Xinjiang Uygur Autonomous Region. The areas are located in the grassland region of Northern China and significantly different with the Central Plains. Special living environments of the Mongol people shaped their unique gene distribution frequencies. An increasing number of studies suggest that genes related to drug response vary between different populations [11], so the pharmacogenomics population genetic studies of different population is valuable.
In this study, we random selected and genotyped 85 VIP variants from the PharmGKB VIP database in 100 Mongols from Xinjiang. We designed primers using MassARRAY Assay Design 3.0 Software [12]. We compared the Mongol's allele frequencies with 11 populations from 1000 genomes project and the Mongol's genotype frequencies and haplotype construction with 11 HapMap populations to identify the differences among them. The results will expand the current Mongol pharmacogenomic information and ethnic diversity. We aimed to provide new strategies for medical professionals through use genomic and molecular data to optimize drug administration and therapeutic treatment in the future.

Ethics statement
Blood samples and signed informed consent forms were obtained from all enrolls. All participants were informed both verbally and in writing of the procedures and purpose of the study, and signed informed consent documents. The clinical protocol was approved by the Clinical Research Ethics of Xizang Minzu University and Northwest University, and it is in compliance with Department of Health and Human Services (DHHS) regulations for human research subject protection.

Study participants
We recruited 100 random unrelated Mongol adults (50 males and 50 females, average age range 25-40 years) from the Xinjiang Region of China and collected blood samples. The detailed recruitment criteria are the sample have good health body and had at least three generations of exclusive ethnic ancestries. They rarely communicate with other ethnics in Xinjiang because they are still nomads which living on relatively limited pasture. They were determined to be a representative Mongol population sample with regard to both ancestry and environmental exposure.

Variant selection and genotyping
Using the PharmGKB database, we screened published genetic polymorphisms associated with VIP variants, and finally 85 genetic variant loci from 37 genes were randomly selected for our investigation. We extracted genomic DNA from whole blood using a GoldMag-Mini Whole Blood Genomic DNA Purification Kit (GoldMag Ltd. Xi'an, China) according to the manufacturer's protocol. The genomic DNA concentration was measured by absorbance at 260 nm using a NanoDrop 2000C (Thermo Scientific, Waltham, Massachusetts, USA). We used the Sequenom MassARRAY Assay Design 3.0 software (San Diego, California, USA) to design multiplexed SNP MassEXTEND arrays [12]. We utilized a Sequenom MassARRAY RS1000 (San Diego, California, USA) to genotype the SNPs according to the manufacturer's instructions. Sequenom Typer 4.0 Software was used for data collection and analysis as described previously [13].

Statistical analyses
We used Microsoft Excel and the SPSS 19.0 statistical package (SPSS, Chicago, IL) to perform a Hardy-Weinberg Equilibrium (HWE) analysis and χ 2 tests. All p values calculated were two-sided and Bonferroni's multiple adjustment was used to correction. The values were considered statistically significant when p < 0.05 and p < 0.05/(85 × 11), respectively [14]. We analyzed each variant frequency in Mongols using an exact test to identify those that departed from HWE. We downloaded the allele frequencies of 85 loci in eleven randomly population of 1000 genomes project, which are a population of African ancestry in the southwestern USA (ASW); a population of Chinese Dai in Xishuangbanna, China (CDX); a Utah residents population (CEPH) with North and Western European Ancestry (CEU); the Chinese Han in Beijing, China (CHB); the Gujarati Indians in Houston, Texas, USA (GIH); the Japanese population in Tokyo, Japan (JPT); the Luhya people in Webuye, Kenya (LWK); people of Mexican ancestry from Los Angeles, USA (MXL); a population of Puerto Ricans from Puerto Rico (PUR); the Tuscan people of Italy (TSI); and the Yoruba in Ibadan, Nigeria (YRI). We downloaded the genotype frequencies of 85 variation loci in eleven populations from the HapMap database that are ASW; a northwestern European population (CEU); CHB; a Chinese population of metropolitan Denver, Colorado, USA (CHD); GIH; JPT; LWK; people of Mexican ancestry living in Los Angeles, California, USA (MEX); the Maasai people in Kinyawa, Kenya (MKK); TSI; and YRI. We first compared the allele frequencies difference between Mongolian and 11 random 1000 genomes project popualtions and calculate the correlation coefficient (R 2 ) among the minor different population, then compared and calculated the selected SNP's variant frequencies between the Mongol people and eleven HapMap populations (data from the second phase of HapMap: http://hapmap.ncbi.nlm.nih.gov) using a χ 2 test. Afterwards, we downloaded the SNP allele  frequencies of selected loci from the ALlele FREquency Database (http://alfred.med.yale.edu, ALFRED) and analyzed the global genetic variation patterns. We used Haploview software package (4.2) to perform the linkage disequilibrium (LD) analysis constructed haplotype, and genetic association of significant polymorphism loci.

Analysis of population genetic structures
There are studies proved that the center of study which research human origins, DNA forensics and complex diseases is population genetic structure. It is also important to our study as a pharmacogenomics population study. Structure analysis is common in population genetic study. To further investigate variation at the VIP locus in terms of population structure we used STRUCTURE ver. 2.3.1 (Pritchard Lab, Stanford University,USA, http://pritchardlab.stanford.edu/ structure.html) which based on the Bayesian clustering algorithmto assign the samples within a hypothetical K number of populations hypothesized by Pritchard et al [15]. We performed structure analysis using ancestry model with correlated allele frequencies among clusters. K = 2 to 8 is the range of possible numbers of clusters and 12 trials were run for each K. We performed the MCMC analyses for each structure analysis was run for 10,000 after an initial burn-in period of 10,000 for data collection. △K was calculated to identified the most likely number of clusters by STRUCTURE HARVESTER [16].

Results
We sequenced 85 VIP variants from 100 Mongols. The selected SNP PCR primers were designed using the Sequenom MassARRAY Assay Design 3.0 Software. Information regarding the selected VIP loci and their genotype frequencies is listed in Table 1, including the genes, their positions, the nucleotide change, the amino acid translation, the calculated allele frequencies, and the genotype frequencies for Mongols. Several variants, such as rs698, rs1695, rs5219, rs16974, rs20417, rs890293, rs2740574, and rs3211371, did not meet HWE with a 5 % significance level and were not included in the final 85 loci analyzed. We first compared the allele frequencies differences among the Mongols and the selected 11 groups from 1000 genomes project database (p < 0.05  (Table 2), respectively. In Fig. 1, we selected CDX, CHB and JPT which are the minimum difference population compared with Mongol population to calculate the correlation coefficient, R 2 . From the allele frequencies difference comparison, we figure out one initial conclusion that the Mongolian is relatively close to CDX, followed by CHB and JPT. We used χ 2 analyses to compare differences in the variants' genotype frequency distributions among the Mongols and eleven HapMap populations (without adjustment, p < 0.05; adjustment, p < 0.05/85 × 11). There were a number of loci had significantly different distribution frequencies among Mongols and the 11 HapMap populations that listed in Table 3  When p < 0.05, rs1540339 locus (46489G > A) which located in an intron region of VDR (1, 25-dihydroxyvitamin D3 receptor), showed the greatest number of significant differences between Mongol and 11 HapMap populations; the SNP rs776746 (12083G > A) is a SNP of CYP3A5 which located in an intron region and a significant locus that observed in these populations except TSI. After Bonferroni's multiple adjustment (p < 0.05/ (85 × 11)), the number of HapMap populations with a significantly different rs1540339 locus changed very large which included CEU, CHB, CHD, JPT, MEX and TRI. The rs776746 locus also changed very large which except TSI added CEU, CHD, GIH, JPT and MEX.
Of the 85 variants analyzed, 74 could be classified as part of a superfamily. When the gene superfamily categories were tallied, the number of the associated variants with significantly different frequencies between the Mongols and the eleven HapMap populations were as follows: ASW, 10; CEU, 9; CHB, 5; CHD, 1; GIH, 5; JPT, 4; LWK, 14; MEX, 1; MKK, 14; TSI, 4; and YRI, 21 (Table 4). A number of distinct loci were significantly different and included several pharmacogenomic superfamilies such as the nuclear receptor family, the sodium channel gene family, and the methylenetetrahydrofolate reductase family.
To further verify the ubiquitous differences between different groups through research the difference of maximum and minimum of two SNPs, we selected two variants, the  most significantly different variants -rs1540339, rs1801131 which is one of the least significantly loci distributed in all 12 populations, and downloaded the population data from the ALFRED database. Combining the new data, we carried out a global analysis. Figure 2 shows the global frequency data of rs1801131 and Fig. 3, the rs1540339 data. From the two figures, we only found that the frequency of Mongol is relatively close to the populations distributed in East Asia. Meanwhile, we focused on rs1540339 to explore the difference of the haplotypes. We performed the LD analysis to define blocks and haplotypes of VDR gene which include rs1540339, rs7975232, rs1544410, rs2239179,   Figure 4 shown that Mongol and CHB has only one block which consisted by rs1540339 and rs2239179, others has obviously different blocks compared with Mongol. For further clarified the genetic structure of Mongol and different populations, we used Structure 2.3.1 performed the population genetic structure comparisons by which works well for 85 loci (K = 2-8). The results are indicated by K = 3-5 (Fig. 5), which based on the Estimated Ln Prob of Data and other recommendations of the STRUCTURE software manual, When k = 3, individuals were divided in three affinity groups (subgroups 1: Mongol, CHD, JPT, CHB; subgroup 2: MEX, TSI, GIH, CEU; subgroup 3: MKK, ASW, LWK, YRI.) which used relative majority of likelihood assignment of individuals to subgroup. Followed by more K value to run STRUCTURE and then displayed the results in bar plots. From the image when k = 4 and 5, we easily found Mongol is closest to CHD, followed by CHB, JPT, and existed significant genetic structure differences with GIH and MEX.

Discussion
Personalized or stratified healthcare is an important goal for medicine in the 21st century. It ensures that the treatments of patients are safe and efficacious [17]. With the rapid development of pharmacogenetics, serious attention has been paid to interethnic or interracial differences in drug responses with the intent to identify the genetic backgrounds of these variations [18]. Our study analyzed the distribution of these VIP variant allele and genotype frequencies to seek out which are altered among the different human populations [19], and found that even the SNP of smallest difference also had significant diversity between different groups. Through the comprehensive analysis, we revealed that Mongol and Chinese populations have the minimum difference.
Two of the variants were identified, rs1801133 (C677T) and rs1801131 (A1298C), included one of the least significant locus in our data, they are located in the same genemethylenetetrahydrofolate reductase (MTHFR) gene. MTHFR is located on chromosome 1p36.3 in human which is an important regulatory enzyme that involved in the folate pathway. It catalyzes the conversion of            [20,21]. Thymidylate synthesis required a lower 5,10methylenetetrahydrofolate levels which leading to misincorporation of uracil into DNA, increasing chromosome damage frequency. A lower levels of 5methyltetrahydrofolate may decrease the methylation process of homocysteine to methionine which could lead to hyperhomocysteinemia and DNA hypomethylation. Severe MTHFR enzyme deficiency is the most common inherited folate metabolism disorder which leads to hyperhomocysteinemia and homocystinuria that eventually destroy the central nervous system  Table 4 The VIP variants in Mongols compared with eleven HapMap groups according to the gene superfamily classification   ASW  CEU  CHB  CHD  GIH  JPT  LWK  MEX  MKK  TSI  YRI   rs1045642  rs1229984  rs1229984  rs4148323  rs1540339  rs1229984  rs10264272  rs1695  rs10264272  rs1544410  rs10264272   rs1128503  rs1544410  rs1801272  rs1544410  rs1801272  rs1128503  rs1045642  rs3807375  rs1045642   rs11568820  rs1801272  rs3782905  rs3807375  rs3782905  rs11568820  rs1051266  rs701265  rs1128503   rs1540339  rs3782905  rs776746  rs4124874  rs975833  rs1540339  rs1128503  rs7975232  rs11568820   rs2032582  rs3807375  rs975833  rs4148323  rs1695  rs11568820  rs1229984   rs2066702  rs4148323  rs1801133  rs1540339  We randomly selected one of the middle significantly different variants in Mongolsthe non-synonymous SNP rs1805124 (A1673G-H558R), which is located in exon 12 of SCN5A [27]. SCN5A encodes the integral membrane protein, voltage-dependent sodium channel α-subunit. It primarily traffics sodium in human heart muscle cells [28,29]. SCN5A can cause fast depolarization during the upstroke phase of cardiac action potentials, that is the reason as a molecular antiarrhythmic drug target [30]. Amounts of Studies reveals SCN5A is associated with various cardiac diseases including long-QT syndrome (LQTS), Brugada syndrome (Brs), progressive cardiac conduction defect, atrial fibrillation (AF), dilated cardiomyopathy, and overlapping syndromes [27][28][29][30][31]. SCN5A-H558R has been shown to generate moderate electrophysiological functions that can regulate the phenotypic expression of cardiac conduction. It is associated with the mechanism of atrial fibrillation [30,32] and can modify QTc duration in people with LQTS [33]. Studies of different genotype frequencies in various populations related to SCN5A-H558R function have not yet been performed, but SY Nikulina.et.al already found that AG genotype of the H558R (rs1805124) polymorphism of the SCN5A gene is a genetic predictor of idiopathic disorders of atrioventricular and intraventricular conduction [34] We can carry out the prevention and early treatment of these diseases by gene sequencing.
Among Mongols and others global populations, numerous important genetic variants play critical roles in drug response and this information should directly applied to clinical guidelines. For instance rs1540339 (46489G > A), the most significant locus in our data, is associated with bronchodilator responsiveness [35]. Studies have been performed on the correlation between Fig. 4 Linkage disequilibrium analysis of the VDR in each of the twelve populations. LD is displayed by standard color schemes with bright red for very strong LD (LOD > 2, D ′ =1), pink red (LOD > 2, D ′ <1), blue (LOD < 2, D ′ = 1) for intermediate LD, and white (LOD < 2, D′ <1) for no LD asthma and rs1540339; however, evaluation of this polymorphism in a clinical setting is not yet routine [36,37].
Beyond the genetic factor, we also determined that long-term survival in different environments affects genetic adaption. Environmental pressures shape genotype distributions towards specific functions, particularly in pharmacogenetic genes. Studies by Janha et al., Sabbagh et al., and Fuselli et al. directly demonstrated that the different genotype frequencies of CYP2C19, NAT2, and CYP2D6 significantly differed between populations race, subsistence modes, and dietary habits also play a role in the evolutionary trajectory [38][39][40].

Conclusions
Different populations exists different genetic distribute frequencies. The drug dosage and usage of different genotype carriers is difference. Identifying genotype distribution and VIP variant frequencies in different populations to determine what medications might be most effective may provide a theoretical foundation for safe drug administration and improved curative effects. Besides, we figured out the minimum allele difference between Mongol and CDX. We also preliminary supplemented the pharmacogenomic data on the Mongol ethnic group and illustrated the differences between Mongols and other populations, and finally found Mongol and Chinese populations have the minimum difference. To the study, the sample size is relatively small and further investigation using a larger cohort of Mongols is needed to verify the generalizability of our results, and would be help us to establish a more reasonable and effective individualized treatment plan.