The goal of the present study was to comprehensively analyze sequence variation and linkage disequilibrium in the SLC6A1 gene in anticipation of larger pharmacogenetic studies of tiagabine and other GAT-1 inhibitors. Resequencing 12.4 kb of the SLC6A1 gene, including all 16 exons and the two putative promoter regions, revealed numerous novel genetic variants. Perhaps the most interesting polymorphism identified was a 21 bp VNTR polymorphism located in the upper promoter sequence of the SLC6A1 gene (Figure 1). We have termed the alleles as the "SLC6A1 short," which has one copy of the allele and "SLC6A1 long," in which the allele is duplicated (Figure 1). Interestingly, the long allele was common in AAs (39%), while in the other populations, it was absent. A likely explanation for the lack of this allele in non-African populations is genetic drift. However, other explanations, such as natural selections, are also possible. Functional studies focusing on understanding whether the short and long allele lead to differential expression of the GAT-1 protein are clearly warranted; these studies may also help in elucidating whether the allele frequency discrepancy between African-Americans and other populations is due to genetic drift or selection. Previously, a comparable promoter region VNTR polymorphism was described in the serotonin transporter gene, which is known to influence the expression of the gene . Several studies have shown that this polymorphism partially accounts for differences in the therapeutic response to serotonin selective reuptake inhibitors (SSRIs) [23–26], susceptibility to depression [27, 28] and alcohol dependence . The SLC6A1 short/long is a candidate polymorphism for moderating the response to tiagabine and susceptibility to neuropsychiatric disorders in which GABA dysfunction may play a role, albeit only in AA populations (and any other populations where this variant is present).
Genetic diversity in SLC6A1 revealed through SLC6A1 resequencing showed interesting results. Although a limited number of chromosomes was examined, certain trends were found. First, no non-synonymous SNPs were discovered in the 80 chromosomes sequenced, suggesting that the coding sequence of SLC6A1 has been conserved against common amino-acid altering substitutions through active background selection. Consistent with these results, the nucleotide diversity was much lower in the SLC6A1 exons compared to the intronic regions examined [30–32]. Comparison of the sequence data between populations revealed higher nucleotide diversity in AAs than in other populations (Table 1). In accordance with these data, the only population in which we found common population specific SNPs, were AAs. In exons, however, nucleotide diversity was no higher in the AA population than in the other populations. In other populations, the SNPs observed in only one population were rarer. Although Finns and Hmong are considered to be isolated populations, nucleotide diversity in these two populations was not different from that observed in EAs or Thais. Overall, no major differences in nucleotide diversity were observed among the Hmong, Thai, EA, and Finnish populations. Together, these findings most likely reflect the older age of the African population relative to the other populations, which had allowed more intronic variation to accumulate in this population, founder effects in non-African populations, and selection pressure conserving the SLC6A1 exonic sequence. The extent of conservation of GAT-1 amino acid sequence suggests an important role of this protein in normal brain function. The Hmong had a significantly lower number of heterozygous SNPs in comparison to the other populations. One explanation for this finding is that the Hmong subjects may have been distantly related. We postulate that differences in the degree and age of population bottlenecks between the Hmong and the other populations are less likely explanations for the lower heterozygosity observed in the Hmong; we feel that this explanation is less likely because all non-African populations had a low observed frequency of population-specific SNPs and because Hmong nucleotide diversity was not significantly lower than that of the other non-African populations . A caveat of this study is that the sample size per ethnicity was small. Consequently, rare non-synonymous SNPs, specific to a population, would have not been identified. It may be useful to resequence larger samples to identify these kinds of variations, at least in the primary target populations of clinical trials. Laboratory methods established for the present study should facilitate such analyses. Another limitation of the study is that the resequencing effort focused on exons. If deep intronic variation in SLC6A1 contributes to functional variation at the protein level, those variants would have been missed in the present study. A low level of LD in SLC6A1 was observed in all five populations (Figure 2). Consistent with these results, in the EA, Thai, Hmong, and Finnish populations, only two or three haplotype tagging SNPs in the areas of preserved LD were identified. In the AAs, no haplotype tagging SNPs were found in the 16 SNPs genotyped. These results suggest that very dense SNP panels would be required to capture common variation in this gene. Using r2/distance as the index, a longer LD span was observed in the Hmong population than in the other populations (Figure 3). However, higher LD probably would not translate to significant practical improvement in genotyping efficiency overall, because there were no major differences in the number of haplotype blocks in Hmong than in the Finnish, Thai and EA populations. Considering the low level of LD, SLC6A1 may pose special challenges for association studies both in isolated and mixed populations. The common SLC6A1 3-SNP haplotypes were largely the same in the five populations (Table 2), but were not completely overlapping. These results suggest a certain degree of, but not absolute, portability of SNP genotyping sets between the populations. It would interesting to study larger EA, AA, Hmong, Thai and Finnish population samples to further refine the structure and frequencies of the common overlapping and population-specific haplotypes . In addition, it will be interesting to study whether haplotype and SNP profile characteristics, such as absence of common non-synonymous substitutions extends to patient populations suffering from various neuropsychiatric disorders, which were not studied here. The present study primarily focused on non-clinical samples and therefore no data were available to assesses whether disease associated variants are present in human populations.
Intrigued by these findings, we examined whether recombination hotspots could explain low levels of LD in SLC6A1. Two hotspots were identified using PHASE; the first is located in the areas of exon 1 and intron 1. The second hotspot is located in the area demarcated by exons 8 and 16. As expected, within the hotspots, D' fell off rapidly. For example, in the Finns, in the area of the distal hotspot, D' was only 0.184 between markers 12 and 13, which are spaced 107 bp apart and D' was 0.044 between 13 and 14, which are spaced 489 bp apart. Areas of high recombination, such as seen in SLC6A1, potentially limit large-scale association studies, as it would be exceptionally difficult to find risk alleles relying on linkage disequilibrium if the alleles were located inside a recombination hotspot.