- Methodology article
- Open Access
Fine-scale detection of population-specific linkage disequilibrium using haplotype entropy in the human genome
© Mizuno et al; licensee BioMed Central Ltd. 2010
- Received: 11 October 2009
- Accepted: 23 April 2010
- Published: 23 April 2010
The creation of a coherent genomic map of recent selection is one of the greatest challenges towards a better understanding of human evolution and the identification of functional genetic variants. Several methods have been proposed to detect linkage disequilibrium (LD), which is indicative of natural selection, from genome-wide profiles of common genetic variations but are designed for large regions.
To find population-specific LD within small regions, we have devised an entropy-based method that utilizes differences in haplotype frequency between populations. The method has the advantages of incorporating multilocus association, conciliation with low allele frequencies, and independence from allele polarity, which are ideal for short haplotype analysis. The comparison of HapMap SNPs data from African and Caucasian populations with a median resolution size of ~23 kb gave us novel candidates as well as known selection targets. Enrichment analysis for the yielded genes showed associations with diverse diseases such as cardiovascular, immunological, neurological, and skeletal and muscular diseases. A possible scenario for a selective force is discussed. In addition, we have developed a web interface (ENIGMA, available at http://gibk21.bse.kyutech.ac.jp/ENIGMA/index.html), which allows researchers to query their regions of interest for population-specific LD.
The haplotype entropy method is powerful for detecting population-specific LD embedded in short regions and should contribute to further studies aiming to decipher the evolutionary histories of modern humans.
- Linkage Disequilibrium
- Window Size
- Large Window Size
- Recent Selection
- Extend Haplotype Homozygosity
Modern humans emerged in Africa approximately 200,000 years ago and over the last 100,000 years dispersed around the world adapting to different environments . The evolutionary histories during this period are reflected in the human genome by "selective sweeps" wherein beneficial alleles keep the genetic patterns of the surrounding sites [1, 2]. The recent availability of high density maps of single nucleotide polymorphisms (SNPs) has provided us with a unique opportunity to uncover such selection traits.
Classically, statistical measurements such as r2 and D' that test linkage disequilibrium (LD) at the resolution of two SNPs have been used to detect regions that have undergone recent selection . However, their pairwise fashion cannot capture multilocus associations and so their testing power is limited . Newly developed techniques, which are based on the concept of extended haplotype homozygosity (EHH) (e.g. LRH, iHS, XP-EHH) [4–6] and the composite likelihood ratio (CLR) , incorporate multilocus association and show higher power than conventional statistics. Nevertheless, those methods are weak in handling low SNP counts from minor alleles and/or require allele polarity (ancestral/derived), making their scores less reliable. Further, they need a relatively large window size to distinguish signal from noise, and so the human genome has not been investigated at resolution below 100 kb. Considering that recombination hotspots are estimated to exist with a frequency of at least one every 60 kb [8, 9] and erode LD, genomic scans of short/intermediate resolution would give more detailed insight into recent human evolution .
Lately, entropy of haplotype frequency has been proposed as a general measure to quantify the strength of LD and thus to uncover evolutionary forces [3, 11]. By definition, the haplotype entropy incorporates multilocus association, is proficient at handling low allele frequencies and does not rely on allele polarity. These features enable us to fully utilize nucleotide information and make short haplotype analysis feasible. In this study, we report a fine-scale genomic scan for population-specific LD, which is indicative of natural selection, using haplotype entropy.
Haplotype entropy for detecting population-specific LD
Entropy is an established measure of diversity or information content. Here we use entropy to quantify the genetic diversity of given haplotypes as introduced by Nothnagel et al.  and Atwal et al. . The analysis begins by counting the number of each haplotype within the genomic region of interest. Using this information on frequency of haplotype, we compute its entropy (see Methods). Low entropy is associated with low genetic diversity, where one or a few haplotypes are over-represented at high frequency in the region. On the other hand, high entropy is indicative of high genetic diversity, where various haplotypes are equally represented at small frequencies in the region. Under neutrality, stochastic processes such as mutation, recombination and genetic drift perturb genetic variation of the genome. Meanwhile, advantageous alleles keep the genetic pattern of linked sites by "selective sweep" [1, 2] which decreases observations of recombination and increases the frequency of certain haplotypes, leading to low haplotype entropy. This suggests that the regions with entropy distinct from what is expected under neutral evolution are candidate targets for natural selection.
The original haplotype entropy method resorted to theoretical formula  or simulation  to estimate expected haplotype entropy of neutral evolution. However, these methods require a vast amount of calculations and reasonable parameters of the local recombination rate for each SNP, limiting the haplotype entropy method to a genome-wide application. Alternatively, we can compare two populations, where the entropy from one population provides the reference of the neutral evolution for the other. This comparison is also beneficial because it virtually cancels the effect of the physical distance between SNPs for which haplotype entropy does not take into account. In this approach, population-specific LDs are identified by extreme entropy differences in certain genomic regions between populations. This modification maintains the key features that are ideal for short haplotype analysis: the incorporation of multilocus association, conciliation with low allele frequencies, and independence from allele polarity.
Next, for comparison, we scanned integrated EHH (I), a representative measurement for multilocus LD to give XP-EHH , for each SNP. I, which is designed to give high scores when LD is strong, was low at high-rate recombination sites and high at low-rate recombination sites (Figure 1D-F). As previous studies have noted, I presented highly skewed scores when a core SNP had a low minor allele frequency (MAF) (Figure 1D-F, gray dots) [5, 6, 12]. Note that, even when we ignored low MAF SNPs, I was still noisier compared to S and required a large window size (e.g. 51 SNPs) to secure reliable discrimination power. The observation level was robust, even when examined at a 100-fold higher mutation rate (Additional file 1). These results suggested the ability of S to better quantify the strength of LD with high resolution. We also note that, the feature of conciliation with low allele frequencies (Figure 1A-C, gray dots) would keep S robust against stochastic perturbagens such as recent mutations and yin-yang haplotypes  and artificial noise such as genotyping errors and haplotype inference errors.
Analysis of the HapMap dataset
Comparison with previously reported signatures
Overlap to haplotype entropy method
~800 kb-3.5 Mb
~800 kb-3.5 Mb
Characteristics of population-specific LD signatures
Functions and diseases related to both CEU and YRI signatures
Coronary artery disease
Non-insulin-dependent diabetes mellitus
Endocrine system disorder
Bipolar affective disease
Digestive system disorder
Skeletal and muscular disorder
Progressive motor neuropathy
Shape change of epithelial cells
Shape change of dermal cells
Insulin-dependent diabetes mellitus
Amyotrophic lateral sclerosis
Vitamin D hypothesis
An interesting hypothesis has been proposed by McGrace  claiming that low prenatal vitamin D increases the risk of a wide range of diseases such as multiple sclerosis, diabetes, schizophrenia, prostate cancer, breast cancer and colorectal cancer because of its versatile function in normal development. Additional circumstantial evidence encouraged us to integrate this "vitamin D hypothesis" and population-specific selection. First, the assumed prime function of pigmentation, one of the most convincing as a recent selection target differentiating CEU and YRI populations, is to control vitamin D synthesis from ultra violet exposure [17, 29]. Second, in addition to McGrace's list of diseases, animal model studies and epidemiological surveys have further linked vitamin D insufficiency to cardiovascular disease and inflammatory bowel disease [17, 30, 31], as well as abnormal brain development . These diseases were dominant in our enrichment analysis, with the exception of cancer, which may have less impact on natural selection due to its relatively late onset in life [17, 32, 33].
To assess this possibility, we looked into the entropy of the VDR gene, a vitamin D receptor. Although it did not reach the 0.1% criteria, the difference in entropy of the 3' region of VDR between CEU and YRI was relatively high (Additional file 10, p = 0.00916). The corresponding haplotype contained the loci BsmI, Tru9I, Ap aI and TaqI (Additional file 10), whose polymorphisms have been shown to associate with hypertension, coronary artery disease, Crohn's disease, diabetes, multiple sclerosis, Alzheimer's disease, cognition, and depression [31, 34–36]. We also found that RXRA, a heterodimeric partner of VDR, also showed borderline significance (Additional file 11, p = 0.00139). These results are suggestive that vitamin D might have been involved in the alteration of risk of disease and abnormal development and consequently in the genetic adaptation of modern humans, but validation from further study is necessary.
In this study, we presented a fine-scale genomic scan using haplotype entropy to detect population-specific LD, which is indicative of natural selection, in the human genome. The yielded signatures included a number of previously detected genes, and overlaps with previously reported signatures were significant. On the other hand, there were a considerable number of novel predictions. These inconsistencies among the methods can be attributed to several factors.
First, the haplotype entropy method is effective to detect signals embedded in short regions with a window size such as 23 kb (21 SNPs), whereas previous methods had to contend with a window size of 100 kb or larger [5, 6, 12, 16] to yield reliable signals. This high resolution allowed us to detect novel candidate regions such as MLPH and ATRNL1 that were undetected using previous methods. However, the haplotype entropy method has the issue of entropy saturation. Thus, under the constraint of a fixed window size, it cannot detect regions which may require a larger window size to show a sufficient signal such as LCT.
The second factor affecting detection is the use of a reference population. The practical limitations regarding computational resources and uncertainty in parameters drove us to compare two populations, where one population provides the reference of neutral evolution for the other and so regions showing highly different haplotype entropy between the populations can be deemed to be population-specific LD. Although this would allow us to detect "fixed" selection signatures in one population which cannot be found with intrapopulation methods [4, 5], it still would cause false negatives when the regions have been selected in parallel in both populations . This problem can be tackled by considering more than one population, which would provide a better neutral control and improve the ability of the method to uncover unusual LD regions in a specific population.
A third factor leading to non-overlapping signatures could be statistical error. Teshima et al. has shown that the empirical approach is reasonable but can cause a large number of false negatives . Although we have focused on the genomic regions showing extreme entropy difference (top 0.1%), we cannot exclude the possibility that other signatures remain below the threshold. Since EHH derivatives and CLR are also based on the empirical approach, some inconsistencies may have been due to this limitation. Also, false positives need to be considered. Our analysis of simulation and HapMap datasets showed I was more variable than S (Figure 1 and Additional files 1 and 4). Thus, it is possible that earlier methods detected some false positives that the entropy method does not. At the same time, haplotype entropy method may have caused some false positives absent in previous signatures because it relied on much less information due to the smaller window size.
Therefore, although powerful, the haplotype entropy method is not an ultimate solution. Rather, it would be most effective as a complement to other methods. Its unique detection power can fill the gap between pairwise methods and new technologies such as EHH and CLR. It should also help in cross-validating candidates of natural selection from those statistics. We provide a web interface (ENIGMA at http://gibk21.bse.kyutech.ac.jp/ENIGMA/index.html) so that researchers can query their regions of interest in our fine-scale map of Caucasian and African population-specific LDs. Our works, taken together, would contribute to further studies towards understanding human evolution inscribed in the human genome.
The HapMap2 release #24, a dataset of phased 1,969,416 SNPs, was downloaded from the project web site [14, 37]. In this study, two populations each consisting of 120 chromosomes from 60 donors, were analyzed: Yoruba in Ibadan, Nigeria (YRI) and a group of residents of Utah with European ancestry (CEU). A data table from NCBI Build 36 was also obtained from the NCBI FTP site  and transcribed regions were considered for mapping the SNPs to genes.
Genomic scan using haplotype entropy
The degree of genetic diversity was measured using the entropy (S) for haplotype frequency, S = - ∑ p(i)log2p(i), where i is an index of the haplotypes and p(i) is the frequency of haplotype i in the population. S achieves maximum score log2(n) when the given n haplotypes for the region differ from each other. S is 0 when all haplotypes are identical. Entropy difference ΔS was defined as ΔS = Spop1 - Spop2, where Spop1 and Spop2 are the haplotype entropies for two populations. For the genomic scan on the HapMap dataset, a window size of 21 SNPs was chosen for haplotype composition because it fulfilled three requirements: no entropy saturation, sufficient entropy difference and high resolution (See Results). SNPs on sex chromosomes and haplotypes for long segments (> 200 kb) were excluded. For each SNP, the ΔS between CEU and YRI was calculated and its empirical significance in the genome-wide distribution was determined. No correction operation for multiple testing was applied. A genomic scan for integrated EHH (I) was also done for chromosome 1 using the same window size of 21 SNPs. For the direct comparison to S, segment lengths of the haplotypes were not considered. iHS scores, the other EHH-based measurement, were downloaded from the Haplotter database [5, 39].
We considered model chromosomes composed of 500 loci, where the jth (1 ≤ j ≤ 500) locus has recombination rate r = exp(- j/25) per haploid and generation, shaping a continuous gradient from one end (r = 1) to the other end (r = 2.1E-9). For each locus, initial allele frequency k and 1 - k (0 ≤ k ≤ 1) for two alleles were randomly given. Using the GenomePop software , the evolutionary process was simulated for 5,000 generations with the parameter of population size as 10,000 and mutation rate per locus as 2.0E-9. Then, 120 chromosomes for 60 individuals were sampled from one run of the simulation and scanned for S and I using window sizes of 5, 21 and 51 loci. The simulation was repeated 100 times.
CEU and YRI signatures were queried against IPA version 7.5 . "Function and Disease" libraries were overlaid on each signature and enrichment scores were calculated. Categories with p < 1E-5 significance were listed.
We thank S. Christen of the Institute for Advanced Study and F. Ford of Chugai Pharmaceuticals for their helpful discussions and checking of the manuscript.
- Tishkoff SA, Verrelli BC: Patterns of human genetic diversity: implications for human evolutionary history and disease. Annu Rev Genomics Hum Genet. 2003, 4: 293-340. 10.1146/annurev.genom.4.070802.110226.View ArticlePubMedGoogle Scholar
- Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage. Science. 2006, 312: 1614-20. 10.1126/science.1124309.View ArticlePubMedGoogle Scholar
- Nothnagel M, Fürst R, Rohde K: Entropy as a measure for linkage disequilibrium over multilocus haplotype blocks. Hum Hered. 2002, 54: 186-98. 10.1159/000070664.View ArticlePubMedGoogle Scholar
- Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, Schaffner SF, Gabriel SB, Platko JV, Patterson NJ, McDonald GJ, Ackerman HC, Campbell SJ, Altshuler D, Cooper R, Kwiatkowski D, Ward R, Lander ES: Detecting recent positive selection in the human genome from haplotype structure. Nature. 2002, 419: 832-7. 10.1038/nature01140.View ArticlePubMedGoogle Scholar
- Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4: e72-10.1371/journal.pbio.0040072.PubMed CentralView ArticlePubMedGoogle Scholar
- Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X, Byrne EH, McCarroll SA, Gaudet R, Schaffner SF, Lander ES, International HapMap Consortium: Genome-wide detection and characterization of positive selection in human populations. Nature. 2007, 449: 913-8. 10.1038/nature06250.PubMed CentralView ArticlePubMedGoogle Scholar
- Nielsen R, Williamson S, Kim Y, Hubisz MJ, Clark AG, Bustamante C: Genomic scans for selective sweeps using SNP data. Genome Res. 2005, 15: 1566-75. 10.1101/gr.4252305.PubMed CentralView ArticlePubMedGoogle Scholar
- Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, Lavery T, Kouyoumjian R, Farhadian SF, Ward R, Lander ES: Linkage disequilibrium in the human genome. Nature. 2001, 411: 199-204. 10.1038/35075590.View ArticlePubMedGoogle Scholar
- Crawford DC, Bhangale T, Li N, Hellenthal G, Rieder MJ, Nickerson DA, Stephens M: Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat Genet. 2004, 36: 700-6. 10.1038/ng1376.View ArticlePubMedGoogle Scholar
- Nakajima T, Wooding S, Sakagami T, Emi M, Tokunaga K, Tamiya G, Ishigami T, Umemura S, Munkhbat B, Jin F, Guan-Jun J, Hayasaka I, Ishida T, Saitou N, Pavelka K, Lalouel JM, Jorde LB, Inoue I: Natural selection and population history in the human angiotensinogen gene (AGT): 736 complete AGT sequences in chromosomes from around the world. Am J Hum Genet. 2004, 74: 898-916. 10.1086/420793.PubMed CentralView ArticlePubMedGoogle Scholar
- Atwal GS, Bond GL, Metsuyanim S, Papa M, Friedman E, Distelman-Menachem T, Ben Asher E, Lancet D, Ross DA, Sninsky J, White TJ, Levine AJ, Yarden R: Haplotype structure and selection of the MDM2 oncogene in humans. Proc Natl Acad Sci. 2007, 104: 4524-9. 10.1073/pnas.0610998104.PubMed CentralView ArticlePubMedGoogle Scholar
- Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK: Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 2009, 19: 826-37. 10.1101/gr.087577.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang J, Rowe WL, Clark AG, Buetow KH: Genomewide distribution of high-frequency, completely mismatching SNP haplotype pairs observed to be common across human populations. Am J Hum Genet. 2003, 73: 1073-81. 10.1086/379154.PubMed CentralView ArticlePubMedGoogle Scholar
- International HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449: 851-61. 10.1038/nature06258.View ArticleGoogle Scholar
- Teshima KM, Coop G, Przeworski M: How reliable are empirical genomic scans for selective sweeps?. Genome Res. 2006, 16: 702-12. 10.1101/gr.5105206.PubMed CentralView ArticlePubMedGoogle Scholar
- Williamson SH, Hubisz MJ, Clark AG, Payseur BA, Bustamante CD, Nielsen R: Localizing recent adaptive evolution in the human genome. PLoS Genet. 2007, 3: e90-10.1371/journal.pgen.0030090.PubMed CentralView ArticlePubMedGoogle Scholar
- Parra EJ: Human pigmentation variation: evolution, genetic basis, and implications for public health. Am J Phys Anthropol. 2007, 45: 85-105. 10.1002/ajpa.20727.View ArticlePubMedGoogle Scholar
- Matesic LE, Yip R, Reuss AE, Swing DA, O'Sullivan TN, Fletcher CF, Copeland NG, Jenkins NA: Mutations in Mlph, encoding a member of the Rab effector family, cause the melanosome transport defects observed in leaden mice. Proc Natl Acad Sci. 2001, 98: 10238-43. 10.1073/pnas.181336698.PubMed CentralView ArticlePubMedGoogle Scholar
- Ménasché G, Ho CH, Sanal O, Feldmann J, Tezcan I, Ersoy F, Houdusse A, Fischer A, de Saint Basile G: Griscelli syndrome restricted to hypopigmentation results from a melanophilin defect (GS3) or a MYO5A F-exon deletion (GS1). J Clin Invest. 2003, 112: 450-6.PubMed CentralView ArticlePubMedGoogle Scholar
- Walker WP, Aradhya S, Hu CL, Shen S, Zhang W, Azarani A, Lu X, Barsh GS, Gunn TM: Genetic analysis of attractin homologs. Genesis. 2007, 45: 744-56. 10.1002/dvg.20351.View ArticlePubMedGoogle Scholar
- Ingenuity Pathway Analysis database. [http://www.ingenuity.com]
- Rockman MV, Hahn MW, Soranzo N, Loisel DA, Goldstein DB, Wray GA: Positive selection on MMP3 regulation has shaped heart disease risk. Curr Biol. 2004, 14: 1531-9. 10.1016/j.cub.2004.08.051.View ArticlePubMedGoogle Scholar
- Costas J, Carrera N, Domínguez E, Vilella E, Martorell L, Valero J, Gutiérrez-Zotes A, Labad A, Carracedo A: A common haplotype of DRD3 affected by recent positive selection is associated with protection from schizophrenia. Hum Genet. 2009, 124: 607-13. 10.1007/s00439-008-0584-7.View ArticlePubMedGoogle Scholar
- Simmons JD, Mullighan C, Welsh KI, Jewell DP: Vitamin D receptor gene polymorphism: association with Crohn's disease susceptibility. Gut. 2000, 47: 211-4. 10.1136/gut.47.2.211.PubMed CentralView ArticlePubMedGoogle Scholar
- Picornell Y, Mei L, Taylor K, Yang H, Targan SR, Rotter JI: TNFSF15 is an ethnic-specific IBD gene. Inflamm Bowel Dis. 2007, 13: 1333-8. 10.1002/ibd.20223.PubMed CentralView ArticlePubMedGoogle Scholar
- Myles S, Davison D, Barrett J, Stoneking M, Timpson N: Worldwide population differentiation at disease-associated SNPs. BMC Med Genomics. 2008, 1: 22-10.1186/1755-8794-1-22.PubMed CentralView ArticlePubMedGoogle Scholar
- Myles S, Hradetzky E, Engelken J, Lao O, Nürnberg P, Trent RJ, Wang X, Kayser M, Stoneking M: Identification of a candidate genetic variant for the high prevalence of type II diabetes in Polynesians. Eur J Hum Genet. 2007, 15: 584-9. 10.1038/sj.ejhg.5201793.View ArticlePubMedGoogle Scholar
- McGrath J: Does 'imprinting' with low prenatal vitamin D contribute to the risk of various adult disorders?. Med Hypotheses. 2001, 56: 367-71. 10.1054/mehy.2000.1226.View ArticlePubMedGoogle Scholar
- Jablonski NG, Chaplin G: The evolution of human skin coloration. J Hum Evol. 2000, 39: 57-106. 10.1006/jhev.2000.0403.View ArticlePubMedGoogle Scholar
- Reis AF, Hauache OM, Velho G: Vitamin D endocrine system and the genetic susceptibility to diabetes, obesity and vascular disease. Diabetes Metab. 2005, 31: 318-25. 10.1016/S1262-3636(07)70200-8.View ArticlePubMedGoogle Scholar
- Bouillon R, Carmeliet G, Verlinden L, van Etten E, Verstuyf A, Luderer HF, Lieben L, Mathieu C, Demay M: Vitamin D and human health: lessons from vitamin D receptor null mice. Endocr Rev. 2008, 29: 726-76. 10.1210/er.2008-0004.PubMed CentralView ArticlePubMedGoogle Scholar
- Niell BL, Long JC, Rennert G, Gruber SB: Genetic anthropology of the colorectal cancer-susceptibility allele APC I1307K: evidence of genetic drift within the Ashkenazim. Am J Hum Genet. 2003, 73: 1250-60. 10.1086/379926.PubMed CentralView ArticlePubMedGoogle Scholar
- Ribas G, Milne RL, Gonzalez-Neira A, Benítez J: Haplotype patterns in cancer-related genes with long-range linkage disequilibrium: no evidence of association with breast cancer or positive selection. Eur J Hum Genet. 2008, 16: 252-60. 10.1038/sj.ejhg.5201953.View ArticlePubMedGoogle Scholar
- Valdivielso JM, Fernandez E: Vitamin D receptor polymorphisms and diseases. Clin Chim Acta. 2006, 371: 1-12. 10.1016/j.cca.2006.02.016.View ArticlePubMedGoogle Scholar
- Gezen-Ak D, Dursun E, Ertan T, Hanagasi H, Gürvit H, Emre M, Eker E, Oztürk M, Engin F, Yilmazer S: Association between vitamin D receptor gene polymorphism and Alzheimer's disease. Tohoku J Exp Med. 2007, 212: 275-82. 10.1620/tjem.212.275.View ArticlePubMedGoogle Scholar
- Kuningas M, Mooijaart SP, Jolles J, Slagboom PE, Westendorp RG, van Heemst D: VDR gene variants associate with cognitive function and depressive symptoms in old age. Neurobiol Aging. 2009, 30: 466-73. 10.1016/j.neurobiolaging.2007.07.001.View ArticlePubMedGoogle Scholar
- International HapMap project web site. [http://www.hapmap.org/]
- NCBI FTP site. [http://www.ncbi.nlm.nih.gov/Ftp/]
- Haplotter database. [http://hg-wen.uchicago.edu/selection/haplotter.htm]
- Carvajal-Rodríguez A: GENOMEPOP: a program to simulate genomes in populations. BMC Bioinformatics. 2008, 9: 223-10.1186/1471-2105-9-223.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.