Open Access

Classical sickle beta-globin haplotypes exhibit a high degree of long-range haplotype similarity in African and Afro-Caribbean populations

  • Neil Hanchard1, 5Email author,
  • Abier Elzein2,
  • Clare Trafford2,
  • Kirk Rockett2,
  • Margaret Pinder3,
  • Muminatou Jallow3,
  • Rosalind Harding4,
  • Dominic Kwiatkowski2 and
  • Colin McKenzie1
BMC Genetics20078:52

DOI: 10.1186/1471-2156-8-52

Received: 12 March 2007

Accepted: 10 August 2007

Published: 10 August 2007



The sickle (βs) mutation in the beta-globin gene (HBB) occurs on five "classical" βs haplotype backgrounds in ethnic groups of African ancestry. Strong selection in favour of the βs allele – a consequence of protection from severe malarial infection afforded by heterozygotes – has been associated with a high degree of extended haplotype similarity. The relationship between classical βs haplotypes and long-range haplotype similarity may have both anthropological and clinical implications, but to date has not been explored. Here we evaluate the haplotype similarity of classical βs haplotypes over 400 kb in population samples from Jamaica, The Gambia, and among the Yoruba of Nigeria (Hapmap YRI).


The most common βs sub-haplotype among Jamaicans and the Yoruba was the Benin haplotype, while in The Gambia the Senegal haplotype was observed most commonly. Both subtypes exhibited a high degree of long-range haplotype similarity extending across approximately 400 kb in all three populations. This long-range similarity was significantly greater than that seen for other haplotypes sampled in these populations (P < 0.001), and was independent of marker choice and marker density. Among the Yoruba, Benin haplotypes were highly conserved, with very strong linkage disequilibrium (LD) extending a megabase across the βs mutation.


Two different classical βs haplotypes, sampled from different populations, exhibit comparable and extensive long-range haplotype similarity and strong LD. This LD extends across the adjacent recombination hotspot, and is discernable at distances in excess of 400 kb. Although the multi-centric geographic distribution of βs haplotypes indicates strong subdivision among early Holocene sub-Saharan populations, we find no evidence that selective pressures imposed by falciparum malaria varied in intensity or timing between these subpopulations. Our observations also suggest that cis-acting loci, which may influence outcomes in sickle cell disease, could lie considerable distances away from β-globin.


The sickle mutation (βs) of the beta-globin locus (HBB), which in the homozygous state gives rise to sickle cell anaemia, is associated with five "classical" haplotypes, each with different geographic distributions across sub-Saharan Africa, Arabia and India [1, 2]. These βs haplotypes were first identified by the presence or absence of restriction fragment length polymorphisms (RFLPs) in the 70 kilobases (kb) surrounding HBB [3], and subsequently found to be characterized by strong linkage disequilibrium (LD) across a 'hot spot' of recombination just 5' of HBB [4, 5] (Figure 1). The multi-centric geographical distribution of classical βs haplotypes reflects the recency of strong selection pressures imposed by falciparum malaria on local human populations [6].
Figure 1

Classical βS haplotypes. The figure illustrates restriction fragment length polymorphisms in a 70 kb region around HBB. Five other globin synthesis genes are shown along with the approximate positions of four RFLP sites used to designate the five classical βS haplotypes.

Classical βs haplotypes are named according to their putative geographical origins – Benin, Bantu (Central African), Cameroon, Senegal and Arab. In general, within ethnic groups in which the βs allele has a high frequency, one particular βs haplotype usually predominates; for instance, the Senegal βs haplotype is the most commonly observed βs haplotype in Senegal, although the Benin haplotype is also present [4]. The Benin haplotype is very common (~92%) among the Yoruba in Nigeria [7], and also is common in Jamaica (~71%) [8, 9], although Bantu and Senegal types also occur in the Jamaican population [4]. The geographical distribution of classical βs haplotypes has been attributed to independent origins of the βs mutation [1, 2] but the role of gene conversion in the transfer of the original mutation(s) between haplotypes has also been discussed [3, 4].

Possession of the βs allele in the heterozygous form (HbAS) confers a very strong protection against severe malarial infection compared to mutant (HbSS) and wild-type (HbAA) homozygotes [1013]. This selective advantage results in a rapid increase in the frequency of the βS allele over several generations in regions of high malarial endemicity. This rise in frequency occurs at a faster rate than meiotic recombination can break down the haplotype background on which the allele first arose. Therefore, like other recently selected alleles such as those found in the glucose-6-phosphate dehydrogenase or lactase genes [14, 15], the βs allele would be expected to maintain its ancestral haplotypic relations over a relatively long genetic distance. We previously used the βs allele as a practical example of recent selection in our description of HAPLOSIMILARITY, a method of evaluating long-range haplotypes [16]. Using 20 high-frequency SNP markers spaced across 414 kb in a sample of Gambian cord bloods we demonstrated a high degree (approximately 60%) of similarity among haplotypes associated with the βs allele [16]. This was considerably higher than haplotype similarity scores for surrounding alleles and also higher than neutral expectations derived from coalescent simulations.

To date, the relationship between classical βs haplotypes and the long-range similarity expected of haplotypes associated with the βs allele has not been defined. A better understanding of classical βs haplotypes is of particular relevance to anthropological and population genetic studies [5, 17] and may also be useful for understanding the varying clinical outcomes seen among individuals with sickle cell disease [18, 19]. Therefore, we chose to extend the results of our previous study by investigating βs haplotype similarity in two additional populations; specifically asking whether classical βs haplotypes demonstrate strong haplotype similarity over extended physical distances (ie over hundreds of kb). To do so, we analyzed 76 βs chromosomes from Jamaica, identified in a sample of 30 HbSS individuals and a sample of 133 participants from a population survey, as well as 16 βs chromosomes from 60 unrelated Yoruba participants in the International HapMap Project. We then contrasted these results with the analysis of 37 βs chromosomes identified in our previous study of 191 cord blood samples in The Gambia [16].


Jamaican βs haplotype similarity

In the Jamaican population, 26 SNPs met our selection criteria (see Methods) and these were used to construct haplotypes over approximately 200 kb both 5' and 3' of the βs allele (a total of 400 kb – Table 1). Fifty-eight of the 76 βs haplotypes identified (76%) were of the Benin type. Thirty-six (68%) of these Benin haplotypes were identical across 400 kb investigated, and several of the remaining haplotypes differed from the common haplotype at only two or three loci (Figure 2).
Table 1

SNPs used for haplotype construction in Jamaicans.

rs number a

Chr loc a

Dist from HbS (bp)b

Alleles c























































































HbS (rs334)













































a – rs numbers and location on chromosome 6 (Chr loc) are given with reference to HapMap release #20

b – Negative distances (Dist) from the HbS allele are upstream (5') of the allele, positive distances are downstream (3')

c – SNP alleles given as major/minor.

d – Minor allele frequency.
Figure 2

Jamaican βS haplotypes. Haplotypes designated as 'Benin' type are shown in part A; non-Benin type haplotypes are shown in part B. RFLP markers used to distinguish individual βS haplotypes have shaded marker labels and are outlined by the white border. Haplotypes are arrayed along the Y-axis with SNPs on the X-axis. At each SNP position, the major allele of each SNP is represented in blue and the minor allele in orange. For reference, the 70 kb region of the HBB cluster is indicated at the top of the figure.

In order to provide a summary statistic for the overall degree of βs Benin haplotype similarity in the population, we used the HS score of the HAPLOSIMILARITY algorithm [16]. The HS score is a measure of the mean similarity of haplotypes calculated by assessing the frequency of distinct haplotypes in smaller, overlapping, sliding windows (see Methods). HS scores range from small values approaching zero (all haplotypes are distinct) to one (all haplotypes are the same). We modified the existing algorithm to determine confidence limits for the estimates of haplotype similarity by bootstrapping the sample of haplotypes 1000 times. The HS score for Jamaican βS Benin haplotypes was 0.689 (95% CI 0.685 – 0.693). By contrast, haplotypes that were identical to the Benin type, as defined by RFLPs, but associated with the major allele (βA, N = 93) had an HS score of 0.132 (95% CI 0.131 – 0.133) which was significantly less (P < 0.001) than the score for the βs Benin haplotypes.

We also considered whether the high degree of haplotype similarity among βs Benin haplotypes was simply the consequence of a relatively small number of βs Benin haplotypes having a higher similarity by chance. To do so, we constructed 1000 samples of haplotypes that were not of the βS Benin type (i.e. both βS and βA non-Benin haplotypes); each sample consisted of the same number of haplotypes as the number of βS Benin chromosomes present in the dataset. The mean HS score (HS = 0.133, 95% CI 0.132 – 0.134) of these non-βS Benin haplotypes was significantly lower (P < 0.001) than that obtained for βS Benin haplotypes, suggesting that the high degree of haplotype similarity observed for βS Benin haplotypes was unlikely to be the consequence of sampling error.

Yoruba βs haplotypes

We wanted to determine whether our observations were unique to the Jamaican population, as well as ascertain the extent to which the strong similarity of βS Benin haplotypes might have been the result of the relatively low marker density employed (one SNP per 16 kb). To do this we utilized SNPs genotyped in Yoruba family trios (the YRI dataset, see Methods) of the International HapMap Project [20].

We first considered parental haplotypes constructed from family trios (see methods) using 181 SNPs spaced across the same 400 kb investigated in the Jamaican sample. Using this increased marker density (≈ 1 SNP every 2 kb), 14 of the 15 βs haplotypes observed were noted to be of the Benin type (see Additional file 1). Again, a high degree of haplotype similarity was noted among the haplotypes (HS = 0.805, 95% CI 0.802 – 0.808), and, as before, this was significantly higher (P < 0.001) than that observed among equivalent samples of non-βs Benin haplotypes (mean HS = 0.221, 95% CI 0.220 – 0.222). Thus, a high degree of βS haplotype similarity appears to be a general feature of βS Benin haplotypes which is independent of both the country of origin and the marker density used to construct the haplotype. This strong haplotype similarity was also reflected in the pattern of LD around the βS allele in this population. As shown in Figure 3, the SNP corresponding to the βSA allele demonstrated strong LD (mean D' 0.830) with alleles of almost all markers across the 400 kb investigated, and this extended across the 5' recombination hot spot.
Figure 3

Pattern of HBB LD in Yoruba. The figure shows pairwise D' between the βSA allele (shaded allele) and 180 high frequency SNPs (minor allele frequency > 0.05). D' > 0.9 is shown in red, D' > 0.7 in green, D' > 0.5 in gray, and values < 0.5 in white. The figure also indicates the 70 kb region around β-globin, and the recombination hotspot (dark gray box).

Having noted a high degree of haplotype similarity with concomitant strong LD over 400 kb, we then considered the distance over which this might extend. Without a priori knowledge of the extent or rate of LD decay along the chromosome, we arbitrarily chose to consider haplotype similarity over a distance of 1 Mb. To do so, an additional 220 SNPs were added from the HapMap YRI dataset, providing a final set of 401 SNPs to cover approximately 500 kb on either side of βS. Five of the 14 βS Benin haplotypes were identical across the 1 Mb evaluated, and the haplotypes still exhibited a high degree of similarity (HS = 0.702, 95% CI 0.698 – 0.706).

Gambian βs haplotypes

We also re-evaluated the similarity among βS haplotypes analysed in our previous studies in the Gambia [16]. RFLP genotypes were not available in this dataset so we examined haplotypes constructed from the six SNPs genotyped in the 70 kb surrounding the βS allele (see Methods). Using these markers we found that a single haplotype dominated the distribution, comprising 30 of the 37 haplotypes evaluated (81%). Although we were unable to assign this 'most common' haplotype to one of the classical haplotype groups with absolute certainty, it is of a similar frequency to that expected for the Senegal βs haplotype in this population [21]. In addition, when compared to other markers in the same 70 kb region, this 'most common' Gambian haplotype was clearly different from the βS Benin haplotypes observed in the other two populations (Figure 4). These observations, taken together, suggest that this 'most common' haplotype represents the Senegal βs haplotype. Analysis of this 'SNP-defined' Senegal haplotype revealed a high degree of long-range haplotype similarity (HS = 0.827, 95% CI 0.825 – 0.829), wherein 16 (51%) of the haplotypes were identical across the entire 400 kb region (Figure 4).
Figure 4

Most common Gambian βS haplotypes. The 'most common' short-range haplotype, including extension of the haplotype to 400 kb is shown in part A. Individual haplotypes are arrayed along the Y-axis with SNPs on the X-axis. At each SNP position, the major allele of each SNP is represented in blue and the minor allele in orange. The 70 kb region defining the 'short-range' βS haplotypes is indicated above the figure and by the white border. A comparison of this 70 kb region in Jamaica, Gambia and Yoruba is shown in part B using markers successfully genotyped in all three populations.


We have extended our previous observations of long-range βS haplotypes by demonstrating that classical RFLP-defined βS haplotypes are highly conserved over several hundreds of kb. To our knowledge this is the first time that classically described βS haplotypes have been shown to extend over such long genomic distances. We were also able to demonstrate that this conserved βS haplotype similarity was related to a pattern of strong extended LD around the βS allele. Comparable results were found for three different population groups, using differing markers and marker densities. The number of other classical βS haplotypes in the groups sampled was not large enough for us to make definitive statements about all classically-described βS haplotypes; however, since the high degree of haplotype similarity we observed is almost certainly the result of recent selection, and the selective force underlying these observations – severe malaria infection – also applies to other βS haplotypes, it seems likely that a high degree of long-range haplotype similarity will be seen on other βS chromosomes as well. Our findings provide a framework for further investigating the anthropology of the βS allele, including its selection dynamics across the African Diaspora and the origin of classical βS haplotypes, and may have implications for other selected alleles in the genome as well as for the search for genetic modifiers of sickle cell disease.

As an example, Afro-Caribbean and African βs haplotypes have differing demographic and social histories, with concomitant differences in the duration and extent of malarial selection pressure on the allele. The large-scale importation of slaves from Africa to Jamaica some 400 years ago, forcibly moved persons from an area where the selective force in favour of the βs allele was strong – malaria remains a major cause of mortality in equatorial Africa [22], to one where the selective force was substantially less – endemic malaria is not likely to have been a major cause of mortality in Jamaica and was eradicated from the island in 1963 [23, 24]. We might then expect differences in malarial selection between African and Afro-Caribbean populations. Similarly, the strong geographic sub-division of βS haplotypes across Africa, which presumably resulted from sub-division of early Holocene sub-Saharan Africans, suggests potential differences in the duration and intensity of malaria among the African population groups sampled as well. These observations suggest that we might have expected significant differences in the degree of haplotype similarity across the three populations.

The HS score in Jamaica was quantitatively less than that in Africa, albeit using differing marker sets and sample sizes. Conversely, a comparison of HS scores generated using only the 16 SNPs that were typed in all three populations, did not demonstrate significant differences between populations (data not shown), despite the expectation of selection differences between the two African groups and even more so between the African groups and the Jamaican sample. This observation is somewhat surprising, although there are likely to be other forces at work which are not accounted for in our appraisal. In the Jamaican population, for instance, there is the potential influence of genetic drift, admixture of sickle haplotypes from across Africa, as well as the non-random survival of individual sickle haplotypes; in our sample the frequency of Benin βS haplotypes among population samples (68%) was similar to that in βSβS individuals (78%), albeit with differing sample sizes. Equally, it may be that the differences in the selection pressures themselves are too subtle, too complex, or too recent to affect LD patterns of common SNPs/RFLPs. This would be beneficial for detecting selection from population-based surveys such as the HapMap project, which is already being used as a tool to screen for recently selected alleles in the human genome [25]. A more extensive evaluation of LD/haplotype decay in larger and more diverse datasets with a denser set of markers would help to clarify this.

There remains some uncertainty surrounding the implications of the 5' recombination 'hot spot', both with regard to the origin of the βS allele and to signals of recent selection around HBB. For instance, using sequence data over 5.2 kb, Wood et al [26] found that the recombination hotspot was responsible for attenuation of the haemoglobin C selection signal (HbC); however, strong LD extending over 100 kb and across the β-globin recombination hot spot has also been described in relation to positive malarial selection of the Hemoglobin E (HbE) allele in Southeast Asia [15]. In our dataset, using the Extended Haplotype Homozygosity (EHH) score of the Long-range Haplotype test (LRH) [27], we did not find any substantive differences in haplotype homozygosity between common βs haplotypes 5' of βS and those extending 3' of βS in either Jamaican or Gambian samples (the EHH score is the probability that any two haplotypes extending outwards from a core haplotype or SNP will be the same at a given distance away from the SNP – see Methods). Among the Yoruba, 5' haplotypes appeared to have less similarity and a steeper decline in similarity than 3' haplotypes over 200 kb, but at 1 MB, 5' and 3' haplotypes had comparable degrees of similarity (Additional file 2). The inconsistency of these preliminary results precludes definitive statements about either the impact of the recombination hot spot on the signature of βS selection or the contribution of gene conversion to the origin of classical βS haplotypes; however, a combined approach of short-range sequencing and dense long-range SNP data may help to resolve some of these issues.

Lastly, we offer a note on the potential clinical relevance of our findings. The extent of LD between markers on the haplotypes evaluated may have implications for studies of genetic modifiers of sickle cell disease. To date, such studies have used the strong LD in the surrounding 70 kb to generate hypotheses about local variants that are likely to modulate the HbSS phenotype [18]. Future attempts to identify genetic modifiers of sickle cell disease will have to account for the extended LD observed here, which may require a consideration of cis-acting variants or genes located hundreds to thousands of kilobases away. This approach may provide new candidate loci that either modulate the sickle phenotype or influence traits such as hereditary persistence of foetal haemoglobin, which are known to modify clinical outcomes in the beta-haemoglobinopathies [28].


We have shown that common βS haplotypes from different populations exhibit a high degree of haplotype similarity, with concomitant strong LD, over hundreds of kilobases despite the adjacent 5' recombination hotspot. To the best of our knowledge, this is the first time that this has been described. These findings suggest little support for differences in selective pressures on βS between major population subdivisions, and may have implications for association studies of genetic modifiers of sickle cell disease in cis with the β-globin cluster. Further studies, using both simulated and actual data from multiple populations, are needed to clarify the effects of recombination and population demography on long-range haplotype similarity and LD in the region of this well-established example of a recently selected allele.



DNA samples from Jamaican adults were obtained by randomly sampling from a population survey that has been described in detail previously [29]. DNA samples from HbSS adults attending the main clinic at the Sickle Cell Unit, University of the West Indies, Jamaica were obtained at random from among samples collected during a previous study of genetic modifiers of HbSS disease [30]. All study samples were anonymised. Use of these samples for the purposes of this study was approved by the University Hospital of the West Indies (UHWI)/University of the West Indies (UWI) Faculty of Medical Sciences Ethics committee. Gambian DNA samples were extracted from a set of cord bloods recruited at the Royal Victoria Hospital, Faraja, The Gambia. Permission for the collection, storage and use of these samples for genetic research was granted by the Joint Gambian Government/MRC Ethics Committee. Cell lines from individuals used in Phase 1 of the HapMap project from the Yoruba in Ibadan, Nigeria (YRI dataset) were obtained from the Coriell Cell Repository at the Coriell Institute for Medical Research [31] as transformed B-lymphocytes from peripheral blood (see Additional file 3 for sample details). Cell lines were cultured and DNA was extracted using CST Genomic DNA Purification Kit [32], and then quantified using NanoDrop technology [33].

SNP selection and genotyping

In Jamaicans, the ensemble database [34] was used to identify an initial set of 35 SNPs across 414 kb of the β-globin locus on chromosome 11, spanning about 200 kb on either side of the HbS SNP. SNPs were chosen on the basis of validation (preferably in an African-related population), available frequency data, and a desired SNP density of approximately one SNP per 10 kb. Chosen SNPs (including the HbS SNP) were genotyped using MALDI-TOF mass spectrometry (SEQUENOM) on PEP DNA [35] in 137 Jamaican population samples (SNP assay details are available in Additional file 4). SNPs with greater than 10% missing data, genotypes not consistent with Hardy-Weinberg equilibrium (P < 0.01), or minor allele frequencies < 5% were then excluded, resulting in a final set of 22 SNPs. The frequency of the HbS haplotype was 6% in the population sample, which compares favourably with the 5% figure obtained in larger-scale surveys of Jamaicans [36].

Five RFLP sites were additionally genotyped in the same samples using the restriction enzymes Hinf I, Hinc II, Hind III in HbG1, Hind III in HbG2, and Xmn I. Hinf I digests were uninformative as there were multiple Hinf I sites in the amplified PCR fragment; the remaining restriction digests were therefore used to define classically-described βs haplotypes. SNP selection and typing in the 380 Gambian cord blood samples was very similar to the procedure used for the Jamaican samples, and has been described previously [16].

Publicly available SNPs genotyped in the YRI dataset from HapMap release #20 [37] were chosen from a one megabase region spanning first 200 kb and then 500 kb on either side of the βS allele (see Additional file 4). SNPs were chosen on the basis of being polymorphic in the population sampled, having passed HapMap quality control measures, and providing an approximate marker density of one SNP per 2 kb (N = 398). Three RFLP sites – Hind III in HbG1, Hind III in HbG2, and Xmn I (see below)- and the HbS SNP (see above) were independently genotyped in the same samples, for a total of 401 SNPs. Along with SNP rs968857 (which is the same as the Hinc II RFLP), these were used to define classically-described βs haplotypes.

RFLP genotyping

HBB PCR Primers (see Additional file 5) used to amplify products for RFLP genotyping were designed with careful consideration of the high degree of homology in the region due to gene duplication; this resulted in relatively large amplification fragments. The HBG2 fragment (2734 bp in length) amplified the HBG2 gene and contained restriction sites for both Hind III and Xmn1. The HBG1 fragment (2909 bp in length) amplified the HBG1 gene and contained the restriction site Hind III. The HBB fragment (1200 bp in length) amplified the HBB gene and contained the restriction site Ava II. The recognition site for Hinc II was in an intergenic region with unique flanking sequence, so a small fragment of 118 bp containing it was amplified.

For each PCR reaction 2 μl of genomic DNA at 5 ng/μl was added to 6 μl of PCR mix. PCR mix for 192 reactions was prepared by adding the following: MgCl2 (50 mM) – 44 μl; dNTP's (8 mM pool) – 110 μl; ×10 buffer – 110 μl; Biotag 5 U/μl – 5.5 μl; H2O – 386.1 μl; 1st PCR primer – 2.2 μl; 2nd PCR primer – 2.2 μl. The PCR mix was the same for all fragments except HBG1 (2909 bp) for which 3.3 μl of each of the forward and reverse primers was used.

PCR protocols for the HBG1 and HBG2 RFLP fragments consisted of an initial five cycle denaturation of 96°C for 1 minute, 94°C for 45 seconds, 62°C for 2.5 minutes, and 72°C for 1 minute; followed by a 29 cycles of 94°C for 45 seconds, 65°C for 2.5 minutes, and 72°C for 1 minute, and a final extension of 72°C for 10 minutes and 15°C for 15 minutes. The PCR protocol for the Hinc2 fragment differed only with regard to the main cycling conditions which required an annealing temperature of 65°C for 45 seconds and an extension temperature of 72°C for 30 seconds. The HBB fragment did not require the initial 5 cycle denaturation; instead 35 cycles consisting of 96°C for 1 minute, 94°C for 45 seconds followed by an annealing temperature of 56°C for 45 seconds, and a 72°C extension for 1 minute was used.

Restriction enzymes and their buffers were ordered from New England BioLabs (Ipswich, MA, USA); digests were carried out according to the manufacturer's recommendations. Digestion products were loaded onto an agarose gel and scored as +/+ if the two alleles were digested, as +/- if one but not the other allele was digested (heterozygote), and as -/- if no digestion occurred in the sample.

Haplotype construction

In order to improve the integrity of the haplotype inference in the Jamaicans, we omitted any individuals who had more than one site (marker) with missing data, resulting in 133 population samples and 30 HbSS samples. Haplotypes were constructed using the PHASE (version 2.0) software package [38, 39]. Among the Yoruba, parental genotypes were first phased using the PHAMILY program [40], which uses parent to offspring transmission to derive phase-known sites from family-trio pedigree data. The resulting haplotypes consisting of phase known and phase unknown sites were then phased using the PHASE algorithm.

Long-range haplotype similarity

The HS statistic of HAPLOSIMILARITY uses sliding windows to assess the mean similarity of haplotypes (given as the mean of the sum of the squares of the frequencies of distinct haplotypes within a given window) associated with the minor allele of a given SNP. The value of HS ranges from one (all haplotypes associated with the allele are exactly the same) to a minimum given by 1/kmax, where kmax is the maximum possible number of distinct haplotypes for a given sliding window size (haplotypes associated with the allele are extremely diverse). We used a sliding window size of ten SNPs (the default option) in our evaluation. HAPLOSIMILARITY (including details on operating characteristics and implementation) is available for public use at the GMAP website [41].

The EHH statistic of the long-range haplotype test (LRH) is very similar to the HS statistic of HAPLOSIMILARITY and is the probability that at a given distance away from a core haplotype or SNP, any two haplotypes extending outward from the core haplotype/SNP will be homozygous at all SNPs. EHH scores range from a minimum of zero to a maximum of one [26].

The Normal approximation for the difference between two proportions [42] was used to test the significance of differences in haplotype similarity between the three populations.



We would like to thank Dr. Jonathan Pritchard and Dr. Graham Coop at the University of Chicago, for helpful comments and suggestions. This work was supported in part by a grant from the Caribbean Health Research Council (to NH), and by the Medical Research Council, UK

Authors’ Affiliations

Tropical Metabolism Research Unit, Tropical Medicine Research Institute, University of the West Indies
Wellcome Trust Centre for Human Genetics, University of Oxford
MRC Laboratories
Departments of Zoology and Statistics, University of Oxford
Department of Pediatric and Adolescent Medicine, Mayo Clinic


  1. Nagel RL, Fabry ME, Pagnier J, Zohoun I, Wajcman H, Baudin V, Labie D: Hematologically and genetically distinct forms of sickle cell anemia in Africa. The Senegal type and the Benin type. N Engl J Med. 1985, 312: 880-884.View ArticlePubMedGoogle Scholar
  2. Pagnier J, Mears JG, Dunda-Belkhodja O, Schaefer-Rego KE, Beldjord C, Nagel RL, Labie D: Evidence for the multicentric origin of the sickle cell hemoglobin gene in Africa. Proc Natl Acad Sci U S A. 1984, 81: 1771-1773. 10.1073/pnas.81.6.1771.PubMed CentralView ArticlePubMedGoogle Scholar
  3. Orkin SH, Kazazian HH, Antonarakis SE, Goff SC, Boehm CD, Sexton JP, Waber PG, Giardina PJ: Linkage of beta-thalassaemia mutations and beta-globin gene polymorphisms with DNA polymorphisms in human beta-globin gene cluster. Nature. 1982, 296: 627-631. 10.1038/296627a0.View ArticlePubMedGoogle Scholar
  4. Webster MT, Clegg JB, Harding RM: Common 5' beta-globin RFLP haplotypes harbour a surprising level of ancestral sequence mosaicism. Hum Genet. 2003, 113: 123-139.PubMedGoogle Scholar
  5. Fullerton SM, Harding RM, Boyce AJ, Clegg JB: Molecular and population genetic analysis of allelic sequence diversity at the human beta-globin locus. Proc Natl Acad Sci U S A. 1994, 91: 1805-1809. 10.1073/pnas.91.5.1805.PubMed CentralView ArticlePubMedGoogle Scholar
  6. Rich SM, Licht MC, Hudson RR, Ayala FJ: Malaria's Eve: evidence of a recent population bottleneck throughout the world populations of Plasmodium falciparum. Proc Natl Acad Sci U S A. 1998, 95: 4425-4430. 10.1073/pnas.95.8.4425.PubMed CentralView ArticlePubMedGoogle Scholar
  7. Flint J, Harding RM, Boyce AJ, Clegg JB: The population genetics of the haemoglobinopathies. Baillieres Clin Haematol. 1998, 11: 1-51. 10.1016/S0950-3536(98)80069-3.View ArticlePubMedGoogle Scholar
  8. Wainscoat JS, Bell JI, Thein SL, Higgs DR, Sarjeant GR, Peto TE, Weatherall DJ: Multiple origins of the sickle mutation: evidence from beta S globin gene cluster polymorphisms. Mol Biol Med. 1983, 1: 191-197.PubMedGoogle Scholar
  9. Antonarakis SE, Boehm CD, Serjeant GR, Theisen CE, Dover GJ, Kazazian HH: Origin of the beta S-globin gene in blacks: the contribution of recurrent mutation or gene conversion or both. Proc Natl Acad Sci U S A. 1984, 81: 853-856. 10.1073/pnas.81.3.853.PubMed CentralView ArticlePubMedGoogle Scholar
  10. Hill AV, Allsopp CE, Kwiatkowski D, Anstey NM, Twumasi P, Rowe PA, Bennett S, Brewster D, McMichael AJ, Greenwood BM: Common west African HLA antigens are associated with protection from severe malaria. Nature. 1991, 352: 595-600. 10.1038/352595a0.View ArticlePubMedGoogle Scholar
  11. Aidoo M, Terlouw DJ, Kolczak MS, McElroy PD, ter Kuile FO, Kariuki S, Nahlen BL, Lal AA, Udhayakumar V: Protective effects of the sickle cell gene against malaria morbidity and mortality. Lancet. 2002, 359: 1311-1312. 10.1016/S0140-6736(02)08273-9.View ArticlePubMedGoogle Scholar
  12. Jelliffe DB, Humphreys J: The sickle-cell trait in western Nigeria; a survey of 1,881 cases in the Yoruba. Br Med J. 1952, 1: 405-406.PubMed CentralView ArticlePubMedGoogle Scholar
  13. Allison AC: The distribution of the sickle-cell trait in East Africa and elsewhere, and its apparent relationship to the incidence of subtertian malaria. Trans R Soc Trop Med Hyg. 1954, 48: 312-318. 10.1016/0035-9203(54)90101-7.View ArticlePubMedGoogle Scholar
  14. Bersaglieri T, Sabeti PC, Patterson N, Vanderploeg T, Schaffner SF, Drake JA, Rhodes M, Reich DE, Hirschhorn JN: Genetic Signatures of Strong Recent Positive Selection at the Lactase Gene. Am J Hum Genet. 2004, 74:Google Scholar
  15. Ohashi J, Naka I, Patarapotikul J, Hananantachai H, Brittenham G, Looareesuwan S, Clark AG, Tokunaga K: Extended Linkage Disequilibrium Surrounding the Hemoglobin E Variant Due to Malarial Selection. Am J Hum Genet. 2004, 74:Google Scholar
  16. Hanchard NA, Rockett KA, Spencer C, Coop G, Pinder M, Jallow M, Kimber M, McVean G, Mott R, Kwiatkowski DP: Screening for recently selected alleles by analysis of human haplotype similarity. Am J Hum Genet. 2006, 78: 153-159. 10.1086/499252.PubMed CentralView ArticlePubMedGoogle Scholar
  17. Harding RM, Fullerton SM, Griffiths RC, Bond J, Cox MJ, Schneider JA, Moulin DS, Clegg JB: Archaic African and Asian lineages in the genetic ancestry of modern humans. Am J Hum Genet. 1997, 60: 772-789.PubMed CentralPubMedGoogle Scholar
  18. Powars D, Hiti A: Sickle cell anemia. Beta s gene cluster haplotypes as genetic markers for severe disease expression. Am J Dis Child. 1993, 147: 1197-1202.View ArticlePubMedGoogle Scholar
  19. Powars DR, Chan L, Schroeder WA: Beta S-gene-cluster haplotypes in sickle cell anemia: clinical implications. Am J Pediatr Hematol Oncol. 1990, 12: 367-374.View ArticlePubMedGoogle Scholar
  20. Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly P: A haplotype map of the human genome. Nature. 2005, 437: 1299-1320. 10.1038/nature04226.View ArticleGoogle Scholar
  21. Currat M, Trabuchet G, Rees D, Perrin P, Harding RM, Clegg JB, Langaney A, Excoffier L: Molecular analysis of the beta-globin gene cluster in the Niokholo Mandenka population reveals a recent origin of the beta(S) Senegal mutation. Am J Hum Genet. 2002, 70: 207-223. 10.1086/338304.PubMed CentralView ArticlePubMedGoogle Scholar
  22. WHO: Africa Malaria Report. []
  23. Chandler D: Health and slavery: a study of health conditions among Negro slaves in the Viceroyalty of New Granada and its associated slave trade, 1600-1810. Doctoral Dissertation, History section, Latin American Library. 1972, 308-Google Scholar
  24. Boyd MF, Aris FW: A malaria survey of the island of Jamaica, BWI. American Journal of Tropical Medicine and Hygiene. 1929, 9: 309-399.Google Scholar
  25. Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4: e72-10.1371/journal.pbio.0040072.PubMed CentralView ArticlePubMedGoogle Scholar
  26. Wood ET, Stover DA, Slatkin M, Nachman MW, Hammer MF: The beta -globin recombinational hotspot reduces the effects of strong selection around HbC, a recently arisen mutation providing resistance to malaria. Am J Hum Genet. 2005, 77: 637-642. 10.1086/491748.PubMed CentralView ArticlePubMedGoogle Scholar
  27. Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, Schaffner SF, Gabriel SB, Platko JV, Patterson NJ, McDonald GJ, Ackerman HC, Campbell SJ, Altshuler D, Cooper R, Kwiatkowski D, Ward R, Lander ES: Detecting recent positive selection in the human genome from haplotype structure. Nature. 2002, 419: 832-837. 10.1038/nature01140.View ArticlePubMedGoogle Scholar
  28. Powars DR, Chan L, Schroeder WA: The influence of fetal hemoglobin on the clinical expression of sickle cell anemia. Ann N Y Acad Sci. 1989, 565: 262-278. 10.1111/j.1749-6632.1989.tb24174.x.View ArticlePubMedGoogle Scholar
  29. Cooper R, Rotimi C, Ataman S, McGee D, Osotimehin B, Kadiri S, Muna W, Kingue S, Fraser H, Forrester T, Bennett F, Wilks R: The prevalence of hypertension in seven populations of west African origin. Am J Public Health. 1997, 87: 160-168.PubMed CentralView ArticlePubMedGoogle Scholar
  30. Haverfield EV, McKenzie CA, Forrester T, Bouzekri N, Harding R, Serjeant G, Walker T, Peto TE, Ward R, Weatherall DJ: UGT1A1 variation and gallstone formation in sickle cell disease. Blood. 2005, 105: 968-972. 10.1182/blood-2004-02-0521.View ArticlePubMedGoogle Scholar
  31. Coriell Institute for Medical Research. []
  32. CST Genomic DNA Purification. []
  33. Nanodrop. []
  34. Ensembl database. []
  35. Griffin TJ, Smith LM: Single-nucleotide polymorphism analysis by MALDI-TOF mass spectrometry. Trends Biotechnol. 2000, 18: 77-84. 10.1016/S0167-7799(99)01401-8.View ArticlePubMedGoogle Scholar
  36. Hanchard NA, Hambleton I, Harding RM, McKenzie CA: The frequency of the sickle allele in Jamaica has not declined over the last 22 years. British Journal of Haematology. 2005, 130: 939-942. 10.1111/j.1365-2141.2005.05704.x.View ArticlePubMedGoogle Scholar
  37. HapMap project. []
  38. Stephens M, Smith NJ, Donnelly P: A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001, 68: 978-989. 10.1086/319501.PubMed CentralView ArticlePubMedGoogle Scholar
  39. Stephens M, Donnelly P: A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet. 2003, 73: 1162-1169. 10.1086/379378.PubMed CentralView ArticlePubMedGoogle Scholar
  40. Ackerman H, Usen S, Mott R, Richardson A, Sisay-Joof F, Katundu P, Taylor T, Ward R, Molyneux M, Pinder M, Kwiatkowski DP: Haplotypic analysis of the TNF locus by association efficiency and entropy. Genome Biol. 2003, 4: R24-10.1186/gb-2003-4-4-r24.PubMed CentralView ArticlePubMedGoogle Scholar
  41. GMAP Website. []
  42. Kirkwood BR, Sterne JAC: Essential Medical Statistics. 2003, , Blackwell Publishing, 501-2nd EditionGoogle Scholar


© Hanchard et al; licensee BioMed Central Ltd. 2007

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.