Genetic determinants of hair and eye colours in the Scottish and Danish populations

Background Eye and hair colour is highly variable in the European population, and is largely genetically determined. Both linkage and association studies have previously been used to identify candidate genes underlying this variation. Many of the genes found were previously known as underlying mutant mouse phenotypes or human genetic disease, but others, previously unsuspected as pigmentation genes, have also been discovered. Results We assayed the hair of a population of individuals of Scottish origin using tristimulus colorimetry, in order to produce a quantitative measure of hair colour. Cluster analysis of this data defined two groups, with overlapping borders, which corresponded to visually assessed dark versus red/light hair colour. The Danish population was assigned into categorical hair colour groups. Both cohorts were also assessed for eye colour. DNA from the Scottish group was genotyped at SNPs in 33 candidate genes, using 384 SNPs identified by HapMap as representatives of each gene. Associations found between SNPs and colorimetric hair data and eye colour categories were replicated in a cohort of the Danish population. The Danish population was also genotyped with SNPs in 4 previously described pigmentation genes. We found replicable associations of hair colour with the KITLG and OCA2 genes. MC1R variation correlated, as expected, with the red dimension of colorimetric hair colour in Scots. The Danish analysis excluded those with red hair, and no associations were found with MC1R in this group, emphasising that MC1R regulates the colour rather than the intensity of pigmentation. A previously unreported association with the HPS3 gene was seen in the Scottish population. However, although this replicated in the smaller cohort of the Danish population, no association was seen when the whole study population was analysed. Conclusions We have found novel associations with SNPs in known pigmentation genes and colorimetrically assessed hair colour in a Scottish and a Danish population.


Background
The colours of hair, skin and eyes provide some of the most visible variation between and within human popu-lations. Whilst variation in skin pigmentation is notable between populations, hair colour variation is most notable within populations of European origin [1,2]. In Euro-peans, genetic factors explain 92% of the variation in hair colour, while most of the rest of the variation is due to environmental influence [2]. Blond and red hair colours are commonly seen variations in Europeans, but are rare in other populations. Both linkage studies in families and genome-wide association studies in populations have identified genetic factors that determine hair and eye colour. Thus, for example, brown eye and hair colour was mapped to chromosome 15 by linkage [3,4] and was also found associated with SNPs in the OCA2, and adjacent HERC2 genes in both whole genome and candidate studies [4][5] [6][7] [8]. Red hair colour was initially attributed to MC1R variation by association studies [9] but the highly penetrant phenotype of some variants subsequently allowed family studies [10]. Presently, more than 70 variations in MC1R have been reported [11]. The commonest variants in MC1R have been characterised as highly penetrant or low penetrant red hair alleles and classified as R or r, respectively [12]. Red-haired individuals most commonly are R/R genotype, and this genotype accounts for as much as 84% of red hair colour, but R/r, R/+ and r/r genotypes can also result in red hair [12][13] [14]. Association analyses between hair colour and SNPs in Europeans have been more informative than linkage analyses and have revealed associations with several genes, most already known from mouse or human pigmentary mutations; SLC45A2, TYR, OCA2, KITLG, ASIP, and TYRP1 but others not previously implicated in pigmentation; SLC24A4, IRF4 and TPCN2 [4,5,15,16] [17]. In addition, recently four groups have independently found strong associations between eye colour and polymorphisms in the HERC2 gene upstream of OCA2 [5] [6][7] [8]. These SNPs also have a weak association with hair colour, but there are notably stronger associations with hair colour and haplotypes across the OCA2 gene itself [4].
The precise mechanisms behind most SNP associations with hair colour are not clear. However, most associations found using whole-genome studies map in or near to genes which are already known to play a role in pigmentation through human, mouse or zebrafish mutations. It is clear that most components of the pigmentation path-way are already known through mutation and disease studies and that subtle changes in expression or function of these genes underlie much of normal pigmentary variation.
On this principle, we have studied associations between 384 tagging SNPs which comprehensively cover 33 candidate genes known to be involved in the pigmentation pathway from mouse and human disease genetics or other studies and hair and eye colour in a Scottish population. Associations with hair colour were followed up in a Danish population using 25 SNPs in regions of four candidate pigmentation genes and hair colour.

Methods
The Scottish population A total of 133 unselected young adults of ages ranging from 18 to 40 (32 males and 101 females) were recruited from Edinburgh, Scotland. Their hair colour was measured from 6 scalp hair sites: left and right frontal (8 cm superiorly from supraorbital ridge), left and right temple (8 cm laterally from supraorbital ridge), left and right occipital (5 cm laterally from occiput) using tristimulus L*a*b* colorimetry with a Minolta spectrophotometer CM-2600d (Minolta Co., Ltd, Osaka, Japan). Means of triplicate measurements over all sites were taken. Colour was represented as summary values in three dimensions designed to be commensurable with human colour perception: L*, representing lightness, on a scale of 0-100 where 0 is black and 100 is white; a*, representing redgreen, on a scale from +60 to -60, where positive values indicate increasing shades of red; and b*, representing a yellow-blue, on a scale from +60 to -60, with positive values representing increasing shades of yellow. These three values were plotted and intersected into a three dimensional space to give a numerical value for colour. Subjects with dyed hair and/or who were not of north-European origin were excluded from the study. The eyes of the participants were photographed and categorized into blue, grey, green hazel or brown by inspection. Table 1 shows the numbers of subjects by eye colour, pooled to give sufficient statistical power, with their hair colour by observation. From all volunteers, venous blood was collected and DNA was extracted using Nucleon Genomic DNA extraction kit (Tepnel Life Sciences PLC, Manchester, UK). Ethical approval was obtained from the Lothian Regional Ethics Committee, and consent to carry out and publish the study obtained from each subject.

The Danish population
Unrelated, healthy donors participated and their hair colours categorised as white, light blond, dark blond, brown, black, red or auburn by the same observer. Elderly participants answered by recall of their colour at age 20 years. Data and genotypes from 378 participants with hair colour other than red were analysed with respect to dark or blond hair colour and were tested at 25 SNPs. Initial validation of SNP associations with hair colour in the Scottish population used a subset of participants from the Danish cohort (N = 210) with subsequent follow up in the whole sample. The hair colours were classified as dark (dark blond, brown, black) or light (fair, light blond). Table 2 shows the hair and eye colour categories from this population. Participants were asked where their parents or grandparents were born to determine the individual's ancestral origin, and only those of north European ancestry were included. Blood samples were collected from all volunteers and DNA was purified using the QIAamp DNA blood minikit according to the manufacturer protocol (Qiagen). The project was approved by the Danish ethical committee (ref. KF-01-037/03), and consent to carry out and publish the study obtained from each subject.
In this population the sample size of 210 achieves 67% power to detect at P = 0.05 an association with an odds ratio of 1.6 for a SNP with a minor allele frequency of 0.45 (as we see, for example with rs2254913 in HPS3)

Candidate genes and SNP selection
A total of 33 candidate genes were selected which are reported to play a role in pigmentation through melanin biosynthesis, metabolic pathways associated with pigmentation or tanning response or melanocyte biology.

SNP typing with Illumina microarray (Scots)
The Illumina GoldenGate microarray system was used for genotyping of the Scottish samples [18]. Typing was performed at the Wellcome Trust Clinical Research Facility, Institute of Genetics and Molecular Medicine, Edinburgh [19].

SNP typing by MALDI-TOF MS (Danes)
A multiplex PCR with 13 short amplicons was designed to amplify the loci with the selected 13 candidate variations [see Additional file 1, Table S1]. Primer concentrations ranging from 26.7 μM to 66.7 μM [see Additional file 1, Table S2]. The reaction was balanced to obtain equal peak intensities in the MALDI-TOF MS spectra. PCRs and detection by MALDI-TOF MS technology were performed as previously described [14].

Statistical analysis
Allele frequencies of categorical data were analysed using Fisher's exact test and odds radios. Quantitative values were analysed using linear regression on SNP allelic counts and SNP effects tested by Walds test using Plink v0.99 [21]. All P-values were corrected by using empirical (adaptive) permutation with standard settings or by using Bonferroni single-step adjusted p-values. Analysis of variance (ANOVA), cluster analysis and discriminant analysis were performed with SYSTAT ® v.11. Kruskal-Wallis one way analysis of variance was used to test for differences between male and female a* values. Linkage disequilibrium (D' and R 2 ) was calculated using Haploview 4.0 [22].

Quantitative measure of hair colour
We measured hair colour in the Scottish population using tristimulus colorimetry, which assigns three values; L* assays the light/dark axis, a* measures red/green and b* indicates yellow/blue. Figure 1 shows all individuals plotted against each pair of parameters (L* vs a*, L* vs b* and a* vs b*), The values are clearly correlated, as previously reported [23]. The measured hair colours showed L* values ranging from 16.42 to 53.20 (within a maximum range of 0 to 100, where higher values are lighter). The a* values ranged from 0.88 to 12.74 and b* values from 0.97 to 19.51 (where positive values indicate increasing red and yellow colour respectively). When viewed plotted in 3 dimensions, against each parameter simultaneously, ( Figure 2A) the individual hair colours are distributed in a triangular pattern where individuals with brown hair colour had low L*, a* and b* values and red haired individuals had high a* and b* values and midrange L* values. Individuals with blond hair colour had midrange a* and high b* and L* values. A similar hair colour distribution was previously described [23]. Cluster analysis in two groups of the quantitative measures of hair colour defined a group with dark hair colour with 95% confidence intervals L*(16.90;29.80), a* (1.29;6.31) and b* (1.51;10.99) which was separated from the red/light hair colour group L*(26.61,48.61), a* (2.31;11.01) and b* (9.72;19.87), but the groups had merging borderlines at all three dimensions L*(26.61-29.80)/a*(2.31-6.31)/b*(9.72-10.99). Cluster analysis with more than two groups did not correlate with groups as defined by visual inspection. A gender deviation was observed. In females (N = 85), the a* values were significantly higher than for males (N = 23) p = 0.018, R 2 = 0.05 ( Figure 2B). The mean female a* value is 4.67 and mean male is 3.39. Using a nonparametric test to test for differences, Kruskal-Wallis chi-squared analysis gives a P value of 0.003. Discriminant analysis showed that the hair colours assigned as blond by inspection were correctly assigned in 85% (N = 22) of the individuals, whereas dark (black, brown) was correctly assigned in 94% (N = 61). Red hair colour was correctly assigned in 86% (N = 6) of the individuals whereas auburn hair colour was classified correctly in only 22% (N = 2) and alternatively defined as either blond (N = 1), brown (N = 3) or red (N = 3).
Chi-square analysis of the hair and eye colour categories (Table 1 and 2) indicates that these are not independent and blue/grey eyes are more often found with fair or L*(dark/light) a*(red) b*(yellow) values (hair colour) meas-ured by tristimulus colorimetry in 107 Scots from Edinburgh. Each individual is plotted as L* vs a* (top), L* vs b* (middle) and a* vs b* (bottom)

SNP Associations with Hair and Eye Colour
107 individuals from the Scottish cohort were typed at 384 SNPs from 33 genes. Choice of SNPs was guided by HapMap so that each gene was represented by tagging SNPs. In addition the MC1R gene was sequenced from each individual and any variants found categorised as R (high penetrant red hair) or r (low penetrant red hair). Associations between SNPs and quantitative tristimulus values were examined using linear regression and p-values for putative associations were corrected using permutation or by Bonferroni adjustment. Table 3 shows SNPs in 11 genes in which SNPs were found with adjusted p-values of <0.05. Associations of the same 384 SNPs with eye colour were also analysed in the same 107 Scottish individuals, using Fisher's exact test on the categorical data ( Table 4). All association data is tabulated in additional material [see Additional file 2]. The Danish population was initially typed at SNPs in 4 genes previously indicated to have associations with hair colour (SLC45A2, HERC2, OCA2 and MC1R) across 378 individuals, and associations tested with dark versus light hair (Table 5). A subset of 210 Danes was further analysed to follow up significant associations found in the Scottish population ( Table 6). The size of these sample populations mean that for alleles of low frequency the power to detect associations is limited, unless the effect is large.

The KITLG gene and hair colour
In the Scottish population, the three SNPs, rs1492354 (p = 0.0009), rs1907702 (p = 0.018) and rs10777129 (p = 0.007) located in intron 1 of the KITLG gene were significantly associated with hair colour (Table 3). Of the three SNPs, the rs1492354 AG genotype showed the highest contribution to both the red hair colour dimension a* values (P < 0.001, R 2 = 0.119) and the yellow hair colour dimension b* values (p = 0.001, R 2 = 0.103), the latter value was significant when uncorrected by permutations. The L* axis also showed a trend towards lighter hair colour associated with the A genotype, although this is not statistically significant (Figure 3). Associations were also found between hair colour (a* values) and the two other intron 1 SNPs, rs1907702 and rs10777129. All three SNPs are in linkage disequilibrium in the population, with D' and r 2 values as follows: rs1492354 and rs1907702 (D' = 1.0, r 2 = 0.33), rs1492354 and rs10777129 (D' = 0.88, r 2 = 0.74), rs1907702 and rs10777129 (D' = 1.0, r 2 = 0.34) When retyped in the Danish population, the SNPs rs10777129 (p = 0.02, OR = 3.0) and rs1492354 A (p = 0.04, OR = 2.5) were significantly associated with light hair colour (Table 4). In both the Scottish and the Danish populations, the alleles associated with lighter colour had similar low frequency (rs1492354: 0.08 and rs10777129: 0.07), and given the sample sizes only one homozygote rs10777129 GG was observed, in a Dane.
The OCA2 gene and eye and hair colour SNPs in or close to the OCA2 gene have been reported to be associated with hair and eye colour. However, reported associations are with different regions of the gene; including coding variations and SNPs within the adjacent HERC2. In the Scottish population, associations between eye colour and tagging SNPs were found for SNPs in introns 1, 2, 4, 6, 16, 18 and 23 of OCA2 (Table 4). Associations with hair colour were also found for some of these same SNPs, and others, located in introns 1, 2, 4, 5, 19 and 23 ( Table 3). The intron 1 SNPs rs7495174 (p = 0.02, R 2 = 0.051) and rs7174027 (p = 0.005, R 2 = 0.075) were associated with the L* values, whereas SNPs in intron 2, 4, 5 and 19 were associated exclusively or more strongly with the a* values. The majority of SNPs associated with hair colour were located in intron 23 and SNPs in this intron were associated with both a* and L* values. The markers with the highest correlation were found in intron 23 for rs6497233 (p = 0.0004, R 2 = 0.116) and rs11631195 (p = 0.0002, R 2 = 0.126) ( Table 3). Subsequently, analysis of variance of the rs11631195 AA versus the AG and GG collapsed genotypes showed significant association on the black/white dimension L* value (p = 0.001, R 2 = 0.094), the red dimension, a* value (p = 0.009, R 2 = 0.064) and the yellow dimension, b* value (p = 0.001, R 2 = 0.107). LD analysis of the data reveals that in this population the SNPs in introns 1 to 5 are in linkage disequlibrium, as are those in introns 19 to 23 (data not shown). Thus there appears to be two separate regions of association with hair and eye colour in the OCA2 gene.
As we and others have previously published, a SNP in a putative regulatory element for OCA2, located about 20 kb upstream within the HERC2 gene, is associated with eye colour [4][5] [6,7] and hair colour [4]. In the Danish population (N = 378) analysed here, a significant association was also observed between dark hair colour and SNPs in HERC2 (rs916977, rs1129038, rs2238289 and rs7170852) (p = 2.0 × 10 -6 -8.3 × 10 -5 , OR = 3.5-4.0). However, the OCA2 coding SNPs R419Q (rs1800407) and R305W (rs1800401) were not significantly associated with hair colour (Tables 5), although the minor allele frequencies were low (0.04 for each) which limits the power to detect associations in populations of this size.

The MC1R gene and hair colour
Sequence variations in MC1R in the Scottish population were categorized into the two allelic groups R and r. The R group was significantly associated (p = 2.0 × 10 -6 , single step adjusted Bonferroni corrections) with high correlation with a*-values (R 2 = 0.317) that are characteristic of red hair colour. The MC1R 'r' group was significantly asso- ciated with low a*-values (p = 0.005, R 2 = 0.007) when R and consensus sequences were collapsed. Analysis of variance of the MC1R R/R genotype showed significant association for both the red dimension, a* value (P < 0.001, R 2 = 0.367) and the yellow dimension, b* value (P < 0.001, R 2 = 0.163). No association was seen with the L*, light-dark, dimension. In addition, the Danish population excluded individuals with red hair, which permitted an analysis of association between individual MC1R variants and light hair. No association was seen, reinforcing the notion that MC1R affects the colour but not the intensity of hair pigmentation.

The HPS3 gene and hair colour
We initially identified in the Scottish population a hitherto unreported association between the HPS3 gene and the red/yellow colour axes. Five SNPs (rs4681169, rs16861514, rs16861552, rs6785780 and rs7636389), all in LD in this population, were significantly associated with the red dimension a* value of which rs6785780 (p = 0.003, R 2 = 0.079) was the most strongly associated (Table  3). When replicated initially in a subset (N = 210) of the Danish population, the association between the rs6785780 T allele and light hair colour was also significant (p = 0.04, OR = 1.6) ( Table 4). However, an extended analysis of this SNP on the larger (N = 378) Danish population failed to find an association. Our data do not, therefore, support a role for HPS3 in hair colour variation, but suggest that it may be worth further investigation.

The TYR and SLC45A2 genes and hair colour
SNPs within the tyrosinase (TYR) and the SLC45A2 (MATP)genes have previously been shown to associate with hair colour [15,16]. We therefore paid particular attention to these genes in our populations. The SNP rs12421746 in the TYR gene was significantly associated with blond hair colour in the Danish population (p = 0.04, OR = 3.1) ( Table 6), but we did not replicate this association with any of the colorimetric values in the Scottish population. In the Scottish population, the minor allele frequency in TYR was 0.012 while it was 0.04 in the Danish population. Again this low allele frequency will restrict the ability to detect small effects in a population sample of the size studied here.
Likewise, no statistically significant association was observed between tagging SNPs in SLC45A2 and hair colour in the Scottish population, whereas in the Danish population (N = 378), the one coding SNP rs16891982 (F374L) (p = 0.005, OR = 7.0) was associated with dark hair colour (Table 5).

Discussion
We have used quantitative hair colour on a tristimulus L*, a* and b* scale not only to associate genetic markers to hair colour, but also to evaluate the accuracy of the inspected hair colours. The two clusters of light or dark hair colour based on colorimetric analysis were well separated and good correlation was observed between inspections (88-94%) and quantitative hair measures. Further sub-classification of these groups did not satisfactorily correlate with the groupings. Shekar and co-workers observed 97% correct classification in two groups whereas only 73.1% were correctly assigned using observer reported colour in six groups [2] In total, robust associations between hair colour and five genes MC1R, KITLG, TYR, OCA2 and SLC45A2 were observed. SNPs in or close to all of these have previously been reported by others as showing associations.    [13]. Among the numerous variants, those classified as "R" are highly penetrant red-hair alleles. In the Scottish population the MC1R R/R genotype showed the strongest contribution to the variance of the red/yellow correlated dimensions a*(r 2 = 0.367)/b*(r 2 = 0.163) ( Table 3).
Associations with the a* dimensions were also observed with SNPs in intron 1 of the KITLG gene. In both the Scottish and Danish populations. Another SNP near to the KITLG gene , rs12821256, has also been shown to be associated with hair colour in European populations from Iceland, Netherlands and The United States [4,15]. This SNP is located several hundreds of kilobases 5' of the KITLG gene and does not appear to be in LD with those we have analysed in intron 1 (D' = 1, r 2 = 0.01). Unfortunately this SNP was not typed in our populations and so we are unable to determine whether those we analysed show a better correlation with hair colour than this previously reported one. KITLG encodes stem cell factor, the ligand of the KIT receptor which is essential for normal melanocyte proliferation and development [24]. Mutations of KITLG in mice result in deficits in melanocytes and unpigmented patches in the skin and hair and it is not unreasonable to expect that variation in expression or function of the gene in humans could result in variation of melanocyte number in the hair follicles.
The OCA2 gene was first identified in mice, in which mutations of the gene result in a pale coat, and was later shown to be identified in patients with tyrosinase-positive albinism. The function of the gene product is not unequivocally established, but it is related to a transporter protein family which has 12 transmembrane domains and is localised to the melanosome [25]. Associations of OCA2 and HERC2 SNPs with hair and eye colour were found in this study, in accordance with previous reports of linkage or association [3], [4], [5], [6], [7]. The major contribution to eye colour was conveyed by two HERC2 SNPs, rs1129038 and rs129138332, which lie about 20 kb upstream of the OCA2 gene, and which were almost in perfect linkage disequilibrium [5,6]. Association was observed between hair colour and rs1129038 in the Danish population (p = 2.0 × 10 -6 , OR = 3.5), but two SNPs rs2238289 and rs916977 that were in completely linkage disequilibrium with rs1129038 (D' = 1, R 2 = 0.62 and D' = 1, R 2 = 0.64) were slightly more strongly associated with dark hair colour (p = 8.3 × 10 -5 , OR = 3.7 and p = 2.2 × 10 -5 , OR = 4.0). These results support earlier results from Shekar and co-workers who demonstrated OCA2 haplotypes to be more strongly associated with hair colour than rs1129038 and rs129138332 [3]. By contrast the strongest association with hair colour in the Scottish population in the OCA2 gene were accounted for by two SNPs in intron 23 on the red dimension (a*) rs6497233 (p = 0.0004, R 2 = 0.116) and rs11631195 (p = 0.0002, R 2 = 0.126). Linkage disequilibrium was observed between these markers (D' = 0.94, R 2 = 0.79). The association results from the HERC2/OCA2 region suggest that the molecular mechanism affecting eye colour may not be the same as for hair colour.
Overall in the Danish population, the strongest association with hair colour was observed with the SLC45A2 missense variant rs16891982 (OR = 7.0) followed by the HERC2 (upstream OCA2) variant rs2238289 (OR = 4.0) and the KITLG variant rs1492354 (OR = 3.0). Associations between hair colour and SNPs in TYR and SLC45A2 were only observed in the Danish population. Either of these genes, when mutant in mice or humans, result in albinism. TYR is the rate-limiting enzyme required for melanin synthesis whilst SLC45A2 is a solute transporter whose substrate is unknown [25]. Absence of observed association with these two genes in the Scottish population may well be due to the limited population sizes. The Danish population has power of 82% to detect at p = 0.05 an effect of this TYR SNP with the observed odds ratio of 3.1 and minor allele frequency of 0.04.
We tentatively suggest a novel association between hair and eye colour and the HPS3 gene. 7 SNPs across the whole gene showed significant association in the Scottish population. However, although initially replicated in a subset of the Danish population when extended to the whole sample we no longer detect association. HPS3 belongs to a group of genes that are involved in biogenesis and/or maturation of multiple cellular organelles including lysosomes and melanosomes, the site of melanin synthesis and export. In the mouse, at least seventeen genes have been identified as HPS like genes, with a mutant phenotype comprising reduced pigmentation and a long bleeding time [26]. Seven of these genes have been demonstrated to be mutated in forms of the inherited, human disorder Hermansky-Pudlack syndrome (HPS), which is a syndrome with multiple disorders including oculocutaneous albinism, bleeding tendency and lysosomal dysfunction [27]. Patients with HPS type 3, and mutant in HPS,  usually have mild symptoms and mice mutated in the orthologous gene (Hps3) have the cocoa phenotype which produces a lighter coat colour and prolonged bleeding time but does not have a lysosomal disorder [28][29] [30]. We suggest that variation within the human orthologues of the 17 mouse HPS-like genes merits further detailed analysis as candidates for contributing to hair colour variation.

Conclusions
We have found novel associations between SNPs in pigmentation genes previously shown to play a role in human hair and eye colour and colorimetrically assessed hair colour in a Scottish and a Danish population.