Genome-wide association of milk fatty acids in Dutch dairy cattle

Background Identifying genomic regions, and preferably individual genes, responsible for genetic variation in milk fat composition of bovine milk will enhance the understanding of biological pathways involved in fatty acid synthesis and may point to opportunities for changing milk fat composition via selective breeding. An association study of 50,000 single nucleotide polymorphisms (SNPs) was performed for even-chain saturated fatty acids (C4:0-C18:0), even-chain monounsaturated fatty acids (C10:1-C18:1), and the polyunsaturated C18:2cis9,trans11 (CLA) to identify genomic regions associated with individual fatty acids in bovine milk. Results The two-step single SNP association analysis found a total of 54 regions on 29 chromosomes that were significantly associated with one or more fatty acids. Bos taurus autosomes (BTA) 14, 19, and 26 showed highly significant associations with seven to ten traits, explaining a relatively large percentage of the total additive genetic variation. Many additional regions were significantly associated with the fatty acids. Some of the regions harbor genes that are known to be involved in fat synthesis or were previously identified as underlying quantitative trait loci for fat yield or content, such as ABCG2 and PPARGC1A on BTA 6; ACSS2 on BTA 13; DGAT1 on BTA 14; ACLY, SREBF1, STAT5A, GH, and FASN on BTA 19; SCD1 on BTA26; and AGPAT6 on BTA 27. Conclusions Medium chain and unsaturated fatty acids are strongly influenced by polymorphisms in DGAT1 and SCD1. Other regions also showed significant associations with the fatty acids studied. These additional regions explain a relatively small percentage of the total additive genetic variance, but they are relevant to the total genetic merit of an individual and in unraveling the genetic background of milk fat composition. Regions identified in this study can be fine mapped to find causal mutations. The results also create opportunities for changing milk fat composition through breeding by selecting individuals based on their genetic merit for milk fat composition.


Background
The biosynthesis of bovine milk fat is a complicated process regulated by many genes belonging to several pathways [1]. Genetic analyses of bovine milk fatty acids have shown heritable variation. Short and medium chain fatty acids (C4:0 up to and including C16:0), which are synthesized de novo in the mammary gland, have moderate to high heritability [2,3]. Long chain fatty acids (i. e. C16:0 and higher) are derived from blood lipids that originate mainly from the diet and endogenous lipids, nevertheless, they have low to moderate heritability [2][3][4]. Identification of genomic regions, and preferably individual genes, responsible for genetic variation in milk fat composition will enhance the understanding of biological pathways involved in fatty acid synthesis and may point towards opportunities for changing milk fat composition via selective breeding. Candidate gene studies have shown that polymorphisms in diacylglycerol O-acyltransferase 1 (DGAT1 K232A) [5] and stearoyl-CoA desaturase 1 (SCD1 A293V) [6] have important effects on milk fat composition [7][8][9][10][11][12]. Many genes are involved in the biosynthesis of milk fat, and analyzing these candidate genes one by one in a candidate gene approach is not an option; therefore, quantitative trait loci (QTL) studies try to identify regions associated with milk fat composition to identify candidate genes that are worth considering.
In order to identify genomic regions involved in the biosynthesis of milk fat, Schennink et al. [13] and Stoop et al. [14] performed genome-wide linkage analyses of milk fatty acids and detected genome-wise significant QTL and several suggestive QTL. Other linkage studies have been performed for single chromosomes [15] or the fat composition of adipose tissue in beef cattle [15][16][17][18].
Recent developments in molecular genetics have made it possible to perform genome-wide association studies using thousands of single nucleotide polymorphism (SNP) markers to detect QTL. A genome-wide association study has higher power to detect QTL and provides more precise estimates of QTL locations compared to a linkage study. Some genome-wide associations for routinely evaluated traits in dairy cattle, such as milk production and fertility, have been published [19][20][21][22]. To the best of our knowledge, no genome-wide association study of milk fatty acids has been reported.

Phenotypes
The fat composition of winter milk samples from 1,905 first-lactation Dutch Holstein Friesian cows was available for this study. The cows were housed on 398 commercial farms throughout the Netherlands. At least three cows were sampled per farm. The cows were between 63 and 282 days in milk. The period of negative energy balance in early lactation was avoided by choosing cows over 63 days in lactation. The population consisted of five large paternal half-sib families from proven sires (200,199,195,176, 101 daughters per sire) and 50 small paternal half-sib families from test-sires (10-24 daughters per sire), as well as 190 cows descending from 45 other proven sires (1-30 daughters per sire). The pedigree of the cows was supplied by CRV (Cooperative cattle improvement organization, Arnhem, the Netherlands) and consisted of 26,300 animals.
Milk fat composition was measured by gas chromatography. Many fatty acids were measured, but only the major fatty acids are reported here: even-chain saturated fatty acids C4:0 to C18:0, even-chain (cis9) monounsaturated fatty acids C10:1 to C18:1, and the polyunsaturated fatty acid CLA. The fatty acids were expressed in terms of weight-proportion of total fat weight (w/w%). In total, these fatty acids made up 89% of the total fat. Table 1 presents the mean, phenotypic standard deviation, and intra-herd heritability for the fatty acids included in this study. More detailed information about the population and phenotypes can be found in Stoop et al. [3].

Genotypes
Blood samples were collected from the cows for DNA isolation. The cows were genotyped using a custom Infinium Array (Illumina, San Diego, CA, USA) designed by CRV. In total, 1,810 cows were successfully (call rate > 90%) genotyped. The cows were genotyped for 50,855 technically successful SNPs. The assumed map positions of the SNPs were based on the bovine genome assembly BTAU 4.0 [23]. From these 50,855 SNPs, a total of 776 SNPs could not be mapped to any of the Bos taurus (BTA) chromosomes and were assigned to BTA 0. In addition, 591 of the SNPs were located on the X chromosome. The SNPs on BTA 0 and the X chromosome were included in the study. The average distance between SNPs was 52,452 bp. Monomorphic SNPs (n = 245), SNPs with a genotyping rate < 80% (n = 383), and SNPs with a genotype frequency < 0.006 (1-9 observations for one of the genotype classes, SNPs with two genotype classes instead of three were included in the final marker set; n = 5,494) were discarded from the original SNP set of 50,855 SNPs, resulting in the final marker set of 44,733 SNPs used for the association analysis. Table 2 provides an overview of the number of SNPs available for the association study per chromosome.

Statistical analysis
For the association study, both phenotype and genotype information was available for 1,706 individuals. A twostep single SNP association analysis was performed. In the first step, the genome was screened for interesting regions using a general linear model. In the second step, the interesting regions were verified using an animal model. In the first step, a genome-wide association study was performed with a general linear model using the R package 'SNPassoc' [24]. In this step, the analyzed phenotypes were pre-adjusted for systematic environmental effects, and the general linear model accounted for the SNP effect and the effect of sire. The general linear model used in the first step was: where y* was the phenotype adjusted for the systematic environmental effects; sire was the fixed effect of sire; SNP was the fixed effect of SNP genotype; and e was the random residual. Sire effect was included in the SNPassoc model to account for paternal half-sib relations. Phenotypes were adjusted for days in milk, age at first calving, calving season, and herd. Adjusted phenotypes were obtained from the phenotypes of 1,905 cows using an animal model in ASReml [25]: where y was the (unadjusted) phenotype; μ was the overall mean; dim was the covariate describing the effect of days in milk; afc was the covariate describing the effect of age at first calving; season was the fixed effect of the class of calving season (June-Aug 2004, Sept-Nov 2004, or Dec 2004-Jan 2005); scode was the fixed effect accounting for differences in genetic level between groups of proven bull daughters, young bull daughters, and other bull daughters; herd was the random effect of herd, distributed as N(0, I σ 2 herd ), with identity matrix I and herd variance σ 2 herd ; animal was the random additive genetic effect of the individual, distributed as N(0, A σ 2 a ), with the additive genetic relationship matrix A and the additive genetic variance σ 2 a ; and e was the random residual, distributed as N(0, I σ 2 e ), with identity matrix I and residual variance σ 2 e . The genome-wide false discovery rate (FDR) was controlled according to the method described by Storey and Tibshirani [26], by separately calculating the genomewide FDR based on the P-values from the general linear model for each trait using the R package 'qvalue'. Associations with a genome-wide FDR < 0.05 for the general linear model were considered significant.
The first step was performed to identify interesting regions, which were then further analyzed with an animal model to account for all relationships among individuals. Including a polygenic effect and accounting for genetic relationships would be more appropriate [27]. The model including a polygenic effect is computationally demanding when analyzing many traits, SNPs, and animals; therefore, in the second step we only analyzed the regions that contained multiple SNPs that were significant in the first step.
A region started at the first significant SNP on a chromosome that was followed by an additional significant SNP within 10 Mbp; the region was extended as long as another significant SNP occurred within 10 Mbp from the previous one and ended at the last significant SNP that was not followed by another significant SNP within the next 10 Mbp. Thus, a region contained at least two significant SNPs. More than one region could be present on the same chromosome when there were groups of significant SNPs located within 10 Mbp from each other but further than 10 Mbp from the other region(s) on the chromosome. The 10 Mbp distance between significant SNPs is rather large, but it was chosen to prevent having many small regions on one chromosome, each containing a small number of significant SNPs.
In the second step, all SNPs in regions with significant effects were analyzed using animal model (2) extended with an SNP effect in ASReml [25]. In this model the phenotypes were simultaneously adjusted for systematic environmental effects, for all genetic relationships among individuals, and for the SNP genotype. Associations with a -log 10 (P-value) ≥ 3 were considered significant.
The genetic variance explained by an SNP was calculated from the estimated genotype effects from animal model (2) extended with an SNP effect and the observed genotype frequencies. The result was expressed as a percentage of the total additive genetic variance. These percentages can be overestimated, especially when the effect of an SNP is small, this is due to the so called Beavis effect [28]. The percentage of the total additive genetic variance explained by the most significant SNP per trait per region is reported. The most significant SNP can differ per trait for a region associated with multiple traits.

Results
In the first step of the single SNP association study, all SNPs were analyzed using a general linear model. The analysis resulted in many significant (FDR < 0.05) associations between SNPs and the studied fatty acids. Figure 1 shows the genome-wide plots of -log 10 (P-values) for all of the studied fatty acids. All analyzed fatty acids showed significant associations with at least one genomic region.
From the results of the general linear model, a total of 64 significant regions were defined, affecting one or more (up to 10) traits. Significant associations with SNPs were detected on all chromosomes; however, on BTA 29 we identified only a single SNP associated with a trait instead of multiple SNPs in a region. Therefore, no regions were defined on BTA 29. These 64 regions were analyzed with an animal model and 54 remained significant at -log 10 (P-value) ≥ 3. Thus, most of the regions identified with the general linear model were confirmed with the animal model. The correlation between P-values in both analyses was 0.90. All results mentioned hereafter originate from the animal model and concern the 54 regions that remained significant at -log 10 (P-value) ≥ 3 with the animal model.
Significant associations were detected with several regions for most of the fatty acids, especially C14:0 (19 regions) and C16:1 (18 regions), whereas C18:0 and C12:0 were associated with only one region ( Table 3). The X chromosome showed significant association with C14:0. Table 3 shows that several regions affect more than one trait. In particular, the medium chain unsaturated fatty acids were often associated with the same region.
Sixteen of the unmapped SNPs showed significant associations with one or more of the studied fatty acids. Comparing the sequence of these significant unmapped SNPs against genome assembly UMD 3.0 [29] identified 14 of them as mapping to regions 14a, 19b, and 26, which were already identified as being associated with the traits. The other two unmapped SNPs (ULGR_BTA-38166 and ULGR_BTA-38169) were significantly associated with C10:0 and located at 24.2 Mbp on BTA 16 according to the UMD 3.0 genome assembly [29].
For each region, we estimated the variance explained by the most significant SNP for each trait. Within a region, this most significant SNP can be a different SNP per trait. The percentage of total additive genetic variance explained by the most significant SNP per region ranged from 0.9 for C10:0 on BTA 1 to 61.9 for C18:1 on BTA 14 (Table 3). These percentages can be overestimated, especially when the effect of an SNP is small, this is due to the so called Beavis effect [28]. The QTL with large effects, such as region 14a and 26, are less likely overestimated.

Discussion
This study is the first to report a genome-wide association study of bovine milk fatty acids. A two-step single SNP association analysis was performed. In the first step, the genome was screened for interesting regions using a general linear model. In the second step, the interesting regions were verified using an animal model. The animal model accounted for all relationships between individuals in the pedigree, whereas the general linear model accounted only for the paternal half-sib structure of the phenotyped cows. Ignoring relationships between individuals can cause false positive associations [30]; therefore, the animal model was applied to verify the results from the general linear model.
The results showed that DGAT1 and SCD1 are highly associated with several of the fatty acids. The results also showed that, for some traits, other regions were more significantly associated, such as BTA 19 for some short and medium chain saturated fatty acids. In addition, many other regions were associated with fatty acids, but with smaller effects. Table 3 Regions significantly (-log 10 (P-value ≥ 3)) associated with fatty acids and the percentage of the total additive genetic variance explained (Continued) 26 2  Schennink et al. [13] and Stoop et al. [14] reported a linkage study of milk fat composition using some of the same data used in the present study. In the linkage study, 1,341 SNPs were genotyped in 849 cows and their seven sires. Schennink et al. [13] and Stoop et al. [14] detected genome-wide significant QTL on BTA 6, 14, 19, and 26 for fatty acids included in the present study. The QTL on BTA 14, 19, and 26 were confirmed in our study. Our results suggest a QTL on BTA 6 for C6:0 and C8:0 (Figure 1A), but for C6:0 only one SNP exceeded the FDR threshold, and for C8:0 none of the SNPs exceeded the FDR threshold. Therefore, this region was not included in the animal model analysis in our second step. Given the quite stringent threshold, this region is still likely to harbor a QTL for C6:0 and C8:0.
The suggestive QTL found by Schennink et al. [13] and Stoop et al. [14] on BTA 2, 13, 14, 17, 19, and 26 was also confirmed in our study. Other suggestive QTL reported in the studies could not be confirmed, though in some cases a QTL was indicated, but this did not pass the threshold. In addition, several novel regions significantly associated with fatty acids were detected in our study but not reported by Schennink et al. [13] and Stoop et al. [14].
Some studies of the fat composition of adipose tissue in beef cattle were confirmed by our findings regarding milk fat composition. This result indicates that these regions are not unique for milk fat composition. Our findings confirmed the QTL detected by Morris et al. [18] in subcutaneous fat from beef cattle: C16:1 on BTA 1; C14:0, C16:1, and C18:1 on BTA 19; C14:0 and C14:1 on BTA 26; and C14:0 on BTA 27. Our findings confirmed the QTL detected by Abe et al. [17] in back fat, intermuscular fat, or intramuscular fat from beef cattle: C14:0 and C18:1 on BTA 19. Our findings confirmed the QTL detected by Alexander et al. [16] in the longissimus dorsi of beef cattle: CLA on BTA 7. Our findings confirmed the QTL detected by Uemoto et al. [31] in intramuscular adipose tissue from beef cattle: C18:1 on BTA 19.

Major regions
Three major regions were detected in this genome-wide association study (regions 14a, 19b, and 26), with significant effects on milk fat composition. These regions showed highly significant associations with several fatty acids. The regions on BTA 14 and 26 are regions that have been studied previously, and our results confirm the earlier findings [5][6][7][8][9][10][11][12]. The region on BTA 19 has not been studied extensively in relation to milk fat composition, but it harbors a number of candidate genes involved in fatty acid synthesis.
For C10:1 and C14:1, the most significant SNP (ULGR_BTC-068225) was located at 3.0 Mbp, and for C12:1 the most significant SNP (ULGR_BTC-067423) was located at 3.7 Mbp. However, for C10:1 and C12:1, the DGAT1 K232A SNPs were also significant. After correcting the phenotypes for the effect of the DGAT1 K232A polymorphism, the -log 10 (P-values) of these most significant SNPs decreased from 6.50 to 2.30 for C10:1, from 5.44 to 2.52 for C12:1, and from 4.63 to 2.40 for C14:1. The LD between these most significant SNPs and the DGAT1 K232A SNPs was moderate (r 2 = 0.26 and 0.31). Although the associations of C10:1, C12:1, and C14:1 were just below the significance threshold after correcting for the DGAT1 K232A genotype, the findings still suggest that an additional QTL may be present for the medium chain unsaturated fatty acids on BTA 14.
The region on BTA 26 that showed an association with C10:0, C14:0, C16:0, C10:1, C12:1, C14:1, and C16:1 is the region harboring SCD1, which is known to be associated with the desaturation of fatty acids [7,8,10,11]. For all of these traits, except C16:0, the most significant SNP on BTA 26 corresponded to the nucleotide substitution in SCD1 that causes an A to V amino acid substitution (SCD1 A293V) at 21.1 Mbp. The A allele was associated with larger fractions of C10:1, C12:1, and C14:1, and with smaller fractions of C10:0, C14:0, and C16:1. Thus, the A allele resulted in more C10:1 and C14:1 at the cost of C10:0 and C14:0. A similar effect was found for C12:0 and C12:1, though C12:0 was not significantly associated with the SNP. The opposite effect was found for C16:0 and C16:1; the A allele resulted in less C16:1 and more C16:0, though C16:0 was not significantly associated with the SNP. The SCD1 gene codes for the SCD enzyme, which desaturates saturated fatty acids to Δ9 unsaturated fatty acids in the mammary gland [32]. The association of this SNP with, and its effects on, the medium chain unsaturated fatty acids and their equivalent saturated fatty acids is, therefore, in agreement with the function of the enzyme. The associations we identified for the medium chain unsaturated fatty acids confirm previous studies on the effect of the SCD1 A293V polymorphism on milk fatty acids [8,10,11]. The associations we identified for the medium chain saturated fatty acids confirm only the results of Schennink et al. [10], who used the same population as the present study.
The SCD1 A293V SNP was not significant for C16:0 (-log 10 (P-value) = 1.09), which also confirms previous studies [7,10,11]. The most significant SNP for C16:0 on BTA 26 was located at 28.8 Mbp. This SNP was not in LD with the SCD1 A293V SNP (r 2 = 0.08), and correcting for SCD1 A293V had little effect on the significance of the SNP associated with C16:0 (-log 10 (P-value) decreased from 5.52 to 4.46). Also, one allele of this SNP is associated with larger fractions of C10:1, C12:1, C14:1, and C16:0, and with smaller fractions of C10:0, C12:0, C14:0, and C16:1, suggesting that it has something to do with desaturation, but it was only significantly associated with C16:0. We did not identify obvious candidate genes in this region.
The region on BTA 19, at 32.7-61.8 Mbp, showed associations with C4:0, C8:0, C10:0, C12:0, C14:0, C16:1, C18:1, and CLA, i.e. mainly with the short and medium chain saturated fatty acids and with the long chain unsaturated fatty acids. No significant effects were found for C6:0, but a QTL was indicated below the threshold in the region on BTA 19 ( Figure 1A). Morris et al. [15] performed a linkage analysis of milk fatty acids on BTA 19, detecting QTL for C8:0, C10:0, C12:0, C14:0, C18:1, and C18:2, which was confirmed by our findings and suggested fatty acid synthase (FASN, at 52.2 Mbp) as a candidate gene responsible for the observed effect. In addition to FASN, several other genes located within the region on BTA 19 are involved in the biosynthesis of milk fat, including sterol regulatory element binding transcription factor 1 (SREBF1, at 35.7 Mbp), ATP citrate lyase (ACLY, at 43.4 Mbp), signal transducer and activator of transcription 5A (STAT5A, at 43.7 Mbp), and growth hormone (GH, at 49.7 Mbp). These genes are all candidate genes because SNPs in the whole region showed an association with the traits, perhaps in LD with mutations in genes not captured by our marker set. The strongest association was found near FASN, but also near some other candidate genes as discussed below.
A haplotype of five FASN SNPs has been shown to be significantly associated with C6:0, C8:0, C10:0, C12:0, C14:0, and C18:1 in Friesian-sired cows [15], which are almost the same traits for which we found an association in the region, though not specifically with SNPs in FASN. This finding suggests a QTL in this region with an effect on short and medium chain fatty acids and long chain unsaturated fatty acids, but whether it is actually FASN that causes the association remains unclear. Our genome-wide scan showed that an SNP outside of FASN is the most significant SNP for C14:0 and was also associated with some of the other traits. This SNP showed very strong association with C14:0, whereas the association of C14:0 with the SNP in FASN was just barely significant. Candidate gene studies have shown that FASN is mainly associated with C14:0, but we found associations with additional traits, similar to the haplotype findings of Morris et al. [15]. Perhaps the causal mutation is located outside of FASN and is mainly the effect on C14:0 strong enough to be detected by SNPs in LD with this mutation.
No SNPs were located in the other candidate genes on BTA 19. The SNP showing the strongest association with C16:1 was located at 43.8 Mbp on BTA 19 (BFGL-NGS-111365), which is 13,873 bp from STAT5A. The SNPs neighboring ACLY and STAT5A showed significant associations with several of these traits, especially C14:0. All SNPs in the regions seem to be detecting the same effect, which is strongest with C14:0. Although previous studies suggested FASN as the candidate gene for association, which of the candidate genes causes the effect remains debatable. The causal mutation might even be in a gene not considered here.

Additional regions
In addition to the three major regions mentioned above, many additional regions showed associations with fatty acids (see Figure 1). We will not discuss all of the regions in this paper, but what follows are some select regions that showed association with three or more traits.
On BTA 6, region 6a was associated with C12:1, C14:1, and C16:1 ( Table 3). The SNP effects were in the same direction for all three fatty acids. This region harbors the genes ATP binding cassette, subfamily G, member 2 (ABCG2, 37.4 Mbp) and peroxysome proliferatoractivated receptor-gamma coactivator-1alpha (PPARGC1A, 44.8 Mbp). The SNPs most significantly associated with C12:1 (-log 10 (P-value) = 5.03, at 44.3 Mbp), C14:1 (-log 10 (P-value) = 4.39, at 41.2 Mbp), and C16:1 (-log 10 (P-value) = 4.63, at 40.2 Mbp) were located between these candidate genes. ABCG2 has been associated with milk fat yield and percentage [35,36]. One SNP in our marker set was located in ABCG2 and showed no significant associations with the studied fatty acids. PPARGC1A has been associated with milk fat yield in German Holsteins [37], but this was not confirmed by Khatib et al. [38] in two larger American Holstein populations. Up to 10 SNPs in our marker set were located in PPARGC1A, but none of these were significantly associated with C12:1, C14:1, or C16:1. However, one of the 10 SNPs in PPARGC1A showed an association with C16:1, which was just below the significance threshold (-log 10 (P-value) = 2.86). In a candidate gene study, Schennink et al. [33] found a significant association in the same population as in the present study for two SNPs in PPARGC1A that were not included in our marker set: PPARGC1A 1790+514 with C16:1 and PPARGC1A 1892+19 with C14:1. Our genomewide scan indicates that not PPARGC1A, but another region has the strongest effect on the unsaturated medium chain fatty acids.
Region 7a showed an association with almost all unsaturated fatty acids. In general, the SNP effects on C18:1 were in the opposite direction of the SNP effects on C10:1, C12:1, and C14:1. SNPs in this region were either associated with C10:1, C12:1, and C14:1, or with C18:1, suggesting one QTL for the medium chain unsaturated fatty acids and another for C18:1. The SNP effects on C10:0, C12:0, and C14:0 were in the same direction as the effects on C10:1, C12:1, and C14:1, but they were not significant. The SNP effects on C18:0 were in the same direction as the effects on C18:1, but they were not significant. Although only unsaturated fatty acids were significantly associated with this region, the SNP effects suggested that this QTL has nothing to do with desaturation because the SNP effects on unsaturated fatty acids were in the same direction as the effects on their saturated equivalents.
Region 7b showed associations with C14:0, C18:0, C12:1, and C14:1. The most significant SNP for each trait was different, but they were located in the same neighborhood, around 64.0-64.1 Mbp, and were in high LD with one another (r 2 = 0.53-0.97). This finding indicates the likelihood of one QTL in this region with an effect on these four traits. The effects of these most significant SNPs were in the opposite direction for C18:0 compared to C14:0, C12:1, and C14:1. The SNP effects on C10:0 and C12:0 were in the same direction as the effects on C14:0, C12:1, and C14:1, but they were not significant. The SNP effects on C16:0, C18:1, and CLA were in the same direction as the effects on C18:0, but they were not significant. These SNP effects suggest that this QTL has something to do with a trade-off between long chain fatty acids and de novo synthesis of medium chain fatty acids. No candidate genes were located in this region.
On BTA 13, region 13 was associated with C6:0, C8:0, C10:0, C14:1, and C16:1 (Table 3). This region confirms the QTL for C6:0, C14:1, and C16:1 detected by Stoop et al. [14], who also found that C8:0 and C10:0 showed a QTL around the same position as C6:0, but these QTL were just below the threshold and, therefore, not reported. For C6:0, C8:0, and C10:0, the same SNP, located at 64.8 Mbp (ULGR_SNP_BES11_Con-tig346_1209), was the most significant SNP in the region, and the SNP effect was in the same direction for all three traits. These traits have a high genetic correlation [3], which supports one QTL affecting these three short chain fatty acids. This SNP is located in acyl-CoA synthetase short-chain family member 2 (ACSS2), which activates acetate for de novo fatty acid synthesis [1]; thus, ACSS2 is a good candidate gene for a QTL with an effect on C6:0, C8:0, and C10:0. Given that this particular significant SNP is located in an intron (between exon 16 and 17) of ACSS2, this SNP is not likely the causal mutation, but it can be in high LD with the causal mutation. In addition, the region also had an effect on C14:1 and C16:1. The SNP effects for C14:1 and C16:1 were in the same direction, but these effects were in opposite direction of SNP effects on C6:0-C10:0. The SNP in ACSS2 had no significant association with C14:1 or C16:1, which indicates an additional QTL with an effect on these unsaturated fatty acids.
On BTA 27, region 27 was associated with C14:0, C16:0, C10:1, C12:1, C14:1, and C16:1 ( Table 3). The region on BTA 27 includes 1-acylglycerol-3-phosphate O-acyltransferase 6 (AGPAT6), which is the most abundant isoform of all AGPAT mRNA (~60%) in the mammary gland and involved in positioning fatty acids on the second position of the triglyceride backbone [1]. Given that 62.2% of C14:0 and 43.1% of C16:0 is located at the second position of the triglyceride backbone [39], this gene might be a candidate for this association. The SNP effects on C14:0 and C16:0 were in opposite directions, which suggests competition between C14:0 and C16:0 for the second position of the triglyceride backbone as an explanation for this association. In AGPAT6 knock-out mice, the composition of the triacylglycerol is altered and contains proportionally more polyunsaturated fatty acids at the expense of monounsaturated fatty acids [40], which might explain the effect of this region on the monounsaturated fatty acids. In general, the SNP effects on C10:1, C12:1, C14:1, and C16:1 were in the same direction. Table 3 shows that regions 14a and 26 explained a large portion of the total additive genetic variation of C10:1, C12:1, C14:0, C14:1, C16:0, C16:1, C18:1, and CLA. This variation is caused by DGAT1 and SCD1. For other traits, however, no regions had such large effects. The sum of the total additive genetic variance explained by the most significant SNP per region for C4:0-C12:0 and C18:0 was less than 20% of the total additive genetic variance. This finding suggests that these traits are influenced by many genes with small effects. Regions explaining roughly 1% of the total additive genetic variation or more were detected by the studied design (e.g., C10:0 on BTA1), but additional regions with either undetectable small effects or that were not dense enough in our marker set to detect the effect is likely.

Variance explained
Even though short chain fatty acids are produced by de novo synthesis, less than 20% of the total additive genetic variance was explained by the analyzed regions. FASN plays an important role in de novo synthesis, but less than 6.5% of the total additive genetic variance of short chain fatty acids is explained by the region harboring FASN. De novo synthesis of fatty acids requires several compounds in addition to FASN enzymes to elongate fatty acids. One of the compounds is acetate, which is activated by ACSS2 for de novo synthesis, ACSS2 was indicated as a candidate gene associated with C6:0-C10:0 in this study. More genes like this that assist FASN in de novo synthesis and, therefore, explain a portion of the total additive genetic variation is likely.
Also, genes involved in transport and triacylglyceride production might explain some of the total additive genetic variation.

Conclusions
A genome-wide association study of 50,000 SNPs was performed for milk fatty acids, resulting in many QTL. All over the genome regions were associated with milk fatty acids, some regions with just one fatty acid and other regions with multiple fatty acids. Milk fat composition is strongly influenced by polymorphisms in DGAT1 and SCD1, genes that have large effects on medium chain fatty acids and unsaturated fatty acids. Several regions showed associations with these milk fatty acids, but with smaller effects. The short chain fatty acids, C12:0 and C18:0, are not strongly affected by genes with large effects, but are influenced by regions with small effects. Some regions included candidate genes involved in milk fat synthesis pathways. On BTA 19, there were several genes involved in fat synthesis underlying the region associated with multiple fatty acids. Only in a few cases was an SNP associated with fatty acids actually located in a candidate gene. Regions identified in this study can be fine mapped to find causal mutations. The results also create opportunities for changing milk fat composition through breeding by selecting individuals based on their genetic merit for milk fat composition, which can be retrieved from the estimated SNP effects and the individual's genotype.