Polymorphisms in bovine immune genes and their associations with somatic cell count and milk production in dairy cattle

Background Mastitis, an inflammation of the mammary gland, is a major source of economic loss on dairy farms. The aim of this study was to quantify the associations between two previously identified polymorphisms in the bovine toll-like receptor 2 (TLR2) and chemokine receptor 1 (CXCR1) genes and mammary health indictor traits in (a) 246 lactating dairy cow contemporaries representing five breeds from one research farm and (b) 848 Holstein-Friesian bulls that represent a large proportion of the Irish dairy germplasm. To expand the study, a further 14 polymorphisms in immune genes were included for association studies in the bull population. Results TLR4-2021 associated (P < 0.05) with both milk protein and fat percentage in late lactation (P < 0.01) within the cow cohort. No association was observed between this polymorphism and either yield or composition of milk within the bull population. CXCR1-777 significantly associated (P < 0.05) with fat yield in the bull population and tended to associate (P < 0.1) with somatic cell score (SCS) in the cows genotyped. CD14-1908 A allele was found to associate with increased (P < 0.05) milk fat and protein yield and also tended to associate with increased (P < 0.1) milk yield. A SERPINA1 haplotype with superior genetic merit for milk protein yield and milk fat percentage (P < 0.05) was also identified. Conclusion Of the sixteen polymorphisms in seven immune genes genotyped, just CXCR1-777 tended to associate with SCS, albeit only in the on-farm study. The lack of an association between the polymorphisms with SCS in the Holstein-Friesian data set would question the potential importance of these variants in selection for improved mastitis resistance in the Holstein-Friesian cow.


Background
Mastitis, an inflammation of the mammary gland, is a major source of economic loss on dairy farms. The disease cost to the EU dairy industry was €1.5 billion in 2005 [1]. This is due to increased involuntary culling [2], therapeutic costs [3], reduced milk yield and changes in milk composition [4]. Although farm management practices play a major role in mastitis incidence and control [5], modern day breeding programmes aim to incorporate increased mastitis resistance in the dairy herd without compromising milk production [6]. With the exception of the Nordic countries [7], most countries are limited to select for animals with low somatic cell count (SCC) due to the lack of routinely available phenotypic data on clinical mastitis. Indeed it has been shown in sheep that animals bred for low somatic cell score (SCS; natural logarithm of SCC) possess a better ability to limit and eliminate mastitis infections than their high SCS counterparts [8]. Mastitis resistance/susceptibility is a complex multi-locus trait and unsurprisingly, QTL for mastitis and/or SCC have been identified on almost all chromosomes of the bovine genome [9]. With the development of high density single nucleotide polymorphism (SNP) chips, genome wide association studies (GWAS) are becoming popular for breeding programmes. While GWAS has advantages, large studies are needed to find significance after adjusting for multiple testing [10,11]. Furthermore, fine mapping is still needed to efficiently identify the candidate gene and/or causative mutation [11].
Given that the host response to mastitis is complex and involves many different genes and cellular pathways [12], genes coding for immune factors that detect and eliminate pathogens are strong potential markers. For example, Toll-like Receptor 4 (TLR4) is an innate immune protein on cell surfaces that recognizes lipopolysaccharide (LPS) of Gram-negative bacteria [13]. The TLR4-2021 (C/T) polymorphism, within the third exon, results in the substitution of threonine with isoleucine at amino acid position 674, located within the transmembrane-cytoplasmic domain [14]. The C allele associates with decreased SCS and higher lactation persistency in the daughters of Holstein-Friesian bulls [15]. Combined with this, TLR4 exhibits a differential expression during mastitis episodes [16,17] which prompted Ogorevc et al. [9] to propose TLR4 as a strong candidate for inclusion in cattle breeding programs to augment the accuracy of selection for mastitis resistance.
CXCR1 encodes a chemokine receptor on neutrophil surfaces which binds the proinflammatory cytokine interleukin 8 (IL8) with high affinity. The bovine CXCR1 gene maps to a position of 110,617,059,110 megabase-pairs (Mbp) on chromosome (BTA) 2 and codes for a protein of 360 amino acids in length. The CXCR1-777 (G/C) single nucleotide polymorphism (SNP) results in a glutamine to histidine substitution at amino acid position 245 located within the third intracellular loop of CXCR1 [18]. When CXCR1 is stimulated by IL8, this loop is implicated in G-protein binding and calcium signaling [19]. The CC genotype of CXCR1-777 associates with impaired neutrophil migration, impaired reactive oxygen species production in vitro and increased subclinical mastitis in vivo [18,20].
Other potential immune targets for incorporation into breeding programs include polymorphisms in the cluster of differentiation 14 (CD14), caspase recruitment domain 15 (CARD15) and IL8 genes. CD14 encodes a membrane-associated protein on the surface of cells which is involved in the recognition of LPS in conjunction with TLR4 [21]. Of the exonic SNPs identified in the CD14 gene, CD14-1908 (A/G) causes an asparagine to aspartic acid substitution at amino acid position 175 within the leucine rich repeat (LRR) domain involved in pathogen recognition [22]. Animals with the mutant variant have a higher percentage of neutrophils expressing CD14 molecules on their surface which may increase the speed of response to pathogen attack [22]. CARD15 is a cytosolic protein that initiates inflammation following microbe recognition [23] and plays an important role in nuclear factor-kappa B (NF-B) activation [24]. The CARD15-3168 (A/T) polymorphism, within exon 12, results in a leucine 125 to glutamine switch within the LRR domain and is associated with increased SCS values [25]. IL8 is a potent inducer of chemotaxis and attracts and activates neutrophils in the early stages of host defense against bacterial invasion [26]. Several IL8 SNPs have been identified [27] and associations with milk fat yield and udder depth have been reported [28].
SERPINA1 encodes a serum protease inhibitor, whose primary role is to protect tissues against neutrophil attack. In particular, secretion of this protease protects 'self' cells from neutrophil elastase digestion and may play a role in the maintenance of milk quality over the lactation curve [29]. Indeed a SERPINA1 haplotype, composed of five SNPs within exon regions (SERPINA1-164, -269, -284, -407, -989), is associated with decreased SCS, increased milk yield, milk fat yield and productive life [30].
The main objective of this study was to quantify the associations between CXCR1-777 and TLR4-2021 with milk performance and somatic cell traits using both individual cow production data and estimated genetic merit of Holstein-Friesian sires based on progeny performance. The advantage of an experimental design that uses direct phenotypic data from individual cows is that it allows for the inclusion of deeper phenotypes and various dairy breeds but the results are only applicable to (a) the family lines and (b) farm management practices and environments similar to those described. In comparison, the performance traits evaluated in the Holstein-Friesian sires are not expressed by the sires themselves. The advantage of this alternative design is the increased study power, and therefore increased accuracy, to detect associations by minimizing the proportion of variance in the trait under investigation attributable to non-genetic variation. The disadvantage is that performance traits included in the analyses are limited to traits routinely measured on progeny.

Dairy cows
Genotype frequencies of TLR4-2021 and CXCR1-777 and summary statistics in the on-farm study are summarized in Tables 1, 2 and 3. For TLR4-2021, Jersey and Jersey x Holstein-Friesian crossbreds had a lower proportion of CC homozygotes, than their herdmates. The rare TT genotype was absent in the Montbeliarde and Norwegian Red breeds. For CXCR1-777, heterozygous CG was the most abundant genotype, within breed, with the exception of the Jersey cows where the GG genotype predominated (62%).
Although the C allele of TLR4-2021 was associated with decreased milk protein percentage over the entire lactation (Table 2), this association occured in late lactation (lactation days 141 to 305) ( Table 3). In addition, the heterozygous TC genotype associated with an increase in milk fat percentage from days 141 to 305 of lactation (Table 3). Interestingly this association did not hold when the genotype was included in the model as a class effect ( Table 2). CXCR1-777 tended to associate with SCS throughout the entire lactation when the polymorphism was included as either a class or continuous effect in the statistical model (Table 4). However, no associations were observed between CXCR1-777 and alternative mastitis indicators, i.e. peak SCC (highest SCC record per cow per lactation) and subclinical mastitis(SCC > 250,000 cells/ml) (data not shown).

Holstein-Friesian bulls
The genotype and allele frequencies of TLR4-2021 and CXCR1-777 in 848 Holstein-Friesian sires are listed in Table 5, with summary statistics listed in Table 6. Both SNPs deviated significantly from Hardy Weinberg equilibrium with a lower than expected frequency of TT and CC genotypes for TLR4-2021 and CXCR1-777, respectively. TLR4-2021 did not associate with any milk composition traits investigated ( Table 6). The G allele of CXCR1-777 was associated with a 1.01 kg increase in fat yield (P < 0.05; Table 6). Table 5 also summaries the allele frequencies and genotypes of an additional 14 polymorphisms. Both IL8-1203 and TLR4-1142 were monomorphic with only the T and A allele present, respectively and were therefore not included in the association analysis. TLR4-452, TLR4-1040, and TLR4-1948 were also excluded from the association analyses due to the absence of the homozygous CC, AA, and AA genotypes respectively and the low proportion of the heterozygous genotypes. Minor allele frequency in the remaining SNPs ranged from 8% to 49%.
Linkage disequilibrium (LD), based on the r 2 statistic, between SNPs SERPINA1-164, SERPINA1-407 and SER-PINA1-989 was > 0.95 (Table 7). As such, only association analysis of SERPINA1-164 was reported in Table 6. SERPINA1-269 and SERPINA1-284 also exhibited strong LD with a score of 0.99 (Table 7) with SER-PINA1-284 omitted from the analysis listed in Table 6. LD between IL8-182 and IL8-1226 was 0.99 (data not shown), allowing omission of IL8-1226 from Table 6.  The associations among the seven segregating SNPs and milk production traits are listed ( Table 6). The G allele of CD14-1908 was associated with decreases (P < 0.05) in milk fat and protein yield by 1.32 kg and 0.98 kg respectively and also tended to associate with decreased milk yield (P < 0.1). The T allele of SER-PINA1-269 associated with 0.74 kg (P < 0.05) and 2.34% (P < 0.01) decreases in fat yield and fat percentage respectively. The T allele of SERPINA-269 also showed a tendency (P < 0.1) to associate with an increase in protein yield by 0.48 kg. The remaining SNPs, SER-PINA1-164, CARD15-3168 and IL8-182, did not associate with any of the milk production traits investigated ( Table 6).
Using the five SNPs in SERPINA1, four possible haplotypes were constructed in the Holstein-Friesian bulls ( Table 8). The haplotypes G 164 T 269 T 284 G 407 C 989 , ACGCT, GCGGC were present in the population in almost equal proportions with only a 1% incidence of the GCGCT haplotype (Table 8). Bulls with the GCGGC haplotype had superior genetic merit for milk protein yield and milk fat percentage compared to the referent haplotype, GTTGC (P < 0.05), with no differences in milk yield. The ACGCT haplotype had a superior genetic merit for milk fat yield (P < 0.05) compared to GTTGC.

Discussion
The modern dairy cow has been subjected to intense selection for milk production which has adversely affected its health [31,32]. The G allele of the CXCR1-777 SNP showed a significant association with increased milk fat yield in Holstein-Friesian bulls and a tendency to associate with decreased SCS in the on-farm study. In agreement, Leyva-Baca et al. [28] reports a tendency for the G allele to associate with increased fat yield, but not SCS, in Canadian Holstein-Friesian bulls. Furthermore, Goertz et al. [33] failed to identify an association between CXCR1-777 genotype and SCS in German Holsteins. However, Youngerman et al. [18] demonstrates an association between higher SCS and the heterozygous GC genotype compared to the GG genotype in Holstein-Friesian cows. Interestingly, Holstein-Friesian cows with high SCS have reduced daily fat production and Table 3 The associations (standard error in parenthesis) between the TLR4-2021 genotypes and milk composition variables in cows within early, mid and late lactation  higher lipolysis compared to low SCS herdmates [34]. In other species, CXCR1 is expressed by adipocytes [35], and its ligand, IL8, inhibits adipogenesis through downregulation of the transcription factor peroxisome proliferator-activated receptor-γ (PPAR-g) gene [36]. IL8 and CXCR1 may indirectly influence fatty acid synthesis in the mammary gland through this PPAR-g regulation [37]. Alternatively, the association of CXCR1 with milk fat yield may simply be explained by the mapping of quantitative trait loci (QTL) for fat yield and percentage [9] at 110 to 130 Mbp on BTA2. Indeed a QTL for SCS is also linked to BTA2 at position 105 Mbp [9]. The CC genotype of TLR4-2021 was associated with a decrease in milk protein percentage in late lactation across all breeds within the on-farm study. However, this association did not persist within the Holstein-Friesian bull population. This discrepancy between the two studies may be a function of 1) differences in phenotypes and accuracy of phenotypes used (actual cow phenotypic records adjusted for systematic environmental effects or predicted transmitting abilities), 2) differences in the parity of the progeny or period during which the phenotype was measured, 3) the mean performance of the animals or the environment they were exposed to (i.e. genotype by environment interaction) and 4) the different genetic background and breed of the animals in the study and how that may affect interactions between the polymorphism and background genes. It is worth noting that there was a lower frequency of TLR4-2021 CT heterozygotes (0.26) in the Holstein-Friesian bull population than reported in the Holstein-Friesian cows (0.46). This discrepancy reflects the small sample size of the on-farm study. Flow cytometry analysis demonstrates that heterozygous AG animals of CD14-1908 have higher numbers of neutrophils expressing CD14 molecules [22]. This may confer the neutrophils with altered sensitivity to LPS [22] which may aid in their fight against pathogen attack. While no association with SCS was established within our study, to the authors' knowledge this is the first report of the G allele of CD14-1908 associating with lower milk fat and protein yields and tending to associate with a decrease in milk production. SERPINA1 levels are known to affect milk composition milk quality over the lactation curve [29]. The synonymous SERPINA1-269 polymorphism demonstrated an association with milk fat yield and milk fat percentage. Similarly, Khatib et al. [30] also found associations with a SERPINA1 haplotype and milk traits (milk yield, fat yield and protein yield). In our study animals, the GCGGC haplotype had similar genetic merit for milk yield but a superior genetic merit for milk protein yield and milk fat percentage when compared to GTTGC animals. Although the GTTGC haplotype associates with a decrease in SCS in a North American herd [30], it failed to associate with SCS in our bull data set. Interestingly, within our data set, the haplotype GTTGT was absent and the previously unreported GCGCT haplotype occurred in 7 bulls (1%).

Conclusion
Of the sixteen polymorphisms in seven immune related genes genotyped, only CXCR1-777 tended to associate with SCS albeit only in the on-farm study. Interestingly polymorphisms in the immune genes, CXCR1, CD14, SERPINA1, did associate with milk composition. Animals with the GCGGC SERPINA1 haplotype associated with superior genetic merit for milk protein yield and fat percentage. The lack of an association between the polymorphisms with SCS in the Holstein-Friesian data set would question the potential importance of these variants in selection for improved mastitis resistance in the Holstein-Friesian cow.

DNA extraction and genotyping of cow samples
A total of 246 dairy cows (77 Holstein-Friesian, 47 Jersey, 51 Norwegian Red and 24 Montbeliarde, 47 Holstein-Friesian x Jersey crossbreeds) from a single experimental dairy farm were milked twice daily. Blood sampling, performed under license from the Department of Agriculture, Fisheries and Food by trained personnel,   was taken from the tail vein by caudal venipuncture and collected into tubes containing potassium ethylenediaminetetraacetic acid (EDTA K3E 15%, 0.12 ml; BD Vacutainer, BD Vacutainer Systems, Plymouth, UK). Blood samples were initially centrifuged at 2500 × g for 20 min at 4°C. Buffy layers containing white blood cells were collected and DNA extracted using the QIAGEN Flexigene DNA kit (QIAGEN Ltd., West Sussex, UK), according to manufacturer's instructions. DNA was quantified and quality assessed spectrophotometrically and by gel electrophoresis. All cows on the farm were genotyped by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). CXCR1 PCR primers were designed based on the NCBI http://www.ncbi.nlm.nih.gov nucleotide sequence U19947.(Forward: 5'-caa tac aac gaa atg gcg gat gat-3', Reverse: 5'-cag gtt gta ggg cag cca gca gag-3'). TLR4 primers were designed based on the sequence NM_174198 (Forward: 5'-aac agg tag ccc aga cag cat ttc-3', Reverse: 5'-ggc acg ccc tcc tcc aag tt-3'). Each PCR product was designed to incorporate a Bsp 1286 I restriction site within the amplicon to act as an internal quality control for enzyme activity. Cycling conditions were as follows: initial denaturation at 94°C for 5 min followed by 30 cycles for 94°C for 1 min, annealing temp (61°C for CXCR1, 60°C for TLR4) for 1 min, 72°C for 1 min, with a final extension at 72°C for 5 min. Each PCR reaction was performed in 25 μl volumes and contained 1× NH 4 Buffer (160 nM (NH 4  Allele specific fragments of all products were visualized on a 2.5% agarose gels following electrophoresis. To confirm accuracy of genotyping by RFLP, PCR products from a subset of 20 animals were sequenced (Cogenics and MWG-Biotech). Sequence data was analyzed using Lasergene 6 (DNAStar Inc., Madison WI, USA). Control homozygous and heterozygous samples, identified by sequencing, were included in each digest as a quality control measure.

DNA extraction and genotyping of bull samples
Semen straws from 848 Holstein-Friesian sires, representing the Irish dairy germplasm in recent years, were collected for genomic DNA extraction. Semen was washed twice in phosphate buffered saline (pH7.4) and centrifuged. The resulting cell pellets were resuspended in 450 μl of pre-warmed extraction buffer (10 mM Tris pH8, 10 mM EDTA, 0.1% SDS, 100 mM NaCl) and 15 μl of 2-mercaptoethanol was subsequently added. Following addition of 10 μl proteinase K (20 mg/ml), samples were incubated for 15 minutes at 55°C. An overnight incubation at 60°C ensured complete lysis. DNA was extracted using the Maxwell® instrument (Promega Corp., Madison, WT, USA) according to the manufacturer's instructions. All SNPs were genotyped in the bull population by Sequenom® using the iPLEX Gold assay on a MassARRAY® platform http://www.sequenom.com. As a quality control measure, genomic DNA representing twenty-five animals was genotyped in duplicate for each SNP. Concordance across all duplicates was 100%.

Cow phenotypic data and editing
Cows were primarily grass-fed (100% pyrennial grass) and housed during winter months. Cows were milked twice daily. Prior to milking, teats were wiped. Post milking, teats were treated with iodine. Milk yield was recorded daily. Milk sample were collected once weekly from successive morning and evening milkings and the Milkoscan 203 instrument (Foss Electric, Hillerød, Demark) was used to determine milk fat, protein and lactose concentrations. Somatic cell count was determined from milk samples using a Bentley Somacount 300® (Bentley Instruments Inc., Chaska MN, USA). Subclinical mastitis events were defined as the number of SCC records greater than 250,000 cells/ml per animal per lactation. Peak SCC was the highest SCC record per animal per lactation. All animals genotyped were included in the association analysis (i.e. no selection was applied). The 246 dairy cows represented 71 sire families. As well as entire lactation totals, records were divided into 3 groups for analysis; early (days 1-70), mid (days 71-140) and late (days 141-305) lactation. Pedigree files to determine relatedness were constructed using sire information extracted from the Irish Cattle Breeding Federation (ICBF).
Bull phenotypic data and editing ICBF evaluated data, which included daughter yield deviations (DYDs) and predicted transmitting abilities (PTAs), as well as associated reliabilities, for a range of performance traits, was available for inclusion in the analysis. A description of models used in Irish genetic evaluations and variance components is summarized in Berry et al. [38]. The average co-ancestry among the sires was 2.2%. A repeatability model across the first five lactations is used to estimate DYD for 305-day milk, fat and protein yield and geometric mean SCS (log e SCC). DYDs expressed on the scale of PTA were used in this study. PTAs were de-regressed according to Berry et al. [39]. Using the method of Harris and Johnson [40], the parental contribution of the reliability of each DYD or PTA was removed. Only sires with reliability, less parental contribution, of greater than 60% were included in association analysis. For milk, fat and protein yield, as well as fat and protein percentage, a total of 742 sires were included in analysis, while 703 sires were suitable for association analysis of SCS.

Statistical analysis Cow Dataset
The association between each SNP and performance was quantified using linear mixed models in ASREML [41] with an additive genetic effect and permanent environmental effect included as random effects.
Relationships among individuals were accounted for using a numerator relationship matrix. Fixed effects included in the model were contemporary group defined as experimental treatment-by-year, parity and breed. The ratio of the additive genetic to phenotypic variance and permanent environmental variance to phenotypic variance were fixed to give a heritability and repeatability of 0.35 and 0.5 respectively for milk, fat, protein, and lactose yield; 0.52 and 0.6 for fat and protein percentage and 0.11 and 0.13 for SCS, subclinical mastitis and peak SCS. SNPs were included in the model as either continuous variables or as class effects. SNPs were included in the model either individually or together. Significance was based on the F-statistic.

Bull Dataset
The association between each SNP and performance was quantified using weighted mixed models in ASREML [41] with bull included as a random effect, and average expected relationships among bulls accounted for through the numerator matrix. Year of birth (divided into 5 yearly intervals) and percentage Holstein of the individual bull were included as fixed effects in the model. In all instances the dependent variable was DYD for 305-day milk yield, fat yield, protein yield, and average lactation SCS, weighted by their respective reliabilities less the parental contribution. Each SNP was individually included in the model as a continuous variable. For SERPINA1, a multiple regression mixed model with all segregating SNPs included in the model was also undertaken. In a separate analysis, the probability of each haplotype for the bull was included as a fixed effect in the weighted animal mixed model with year of birth of the bull and Holstein proportion included as fixed effect; the most frequent haplotype (G 164 T 269 T 284 G 407 C 989 ) was not included in the model to avoid linear dependencies among haplotype effects. Significance of associations was determined by the F-test and the appropriate degrees of freedom.