The patterns of genomic variances and covariances across genome for milk production traits between Chinese and Nordic Holstein populations
 Xiujin Li^{1, 2, 3},
 Mogens Sandø Lund^{1},
 Luc Janss^{1},
 Chonglong Wang^{4},
 Xiangdong Ding^{2},
 Qin Zhang^{2}Email author and
 Guosheng Su^{1}Email author
DOI: 10.1186/s1286301704919
© The Author(s). 2017
Received: 18 June 2016
Accepted: 7 March 2017
Published: 15 March 2017
Abstract
Background
With the development of SNP chips, SNP information provides an efficient approach to further disentangle different patterns of genomic variances and covariances across the genome for traits of interest. Due to the interaction between genotype and environment as well as possible differences in genetic background, it is reasonable to treat the performances of a biological trait in different populations as different but genetic correlated traits. In the present study, we performed an investigation on the patterns of regionspecific genomic variances, covariances and correlations between Chinese and Nordic Holstein populations for three milk production traits.
Results
Variances and covariances between Chinese and Nordic Holstein populations were estimated for genomic regions at three different levels of genome region (all SNP as one region, each chromosome as one region and every 100 SNP as one region) using a novel multitrait random regression model which uses latent variables to model heterogeneous variance and covariance. In the scenario of the whole genome as one region, the genomic variances, covariances and correlations obtained from the new multitrait Bayesian method were comparable to those obtained from a multitrait GBLUP for all the three milk production traits. In the scenario of each chromosome as one region, BTA 14 and BTA 5 accounted for very large genomic variance, covariance and correlation for milk yield and fat yield, whereas no specific chromosome showed very large genomic variance, covariance and correlation for protein yield. In the scenario of every 100 SNP as one region, most regions explained <0.50% of genomic variance and covariance for milk yield and fat yield, and explained <0.30% for protein yield, while some regions could present large variance and covariance. Although overall correlations between two populations for the three traits were positive and high, a few regions still showed weakly positive or highly negative genomic correlations for milk yield and fat yield.
Conclusions
The new multitrait Bayesian method using latent variables to model heterogeneous variance and covariance could work well for estimating the genomic variances and covariances for all genome regions simultaneously. Those estimated genomic parameters could be useful to improve the genomic prediction accuracy for Chinese and Nordic Holstein populations using a joint reference data in the future.
Keywords
Chinese Holstein Nordic Holstein Genomic variance Genomic covariance Genomic correlationBackground
The Chinese Holstein has been formed in the process that the imported Holstein bulls from Europe and North America were crossbred with local yellow cattle, and the crossbred cows were continuously backcrossed with imported Holstein bulls [1, 2]. Therefore, it is assumed that the Chinese Holstein population is genetically close to the other Holstein populations in the world. To date, Chinese and Nordic Holstein populations have been jointly studied widely [3–5]. It has been reported that around 30% of Chinese cows has a relationship coefficient above 0.20 with one or more Nordic Holstein bulls, calculated by using genomic data [5]. Zhou et al. [4] reported that the extent of linkage disequilibrium (LD) was similar in the Chinese and Nordic Holstein populations, and the consistency of LD phase between two populations was very high with a correlation of 0.97. Thus, it was successful in improving the prediction accuracy for Chinese Holstein population using a joint reference population including Nordic genotyped progenytest bulls. In addition, the accuracy of imputation from the 54 K to the HD marker data for Chinese Holsteins was improved by adding the Nordic HDgenotyped bulls into the reference data [5].
Because of different production systems between China and Nordic countries, some genes may show significant different effects for the same trait between Chinese and Nordic Holstein populations, i.e., genotype by environment interactions. The different genetic effects on the same trait between two populations can be reflected by the patterns of genomic variances, covariances and correlations across genome, which can be detected by an analysis where a given biological trait in Chinese and Nordic Holstein populations is considered as two traits. On one hand, knowing these genomic parameters of different genome regions in two populations will give a better opportunity to understand genetic architectures of traits of interest, On the other hand, these genomic parameters can be used to improve the accuracy of genomic prediction for traits of interest in both populations when using a joint reference population.
With the availability of SNP chips and genome sequencing, SNP information has offered a possibility to study genetic architecture of complex traits. Using SNP data, it is possible to disentangle the pattern of genomic variance and covariance across the whole genome. However, there are few literatures to report genomic variances, covariances and correlations for different genome regions by using SNP information. Recently, Janss [6] has proposed a multitrait Bayesian method using latent variables to model heterogeneous variances and covariances, which makes it easy to estimate the genomic variances and covariances for all genome regions simultaneously. Furthermore a modified model has been developed in the present study, which is more flexible to handle heterogeneous variances and covariances. So far, this new multitrait Bayesian method has not been applied in largescale real data.
Therefore, the objective of the present study is to investigate regionspecific genomic variances in Chinese and Nordic Holstein populations as well as regionspecific genomic covariances and correlations between the two populations for three milk production traits, using the novel multitrait Bayesian method.
Results
Total estimated genomic variance (Va), covariance (Cov), correlation (Corr) with standard error (SE) in parentheses, heritability (h^{2}), and DIC for Chinese (CN) and Nordic (NO) Holstein population in different scenarios
Method  Group^{a}  Trait^{b}  Va_CN  Va_NO  Cov  Corr  h^{2}_CN^{c}  h^{2}_NO^{c}  DIC^{d} 

MTGBLUP  All SNP  MY  358160.4 (25519.1)  112.3(4.1)  4116.0(289.4)  0.649(0.037)  0.41  0.86  NA 
FY  469.8(33.0)  104.6(3.8)  134.7(10.1)  0.607(0.038)  0.42  0.86  NA  
PY  309.0(22.3)  107.1(3.9)  98.2(8.5)  0.540(0.042)  0.40  0.86  NA  
MTrrBLUP  All SNP  MY  349985.3(27058.4)  108.2(3.8)  3825.6(271.7)  0.622(0.034)  0.40  0.84  125746.4 
FY  462.2(34.4)  102.2(3.8)  126.5(11.5)  0.582(0.044)  0.42  0.85  83081.2  
PY  302.1(21.5)  104.4(3.6)  93.6(8.5)  0.527(0.043)  0.39  0.84  81589.5  
BTA  MY  346995.3(27762.0)  104.7(5.0)  3550.9(306.1)  0.589(0.042)  0.40  0.84  125726.5  
FY  418.2(36.6)  95.2(5.1)  112.0(12.0)  0.562(0.048)  0.39  0.83  83255.6  
PY  289.5(20.8)  101.6(4.4)  90.8(8.1)  0.529(0.040)  0.38  0.83  81554.3  
100 SNPs  MY  279770.6(23729.6)  85.4(3.7)  2856.8(281.3)  0.585(0.044)  0.35  0.81  125667.4  
FY  331.4(26.9)  74.5(4.1)  82.2(7.3)  0.523(0.037)  0.34  0.80  83143.4  
PY  272.5(18.1)  86.6(4.1)  85.9(8.0)  0.559(0.044)  0.36  0.81  81570.8 
The proportion of additive genetic variance to phenotypic variance (h^{2}) in Table 1 represented reliability of DRP. The reliabilities in Nordic Holsteins were much higher than those in Chinese Holsteins, because the Nordic Holsteins in the analysis comprised bulls with large group of daughters while the Chinese Holsteins comprised mainly cows. In addition, based on the Deviance Information Criterion (DIC) statistical criteria, MTBayesian rrBLUP with every 100 SNP as one region was better than other two scenarios for MY, MTBayesian rrBLUP with all SNP as one region was best among three scenarios for FY, and MTBayesian rrBLUP with each BTA as one region was best for PY.
Discussion
The present study for the first time reported the patterns of genomic variance, covariance and correlation between the Chinese and Nordic Holstein populations for three milk production traits. The patterns of those genomic parameters were investigated at different levels of genome regions, i.e., the whole genome as one genome region, each chromosome as one region and every 100 SNP as one region, using the novel MTBayesian rrBLUP model. The results showed that the MTBayesian rrBLUP worked well for estimating genomic parameters for different regions in the genome simultaneously. As expected, the results showed that different regions explained different amounts of genomic variance and covariance as well as different degrees of genomic correlation.
Both MTGBLUP and MTBayesian rrBLUP were used to estimate genomic variance, genomic covariance and residual variance between Chinese and Nordic Holstein populations for three milk production traits. The total genomic variance and covariance as well as genomic correlation estimated from MTGBLUP were slightly higher than those from MTBayesian rrBLUP without dividing the genome into regions. Although MTGBLUP and MTrrBLUP without dividing regions were equivalent, the two models used different algorithms. The MTGBLUP used AIREML algorithm while MTrrBLUP used MCMC algorithm. The two different algorithms might lead to slight difference in estimates of variance and covariances, especially when the distribution of the estimates was deviated from a normal distribution. In fact, the difference was very small, compared with the size of the standard errors. MTBayesian rrBLUP had the advantage to estimate regionspecific variance and covariance for all regions simultaneously. In principle, MTGBLUP also can estimate regionspecific variance and covariance for all regions simultaneously by taking each region as one component in the model, but it has a very high computational demand, and is difficult to reach convergence due to too many components in the model. Another approach to use MTGBLUP to estimate regionspecific variance and covariance is to take a particular region as one component and the others as one component in the model, and thus obtain variance and covariance for one region in the model in one run. But this is still a timeconsuming approach. For example for MY, although the computation time for MTGBLUP is much shorter than MTrrBLUP (3 h vs 42 h), the time for MTGBLUP with every 100 SNP as one region is much longer than MTrrBLUP with every 100 SNP as one region (10 h* region number vs 42 h). Thus it is difficult to use MTGBLUP to estimate regional specific variance and covariance with every 100 SNP as one region because of very time consuming. It was observed that with the number of genome regions increased, total genomic variance and covariance derived from each region decreased (Table 1). One possible reason could be that total genomic variance and covariance were obtained by simply summing the genomic variance and covariance for each region, and ignoring covariance between regions. In addition, the correlations of genomic estimated breeding values (GEBV) between MTBayesian rrBLUP with three different scenarios ranged from 0.87 to 0.94 for MY, from 0.80 to 0.91 for FY, and from 0.96 to 0.98 for PY in both populations. These showed that the performance of different MTBayesian rrBLUP with different assumptions is greatly influenced by genetic architectures of traits of interest.
At chromosome level, very large genomic variance, covariance and correlation for three traits were found on BTA14 as the diacylglycerolacyltransferase 1 (DGAT1) on BTA 14 has a very large effect and segregates in the Holstein populations [7–9]. Among the regions with size of 100 SNP on BTA 14, the genome region (1.46 Mbp ~ 5.57 Mbp) showed much higher genomic variance, covariance and correlation (0.98 for MY and FY; 0.80 for PY) than other regions, which confirmed that DGAT1 plays an important role on these three traits (Additional file 1: Figure S1 and Additional file 2: Figure S2) in both Chinese and Nordic Holstein populations. On the other hand, the proportions of genomic variance and covariance for this first region for PY were smaller than those for the other two traits, which indicated that the effect of DGAT1 was larger on MY and FY than on PY.
Besides BTA 14, BTA 5 and BTA 20 also showed a relative high genomic covariance and correlation which speculated that some QTL regions had the same effects one those two chromosomes in both populations. Among all 100 SNP regions on BTA 5, two consecutive regions (84.09 Mbp ~ 98.29 Mbp, 13^{th} and 14^{th}region on BTA5) showed much higher genomic variance, covariance and correlation than other regions for MY and FY (Additional file 3: Figure S3 and Additional file 4: Figure S4). This segment on BTA 5 has been previously reported to include the QTL affecting MY and FY in Holstein populations [7, 10, 11]. In the present study, we also detected that aregion (27.93 Mbp ~ 34.73 Mbp), which included the growth hormone receptor (GHR) gene, presented much higher genomic variance, genomic covariance and genomic correlation for MY in this region than other regions on BTA 20 (Additional file 5: Figure S5). It has been reported that the growth hormone receptor (GHR) gene on BTA 20 has a large effect on MY in Holstein populations [8, 12–14].
Although overall genomic correlations between two populations for three traits were highly positive, some genome regions showed weakly positive correlation or highly negative genomic correlations for MY and FY. In the present study, we found one region (47.20 Mbp ~52.23 Mbp) on BTA 19 for MY and 10 regions on different chromosomes for FY showing highly negative correlations (Additional file 6: Table S1). There were two regions (48.79 Mbp–53.49 Mbp; 58.99 Mbp–64.03 Mbp) on BTA 8 showing lower genomic covariance and negative genomic correlation, i.e., −0.016 and −0.327 (Additional file 7: Figure S6), and thus these two region might resulted in the lowest genomic covariance and correlation for BTA8 among all autosomes for FY (Figs. 2 and 3). The negative genomic correlations between two populations indicated that the same region has different effects on the same trait in different populations. One possible reason for some regions showing negative correlations could be due to different selection histories, e.g. different weights for traits in the selection index, between two populations which led to difference in effects between two populations for some genomic regions. Another possible reason could be due to different production systems in Chinese and Nordic Holstein populations, i.e., genotype by environment interaction. In other words, the regions with negative or very low correlations can be considered as candidate regions involved in genotype be environment interaction for further studies.
How to define genome regions to inspect the patterns of genomic variance and covariance is a key point, and it needs to be further investigated. In this study, we divided the whole genome into regions in an arbitrary way (i.e., each chromosome or every 100SNP as one region). The way to partition the genomic variance into chromosomes has also been reported by Jensen et al. [15]. For the way to define how many SNP as one region, we have tried to use every 50 SNP, every 100 SNP, every 200 SNP and every 400 SNP as one region to estimate genomic variance and covariance for each region, respectively. The results showed that every 100, 200,400 SNP as one region performed well, while every 50 SNP as one region performed worse, especially for MY, with regard to total genomic variance and covariance (results not shown). The reason may be that small regions, such as every 50 SNP as one region, reduce the accuracy of estimated variance and covariances because of less accuracy of \( {\mathrm{r}}_{\mathrm{ij}} \) for a particular region j. We chose every 100 SNP as one genome region rather than every 200 or 400 SNP to analyze the region patterns of genomic variance, covariance and correlation because the large region dilutes differences in variance and covariance between regions.
Using a weighted G matrix in a singletrait GBLUP model to improve the accuracy of genomic prediction has already been reported, in which the estimated SNP variances are used as a weighting factor for building a traitspecific G matrix [16, 17]. Likewise, the information of genomic variance, covariance and correlation in our study can also be further used as weighting factors for building a weighted G matrix for MTGBLUP so that a MTGBLUP can account for heterogeneous variances and covariances across the genome. In the Chinese cattle population, reference data mainly comprises females. Ding et al. [18] have reported prediction accuracy using such a reference data are greatly increased, compared with using pedigree information. The genomic prediction accuracy for Chinese Holsteins has been improved by using Chinese and Nordic Holsteins as reference animals and MTGBLUP [4, 5]. Similar to the single trait model, MTBayesian methods with different SNP/regions having different variances may produce higher reliability of genomic prediction than MTGBLUP assuming that all SNP have equal variances across genome. Therefore, we will consider MTBayesian methods with different SNP/regions and MTGBLUP with such a weighted G matrix to further improve the prediction accuracy for the Chinese Holstein population using a joint reference population including reference animals from other countries.
Conclusions
This study revealed the patterns of genomic variance, covariance and correlation for genomic regions across the whole genome between Chinese and Nordic Holstein populations for three milk production traits. BTA14 and BTA 5 showed very large genomic variance, covariance and correlation for MY and FY than other chromosomes, whereas no specific chromosomes showed very large genomic variance, covariance and correlation for PY. In scenario of every 100 SNP as one genome region, most regions explained <0.50% of genomic variance and covariance for MY and FY, and most regions explained <0.30% for PY. A few regions showed highly negative genomic correlations, e.g., one important region on BTA8 for FY. These regions can be considered as candidate regions accounting for interaction between genotype and production environment for MY and FY. The pattern of genomic variance, covariance and correlation across genomic regions could be useful for improving multipopulation (or multibreed) genomic prediction.
Methods
Data
In this study, both Chinese and Nordic Holstein cattle were genotyped with the Illumina BovineSNP50 BeadChip (Illumina, San Diego, CA). For each population, we firstly removed SNP with unknown positions and SNP on Bostaurus (BTA) X, and then imputed missing genotypes using the software Beagle [19]. After the imputation, SNP with minor allele frequencies less than 0.01 were removed. Finally, the marker set included 43,008 SNP on 29 autosomes in both populations. In order to investigate the patterns of genomic variance, covariance and correlation across the whole genome, we divided the whole genome into genome regions in two ways, which has been done by Jensen et al. [15] and Hayes et al. [20]. The first way was that each chromosome was considered as one genome region. The second way was that every adjacent 100 SNP were considered as one region. If the last region of one BTA had less than 50 SNP, this region was merged with the previous region; otherwise, this region was considered as one independent region.
The Chinese Holstein population consisted of 237 genotyped progenytest bulls and 6076 genotyped cows, and the Nordic population included 5244 genotyped progenytest bulls. Three milk production traits, i.e., milk yield (MY), fat yield (FY) and protein yield (PY), were analyzed. In the analysis, deregressed proofs (DRP) were used as phenotypes for all three traits. The DRP of Chinese Holstein bulls and cows were derived from the estimated breeding values (EBV) in original scale obtained from Dairy Association of China, and the DRP of Nordic Holstein bulls were derived from the EBV in standardized scale (http://www.landbrugsinfo.dk/Kvaeg/Avl/Sider/principles.pdf) obtained from Nordic Genetic Evaluation. Thus, the DRP had different scales in two populations.
Statistical Model
In this study, each given biological trait was regarded as two different but genetic correlated traits in the Chinese and Nordic Holstein populations. A new multitrait Bayesian method according to Janss [6] and the modified form of this method were used to calculate overall variance and covariance as well as genomic variances, covariances and correlations for genome regions for each milk production trait in both populations, respectively. The overall variance and covariance were also estimated using a multitrait genomic BLUP (MTGBLUP).
Multitrait Bayesian rrBLUP model for homogeneous variance and covariance
The prior distributions were assumed as \( {r}_i\sim u n i,\mathbf{s}\sim \mathrm{N}\left(0,\mathbf{I}\right),{{\boldsymbol{a}}_i}^{*}\sim \mathrm{N}\left(0,\mathbf{I}{\sigma_{a{ i}^{*}}}^2\right),{\sigma_{a{ i}^{*}}}^2\sim u n i, \) where the part r _{ i } * s contributed to the covariance between SNP effects across populations; the “residual SNP effect” α _{ i } ^{*} was taken as uncorrelated across populations. The modeled variance and covariance for each SNP were worked out as \( \operatorname{var}\left({\boldsymbol{a}}_i\right)=\operatorname{var}\left({r}_i\ast \boldsymbol{s}+{{\boldsymbol{a}}_i}^{*}\right)={r}_i^2+{\sigma_{a{ i}^{\ast}}}^2 \), and cov(a _{1}, a _{2}) = cov(r _{ i } ∗ s + a _{1} ^{ ∗}, r _{2} ∗ s + a _{2} ^{ ∗}) = r _{1} ∗ r _{2}. Thus, the total genomic variance for the population i was \( {V}_{g_i}={\displaystyle {\sum}_{k=1}^{nSNP}2{p}_{i k}\ast \left(1{p}_{i k}\right)* var\left({\boldsymbol{a}}_i\right)} \), the total covariance between two populations was \( {V}_{g_{1,2}}={\displaystyle {\sum}_{k=1}^{nSNP}2*\sqrt{p_{1 k}*\left(1{p}_{1 k}\right)*{p}_{2 k}*\left(1{p}_{2 k}\right)}* cov\left({\boldsymbol{a}}_1,{\boldsymbol{a}}_2\right)} \), and the overall genomic correlation was \( {R}_{g_1{g}_2}=\frac{V_{g_{1,2}}}{\sqrt{V_{g_1}*{V}_{g_2}}}, \) where nSNP was the total number of SNP across the genome; \( {\mathrm{p}}_{\mathrm{ik}} \) was the frequency of minor allele at marker locus k in the population i.
MTBayesian rrBLUP model for heterogeneous variances and covariances
Thus, the genomic variance for each SNP in region j in population i (i = 1, 2) was worked out as \( \operatorname{var}\left({\boldsymbol{a}}_{i j}\right)={r}_i^2+{r}_{i j}^2+{\sigma_{a{ i}^{*}}}^2, \) and the genomic covariance between the two populations was cov(a _{1j }, a _{2j }) = r _{1} * r _{2} + r _{1j } * r _{2j }. Therefore, the total genomic variance for the region j in the population i was \( {V}_{g_{ij}}={\displaystyle {\sum}_{k=1}^{nSNP}2{p}_{ik}\ast \left(1{p}_{ik}\right)*\operatorname{var}\left({\boldsymbol{a}}_{ij}\right),} \) the total genomic covariance for the region j between two populations was \( {V}_{g1 j,2 j}={\displaystyle {\sum}_{k=1}^{nSNP}2*\sqrt{p_{1 k}\ast \left(1{p}_{1 k}\right)*{p}_{2 k}*\left(1{p}_{2 k}\right)}* cov\left({\boldsymbol{a}}_{1 j},{\boldsymbol{a}}_{2 j}\right),} \) and the overall genomic correlation for the region j was \( {R}_{g1 j,2 j}=\frac{ V g1 j,2 j}{\sqrt{V{g}_{1 j}* V{g}_{2 j}}}, \) where nSNP was the total number of SNP for region j; p _{ ik } was the frequency of minor allele at marker locus k in the population i. The proportion of genomic variance and covariance explained by each region were calculated as \( {V}_{g_{ij}}/{\displaystyle {\sum}_{j=1}^{ngroup}{V}_{g_{ij}}} \) and V _{ g1j,2j }/∑ _{ j = 1} ^{ ngroup } V _{ g1j,2j }, respectively.
The analysis of MTBayesian rrBLUP model was carried out using the BAYZ software (www.bayz.biz). The Markov chains were run for 50,000 cycles of Gibbs sampling, and the first 20,000 cycles were discarded as burning in. After this, every 20th sample of the remaining 30,000 cycles was saved for the posterior analysis, and then median value was considered as the estimate of each unknown parameter. The posterior standard deviations of estimated genetic parameters across 30,000 cycles were denoted as standard error.
Multitrait GBLUP model
Abbreviations
 DRP:

Deregressed proofs
 EBV:

Estimated breeding values
 FY:

Fat yield
 MTBayesian rrBLUP:

Multitrait Bayesian ridge regression BLUP
 MTGBLUP:

Multitrait genomic BLUP
 MY:

Milk yield
 PY:

Protein yield
Declarations
Acknowledgements
Not applicable.
Funding
This work is supported by the Genomic Selection in Plants and Animals (GenSAP) research project financed by the Danish Council of Strategic Research (Aarhus, Denmark), the project ‘MultiGenomics’ from the Danish Milk Levy Fund (Aarhus, Denmark), the ‘948’ Project of the Ministry of Agriculture of China (Beijing, China; No.2011G2A(2)), the National Natural Science Foundation of China (No. 31371258, 31601009), the project funded by China Postdoctoral Science Foundation (No: 2016 M600699).
Availability of data and materials
Genotype and phenotype data are available only upon agreement with Aarhus University, China Agricultural University and the commercial breeding organization (http://www.vikinggenetics.com), and should be requested directly from the authors or these organizations.
Competing interests
The authors declare that they have no competing interests.
Author’s contributions
LX performed the analysis and wrote the manuscript. GS and MSL conceived the study, gave help in the analysis and contributed to the manuscript. LLJ developed the new multitrait Bayesian method and revised the manuscript. ZQ, DX and WC provided Chinese data and revised the manuscript. All authors have read and approved the final manuscript.
Consent to publish
Not applicable.
Ethics approval and consent to participate
For the Nordic population, the phenotypic data was collected from routine records of dairy cattle farms. Genotyped animals used in this study were the progenytest bulls, and the semen samples for genotyping were obtained from routine bull semen collection. Therefore, no ethical approval was necessary. For the Chinese population, blood samples were collected from Chinese Holstein cattle when the regular quarantine inspection of the farms was conducted. The procedure for collecting the blood samples was carried out in strict accordance with the protocol approved by the Animal Welfare Committee of China Agricultural University (Permit Number: DK996). The Dairy Association of China and the commercial breeding organization VIKINGGENETICS of Denmark gave permission to use their cows in this study.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Qiu H, Qing ZR, Chen YC, Wang AD. Bovine breeds in China. Shanghai: Shanghai Sci Tech Pub Shanghai; 1988. pp. 31–117.
 Liu JX, Wu YM, Zhou ZE. Current situation and prospect for dairy production in China. In: Rangnekar D, Thorpe W, editors. Smallholder dairy production and marketing: opportunities and constraints. Anand: Proceedings of a SouthSouth workshop, NDDB; 2001. p. 538.Google Scholar
 Li X, Buitenhuis AJ, Lund MS, Li C, Sun D, Zhang Q, et al. Joint genomewide association study for milk fatty acid traits in Chinese and Danish Holstein populations. J Dairy Sci. 2015;98:8152–63.View ArticlePubMedGoogle Scholar
 Zhou L, Ding X, Zhang Q, Wang Y, Lund MS, Su G. Consistency of linkage disequilibrium between Chinese and Nordic Holsteins and genomic prediction for Chinese Holsteins using a joint reference population. Genet Sel Evol. 2013;45:7.View ArticlePubMedPubMed CentralGoogle Scholar
 Ma P, Lund MS, Ding X, Zhang Q, Su G. Increasing imputation and prediction accuracy for Chinese Holsteins using joint ChineseNordic reference population. J Anim Breed Genet. 2014;131:462–72.View ArticlePubMedGoogle Scholar
 Janss L. Disentangling Pleiotropy along the Genome using Sparse Latent Variable Models. Proceedings of the 10th World Congress of Genetics Applied to Livestock Production. 18–22 August 2014; Vancouver.
 Sahana G, Guldbrandtsen B, Thomsen B, Holm LE, Panitz F, Brøndum RF, et al. Genomewide association study using highdensity single nucleotide polymorphism arrays and wholegenome sequences for clinical mastitis traits in dairy cattle. J Dairy Sci. 2014;97:7258–75.View ArticlePubMedGoogle Scholar
 Jiang L, Liu J, Sun D, Ma P, Ding X, Yu Y, et al. Genome wide association studies for milk production traits in Chinese Holstein population. PLoS One. 2010;5(10):e13661. doi:10.1371/journal.pone.0013661.View ArticlePubMedPubMed CentralGoogle Scholar
 Spelman RJ, Ford CA, McElhinney P, Gregory GC, Snell RG. Characterization of the DGAT1 gene in the New Zealand dairy population. J Dairy Sci. 2002;85:3514–7.View ArticlePubMedGoogle Scholar
 Fang M, Fu W, Jiang D, Zhang Q, Sun D, Ding X, et al. A MultipleSNP Approach for GenomeWide Association Study of Milk Production Traits in Chinese Holstein Cattle. PLoS ONE. 2014;9(8):e99544. doi:10.1371/journal.pone.0099544.View ArticlePubMedPubMed CentralGoogle Scholar
 Guo J, Jorjani H, Carlborg Ö. A genomewide association study using international breedingevaluation data identifies major loci affecting production traits and stature in the Brown Swiss cattle breed. BMC Genet. 2012;13:82.View ArticlePubMedPubMed CentralGoogle Scholar
 Aggrey SE, Yao J, Sabour MP, Lin CY, Zadworny D, Hayes JF, et al. Markers within the regulatory region of the growth hormone receptor gene and their association with milkrelated traits in Holsteins. J Hered. 1999;90:148–51.View ArticlePubMedGoogle Scholar
 Kadri NK, Guldbrandtsen B, Lund MS, Sahana G. Genetic dissection of milk yield traits and mastitis resistance QTL on chromosome 20 in dairy cattle. J Dairy Sci. 2015;98:9015–25.View ArticlePubMedGoogle Scholar
 Sun D, Jia J, Ma Y, Zhang Y, Wang Y, Yu Y. Effects of DGAT1 and GHR on milk yield and milk composition in the Chinese dairy population. Anim Genet. 2009;40:997–1000.View ArticlePubMedGoogle Scholar
 Jensen J, Su G, Madsen P. Partitioning additive genetic variance into genomic and remaining polygenic components for complex traits in dairy cattle. BMC Genet. 2012;13:44.View ArticlePubMedPubMed CentralGoogle Scholar
 Zhang Z, Liu J, Ding X, Bijma P, de Koning DJ, Zhang Q. Best linear unbiased prediction of genomic breeding values using a traitspecific markerderived relationship matrix. PLoS One. 2010;5:1–8.Google Scholar
 Su G, Christensen OF, Janss L, Lund MS. Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locusspecific variances. J Dairy Sci. 2014;97:6547–59.View ArticlePubMedGoogle Scholar
 Ding X, Zhang Z, Li X, Wang S, Wu X, Sun D, et al. Accuracy of genomic prediction for milk production traits in the Chinese Holstein population using a reference population consisting of cows. J Dairy Sci. 2013;96:5315–23.View ArticlePubMedGoogle Scholar
 Browning BL, Browning SR. A unified approach to genotype imputation and haplotypephase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84:210–23.View ArticlePubMedPubMed CentralGoogle Scholar
 Hayes BJ, Pryce J, Chamberlain AJ, Bowman PJ, Goddard ME. Genetic architecture of complex traits and accuracy of genomic Prediction: Coat colour, Milkfat percentage, and type in holstein cattle as contrasting model traits. PLoS Genet. 2010;6(9):e1001139. doi:10.1371/journal.pgen.1001139.View ArticlePubMedPubMed CentralGoogle Scholar
 Su G, Madsen P, Nielsen US, Mäntysaari EA, Aamand GP, Christensen OF, et al. Genomic prediction for Nordic Red Cattle using onestep and selection index blending. J Dairy Sci. 2012;95:909–17.View ArticlePubMedGoogle Scholar
 Madsen P, Jensen J, R. DMU  A Package for Analyzing Multivariate Mixed Models in quantitative Genetics and Genomics. Proceedings of the 10th World Congress of Genetics Applied to Livestock Production. 18–22 August 2014; Vancouver.
 VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91:4414–23.View ArticlePubMedGoogle Scholar