Efficiency of genomic selection using Bayesian multi-marker models for traits selected to reflect a wide range of heritabilities and frequencies of detected quantitative traits loci in mice

Background Genomic selection uses dense single nucleotide polymorphisms (SNP) markers to predict breeding values, as compared to conventional evaluations which estimate polygenic effects based on phenotypic records and pedigree information. The objective of this study was to compare polygenic, genomic and combined polygenic-genomic models, including mixture models (labelled according to the percentage of genotyped SNP markers considered to have a substantial effect, ranging from 2.5% to 100%). The data consisted of phenotypes and SNP genotypes (10,946 SNPs) of 2,188 mice. Various growth, behavioural and physiological traits were selected for the analysis to reflect a wide range of heritabilities (0.10 to 0.74) and numbers of detected quantitative traits loci (QTL) (1 to 20) affecting those traits. The analysis included estimation of variance components and cross-validation within and between families. Results Genomic selection showed a high predictive ability (PA) in comparison to traditional polygenic selection, especially for traits of moderate heritability and when cross-validation was between families. This occurred although the proportion of genomic variance of traits using genomic models was 22 to 33% smaller than using polygenic models. Using a 2.5% mixture genomic model, the proportion of genomic variance was 79% smaller relative to the polygenic model. Although the proportion of variance explained by the markers was reduced further when a smaller number of SNPs was assumed to have a substantial effect on the trait, PA of genomic selection for most traits was little affected. These low mixture percentages resulted in improved estimates of single SNP effects. Genomic models implemented for traits with fewer QTLs showed even lower PA than the polygenic models. Conclusions Genomic selection generally performed better than traditional polygenic selection, especially in the context of between family cross-validation. Reducing the number of markers considered to affect the trait did not significantly change PA for most traits, particularly in the case of within family cross-validation, but increased the number of markers found to be associated with QTLs. The underlying number of QTLs affecting the trait has an effect on PA, with a smaller number of QTLs resulting in lower PA using the genomic model compared to the polygenic model.


Background
Recently, high-density single nucleotide polymorphism (SNP) arrays for a broad range of species have been developed, including humans, mice, plant species such as barley, wheat or maize as well as major livestock species, such as cattle, pigs, sheep and chickens. In the past, selective breeding in plant and livestock species was based on phenotypic information combined with extensive pedigrees using best linear unbiased prediction. The use of high density SNP arrays opened the opportunity of using genomic information to estimate genomic breeding values for individuals [1]. Estimating a breeding value based on the genotype of an individual may provide large benefits in situations where a species has a large generation interval e.g. oil palm [2] or when the trait of interest is recorded in one sex only e.g. milk production [3]. Other traits that may benefit from genomic selection are behavioural traits in animals, which are often costly and time consuming to measure routinely e.g. aggressiveness [4].
The high cost of genotyping, especially for the high density SNP arrays, limits the extent to which routine genotyping can be implemented in practice. Additionally, many of the SNPs contribute little to the genetic variance of a trait, as was found for example for human height variation [5] or complex disease traits [6]. Moreover, statistical limitations can arise when the number of SNP effects exceeds by far the amount of phenotypic data available. For these reasons, there could be interest in reducing the number of SNPs while maintaining efficiency of selection. Costs of genotyping may be reduced by genotyping only part of the population, e.g. [7], or a two-step approach could be used to prioritize SNPs for genotyping with lower density SNP arrays, e.g. [8]. To circumvent the statistical limitations, many different approaches have been developed to reduce the number of SNP effects to be estimated, e.g. [9,10].
The aim of this study was to assess the efficiency of genomic selection using mouse data and how it is affected by a) the heritability of the trait, b) the number of QTLs affecting the trait, c) the type of trait ('classical' traits that are easily measurable versus behavioural traits) and d) the number of SNP markers in the model allowed to have a substantial effect. Various models are fitted (including polygenic and/or genomic effects), and cross-validation performance within and between families is compared.

Variance components
Tables 1, 2 and 3 show estimates of the total phenotypic variances, heritabilities based on polygenic effects, proportions of variances attributed to genomic effects relative to the phenotypic variance and of the phenotypic fractions of cage variances. Estimated variance components are based on the full dataset and are presented for seven models, namely: models (1), (2) and (3), and sub-models with 10% and 2.5% of the markers assumed to be associated with a substantial effect using models (2) and (3). Results based on sub-models using mixtures of 70%, 40%, 7.5% and 5% are not presented, because they showed the same trend as the 10% and 2.5% mixtures.
Analyses of weight traits based on model (1), using polygenic effects only, showed slightly lower heritabilities (Table 1) compared to those reported by Valdar et al. [11]. The differences are likely because of different fixed effects fitted in the models. For behavioural and physiological traits, Tables 2 and 3 show estimates of heritabilities of comparable magnitude to those reported by Valdar et al. [11]. Using model (1), phenotypic proportions of cage variances were low for the behavioural traits (4 to 7% of the total variance, Table 2) compared to weight and physiological traits (15 to 29%, Tables 1 and 3).
Phenotypic proportions of genomic variances of weight, behavioural and physiological traits using genomic model (2) were 22 to 31%, 23 to 33% and 25 to 30% lower, respectively, than those using the polygenic model (1) ( Tables 1 to 3). This was compensated for by an increase in variances attributed to the cage effects and/or error effects depending on trait. Using the 2.5% mixture in genomic model (2)i.e. 2.5% of the genotyped SNP markers assumed to have a substantial effect on the traitthe phenotypic proportions of variance of genomic effects were 65 to 79%, 43 to 61% and 60 to 69% lower than estimates of the heritability from the polygenic model (1) for weight, behavioural and physiological traits, respectively. The underestimation of variances of genomic effects compared to variances of polygenic effects may be due to incomplete linkage disequilibrium between SNPs markers and causal variants, and due to low frequencies of these causal variants [12].
In model (3), additionally fitting polygenic effects essentially captured part of the genetic variance that was not accounted for by the genomic effects. The total variance attributed to genetic effects (polygenic and genomic) was in line with the polygenic variance found in model (1). Phenotypic proportions of the variance of the genomic effects using model (3) were consistently lower than in model (2). The phenotypic proportions of the variance of the genomic effects were 33 to 44%, 37 to 52% and 44 to 50% lower for weight, behavioural and physiological traits, respectively, than the phenotypic proportions of the variance of the polygenic effects obtained from model (1). For these traits, the phenotypic proportions of the variance of the polygenic effects accounted for 40 to 48%, 31 to 50% and 50 to 80%, respectively, of the heritability estimated using model (1). The use of different mixtures in model (3) resulted in a 75 to 92%, 60 to 70% and 75 to 80% decrease in phenotypic proportions of variance of the genomic effects compared to the corresponding proportions of polygenic effects from model (1) for the respective traits. The phenotypic proportions of the variance of the polygenic effects for weight, behavioural and physiological traits accounted for 73 to 80%, 60 to 61% and 81 to 100%, respectively, of the phenotypic proportions of the variance of the polygenic effects from model (1).
Using model (3), in general, a lower mixture percentage led to a decrease in phenotypic proportions of the variance of the genomic effects and to an increase in the phenotypic proportions of variance of the polygenic effects in all traits. Comparing W6 and W6m, treating missing alleles as a separate 3 rd allele, resulted in small changes in proportions of the variance of the genomic effects with lower mixture percentages. Tables 4 and 5 show the average predictive abilities (PA) based on cross-validation within (W) or between families (B). PA was calculated using ten training and validation sets, and is shown for all three models and their sub-model using different mixtures. Within family cross-validation always performed substantially better, as expected because of the higher genetic connectedness between the training and validation dataset. Within family cross-validation resulted in little change between PAs for all models for most traits, with only W6 and TA showing an increase in PA using models (2) and (3) compared to model (1).

Predictive ability
In contrast, using between family cross-validation, model (1) resulted in substantially lower PA than models (2) and (3) Tables 4 and 5). For traits with low heritabilities there were little differences in PA between model (1) and the other two models.
TA was the only trait to show significant differences in PA for within as well as between family cross-validation using model (2) with different mixtures. PA was stable at first with the low mixture models, but with mixtures below 7.5%, PA decreased substantially compared to its highest value (0.27 vs. 0.35). W6 showed a similar pattern for between family cross-validation, with a drop-off for mixtures below 7.5% (from 0.27 to 0.20). Both W6 and W10 showed a trend for a decrease in PA for within family cross-validation for mixtures below 7.5% in model (2). TA was the only trait to show a tendency for a reduction in PA with a decrease in mixture percentage, for both within (from 0.43 to 0.41) and between family cross-validation (from 0.34 to 0.28) using model (3). All other traits showed no significant decrease in PA with lower mixture percentages. Different modelling of missing genotypes as used for W6m compared to W6 showed almost no difference in PA.

Importance of individual markers
As an illustration of the statistical relevance of particular markers, ratios of posterior to prior odds based on two 2.5% mixture models are shown in Table 6. Model (2) excludes and Model (3) includes polygenic effects. No trait showed markers with an increased evidence for an effect using mixture percentages higher than 10%. As an example of decreased number of markers showing evidence for an effect with increasing mixture percentages, the estimates for TA are illustrated in Figure 1. This pattern was found for all eight traits using both model (2) and model (3). Note that the number of SNP markers is not equal to the number of QTLs; a QTL effect may be spread over several markers in a region, whereby each individual marker picks up part of the effect of the QTL. The table lists the number of markers with substantial (3.2 < PPOR ≤ 10), strong (10 < PPOR ≤ 100) or decisive (PPOR > 100) effects. Moreover, Figures 2 and 3 show Manhattan plots of the PPOR per marker for model (2) and model (3), respectively. Generally model (2) detected more SNP markers to be associated with QTLs than model (3). Based on model (2), the two weight traits and TA showed the highest numbers of markers associated with QTLs (30 to 57 in total), followed by TF (21 in total). The three traits with the lowest heritabilities, FB, HC and I75, showed the lowest numbers of markers associated with QTLs (7 to 10 in total).  In contrast to the variance estimates and PA, for which treating missing alleles as a separate 3 rd allele did not change their estimates, the number of markers with increased evidence to be associated with a QTL was much lower for W6m than for W6. Figures 2 and 3 show that SNP markers, indicating the presence of QTLs, differed little for models (2) and (3) in terms of the location of the QTLs. Some variation was visible in the relative weight of markers located closely together, since markers situated near a QTL might each pick up part of the QTL effect. For example, for trait W6, in which a QTL region has been previously detected on chromosome 11, model (2) detected two adjacent markers in the region with a PPOR of 105 and 14, respectively ( Figure 2), while the same markers for model (3) showed a PPOR of 29 and 42, respectively ( Figure 3). Generally, model (2) detected more QTLs with decisive evidence except for W10, for which two decisive QTLs were found using model 3.

Heritabilities
In general, higher heritabilities resulted in an increase in PA of genomic selection for all traits. Similar results were found for a different set of traits from this dataset [10,13], with PA as high as 0.67 for a trait with a high heritability (weight, h 2 = 0.74), but as low as 0.27 for a trait with a low heritability (body length, h 2 = 0.13). However, the relationship between heritability and PA was far from linear, as can be seen when comparing for example TF and I75, where the latter trait had a lower heritability but a higher PA using within family cross-validation. This might  indicate that other factors besides the heritability have an influence on PA of a model.

QTL and individual marker distribution
In addition to heritability, the influence of the number of QTLs on PA of a trait was also investigated. As pointed out earlier, the number of SNP markers to be associated with QTLs is higher than the number of QTLs so that we found more markers to have an substantial effect on QTLs than found by Valdar et al. [14]. Across traits, the number of markers associated with QTLs depended partially on the heritability, but especially for traits with low to moderate heritabilities the number of markers associated with QTLs varied substantially between traits with similar heritabilities. There was a clear tendency for traits with fewer SNP markers associated with QTLs to have a lower PA in the case of between family cross-validation. The only exception was HC, which had the lowest PA but not the lowest total number of markers associated with QTLs. However, this trait had a relatively high number of markers classified with the lowest levels of evidence for QTLs compared to the other traits, which indicate their low effect size. For within family cross-validation the tendency was weaker.
Simulation studies, e.g. by Zhong et al. [15], Kizilkaya et al. [16] and Meuwissen and Goddard [17], have shown that the number of QTLs affecting a trait influences the performance of genomic selection, though the influence differed depending on the methodology that was used to estimate genomic breeding values. Kizilkaya et al. [16] found that for a given amount of genetic variance, an increase in the number of QTLs affecting a trait, and thereby a reduction of the variance attributed to a single    QTL, led to a decrease in correlations between true and predicted genotype in both purebred (from 0.39 to 0.20) and multi-breed (from 0.42 to 0.30) situations. Assuming the availability of whole-genome sequence data, Meuwissen and Goddard [17] found that the accuracy of the predicted total genetic value using Bayesian methodology was higher in a scenario simulating three causative QTL per chromosome compared to that simulating 30 QTL. They suggested that the lower accuracy in the presence of more QTL may be caused by the fact that each QTL was associated with a smaller effect and therefore harder to detect and estimate accurately. For W6m, treating missing SNPs as a 3 rd allele reduced the number of detected SNPs associated with QTLs. This reduction had no influence on PA.

Behavioural traits versus weight traits and physiological traits
Analysis of variance components based on models (2) and (3) indicated that behavioural traits showed in general much lower variability attributable to cage effects. The larger variability of cage effects for weight and physiological traits was also found by Valdar et al. [11] and various reasons, such as the more automated process used to record behavioural phenotypes, were discussed. Behavioural traits are generally difficult to collect in large quantities and difficult to measure directly, and therefore require suitable proxy traits.

Cross-validation
There is a vast statistical literature on model comparison criteria using Bayesian and frequentist perspectives. In this work we focus on the use of genomic models to predict genetic values of individuals or to predict future observations. In such a context an obvious criterion of model comparison is their predictive ability. This was studied using cross-validation. Using this method, all three models performed equally well using within family cross-validation. Extensive pedigree information reduced the advantage of genomic information which provided only a small benefit relative to polygenic selection. However, with less close family ties, as is the case with between family cross-validation, genomic information became substantially more valuable, in agreement with other studies [13,18]. This effect was to some extent dependent on a number of factors discussed before, namely the heritability and number of QTLs affecting the trait. For FB and HC, two traits with low heritabilities and a small number of QTLs, genomic selection did not lead to an increase in PA. This indicates that a larger reference population is necessary for these traits to obtain more accurate inferences of genomic values as discussed by Goddard and Hayes [19]. For TF, a trait with moderate heritability and despite a low number of QTLs, genomic information led to a substantial increase in PA using between family cross-validation. For I75, a trait affected by a relatively large number QTLs, but with low heritability, inclusion of genomic information led to a moderate increase in PA when between family cross-validation was used.

Inclusion of polygenic effects
Adding polygenic effects to a genomic model influenced the estimated variances by picking up the part of the genetic variance that was not captured by the genomic effects model. However it had little influence on PA. Legarra et al. [13] and De los Campos et al. [20] used the same dataset and found an increased PA using genomic information relative to polygenic information, but little difference between a solely genomic model and a combined genomic-polygenic model. A simulation study showed slight increases of accuracy when adding polygenic effects to the genomic model, but this was dependent on the extent of linkage disequilibrium between adjacent markers [21]. The same study also showed that a genomic model underestimates genetic variance, but that this is improved by adding a polygenic component, as was the case in this study.

Influence of proportion of markers
A reduced number of markers assumed to have a substantial effect on the trait had an influence on estimates of variance but had no significant effect on PA for most traits. Mixture models explained less of the variance attributed to genomic effects, but resulted in better estimates of individual SNP effects. As a consequence, the PAs between models differed little. Within trait, there was a clear relationship between the mixture percentage and the number of SNP markers associated with QTLs, but no clear association with the PA. TA was the only trait to show a significant decrease in PA for within as well as between family cross-validation, but only in cases where the mixture proportion dropped below 7.5%. Weight traits showed almost the same trends for some mixture models, but for most traits no change in PA occurred even at a mixture proportion of 2.5%. Even though estimates for PA were not significantly different from each other, there seemed to be an optimum mixture percentage, with highest values obtained often around mixtures of 40%.
Su et al. [22] found similar results in dairy cattle when looking at the squared correlation between true and predicted breeding values in bulls, across a range of percentages of mixtures and traits. Reducing the percentages eventually led to lower correlations, but, depending on the trait, the decline was small and did not appear until percentages were below 20% (e.g. in the trait fat percentage in milk). In traits affected by a small number of QTLs with a large effect each, a larger part of the variance is accounted for by these QTLs. Reducing the proportion of SNPs might lead to an even higher proportion of variance explained by these QTLs and a more skewed distribution of SNP-effect size, as was shown by Su et al. [22, Figure 2]. In contrast, in traits not affected by QTLs of large effect, the variance is shared more uniformly among all available SNPs. Similar to this link between SNP-effect size and mixture percentage, a larger number of markers showing a high PPOR and a more skewed distribution of PPOR was found in the present study when the mixture percentage was reduced.
The relationship between mixture percentage and PPOR or SNP-effect size may be a reason for a slightly higher PA when the variance is distributed more evenly, which could be seen when comparing traits with more QTLs (e.g. TA) to traits with few QTLs (e.g. FB). Due to the large costs of genotyping, low density SNP arrays or methodologies that reduce the numbers of animals to be genotyped are of great importance. Research in genome-wide association studies has found that a two-stage design with pre-selection of SNPs between steps can reduce costs substantially without reducing the power of the study [8,23]. Another strategy is the use of imputation of haplotypes or of missing genotypes, for example long-range phasing [24,25]. Our results indicate that, depending on trait characteristics such as heritability and number of QTLs involved, an optimum mixture percentage, i.e. an optimum number of SNPs considered to have a substantial effect, may exist. This indicated that a pre-selected, optimal subset of SNPs could be used for genomic selection of specific traits, where high efficiency is combined with lower financial costs. However, breeding programmes involve simultaneous selection of many traits, and depending on the degree of overlap of the selected SNP markers, the total number of selected SNPs may be considerably larger than the number of SNPs selected for a single trait.

Conclusions
Genomic selection generally performed better than traditional polygenic selection, as indicated by an increase in PA. The increase in PA was most pronounced in the case of between family cross-validation. Larger increases in PA were found for traits with lower heritabilities, but the underlying number of QTLs affecting the trait had an important effect. Traits with a small number of QTLs showed lower PA using the genomic model compared to the polygenic model. Behavioural traits showed a lower variance of cage effects than other traits, but no difference in efficiency of genomic selection compared to traits with a similar heritability. Models including both polygenic and genomic effects captured more of the genetic variance, but did not improve PA. The dataset was restricted to genotyped animals only; incorporation of non-genotyped animals may show different results as a result of for example lower errors of estimation of fixed effects and higher accuracy of prediction of polygenic effects [26]. Reducing the number of SNP markers assumed to have a substantial effect in a mixture model did not significantly change PA for most traits, particularly in the context of within family cross-validation. The mixture approach showed that models using different percentages of SNPs affecting the trait performed efficiently even with low percentages, which may be of greater importance in the future with increasing sizes of SNP arrays.
In the present work, the a priori probability that a marker effect has a detectable effect was treated as a known parameter. In common with other results from the literature, this did not have a clear effect on the PA of the models. However as shown in Figure 1, the a priori probability influences the number of detectable markers a posteriori. Therefore when focus is on detection, it would be desirable to infer the probability of markers with detectable effects from the data. Recently, Bayesian implementations of such methods have been developed [27].

Animals
Data on 2,188 geno-and phenotyped mice provided by the Wellcome Trust Centre for Human Genetics were used to analyse the efficiency of genomic selection in seven traits. The data were freely available [28] and the care and use of animals were performed in compliance with the guidelines at the Wellcome Trust Centre for Human Genetics, University of Oxford, UK. The population has already been described and analyzed comprehensively in various papers including Solberg et al. [29] and Valdar et al. [11]. Therefore, only the aspects important for the present analysis will be highlighted here. Animals were obtained from crossing eight purebred mice strains, followed by 50 generations of pseudo-random mating. Data comprised of 175 full-sib families belonging to one generation and were collected over a period of three years, with a pedigree that consisted of parents and grandparents (2,890 animals in total). Parents and grandparents had no phenotypic records.

Single nucleotide polymorphism markers
After removing uninformative markers, 10,496 SNPs were retained for the analysis. All animals had a call rate above 95% and 99% of all SNPs had call rates higher than 99%. Missing alleles were imputed at random based on the Hardy-Weinberg equilibrium conditional on the observed allelic frequencies of genotyped SNPs. The random numbers were generated based on a uniform distribution. The extent of linkage disequilibrium between pairs of markers was low with an r 2 < 0.5 within 2 Mb and < 0.2 within 8 Mb [14] .

Phenotypic data
Traits were chosen across a range of heritabilities, type (weight, behavioural or physiological) and number of QTLs (Table 7), based on Valdar et al. [11, 14, suppl.]. Weight traits included body weight at the start of the test at six weeks of age (W6) and body weight at the end of the test at ten weeks of age (W10). Behavioural traits included three measurements. One measurement was recorded as part of an open field test (a model of anxiety) at six weeks of age, namely total activity, measured as distance travelled in a time span of five minutes (TA). Two measurements were recorded as part of a cue conditioning test at seven weeks of age, whereby freezing to a tone after association with a foot shock was measured: time spent freezing during cue in minutes (TF) and number of fecal boli after cue (FB). Physiological traits were hematocrit percentage in blood as part of a full blood count test (HC) and insulin level at 75 minutes after intraperitoneal injection with glucose dose as part of a test to model type 2 diabetes mellitus, at nine weeks of age (I75). For further information regarding the biology behind these traits we refer to Solberg et al. [29].
These traits were normalized using the transformation given in Valdar et al. [11] and subsequently multiplied or divided by appropriate factors to avoid rounding errors in the multi-marker programme. To investigate the influence of low frequencies of missing SNPs, weight at 6 weeks was analysed with missing values for SNPs treated as a separate 3 rd allele with low frequency (W6m).
Three basic groups of models were used to compare changes in variance components and PA as a result of using genomic information. One model used only polygenic effects (1), a second model used only genomic effects (2), and a third model fitted both effects (3). For models (2) and (3), seven different sub-models were considered based on the percentage of markers that was assumed to have a substantial effect. This included a non-mixture model using 100% and six mixture models, ranging from 70%, 40%, 10%, 7.5%, 5% to 2.5% of the SNPs assumed to have a substantial effect. In the following, these sub-models will be labelled according to the mixture percentages. All analyses were performed using a Bayesian approach and implemented with Markov chain Monte Carlo methods [30] using the programme iBay [31]. The basic model using polygenic effects can be described as follows: where μ fits a general mean and the vectors b, c, u and e fit the fixed, cage Þand residual effects e $ N 0; Iσ 2 e ð Þ ð , respectively. I is the identity matrix and A the additive genetic relationship matrix. X 1 , X 2 and Z are incidence matrices relating the vectors b, c and u with y. This is the mixed model which is most commonly used to predict traditional breeding values in animal breeding programmes. For the model using genomic effects, model (1) was changed to a Bayesian multimarker association model as follows: where Qas fits the genomic effects, with a the vector representing effects associated with marker alleles a $ N 0; 1 ð Þ ð Þ , s a scaling factor modelling the variance explained by each marker and Q the design matrix linking alleles with markers [31]. Priors were assigned to the scaling factor s as follows for the non-mixture models: where σ 2 g can be interpreted approximately as the expected average fitted variance per marker and TN denotes a truncated normal distribution. For mixture models the following scaling factors s were used: where the first distribution models the markers with on average little to no effect at a proportion π 0 , and the second distribution models the markers that have a substantial effect at a proportion π 1 . The proportions of markers π 1 were varied across mixture models ranging from 100 to 2.5%. Variances for the first distribution σ 2 g0 À Á were set to 1% of the phenotypic variance of the trait divided by the number of markers. No polygenic effects were present and all other effects were as described for model (1). Using the methodology of genomic selection as described by Meuwissen et al. [1], it was possible to solve models with more markers than phenotypic records. The last model, which combined both genetics effects of model (1) and (2), can be as described as follows: where the effects are as defined earlier.
Here the polygenic variance of u accounts for genetic variation which could not be explained by the genomic markers a.

Variance components
Estimates for the variance of polygenic effects σ 2 u ð Þ, variance of genomic effects σ 2 a ð Þ, cage variance σ 2 c ð Þ, residual variance σ 2 e ð Þ and total phenotypic variance σ 2 p À Á were calculated using information from all animals that had both genomic and phenotypic information. The variance of genomic effects σ 2 a is calculated as the sum of the contributions to the genetic variance from each marker, plus all possible covariances due to linkage disequilibrium, taking into account the allele frequencies. The heritabilities for the polygenic effects h 2 u À Á and genomic effects h 2 a À Á were calculated based on their corresponding variance components σ 2 u and σ 2 a ð , respectively) as proportion of the phenotypic variance. The software iBay required that animals had both genomic and phenotypic data available to be included in the analysis.

Predictive ability
PA was calculated as the Pearson's correlation between a predicted observation and the corresponding realized observation. Realized observation was calculated as the phenotype corrected for fixed effects and covariates, while the predicted observation was the estimated breeding value, as was done by Legarra et al. [13]. To predict these observations, a cross validation approach was used, whereby the dataset was split into a validation set and a training set. The validation set contained the animals for which the observation had to be predicted, while the training set was used to estimate the parameters for the prediction model. Size of the training set is of importance for the estimation of accurate breeding values [19] and to ensure a sufficient size of training population, a 1:5 proportion of validation to training dataset was used. Only animals from families with at least two members were used to create validation sets (~80% of all animals). These animals were randomly split into five groups to create five validation sets. Thus each validation set con-tained~16% of all animals. This was repeated to create ten validation sets in total. Each validation set had a corresponding training set, which contained the remaining animals with phenotypic data.
Two different routines for splitting the data were used: within family and between family cross-validation. For within family cross-validation, full sib families were randomly split between training and validation set such that each set contained at least one animal from a family. For between family cross-validation, families were split such that no full sib family would have animals in both datasets simultaneously. As a result, for between family cross-validation no close genetic connectedness due to full sib families was available between training and validation data. In the case of within family cross-validation, full sibs with phenotypic data linked the breeding values of the training and validation data.

Importance of individual markers
As an illustration, the relative importance of individual markers was quantified via the computation of Bayes Factors, conditional on either model (2) or model (3). The correct inferences about the statistical relevance of particular markers could involve, first, calculation of the posterior probability of each model. Secondly one could report Bayes factors conditional on the model with largest posterior probability, or averaging over all models. This task was judged to be computationally too burdensome and was not undertaken in this study. As indicated in Table 7, traits were chosen across a range of number of QTLs, ranging from as low as 1 for TF and HC up to 20 for W10. The objective was to compare the performance of genomic models (2) and (3) in finding regions with evidence of a marker having an increased effect, and to study how the number of QTLs affecting a trait influences the efficiency of genomic selection. Using the Bayesian approach implemented in the programme iBay [31], the Bayes Factor computed as the change in prior to posterior odds (PPOR) for each marker was calculated with the following formula: PPOR ¼p 1 = 1 Àp 1 ð Þ ð Þ = π 1 =1 À π 1 ð Þ ; wherep 1 is the estimate for the posterior probability of the marker having a substantial effect, and π 1 the a priori probability that the marker has a substantial effect. Results were plotted per trait for all markers, whereby a PPOR > 3.2 can be interpreted as substantial evidence for the marker to have an increased effect, a PPOR > 10 as strong evidence, and a PPOR > 100 as decisive [31].