Multiple trait multiple interval mapping of quantitative trait loci from inbred line crosses

Background Although many experiments have measurements on multiple traits, most studies performed the analysis of mapping of quantitative trait loci (QTL) for each trait separately using single trait analysis. Single trait analysis does not take advantage of possible genetic and environmental correlations between traits. In this paper, we propose a novel statistical method for multiple trait multiple interval mapping (MTMIM) of QTL for inbred line crosses. We also develop a novel score-based method for estimating genome-wide significance level of putative QTL effects suitable for the MTMIM model. The MTMIM method is implemented in the freely available and widely used Windows QTL Cartographer software. Results Throughout the paper, we provide compelling empirical evidences that: (1) the score-based threshold maintains proper type I error rate and tends to keep false discovery rate within an acceptable level; (2) the MTMIM method can deliver better parameter estimates and power than single trait multiple interval mapping method; (3) an analysis of Drosophila dataset illustrates how the MTMIM method can better extract information from datasets with measurements in multiple traits. Conclusions The MTMIM method represents a convenient statistical framework to test hypotheses of pleiotropic QTL versus closely linked nonpleiotropic QTL, QTL by environment interaction, and to estimate the total genotypic variance-covariance matrix between traits and to decompose it in terms of QTL-specific variance-covariance matrices, therefore, providing more details on the genetic architecture of complex traits.


Background
Many traits that are important to agriculture, human health and evolutionary biology are quantitative in nature, influenced by multiple genes. Efficient and robust identification and mapping onto genomic positions of those genes is a very important goal in quantitative genetics. The availability of genome-wide molecular markers provides the means for us to locate and map those quantitative trait loci (QTL) in a systematic way. Since the publication of interval mapping method for QTL genome-wide scan [1], many statistical methods have been proposed and developed to map multiple QTL with or without epistasis in single trait in a variety of populations [2], e.g.  [3,4], least squares [5,6], multiple interval mapping (MIM) [7], and Bayesian interval mapping [8,9].
Although single trait QTL mapping methods have been applied in many studies to estimate the genetic basis and architecture of complex traits, these methods did not utilize the information of genetic and environmental correlations between traits, and are not ideal for data analysis. Multiple trait analysis however can take these into account and also can formally test a number of hypotheses concerning the nature of genetic correlations, such as pleiotropy vs. close linkage and genotype by environment interaction [10]. Moreover, multiple trait analysis can allow the estimation of genetic variance-covariance matrix between traits and its decomposition in terms of individual QTL ( [11,12] pages 109-110).
Multiple trait CIM [10], least squares [13] and Bayesian [14,15] methods have been available for multiple trait http://www.biomedcentral.com/1471-2156/13/67 QTL analysis. However, these methods have not been targeted to multiple QTL for multiple traits, i.e. the whole QTL that contribute to the genetic variances and covariances. Also these methods lack appropriate criteria for assessing genome-wide significance level of QTL effects. The multiple trait CIM method uses a genome-wide threshold based on either asymptotic approximation of the log-likelihood ratio test (LRT) or permutation [16]. Nevertheless, when applied to multiple QTL models, the permutation test has some limitations in testing some targeted hypotheses. In this study, we have invested efforts in developing: (1) a statistical method for multiple trait multiple interval mapping (MTMIM) of QTL from inbred line crosses, and (2) a score-based threshold for assessing significance level of QTL that is suitable for MTMIM. In what follows, we motivate MTMIM modeling from a practical point of view, describe the MTMIM statistical model, build the likelihood function, derive parameter estimators, extend the score-based threshold method [17] to the MTMIM model, propose a forward selection strategy to build an MTMIM model using the score-based threshold as a criterion to assess the significance level of QTL effects, and propose a model optimization procedure to fine tune a fitted MTMIM model. We then frame the hypothesis testing of pleiotropic versus closely linked nonpleiotropic QTL, and QTL by environment interaction via the MTMIM model. Next, we implement the MTMIM model and score-based threshold method and evaluate them with several simulated datasets. More specifically, we evaluate type I error, model fitting, and the efficiency of the test of pleiotropic versus closely linked nonpleiotropic QTL delivered by the MTMIM model. Lastly, we demonstrate the usefulness of the MTMIM model by analyzing data from an experiment with fruit flies Drosophila and draw our final considerations.
We organize this paper in a manner such that a reader less interested in the mathematical aspect of the modeling could skip the analytical derivations while being able to understand the main points regarding multiple trait multiple interval mapping of QTL.

A motivating example
We use data from a cross between fruit flies Drosophila simulans and D. mauritiana to motivate MTMIM modeling. Detailed information about the experiment can be found in [18,19]. Briefly, males from an inbred line of D. mauritiana (Rob A JJ) were crossed to females from an inbred line of D. simulans (13w JJ) to produce F 1 hybrids. F 1 females were then crossed to each parental species to produce two backcross populations of males, mauritiana backcross (BM) and simulans backcross (BS). These two crosses were repeated one more time to produce two independent populations from each backcross: BS1 (sample size n=186), BS2 (n=288), BM1 (n=192) and BM2 (n=299).
Males from BM1 and BS1 were scored at 45 marker loci for which the two parental lines were homozygous for different alleles. Males from BM2 and BS2 were scored at 42 marker loci out of the same 45 marker loci that BM1 and BS1 were scored. The phenotypic values of each subject are: (1) average over both sides (left and right) of the first principal component of 100 Fourier coefficients of posterior lobe (PC1); (2) area of the posterior lobe (AREA); (3) average over both sides of the first principal component of 100 Fourier coefficients of the rescaled posterior lobe, rescaled so that it has unit area (ADJPC1); and (4) length of the foreleg tibia (TIBIA). While PC1 provides a measure of both size and shape of the posterior lobe, AREA and ADJPC1, on the other hand, provide measures of size and shape somewhat separately. TIBIA provides a measure of overall body size. The genotypic and phenotypic data are freely available at [20].
All variables related to the posterior lobe (PC1, ADJPC1 and AREA) were reported to be highly correlated between themselves in both BM1 and BS1, correlation larger than 0.82 [18]. Therefore, suggesting the presence of pleiotropic and/or closely linked QTL affecting size and shape. However, all variables related to the posterior lobe were weakly correlated with TIBIA. Because posterior lobe shape and size possibly share most of their developmental process components, these two traits could be tightly related mostly due to pleiotropic effects [18]. Results of composite interval mapping analysis of AREA, PC1, and ADJPC1 were very similar to each other, except for the presence of a QTL affecting both AREA and PC1 but not ADJPC1 in the interval between marker loci Ddc and eve. Therefore, this QTL affects size but not shape of the posterior lobe [18]. In this article, we use only PC1 and ADJPC1 traits and the BM1 and BM2 samples. AREA was not analyzed because it is highly correlated (0.99) with PC1 [18], and TIBIA was not analyzed because according to Liu and coauthors [18] it has small correlation with AREA and in general TIBIA is not an important factor governing the variability of posterior lobe shape. Besides, on our single trait analysis no QTL was found for TIBIA. BS1 and BS2 samples were not used for analysis because the main goal of this article is to present a novel method for QTL mapping, rather than to investigate details of the inheritance of posterior lobe shape.
We carried out MIM analysis of PC1 and ADJPC1 in the pooled samples of BM1 and BM2 (n=192+299), hearafter referred as BM data, and we found statistical evidence for seventeen genomic regions harboring QTL (Figure 1), of which twelve genomic regions showed statistical evidence of QTL affecting both traits (perhaps pleiotropic QTL), and five regions showed statistical evidence of QTL affecting either one of the traits (regions 3, 6, 9 , 12 and 15). We want to mention that in all these five regions, expect region 6, even for the trait for which the effect is not http://www.biomedcentral.com/1471-2156/13/67 statistically significant there is still some evidence of weak putative QTL effect, as shown in the LRT profiles from the MIM analysis of PC1 and ADJPC1. Region 6, which includes marker loci Ddc and eve, was previously reported not to harbor any putative QTL with significant effect on ADJPC1 [18]. Overall, the inferred genomic regions harboring putative QTL in our MIM analysis are in strong agreement with previous inferred QTL in [18,19].
Positions of mapped QTL in regions 4, 5, 7, 10, 11, 13, 14, 16 and 17 ( Figure 1) did not coincide in the MIM models of PC1 and ADJPC1. Therefore, one could hypothesize the existence of two closely linked nonpleiotropic QTL at each of these regions. In order to test the hypotheses of pleiotropic versus closely linked nonpleiotropic QTL at each one of these regions, a joint analysis of PC1 and ADJPC1 is needed. The joint analysis also allows us to partition the genotypic variance-covariance matrix between traits PC1 and ADJPC1 in terms of QTL-specific variancecovariance matrices. Thus in this motivating example, the main reasons to use the MTMIM model are: (1) to test pleiotropic versus closely linked nonpleiotropic QTL, and (2) to estimate the contribution of each QTL to the total genotypic variance-covariance matrix between traits PC1 and ADJPC1. A third reason for the MTMIM modeling, though not applicable to this specific motivating data, is the possibility to test the hypothesis of QTL by environment interaction [10].

Type I error
The results show clearly an excellent agreement between estimated type I error and nominal level in the range of 1 to 15% ( Figure 2).

Model size (results not shown)
The number of QTL in the MTMIM model of scenario SI was much closer to the simulated parameter (five QTL) when compared to scenario SII, for any genome-wide significance level. While a QTL in both scenarios has to exceed very similar thresholds to be declared significant in the forward selection, the number of traits affected by a QTL is rather different between the two scenarios. In scenario SI all QTL have effect on all traits, while in scenario SII a QTL may have effect either on one, two or three traits. Therefore, model overparametrization makes the detection of QTL with effects on one and two traits in scenario SII more difficult. Lastly, our results show that in general the number of mapped QTL is closer to the simulated (five QTL) in the MTMIM than in the MIM model.  Estimated and expected type I error, in percentage, of LRT when using the genome-wide score-based threshold to assess significance level of putative QTL in genome-wide scan of 1000 replicates.

FDR
FDR is a very import measure of quality control in statistical analysis [21]. However, FDR is not feasibly estimated in analysis of data from traditional QTL experiments, due to the low discovery rate of putative QTL in such experiments. Nevertheless, in simulation experiments we are able to estimate FDR because we can replicate the experiment many times. We estimate FDR (Table 1) when varying the genome-wide significance levels (1, 5, and 10%) and LOD-d support interval levels (d=1, 1.5 and 2). While FDR is expected to increase with increments in genome-wide significance level, our results show that for a fixed LOD-d level FDR changed little with increments in genome-wide significance levels, in both MIM and MTMIM models. Regarding changes in LOD-d level, our results show that FDR and LOD-d are negatively correlated, as expected. Higher levels of LOD-d ultimately translate into wider LOD-d support intervals, therefore, increasing chances of capturing the true position of QTL. FDR in the MIM and MTMIM models were very similar, except in the MIM model of trait T3 of scenario SII, which was simulated with only one QTL of small effect (heritability of 5%).

Power
Results of power for the MIM and MTMIM models of all three scenarios clearly show a remarkable increment in power as genome-wide significance levels grow less stringent, for any LOD-d level ( Table 2 -results shown for LOD-1.5 level only). Based on these results as well as on those that showed almost constance of FDR across genome-wide significance levels, we, hereafter, show and discuss results of 10% genome-wide significance level only.
Results of power (10% genome-wide significance level and LOD-1.5) to identify QTL in the MTMIM model show that QTL affecting more traits have higher chances of being identified in the forward selection. In scenario SI, which is the most favorable among all three scenarios, all QTL have effects on all traits. Therefore, all QTL were  In scenarios SII and SIII, we decomposed power of QTL identification (10% genome-wide significance level and LOD-1.5) into three nonoverlapping subsets (Table 3). In scenario SII, there is a subset of replicates for which a QTL affects T1 only, another subset for which a QTL affects T1 and T2 simultaneously, and finally a subset of replicates for which a QTL affects all traits (T1, T2, and T3) simultaneously. In scenario SIII, there is a subset of replicates for which a QTL affects T1 only, another subset for which a QTL effects T2 only, and finally a subset of replicates for which a QTL affects T1 and T2 simultaneously. These decompositions of power allow us to decompose the total power in the MTMIM model into QTL-trait power, therefore enabling us to measure the frequency in which a nonpleiotropic QTL is mapped as a pleiotropic one. In scenario SII, where all QTL are independent, most of power to identify a QTL is concentrated on the simulated trait affected by that QTL. For instance, in the LOD-1.5 level, 66.4 out of 78.2% power (0.85 ratio) to identify Q1 is due to T1 alone, which is the only trait in which Q1 has effect on. In scenario SIII, because linkage between QTL pairs Q1 and Q2, and Q4 and Q5, the contribution of simulated traits affected by these QTL to their overall power is lower than in scenario SII, though the simulated traits still account for a large amount of power. For example, 36.8 out of 70% power (0.53 ratio) to identify Q1 is due to T1 alone, which is the only trait in which http://www.biomedcentral.com/1471-2156/13/67 Decomposition of total power (P total in Table 2) from scenarios SII and SIII into QTL-trait power (P trait ) with 10% genome-wide significance level and LOD-1.5 support interval. In SII, subsets (1, 0, 0), (1, 1, 0) and (1, 1, 1) contain replicates with QTL affecting T1 only, T1 and T2, and T1, T2 and T3, respectively. In SIII, subsets (1, 0), (0, 1) and (1, 1) contain replicates with QTL affecting T1 only, T2 only, and T1 and T2, respectively. The QTL-trait to the overall power ratio (ratio=P trait /P total ) is also presented.
Q1 has effect on, and 46 out 68% (0.68 ratio) power to identify Q5 is due to T1 alone, which is the only trait in which Q5 has effect on. Notice that in scenario SIII Q1 was mapped as a pleiotropic QTL (subset (1,1) in Table 3) more often than Q5, i.e. 30.4 out 70% (0.43 ratio) and 20.8 out of 68% (0.31 ratio), respectively. Identification of Q1 as being pleiotropic more often than Q5 is mainly because the distance between Q1 and Q2 is shorter than the distance between Q4 and Q5, 10 and 15 cM, respectively. The smaller the distance between two nonpleiotropic QTL, the harder is to separate them in the MTMIM model. Moreover, separation of nonpleiotropic QTL is also affected by the distance between genetic markers. Linkage maps with markers closely spaced are expected to help in separating nonpleiotropic QTL. On the other hand, separation of nonpleiotropic QTL in linkage maps with sparse markers, such as the linkage map used in our simulations, is a much harder task.

Mean position of QTL
Our simulations show that mean estimates of QTL position in the MIM and MTMIM models have no qualitative difference and are in close agreement with the simulated parameters (Table 4). There is, though, a trend of smaller variation (measured in terms of standard error of mean) in the MTMIM than in the MIM model. Also, in the MTMIM model there is a trend of smaller variation for those QTL with effects on a larger subset of traits. Means of QTL position (cM), LOD-d support interval coverage (%) and length (cM) in the MIM and MTMIM models as observed in scenario SII across LOD-d support intervals (1, 1.5 and 2) and 10% genome-wide significance level. Position estimates shown here are for the LOD-1.5 support interval only. The chromosome in which each QTL is located is shown between square brackets. Standard errors of means are between parentheses. http://www.biomedcentral.com/1471-2156/13/67

Coverage and length of LOD-d support interval
In Table 4

Mean effect of QTL
The average of effects of QTL in scenario SI (Table 5) shows that estimates of QTL effects in the MTMIM model are overall in close agreement with the simulated parameters, mostly because of high power in this scenario. Results of scenario SII demonstrate the robustness of the MTMIM model in estimating the effects of QTL, whereby QTL without effects on certain traits have estimates near zero, while QTL with nonzero effects have estimates with low bias. However, the robustness of the MTMIM to estimate QTL effect with low bias is less evident in scenario SIII. For instance, notice that while Q2 has zero effect on T1, its effect estimate is not close to zero. In order to understand why this bias is present in Q2 of scenario SIII, we need to understand how we matched a mapped to a simulated QTL. In the forward selection we searched and mapped pleiotropic QTL, then each mapped pleiotropic QTL was tested against the alternative hypothesis of closely linked nonpleiotropic QTL at the neighboring region of the mapped pleiotropic QTL. If the pleiotropic hypothesis was not rejected, we assumed the QTL was pleiotropic. Then, in order to apply our summary statistics, each mapped pleiotropic QTL was matched to its closest (smallest distance) simulated QTL. It could happen that a mapped pleiotropic QTL in the neighboring region of simulated Q1 and Q2 be matched to Q2, even though the major effect of the mapped pleiotropic QTL comes from Q1. Notice that when the previous situation happens, we mistakenly assign the effect of Q1 (which affects only T1) to Q2 (which presumably would not affect T1), therefore, producing biased estimated effect of Q2 on T1. The same explanation of "bias" carries over to Q4 (T1), Q1 (T2) and Q5 (T2) in scenario SIII. We quoted bias to emphasize that the bias observed in scenario SIII is not due to the MTMIM estimation per se, but rather due to our lack of ability to separate closely linked nonpleiotropic QTL or due to our criterion to match mapped to simulated QTL. The effects of all QTL were overestimated in the MIM model. This phenomena is expected due to estimation conditional on detection, the so-called "Beavis effect" [22]. A qualitative comparison of results show that overall the estimation of QTL effects in the MTMIM model are less biased than in the MIM model.

Pleiotropic versus closely linked nonpleiotropic QTL
In scenario SIII, after selecting an MTMIM model in the forward selection, each mapped pleiotropic QTL was tested against the alternative of closely linked nonpleiotropic QTL. In the bivariate model, we performed a two-dimensional search for positions of putative closely linked nonpleiotropic QTL in the neighborhood of the position of each pleiotropic QTL, as suggested in [10]. The model with nonpleiotropic QTL that showed highest likelihood within the two-dimensional search region was selected and tested against the model with pleiotropic QTL. We compared two criteria for model selection, the AICc and LRT. The critical value for the LRT at 5% significance level was obtained from a chi-squared probability distribution with one degree of freedom.
Because Q3 was simulated as being pleiotropic, rejection of pleiotropic hypothesis for Q3 provides a measure of type I error. On the other hand, Q1 and Q2, and Q4 and Q5 were simulated as pairs of closely linked nonpleiotropic QTL. Therefore, rejection of pleiotropic hypothesis at these QTL provides a measure of power. Under our simulation setting, the LRT performed better than the AICc. The LRT was able to keep the best balance between type I error and power. Estimated frequency of rejecting pleiotropy for Q3 (4%) using the LRT agrees very well with the expected 5% nominal error rate, and estimated frequency of rejecting pleiotropy for Q1 (38%) and Q2 (36%) are satisfactory high, taking into account that Q1 and Q2 are considerably close to each other in a linkage map with markers considerably distant from each other (10 cM from marker-to-marker). On the other hand, the AICc criterion showed higher power for Q1 (45%) and Q2 (45%), but with a cost of high type I error for Q3 (15%). Moreover, because Q4 and Q5 are 15 cM apart from each other, the frequency of rejecting pleiotropy using LRT for these two QTL (41 and 48%, respectively) is higher than for Q1 (38%) and Q2 (36%), which are 10 cM apart from each other.

Motivating example revisited
Motivated by the fact that the joint analysis of PC1 and ADJPC1 in the Drosophila dataset could provide additional information to distinguish between genetic effects of QTL on size and shape of posterior lobe, we then analyzed these two traits with the MTMIM model. Such additional information are: (1) testing pleiotropic versus closely linked nonpleiotropic QTL, and (2) estimating the contribution of each QTL in the fitted model to the genotypic variance-covariance matrix between PC1 and ADJPC1. In what follows, we show results of the MIM and MTMIM model of the pooled samples from BM1 and BM2 (n=192+299), the BM data. We also take advantage of this dataset to test the GEM-NR algorithm for maximizing the likelihood function under the MTMIM model with many QTL. Using data from a genetic experiment would provide more realistic comparisons between the GEM-NR and ECM algorithms than a simulated dataset would do.
The LRT profiles of genome-wide scan in the BM data ( Figure 1) shows that the MTMIM model produced smaller values of LRT than the MIM model for some genomic positions, therefore, seemingly violating the expectation that the MTMIM model would produce greater LRT values than the nested MIM models [10]. Nevertheless, this violation is easily explained because not all positions of putative QTL in the MIM and MTMIM models coincide. Therefore, the MIM models are not nested within the MTMIM model shown here. Seventeen regions in the genome showed statistical evidence of putative QTL in the MTMIM model with 10% genome-wide significance level ( Figure 1 and Table 6).
MIM models of PC1 and ADJPC1 all together showed statistical evidence of twelve genomic regions with statistical significant QTL affecting both traits, and five regions with statistically significant QTL affecting either one of the traits (regions 3, 6, 9 , 12 and 15 shown in Figure 1 and Table 6). MTMIM model mapped these five regions either exactly or very close to their respective estimated positions in the MIM models. Moreover, the estimated effects of these five regions in the MTMIM model showed small discrepancy from those estimates in the MIM models (Table 6). Nevertheless, empirical results from our simulations suggest that both estimates of positions and effects of QTL in the MTMIM model are more accurate than in the MIM models.
Positions of QTL in regions 4, 5, 7, 10, 11, 13, 14, 16 and 17 ( Figure 1 and Table 6) did not coincide with those in the MIM models of PC1 and ADJPC1. Therefore, one could hypothesize the existence of two closely linked nonpleiotropic QTL at each of these regions. We tested the hypothesis of pleiotropic QTL versus closely linked nonpleiotropic QTL at each of these regions, and on the basis of the data available the null hypothesis of pleiotropic QTL could not be rejected for any region. Thus, since PC1 contains attributes of both shape and size of posterior lobe, whereas ADJPC1 contains attributes of size only, the available data provides strong evidence that the genetic mechanisms controlling shape and size of posterior lobe are highly similar.
Partition of the phenotypic variance-covariance matrix between PC1 and ADJPC1 in terms of their environmental and genotypic components, as estimated in the http://www.biomedcentral.com/1471-2156/13/67 Estimates of QTL position (p) and main effect on PC1 (β 1 ) and ADJPC1 (β 2 ) in the MIM and MTMIM models of BM data with 10% genome-wide significance level. QTL effects in the MTMIM model were estimated via GEM-NR and ECM algorithms. Estimated phenotypic (ˆ p ) and genotypic (ˆ g ) variance-covariance matrices (multiplied by 10 5 ) are also shown. a Estimated position (cM) of QTL from the leftmost genetic marker on the chromosome. ns Nonsignificant main effect tested with the LRT and 5% significance level. The critical value of the LRT was obtained from the chi-squared distribution function with one degree of freedom. http://www.biomedcentral.com/1471-2156/13/67 MTMIM model, shows that most of the phenotypic variance-covariance between these traits is due to the genotypic component (Table 6). Moreover, we partitioned the total genotypic variance-covariance matrix in terms of QTL-specific variance-covariance matrices (Table 7) as proposed in [11] and [12] (pages 109-110). This decomposition of the genotypic variance-covariance matrix shows how much of the total genotypic variance-covariance is explained by each QTL in the fitted model. The possibility of fitting many traits and many QTL in the MTMIM model imposes severe burden in the estimation of parameters both in terms of reliability of parameter estimates (accuracy) and computation time (speed). The GEM-NR and ECM algorithms are two alternative approaches suitable for parameter estimation in such complex models. We evaluate these two algorithms with the BM data by fitting an MTMIM model for PC1 and ADJPC1. The results ( Figure 3) show a tremendous gain of GEM-NR over ECM in terms of number of iterations, 19 and 52, respectively, as well as in terms of computing time, 8.2 and 30.6 seconds in a desktop PC, respectively. The gain in computation time from GEM-NR is even more evident in genome-wide scan and model selection because likelihood maximization have to be computed many times. Parameter estimates delivered in the GEM-NR and ECM were very similar (Table 6).

Conclusions
A novel statistical method for multiple trait multiple interval mapping (MTMIM) of QTL from inbred line crosses was proposed and developed. We also proposed a novel method for estimating genome-wide threshold and assessing the significance level of putative QTL effects in the MTMIM model. The method of genome-wide threshold estimation is based on the score-based resampling framework [17]. The MTMIM model has the advantage of allowing us to map QTL with effects on multiple traits, while taking advantage of information from correlations between traits. The MTMIM model has been implemented in the freely available software Windows QTL Cartographer [23].
The MTMIM model provides a comprehensive framework for QTL inference on multiple traits and the scorebased threshold serves as an essential and elegant tool for computing significance level of effects of putative QTL in the genome-wide scan. The MTMIM model and score-based threshold were evaluated through simulations. Also, we analyzed data from an experiment with Drosophila for the purpose of illustrating the MTMIM model and evaluating the performances of the GEM-NR and ECM algorithms.
Results from our simulations showed many interesting features of the MTMIM model and score-based threshold. First, the score-based threshold maintained the type I error at a desired nominal level when no QTL effects were present in the simulated datasets. Second, discovery of spurious QTL (false discovery rate) was almost constant across genome-wide significance levels of 1, 5 and 10%, while power to identify simulated QTL increased substantially as the significance level grew less stringent. Therefore, a more liberal (10%) genome-wide significance level could be used in the genome-wide scan, corroborating the results of C. Laurie, S. Wang However, the support interval was much wider in the MIM than in MTMIM model. Overall, a qualitative comparison of results from the MIM and MTMIM models shows that effect estimates in the latter are less biased than in the former. Lastly, the LRT was shown to keep adequate type I error level when testing the null hypothesis of pleiotropic QTL against the alternative of closely linked nonpleiotropic QTL in the bivariate analysis, while it delivered reasonable power when data were generated under the alternative.
Throughout this paper, we provided compelling empirical evidences that the score-based threshold maintained proper type I error rate and tend to give a false discovery rate within acceptable level, and that the MTMIM model can deliver better parameter estimates and power than the MIM model, and yet the MTMIM model provides a framework to test hypotheses of pleiotropic QTL versus closely linked nonpleiotropic QTL, QTL by environment interaction, and to estimate the total genotypic variancecovariances matrix between traits and to decompose it in terms of QTL-specific variance-covariance matrices. An analysis of phenotypic and genotypic data from an experiment with Drosophila illustrated the new tools present in the MTMIM model. In conclusion, the MTMIM model is a valuable tool to better extract information from experiments with measurements in multiple quantitative traits, therefore, providing more details on the genetic architecture of complex traits.

Methods
In what follows, for any matrix A, its transpose is denoted by A , its inverse by A −1 , its u th row by A [u,·] , its v th http://www.biomedcentral.com/1471-2156/13/67 Table 7 Estimated QTL-specific (multiplied by 10 5 ) genotypic variance-covariance matrix between traits PC1 and ADJPC1   column by A [·,v] , and its element in row u and column v by A [u,v] .

Statistical model
Our statistical model for multiple trait multiple QTL inference for a backcross (BC) population is a linear model, in which the measurement y ti of trait t (t = 1, 2, · · · , T) on each subject i (i = 1, 2, · · · , n) is regressed on variables x ir (r = 1, 2, · · · , m). These variables are defined according to Cockerham genetic model [24,25]. For each subject i, x ir takes either value 1 2 or − 1 2 , depending on whether QTL r has homozygous or heterozygous genotype, respectively. The coefficient β tr is called the main effect of the r th QTL on trait t. The linear model also includes an intercept μ t for each trait, it may include a subset p of epistatic effects (w trl ) among all pairwise QTL interactions (r and l ∈ {1, 2, · · · , m}), and it includes a residue e ti . The linear model is: For each subject i, let y i = (y 1i , y 2i , · · · , y Ti ) be a T × 1 vector of trait measurements, and e i = (e 1i , e 2i , · · · , e Ti ) be a T × 1 random vector assumed to be independent and identically distributed according to a multivariate normal distribution with mean vector zero and positive definite symmetric variance-covariance matrix e , i.e., e i ∼ MVN T (0, e ). For each r, let β r = (β 1r , β 2r , · · · , β Tr ) be a column vector of main effects. For each pair r and l (r < l, r = 1, 2, · · · , p) of interaction, let w b = (w 1rl , w 2rl , · · · , w Trl ) be a column vector of epistatic effects (b = 1, 2, · · · , p). Lastly, let μ = (μ 1 , μ 2 , · · · , μ T ) be a T × 1 vector of means.
We collect all effect parameters (m main and p epistatic effect vectors) into a T × s (s = m + p) matrix B = ( β 1 , β 2 , · · · , β m , w 1 , w 2 , · · · , w p ), and all model parameters into a column vector θ = (θ 1 , θ 2 , · · · , θ s , μ , vect( e )) , where θ b = β b for 1 ≤ b ≤ m and θ b = w b for m < b ≤ s, and vect( e ) is an operator that stacks the rows of e into a column vector one on the top of the other and then transposes it. Motivated by the fact that a QTL may not have significant effect on all traits under analysis, we allow for the insignificant parameter effects in each vector θ b to be constrained to zero. Therefore, the MTMIM model allows each trait to have its own set of effect parameters, as in the seemingly unrelated regression model [26].

Likelihood function
In order to search the entire genome for significant QTL effects, the genome is partitioned into H points, usually at 1-centiMorgan (cM) grid. This partition is denoted by ζ . The set of positions of m putative QTL, λ = http://www.biomedcentral.com/1471-2156/13/67 {λ 1 , λ 2 , · · · , λ m }, is assumed to be a subset of ζ [27]. For any subject i, let M i be the genotypic information of markers flanking the m QTL, and M r i,L and M r i,R be the flanking markers on left and right of QTL r, respectively. In a diploid species, a subject from a BC population generated from inbred line crosses has either genotype QQ or Qq for a locus, assuming the recurrent parent has genotype QQ. Therefore, if there are m QTL affecting a trait, there are 2 m possible genotypes for any subject i. Genotypes of the form G j = Q 1 Q 2 · · · Q m , where Q r ∈ {QQ, Qq}, r = 1, 2, · · · , m and j = 1, 2, · · · , 2 m . Then, assuming no crossover interference between marker intervals and no more than one QTL existing within a marker interval, the probability of any genotype G j , conditional on the genotypes of markers flanking the m QTL is p ij = where the probabilities on the right hand side of this equation can be estimated via a Hidden Markov model [28].
We define an s×2 m matrix Z of coded genotypes according to Cockerham genetic model [24,25]. In the matrix Z each row b, Z [b,·] , corresponds to a column of effect parameters in B (b = 1, 2, · · · , s) and each column j, Z [·,j] , represents a coded genotype The individual (L i ) and overall likelihood (L) functions of data under the MTMIM model with m QTL are mixtures of 2 m multivariate normal distribution functions with different means (μ + BZ [·,j] ), assumed same variance-covariance ( e ), and mixing probabilities p ij (j = 1, 2, · · · , 2 m ), i.e., L i (θ |y i , M i , λ) = 2 m j=1 p ij φ(y i |μ + BZ [·,j] , e ) and matrix of trait measurements, and φ(y i |μ + BZ [·,j] , e ) is the probability distribution function of a multivariate normal random variable y i with mean μ + BZ [·,j] and variance-covariance e . In what follows, i (θ |y i , M i , λ) and (θ|Y , M, λ) are the natural logarithm of the individual and overall likelihood functions, respectively.

Parameter estimation
Estimation of parameters in the likelihood function is cumbersome due to mixture of distributions. The expectation-maximization (EM) [29] algorithm is very popular for parameter estimation in mixture models. The EM algorithm is very simple to program, given that efficient estimators are available for the "completedata". Moreover, the EM algorithm guarantees that the likelihood function is nondecreasing in every iteration.
However, EM may show slow convergence rate if there are many missing data, and EM does not provide standard errors of parameter estimates.
Many modifications of the EM algorithm and many hybrids of EM and Gauss-Newton (GN) methods have been proposed [30][31][32]. GN methods are not guaranteed to converge when the logarithm likelihood function is not concave, but if there is convergence its rate is usually quadratic, as opposite to the linear rate of EM. Therefore, speed of convergence of GN may be much faster than EM. We describe two algorithms to obtain the maximum likelihood estimators (MLE) of parameters in the MTMIM model: expectation-conditional maximization (ECM) and a hybrid of EM and Newton-Raphson called generalized EM-NR (GEM-NR).

Expectation-conditional maximization algorithm
The EM algorithm [29] solves the incomplete logarithm likelihood function iteratively in terms of the unobserved complete-data logarithm likelihood function. If the complete-data logarithm likelihood function is messy and the M-step is complex, then the EM algorithm is no longer attractive. For such cases of complicate M-step, [33] proposed a class of generalized EM algorithm, called expectation-conditional maximization (ECM). The ECM enjoys the convergence properties of the EM while simplifying the estimation of parameters. In the ECM, a complex M-step is broken down into many simpler CM-steps, each one of them maximizes the expected complete-data logarithm likelihood function conditional on some function of the parameters. Besides simplifying the M-step, the CM-step is often faster and more stable than the M-step because the conditional maximization are over spaces of smaller dimensions [33]. E-step: The E-step requires computation of the expectation of the complete-data logarithm likelihood function, conditional on the observed data Y and evaluated at a current value of θ (see Appendix). The E-step at the (ν + 1) iteration consists of updating the probabilities π ij as follows: It is worth mentioning that in the E-step above, the updating equation at step ν + 1 does not use the probabilities from the previous step ν, i.e, it uses p ij instead of π (ν) ij . This is the case in QTL mapping literature because the a priori probabilities are indeed exellent estimates of the conditional probabilities of QTL given the flanking markers. http://www.biomedcentral.com/1471-2156/13/67 The CM-step consists of maximizing the expected complete logarithm likelihood function with respect to the unknown parameters (see Appendix). CM-step without constrained parameters: We split the parameters into the groups B [·,1] , B [·,2] , · · · , B [·,s] , μ, and e . Parameters within the same group are estimated simultaneously, while parameters in distinct groups are estimated consecutively. The parameter estimators can be shown to be: for b ∈ {1, 2, · · · , s}. CM-step with constrained parameters: The estimator of B [·,b] shown previously is not appropriate if some parameters in B [·,b] are constrained to zero. For instance, when estimating parameters in a model with closely linked nonpleiotropic QTL. If there exist zero-constrained effect parameters in the MTMIM model, our strategy is to update each element in B [·,b] one at a time. Given the current estimate B (ν) [·,b] , the updating equation for the unconstrained effect parameter B [t,b] is: The E-and CM-steps are computed iteratively until convergence of the likelihood function. Our choice of initial values for μ and e are the sample mean and the sample variance-covariance, respectively, and all parameters in B are set to zero. In the genome-wide scan, an alternative efficient choice of initial values is to use converged parameters of a previous position in the search grid. It is worth mentioning that for many combinations of i and j, the probabilities p ij are zero or very close to zero. Therefore, one may choose to ignore unimportant small probabilities in the computations, which may lead to significant improvement on computation time.

Generalized EM algorithm based on Newton-Raphson methods
The generalized EM-Newton-Raphson (GEM-NR) methods combine the EM algorithm with the NR method for maximizing the complete-data logarithm likelihood function [30,31]. The hybrid methods take advantage of the EM algorithm for generating an accurate starting point for the NR algorithm, which usually has faster convergency rate. By introducing a step-size κ (ν) (0 < κ (ν) ≤ 1) and by having the incomplete-data logarithm likelihood function ( ) replaced by the expected complete-data logarithm likelihood function (Q c ) in the updating NR formula, a modified version of the updating equation [32] (see Appendix) is: The advantage of using equation (2) is that an appropriate choice of κ (ν) guarantees that the logarithm likelihood function increases at each iteration. So long as κ (ν) is chosen to make (3) positive definite, the logarithm likelihood function is guaranteed to increase at every iteration (Appendix).
where C is the Cholesky decomposition of the negative of the matrix of second order derivatives of the complete logarithm likelihood function (see Appendix) and I is an identity matrix.
To guarantee that the logarithm likelihood function is nondecreasing, [31] proposed to start the EM algorithm with five iterations to quickly approach the MLE and then to switch to NR until either convergence or decrease of the logarithm likelihood function. If the logarithm likelihood function decreases, they suggested halving the step-size κ up to five times. If the logarithm likelihood function still decreases, they suggested to return to the EM, run five iterations, and then to switch back to NR. [31] argued that their choice of running the EM algorithm for five iterations is based on previous experiences of [34] that 95% of the change in the initial value of logarithm likelihood function until its maximum value often happens in five EM iterations.
As θ (ξ ) lies in the line segment from θ (ν) to θ (ν+1) , and θ lives in high-dimensional space, the choice of κ (ν) to make (3) positive definite may not be easy. We implemented an iterative GEM-NR procedure as follows: 1. Run the ECM algorithm a couple of iterations (say five iterations); http://www.biomedcentral.com/1471-2156/13/67 2. Let θ (ν) be the parameter estimate in the ν th EM iteration; 3. Set κ (ν) = 1; 4. Estimate θ (ν+1) using equation (2) with the first and second order derivatives of Q c (θ |Y ) evaluated at θ (ν) ; 5. • If (θ (ν+1) |Y , M, λ) > (θ (ν) |Y , M, λ), then set θ (ν+1) as the updated parameter; • Otherwise, keep repeating step 4 with smaller and smaller κ (ν) , until the likelihood function increases or until κ (ν) gets too small, in which case start again in step 1; In cases in which the complete-data logarithm likelihood function does not allow for closed form solution of parameter estimators, [30] have found that the GEM-NR can reduce significantly the computation burden when compared to the EM algorithm. In the Appendix, we derived all expressions (first and second order derivatives of the complete-data logarithm likelihood function) to implement the GEM-NR algorithm.

Score-based threshold
We extend the score statistic [17] to assess the genomewide statistical significance level of QTL effect in the MTMIM model. Based on the individual and overall likelihood functions, we derived all required expressions to compute the score statistic to test any effect parameter in the MTMIM model (see Appendix).
Under some regular conditions, the score and LRT statistics are asymptotically equivalent in large sample [35]. But, an interesting characteristic of the score statistic is that it can be approximated by a sum of independent random components. Motivated by this characteristic and based on the decomposition of the score function [17,36] derived the large-sample distribution of the score statistic for genome-wide QTL mapping.
In multiple trait genome-wide scan, a putative pleiotropic QTL is assumed at every position λ ∈ ζ and the significance level of its effects (main or epistatic effects) is tested against the null of no effects. For instance, assume a model with m − 1 QTL with main effects and p epistatic effects between certain QTL pairs. Assume we are scanning for a putative m th QTL. Let l = λ denotes the testing position of the putative QTL coming into the model. Let λ = (λ 1 , λ 2 , · · · , λ (m−1) , l) be the current positions of all m QTL in the model. Let θ m = β m be a T × 1 vector of effects for the new QTL coming into the model, and let θ = (θ 1 , θ 2 , · · · , θ m−1 , θ m , θ m+1 , · · · , θ s , μ , vect( e )) be a column vector of all parameters in the model, where Let η = (θ 1 , θ 2 , · · · , θ m−1 , θ m+1 , · · · , θ s , μ , vect( e )) be the column vector of nuisance parameters. Then the hypothesis H 0 : θ m = 0 versus H 1 : θ m = 0 is assessed at every position l in the genome by the LRT. The genomic position with the maximum LRT among all l is assessed for the presence of a QTL via the score-based method.
The score statistic to test H 0 vs H 1 can be written as andÛ i is: whereη is the MLE of η under H 0 (see Appendix for a detailed derivation of first and second order derivatives of the likelihood function). In order to maintain equal expected variances in the resampled score and score statistic [17], we multiplyÛ i by random variables z i from the univariate normal distribution with mean zero and unit variance, i.e. z i ∼ N(0, 1). LetÛ i (l) be equation (4)  3. repeat steps 1 and 2 many times, say N times (resampling), to obtain a sequence (S * 1 , S * 2 , · · · , S * N ); 4. the score-based threshold for a given significance α-level is the 100(1 − α) percentile of the ascending ordered values (S * (1) , S * (2) , · · · , S * (N) ).
IfÛ i (l) inÛ * (l) andV (l) are assumed to be fixed and z i inÛ * (l) to be random, then: (I) The conditional distribution ofÛ * (l) on the observed data is normal with mean zero and limiting covariance as that ofÛ(l); (II) From I, it follows that the distributions of n − 1 2Û * (l) and n − 1 2Û (l) are asymptotically equivalent; and, (III) From II, it is possible to approximate the distribution of S(l) by that of S * (l) under the null hypothesis [17,37]. http://www.biomedcentral.com/1471-2156/13/67

Model selection
The search for QTL effects on phenotypic traits consists on identifying those subset of genomic regions for which statistical tests are significant. [38] elaborated the problem of finding such a subset of genomic regions as the one of model selection, for which many tools are available in the vast literature of variable selection. However, in QTL studies the identification of a reasonable model, which maximizes the correct number of QTL while controlling the rate of false discovery is predominant over the identification of models with the smallest prediction errors, which is the major criterion for model selection [38].
The score-based threshold can be used as a criterion to build and refine models with many QTL. Starting with a model with no QTL effect we can select putative QTL and refine the model, by including to or excluding from the MTMIM model any effects, all based on their statistical significance assessed via the score-based method. We propose an algorithm, analogue to the algorithm described in [11], to build an initial MTMIM model and to refine it upon using the score-based threshold criterion.

Forward selection
Assuming that model (1) starts with no QTL, one QTL is added at each step of the forward selection. In the m th step of the forward selection, we assume a putative pleiotropic QTL at every position l ∈ ζ (one at the time), but avoiding positions within 5 cM neighboring regions of the m − 1 QTL already in the model and compute the MLE of all parameters. For each position l, we compute the LRT statistic to test the null hypothesis H 0 : (β 1m , β 2m , · · · , β Tm ) = (0, 0, · · · , 0) versus H 1 : (β 1m , β 2m , · · · , β Tm ) = (0, 0, · · · , 0) . A putative QTL at the position with maximum LRT statistic is added to the model if the LRT statistic is larger than the score-based threshold. Next, the effect of the selected QTL on each trait is tested individually against the null of no effect using the LRT and critical value from a chi-squared probability distribution function with one degree of freedom and pre-specified corrected error rate α c , i.e., when T traits are analyzed jointly, the corrected significance level (Bonferroni correction) to test each effect of the m th QTL at an error rate α is α c = α/T. Finally, any nonsignificant effect of the m th QTL is removed from the model, ending the m th step of the forward selection. The forward selection continues until no maximum LRT statistic exceeds the score-based threshold.

Model optimization
In turns, we update the positions of all QTL in the model. We pick a QTL and hold the other QTL fixed at the positions that they were mapped. The effects of the picked QTL are then removed from the model and a new search is done within the region delimited by its two neighboring QTL, avoiding 5 cM from each neighbor (the search is performed until the end of the chromosome if no neighbor QTL is found on either side of the picked QTL). The new position of the picked QTL is set to the position of the maximum LRT statistic within the searched region and all parameters in the model are updated. This procedure is repeated until the positions of all QTL are updated.

Some suitable hypotheses in the MTMIM model Testing pleiotropic versus closely linked nonpleiotropic QTL
Although testing for pleiotropic versus closely linked nonpleiotropic QTL is a part of model selection, we preferred to separate it from the model selection because in general this test is performed at the end of the model selection procedure, when the final model is almost fitted.
As previously stated, an advantage of multiple trait analysis is the possibility of testing for a single locus affecting multiple traits versus the alternative of two or more closely linked nonpleiotropic loci. For instance, suppose we have measurements of two traits and a total of three nonepistatic QTL at positions λ 1 , λ 2 and λ 3 . The multiple trait multiple QTL pleiotropic model for a subject i would look like: The model above assumes that all QTL have the same pattern of pleiotropy, but instead, suppose we want to test whether the last locus in model (5) is indeed two closely linked nonpleiotropic loci. The model with two pleiotropic (positions λ 1 and λ 2 ) and two closely linked nonpleiotropic QTL (positions λ 3 and λ 4 ) for a subject i would look like: Or, suppose we want to test whether the last two QTL in the model (6) are both pleiotropic. The model with four pleiotropic QTL for a subject i would look like: Many hypotheses can be formulated and tested, for example, the hypotheses of model (5) versus (6) can be http://www.biomedcentral.com/1471-2156/13/67 stated as H 0 : λ 3 = λ 4 versus H 1 : λ 3 = λ 4 , and the hypotheses of model (6) versus (7) can be stated as H 0 : β 14 = β 23 = 0 versus H 1 : β 14 = 0 and β 23 = 0. In general, testing whether QTL r has pleiotropic main effect or not in a subset S (S ∈ T) of traits in the model, means testing H 0 : β tr = 0 ∀ t ∈ S versus H 1 : β tr = 0 for some t ∈ S. And, testing whether QTL r and l has pleiotropic epistatic effect or not in a subset S (S ∈ T) of traits in the model, means testing H 0 : w trl = 0 ∀ t ∈ S versus H 1 : w trl = 0 for some t ∈ S. Model (6) illustrates a situation in which parameters are constrained to zero and the parameter estimators derived previously in the CM-step with constrained parameters are suitable.
When models are nested, the critical value to assess the strength of the LRT is straightforward, in the sense that under regular conditions the LRT has asymptotic chisquared distribution with degrees of freedom equal to the difference between the number of parameters in the full and reduced models. However, the pleiotropic and closely linkage models may not be nested (for instance, models (6) and (7)), which then requires some correction for the LRT [39,40]. The parametric bootstrap method [13] is an alternative for computing the empirical distribution of the LRT statistic in QTL mapping when models are not nested. In recognizing the test of pleiotropic versus closely linked nonpleiotropic QTL as one of model selection, we evaluate the performance of Akaike's Information Criterion corrected (AICc) [41] and LRT, using simulation.
When a QTL has epistasis, testing this QTL for pleiotropy versus close linkage is not trivial because the test not only depends on the QTL being tested but also on any other QTL in the model that might interact with it. In general, we suggest to search for QTL main effects, and upon finishing this search to test for pleiotropy versus close linkage, and finally to search for epistasis and no longer to test pleiotropy or to test solely those QTL without epistasis.

QTL by environment interaction
The possibility of testing for QTL by environment interaction arises as another advantage of the multiple trait analysis. There are two situations in which we are able to study the differential expression of QTL. First, when the same set of genotypes are evaluated phenotypically in different environments (design I), and second when the phenotypic evaluations are done in different sets of genotypes in different environments (design II) [10]. We regard the model for analysis of data in design II as multiple population model, and thus we shall omit further discussion about it while talking about the multiple trait analysis in this paper.
Let us reiterate that in design I we regard the expression of a trait in different environments as different trait states [42]. Therefore, the index t (t = 1, 2, · · · , T), which was previously defined to index traits, is regarded as the environment index in what follows. With this in mind, testing whether the main effect of QTL r on a trait is statistically different or not in a subset S (S ∈ T) of environments, means testing H 0 : β tr = β r ∀ t ∈ S versus H 1 : β tr = β r for some t ∈ S. And, testing whether QTL r and l epistatic effect on a trait is statistically different or not in a subset S (S ∈ T) of environments, means testing H 0 : w trl = 0 ∀ t ∈ S versus H 1 : w trl = 0 for some t ∈ S.
The LRT may be used to evaluate the hypotheses above. The cut-off point for the test can be obtained from the chi-squared probability distribution function with degrees of freedom being the difference between the number of parameters in the full (H 1 ) and reduced (H 0 ) models.

Evaluation of the MTMIM model by simulation
We implemented the MTMIM model and score-based threshold method, and evaluated them with several simulated datasets. More specifically, we evaluated type I error, model fitting, and the efficiency of pleiotropic versus closely linked nonpleiotropic QTL testing hypothesis delivered by the MTMIM model.

Genome-wide type I error
We use simulation to evaluate the proportion of falsely discovered QTL (type I error) in the analysis of datasets simulated without QTL effects. The LRT statistic is used for hypothesis testing and the score-based threshold is used as the criterion to assess significance level of QTL effects in a genome-wide scan. Each replicate has six chromosomes, each with nine markers evenly spaced 10 cM apart from each other, 300 subjects, and three quantitative traits (see Scenario S0 in Table 8). In the genome-wide scan a putative pleiotropic QTL with main effects on all traits, β = (β 1 , β 2 , β 3 ) , was assumed at each 1 cM in the genome as the alternative hypothesis. The effects of putative QTL were then tested against the simulated null hypothesis of no effects, β = (β 1 , β 2 , β 3 ) = (0, 0, 0) (Scenario S0 of Table 8). For each position in the genome, we resampled the score statistic 1000 times to obtain the genome-wide score-based threshold. One thousand replicates were analyzed in this type I error study.

Model fit evaluations
We use simulation to evaluate the overall performance of the MTMIM model and score-based threshold as the criterion to assess the significance level of QTL effects in the genome-wide scan. We examined the performance of the MTMIM in three different scenarios (SI, SII and SIII shown in Table 8), each evaluated with R = 500 replicates. Each replicate was simulated with six chromosomes, each with nine markers evenly spaced 10 cM apart from each other, and 300 subjects. The genetic architecture of quantitative traits in each scenario is described with details in http://www.biomedcentral.com/1471-2156/13/67  Q1  Q2  Q3  Q4  Q5  T1  T2  T3   T 1  0  3 0  0  0  0  0  0   We evaluated the MTMIM model under three genomewide significance levels: 1, 5 and 10%. For each replicate, all QTL selected in the forward selection are defined as mapped QTL. We summarize the performance of the MTMIM model with measures that are function of the logarithm of odds ratio (LOD) support interval of mapped QTL. The LOD-d (d = 1, 1.5, and 2) support interval of a mapped QTL is a continuous genomic region that includes the position of the mapped QTL and all positions on its left and right sides with LOD values greater than or equal to the LOD value at the position of the mapped QTL after subtraction of a positive constant d [1]. Let Q r , for r ∈ {1, 2, · · · , m = 5}, be a simulated QTL. A simulated QTL is defined as being paired with a mapped QTL if the simulated and mapped QTL are nearby. A mapped QTL is defined as being matched to a paired QTL if the LOD-d support interval of the mapped QTL includes the paired QTL. A mapped QTL is defined as mismatched if it is not matched. A simulated QTL Q r is defined as identified if it has a matched QTL. For each simulated http://www.biomedcentral.com/1471-2156/13/67 Q r and for each d, let Q r ,d be the set of replicates for which Q r is identified. We define | Q r ,d | as the number of elements in Q r ,d . A criterion to match mapped and simulated QTL which uses both LOD-d support interval and closest distance between mapped and simulated QTL is more appropriate than the usual criterion that uses closest distance alone. Our measures of model fit are: (1) False discovery rate per replicate, FDR b (d), which is the ratio of number of mismatched QTL in replicate b to total number of mapped QTL in replicate b; (2) FDR over all repli- d), which is the ratio of | Q r ,d | to the number of replicates for which Q r is paired with a mapped QTL; (5) Mean length of LOD-d support interval of Q r , which is the average length of LOD-d support intervals of Q r over replicates in Q r ,d ; (6) Mean effect of Q r , which is the average effects of Q r over replicates in Q r ,d ; (7) Mean position of Q r , which is the average positions of Q r over replicates in Q r ,d ; and (8) Model size, which is the number of mapped QTL. These summary statistics have been proposed by C. Laurie, S. Wang, L. A. Carlini-Garcia and Z-B. Zeng (unpublished observations).

Expectation-conditional maximization algorithm
Let z * i = z * i1 , z * i2 , · · · , z * i2 m be a vector with information on "missing" genotypes of m QTL for subject i. Each z * ij = 1 if i th subject has genotype G j (j=1,2,· · · ,2 m ), otherwise z * ij = 0. Let z * = (z * 1 , z * 2 , · · · , z * n ) be a matrix containing missing information from all subjects. The joint distribution of observed and missing data (y i , z * i ) for subject i is: φ(y i |μ + BZ [·,j] , e )p ij z * ij where p ij = P(G j |M i , R, λ), and φ(y i |μ + BZ [·,j] , e ) is the probability density distribution of a multivariate normal random vector y i with mean vector μ + BZ [·,j] and variance-covariance matrix e . The joint distribution of observed and missing data allow us to obtain the complete-data logarithm likelihood function ( c ): 2 m j=1 z * ij log p ij + log × φ y i |μ + BZ [·,j] , e The E-step requires computation of the expectation of the complete-data logarithm likelihood function, conditional on the observed data y and evaluated at current estimated values of θ (denoted here as θ (ν) ) [32]: The CM-step consists of maximizing the expected complete logarithm likelihood function with respect to the unknown parameters through derivatives (see Section Derivatives).

Newton-Raphson method
The NR updating formula for parameter estimation [32] is: The NR method is not very stable for complex functions because it requires accurate initial values of parameters, in certain problems, in order for right convergency. Moreover, the NR method has almost equally chances to move either in the direction of saddle points, local minima or local maxima [32]. Nevertheless, NR method has a major advantage in terms of quadratic convergence rate (when it does converge) and it can provide an estimate of the variance-covariance matrix of parameters at the limiting value of θ , θ * , through the inverse of the observed Fisher's information matrix:

First order derivatives of the logarithm of the individual likelihood function
In the following equations we use a short-hand notation i (θ ) = i (θ |y i , M i , λ), and assume b = 1, 2, · · · , s.

Extension to other crosses
The extension of score statistic to other cross types (for instance, intercross F 2 , recombinant inbred lines, double haploids) is straightforward, in fact, the auxiliary matrices, expressions of first and second order derivatives of the logarithm of individual and overall likelihood functions can be straightly obtained from the general expressions derived previously. For a specific cross type, the extension consists basically of building an appropriate design matrix Z and matrix of parameters B, and substituting 2 m in the summations by the appropriate value according to that cross type (for instance, 3 m for intercross F 2 ).