Statistical properties of interval mapping methods on quantitative trait loci location: impact on QTL/eQTL analyses
- Xiaoqiang Wang^{1, 2},
- Hélène Gilbert^{3, 4},
- Carole Moreno^{5},
- Olivier Filangi^{1, 2},
- Jean-Michel Elsen^{5} and
- Pascale Le Roy^{1, 2}Email author
DOI: 10.1186/1471-2156-13-29
© Wang et al; licensee BioMed Central Ltd. 2012
Received: 18 September 2011
Accepted: 20 April 2012
Published: 20 April 2012
Abstract
Background
Quantitative trait loci (QTL) detection on a huge amount of phenotypes, like eQTL detection on transcriptomic data, can be dramatically impaired by the statistical properties of interval mapping methods. One of these major outcomes is the high number of QTL detected at marker locations. The present study aims at identifying and specifying the sources of this bias, in particular in the case of analysis of data issued from outbred populations. Analytical developments were carried out in a backcross situation in order to specify the bias and to propose an algorithm to control it. The outbred population context was studied through simulated data sets in a wide range of situations.
The likelihood ratio test was firstly analyzed under the "one QTL" hypothesis in a backcross population. Designs of sib families were then simulated and analyzed using the QTL Map software. On the basis of the theoretical results in backcross, parameters such as the population size, the density of the genetic map, the QTL effect and the true location of the QTL, were taken into account under the "no QTL" and the "one QTL" hypotheses. A combination of two non parametric tests - the Kolmogorov-Smirnov test and the Mann-Whitney-Wilcoxon test - was used in order to identify the parameters that affected the bias and to specify how much they influenced the estimation of QTL location.
Results
A theoretical expression of the bias of the estimated QTL location was obtained for a backcross type population. We demonstrated a common source of bias under the "no QTL" and the "one QTL" hypotheses and qualified the possible influence of several parameters. Simulation studies confirmed that the bias exists in outbred populations under both the hypotheses of "no QTL" and "one QTL" on a linkage group. The QTL location was systematically closer to marker locations than expected, particularly in the case of low QTL effect, small population size or low density of markers, i.e. designs with low power. Practical recommendations for experimental designs for QTL detection in outbred populations are given on the basis of this bias quantification. Furthermore, an original algorithm is proposed to adjust the location of a QTL, obtained with interval mapping, which co located with a marker.
Conclusions
Therefore, one should be attentive when one QTL is mapped at the location of one marker, especially under low power conditions.
Keywords
QTL linkage analysis QTL location biasBackground
For the last decade, several studies have shown that a large proportion of QTL are mapped at the markers locations whenever linkage analysis is applied. As to what regards dataset analyses, this bias first raised doubts in Spelman et al. [1], who observed a large proportion of significant test statistics at marker location when looking for QTL in five milk production traits. Walling et al. [2] have described the influence of markers in constructing the confidence intervals of QTL location and questioned whether QTL location was biased towards the location of markers instead of its true position. By applying the regression coefficients on the markers as suggested by Whittaker et al. [3], Walling et al. [4] calculated the proportion of putative QTL located at marker positions in a backcross population. They have reported a systematic bias for the estimated QTL position under the null hypothesis of the test, i.e. the hypothesis of no QTL on the linkage group. Moreover, results from linear regression methods for QTL detection have been reported to behave the same way the results from maximum likelihood methods in interval mapping approaches do [5]. The simulation studies by Walling et al. [4] have confirmed that these two approaches have similar biases on the estimated QTL position in a backcross population.
These previous works have shown that a bias on the QTL location occurs when genetic linkage analysis for QTL mapping is used in a backcross population. However, little research has been devoted to establishing which parameters give rise to that bias. What is more, no study has investigated how it affects linkage analysis applied to outbred populations. In order to address these shortcomings, the present study aims at identifying the sources of that bias, in particular regarding the analysis of data issued from outbred populations.
This question is of critical importance in expression quantitative trait loci (eQTL) mapping. Indeed, a main objective is often to search for eQTL which co localize with QTL which influences agronomical performances. The accuracy of the eQTL locations is thus a fundamental element for experimental design optimization, especially since experimental designs for gene expression analyses are generally of moderate size due to the cost of phenotyping. The pioneer work of eQTL detection can be traced back to the emergence of the concept of genetical genomics [6]. During the past decade, QTL mapping was widely applied to the detection of eQTL, for example in yeast [7], mice [8], human [9, 10], maize [11] and pig [12]. Generally, mapping procedures were used to map eQTL considering each transcript expression level as one quantitative trait in a trait by trait analysis.
Hence, in order to qualify this possible bias on the estimated QTL location in outbred populations and to specify which parameters influence it, this paper presents a study of the QTL location accuracy. Firstly, in order to make things more concrete, we explored the empirical distribution of the LRT along the linkage group under the null hypothesis of "no QTL" on a real dataset. Secondly, analytical developments were carried out so as to identify the parameters which influence the QTL location accuracy. Since they are impossible to realize for outbred populations, because of the test statistic complexity, a more simple case of a backcross type population, i.e. a backcross between inbred lines, was considered at that stage. Thirdly, designs of outbred sib families were simulated and analyzed in order to characterize the bias variability under the null and the alternative ("one QTL segregating on the linkage group") hypotheses. Such parameters as population size and marker density under the null hypothesis, as well as QTL effect and simulated QTL location under the alternative hypothesis were taken into account. Finally, an approach as to how to adjust the QTL location estimation, for a QTL located at the position of one marker, was suggested.
Results
The LRT distribution along the linkage group under the null hypothesis
The QTL location bias expression in a backcross population
In order to investigate the bias on the QTL location under the hypothesis of "one QTL", we considered a linkage group limited to an interval [0,T] between two markers M_{1} (alleles M_{1} and m_{1}) at 0 and M_{2} (alleles M_{2} and m_{2}) at T flanking a QTL (alleles Q and q) in a backcross population obtained from the cross M_{1}M_{1}QQM_{2}M_{2} × M_{1}m_{1}QqM_{2}m_{2}.
It can be seen from formula (4) that the first term reaches its maximum at t_{0} since ${\rho}_{t{t}_{0}}\to 1$ when t→ t_{0}. As seen in the previous section, the second term, which is proportional to the LRT at t under the "no QTL" hypothesis, reaches its maximum at the position of markers more often than at the positions between markers. As a result, the estimated QTL location will be biased towards the position of markers. However, when n or a^{ 2 } increase, or when T decreases, or when t_{0} approaches one marker location (see Appendix III), the deviation between the two first terms in formula (4) increases and the influence of the second term is reduced. Therefore, in our simple backcross population model, under the hypothesis of one QTL, when the population size, marker density, QTL effect increase or when the true QTL location approaches the position of a marker, the bias of the estimated QTL location is expected to be reduced.
Simulations under H0
According to the preceding results, the estimated QTL location cannot be expected to be uniformly distributed on the chromosome under the null hypothesis of no QTL. Familial designs were simulated to test the influence of the population size and the marker density on this bias in outbred populations.
Impact of the population size
Proportion of estimated QTL locations at marker locations according to the population size under H0
Number of individuals (s× d× p)^{1} | |||||
---|---|---|---|---|---|
60 (3 × 1 × 20) | 100 (5 × 1 × 20) | 300 (5 × 2 × 30) | 400 (5 × 2 × 40) | 800 (5 × 4 × 40) | |
Proportion^{2} (%) | 64.2 | 63.8 | 65.3 | 67.1 | 66.0 |
Impact of the marker density
Proportion of estimated QTL locations at marker locations according to the marker density under H0
Number of markers^{1} | |||||
---|---|---|---|---|---|
2 | 3 | 5 | 7 | 11 | |
Proportion^{2} (%) | 58.6 | 58.5 | 52.6 | 57.8 | 69.5 |
Simulations under H1
According to the analytical results obtained in a backcross type population, the population size and the marker density, as well as the QTL effect and location, were parameters which were very likely to influence the bias on the estimated QTL location under the alternative hypothesis.
Impact of the population size
Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the population size under H1
Number of individuals (s× d× p)^{1} | ||||
---|---|---|---|---|
α ^{2} | 100 (5 × 1 × 20) | 300 (5 × 2 × 30) | 800 (5 × 4 × 40) | |
Power^{3} (%) | 0.05 | 39 | 93 | 100 |
0.01 | 15 | 81 | 100 | |
RMSE^{4} (cM) | - | 13.9 | 8.7 | 4.2 |
0.05 | 12.2 | 8.4 | 4.2 | |
0.01 | 11.4 | 8.0 | 4.2 | |
Proportion^{5} (%) | - | 39.9 | 15.2 | 1.6 |
0.05 | 29.7 | 14.0 | 1.6 | |
0.01 | 26.8 | 12.9 | 1.6 |
Impact of the marker density
Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the marker density under H1
Number of markers^{1} | ||||||
---|---|---|---|---|---|---|
α ^{2} | 2 | 3 | 5 | 7 | 11 | |
Power^{3} (%) | 0.05 | 62.2 | 91.6 | 94.4 | 96.5 | 97.5 |
0.01 | 37.2 | 79.0 | 85.2 | 88.9 | 91.8 | |
RMSE^{4} (cM) | - | 16.6 | 10.5 | 8.8 | 7.4 | 5.7 |
0.05 | 14.9 | 10.1 | 8.4 | 7.2 | 5.6 | |
0.01 | 14.1 | 9.7 | 8.0 | 6.9 | 5.2 | |
Proportion^{5} (%) | - | 19.0 | 15.4 | 12.5 | 11.8 | 35.0 |
0.05 | 13.0 | 14.6 | 11.5 | 11.5 | 34.5 | |
0.01 | 10.8 | 13.7 | 10.7 | 10.5 | 34.0 |
Impact of the QTL effect
Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the QTL effect under H1
QTL effect | ||||||
---|---|---|---|---|---|---|
α ^{1} | 0.5 σ | 1 σ | 1.5 σ | 2 σ | 4 σ | |
Power^{2} (%) | 0.05 | 10.5 | 41.8 | 81.0 | 97.4 | 100 |
0.01 | 2.9 | 19.5 | 58.1 | 89.8 | 100 | |
RMSE^{3} (cM) | - | 17.0 | 13.6 | 10.3 | 8.0 | 4.4 |
0.05 | 15.6 | 11.9 | 9.8 | 7.8 | 4.4 | |
0.01 | 15.4 | 10.9 | 9.1 | 7.6 | 4.4 | |
Proportion^{4} (%) | - | 56.6 | 37.9 | 24.6 | 13.6 | 2.4 |
0.05 | 37.1 | 28.8 | 22.2 | 13.2 | 2.4 | |
0.01 | 31.0 | 24.6 | 20.3 | 12.9 | 2.4 |
Impact of the true QTL location
Power, RMSE of the QTL location and proportion of estimated QTL locations at marker locations according to the true QTL location under H1
QTL location (M) | ||||||
---|---|---|---|---|---|---|
α ^{1} | 0 | 0.05 | 0.1 | 0.15 | 0.2 | |
Power^{2} (%) | 0.05 | 96.1 | 92.3 | 87.2 | 82.0 | 81.2 |
0.01 | 87.4 | 80.1 | 69.7 | 61.9 | 60.2 | |
RMSE^{3} (cM) | - | 7.5 | 7.6 | 9.5 | 10.7 | 11.2 |
0.05 | 7.2 | 7.2 | 9.0 | 10.1 | 10.7 | |
0.01 | 7.0 | 7.0 | 8.6 | 9.7 | 10.3 | |
Proportion^{4} (%) | - | 51.5 | 38.6 | 26.6 | 19.0 | 16.2 |
0.05 | 51.3 | 37.8 | 24.8 | 15.8 | 13.6 | |
0.01 | 50.9 | 37.2 | 24.3 | 13.8 | 11.2 |
An algorithm to adjust the location of QTL mapped on markers
Analytical developments in backcross type population and simulation study in outbred type population demonstrated that the estimated position of the QTL is biased towards marker location under some circumstances. On the other hand, the decomposition of the LRT according to the formula (4) allowed to identify a putative cause of this bias: the residual error ε_{1} in the LRT both under "no QTL" and the "one QTL" hypotheses. Indeed, according to the decomposition of the LRT in the formula (4), if the QTL is not estimated at its true location, two residual errors may have generated the bias: ε_{1} and ε_{2}. When the estimated QTL position is at a marker location, argmax_{ t }ε_{2}(t) has a uniform distribution but argmax_{ t }ε_{1}(t) is more often estimated at a marker location than between markers. In such a situation, ε_{1} is very likely to play a dominant role in the bias. On the contrary, when the estimated QTL location is not at a marker location, argmax_{ t }ε_{1}(t) and argmax_{ t }ε_{2}(t) are unknown for a given argmax_{ t }[ε_{1}(t) + ε_{2}(t)] error. Under these circumstances, it is impossible to predict the relative influence of ε_{1} and ε_{2} on the bias. On the basis of this observation, we propose an approach to describe the ε_{1}(t) process and, consequently, adjust the estimated QTL position when a QTL co localizes with one marker, i.e. an approach to correct the "marker effect" on the bias of the estimated QTL location.
1. Obtain the vector which contains the LRT profile along the linkage group, calculated on the phenotypic data, say L_{0}. L_{0} is maximum at the location of the marker M.
2. Under the "no QTL" hypothesis, simulate phenotypes and obtain LRT profiles until to have 1000 profiles which have their maximum at the position of the marker M, say { L_{ i }}_{i = 1,...,1000}
3. Calculate the 1 000 vectors V_{ i }= L_{0}-L_{i},i ϵ 1,...,1000.
4. Retain the 1000 locations where { V_{ i }}_{i= 1, ..., 1000 }is maximum.
RMSE of the QTL location before or after adjustment
True QTL location (M) | ||
---|---|---|
0.1 | 0.2 | |
Before^{1} | 28.6 | 23.6 |
After^{2} | 26.6 | 21.3 |
Discussion
In order to study the elements that give rise to the bias on the estimated QTL position, we checked whether the distribution of the test statistic changed along the locations on the linkage group. More precisely, we checked if the significance threshold remained the same at a marker and at a non-marker location. Under the null hypothesis of "no QTL on the linkage group", the asymptotic distribution of the LRT at a given point is well known and identical for all locations. It will be getting closer to the central χ^{2} distribution with a degree of freedom depending on the number of parameters fixed under the null hypothesis [20], i.e. here the number of sires or dams for which a QTL effect was estimated. Nevertheless, the population size is most often not large enough to make the LRT reach its asymptotic distribution for all the locations on the linkage group. The variability of the marker informativity along the linkage group may actually influence this convergence to asymptotic conditions, resulting in variability of the LRT distributions depending on the tested locations. Here, the differences between the empirical distributions of the LRT at each position along the linkage group were explored using a real example of an outbred type population. It appeared that the variability of informativity along the linkage group did not lead to a significative variability of the empirical distributions of the nominal test statistics. This observation is not contradictive to the bias on the estimated QTL location towards the locations of markers but it means that the bias is due to the process which defines the sup of LRT on the linkage group.
Some analytical results concerning the bias of the estimated QTL location were obtained in a backcross type population, i.e. a backcross between inbred lines. We identified a common source of bias under the "no QTL" and the "one QTL" hypotheses, and also showed the possible influence of several parameters under "one QTL" hypothesis, such as the population size, the marker density, the QTL effect and the true QTL location. Using simulations, we verified the existence of a bias on the estimation of the QTL location using the interval mapping method, under the null and the alternative hypotheses, when family structure are more complex than the backcross design considered by Walling et al. [4]. Simulations of outbred populations confirmed that this bias is influenced by the size of the population and the density of the genetic map, as well as by the QTL effect under the alternative hypothesis. We also demonstrated that the true QTL location, relatively to the flanking markers, had a significant impact on the accuracy of the estimated QTL location. Moreover, we quantified the bias of the estimated QTL location for various values of these parameters and validated the results by applying appropriate test statistics.
We showed that the population size does not affect the estimation of the QTL location under the null hypothesis. Under the alternative hypothesis, very similar values of RMSE or of proportion of QTL detected at marker locations were observed whatever α. On the other hand, a slight reduction in the bias seemed to be obtained when applying α < 0.01. However, the choice of a high significance level also implies a decrease of power and the detection of only few QTL. As a consequence, it cannot be considered an efficient way to correct the bias problem.
Secondly, concerning the particular case of eQTL detection, when the marker information is relatively sparse, for example when microsatellite markers are used for genotyping, it is necessary to measure several hundred of animals for transcriptomic data to obtain an accurate eQTL location. Finally, a population size of 300 progeny seems to be a good compromise in the detection of eQTL, even if only those which have relatively large effects will be detected.
Thirdly, it is clear that significant QTL detection located at a marker position should be considered with caution, especially when the population size, the marker density or the QTL effect are low. Hence, the approach proposed above is efficient to remedy the bias on the estimated QTL location in such situation.
Conclusions
When we apply the interval mapping method on an outbred design to map QTL, the QTL is often incorrectly mapped at the position of a marker. In this work, this bias was studied by using analytical developments in backcross type population and simulated data in outbred populations. In the absence of QTL, adjusting the thresholds at the location of markers cannot reduce the bias, and the population size does not affect the bias. Under the hypothesis of having one QTL, the impact of some parameters on the bias was confirmed: when the population size and/or the QTL effect and/or the marker density are large enough, the bias is reduced. Moreover, the closer the QTL is to a marker location, the more accurate the estimation is. Therefore, caution should be taken when the QTL is mapped at a position of a marker, in particular for low power designs. In such cases, a method is proposed to correct the bias on the estimated QTL location. Simulations carried out in a backcross type population demonstrated that this method is valid to limit the bias.
Methods
Analyses on a real data set in pig
A real data set was used to illustrate some aspects of the present work. It is a porcine outbred population of 325 progeny issued from 4 sires. One example of eQTL analysis using the QTLMap software [21], i.e. the analysis of the chromosome SSC18 for 6 665 gene expression traits, was given. The linkage analysis method was applied according to Le Roy et al. [22] with a gene by gene procedure. For each gene, when the LRT was significant at the 5%_{0} level at the chromosome level, the estimated eQTL location was the location where the LRT was maximum on the linkage group.
The same familial structure was used to study the empirical distribution of the LRT along the linkage group. Two thousand simulations were performed under the null hypothesis of "no QTL" on the chromosome SSC1 which carried 16 microsatellite markers. A polygenic heritability coefficient of 0.5 was assumed for the trait (see http://www.inra.fr/qtlmap).
Simulations of an Ornstein-Uhlenbeck process
where W_{ t } denotes the Brownien movement.
In a backcross type population, the mean of this process is 0 and the autocovariance is: cov(X_{ t }, X_{t'}) = e^{-2|t-t'| }with t and t' in the Haldane distance unit [16].
with s = 1, ..., mk, where τ denotes the spacing of two adjacent markers in Morgan. This sequence is a first-order autoregressive sequence.
Simulations of outbred type population
y_{ ijk } is the phenotype of the progeny ijk of the sire i and of the dam ij. u_{ i } and u_{ ijk } denote the polygenic effects, of the sire i and of the dam ij respectively, which follow a normal distribution with mean 0 and variance ${\sigma}_{u}^{2}$. a denotes the QTL allelic substitution effect and g_{ ijk } (t_{0}) is the genotypic value of ijk at the QTL location t_{0}. g_{ ijk } takes value 1, 0 or -1 depending onthe QTL genotype, QQ, Qq or qq, respectively. e_{ ijk } is a random normal variable with mean 0 and variance ${\sigma}_{e}^{2}$. The variance within QTL genotype is ${\sigma}^{2}=2{\sigma}_{u}^{2}+{\sigma}_{e}^{2}$ and a is expressed in σ unit. The heritability coefficient, equal to $4{\sigma}_{u}^{2}/{\sigma}^{2}$, was fixed at 0.25.
For each of the cases studied, the results were based on 5 000 simulations, either under the null hypothesis (H_{0}: there is no QTL segregating on the linkage group, i.e. a = 0) or under the alternative hypothesis (H_{0}: there is one QTL segregating on the linkage group, i.e. most often a = 1σ). For each simulated dataset, the estimated QTL position was the location of the linkage group where the LRT was maximum.
Under the null hypothesis, simulations were carried out so as to compare the influence of the population size with 6 levels: 60 (3s,1d,20p), 80 (4s,1d,20p), 100 (5s,1d,20p), 300 (5s,2d,30p), 400 (5s,2d,40p), 800 (5s,4d,40p) progeny. Under the H1 hypothesis, only 3 of these population sizes were considered: 100, 300 and 800 progeny.
To understand how the QTL effect affects the estimation of the QTL location, a population of 100 progeny was simulated with a QTL effect ranging from 0.5 σ to 4 σ.
Other simulations were performed in a population of 300 progeny. Firstly, to check the bias extent depending on the marker density, samples with 2, 3, 5, 7 or 11 markers equidistant in a linkage group of 0.6 M were simulated, under the null and under the alternative hypotheses. Under H_{1}, one QTL was simulated at 0.25 M (a = 1σ). Secondly, to test how the true QTL location may affect the bias, we performed simulations under H_{1} with a QTL (a = 1σ) lying at 0 M, 0.05 M, 0.10 M, 0.15 M, 0.2 M on a linkage group of 0.4 M with two flanking markers at 0 M and 0.4 M.
Criteria
where L is the number of simulations (all, significant at the level 0.05 or at the level 0.01), ${\widehat{t}}_{l}$ is the l^{ th } estimated QTL position and t_{ 0 } is the true QTL position.
Hypothesis test
Appropriate statistical tests are needed to evaluate which parameters affect the bias of the estimated QTL position. ANOVA was not adequate to test the equality of the average QTL position in two different conditions (e.g. 2 population sizes) because of the non normality of the QTL position estimator. Therefore, two nonparametric tests were combined in order to test which parameters affect the bias, and how they influence the variation of the QTL location estimation. This was performed in two steps: (1) the parameters which influence the accuracy of the estimated QTL location were identified. This step was carried out with a Kolmogorov-Smirnov test; (2) for the parameters identified in the first step, a description of their effect on the accuracy of the estimated QTL position was made. This step was performed with a Mann-Whitney-Wilcoxon test [24].
where F_{ a }, F_{ b } denote the distribution of the estimated QTL position under the conditions a and b, respectively. For a given parameter, all the distributions were compared by pairs with the function ks.test in R. If all pair comparisons concluded to accept the null hypothesis, it means that the value of this parameter did not influence the estimation of the QTL position.
where D_{ a }, D_{ b } denote the absolute values of the deviations between the estimated QTL position and the assumed, i.e. the true position, under the condition a and b, respectively. A smaller median D corresponds to a more accurate position estimation.
Appendix
Appendix I
Let us denote M^{ k } the genotype of the markers at 0 and T for individual k and p_{ kt } = ℙ (g_{ k }(t) = 1|M^{ k }). Then using a linearized likelihood function instead of the mixture of two normal distributions, the LRT can be written as: $\begin{array}{cc}\hfill LRT\left(t\right)\hfill & \hfill =-2\mathrm{ln}\frac{\mathcal{L}\left({y}_{1},\dots ,{y}_{n}\right)}{{\mathrm{max}}_{a}\mathcal{L}\left({y}_{1},\dots ,{y}_{n};a\right)}\hfill \\ \hfill \approx -2\mathrm{ln}\frac{\prod \varphi \left({y}_{k};0,1\right)}{{\mathrm{max}}_{a}\prod \varphi \left({y}_{k}+\frac{a}{2}\left(1-2{p}_{kt}\right);0,1\right)}\hfill \\ \hfill =\frac{{\left[\sum {y}_{k}(1-2{p}_{kt})\right]}^{2}}{\sum {\left(1-2{p}_{kt}\right)}^{2}},\hfill \end{array}$
p _{ kt } | Probability | M ^{ k } |
(1-θ_{ t })(1-θ_{ T-t })/(1-θ_{ T }) | (1-θ_{ T })/2 | M _{1} M _{1} M _{2} M _{2} |
(1-θ_{ t })θ_{ T-t }/θ_{ T } | θ_{ T }/2 | M _{1} M _{1} M _{2} m _{2} |
θ _{ T-t } θ _{ t } /θ _{ T } | (1-θ_{ T })/2 | M _{1} m _{1} M _{2} m _{2} |
θ_{ t }(1-θ_{ T-t })/θ_{ T } | θ_{ T }/2 | M _{1} m _{1} M _{2} M _{2} |
x_{ k } (t) | Probability | M ^{ k } |
(1-θ_{ t }-θ_{ T-t })/(1-θ_{ T }) | (1-θ_{ T })/2 | M _{1} M _{1} M _{2} M _{2} |
(θ_{ T-t }-θ_{ t })/θ_{ T } | θ_{ T }/2 | M _{1} M _{1} M _{2} m _{2} |
(θ_{ t }+θ_{ T-t }-1)/(1-θ_{ T }) | (1-θ_{ T })/2 | M _{1} m _{1} M _{2} m _{2} |
(θ_{ t }-θ_{ T-t })/θ_{ T } | θ_{ T }/2 | M _{1} m _{1} M _{2} M _{2} |
Appendix II
Replacing ${y}_{k}=\frac{a}{2}{x}_{k}\left({t}_{0}\right)+{\epsilon}_{k}$ in LRT (3), we have $\begin{array}{cc}\hfill LRT\left(t\right)\hfill & \hfill =\frac{{\left[\sum \left(\frac{a}{2}{x}_{k}\left({t}_{0}\right)+{\epsilon}_{k}\right){x}_{k}\left(t\right)\right]}^{2}}{\sum {x}_{k}^{2}\left(t\right)}\hfill \\ \hfill =\frac{1}{4}{a}^{2}\frac{{\left[\sum {x}_{k}\left({t}_{0}\right){x}_{k}\left(t\right)\right]}^{2}}{\sum {x}_{k}^{2}\left(t\right)}+\frac{{\left[\sum {\epsilon}_{k}{x}_{k}\left(t\right)\right]}^{2}}{\sum {x}_{k}^{2}\left(t\right)}+a\frac{\sum {x}_{k}\left({t}_{0}\right){x}_{k}\left(t\right)}{\sum {x}_{k}^{2}\left(t\right)}\sum {\epsilon}_{k}{x}_{k}\left(t\right)\hfill \\ \hfill =f\left(t\right)+{\epsilon}_{1}\left(t\right)+{\epsilon}_{2}\left(t\right),\hfill \end{array}$
where in the case of large sample, we have
• $f\left(t\right)=\frac{1}{4}{a}^{2}\frac{{\left[\sum {x}_{k}\left({t}_{0}\right){x}_{k}\left(t\right)\right]}^{2}}{\sum {x}_{k}^{2}\left(t\right)}$ and $f\left(t\right)/n\to \frac{1}{4}{a}^{2}Var\left(x\left({t}_{0}\right)\right){\rho}_{t{t}_{0}}^{2}$ when n → ∞ according to the law of large numbers.
• ${\epsilon}_{1}\left(t\right)=\frac{{\left[\sum {\epsilon}_{k}{x}_{k}\left(t\right)\right]}^{2}}{\sum {x}_{k}^{2}\left(t\right)}$ is the LRT under the no QTL hypothesis.
• ${\epsilon}_{2}\left(t\right)=a\frac{\sum {x}_{k}\left({t}_{0}\right){x}_{k}\left(t\right)}{\sum {x}_{k}^{2}\left(t\right)}\sum {\epsilon}_{k}{x}_{k}\left(t\right)~a\frac{Cov\left(x\left(t\right),x\left({t}_{0}\right)\right)}{Var\left(x\left(t\right)\right)}\sum {\epsilon}_{k}{x}_{k}\left(t\right)$ is a residual error, linear combination of gaussian random variables. Its distribution is approximated as $N\left(0,n{a}^{2}Var\left(x\left(t\right)\right){\rho}_{t{t}_{0}}^{2}\right).$
Appendix III
Considering the two first terms in the expression of $\frac{1}{n}LRT\left(t\right)$ (4), when n tends to infinity, $\frac{1}{n}{\epsilon}_{1}\left(t\right)$ will converge to 0 at each position t. So the amplitude of the curve representing the term $\frac{1}{n}{\epsilon}_{1}\left(t\right)$, with respect to that of the first term, is reduced. In the same way, as a^{2} increases, the amplitude of the first term becomes larger with respect to that of the second term.
The proof of the influence of t_{0} and T will use the results in this following lemma:
Lemma 1. Given two markers at the location 0 and T in a linkage group of length T and assuming a QTL located at t_{0}, from the distribution of Var(x(t)) and applying the Taylor series expansion in case of small T, we have:
1. $Var\left(x\left({t}_{0}\right)\right)=\frac{{\left(1-{\theta}_{{t}_{0}}-{\theta}_{T-{t}_{0}}\right)}^{2}}{1-{\theta}_{T}}+\frac{{\left({\theta}_{{t}_{0}}-{\theta}_{T-{t}_{0}}\right)}^{2}}{{\theta}_{T}}\approx \frac{4}{T}{t}_{0}^{2}-4{t}_{0}+1.$
2. $Cov\left(x\left(t\right),x\left({t}_{0}\right)\right)=\frac{\left(1-{\theta}_{t}-{\theta}_{T-t}\right)\left(1-{\theta}_{{t}_{0}}-{\theta}_{T-{t}_{0}}\right)}{1-{\theta}_{T}}+\frac{\left({\theta}_{t}-{\theta}_{T-t}\right)\left({\theta}_{{t}_{0}}-{\theta}_{T-{t}_{0}}\right)}{{\theta}_{T}}\approx \frac{4}{T}t{t}_{0}-2\left(t+{t}_{0}\right)+1.$
It can be seen that when t_{0} → T from $\frac{T}{2}$ and/or when T decreases, δ will become larger.
Therefore, when T becomes larger, δ decreases.
In conclusion, the amplitude of g(t) will be greater as the QTL position tends to one marker and/or the distance between the markers decreases.
Declarations
Acknowledgements
These results are part of the SABRE research project that has been co-financed by the European Commission, within the 6th Framework Programme, contract No. FOOD-CT-2006-016250. XW is a Ph.D fellow supported by the SABRE research project and by the Animal Genetics division of INRA.
Authors’ Affiliations
References
- Spelman RJ, Coppieters W, Karim L, van Arendonk J, Bovenhuis H: Quantitative trait loci for five milk production traits on chromosome six in the Dutch Holstein-Friesian population. Genetics. 1996, 144: 1799-1808.PubMed CentralPubMed
- Walling GA, Visscher PM, Haley CS: A comparison of bootstrap methods to construct confidence intervals in QTL mapping. Genetical Research. 1998, 71: 171-180. 10.1017/S0016672398003164.View Article
- Whittaker JC, Thompson R, Visscher PM: On the mapping of QTL by regression of phenotype on marker-type. Heredity. 1996, 77: 23-32. 10.1038/hdy.1996.104.View Article
- Walling GA, Haley CS, Perez-Enciso M, Thompson R, Visscher PM: On the mapping of quantitative trait loci at marker and non-marker locations. Genetical Research. 2001, 79: 97-106.
- Perez-Enciso M, Fernando RL, Bidanel JP, Le Roy P: Quantitative Trait Locus analysis in crosses between outbred lines with dominance and inbreeding. Genetics. 2001, 159: 413-422.PubMed CentralPubMed
- Jansen RC, Nap JP: Genetical genomics: the added value from segregation. TRENDS in Genetics. 2001, 17: 388-391. 10.1016/S0168-9525(01)02310-1.View ArticlePubMed
- Brem RB, Yvert G, Clinto R, Kruglyak L: Genetic Dissection of Transcriptional Regulation in Budding Yeast. Science. 2002, 296: 752-755. 10.1126/science.1069516.View ArticlePubMed
- Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS, Mao M, Stoughton RB, Friend SH: Genetics of gene expression surveyed in maize, mouse and man. Nature. 2003, 422: 297-302. 10.1038/nature01434.View ArticlePubMed
- Monks SA, Leonardson A, Zhu H, Cundiff P, Pietrusiak P, Edwards S, Phillips JW, Sachs A, Schadt EE: Genetic inheritance of gene expression in human cell lines. Am J Hum Genet. 2004, 75: 1094-1105. 10.1086/426461.PubMed CentralView ArticlePubMed
- Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG: Genetic analysis of genome-wide variation in human gene expression. Nature. 2004, 430: 743-747. 10.1038/nature02797.PubMed CentralView ArticlePubMed
- Shi C, Uzarowska A, Ouzunova M, Landbeck M, Wenzel G, Lubberstedt T: Identification of candidate genes associated with cell wall digestibility and eQTL (expression quantitative trait loci) analysis in a Flint × Flint maize recombinant inbred line population. BMC Genomics. 2007, 8: 22-10.1186/1471-2164-8-22.PubMed CentralView ArticlePubMed
- Ponsuksili S, Murani E, Schwerin M, Schellander K, Wimmers K: Identification of expression QTL (eQTL) of genes expressed in porcine M. longissimus dorsi and associated with meat quality traits. BMC Genomics. 2010, 11: 572-10.1186/1471-2164-11-572.PubMed CentralView ArticlePubMed
- Cherel P, Glenisson J, Damon M, Vincent A, Liaubet L, Lobjois V, Hatey F, Milan D, Le Roy P: Colocalization of quantitatitve trait loci for meat pH and differentially expressed genes in skeletal muscles in pigs. Proceedings of the 8th World Congress on Genetics Applied to Livestock Production: 13-18 August 2009; Belo Horizonte, MG, Brazil. 2006, 06:18
- Le Mignon G, Desert C, Pitel F, Leroux S, Demeure O, Guernec G, Abasht B, Douaire M, Le Roy P, Lagarrigue S: Using transcriptome profiling to characterize QTL regions on chicken chromosome 5. BMC Genomics. 2009, 8: 22-
- Le Bras Y, Dechamp N, Montfort J, Cam AL, Krieg F, Quillet E, Prunet P, Le Roy P: Acclimation to seawater in rainbow trout: QTL/eQTL approach for plasmatic ions and gill tissue. Proceedings of the 9th World Congress on Genetics Applied to Livestock Production: 1-6 August 2010; Leipzig, Germany. 2010, 638-
- Lander E, Botstein D: Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics. 1989, 121: 185-199.PubMed CentralPubMed
- Cierco C: Asymptotic distribution of the maximum likelihood ratio test for gene detection. Statistics. 1998, 31: 261-285. 10.1080/02331889808802639.View Article
- Rabier CE, Azais JM, Delmas C: Likelihood ratio test process for quantitative trait loci detection. Journal of the Royal Statistical Society. 2009
- Elsen JM, Mangin B, Goffinet B, Boichard D, Le Roy P: Alternative models for QTL detection in livestock: I. General introduction. Genetics Selection Evolution. 1999, 31: 213-224. 10.1186/1297-9686-31-3-213.View Article
- Goffinet B, Rebai A, Mangin B: Construction confidence intervals for QTL location. Genetics. 1994, 138: 1301-1308.PubMed CentralPubMed
- Filangi O, Moreno C, Gilbert H, Legarra A, Le Roy P, Elsen JM: QTLMap, a software for QTL detection in outbred populations. Proceedings of the 9th World Congress on Genetics Applied to Livestock Production: 1-6 August 2010; Leipzig, Germany. 2010, 787-
- Le Roy P, Elsen JM, Boichard D, Mangin B, Bidanel JP, Goffinet B: An algorithm for QTL detection in mixture full and half sib families. Proceedings of the 6th World Congress on Genetics Applied to Livestock Production: 11-16 January 1998; Armidale, Australia. 1998, 257-260.
- Feller W: An Introduction to Probability Theory and Its Applications. 1968, Wiley, Volume 2: 2
- Lehmann EL: Nonparametrics: Statistical Methods Based on Ranks. 1975, Mcgraw-Hill
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.