Quantitative trait locus analysis of hybrid pedigrees: variance-components model, inbreeding parameter, and power
- Gulnara R Svischeva^{1}Email author
https://doi.org/10.1186/1471-2156-8-50
© Svischeva; licensee BioMed Central Ltd. 2007
Received: 30 January 2007
Accepted: 26 July 2007
Published: 26 July 2007
Abstract
Background
For the last years reliable mapping of quantitative trait loci (QTLs) has become feasible through linkage analysis based on the variance-components method. There are now many approaches to the QTL analysis of various types of crosses within one population (breed) as well as crosses between divergent populations (breeds). However, to analyse a complex pedigree with dominance and inbreeding, when the pedigree's founders have an inter-population (hybrid) origin, it is necessary to develop a high-powered method taking into account these features of the pedigree.
Results
We offer a universal approach to QTL analysis of complex pedigrees descended from crosses between outbred parental lines with different QTL allele frequencies. This approach improves the established variance-components method due to the consideration of the genetic effect conditioned by inter-population origin and inbreeding of individuals. To estimate model parameters, namely additive and dominant effects, and the allelic frequencies of the QTL analysed, and also to define the QTL positions on a chromosome with respect to genotyped markers, we used the maximum-likelihood method. To detect linkage between the QTL and the markers we propose statistics with a non-central χ^{2}-distribution that provides the possibility to deduce analytical expressions for the power of the method and therefore, to estimate the pedigree's size required for 80% power. The method works for arbitrarily structured pedigrees with dominance and inbreeding.
Conclusion
Our method uses the phenotypic values and the marker information for each individual of the pedigree under observation as initial data and can be valuable for fine mapping purposes. The power of the method is increased if the QTL effects conditioned by inter-population origin and inbreeding are enhanced. Several improvements can be developed to take into account fixed factors affecting trait formation, such as age and sex.
Background
The wide application of DNA markers scattered along the genome together with the rapid development of statistical methods provides reliable localization of quantitative trait loci (QTLs). There are now many approaches to QTL analysis of various types of crosses within one population (breed) as well as crosses between divergent populations (breeds) [1–5]. One of the most powerful approaches to QTL mapping is the variance-components method. In this method, variability among trait observations from individuals within pedigrees is expressed in terms of the effect caused by an unobservable trait-affecting major locus, of the polygenic effect, and of the residual non-genetic effect [2, 3, 6–13]. The effect attributable to a locus linked to a marker is a function of the additive and dominance components of variance of the locus, the recombination fraction, and the portion of alleles that are identical by descent (IBD) at the marker for each pair of individuals. The polygenic variance component depends only on the relationship between the relative pair.
If the pedigree analysed comes from a population with an identical distribution of genotypes for all the members of the pedigree and with an identical environmental influence on phenotypes, then the covariance between trait values of related pairs is the weighted sum of the variance components identical for all individuals [10, 14–16]. The presence of the marker information makes it possible to separate the variance component caused by the locus linked to the marker from the polygenic variance component, and to test the significance of major locus contribution with respect to trait polymorphism.
Crosses between individuals from divergent populations (breeds) that differ by trait distribution are often used in investigations of traits of livestock breeding, laboratory and domestic animals, and studies of human hereditary diseases. A set of statistical methods for QTL mapping was developed in which initial materials that are backcrosses or the F_{2}-generation descended from inbred lines were used [17–21]. Recently several studies devoted to the analysis of crosses between outbreed lines has been reported [2, 3]. One of these statistical methods, known as the segment mapping method [2], is based on division of the genome of hybrid individuals into segments. Here, genetic covariance of a trait is defined for each segment and depends on the variance of initial breeds and the percentage of genetic material of these breeds in this segment. However, this method does not take into consideration such effects as domination and inbreeding. On the contrary, another method developed in [3] assumes the presence of these effects and allows us to find distinctions in genotype frequencies of the major locus analysed between the crossed breeds. The essence of this method is that the genetic covariance of any two individuals is expressed as a non-linear function of the probability of up to 15 possible identity modes differing by the allele origin of the locus. The disadvantage of this method is its inherent complicated calculations.
The objective of the present study is to present another high-powered theoretical approach to analyse data from crosses between outbreed lines using marker information. This approach is based on the variance-components method that takes into account dominance and inbreeding and uses all the pedigree information available. This study is structured as follows. First, we formulate assumptions about the genetic inter-population nature of the trait so that these assumptions allow us to prove the genetic model chosen and the distribution of phenotypes in the pedigree. Second, we develop a universal way for decomposing variance and covariance into equi-type components, so that weighting factors at these components depend on the degree of relationship and the recombination frequency between the marker and the locus, and can be obtained from joint distribution of IBD-alleles of the QTL and the marker [7]. This ensures that we derive the exact analytical expressions of variance components for different types of relative pairs. Third, we obtain analytical expressions for the power of our method without simulation data. The method is demonstrated by an example of hybrid sibships, which are widely popular in experimental designs.
Results
The genetic model
A general explanatory multi-locus model describing the quantitative trait for the i th individual of a hybrid pedigree is
X_{i} = μ_{i} + g_{i} + G_{i} + e_{i},
where μ is the overall mean, g and G denote independent effects conditioned by the influence of QTLs (major locus and polygene, respectively), and e denotes the environmental effect. However, since the contribution of the major locus to the trait studied has no priorities in relation to other loci listed in the polygene, we will consider a simplified mono-locus model, which could be easily extended to general cases without major difficulties.
For the analysis of crosses between two divergent populations, P_{1} and P_{2}, it is necessary to consider additional assumptions about equi-type distribution of the trait in the parental populations, P_{1} and P_{2}, and in the hybrid pedigree, P_{1} × P_{2}. We assume that QTL contributions to trait formation do not depend on the population origin of the individuals, and that crossed initial populations differ by unequal QTL allele frequencies, p_{1} for P_{1} and p_{2} for P_{2} [3, 23]. In addition, we assume that the Hardy-Weinberg equilibrium is carried out for the P_{1} and P_{2} populations.
Furthermore, the allelic frequencies of the i th hybrid individual can be expressed in terms of the allelic frequencies of the initial populations, p_{1} and p_{2}, and a parameter, ε_{i1} (ε_{i2} = 1-ε_{i1}), called portion of "blood" of the population P_{1} (P_{2}) [23]:
p(A_{i}) = ε_{i1} p_{1} + (1-ε_{i1}) p_{2}.
We admit that the trait values of individuals from a hybrid pedigree, as well as from P_{1} and P_{2} populations, have a multi-normal distribution that is parameterised by an expectation vector and a covariance matrix [25]. If the influence of the environment is identical for all hybrid individuals, then without sacrificing the model generality we can assume that environmental effects for all individuals are random effects distributed by the normal law with identical parameters of distribution, N (0, Var_{e}) [10].
Parameter of inbreeding
Conditional distribution of genotype frequencies of the QTL for an individual under the given genotype of his (her) parent
p(g_{3}/g_{2}) | g_{2} = AA | g_{2} = AB | g_{2} = BB |
---|---|---|---|
p(AA_{3}/g_{2}) | p(A_{1}) | 1/2 p(A_{1}) | 0 |
p(AB_{3}/g_{2}) | p(B_{1}) | 1/2 | p(A_{1}) |
p(BB_{3}/g_{2}) | 0 | 1/2 p(B_{1}) | p(B_{1}) |
Conditional distribution of genotype frequencies of the QTL for an inbred individual originated from cross "parent-offspring" under the given genotype of his (her) parent
p(g_{4}/g_{2}) | g_{2} = AA | g_{2} = AB | g_{2} = BB | p^{inb}(g_{4}) |
---|---|---|---|---|
p(AA_{4}/g_{2}) | 1/2 (p(A_{1})+1) | 1/4(p(A_{1})+1/2) | 0 | 1/2[p(A_{1}) p(A_{2}) + p(A_{2}) - 1/4p(AB_{2})] |
p(AB_{4}/g_{2}) | 1/2(p(B_{1}) | 1/2 | 1/2p(A_{1}) | 1/2[p(B_{1}) p(A_{2}) + p(A_{1}) p(B_{2}) + 1/2p(AB_{2})] |
p(BB_{4}/g_{2}) | 0 | 1/4(p(B_{1})+ 1/2) | 1/2 (p(B_{1})+ 1) | 1/2[p(B_{1}) p(B_{2}) + p(B_{2}) - 1/4p(AB_{2})] |
The parameter of inbreeding may be generalized to any type of crosses with a single common ancestor:
τ = (1/2)^{k} f(g_{o}), (3)
where (1/2)^{k} is the degree of relationship of the parents of the inbred individual, and f(g_{o}) is the function of genotype frequencies of their common ancestor, g_{o}:
f(g_{o}) = (p(A_{o}) p(BB_{o}) + p(B_{o}) p(AA_{o}))/2. (4)
The proof of the validity of formulas (3–4) is presented in the Appendix.
Let the second pedigree examined include the shortest inbred loop with two common ancestors of parents of an inbred individual (Figure 1c). In this case, an inbred individual descends from a cross of sibs. To find τ, we considered a similar pedigree with the same structure but without inbreeding (Figure 1d). We need to determine the distributions of the genotype frequencies of the 5th individuals from inbred and outbred crosses through allelic frequencies of the pedigree's founders, p(A_{1}) and p(A_{2}), and to compare them with each other.
Conditional distribution of genotype frequencies of the QTL for sibs under the given genotypes of their parents
p(g_{3,4}/g_{1},g_{2}) | g_{1} × g_{2} | |||||
---|---|---|---|---|---|---|
AA × AA | BB × BB | AB × AB | BB × AA, AA × BB | AB × AA, AA × AB | BB × AB, AB × BB | |
p(AA_{3,4}/g_{1},g_{2}) | 1 | 0 | 1/4 | 0 | 1/2 | 0 |
p(AB_{3,4}/g_{1},g_{2}) | 0 | 0 | 1/2 | 1 | 1/2 | 1/2 |
p(BB_{3,4}/g_{1},g_{2}) | 0 | 1 | 1/4 | 0 | 0 | 1/2 |
Conditional distribution of genotype frequencies of the QTL for an inbred individual originated from the cross of sibs under the given genotypes of parents of sibs
p(g_{5}/g_{1},g_{2}) | g_{1} × g_{2} | p^{inb}(g_{5}) | |||||
---|---|---|---|---|---|---|---|
AA × AA | BB × BB | AB × AB | BB × AA, AA × BB | AB × AA, AA × AB | BB × AB, AB × BB | ||
p(AA_{5}/g_{1},g_{2}) | 1 | 0 | 1/4 | 1/4 | 9/16 | 1/16 | 1/4[p(A_{1}) + p(A_{2}) - 1/4p(AB_{1}) - 1/4p(AB_{2})+ 2p(A_{1}) p(A_{2})] |
p(AB_{5}/g_{1},g_{2}) | 0 | 0 | 1/2 | 1/2 | 3/8 | 3/8 | 1/2[p(B_{1})p(A_{2}) + p(A_{1})p(B_{2}) + 1/4p(AB_{1}) + 1/4p(AB_{2})] |
p(BB_{5}/g_{1},g_{2}) | 0 | 1 | 1/4 | 1/4 | 1/16 | 9/16 | 1/4[p(B_{1})+ p(B_{2})- 1/4p(AB_{1}) - 1/4p(AB_{2}) + 2p(B_{1}) p(B_{2})] |
and can be generalized to all inbred individuals having parents with two common ancestors as:
τ = (1/2)^{k} (f(g_{1o}) + f(g_{2o}))/2, (5)
where (1/2)^{k} is the degree of relationship of the parents of the inbred individual, and f(g_{io}) for i = 1,2 is the function of genotype frequencies, g_{io}, of their i th common ancestor calculated by formula (4). The proof of the validity of formula (5) is presented in the Appendix.
Partitioning genetic covariance into components
To partition the genetic variance into components, we modernized the approach of Amos and Elston [7] adapting it for hybrid pedigree analyses.
Let us review in detail the process of finding genetic covariance and its components using the example of sib-pair as the most often used in studies of hereditary diseases.
The analysis of sib-pair
Conditional probability distribution of Y_{j} values for pair of sibs
Genotypes of sib-pair | Y _{j} | Conditional probability Pr (Y_{j}|π_{QTLj}) | ||
---|---|---|---|---|
π_{QTLj} = 0 | π_{QTLj} = 1/2 | π_{QTLj} = 1 | ||
AA-AA | ξ_{j}^{2} | p(AA_{f})p(AA_{m}) + τ | 1/2 (p(AA_{f})p(A_{m}) + p(A_{f})p(AA_{m})) + τ | p(A_{f}) p(A_{m}) + τ |
BB-BB | ξ_{j}^{2} | P(BB_{f})p(BB_{m}) +τ | 1/2 (p(BB_{f})p(B_{m}) + p(B_{f})p(BB_{m}))+τ | p(B_{f})p(B_{m})+ τ |
AB-AB | ξ_{j}^{2} | p(AA_{f})p(BB_{m}) + p(BB_{f})p(AA_{m}) + 1/2p(AB_{f}) p(AB_{m}) -2τ | p(B_{f})p(A_{m}) + p(A_{f})p(B_{m}) - 1/4 [p(AB_{f})+ p(AB_{m})] - 2 τ | p(A_{f})p(B_{m}) + p(B_{f})p(A_{m})-2τ |
AA-AB | (a-d +ξ_{j})^{2} | 1/2 (p(AB_{f})p(AA_{m}) + p(AA_{f})p(AB_{m})) | 1/4(p(A_{f})p(AB_{m}) + p(AB_{f})p(A_{m})) | 0 |
AB-AA | (-a + d + ξ_{j})^{2} | 1/2 (p(AB_{f})p(AA_{m}) + p(AA_{f})p(AB_{m})) | 1/4(p(A_{f})p(AB_{m})+ p(AB_{f})p(A_{m})) | 0 |
AB-BB | (a + d +ξ_{j})^{2} | 1/2 (p(AB_{f})p(BB_{m}) + p(BB_{f})p(AB_{m})) | 1/4(p(B_{f})p(AB_{m}) + p(AB_{f})p(B_{m})) | 0 |
BB-AB | (-a-d +ξ_{j})^{2} | 1/2 (p(AB_{f})p(BB_{m}) + p(BB_{f})p(AB_{m})) | 1/4(p(B_{f})p(AB_{m}) + p(AB_{f})p(B_{m})) | 0 |
AA-BB | (2a +ξ_{j})^{2} | 1/4 p(AB_{f}) p(AB_{m}) | 0 | 0 |
BB-AA | (-2a + ξ_{j})^{2} | 1/4 p(AB_{f}) p(AB_{m}) | 0 | 0 |
Total | 1 | 1 | 1 |
where ξ_{j} is the trait value difference caused by the environment. Apparently, a dependence on the parameter of inbreeding is not present in formulas (7) for E(Y_{j}|π_{QTLj}). This means that the components caused by inbreeding are identical for covariance and variance.
where Ψ = θ^{2} + (1-θ)^{2}.
where β_{p} = (p(AB_{p}) - 2 p(A_{p}) p(B_{p})) at p = m, f.
The trait variance can be partitioned into components:
Var = A + D + R + Z,
One can conclude that trait covariance depends on the necessary set of parameters {a, d, p_{1}, p_{2}, θ}.
Criterion for the definition of QTL position
To localize a QTL on a chromosome the maximum likelihood method was used. This method enables to choose the most suitable genetic model, estimate the modelling parameters and define the position of the QTL with the required accuracy. Note that if there are no genetic effects (a = 0 and d = 0), it is impossible to localize a QTL since the recombination fractions between the QTL and markers can not be estimated. Let us consider two genotyped markers flanking the QTL and construct the suitable log-likelihood function:
lnL = const - 1/2 ∑[ln|V| + (X- E_{ X }) V^{-1} (X- E_{ X })^{T}],
where the summation is over the two flanking markers; X and E_{ X }are horizontal vectors of quantitative trait values and their expectations, respectively; V is a covariance matrix with the elements $\text{Cov}({X}_{1\text{j}},{X}_{2\text{j}}|{\stackrel{\u2322}{\pi}}_{\text{Mj}})$. The log-likelihood function does not change a form at multiple analyses, because to localize the QTL among multiple markers, it is necessary to test each chromosome fragment bracketed by only two adjacent genotyped markers.
We constructed the statistics as a double likelihood ratio, 2(lnL_{1}-lnL_{0}), where L_{0} is the maximum likelihood under a null hypothesis H_{0}, obtained by imposing restrictions on certain parameters of interest, and L_{1} is the maximum likelihood under an alternative hypothesis H_{1}, where these restrictions are removed. Here, we have chosen hypothesis H_{1} in which the parameter θ is not fixed, and hypothesis H_{0} in which the parameter θ is equal to the fixed value, θ_{ k }. One can let the recombination frequency between one of the markers and the QTL, θ_{ k }, be correlated with genetic distance, k, by the Kocambi mapping function [27], and take into account interference:
θ_{ k }= 1/2 (e^{4k}- 1)/(e^{4k}+ 1),
where k varies from 0 to r discretely (with given step length), and r is a fixed genetic distance between two markers. Thus, we have several null hypotheses from which it is necessary to choose a suitable one. If the value of the statistics is calculated for each probable k-position of the QTL and compared with the critical value, then we can accept or reject the given position as correct. Indeed, the specified criterion is the linkage test, for which the critical value transformed from LOD score is equal to 2ln(10^{3}) = 13.8. Note that in spite of the fact that many authors have demonstrated that, for evidence of more significant linkage, LOD score threshold is greater than 3, we use just this traditional threshold as being more convenient for comparison of our method with other ones with same LOD score thresholds. But researchers can choose a more severe threshold.
Power
From mathematical statistics it is known that the likelihood ratio test has a central χ^{2}-distribution under a null hypothesis and a noncentral χ^{2}-distribution under an alternative hypothesis in large samples [25]. Given a critical P-value, the power of a χ^{2}-test can be determined from the noncentrality parameter, λ, which is directly proportional to the sample size, N, and to the degree of freedom of the noncentral χ^{2}-distribution, df. To estimate the power for any sample size at a given λ and df, one can refer to the appropriate function of the noncentral χ^{2}-distribution. It is possible to derive analytical formulas for the noncentrality parameter without carrying out data simulation [28]. For this, it is necessary to obtain the asymptotical values of the maximum-likelihood estimates of parameters under both the H_{0} and H_{1} hypotheses, and then to take the log-likelihood expectations under these hypotheses evaluated at their respective asymptotical parameter estimations. The noncentrality parameter is then:
λ = E(2lnL_{1}) - E(2lnL_{0}). (10)
where s is sibship size; and p_{i} and V_{ki} are the probability and the covariance matrix for the i th marker genotype configuration, respectively.
where Ψ = θ^{2} + (1-θ)^{2}. The noncentrality parameter for sib-pair is then given by:
λ = 1/4 ln(1-c_{0}^{2}) + 1/2 ln(1-c_{1}^{2}) + 1/4 ln(1-c_{2}^{2}) - 1/4 ln(1-c'_{0}^{2}) - 1/2 ln(1-c'_{1}^{2}) - 1/4 ln(1-c'_{2}^{2}),
where c'_{i} are values of c_{i} (i = 0, 1, and 2) at the fixed Ψ_{k}, Ψ_{k} = θ_{k}^{2} + (1-θ_{k})^{2}.
where Δ_{1} = Ψ-Ψ_{k} and Δ_{2} = 1-Ψ_{k}-Ψ. In this case λ_{1} is proportional only to the squares and products of the additive V_{A} and dominance V_{D} variance components and does not depend on the component conditioned by the hybrid (inter-population) origin of the sibs.
For a sufficiently accurate calculation of the noncentrality parameter, as often as not the second-order approximation, ln(1-x)≈-x-1/2 x^{2}, is used. It follows therefore that
λ ≈ λ_{1} + 1/8(c'_{0}^{4} - c_{0}^{4}) + 1/4(c'_{1}^{4} - c_{1}^{4}) + 1/8(c'_{2}^{4} - c_{2}^{4}).(12)
The analytical expression for the noncentrality parameter after substituting the expressions (11) in formula (12) is lengthy, but we can see that λ depends on all variance components, V_{A}, V_{D}, and V_{R}. Moreover, the power to detect a given QTL effect increases with increasing proportion of the residual component, V_{R}. To obtain more accurate results, it is possible to use an approximation by involving higher-order terms.
To determine a noncentrality parameter for the entire sibship, we used a suitable approximation to calculate a determinant of correlation matrix (non-singular and symmetric) as shown in [28]:
ln|V| ≈ ln(1 - ∑V_{jk}^{2}) = -∑V_{jk}^{2}, (12)
where ∑ denotes the sum over all possible sib-pairs (j, k), j <k. Then for an s-size sibship the noncentrality parameter, λ_{ s }, is equal to:
λ_{s} ≈ 1/2 s(s - 1) λ, (14)
where 1/2s(s-1) is the number of sib-pairs. As is obvious, the noncentrality parameter for the linkage test is proportional to the number of all pairs in the sibship. It is noteworthy that formula (14) is not exact for small samples, and in this case, it is necessary to calculate the power through data simulation.
In the case of analyses of a hybrid pedigree of arbitrary structure, the noncentrality parameter can be obtained in a similar manner. For this purpose, noncentrality parameters are calculated for all relative pairs of the pedigree analysed, and are then summarized according to approximation (13). When the theoretical noncentrality parameter has been obtained, it is easy to calculate the size of the sample required for any required level of significance and power. For the linkage test, the level of significance required is traditionally set at a LOD score of 3, which is equivalent to a χ^{2} statistics of 13.8 with df = 2 and a fixed-sample one-tailed significance level of 0.0001. The noncentrality parameter required for 80% power is 20.8 [28]. For example, the number of sib-pairs required can be obtained by dividing the noncentrality parameter required (i.e. 20.8 for the linkage test) by the theoretical noncentrality parameter per sib-pair.
Sample sizes required for 80% power to detect linkage for the range of V_{R}, recombination fraction and fixed components V_{D} = 0.15 and V_{A} = 0.15
Recombination fraction, θ | Sample size required for V_{R} = | |||||||
---|---|---|---|---|---|---|---|---|
0.00 | 0.05 | 0.10 | 0.15 | 0.20 | 0.25 | 0.30 | 0.35 | |
0.00 | 1530 | 1460 | 1379 | 1291 | 1200 | 1110 | 1023 | 940 |
0.05 | 2453 | 2346 | 2220 | 2082 | 1938 | 1793 | 1653 | 1520 |
0.10 | 4078 | 3909 | 3704 | 3478 | 3240 | 3001 | 2767 | 2545 |
0.15 | 7142 | 6855 | 6504 | 6111 | 5697 | 5279 | 4871 | 4480 |
Simulations
To examine the performance of the proposed approach in realistic situations we conducted simulation studies on examples of inbred sibships. We generated 10, 20, 30, and 40 hybrid pedigrees covering three generations of individuals (F_{0}, F_{1}, and F_{2} generations). Founders (F_{0} individuals) of each pedigree analysed are two individuals from the P_{1} population and single individual from the P_{2} population. The founders from different populations formed a crossing pair and had one offspring for the F_{1} generation. Inbred crossing between related F_{1} individuals contributed to the F_{2} generation by the size of 10 offsprings. Thus, each pedigree consisted of three founders, two F_{1} individuals, and ten F_{2} individuals. For all inbred sibs from the F_{2} generation, the marker genotypes and phenotypic values of the trait were simulated and were considered as known.
We considered two QTL positions between two markers, in turn assuming that the QTL is located in the chromosome positions 5 and 10 cM, and that the markers flanking it are fixed at positions 0 and 25 cM. We specified the distribution of allele frequencies of the QTL for all the individuals analysed based on the assumption that the Hardy-Weinberg equilibrium is carried out for the founders.
Let there be four-allelic markers (two unique alleles from each initial population). The marker genotypes of the founders were selected randomly, assuming an even distribution of frequencies of marker alleles. The additive and dominance genetic values were taken to be a = 3 and d = 1, respectively. For the founders, the QTL allelic frequencies were taken to be p_{1} = 0.9 and p_{2} = 0.4. The phenotypic values were obtained by adding the normal deviation N(0,1) to the genetic value. For each given set of model parameters {p_{1}, p_{2}, a, d, θ} and the given sample size, 100 replicates were simulated.
Our purpose is to locate the QTL estimating allelic frequencies for initial populations P_{1} and P_{2} and the genetic effects of the trait in question in each replicate. To locate the QTL on a chromosome fragment between markers, we discretely moved along the fragment at a step length of 0.01 cM and estimated the double likelihood ratio statistics at each point. If the statistics calculated at a point was higher than the critical value, then the hypothesis of the localization of the QTL at this point was rejected. The QTL was hypothetically located at the point where the statistics had the lowest value.
We compared our method with the method for QTL analysis of F_{2} crosses between outbred lines, described in [2, 29], which was performed using the Qxpak software available free at [30].
The performance of both methods was tested using the same simulated data. The comparative characteristics were the frequency of the events consisting in the fact that the true location of the QTL would not rejected (W_{1}), and the frequency of the events consisting in the fact that the statistical test would indicate the true QTL location as the most likely one (W_{2}). It is obvious that W_{1} ≥ W_{2}. It should be noted that the value of (1-W_{1}) can be interpreted as type I error rate, and value of W_{2} can be analogous to power of the method.
When the QTL is really localized at the position 5 cM (Figure 2), our method demonstrates the highest values of W_{1} frequency, exceeding 0.8 at any sample size. Only beginning at N_{ped} = 450, each of bar graphs for W_{1} gets in a range of 5-percentage error of another. Regarding the W_{2} characteristic, its values do not exceed 0.8 for either method at any sample size. When the QTL is localized at position 10 cM (Figure 3), the frequencies W_{1} and W_{2} are higher than 0.95 and 0.8, respectively, beginning at N_{ped} = 150 for our method and beginning at N_{ped} > 600 for Qxpak.
All aforesaid facts speak in favour for our method for the analysis of hybrid pedigrees with dominance and inbreeding effects. The results obtained justify the QTL analysis by our method that yields more accurate data on the localization of the QTL.
Discussion
In this study we have updated the variance-components method for the analysis of hybrid pedigrees with dominance and inbreeding. We have considered hybrid sibships as an example to demonstrate the method. An advantage of our method is to partition into variance components, where one of the components is conditioned by the inter-population origin of individuals and inbreeding. There is no necessity to resolve this component into separate elements caused by inter-population origin and separate elements caused by inbreeding since these elements are indivisible in variances and covariances and therefore can not be estimated singly.
We have derived an intuitively appealing result regarding the power of our method under a variance-components model for larger samples of sibships. If the effects of QTL are small, the results are particularly simple. We have generally arrived at the conclusion that the power of our method decreases rapidly with decreasing proportion of the variance component caused by the hybrid origin and by inbreeding. This means that the sample size required for 80% power for hybrid pedigrees is less than for pedigrees descended from one pure line.
For simplicity, we did not consider such fixed factors affecting the formation of traits as for example, age and sex, but these factors can easily be included in our model. Moreover, the method suggested can be used to choose the most suitable model for the description of the data: additive models (d = 0), dominance models (a = 0), models of crosses of two pure lines (p_{1} = 0, and p_{2} = 1) or models of intra-population crosses (p_{1}=p_{2}).
The results obtained make it possible to draw conclusions on the competence of the incorporated analysis that could specify not only the localization of the QTL, but also an estimate of the values of QTL effects.
Conclusion
We have presented a new modification of the variance-components method for QTL analysis. It is a linkage test method, whose originality consists in considering the trait effect caused by inter-population origin and inbreeding. Analytical derivations for the variance components make it possible to analyse their dependence from the model parameters.
The analytical expressions for the power of our method avoid the intensive computations required for simulated data processing and allow to estimate the size of the pedigree required. We have shown that the method is more powerful if the QTL effects conditioned by inter-population origin and inbreeding are increased. Several improvements can be developed to take into account fixed factors affecting trait formation, such as age and sex.
Our method uses the trait values and the marker information for each individual of a pedigree with an arbitrary structure including inbred loops as initial data and can be valuable for fine mapping purposes.
Methods
This maximization is numerically carried out using the simplex METHI – program specifically developed to obtain maximum likelihood (ML) and ML-parameter estimates of likelihood function. METHI uses a method of configurations when maximising a function. We have free access to METHGI on our laboratory website [31]. The parameters {a, d, p_{1}, p_{2}} must be estimated. For the recombination frequency, we have assigned different fixed values corresponding to specific distances on the chromosome.
Appendix
After of some transformations, it is clear that the frequency of genotype AA at the inbred cross can be expressed through a similar frequency of the individual i th:
p^{inb}(AA_{j}) = 1/2(p^{inb}(AA_{i}) + p(A_{jfm}) p(A_{im})).
Thus, we have shown that the inbreeding parameter for the individual descended from any type of inbred cross, depends on the degree of relationship of his parents and genotype distribution of the common ancestors, and does not depend on the distribution of the inbred offspring.
As a result, inbreeding changes the parameters of distribution of quantitative trait values for hybrid individuals: genotypic means decrease or are constant, and covariances basically increase. It is important to note that the account of inbreeding of hybrid individuals does not complicate QTL analysis, and more exactly estimates parameters of distribution of quantitative trait values.
Declarations
Acknowledgements
I gratefully thank Anne-Lise Haenni for critical reading of this manuscript, Tatiana Axenovich for the helpful discussions, and Galina Karpova, Dmitry Graifer and Ivan Shatsky for their help in the preparation of this manuscript.
Authors’ Affiliations
References
- Haley CS, Knott SA, Elsen JM: Mapping quantitative trait loci in crosses between outbred lines using least squares. Genetics. 1994, 136: 1195-1207.PubMed CentralPubMedGoogle Scholar
- Pérez-Enciso M, Varona L: Quantitative trait loci mapping in F (2) crosses between outbred lines. Genetics. 2000, 155: 391-405.PubMed CentralPubMedGoogle Scholar
- Pérez-Enciso M, Fernando R, Bidanel J-P, Le Roy P: Quantitative trait locus analysis in crosses between outbred lines with dominance and inbreeding. Genetics. 2001, 159: 413-422.PubMed CentralPubMedGoogle Scholar
- Pérez-Enciso M, Clop A, Folch JM, Sanchez A, Oliver MA, Ovilo C, Barragan C, Varona L, Noguera JL: Exploring alternative models for sex-linked quantitative trait loci in outbred populations: application to an Iberian × landrace pig intercross. Genetics. 2002, 161: 1625-1632.PubMed CentralPubMedGoogle Scholar
- Ovilo C, Clop A, Noguera JL, Oliver MA, Barragán C, Rodríguez C, Silió L, Toro MA, Coll A, Folch JM, Sánchez A, Babot D, Varona L, Pérez-Enciso M: Quantitative trait locus mapping for meat quality traits in an Iberian × Landrace F2 pig population. J Anim Sci. 2002, 80: 2801-2808.PubMedGoogle Scholar
- Chevalet C, Gillois M, Khang VT: Conditional probabilities of identity of genes at a locus linked to a marker. Genetic Selection Evolution. 1984, 16: 431-444. 10.1051/gse:19840404.View ArticleGoogle Scholar
- Amos CI, Elston RC: Robust methods for the detection of genetic linkage for quantitative data from pedigrees. Genet Epidemiol. 1989, 6: 349-360. 10.1002/gepi.1370060205.View ArticlePubMedGoogle Scholar
- Goldgar DE: Multipoint analysis of human quantitative genetic variation. Am J Hum Genet. 1990, 47: 957-967.PubMed CentralPubMedGoogle Scholar
- Schork NJ: Extended multipoint identity-by-descent analysis of human quantitative traits: efficiency, power, and modeling considerations. Am J Hum Genet. 1993, 53: 1306-1319.PubMed CentralPubMedGoogle Scholar
- Amos CI: Robust variance-components approach for assessing in pedigree. Am J Hum Genet. 1994, 54: 535-543.PubMed CentralPubMedGoogle Scholar
- Blangero J, Almasy L: Multipoint oligogenic linkage analysis of quantitative traits. Genet Epidemiol. 1997, 14: 959-964. 10.1002/(SICI)1098-2272(1997)14:6<959::AID-GEPI66>3.0.CO;2-K.View ArticlePubMedGoogle Scholar
- Williams JT, Duggirala R, Blangero J: Statistical properties of a variance-components method for quantitative trait linkage analysis in nuclear families and extended pedigrees. Genet Epidemiol. 1997, 14: 1065-1070. 10.1002/(SICI)1098-2272(1997)14:6<1065::AID-GEPI84>3.0.CO;2-F.View ArticlePubMedGoogle Scholar
- Blangero J, Williams JT, Almasy L: Robust LOD scores for variance component-based linkage analysis. Genet Epidemiol. 2000, 8-12. 10.1002/1098-2272(2000)19:1+<::AID-GEPI2>3.0.CO;2-Y. Suppl 19Google Scholar
- Lange K, Westlake J, Spence MA: Extensions to pedigree analysis III. Variance components by the scoring method. Ann Hum Genet. 1976, 39: 485-491.View ArticlePubMedGoogle Scholar
- Almasy L, Blangero J: Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet. 1998, 62: 1198-1211. 10.1086/301844.PubMed CentralView ArticlePubMedGoogle Scholar
- Comuzzie AG, Hixson JE, Almasy L, Mitchell BD, Mahaney MC, Dyer TD, Stern MP, MacCluer JW, Blangero J: A major quantitative trait locus determining serum leptin levels and fat mass is located on human chromosome 2. Nat Genet. 1997, 15: 273-276. 10.1038/ng0397-273.View ArticlePubMedGoogle Scholar
- Lander ES, Botstein D: Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics. 1989, 121: 185-199.PubMed CentralPubMedGoogle Scholar
- Haley CS, Knott SA: A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity. 1992, 69: 315-324.View ArticlePubMedGoogle Scholar
- Martinez O, Curnow RN: Estimating the locations and the sizes of the effects of quantitative trait loci using flanking markers. Theor Appl Genet. 1992, 85: 480-488. 10.1007/BF00222330.View ArticlePubMedGoogle Scholar
- Jansen R: Interval mapping of multiple quantitative trait loci. Genetics. 1993, 135: 205-211.PubMed CentralPubMedGoogle Scholar
- Xie C, Gessler DD, Xu S: Combining different line crosses for mapping quantitative trait loci using the identical by descent-based variance component method. Genetics. 1998, 149: 1139-1146.PubMed CentralPubMedGoogle Scholar
- Lo LL, Fernando RL, Cantet RJC, Grossman M: Theory for modelling means and covariances in a two-breed population with dominance inheritance. Theor Appl Genet. 1995, 90: 49-62. 10.1007/BF00220995.View ArticlePubMedGoogle Scholar
- Axenovich TI: Inheritance of Quantitative Traits in Hybrid Pedigrees: Mixed Models. Russian Journal of Genetics. 1999, 35 (4): 530-539.Google Scholar
- Li CC: First course in population genetics. 1976, Pacific grove (California): The boxwood pressGoogle Scholar
- Kendall M, Stuart A: The advanced theory of statistics. 1979, Inference and relationship. New York: John Wiley and Sons, 2:Google Scholar
- Haseman JK, Elston RC: The investigation of linkage between a quantitative trait and a marker locus. Behav Genet. 1972, 2: 3-19. 10.1007/BF01066731.View ArticlePubMedGoogle Scholar
- Kosambi DD: The estimation of map distances from recombination values. Ann Eugen. 1922, 12: 172-175.View ArticleGoogle Scholar
- Sham PC, Cherny SS, Purcell S, Hewitt JK: Power of linkage versus association analysis of quantitative traits, by use variance-components models, for sibship data. Am J Hum Genet. 2000, 66: 1616-1630. 10.1086/302891.PubMed CentralView ArticlePubMedGoogle Scholar
- Perez-Enciso M, Misztal I: Qxpak: a versatile mixed model application for genetical genomics and QTL analyses. Bioinformatics. 2004, 20: 2792-2798. 10.1093/bioinformatics/bth331.View ArticlePubMedGoogle Scholar
- [http://www.icrea.es/pag.asp?id=Miguel.Perez]
- [http://mga.bionet.nsc.ru/soft/methgi/methgi.html]
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.