A complete classification of epistatic two-locus models
- Ingileif B Hallgrímsdóttir^{1}Email author and
- Debbie S Yuster^{2, 3}
DOI: 10.1186/1471-2156-9-17
© Hallgrímsdóttir and Yuster; licensee BioMed Central Ltd. 2008
Received: 14 March 2007
Accepted: 19 February 2008
Published: 19 February 2008
Abstract
Background
The study of epistasis is of great importance in statistical genetics in fields such as linkage and association analysis and QTL mapping. In an effort to classify the types of epistasis in the case of two biallelic loci Li and Reich listed and described all models in the simplest case of 0/1 penetrance values. However, they left open the problem of finding a classification of two-locus models with continuous penetrance values.
Results
We provide a complete classification of biallelic two-locus models. In addition to solving the classification problem for dichotomous trait disease models, our results apply to any instance where real numbers are assigned to genotypes, and provide a complete framework for studying epistasis in QTL data. Our approach is geometric and we show that there are 387 distinct types of two-locus models, which can be reduced to 69 when symmetry between loci and alleles is accounted for. The model types are defined by 86 circuits, which are linear combinations of genotype values, each of which measures a fundamental unit of interaction.
Conclusion
The circuits provide information on epistasis beyond that contained in the additive × additive, additive × dominance, and dominance × dominance interaction terms. We discuss the connection between our classification and standard epistatic models and demonstrate its utility by analyzing a previously published dataset.
Background
The genetic dissection of complex traits is at the center of current research in human genetics. Complex traits are caused by multiple susceptibility genes and environmental factors, and mounting evidence from both human genetics and model organisms suggests that epistasis (gene × gene interaction) plays an important role [1, 2]. Although the need to consider epistasis when mapping complex trait loci has been discussed by several authors [3–6], most statistical methods used in gene mapping, be it case-control association studies, quantitative trait loci (QTL) mapping, or linkage analysis, are based only on measures of marginal effects at individual loci and do not consider epistasis. Due to recent advances in genotyping technology many large case-control genome-wide association studies [7] have recently been completed, and there has been renewed interest in two-locus disease models and two-locus tests for association [8–11]. The application of two-locus models also arises in expression QTL mapping where thousands of gene expression traits are mapped with linkage analysis and it is imperative to study gene interactions [12].
Ideally, a test for epistasis between two loci A and B should test for biological interaction, or non-independence between the effects of locus A and locus B. Loci A and B are considered independent if the effect of the genotype at locus A does not depend on the genotype at locus B. This biologically motivated concept has been formalized in a variety of ways by different communities seeking simple, mathematically convenient definitions. In the statistical genetics literature the term epistasis is typically taken to mean that the effects at loci A and B are not additive (the "effect" of a locus is defined in terms of the statistical model used [13]). For a further discussion on epistasis see [14–16]. Fisher [17] considered a linear model for the contribution of different loci to a quantitative trait and used the term epistasy to describe a departure from additivity. In linkage analysis based on variance component models, a model without epistasis is a model in which all dominance variance components are zero. In case-control association studies of dichotomous traits it is common to use logistic regression, and additivity is measured in the log-odds of disease for a genotype [18]. A new test for epistasis was recently suggested in [10], it tests for departures from linkage disequilibrium (LD) in the cases, which is equivalent to testing for departure from additivity of the log penetrance values (i.e. departure from a multiplicative model for the two-locus penetrances). These tests all test for departure from additivity on a particular scale, but if an additive model is rejected they provide no information on the type of interaction present. Furthermore, it is not clear what the biological meaning of the interaction is.
With each of the nine two-locus genotypes we associate a genotype value. In the case of a dichotomous phenotype the genotype value can e.g. be the penetrance associated with the genotype, the logarithm of the penetrance, or the logarithm of the odds ratio. In the case of a quantitative trait a natural choice for the genotype value is the expected phenotype value of individuals with that genotype (sometimes called measured genotype). We will consider epistasis to be any deviation from additivity of the genotype values. This is consistent with the definition of epistasis given in [13], both for quantitative and dichotomous traits. In this paper we provide a framework within which one can study and classify the types of epistasis possible between two biallelic loci. Our results are based on recent work of [19] who provide a rigorous geometric approach to epistasis in the haploid case. We extend their results to the diploid case, and characterize all possible patterns of physical interactions among the 9 possible genotypes in the two locus case, showing that there are 387 classes of models that fall into 69 symmetry classes. We discuss the meaning of the different types of interaction and show how the interaction pattern can be effectively measured and visualized.
In genetic analysis it is common to test not only for departure from additivity, but also for whether the data fits a particular two-locus model (e.g. recessive or dominant). We discuss the models that are frequently used and show how they relate to the classification given here. In order to study a wider class of two-locus models [20] enumerated all two-locus, two-allele, two-phenotype disease models with penetrance values 0 or 1 for the nine possible phenotypes. There are 512 such models, which can be reduced to a list of 50 models after allowing for symmetry between alleles, loci and affection status. We classify models with continuously varying penetrances, overcoming the difficulty they highlight in their paper, and show that in fact their 50 models fall into 29 of the 69 symmetry classes.
We introduce the mathematical concepts used to derive the 387 classes of two-locus models and demonstrate on a real dataset how the shapes can be used to classify pairs of loci and identify pairs with similar genetic effects. Finally, we consider the two-locus models typically used in human genetics, the 50 models from [20], and some models with epistasis. We show that these models only represent a small fraction of all possible two-locus models.
Results and Discussion
Shapes of two-locus models
In the case of a dichotomous trait, f_{ ij }can e.g. be a penetrance, the probability that an individual with genotype ij will get the disease. For a quantitative trait, f_{ ij }can e.g. be the expected phenotypic value for an individual with genotype ij.
In an additive model, the genotype values can be written as a sum of the effect at each locus, f_{ ij }= α_{ i }+ β_{ j }, where α_{ i }is the effect associated with having i disease alleles at the first locus, and β_{ j }is the effect associated with having j disease alleles at the second locus. An epistatic model is any non-additive two-locus model. To study epistasis we consider the interaction space, which is the space of all two-locus models modulo the space spanned by all additive two-locus models. The interaction space is spanned by a set of linear forms in the {f_{ ij }} called circuits. There is a circuit for each set of 3 collinear points, and for each set of four points in the plane such that no three of the points are collinear, resulting in a total of 86 circuits. The coefficients in the linear form are such that the sum of the points in the circuit, when scaled by these coefficients, is zero. For example, the circuit arising from the points f_{00}, f_{01}, f_{20}, and f_{12} is
-3f_{00} + 4f_{01} + f_{20} - 2f_{12},
since
-3·(0, 0) + 4·(0, 1) + (2, 0) - 2·(1, 2) = 0.
Every circuit with four points can be seen as a contrast between two pairs of genotype values and measures a specific deviation from additivity. For example, the above circuit is positive if 4f_{01} + f_{20} ≥ 3f_{00} + 2f_{12} and negative otherwise. For some circuits this contrast has a simple interpretation, e.g. the circuit arising from f_{00}, f_{01} and f_{02} is f_{00} - 2f_{01} + f_{02}. It compares the genotype value f_{01} (for genotype aa/Bb) to the average of the genotypic values f_{00} and f_{02} (for genotypes aa/bb and aa/BB), i.e. it measures deviation from additivity at locus B in individuals with genotype aa at locus A.
4·f_{ a }= -f_{00} + f_{02} - f_{20} + f_{22}
4·f_{ b }= -f_{00} - f_{02} + f_{20} + f_{22}
4·δ_{ a }= -f_{00} + 2f_{01} - f_{02} - f_{20} + 2f_{21} - f_{22}
4·δ_{ b }= -f_{00} + 2f_{10} - f_{20} - f_{02} + 2f_{12} - f_{22}
4·I_{ AA }= f_{00} - f_{02} - f_{20} + f_{22}
4·I_{ AD }= f_{00} - 2f_{01} + f_{02} - f_{20} + 2f_{21} - f_{22}
4·I_{ DA }= f_{00} - 2f_{10} + f_{20} - f_{02} + 2f_{12} - f_{22}
4·I_{ DD }= f_{00} - 2f_{01} + f_{02} - 2f_{10} + 4f_{11} - 2f_{12} + f_{20} - 2f_{21} + f_{22}
Note that with this choice the additive effect is scaled so that the contribution is -f_{ a }, 0, and f_{ a }for genotypes aa, Aa, and AA respectively, and similarly for the second locus. This is a simple linear transformation of the genotype values which can be used both for penetrances and phenotypic means. The space of all two-locus models has dimension 9 and the interaction space has dimension 6. A natural choice of a basis for the interaction space is given by the interaction coordinates (δ_{ a }, δ_{ b }, I_{ AA }, I_{ AD }, I_{ DA }, I_{ DD }) where δ_{ a }and δ_{ b }measure within-locus interaction and I_{ AA }, I_{ AD }, I_{ DA }, and I_{ DD }measure between-loci interaction.
A full list of the 86 circuits in the new coordinates is given in Appendix B. Although the circuits are fully specified by the six interaction coordinates they do contain important information on the type of interaction present. The circuits measure interesting contrasts and the new parameterization allows us to interpret them. For example, circuit c_{30} = 2δ_{ a }- 2δ_{ b }measures the difference between the dominance effects, circuit c_{1} = -2δ_{ a }+ 2I_{ AD }measures the difference between the dominance effect at the first locus and the additive × dominance interaction, etc. The sign of a circuit specifies whether the type of epistasis measured by the circuit is positive or negative, and its magnitude measures the degree of interaction. The circuits contain detailed information on the interaction in a model and to fully describe the pattern of interaction we can consider the sign pattern of all 86 circuits, however, this leads to a very large number of categories. For a more useful classification of all two-locus models according to the type of interaction present we consider the triangulation induced by the penetrances. The connection between a triangulation and the circuits will be discussed further below.
The mathematical definition of a triangulation is given in Appendix A but an informal description is provided here. We represent the 9 genotypes by 9 points in the plane on a 3 × 3 grid and the genotypic values by heights above these points. If the values come from an additive model it is possible to fit a plane through the height points. For any non-additive model we consider the surface given by the upper faces of the convex hull of the heights. Intuitively this is the surface formed if we were to drape a piece of stiff cloth on top of the heights and consider its shape. Any departure from additivity in the model becomes apparent in this surface. The triangulation, or shape, of a model is obtained by projecting these upper faces (the "creases" in the surface) onto the xy-plane.
A sign pattern for the circuits specifies a model shape, but the converse is not true. Thus considering the shape of a model, rather than the sign pattern of the 86 circuits, gives a coarser model classification, but it provides a very useful description of the type of epistasis in the model. A shape contains information about the signs of some of the 86 circuits. Every group of points in a circuit can be triangulated in exactly two ways [22] corresponding to the type of epistasis. If a model shape has a line connecting the points (i_{1}, j_{1}) and (i_{2}, j_{2}) then for some circuit, $c=({a}_{1}{f}_{{i}_{1}{j}_{1}}+{a}_{2}{f}_{{i}_{2}{j}_{2}})-({b}_{1}{f}_{{k}_{1}{l}_{1}}+{b}_{2}{f}_{{k}_{2}{l}_{2}})$, the pair ${f}_{{i}_{1}{j}_{1}}$ and ${f}_{{i}_{2}{j}_{2}}$ are the "winners", i.e. ${a}_{1}{f}_{{i}_{1}{j}_{1}}+{a}_{2}{f}_{{i}_{2}{j}_{2}}\ge {b}_{1}{f}_{{k}_{1}{l}_{1}}+{b}_{2}{f}_{{k}_{2}{l}_{2}}$. Similarly, if there is no line connecting the points (i_{1}, j_{1}) and (i_{2}, j_{2}), and it is not possible to add one without crossing an existing line segment, then there is some circuit such that ${f}_{{i}_{1}{j}_{1}}$ and ${f}_{{i}_{2}{j}_{2}}$ are the "losers". For example, in Figure 1, there is a line between (1, 0) and (0, 1) and f_{01} + f_{10} ≥ f_{00} + f_{11}, and also 2f_{01} + 2f_{10} ≥ 3f_{00} + f_{22}.
Note that the model shape gives information about the types of interaction present in the model, but does not reveal the magnitude of the interaction (for that we need the actual value of the circuits). For generic models we always get a triangulation of the 3 × 3 grid, but for some models the resulting shape is not a triangulation but a subdivision, where not all cells in the shape are 3-sided (this happens e.g. when many of the genotype values are identical). These coarse subdivisions are not counted in our 387 models, however each coarse subdivision is refined by two or more of our models. The model shape provides information that is complementary to that given by the values of the interaction coordinates. Looking at a specific triangulation or subdivision tells us which way some (but not all) of the circuits are triangulated, thus giving information about interaction for that particular model, in particular the triangulation allows us to identify the dominating interactions. Consider e.g. the example in Figure 1. Although there is additive × additive interaction present (I_{ AA }= 210.9), and the circuit c_{17} = 4I_{ AA }is clearly positive, the corresponding line between (0, 0) and (2, 2) is not included. This is because this interaction is dominated by other types of interaction. The two circuits with the largest values are c_{71} and c_{72}. The first of these contrasts f_{01} and f_{22} with f_{02} and f_{10} and thus the line between (0, 1) and (2, 2) is included in the triangulation, the second contrasts f_{10} and f_{22} with f_{02} and f_{21} and thus the line between (1, 0) and (2, 2) is included.
It is useful to have a notion of when two shapes are "close" or "similar". We say that two shapes are adjacent if one can move from one to the other by changing the sign of one of the 86 circuits (note that most sign changes do not result in a move between shapes). Out of 387 shapes, 350 are adjacent to 6 other shapes, 16 are adjacent to 7 other shapes, and 21 are adjacent to 8 other shapes. We define the distance between two shapes as the minimum number of circuit changes that are necessary to get from one to the other. In the set of 387 shapes the maximum distance between two shapes is 9, and around 70% of all pairs of shapes are distance 4 to 6 apart. Two-locus models which fall into adjacent model shapes share many of the same two-locus interactions, and in general the shorter the distance between two shapes, the more similar the genetic effects. For a further discussion on the shapes of genetic models see [23].
Each shape divides the 3 × 3 grid into 2 to 8 triangles (the numbers in each category are 2, 11, 38, 68, 96, 108 and 64 out of 387). Each shape corresponds to a subspace of 9-dimensional space and the volume of this subspace measures how much of the parameter space the shape inhabits. We obtained an estimate of this by generating 1,000,000 random vectors of length 9 and calculating the shape that each of them falls into. The model shape for a 9-vector is conserved under shifting and scaling so it suffices to consider vectors in [0, 1]^{9} and each of the 9 numbers were drawn uniformly at random from the interval [0, 1]. The fraction of observations that fell into shapes which divide the grid into 2, 3, 4, triangles, etc., was 6.4%, 17.2%, 28.3%, 24.9%, 15.0%, 6.1% and 2.0%, very different from the fraction of shapes in each category, which is 0.5%, 2.8%, 9.8%, 17.6%, 24.9%, 29.9% and 16.5%. Two-locus models where one, or a few, genotype values are larger than the remaining values induce shapes which contain fewer triangles. However, if the genotype values show only slight deviations from falling on a plane (i.e. δ_{ a }, δ_{ b }, I_{ AA }, I_{ AD }, I_{ DA }, and I_{ DD }are small), the surface is not dominated by a few genotypes and the resulting shape will be more subdivided.
We will further discuss how the shapes can be used to characterize the type of interaction in a dataset using an example on QTL mapping in chicken.
Two-locus models
In this section we study a number of model classes that are often used in genetic analysis, and the shapes that they induce. We show that each of the model classes restricts the analysis to a small subset of all possible two-locus models. Furthermore, because these models are very specific, they limit the types of interaction that can be modeled and only represent a small fraction of the 69 shapes.
Classical two-locus models. The table lists the values of the interaction coordinates for multiplicative two-locus models. The parameters are γ = (α_{2} - α_{0})(β_{2} - β_{0}), η_{1} = (α_{0} + α_{2})(β_{2} - β_{0}), and η_{2} = (α_{2} - α_{0})(β_{0} + β_{2}).
One-loc | δ _{ a } | δ _{ b } | I _{ AA } | I _{ AD } | I _{ DA } | I _{ DD } |
---|---|---|---|---|---|---|
rec-rec | -η_{1}/4 | -η_{2}/4 | γ | -γ | -γ | γ |
rec-add | 0 | -η_{2} | γ | 0 | 0 | -γ |
rec-dom | -η_{1}/4 | -η_{2}/4 | γ | γ | -γ | -γ |
dom-dom | -η_{1}/4 | -η_{2}/4 | γ | γ | γ | γ |
dom-add | 0 | η _{2} | γ | γ | 0 | 0 |
add-add | 0 | 0 | γ | 0 | 0 | 0 |
Classification of epistatic effects
Model organisms such as yeast, mouse or chicken are frequently used in genetic analysis, and several recent studies have shown that epistatic effects contribute greatly to observed genetic variability. When pairs of interacting loci have been found, using either QTL mapping, linkage analysis, or association analysis, it is of interest to describe the epistasis in the data. If many pairs of interacting loci have been found, it is of interest to identify pairs with similar genetic effects. This classification can be based on finding, for each pair, the model which best fits the data, out of the classical two-locus models. However, many datasets do not fall into any one of these classes (e.g. more than one type of epistasis can be present in the data). Another option is to base the classification on visual inspection, but that can be inaccurate and very time consuming, especially since in most applications the two alleles at a locus are interchangeable, so one would have to consider many rotations of the 3 × 3 data matrix.
We propose classifying observations according to the shape that they induce, and measuring the similarity of the genetic effects observed in two different datasets by the minimum distance between their induced shapes. This allows us to quickly and automatically identify observations with similar genetic effects. Here we only consider the shape of a model for classification but a more robust classification, outside the scope of this paper, could be obtained by testing which circuits are non-zero and considering the shape induced after circuits which are not significant have been set to zero. This would help in reducing mis-classification due to measurement error in the data and in particular this would reveal whether the data comes from an additive (or near additive) model.
In a study of growth traits in chickens, [24] measured various growth and body weight variables on 546 chickens from an F_{2} cross between two lines, a commercial broiler sire line and a White Leghorn line. The alleles at each locus are labeled with B and L, according to which line they came from. A method for simultaneous mapping of interacting QTLs was used to do a genome-wide analysis of five growth traits which identified 21 QTL pairs with a significant genetic effect. Some of the 21 QTL pairs were associated with more than one growth trait, resulting in 30 combinations of traits and QTL pairs. For each trait and QTL pair the phenotypic means of each of the nine two-locus genotypes were estimated using linear regression (see Table 2 in [24]). They noted that the standard models for epistasis do not adequately describe the types of interaction present in their data, and classified the QTL pairs into groups with similar genetic effect by visual inspection. They identified 4 general classes of models in this dataset, and classified 16 out of the 21 QTL pairs into one of these classes (when a QTL pair was associated with more than one trait the observations from both traits were considered to be in the same class). The classes are H) some of the homozygote/heterozygote combinations are lower than expected, B) the phenotype value associated with the genotype BB/BB is lower than expected, A) the data fits an additive model, by visual inspection, L) there is a set of genotypes with a high value, a set with a low value associated with it, and the value associated with the genotype LL/LL is between the two, and U) the 5 QTL pairs which did not fit into any of the four classes were left unclassified.
The visual classification corresponds very well to the classification based on shapes. All observations labeled H fall into clusters 1 and 2 (which are close to each other in the dendrogram) and all observations labeled B fall into clusters 3 and 4. The observations in group A (additive model) fall into two different clusters. An additive model has no shape (one can fit a plane through the points) but due to measurement error in real data this will not be the case. Note that 3 of the 5 observations in group A induce shapes which are very subdivided, as can be expected when there are no genotypes with very high values which dominate the shape. The observations in group U, which were previously unclassified, have now been grouped with the observations they are closest to. Two QTL pairs (4 and 6) were grouped together in category L. The two observations on QTL pair 6 are in cluster 4 and the observation on QTL pair 4 in cluster 1.
The power to detect epistasis depends on the model shape
where the coefficients of the model are the coordinates $\tilde{f}$, f_{ a }, f_{ b }, δ_{ a }, δ_{ b }, I_{ AA }, I_{ AD }, I_{ DA }and I_{ DD }, and ε is Gaussian. The x_{*} are dummy variables; x_{ A }takes the values -1, 0, and 1 for individuals with genotypes aa, Aa, and AA, respectively, and x_{ Aa }takes the value 1 for individuals with genotype A_{ a }. The variables x_{ B }and x_{ Bb }are defined similarly. To test for epistasis, the fit of this model is compared to an additive model where I_{ AA }= I_{ AD }= I_{ DA }= I_{ DD }= 0. The test statistic for a likelihood ratio test is minus twice the difference between the log-likelihood of the additive and the full model. This is equivalent to testing if the circuits c_{7} = c_{8} = c_{9} = c_{10} = 0.
where the f_{ ij }are penetrances and the dummy variables x_{*} are defined as above. By using logistic regression the log-odds scale is chosen as the scale of interest, and additivity on that scale corresponds to no interaction. A likelihood ratio test for epistasis compares the fit of the full model to an additive model where I_{ AA }= I_{ AD }= I_{ DA }= I_{ DD }= 0. This test is equivalent to testing c_{7} = c_{8} = c_{9} = c_{10} = 0 where the circuits are obtained by replacing f_{ ij }with log(f_{ ij }/(1 - f_{ ij })).
Recently [10] proposed a new test to detect unlinked interacting disease loci. They use an LD based interaction measure, I = h_{00}h_{11} - h_{01}h_{10}, where h_{ ij }is defined as the penetrance of a haplotype h_{ ij }(h_{00} is the haplotype ab, h_{01} is aB, etc.). The haplotype penetrance depends on the two locus penetrances as well as the allele frequencies. It is easy to show that the interaction measure, I, vanishes if c_{7} = c_{8} = c_{9} = c_{10} = 0 when the circuits are calculated using the log penetrance values. In other words, this interaction measure tests for multiplicative penetrances.
We also generated case-control data from an association study. The random vectors were normalized so that they all give the same population prevalence of disease. In the middle panel of Figure 7 we have plotted the maximum of the likelihood ratio test statistic as a function of the shape induced by the penetrances. The test measures deviation from additivity on a log-odds scale, so the difference between the different shapes is relatively small. When the shape is calculated based on the log-odds the results are the same as before. Finally, in Panel 3 of Figure 7 we plot the maximum absolute value of the interaction measure I. This test measures deviation from additivity on a log scale, yet the results seem to be more similar to the QTL mapping case.
Conclusion
The multitude of terms used to describe gene interactions are a testament not only to the importance of describing and classifying gene interaction, but also to the fact that even in a two-locus model the types of interactions that can and do occur are diverse and difficult to classify. Most examples of gene interactions that are observed in real data do not fall into any one of the categories typically used to describe interactions. Our approach overcomes this limitation and provides a complete classification of all two-locus models with continuous genotypic values into 69 (or 387) classes. The shape of a two-locus model reveals information about the types of gene interaction present and provides a visual representation of epistasis. By comparing an observed shape to the shapes of standard epistatic models we see which type of interaction is strongest in the data. Moreover, the values of the individual circuits listed in Appendix B provide a complete description of the epistasis in a two-locus system. The observed shape can differ from the true underlying model shape due to noise in the data. Rather than assign an observation to a shape based on the observed genotype values, one could test which circuits are significantly different from zero and use only those circuits to obtain the shape.
Two-locus models are frequently used to generate simulated datasets that form the basis for studies of the power of single-locus and two-locus methods. These can then be used e.g. to choose between exhaustive two-locus searches or two-stage two-locus analyses. There are many examples, both for linkage analysis and association analysis, where the results and ensuing recommendations depend on the models, and types of gene interactions, that are considered [8, 9]. With our complete classification it is possible to generate data from each model class (while varying parameters such as population prevalence and allele frequencies) and subsequently a more thorough analysis than previously possible can be performed.
As observed in [24] "there are no striking similarities with a Mendelian pattern of digenic epistasis" in the QTL example and we found many types of nontrivial interaction, including models which cannot easily be described using existing models. The fact that our classification is purely mathematical lends it strength, since we can describe all possible models and categorize them according to the relative genotypic values. It can easily be extended to three or more loci. It remains to be seen whether all of the 69 types occur in nature. Our results also provide a formalism for identifying types of epistasis that may play a role in determining genetic variability in populations [25], but we do not address these implications in this paper.
Appendix A: Polyhedral subdivisions
Our classification is based on the theory of regular polyhedral subdivisions.
Definition 1 A polyhedral subdivision of a point set A is a decomposition of conv(A), the convex hull of A, into a finite number of bounded polyhedra, such that the union of these polyhedra is conv(A), and the intersection of any two polyhedra is a common face of each (possibly the empty face).
Take conv($\tilde{A}$), and consider its "upper faces", that is, the faces whose outward-pointing normal vector has its last coordinate positive. Project each upper face onto conv(A), by dropping the final coordinate of each point. In this manner, we obtain a polyhedral subdivision of A. Note that some points of A may not be used in this subdivision.
Remark 2 In the construction of an induced subdivision there is some ambiguity as to the whether to project with the lower or upper faces of conv($\tilde{A}$). Both conventions are commonplace. We chose to use the upper faces in order to stay consistent with literature on induced subdivisions and gene epistasis [19].
If the set of heights {h_{ i }} is sufficiently generic, then the subdivision induced by the heights will be a triangulation. We will only consider regular subdivisions and triangulations, thus we will use the term "subdivision" to mean "regular polyhedral subdivision", and "triangulation" to mean "regular triangulation". For more on polyhedral geometry see the book [26].
Appendix B: Circuits
c_{1} = -2δ_{ a }+ 2I_{ AD }
c_{2} = -2δ_{ a }- 2I_{ DD }
c_{3} = -2δ_{ a }- 2I_{ AD }
c_{4} = -2δ_{ b }+ 2I_{ DA }
c_{5} = -2δ_{ b }- 2I_{ DD }
c_{6} = -2δ_{ b }- 2I_{ DA }
c_{7} = I_{ AA }+ I_{ AD }+ I_{ DA }+ I_{ DD }
c_{8} = I_{ AA }- I_{ AD }+ I_{ DA }- I_{ DD }
c_{9} = I_{ AA }- I_{ AD }- I_{ DA }+ I_{ DD }
c_{10} = I_{ AA }+ I_{ AD }- I_{ DA }- I_{ DD }
c_{11} = -2δ_{ a }- 2δ_{ b }+ 2I_{ AA }- 2I_{ DD }
c_{12} = -2δ_{ a }- 2δ_{ b }- 2I_{ AA }- 2I_{ DD }
c_{13} = 2I_{ AA }+ 2I_{ DA }
c_{14} = 2I_{ AA }- 2I_{ DA }
c_{15} = 2I_{ AA }+ 2I_{ AD }
c_{16} = 2I_{ AA }- 2I_{ AD }
c_{17} = 4I_{ AA }
c_{18} = -2δ_{ a }+ I_{ AA }+ I_{ DA }- I_{ DD }+ I_{ AD }
c_{19} = 2δ_{ a }+ I_{ AA }+ I_{ DA }+ I_{ DD }- I_{ AD }
c_{20} = -2δ_{ a }+ I_{ AA }- I_{ DA }- I_{ DD }- I_{ AD }
c_{21} = 2δ_{ a }+ I_{ AA }- I_{ DA }+ I_{ DD }+ I_{ AD }
c_{22} = -2δ_{ b }+ I_{ AA }+ I_{ DA }- I_{ DD }+ I_{ AD }
c_{23} = -2δ_{ b }- I_{ AA }+ I_{ DA }- I_{ DD }- I_{ AD }
c_{24} = -2δ_{ b }+ I_{ AA }- I_{ DA }- I_{ DD }- I_{ AD }
c_{25} = -2δ_{ b }- I_{ AA }- I_{ DA }- I_{ DD }+ I_{ AD }
c_{26} = -2δ_{ a }+ 2I_{ AA }
c_{27} = 2δ_{ a }+ 2I_{ AA }
c_{28} = -2δ_{ b }+ 2I_{ AA }
c_{29} = -2δ_{ b }- 2I_{ AA }
c_{30} = 2δ_{ a }- 2δ_{ b }
c_{31} = 2δ_{ a }+ 2I_{ AA }+ 2I_{ DA }+ 2I_{ DD }
c_{32} = -2δ_{ a }+ 2I_{ AA }+ 2I_{ DA }- 2I_{ DD }
c_{33} = 2δ_{ a }+ 2I_{ AA }- 2I_{ DA }+ 2I_{ AD }
c_{34} = -2δ_{ a }+ 2I_{ AA }- 2I_{ DA }- 2I_{ AD }
c_{35} = -2δ_{ b }- 2I_{ AA }- 2I_{ DD }+ 2I_{ AD }
c_{36} = 2δ_{ b }+ 2I_{ AA }+ 2I_{ DD }+ 2I_{ AD }
c_{37} = -2δ_{ a }+ 2I_{ AA }+ 2I_{ DA }+ 2I_{ AD }
c_{38} = 2δ_{ a }+ 2I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{39} = -2δ_{ b }+ 2I_{ AA }- 2I_{ DD }+ 2I_{ AD }
c_{40} = 2δ_{ b }+ 2I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{41} = -2δ_{ b }+ 2I_{ AA }- 2I_{ DD }- 2I_{ AD }
c_{42} = -2δ_{ b }- 2I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{43} = 2δ_{ a }+ 2I_{ AA }- 2I_{ DA }+ 2I_{ DD }
c_{44} = -2δ_{ b }+ 2I_{ AA }+ 2I_{ DA }+ 2I_{ AD }
c_{45} = -2δ_{ b }+ 2I_{ AA }- 2I_{ DA }- 2I_{ AD }
c_{46} = -2δ_{ a }+ 2I_{ AA }- 2I_{ DA }- 2I_{ DD }
c_{47} = -2δ_{ a }+ 4I_{ AA }+ 2I_{ AD }
c_{48} = -2δ_{ b }+ 4I_{ AA }- 2I_{ DA }
c_{49} = -2δ_{ a }+ 4I_{ AA }- 2I_{ AD }
c_{50} = 2δ_{ a }+ 4I_{ AA }- 2I_{ AD }
c_{51} = -2δ_{ b }- 4I_{ AA }+ 2I_{ DA }
c_{52} = 2δ_{ a }+ 4I_{ AA }+ 2I_{ AD }
c_{53} = 2δ_{ b }+ 4I_{ AA }+ 2I_{ DA }
c_{54} = -2δ_{ b }+ 4I_{ AA }+ 2I_{ DA }
c_{55} = -2δ_{ a }+ 2δ_{ b }+ 2I_{ DA }+ 2I_{ AD }
c_{56} = 2δ_{ a }- 2δ_{ b }+ 2I_{ DA }+ 2I_{ AD }
c_{57} = 2δ_{ a }- 2δ_{ b }+ 2I_{ DA }- 2I_{ AD }
c_{58} = 2δ_{ a }- 2δ_{ b }- 2I_{ DA }+ 2I_{ AD }
c_{59} = 2δ_{ a }+ 2δ_{ b }+ 4I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{60} = -2δ_{ a }- 2δ_{ b }+ 4I_{ AA }- 2I_{ DA }- 2I_{ AD }
c_{61} = -2δ_{ a }- 2δ_{ b }+ 4I_{ AA }+ 2I_{ DA }+ 2I_{ AD }
c_{62} = -2δ_{ a }- 2δ_{ b }- 4I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{63} = 2δ_{ a }- 4δ_{ b }- 2I_{ AA }- 2I_{ DA }+ 2I_{ AD }
c_{64} = 4δ_{ a }- 2δ_{ b }+ 2I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{65} = -4δ_{ a }+ 2δ_{ b }+ 2I_{ AA }+ 2I_{ DA }+ 2I_{ AD }
c_{66} = 2δ_{ a }- 4δ_{ b }- 2I_{ AA }+ 2I_{ DA }- 2I_{ AD }
c_{67} = 2δ_{ a }- 4δ_{ b }+ 2I_{ AA }- 2I_{ DA }- 2I_{ AD }
c_{68} = 4δ_{ a }- 2δ_{ b }+ 2I_{ AA }- 2I_{ DA }+ 2I_{ AD }
c_{69} = 2δ_{ a }- 4δ_{ b }+ 2I_{ AA }+ 2I_{ DA }+ 2I_{ AD }
c_{70} = 4δ_{ a }- 2δ_{ b }- 2I_{ AA }+ 2I_{ DA }+ 2I_{ AD }
c_{71} = 4δ_{ a }- 2δ_{ b }+ 4I_{ AA }+ 2I_{ DA }- 4I_{ AD }
c_{72} = 4δ_{ a }- 2δ_{ b }- 4I_{ AA }+ 2I_{ DA }+ 4I_{ AD }
c_{73} = 4δ_{ a }- 2δ_{ b }+ 4I_{ AA }- 2I_{ DA }+ 4I_{ AD }
c_{74} = 2δ_{ a }- 4δ_{ b }- 4I_{ AA }+ 4I_{ DA }- 2I_{ AD }
c_{75} = -2δ_{ a }+ 4δ_{ b }+ 4I_{ AA }+ 4I_{ DA }- 2I_{ AD }
c_{76} = -4δ_{ a }+ 2δ_{ b }+ 4I_{ AA }+ 2I_{ DA }+ 4I_{ AD }
c_{77} = 2δ_{ a }- 4δ_{ b }+ 4I_{ AA }+ 4I_{ DA }+ 2I_{ AD }
c_{78} = 2δ_{ a }- 4δ_{ b }+ 4I_{ AA }- 4I_{ DA }- 2I_{ AD }
c_{79} = -4δ_{ a }- 2δ_{ b }- 2I_{ DA }- 4I_{ DD }
c_{80} = -2δ_{ a }- 4δ_{ b }- 4I_{ DD }+ 2I_{ AD }
c_{81} = -4δ_{ a }- 2δ_{ b }+ 2I_{ DA }- 4I_{ DD }
c_{82} = -2δ_{ a }- 4δ_{ b }- 4I_{ DD }- 2I_{ AD }
c_{83} = -2δ_{ a }- 2δ_{ b }+ I_{ AA }+ I_{ DA }- 3I_{ DD }+ I_{ AD }
c_{84} = -2δ_{ a }- 2δ_{ b }- I_{ AA }+ I_{ DA }- 3I_{ DD }- I_{ AD }
c_{85} = -2δ_{ a }- 2δ_{ b }+ I_{ AA }- I_{ DA }- 3I_{ DD }- I_{ AD }
c_{86} = -2δ_{ a }- 2δ_{ b }- I_{ AA }- I_{ DA }- 3I_{ DD }+ I_{ AD }
Declarations
Acknowledgements
The authors thank Bernd Sturmfels and Lior Pachter for helpful discussions and the two anonymous reviewers whose comments greatly improved this paper. I.B.H. was supported by grant 512066 (LSHG-CT-2004) from the European Union FP6 programme.
Authors’ Affiliations
References
- Clark A, Wang L: Epistasis in measured genotypes: Drosophila P-element insertions. Genetics. 1997, 147: 157-63.PubMed CentralPubMed
- Storey J, Akey J, Kruglyak L: Multiple locus linkage analysis of genomewide expression in yeast. PLOS Biology. 2005, 3 (8):
- Carlborg O, Haley C: Epistasis: too often neglected in complex trait studies?. Nat Rev Genet. 2004, 5:
- Wade M: Epistasis, complex traits, and mapping genes. Genetica. 2001, 112–113: 59-69. 10.1023/A:1013316611768.View ArticlePubMed
- Hoh J, Ott J: Mathematical multi-locus approaches to localizing complex human trait genes. Nature Rev Genet. 2003, 4: 701-709. 10.1038/nrg1155.View ArticlePubMed
- Moore J: A global view of epistasis. Nature Genetics. 2005, 37:
- Hirschhorn J, Lohmueller K, Byrne E, Hirschhorn K: A comprehensive review of genetic association studies. Genet Med. 2002, 4: 45-61.View ArticlePubMed
- Marchini J, Donnelly P, Cardon L: Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genetics. 2005, 37 (4): 413-417. 10.1038/ng1537.View ArticlePubMed
- Evans D, Marchini J, Morris A, Cardon L: Two-stage two-locus models in genome-wide association. PLOS Genetics. 2006, 2 (9): 1424-1432. 10.1371/journal.pgen.0020157.View Article
- Zhao J, Jin L, Xiong M: Test for interaction between two unlinked loci. Am J of Hum Genet. 2006, 79: 831-845. 10.1086/508571.View Article
- Pérez-Enciso M: Multiple association analysis via simulated annealing (MASSA). Bioinformatics. 2006, 22 (5): 573-580. 10.1093/bioinformatics/btk050.View ArticlePubMed
- Brem R, Kruglyak L: The landscape of genetic complexity acreoss 5,700 gene expression traits in yeast. PNAS. 2005, 102 (5): 1572-1577. 10.1073/pnas.0408709102.PubMed CentralView ArticlePubMed
- Cordell H: Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Gen. 2002, 11 (20): 2463-2468. 10.1093/hmg/11.20.2463.View ArticlePubMed
- Wade M: A gene's eye view of epistasis, selection and speciation. J Evol Biol. 2002, 15: 337-346. 10.1046/j.1420-9101.2002.00413.x.View Article
- Phillips P: The language of gene interaction. Genetics. 1998, 149: 1167-1171.PubMed CentralPubMed
- Cheverud J, Routman J: Epistasis and its contribution to genetic variance components. Genetics. 1995, 139: 1455-61.PubMed CentralPubMed
- Fisher R: The correlations between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinburgh. 1918, 52: 399-433.View Article
- Gauderman W: Sample size requirements for association studies of gene-gene interaction. Int J of Epidemiology. 2002, 155 (5): 478-484. 10.1093/aje/155.5.478.
- Beerenwinkel N, Pachter L, Sturmfels B: Epistasis and shapes of fitness landscapes. Statistica Sinica. 2007, 17 (4): 1317-1342.
- Li W, Reich J: A complete enumeration and classification of two-locus disease models. Hum Hered. 2000, 50: 334-349. 10.1159/000022939.View ArticlePubMed
- Rambau J: TOPCOM: Triangulations of point configurations and oriented matroids. Proc Int Congress of Mathematical Software, ICMS. 2002
- Ziegler G: Lectures on Polytopes. 1995, Springer, New York, NY, 152:
- Beerenwinkel N, Pachter L, Sturmfels B, Elena S, Lenski R: Analysis of epistatic interactions and fitness landscapes using a new geometric approach. BMC Evolutionary Biology. 2007, 7 (60):
- Carlborg O, Hocking P, Burt D, Haley C: Simultaneous mapping of epistatic QTL in chickens reveals clusters of QTL pairs with similar genetic effects on growth. Genet Res. 2004, 83: 197-209. 10.1017/S0016672304006779.View ArticlePubMed
- Turelli M, Barton H: Will population bottlenecks and multilocus epistasis increase additive genetic variance?. Evolution. 2006, 60 (9): 1763-1776.View ArticlePubMed
- DeLoera JA, Rambau J, Santos F: Triangulations: Applications, Structures, Algorithms. Springer to appear.
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.