Assuming that a genotype interacts with some factor in determination of a trait's value, it is expected that the trait's variance is increased in the group of subjects having this genotype. Thus, test of heterogeneity of variances can be proposed as a test to screen for potentially interacting SNPs. In this work, we evaluated type I error and power of variance heterogeneity analysis in respect to the detection of potentially interacting SNPs under the scenario when an interaction variable is unknown.

Three different tests of variance homogeneity were chosen in order to investigate their type I error performance. They are Bartlett's, Bartlett's with prior rank-transformation to normality of a trait and Levene's (Brown-Forsythe) tests. Not surprisingly, our results were in agreement with what is known from standard statistical theory [8–11]: it is known that for Bartlett's departure of the distribution of analyzed trait from normality (e.g. skewness or heavy tails) lead to increased type I error and Levene's test has better performance under these conditions. Interestingly, we have found that Bartlett's test has increased type I error even when the distribution of the trait is forced to be perfectly normal by application of rank transformation to normality in the case when the original pre-transformed distribution was non-normal, and direct effect of the SNP is present. These results, which may seem surprising at first, may be easily explained: three non-normal distributions with the same variance but different means after transformation translate to still not normal distributions with different variances. An illustrative example is provided in Additional file 1, Figure S1.

We showed that even if a large interaction effect is present, the power of the "screening" variance heterogeneity test depends strongly on the main effect of the interacting factor and may be quite limited. This results may at first seem surprising and contra-intuitive. To help better understanding of this phenomenon, here we provide a simple example of situation when there is an interaction effect, but the variances for all genotypes are equal, thus the variance test has no power. Consider binary factor **F** ∈ {-1, 1} with effect on the trait - in accordance to our previous notation - equal to *β*
_{
F
}, and frequency of "1" denoted as *f* (thus frequency of "-1" is 1 - *f*). Let genotype in question to be "dominant" and coded as *g* ∈ {0, 1, 1} for genotypes {*AA*, *AB*, *BB*}, respectively. Let mean *μ* = 0; for simplicity, at first, let us assume that the main effect of genotype is *β*
_{
g
}= 0. Let us denote the effect of genotype by factor interaction as *β*
_{
gF
}. Let the residual variance is
. In this case, the conditional expectations of the trait for the genotype "0" are *E*(*y*|*g* = 0, **F** = -1) = -*β*
_{
F
}(when the value of factor is -1) and *E*(*y*|*g* = 0, **F** = 1) = *β*
_{
F
}. For genotype "1", the expectations are *E*(*y*|*g* = 1, **F** = -1) = -*β*
_{
F
}- *β*
_{
gF
}and *E*(*y*|*g* = 1, **F** = 1) = *β*
_{
F
}+ *β*
_{
gF
}. It is easy to see that the conditional variance of the trait in genotype *g* = 0 is simply
, while the variance of the trait in other genotype is
. The conditional variances of the two genotypes are equal when either of two conditions is met: *β*
_{
gF
}= 0 (absence of interaction) or *β*
_{
F
}= -*β*
_{
gF
}/2. Taking a simple example with *f* = 1/2 it is straightforward to see how the variance could be the same while interaction effect is present. Interestingly, if *f* ≠ 1/2 and *β*
_{
F
}= -*β*
_{
gF
}/2, the conditional variances *Var*(*y*|*g* = 0) = *Var*(*y*|*g* = 1), but conditional expectations *E*(*y*|*g* = 0) ≠ *E*(*y*|*g* = 1), so the interaction will translate into marginal SNP effect in the absence of the main effect (we assumed that *β*
_{
g
}= 0). As *β*
_{
F
}deviates from -*β*
_{
gF
}/2 in any direction, the conditional variance *Var*(*y*|*g* = 1) will increase while *Var*(*y*|*g* = 0) will stay the same. With |*β*
_{
F
}| → ∞, *Var*(*y*|*g* = 1) → *Var*(*y*|*g* = 0). This explains the non-monotonic, M-shaped dependency of the non-centrality parameter of variance test on the main effect of the interaction variable demonstrated in Figure 2.

While in this work we consider a model assuming a SNP having additive effect and following Hardy-Weinberg distribution and an interaction factor following normal distribution, the same principal result - non-monotonic dependence of the power of variance test on the main effect of interacting variable - should hold for other models and other types of interacting factor (e.g. binary, as we show above, or three-level, such as other SNPs); also, a deviation from HWE will not affect our major conclusions.

Our analysis of power was performed using Bartlett's test. Bartlett's has highest power in case of normally distributed trait, but is not robust to non-normality in trait distribution. Levene's test has better performance under deviations from normality, but has lower power compared to Bartlett's test. Therefore our principal findings will not change whether Bartlett's or Levene's test is used: particular figures provided estimate maximal power, but the relation of the power to the underlying model parameters will be the same for both tests.

We considered testing for heterogeneity of variances as a screening tool for potentially interacting SNPs in the context of population-based design. It has been proposed that this testing can be more effectively done in the context of monozygotic twins or migrant studies [4]. While these designs may indeed be more powerful compared to population-based design, the same relation between power of variance heterogeneity test and the underlying model parameters is to be expected in these designs as well.

Thus, for a wide range of designs, models and test used, we can conclude that that absence of significant heterogeneity of variances can not be interpreted as absence of strong interaction because the power of the variance test depends much on the main effect of the (unobserved) interacting factor.

It is interesting to consider whether presence of significant variance heterogeneity tells us that a SNP indeed interacts with some factor. First of all, variance heterogeneity will be detected for a SNP having main effect when the distribution of the trait is heteroscedastic, i.e. the variance increases with the mean - a situation rather common in biology. This suggests that prior test for heteroscedasity should be performed before running variance heterogeneity as an "interaction screening" test. Another - biological - possibility is that a genotype indeed affects the variance of the trait without any specific interaction. We can speculate that there may be genotypes which affect the stability of development or homeostasis, leading to wider trait's variance.

Detection of a variance homogeneity for a given SNP does not necessary indicate that a single factor is interacting with a studied SNP. Moreover, it can suggest the presence of a complex network with many other SNPs and factors involved. The variance heterogeneity test may be especially effective to detect such SNPs - in case of multiple interacting factors it is very unlikely that the cumulative effects of the interacting factor will fall into the point at which the power of the variance test is minimal.

Further dissection of the SNPs demonstrating strong heterogeneity of variances may be a challenging task, requiring the search of the interactors through phenomic screening. Straightforward testing whether the identified interactor does explain heterogeneity of variances can be easily performed by using the variance homogeneity test on the residuals from the regression involving identified factor.

A number of genetic interaction models may lead to variance heterogeneity. These are straightforward interaction models as discussed above, when an environmental of other genetic factor changes the expectation of the trait value in the concert with the SNP studied. Other interesting model, leading to specific increase of the variance of the heterozygous genotype, is parent-of-origin model, when the expectation of the trait in heterozygous individuals (*AB*) depends on whether allele *A* was transmitted from father or from mother.

We showed that when one interacting factor is considered, the power of direct test, exploiting the knowledge of the interacting factor, is always greater then the power of the variance heterogeneity test. An interesting scenario in which the power of variance heterogeneity test may be greater than the power of direct test occurs when multiple interacting factors induce variance heterogeneity, in which case the power of identification any single of them (or all together) may be - due to small effects associated with particular interacting factor and with increased number of degrees of freedom - lower then the power of variance heterogeneity test.

In present GWAS, association between a SNP and a trait is studied by detecting difference between mean values of the genotypes for a given SNP. We conclude that screening for differences in variances is a promising approach as a number of biologically interesting models may lead to the heterogeneity of variances. However, it should be clearly considered that absence of variance heterogeneity for a SNP can not be interpreted as absence of involvement of the SNP into interactions network, while the presence of significant heterogeneity may be explained not only by plain interaction with some factor, but also by other biological mechanisms and statistical artifacts.