Genome-wide association analysis is currently a primary tool for identification of loci associated with complex human traits. Testing for association under complex genetic models involving multiple interactions represents methodologically and computationally challenging task.
We and others have developed a method allowing testing of SNPs genome-widely for possible involvement into interaction [2, 3] via testing of the heterogeneity of variance of the trait conditional on the genotype. Here we extend this method to imputed SNPs. The method we suggest, SVLM, is based on linear regression, and therefore results obtained in individual studies can be easily meta-analyzed using conventional methods and software tools.
Analysis of genotypic variances can be of interest to medical research. Assuming that there is a certain genotype associated with high variance of, for instance, blood pressure, the subjects having this genotype can be at risk of having extremely low or extremely high blood pressure.
In developing our method for analysis of variances using imputed data we have utilized the fact that the variance is, by definition, the expectation of squared values of the variable in case of zero mathematical expectation of this variable. This allowed us re-formulate the task of estimation and analysis of variances of the trait as a task of regression analysis of transformed trait. In this setting, methodological and computational tools developed for GWAS are applicable for the variance analysis.
The most important advantage of the proposed method is the possibility to detect SNPs belonging to a complex genetic network with many interacting factors that is impossible to study with standard tools. These SNPs will show variance heterogeneity and using our method these SNPs can be detected without knowing all the factors involved into this network. To find the factors, which interact with the identified SNP, a follow-up analysis can be applied where interaction between the SNPs found in variance analysis and all other measured SNPs or environmental factors are tested. In a case of interaction with an unknown factor, the SNPs showing significant variance differences still can be used to improve the variance explained as shown in the example below.
Consider a scenario in which SNPs, associated with a trait found in regular GWAS's, together explain a certain proportion of total trait's variance:
is the proportion of total explained variance,
is the variance explained by GWAS SNPs, and
is the trait's variance. In addition a SNP has been found by variance analysis, showing different genotypic variances in a way where presence of interacting allele B
increases trait's variance:
are variances for the respective genotypes group AA, AB
, and BB
. Assume that allele frequencies and the effects of the SNPs found in GWAS and which contributed into
are the same in each genotypic group AA, AB
, and BB
of the interacting SNP. Then the proportions of explained variance for different genotypic groups are:
are proportions of variances explained by GWAS SNPs in individuals with genotypes AA, AB, and BB, respectively, at the SNP identified by the variance analysis. Taking into account that
it follows that the proportions explained variance by the GWAS SNPs is higher in genotypic group AA compared to AB and BB, and higher in genotypic group AB compared to
. The value of the proportion of total explained variance
and this value depends on interacting allele frequency, effect of interaction, variance and effect of interacting factor. Thus, in such a scenario there is at least one genotypic group (AA) for which SNPs found in GWAS's explain more of the trait's variance
compared to the total trait variance
. To perform genotypic variance analysis for pedigree-based studies we propose to use GRAMMAR  implemented into GenABEL software . In GRAMMAR the mixed model is applied where the trait is adjusted on random additive polygenic effect. Residuals from this model are free from polygenic familiar correlations and can be used for variance analysis.
To increase power of variance analysis by including the data from other studies the same approach as for regular GWAS can be used where the analysis is done for each cohort separately, followed by meta-analysis. The SVLM method can be used for discovering interacting SNPs following any of additive, dominant, recessive, over-dominant (where trait's variance among heterozygotes is increased), or genotypic models. In case of testing the additive variance model only, the SVLM test has maximal power in the case when the SNP follows true additive model and less power in case of dominant, recessive and over-dominant models. It is of interest to note that in case of over dominant model the power to detect interaction by the SVLM test is zero if the minor allele frequency (MAF) is 0.5 and increases with decreasing MAF. In a case when MAF is close to 0.5 Levene's test has higher performance.