Skip to main content

Filter-free exhaustive odds ratio-based genome-wide interaction approach pinpoints evidence for interaction in the HLA region in psoriasis

Abstract

Background

Deciphering the genetic architecture of complex traits is still a major challenge for human genetics. In most cases, genome-wide association studies have only partially explained the heritability of traits and diseases. Epistasis, one potentially important cause of this missing heritability, is difficult to explore at the genome-wide level. Here, we develop and assess a tool based on interactive odds ratios (IOR), Fast Odds Ratio-based sCan for Epistasis (FORCE), as a novel approach for exhaustive genome-wide epistasis search. IOR is the ratio between the multiplicative term of the odds ratio (OR) of having each variant over the OR of having both of them. By definition, an IOR that significantly deviates from 1 suggests the occurrence of an interaction (epistasis). As the IOR is fast to calculate, we used the IOR to rank and select pairs of interacting polymorphisms for P value estimation, which is more time consuming.

Results

FORCE displayed power and accuracy similar to existing parametric and non-parametric methods, and is fast enough to complete a filter-free genome-wide epistasis search in a few days on a standard computer. Analysis of psoriasis data uncovered novel epistatic interactions in the HLA region, corroborating the known major and complex role of the HLA region in psoriasis susceptibility.

Conclusions

Our systematic study revealed the ability of FORCE to uncover novel interactions, highlighted the importance of exhaustiveness, as well as its specificity for certain types of interactions that were not detected by existing approaches. We therefore believe that FORCE is a valuable new tool for decoding the genetic basis of complex diseases.

Background

During the past decade, many genome-wide association studies (GWAS) have aimed to identify new genetic factors determining susceptibility to a variety of diseases [1,2]. Although promising and sometimes successful, these large-scale studies have only led to modest advances [3]. One explanation is that the underlying model that single SNPs contribute independently to the complex trait may frequently be too simple. Rather, complex traits are likely to result from a complex interplay between genes, notably epistatic gene-environment and gene-gene interactions [4].

The principal obstacles in a genome-wide search for epistasis are statistical power to overcome the limitations of multiple testing and the computational time of the search itself. Over the past decades, many tools have been developed for epistasis detection using various statistical methods [5,6], including those based on regression [7-11], linkage disequilibrium and haplotypes [12,13], and Bayesian approaches [14,15]. Alternative approaches are based on data-filtering, machine-learning and data mining [16-19]. Here, we present an approach that detects pairwise epistasis on a genome-wide scale based on the classical interaction odds ratio (IOR). Introduced by Piegorsch et al. in 1994 [20], this approach has mainly been used for the detection of gene-environment interactions in case-only designs [21]. VanderWeele et al. [22] showed how the use of IOR can help reveal mechanistic interactions in case-only datasets.

Firstly, we report on the first efficient implementation of an approach for genome-wide epistasis detection, which we call FORCE (Fast Odds Ratio sCan for Epistasis). Due to its mathematical simplicity, the approach is suitable for exhaustive unfiltered epistasis analysis; i.e., the exact value of the IOR statistic can be evaluated for all pairs of genotyped SNPs in reasonable time on a standard computer. We introduce the mathematics to compute exact P-values for the most extreme values of IOR.

Secondly, we describe the application of FORCE to the Welcome Trust Case Control Consortium (WTCCC) data on psoriasis, and analyze the previously unknown statistical interactions we found in the light of already-known results.

Lastly we ask whether the statistical interactions detected by FORCE were found due to its exhaustiveness and/or its underlying genetic model, and we present evidence for both. We show that the restriction of FORCE to analyzing only certain SNPs selected according to their marginal effect on psoriasis (as previously described by Knight et al. [23]) strongly limits the statistical significance of the results. We then benchmark the performance of FORCE and other popular methods to detect simulated epistatic interactions, always using exhaustive search. Under different common models for interaction and noise, FORCE consistently detects certain types of interactions better than other approaches.

Methods

Definition of interaction odds ratio (IOR)

For any given pair of SNPs, the interaction odds-ratio statistic IOR is calculated from a pair of 2×2 contingency tables. These tables are derived from 3×3 tables of all allele combinations, by collapsing them according to a dominant or recessive model (see Table 1). Following preliminary evidence that the dominant model allowed more efficient detection of epistasis (Table 2), all analyses were performed using this dominant genetic model.

Table 1 Contingency table under a dominant model
Table 2 Power and Family-wise error rate (FWER) for detection of the functional pair using a dominant or recessive transmission assumption in 6 different epistasis models

We define the following odds ratios:

$$ {\mathrm{OR}}_1=\frac{\beta \varepsilon }{\alpha \zeta },\ {\mathrm{OR}}_2=\frac{\beta \gamma }{\alpha \delta },{\mathrm{OR}}_{1*2}=\frac{\beta \eta }{\alpha \theta },\ \mathrm{and}\ {\mathrm{I}}_{\mathrm{OR}}=\frac{{\mathrm{OR}}_{1*2}}{{\mathrm{OR}}_1\cdot {\mathrm{OR}}_2}=\frac{\alpha \delta \zeta \eta}{\beta \gamma \varepsilon \theta}. $$

Note that IOR is undefined when the denominator of this expression becomes zero. For formal consistency, we therefore added a pseudocount of 1 to each cell of the two contingency tables.

Statistical significance: Empirical and exact P-values

Note that an IOR of x equals an IOR of 1/x after exchanging counts between cases and controls. We define universal IOR, u(IOR):

$$ \mathrm{u}\left({\mathrm{I}}_{\mathrm{OR}}\right)=\frac{1}{{\mathrm{I}}_{\mathrm{OR}}}\mathrm{if}\ {\mathrm{I}}_{\mathrm{OR}}\le\ 1\ \mathrm{and}\ \mathrm{u}\left({\mathrm{I}}_{\mathrm{OR}}\right)={\mathrm{I}}_{\mathrm{OR}}\ \mathrm{if}\ {\mathrm{I}}_{\mathrm{OR}}>1. $$

This definition allows us to express significant deviations of u(IOR) from the expectation of 1 using a one-tailed P-value.

Pairs with high u(IOR) were identified by the straightforward algorithm that computes u(IOR) for each pair of given SNPs. Our C implementation encodes, in a preprocessing step, all data related to any given SNP into a bit string, and then uses fast logical and bit-counting functions to compute u(IOR) for all pairs.

Marginal empirical P-values for any given pair of SNPs were calculated as the proportion of u(IOR) values from randomly generated permutations of case–control labels that were larger than or equal to the value of u(IOR) obtained for the same pair in real data. The number of permutations performed (1000 for simulated data, 100,000 for real data) was adapted to the number of tests performed in these two scenarios.

Exact P-values were calculated using

$$ p= Pr\left(\mathrm{u}\left({\mathrm{I}}_{\mathrm{OR}}\right)\ge \mathrm{x}\right)=2\cdot \sum_{\begin{array}{l}\left(\alpha \hbox{'},\gamma \hbox{'},\varepsilon \hbox{'},\eta \hbox{'}\right):\ {I}_{OR}\ge \mathrm{x},\\ {}\alpha \hbox{'}+\gamma \hbox{'}+\varepsilon \hbox{'}+\eta \hbox{'}=\alpha +\gamma +\varepsilon +\eta \end{array}}\frac{\left(\underset{\alpha \hbox{'}}{\alpha +\beta}\right)\left(\underset{\gamma \hbox{'}}{\gamma +\delta}\right)\left(\underset{\varepsilon \hbox{'}}{\varepsilon +\zeta}\right)\left(\underset{\eta \hbox{'}}{\eta +\theta}\right)}{\left(\underset{\alpha \hbox{'}+\gamma \hbox{'}+\varepsilon \hbox{'}+\eta \hbox{'}}{\alpha +\beta +\gamma +\delta +\varepsilon +\zeta +\eta +\theta}\right)} $$

and computed by the straightforward algorithm with four nested loops to cover all required parameter tuples (α’,γ’,ε’,η’). Each inner loop only visits those parameter values that correspond to possible tuples with α’ + γ’ + ε’ + η’ = α + γ + ε + η, given the parameter values in the outer loop. Summed are those terms with u(IOR) ≥ x.

Application of FORCE to psoriasis data

To evaluate FORCE, we assessed its performance on the WTCCC psoriasis dataset. Initial GWAS and further analyses performed on these data are described in [24]. Following general practice for pre-processing, we excluded potentially low-quality SNP data from further analysis. Specifically, we discarded i) any individual whose total missing rate was above 0.05, ii) any SNP with a frequency of missing data above 0.05, and iii) any SNP with minor allele frequency below 0.05. After pre-processing, our dataset consisted of 2,618 cases, 2,737 controls and 491,191 SNPs, corresponding to approximately 1.2 × 1011 SNP pairs. We excluded pairs with a genomic distance of less than 100 kb to avoid pairs in linkage disequilibrium. In addition, we found that low row and cell counts in the contingency table (Table 1) can lead to extreme but frequently not significant values of u(IOR). For the purposes of this study, we excluded 3,521,114 SNP pairs with a total count of less than 50 in any row, or less than 5 in any cell of the contingency table. In addition to FORCE, we performed PLINK (FastEpistasis mode) on the top-ranked 500 pairs to compare the results obtained with both methods.

Comparison of exhaustive FORCE with semi-exhaustive and conditional search

To assess the utility of exhaustive search, we constructed a reference dataset of SNPs previously implicated in psoriasis. We started with a set of 34 SNPs from two previous reviews on psoriasis genetics [25,26] that were part of our psoriasis dataset. After applying quality control thresholds (described above), 18 SNPs remained.

Following general practice for genome-wide approaches, for exhaustive and semi-exhaustive searches, we used a genome-wide significance threshold of \( \mathrm{p}=\frac{2\times 0.05}{{\left({10}^6\right)}^2}={10}^{-13} \), which is based on a model of the human genome with 106 independent SNPs [27].

Comparison of FORCE with other approaches on simulated datasets

We simulated datasets of 10 biallelic SNPs over 200 cases and 200 controls following the Hardy-Weinberg equilibrium model. Interactions were simulated according to six different previously described models without main effect [28] (Table 3). These models represent pure epistasis effects, and not confounding main effects. Model 1 is an interaction effect in which high risk of disease occurs when inheriting heterozygous genotypes at either locus (Aa or Bb) but not both. Model 2 represents high risk of disease when inheriting two high-risk alleles that could be A and/or B. Models 3–6 correspond to the epistasis model discovery method described by Moore et al. [29]. Each of these models represents an interaction effect without any main effects. Allele frequencies are p = 0.25 and q = 0.75 for model 3 and 4, p = 0.1 and q = 0.9 for model 5 and 6.

Table 3 Penetrances and allele frequencies (p,q) used to simulate the interaction models – from Ritchie [28]

For each of the six models, we generated 100 datasets in each of the 16 conditions of the presence or absence of four of the most commonly encountered sources of noise: missing data (MS), genotyping errors (GE), genetic heterogeneity (GH), and phenocopy (PC).

For GH, two independent interactions were simulated instead of one, each interaction being risk-associated in half of the affected cases. When PC was simulated, interaction affected the trait for half of the cases, emulating an unknown environmental effect. GE and MS were simulated at 5%, as previously described [28].

An epistatic pair of SNPs was considered as detected if the empirical P-value was below 0.001, i.e., below 0.05 after Bonferroni correction. Power was estimated as n/100, where n is the number of datasets with detection(s). When two pairs (P1, P2) of SNPs were simulated, detection was counted under one of three different conditions: D1) when P1 and P2 were detected, D2) when P1 was detected, or D3) when P1 or P2 was detected. Family-wise error rate (FWER) was calculated as m/100, where m is the number of datasets for which at least one pair other than the simulated pair was detected.

Results

FORCE enables exhaustive unfiltered epistasis analysis

The FORCE method for epistasis detection is based on the choice of a dominant or recessive model that collapses combinations of allele counts into two 2×2 incidence tables (see Methods). Interactions are then detected as extreme values of the IOR statistic. We implemented the FORCE method for epistasis in C language [30]. Due to its mathematical simplicity and efficient implementation, the computation of IOR could be performed rapidly, compared to other approaches (4.3 days on a single core of a standard computer). Table 4 shows running times of different methods selected for this study.

Table 4 Average time needed to exhaustively test one/all 1.25×10 11 pairs among 500,000 SNPs using a single-core CPU computer

Identification of statistically strong interactions requires exhaustive search

To assess the value of exhaustive search, we first evaluated the performance of a conventional, non-exhaustive approach of constraining the analysis to pairs of SNPs that were previously shown to have main effects associated with the phenotype. We therefore performed a constrained analysis on all pairs of 18 high-quality SNPs that had main effects on psoriasis in previous GWA studies (see Methods). Table 5 gives the best 25 hits obtained through this approach when evaluated on the WTCCC dataset on psoriasis [24] (the results of all pairs are shown in Additional file 1: Table S1). None of the 153 pairs reached a significant interaction P-value below a genome-wide significance threshold of 10−13.

Table 5 Results from conditional search, restricted to pairs of previously implicated SNPs

A more comprehensive approach, to which we will here refer to as semi-exhaustive, constrains only one of the SNPs in a pair to a set of previously identified SNPs [8]. Table 6 shows, for each of the 18 previously identified “main effects” SNPs, the highest-scoring interactors, according to the FORCE and PLINK FastEpistasis statistics. Note that FORCE and PLINK identified a few genome-wide significant interactions with P-values as low as 10−20.

Table 6 Semi-exhaustive search among SNP pairs containing a GWAS-identified SNP

Finally, the relatively low computational complexity required for the FORCE statistic allowed us to perform exhaustive analysis of all SNP pairs in the psoriasis dataset. The results are shown in Table 7 (100 best hits shown in Additional file 1: Table S2). Strikingly, the best resulting P-values are another 20 orders of magnitude lower than the P-values identified by semi-exhaustive search. This shows that a large number of the most significant interactions are missed by the semi-exhaustive approach, and hence that the possibility of discovering the statistically best-supported interactions requires an exhaustive approach. Interestingly, FORCE and PLINK identify distinct interactions.

Table 7 FORCE Exhaustive search top hits, and PLINK FastEpistasis results in WTCCC psoriasis data

FORCE pinpoints interactions beyond main effects in the HLA region

We also analyzed the exhaustive FORCE results with regard to previous studies, which have detected numerous main effects [24-26], but only few weak statistical interactions [24,34,35]. We assessed the performance of FORCE using the WTCCC psoriasis dataset, which contains 2,618 cases, 2,737 controls and 491,191 SNPs. Table 7 shows the 25 best FORCE hits. Twenty-one out of 25 SNP pairs involve SNPs located in the HLA region on chromosome 6, which is consistent with the known strong involvement of the HLA region in psoriasis. Interestingly, certain SNP pairs found to be statistically significant by FORCE did not reach genome-wide significance when using PLINK FastEpistasis.

It is well known that SNPs with main effects may falsely appear to be interacting [36]. To avoid such artifacts in our analysis, we removed those SNPs that displayed a univariate statistical association P-value of 10−5 or less [24]. The results show three highly significant interactions involving SNPs from the HLA region that display no main effect (Table 8). In the absence of correlation between the SNPs we claim that these findings provide evidence of interactive effects involved in psoriasis susceptibility. This confirms that FORCE is able to uncover novel statistical interactions in the HLA region that have not been detected before using conventional approaches.

Table 8 Most significant interactions detected through exhaustive search after main effect SNPs removal

FORCE systematically detects interactions missed by other approaches

Besides its exhaustiveness, the other characteristic feature of the FORCE approach is the use of the IOR statistic for genome-wide epistasis analysis. To study the extent to which the choice of this statistic contributed to the identification of novel statistical interactions, we used datasets that contained different simulated epistatic interactions between SNPs without main effects, according one of six models of Ritchie [28], and none or one of the four sources of noise: Genotyping Error (GE), Missing Data (MS), Genetic Heterogeneity (GH), Phenocopy (PC) (see Methods for details). We then evaluated the power of FORCE and three other popular epistasis detection methods (PLINK Epistasis [7] and PLINK FastEpistasis [8] using default parameters, and MB-MDR [16], using recommended parameters [37]) to detect the simulated interactions. We used a significance threshold of 0.001. Figure 1 shows the results for all epistatic models for the case of no noise.

Figure 1
figure1

Power of different approaches to detect simulated epistatic interactions across the six epistasis models by Ritchie [ 28 ]. Purple: FORCE – Green: MB-MDR – Blue: PLINK Epistasis – Red: PLINK FastEpistasis. Refer to Table 3 for the definitions of the 6 interaction models.

Under all six models, FORCE and MB-MDR consistently showed power close to 1. The situation became more interesting in the presence of noise. Figure 2 shows the power of the tested methods for all six models in the presence of one type of noise (numerical values for are given in Tables 9, 10, 11 and 12). While the results for Genotyping Errors (GE) and Missing Data (MS) were very similar to the no-noise scenario, the presence of Genetic Heterogeneity (GH, independent of the definition of “detection”) or Phenocopy (PC) revealed larger differences among the different approaches. Firstly, we noted that, with GH and PC, all approaches lose power. Secondly, we observed that different approaches worked consistently better than others, depending on the interaction model. For interaction models 1 and 2, MB-MDR dominated all other approaches; FORCE dominated the other approaches for interaction models 3–6.

Figure 2
figure2

Power of different approaches to detect simulated epistatic interactions across the six epistasis models by Ritchie [ 28 ], in the presence of noise. Comparison of the power of four methods to detect interaction in the presence of one source of noise. GH: Genetic heterogeneity – GE: Genotyping errors – MS: Missing data – PC: Phenocopy. When GH is simulated, three different ways of calculating power are employed: the power of detecting both pairs in the same dataset, the power of detecting the first (fixed) pair and the power to detect either of the two epistatic pairs. Purple: FORCE – Green: MB-MDR – Blue: PLINK Epistasis – Red: PLINK FastEpistasis.

Table 9 Power and family-wise error rate (FWER) of FORCE, MBMDR, Plink Epistasis and Plink FastEpistasis on 6 epistasis models with or without noise
Table 10 Power and family-wise error rate (FWER) of FORCE, MBMDR, Plink Epistasis and Plink FastEpistasis on 6 epistasis models without noise or with simulated genetic heterogeneity (GH)
Table 11 Power of FORCE detection method, impact of various sources of noise and combinations of them for the 6 epistatic models
Table 12 Family-wise error rate (FWER) of FORCE for the 6 epistatic models and 16 noise conditions tested

Discussion

This study introduces the FORCE approach for genome-wide epistasis analysis. On the basis of the Interaction Odds Ratio (IOR) statistic, it performs a genome-wide search for epistatic interactions between pairs of SNPs in a reasonable time on a standard laptop computer. The search is exhaustive and filter-free; i.e., the result is guaranteed to reflect the most extreme IOR values over all possible interactions. Exhaustive search using FORCE is possible because of the computational simplicity of the IOR statistic.

Wu et al. [38] introduced a haplotype-based measure based on the following term:

$$ {I}_{GH}=\frac{OR_{G_1{H}_1}}{OR_{G_1{H}_2}{OR}_{G_2{H}_1}} $$

where \( {OR}_{G_1{H}_1} \) is the odds ratio for both risk haplotypes when carried together, compared to the baseline haplotypes; \( {OR}_{G_1{H}_2} \) and \( {OR}_{G_2{H}_1} \) are the odds ratios for each risk haplotype, respectively, compared to the baseline haplotype.

Although both methods are based on odds ratios, the methods differ in several respects. First, and most significantly, Wu’s method uses haplotypes, which typically require the statistical inference of haplotypes. Even though this design was shown to be better powered than classical genotype-based statistics, the additional calculations are computationally costly. As a result, FORCE can perform an exhaustive genome-wide epistasis search in a few days on a single compute core while, in practice, Wu’s method only allows a limited number of SNP pairs to be tested.

In addition to the different statistics themselves, the approaches to calculating significance differ. FORCE relies on an exact P-value that requires too much time to be calculated exhaustively for all SNP pairs. Instead, P-values are calculated only for pairs with the highest IOR. Conversely, Wu et al. used an approximate, chi-square distribution-based, P-value which can be applied to each investigated pair of the search.

Our study on WTCCC psoriasis data suggests that the computational effort for exhaustive testing is currently not just a luxury. The popular class of conditional analyses focuses only on possible interactions of previously implicated SNPs – often the only option to perform large-scale analysis in reasonable time. When comparing conditional and exhaustive FORCE analyses, we found that the conditional approach only detects interactions of vastly weaker statistical significance.

Our systematic study on small simulated datasets indicates that FORCE not only “goes farther” than existing approaches because of its exhaustive search, but also detects fundamentally different types of interactions, in particular in the biologically more relevant models 3–6. In two out of six models of epistatic interaction described by Ritchie [28], and across the different sources of noise in the data, FORCE consistently displayed a good power of detection compared to other approaches. Interestingly, each of the four approaches is always less efficient than another for at least one model associated with one type of noise.

Finally, by applying FORCE to WTCCC psoriasis data, we were able to detect statistical interactions between SNPs in the HLA region, even after the exclusion of all SNPs with main effects. To our knowledge this constitutes the first demonstration that the genetic structure of the HLA region cannot be understood by the analysis of main effects alone and that more than one interacting locus exists in that region.

Conclusions

Together, the different elements of our study suggest that FORCE represents a valuable new addition to the arsenal of genome-wide epistasis detection approaches for case–control studies. As with other approaches, the additionally detected interactions are a priori of a statistical nature, and require detailed analysis and follow-up.

Beyond this, our study has provided an example for the need for exhaustive epistasis analysis. In the future, exhaustive analysis will be facilitated by the ever-increasing computational power available to biological research. On one hand, this may enable the exhaustive calculation of FORCE P-values, which can be expected to lead to a potentially much enlarged set of statistically significant interactions. On the other hand, more computational power, as well as algorithmic improvements, may also render exhaustive analysis under those models of interactions feasible for which running times are prohibitive today. Finally, we believe that these improvements are necessary for the integration of different types of interactions and other types of large-scale data, which may ultimately be key to understanding the genetic basis of complex diseases.

Abbreviations

OR:

Odds ratio

IOR :

Interaction odds ratio

HLA:

Human leukocyte antigen

GWAS:

Genome wide association study

SNP:

Single nucleotide polymorphism

WTCCC:

The welcome trust case control consortium

GE:

Genotyping error

MS:

Missing data

GH:

Genetic heterogeneity

PC:

Phenocopy

FWER:

Family-wise error rate

References

  1. 1.

    Visscher PM, Brown MA, McCarthy MI, Yang J. Five years of GWAS discovery. Am J Hum Genet. 2012;90(1):7–24.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  2. 2.

    Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42(Database issue):D1001–1006.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. 3.

    Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747–53.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. 4.

    Mackay TF. Epistasis and quantitative traits: using model organisms to study gene-gene interactions. Nat Rev Genet. 2014;15(1):22–33.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  5. 5.

    Steen KV. Travelling the world of gene-gene interactions. Brief Bioinform. 2012;13(1):1–19.

    Article  PubMed  Google Scholar 

  6. 6.

    Wei WH, Hemani G, Haley CS. Detecting epistasis in human complex traits. Nat Rev Genet. 2014;15(11):722–33.

    Article  CAS  PubMed  Google Scholar 

  7. 7.

    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. 8.

    Schupbach T, Xenarios I, Bergmann S, Kapur K. FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics. 2010;26(11):1468–9.

    Article  PubMed Central  PubMed  Google Scholar 

  9. 9.

    Hemani G, Theocharidis A, Wei W, Haley C. EpiGPU: exhaustive pairwise epistasis scans parallelized on consumer level graphics cards. Bioinformatics. 2011;27(11):1462–5.

    Article  CAS  PubMed  Google Scholar 

  10. 10.

    Lange K, Papp JC, Sinsheimer JS, Sripracha R, Zhou H, Sobel EM. Mendel: the Swiss army knife of genetic analysis programs. Bioinformatics. 2013;29(12):1568–70.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  11. 11.

    Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NL, et al. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case–control studies. Am J Hum Genet. 2010;87(3):325–40.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  12. 12.

    Kam-Thong T, Czamara D, Tsuda K, Borgwardt K, Lewis CM, Erhardt-Lehmann A, et al. EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units. Eur J Hum Genet. 2011;19(4):465–71.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  13. 13.

    Prabhu S, Pe'er I. Ultrafast genome-wide scan for SNP-SNP interactions in common complex disease. Genome Res. 2012;22(11):2230–40.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. 14.

    Yi N, Kaklamani VG, Pasche B. Bayesian analysis of genetic interactions in case–control studies, with application to adiponectin genes and colorectal cancer risk. Ann Hum Genet. 2011;75(1):90–104.

    Article  PubMed Central  PubMed  Google Scholar 

  15. 15.

    Zhang Y, Liu JS. Bayesian inference of epistatic interactions in case–control studies. Nat Genet. 2007;39(9):1167–73.

    Article  CAS  PubMed  Google Scholar 

  16. 16.

    Luz Calle ML, Urrea V, Van Steen K. MB-MDR: Model-Based Multifactor Dimensionality Reduction for detecting interactions in high-dimensional genomic data. Vic, Spain: Universitat de Vic; 2008.

    Google Scholar 

  17. 17.

    Schwarz DF, Konig IR, Ziegler A. On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics. 2010;26(14):1752–8.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. 18.

    Ueki M, Tamiya G. Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis. BMC Bioinformatics. 2012;13:72.

    Article  PubMed Central  PubMed  Google Scholar 

  19. 19.

    Xie M, Li J, Jiang T. Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics. 2012;28(1):5–12.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  20. 20.

    Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case–control studies. Stat Med. 1994;13(2):153–62.

    Article  CAS  PubMed  Google Scholar 

  21. 21.

    Thomas D. Gene–environment-wide association studies: emerging approaches. Nat Rev Genet. 2010;11(4):259–72.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  22. 22.

    VanderWeele TJ, Hernandez-Diaz S, Hernan MA. Case-only gene-environment interaction studies: when does association imply mechanistic interaction? Genet Epidemiol. 2010;34(4):327–34.

    Article  PubMed Central  PubMed  Google Scholar 

  23. 23.

    Knight J, Spain SL, Capon F, Hayday A, Nestle FO, Clop A, et al. Conditional analysis identifies three novel major histocompatibility complex loci associated with psoriasis. Hum Mol Genet. 2012;21(23):5185–92.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  24. 24.

    Strange A, Capon F, Spencer CC, Knight J, Weale ME, Allen MH, et al. A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1. Nat Genet. 2010;42(11):985–90.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  25. 25.

    Chandran V. The genetics of psoriasis and psoriatic arthritis. Clin Rev Allergy Immunol. 2013;44(2):149–56.

    Article  CAS  PubMed  Google Scholar 

  26. 26.

    Oka A, Mabuchi T, Ozawa A, Inoko H. Current understanding of human genetics and genetic analysis of psoriasis. J Dermatol. 2012;39(3):231–41.

    Article  CAS  PubMed  Google Scholar 

  27. 27.

    Ziegler A, Konig IR, Thompson JR. Biostatistical aspects of genome-wide association studies. Biom J. 2008;50(1):8–28.

    Article  PubMed  Google Scholar 

  28. 28.

    Ritchie MD, Hahn LW, Moore JH. Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet Epidemiol. 2003;24(2):150–7.

    Article  PubMed  Google Scholar 

  29. 29.

    Moore JH, Hahn LW, Ritchie MD, Thornton TA, White BC. Application of Genetic Algorithms to the Discovery of Complex Models for Simulation Studies in Human Genetics. Proc Genet Evol Comput Conf. 2002;2002:1150–5.

    PubMed Central  PubMed  Google Scholar 

  30. 30.

    Kernighan BW, Ritchie DM. The C programming language. Second edition., ed. Prentice Hall; 1988.

  31. 31.

    Cattaert T, Calle ML, Dudek SM, Mahachie John JM, Van Lishout F, Urrea V, et al. Model-based multifactor dimensionality reduction for detecting epistasis in case–control data in the presence of noise. Ann Hum Genet. 2011;75(1):78–89.

    Article  PubMed Central  PubMed  Google Scholar 

  32. 32.

    Cordell HJ. Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet. 2009;10(6):392–404.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  33. 33.

    Goudey B, Rawlinson D, Wang Q, Shi F, Ferra H, Campbell RM, et al. GWIS--model-free, fast and exhaustive search for epistatic interactions in case–control GWAS. BMC Genomics. 2013;14 Suppl 3:S10.

    Article  PubMed Central  PubMed  Google Scholar 

  34. 34.

    Riveira-Munoz E, He SM, Escaramis G, Stuart PE, Huffmeier U, Lee C, et al. Meta-analysis confirms the LCE3C_LCE3B deletion as a risk factor for psoriasis in several ethnic groups and finds interaction with HLA-Cw6. J Invest Dermatol. 2011;131(5):1105–9.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  35. 35.

    Veal CD, Clough RL, Barber RC, Mason S, Tillman D, Ferry B, et al. Identification of a novel psoriasis susceptibility locus at 1p and evidence of epistasis between PSORS1 and candidate loci. J Med Genet. 2001;38(1):7–13.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  36. 36.

    Ueki M, Cordell HJ. Improved statistics for genome-wide interaction analysis. PLoS Genet. 2012;8(4):e1002625.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. 37.

    Mahachie John JM, Van Lishout F, Van Steen K. Model-Based Multifactor Dimensionality Reduction to detect epistasis for quantitative traits in the presence of error-free and noisy data. Eur J Hum Genet. 2011;19(6):696–703.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  38. 38.

    Wu X, Dong H, Luo L, Zhu Y, Peng G, Reveille JD, et al. A novel statistic for genome-wide interaction analysis. PLoS Genet. 2010;6(9):e1001131.

    Article  PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgements

We would like to thank the WTCCC for permission to use the psoriasis genome-wide data set. This article is linked to a project funded by “ANR-11-BSV1-027-01”. LG is supported by Ministère de la Recherche of France, IN is supported by the French Government's Investissement d'Avenir program, Laboratoire d'Excellence “Integrative Biology of Emerging Infectious Diseases” (grant n°ANR-10-LABX-62-IBEID).

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Benno Schwikowski or Anavaj Sakuntabhai.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

LG participated in the study design, carried out the analyses, interpreted results and drafted the manuscript. JFB participated in the study design, interpreted result and drafted the manuscript. IN contributed to the program, RP participated in interpreting results and was involved in drafting and revising the manuscript, KVS participated in the study design and revising the manuscript, BS conceived and implemented the FORCE method, analyzed and interpreted data, and was involved in drafting and revising the manuscript, AS conceived of the study, participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.

Additional file

Additional file 1: Table S1.

Epistasis analysis among GWAS hits: all 153 pairs of the conditional search. Table S2. FORCE Exhaustive search top 100 hits on psoriasis data.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Grange, L., Bureau, J., Nikolayeva, I. et al. Filter-free exhaustive odds ratio-based genome-wide interaction approach pinpoints evidence for interaction in the HLA region in psoriasis. BMC Genet 16, 11 (2015). https://doi.org/10.1186/s12863-015-0174-3

Download citation

Keywords

  • Genome-wide interaction studies
  • Epistasis
  • Plink
  • MBMDR
  • IOR