A powerful latent variable method for detecting and characterizing genebased genegene interaction on multiple quantitative traits
 Fangyu Li^{1},
 Jinghua Zhao^{2},
 Zhongshang Yuan^{1},
 Xiaoshuai Zhang^{1},
 Jiadong Ji^{1} and
 Fuzhong Xue^{1}Email author
DOI: 10.1186/147121561489
© Li et al.; licensee BioMed Central Ltd. 2013
Received: 22 May 2013
Accepted: 17 September 2013
Published: 23 September 2013
Abstract
Background
On thinking quantitatively of complex diseases, there are at least three statistical strategies for analyzing the genegene interaction: SNP by SNP interaction on single trait, genegene (each can involve multiple SNPs) interaction on single trait and genegene interaction on multiple traits. The third one is the most general in dissecting the genetic mechanism underlying complex diseases underpinning multiple quantitative traits. In this paper, we developed a novel statistic for this strategy through modifying the Partial Least Squares Path Modeling (PLSPM), called mPLSPM statistic.
Results
Simulation studies indicated that mPLSPM statistic was powerful and outperformed the principal component analysis (PCA) based linear regression method. Application to real data in the EPICNorfolk GWAS subcohort showed suggestive interaction (γ) between TMEM18 gene and BDNF gene on two composite body shape scores (γ = 0.047 and γ = 0.058, with P = 0.021, P = 0.005), and BMI (γ = 0.043, P = 0.034). This suggested these scores (synthetically latent traits) were more suitable to capture the obesity related genetic interaction effect between genes compared to single trait.
Conclusions
The proposed novel mPLSPM statistic is a valid and powerful genebased method for detecting genegene interaction on multiple quantitative phenotypes.
Keywords
Thinking quantitatively for complex diseases Genebased genegene interaction Quantitative traits mPLSPM statisticBackground
In search of novel loci influencing complex traits in humans, successes in genomewide association studies (GWAS) have been welldocumented [1]. While these have greatly improved our understanding of the genetic architecture of complex traits, often implicating biological pathways previously went undetected, most genetic components for complex traits are still to be revealed. One can attribute this to the suboptimality of their study designs, but inappropriate statistical data analysis strategy, including methods for genegene interaction analysis, may also play a role.
Although discussed extensively in the literature, a notable issue remains in GWAS using case–control design [2, 3]. Given phenotypes of most complex diseases (obesity, hypertension, diabetes, to name a few) are actually quantitative [4], a case–control design is usually furnished by dividing particular continuous quantitative measurement into case and control groups with a cut off which might not relate so well with genetic variation. Assigning cutoff to a continuous variable can lead to loss of information, and decrease the statistical power caused by selection bias. A proposal revived recently is to treat common disorders as quantitative traits in a framework of thinking quantitatively such that GWAS should be conducted using a population cohort with multiple quantitative traits [4]. In this framework, a complex disease is caused by multiple genes with small effect and their interaction, as well as their interaction with multiple environmental factors. The quantitative phenotype (trait) is expected to be continuous and normally distributed [4–6]. While for some diseases such as body mass index (BMI, weight (in kilograms)/height (in meters)^{2}) for obesity, blood pressure for hypertension, and mood for depression the relevant quantitative traits seem obvious, the relevant quantitative traits may not be entirely clear for diseases such as arthritis, autism, cancers, dementia and heart disease for which limited biomarkers are available. Even with obesity, BMI is only a proxy since it crudely measures the mean weight under given body surface area and varies with the amount of body fat and not a representation of its distribution. Various studies have shown that people with abdominal fat (with more weight around the waist) face more risks of cardiovascular diseases [7, 8] and other related diseases (such as hypertension, type 2 diabetes, and high cholesterol) [9–11] than those with hip obesity (with more weight around the hip) [10], suggesting that the phenotype of obesity might be more appropriately a synthetically latent trait (SLT) combined from diseaserelated manifest variables (BMI, waist circumference, hip circumference and neck circumference etc.). This serves as a contrast with most GWASs either using case–control designs [2, 3] or using quantitative variables [12–15] with simple linear regression and single SNPSNP interaction.
To detect genegene interaction, at least three statistical strategies can be considered for quantitative phenotypes, including single SNPSNP interaction on single trait, genegene (with multiple SNPs) interaction on single trait and genegene (with multiple SNPs) interaction on multiple traits. The first strategy is most susceptible to high false positive rate and low power in detecting modest effects owing to the ignorance of the linkage disequilibrium (LD) information between SNPs [16, 17]. Moreover, genes are the functional units in living organisms, analysis by focusing on a gene as a system could potentially yield more biologically meaningful results. In view of this, LD information is used in the second strategy, and some methods aimed at genebased genegene interaction detection exist [18–22]. Based on a genebased association test –ATOM by combining optimally weighted markers within a gene [18], He et al. extend it to analysis gene–gene interactions [19]. First, they derive the optimal weight for both quantitative and binary traits based on pairwised LD information and use the principal components (PCs) to summarize the information in each gene. Then, test for interactions between the PCs. In the work of Li and Cui, they conceptually propose a genecentric framework for genomewide gene–gene interaction detection [20]. They treat each gene as a testing unit and derive a modelbased kernel machine method for twodimensional genomewide scanning of gene–gene interactions. Recently, Ma et al. combine markerbased interaction tests between all pairs of markers in two genes to produce a genelevel test for interaction between the two, to test the genebased gene–gene interaction [21]. The tests are based on an analytic formula derived for the correlation between markerbased interaction tests due to LD. Although, aforementioned methods are proposed to detect the genebased genegene interaction, they fall short of consideration on multiple traits or SLT, especially when the traits are genetic related. It is, therefore, desirable to develop new method to detect genegene (with multiple SNPs) interaction on multiple traits.
In this paper, we attempted to develop a novel model for detecting the effect of genegene interaction on the SLT summarized by multiple manifest traits. The proposed model was constructed by adding a product term of combined multiple SNPs effect within two genes (genes A and B) via Partial Least Squares Path Modeling (PLSPM) [23, 24]. Thus, a structural equation model (SEM) was built between two genes and multiple manifest traits linked by the latent variables of gene A, gene B, gene A × gene B, and multiple traits, so that the genegene interaction statistic was defined based on the path coefficient between the latent variables of gene A × gene B and multiple traits. As the path coefficient in proposed statistic was calculated by modifying the Lohmöller PLSPM algorithm [25], we called it the modified PLSPM (mPLSPM) based statistic. Simulation studies were conducted to evaluate its type I error rate and power, and to compare its performance with the PCAbased linear regression model [26–28]. The method was also applied to a real data to evaluate its utility.
Methods
Statistical model
Our model is motivated from the original PLSPM which developed from structural equation models (SEM). SEM are complex models allowing the study of real world complexity by taking into account a whole number of causal relationships among latent concepts (i.e. the latent variables (LVs)), each measured by several observed indicators usually defined as manifest variables (MVs). Each pathmodelingbased statistic is formed by 2 submodels: structural (Inner) model and measurement (Outer) model. The structural model indicates the relationships among the latent variables, both of which are inferred from the observed SNPs (from different genes) and traits (e.g. waist, hip, BMI) respectively in this study. The measurement model formulation depends on the direction of the relationships between the latent variables and the corresponding manifest variables. As a matter of fact, different types of measurement model are available: the reflective model (or outwards directed model), the formative model (or inwards directed model) and the MIMIC model (a mixture of the two previous models). The reflective model has causal relationships from the latent variable to the manifest variables in its block. In contrast to reflective (or effects) model, the formative (causal) model has causal relationships from the manifest variables to the latent variables, namely the LV is caused (formed) by the MVs. Its construction is combination of observed (manifest) variables with multidimensional form and aims at minimizing residuals in structural relationships to explain the unobserved (latent) variable with higher R^{ 2 }[23]. More detailed interpretation for the original PLSPM see Additional file 1.
In this paper, we modify the Lohmöller’s PLSPM algorithm to estimate the parameters. In details, the specific modified procedure is as follows: 1) working on standardized manifest variables and giving initial values on weights w_{ ij }, iteratively alternating the outer and inner estimation steps; 2) specifically in the outer estimation step, the values of the latent variables ξ_{1}, ξ_{2}, and ξ_{3} were estimated by ${\nu}_{1}={\displaystyle \sum _{j=1}^{p}{\omega}_{1j}}{x}_{1j}$, ${\nu}_{2}={\displaystyle \sum _{j=1}^{q}{\omega}_{2j}}{x}_{2j}$ and ${v}_{3}={\displaystyle \sum _{j=1}^{k}{\omega}_{3j}}{y}_{j}$, respectively; 3) in the inner estimation step, the endogenous latent variable ѵ_{ η } were updated with ν_{3} = cov(ν_{3}, ν_{1})ν_{1} + cov(ν_{3}, ν_{2})ν_{2} + cov(ν_{3}, ν_{1}ν_{2})ν_{1}ν_{2}, furthermore the exogenous latent variables ѵ_{1} and ѵ_{2} by ν_{1} = cov(ν_{1}, ν_{3})ν_{3} and ν_{2} = cov(ν_{2}, ν_{3})ν_{3}; 4) updating weights before moving to the next step: w_{1j} = cov(x_{1j}, ν_{1}), w_{2j} = cov(x_{2j}, ν_{2}) and w_{3j} = cov(y_{ j }, ν_{3}). Steps 2)4) were repeated until convergence (max (w_{ij − new} − w_{ij − old}) < ∆, where ∆ is a convergence tolerance usually set at 0.0001 or less), and the outer weights were obtained. In addition, significant test of path coefficients and loadings were furnished by bootstrap procedures [24, 25].
Statistical significance
The modified statistics (mPLSPM) is defined as $U=\frac{\left\gamma 0\right}{\mathit{se}\left(\gamma \right)}$, where se (γ) denotes the standard deviation of γ. Significance of parameter γ under the null hypothesis (H_{0}): γ = 0 and the alternative hypothesis(H_{1}): γ ≠ 0 is tested via a normal statistic in the form $U=\frac{\left\gamma 0\right}{\mathit{se}\left(\gamma \right)}$, where se(γ) is calculated by the bootstrap procedures [29, 30], since the distribution of parameters from modified PLSPM is unknown. The testing stages are as follows: 1) A large, prespecified number of bootstrap samples (e.g. 1,000), each with the same number of subjects as the original sample, are generated via resampling with replacement. 2) Parameter estimation is done for each bootstrap sample using above modified algorithm, whose path coefficients or loadings can be viewed as drawings from their sampling distributions. All bootstrap samples together provided empirical estimators for the standard error of each parameter. 3) The result of bootstrapping procedure permits a Utest to be performed for the significance of the path coefficients or loadings, ${U}_{\mathit{emp}}=\frac{\leftw0\right}{\mathit{se}\left(w\right)}$ (for example ${U}_{int\mathit{er}}=\frac{\left\gamma 0\right}{\mathit{se}\left(\gamma \right)}$ in Figure 1), where U_{ emp } represents the empirical Uvalue, w (for example γ in Figure 1) denotes the original path coefficient or loading, and se(w) (for example se(γ) in Figure 1) indicates its bootstrapping standard error. The normal distribution provides the critical Uvalues at given αlevels. The histogram of the statistic was shown in Additional file 1: Figure S2.
Simulation
Simulation was conducted similar to a previous paper [31] as follows. Genotype data was generated by software gs2.0 [32] according to phase 1 and 2 HapMap data. Multiple phenotypic data were created to mirror the European Prospective Investigation of Cancer (EPIC)Norfolk study [33, 34] for which the waist circumference, hip circumference, and BMI were defined as multiple quantitative traits to reflect the body shape as the SLT. As noted earlier [31], the influence of body fat distribution has been linked with body shape named crudely after the fruits and vegetable(s) they resemble most (chilli, apple, pear, and pear apple) [35, 36]. People with a larger waist have higher risks of hypertension, type 2 diabetes and high cholesterol than those who carry excess weight on the hips [10, 11]. The combination of BMI, waist and hip circumferences is also a good predictor of cardiovascular risk and mortality [11, 35, 37]. In this paper, the simulated phenotype data was created based on abdominal obesity population from the EPICNorfolk study. The simulation procedure was detailed as follows:
(2) As waist and waist to hip ratio (WHR) were commonly used to predict the typeII diabetes and cardiovascular disease [10, 11, 38, 39], we created an abdominal obesity data set based on abdominal obesity sample (N = 355) in EPICNorfolk study. Multiple quantitative phenotypes with three traits (waist, hip, BMI) were generated from a trivariate normal distribution Y ~ N(μ, Σ) to assess our proposed statistic, where Y = (y _{1}, y _{2}, y _{3}) was the random vector (waist, hip, BMI) for abdominal obesity types in EPICNorfolk study, with their sample mean $\overline{Y}$ = (105.2746, 106.0051, 29.2172) and covariance $\Sigma =\left(\begin{array}{l}52.1991\phantom{\rule{0.25em}{0ex}}36.8688\phantom{\rule{0.25em}{0ex}}16.9545\\ 36.8688\phantom{\rule{0.25em}{0ex}}37.1419\phantom{\rule{0.25em}{0ex}}13.7969\\ 16.9545\phantom{\rule{0.25em}{0ex}}13.7969\phantom{\rule{0.5em}{0ex}}8.3859\end{array}\right)$. The QQplots of the three variables (waist, hip, BMI) among the abdominal obesity groups are seen in Additional file 1. Supposed the causal SNPs’ interaction effect only on waist not on hip, under H _{0}, the causal SNP1 and SNP2 had no interaction effect but main effect on BMI, thus $\mu =\left(\begin{array}{ccc}\hfill \mathit{\text{wais}}\widehat{t},\hfill & \hfill 106.0051,\hfill & \hfill 29.2172+0.32\times \mathrm{SNP}1+0.09\times \mathrm{SNP}2\hfill \end{array}\right)$, where SNP1, SNP2 = 0, 1, 2 for three genotypes (GG, GA, and AA) at both loci, the main effect of SNP1 (0.32) and SNP2 (0.09) were assigned according to real data [40], and waist was estimated by an empirical model $\mathit{\text{wais}}\widehat{t}=10.20345+0.62138*\mathit{\text{hip}}+0.99947*\mathrm{BMI}\phantom{\rule{0.5em}{0ex}}\left(\mathrm{F}=568.25,\mathrm{P}<0{.0001,\mathrm{R}}^{2}=0.7635\right)$. Under H _{1}, the interaction effect of two causal SNPs (SNP1 and SNP2) on BMI was δ kg/m^{2}, thus ($\mu =\left(\begin{array}{ccc}\hfill \mathit{\text{wais}}\widehat{t},\hfill & \hfill 106.0051,\hfill & \hfill 29.2172+0.32\times \mathrm{SNP}1+0.09\times \mathrm{SNP}2+\mathit{\delta}\times \mathrm{SNP}1\times \mathrm{SNP}2\hfill \end{array}\right)$. The range of the interaction effect δ = (0.10, 0.20, 0.30, 0.40, 0.50) was estimated by published data [41]. All simulation was performed by the R “mvtnorm” package available from CRAN (http://cran.rproject.org/).
(3) Under H _{0}, 1,000 simulations given various sample sizes (N = 1000, 2000, 3000, 4000, 5000) were conducted to assess the type I error. Under H _{1}, given δ, we repeated 1, 000 simulations under various sample sizes at two significant levels (α = 0.05, α = 0.01) to assess power of the mPLSPM statistic. The power of the proposed statistic for waist, WHR, and SLT was also estimated at given interaction effect δ under various sample sizes to compare their performance.
(4) To assess the performance of our proposed statistic, we compared it with a PCAbased linear regression model based on the ideas of three published work [20, 26, 28]. The PCAbased linear regression model was defined as $\eta =b+{\displaystyle \sum _{i=1}^{P}{\beta}_{1i}{U}_{i}^{1}}+{\displaystyle \sum _{j=1}^{Q}{\beta}_{2j}{U}_{j}^{2}}+{\displaystyle \sum _{i=1}^{P}{\displaystyle \sum _{j=1}^{Q}{\gamma}_{\mathit{ij}}{U}_{i}^{1}{U}_{j}^{2}}}$ where η denoted the PCs of the three traits (waist, hip, and BMI), ${U}_{i}^{1},{U}_{j}^{2}$ represented the PCs for gene 1 and gene 2 respectively, and P,Q are the number of PCs in gene 1 and gene 2 chosen based on the proportion of variation explained. The prespecified fraction of the total variance was 85% in this study.
Application
Obesity is related to obstruction of food intake and energy balance regulation. The neurocenter in control of the food intake, hunger, and energy balance locates at hypothalamus and brainstem, and involves in a complicated neurochemical regulatory mechanism. The roles of both TMEM18 gene and BDNF gene in the food intake and energy balance as with their association with obesity were shown [42–44]. Here we assess interaction of these two genes on obesity related quantitative traits. The genotype data of TMEM18 (13 SNPs), BDNF (31 SNPs) and phenotype data (waist, hip, BMI) are from GWAS in the EPICNorfolk study (N = 2417). The EPICNorfolk study is a populationbased, ethnically homogeneous, white Europe cohort study of 25,631 residents living in the city of Norwich, United Kingdom, and its surrounding area. Participants were 39–79 years old during the baseline health check between 1993 and 1997. Of these, 2417 individuals had complete genotype data for 2,500,000 SNPs on the whole genome [31, 33]. The interaction between TMEM18 and BDNF for waist, hip, BMI, WHR, body shape score 1 (BSS1, latent variable with waist, hip, and BMI as its manifest variables), and body shape score2 (BSS2, latent variable with BMI and WHR as its manifest variables) were detected using our proposed mPLSPM statistic at nominal level of α = 0.05.
Results
Simulation
Type I error rate
Statistical power
To evaluate the statistical power of the mPLSPM statistic, we repeat simulations with various interaction effect δ and sample sizes. As expected, it monotonically increases with sample size and interaction effect (δ) under two given nominal levels (α = 0.05, α = 0.01) (Figure 3c and 3d).
As one reviewer suggested additional simulations under the case that different SNPs affecting different phenotypes have also been conducted. Similar performance can be found (see Additional file 1: Table S2).
Application
PCAbased method has been also applied to detect different kinds of TMEM18BDNF interactions on obesity. None showed statistical significance when using the first PC of each gene, while only interaction on BSS1 (P = 0.012) and BMI (P = 0.008) are statistically significant when using the first two PCs (explained over 85% of the total variance).
Discussion
Under the hypothesis of thinking quantitatively [4], we have considered a general framework for genegene interaction on quantitative phenotype, which includes single SNPSNP interaction on single trait, genegene (each with multiple SNPs) interaction on single trait and genegene (each with multiple SNPs) interaction on multiple traits, which was the most reasonable in genetic mechanism for multiple quantitative traits underlying complex diseases. In this paper, we furnished a novel mPLSPM statistic to detect the third of interaction. The mPLSPM statistic should alleviate the burden of single SNP single trait paradigm which inevitably has high false positive rate due to multiplicity problem, as well as its reduction of power due to the underuse of the LD information [16, 17]. Furthermore, the new approach does not have the drawback of gene (multiple SNPs)single trait paradigm for reasons mentioned earlier, and for most complex diseases (type II diabetes, obesity, disturbance of consciousness), although their quantitative phenotype could in principle be measured, they might not be used for practical reasons (quantitative phenotypes are “really there” but hidden). Our proposed statistic uses the framework of SLT as a quantitative phenotype which was inferred from observed variables (multiple SNPs within gene regions, and multiple traits of a specific complex disease). Through simulation it was shown that the proposed novel mPLSPM statistic to be not only powerful (Figure 3c, 3d) but superior to the PCAbased linear regression method (Figure 5a, 5b, 6).
After applying the novel statistic to the real data, a significant TMEM18BDNF interaction has been shown for body shape score as a SLT but not for its individual components (waist, hip, and WHR) (Figure 7a7f), suggesting that the SLT (body shape score) to be more suitable to capture the interaction effect than any single trait. The biological significance in the food intake and energy balance regulation system is in line with the literature, and these two genes have been confirmed to be associated with obesity [42–44].
Our approach shares similarity with traditional SEM, available as either covariancebased or componentbased [25, 45, 46]. However, genebased multiple SNPs with high LD in genomic data and multiple high correlated traits, the covariancebased SEM suffers from the strong multicollinearity between them. Our use of PLSPM is a componentbased with the following advantages: 1) use of reflective measurement model to avoid the impact of high multicollinearity among multiple SNPs, and among multiple traits; 2) as a “soft modeling” approach (very few distribution assumptions, variables can be numerical, ordinal or nominal, and no need for normality assumptions) suitable for any genetic model (additive, recessive, dominant, etc.) [23, 24, 47]. However, the usual PLSPM cannot handle the interaction between latent variables straightforwardly, the modified PLSPM has a product term of combined multiple SNPs effect within two genes (gene A and gene B).
A reviewer has also indicated that another way to test interaction would be to add a new latent variable for all the pairwised SNP × SNP interactions to the path modeling and test whether the path coefficient from this interaction latent variable to the latent trait variable is significant [48]. We compared this method with our proposed statistic, and results showed they have similar performance (see Additional file1: Table S1). However, when the number of SNPs is large, there will be so many SNP × SNP terms and undoubtedly bringing us higher computation burden. Our method seems more practical in real data analysis. It is worth mentioning that our proposed method should only be used for testing the interaction, but not for detecting main effect. Testing multipletraits may only be superior if pleiotropic SNPs and genetic related traits exist, and when the number of traits is large or the correlation (or LD) structure among the traits is small, the power of our statistic will decrease.
A possible drawback of the proposed approach is the computing time spending on bootstrap test used to evaluate the standard deviation of path coefficients. Ideally, a parametric statistic can be developed in the near future. Our findings on the interaction also call for replications by other studies.
Conclusions
The proposed novel mPLSPM statistic is a valid and powerful genebased method for detecting genegene interaction on multiple quantitative phenotypes. Further work is needed to make its use in GWAS more practical.
Abbreviations
 mPLSPM:

modified Partial Least Squares Path Modeling
 SLT:

Synthetically latent trait.
Declarations
Acknowledgments
This work was supported by grants from National Natural Science Foundation of China (31071155). The EPICNorfolk study is supported by research programmer grant from Cancer Research UK and the Medical Research Council. Part of the work was done by FX during his visit to the MRC Epidemiology Unit.
Authors’ Affiliations
References
 Stranger BE SE, Raj T: Progress and Promise of GenomeWide Association Studies for Human Complex Trait Genetics. Genetics. 2011, 187: 367383. 10.1534/genetics.110.120907.PubMed CentralView ArticlePubMedGoogle Scholar
 Zhang Y, Liu JS: Bayesian inference of epistatic interactions in case–control studies. Nat Genet. 2007, 39 (9): 11671173. 10.1038/ng2110.View ArticlePubMedGoogle Scholar
 Gayan J, GonzalezPerez A, Bermudo F, Saez ME, Royo JL, Quintas A, Galan JJ, Moron FJ, RamirezLorca R, Real LM, et al: A method for detecting epistasis in genomewide studies using case–control multilocus association analysis. Bmc Genomics. 2008, 9: 36010.1186/147121649360.PubMed CentralView ArticlePubMedGoogle Scholar
 Plomin R, Haworth CMA, Davis OSP: Common disorders are quantitative traits. Nat Rev Genet. 2009, 10 (12): 872878.View ArticlePubMedGoogle Scholar
 Rowe NGMP, Cumming RG, Wans JJ: Diabetes, fasting blood glucose and agerelated cataract: the Blue Mountains Eye Study. Ophthalmic Epidemiol. 2000, 7: 103114.View ArticlePubMedGoogle Scholar
 RA F: The correlation between relatives on the supposition of Mendelian inheritance. Am J Hum Genet. 1968, 20 (4): 402403.Google Scholar
 Donahue RP, Abbott RD: Central obesity and coronary heart disease in men. Lancet. 1987, 2 (8569): 1215View ArticlePubMedGoogle Scholar
 Ducimetiere P, Richard J, Cambien F: The pattern of subcutaneous fat distribution in middleaged men and the risk of coronary heart disease: the Paris Prospective Study. Int J Obes. 1986, 10 (3): 229240.PubMedGoogle Scholar
 Bjorntorp P: Abdominal obesity and the development of noninsulindependent diabetes mellitus. Diabetes Metab Rev. 1988, 4 (6): 615622. 10.1002/dmr.5610040607.View ArticlePubMedGoogle Scholar
 Yusuf S, Hawken S, Ounpuu S, Bautista L, Franzosi MG, Commerford P, Lang CC, Rumboldt Z, Onen CL, Liu LS, et al: Obesity and the risk of myocardial infarction in 27,000 participants from 52 countries: a case–control study. Lancet. 2005, 366 (9497): 16401649. 10.1016/S01406736(05)676635.View ArticlePubMedGoogle Scholar
 Wells J: BMI compared with 3dimensional bodyshape: the UK National Sizing Survey. Am J Clin Nutr. 2007, 85: 7Google Scholar
 Pare G, Cook NR, Ridker PM, Chasman DI: On the Use of Variance per Genotype as a Tool to Identify Quantitative Trait Interaction Effects: A Report from the Women's Genome Health Study. Plos Genetics. 2010, 6 (6): e100098110.1371/journal.pgen.1000981.PubMed CentralView ArticlePubMedGoogle Scholar
 Li M, Ye C, Fu W, Elston RC, Lu Q: Detecting Genetic Interactions for Quantitative Traits With UStatistics. Genet Epidemiol. 2011, 35 (6): 457468.PubMed CentralPubMedGoogle Scholar
 Culverhouse R, Suarez BK, Lin J, Reich T: A perspective on epistasis: Limits of models displaying no main effect. Am J Hum Genet. 2002, 70 (2): 461471. 10.1086/338759.PubMed CentralView ArticlePubMedGoogle Scholar
 Moore JH, Hahn LW, Bass M, Martin ER: Detection of genegene interactions in general pedigrees. Am J Hum Genet. 2003, 73 (5): 606Google Scholar
 Beyene J, Tritchler D, Asimit JL, Hamid JS: Gene or RegionBased Analysis of GenomeWide Association Studies. Genet Epidemiol. 2009, 33: S105S110. 10.1002/gepi.20481.PubMed CentralView ArticlePubMedGoogle Scholar
 Buil A, MartinezPerez A, PereraLluna A, Rib L, Caminal P, Soria JM: A new genebased association test for genomewide association studies. BMC Proc. 2009, 3Suppl 7: S130View ArticleGoogle Scholar
 Li M, Wang K, Grant SFA, Hakonarson H, Li C: ATOM: a powerful genebased association test by combining optimally weighted markers. Bioinformatics. 2008, 25 (4): 497503.PubMed CentralView ArticlePubMedGoogle Scholar
 He J, Wang K, Edmondson AC, Rader DJ, Li C, Li M: Genebased interaction analysis by incorporating external linkage disequilibrium information. Eur J Hum Genet. 2010, 19 (2): 164172.PubMed CentralView ArticlePubMedGoogle Scholar
 Li S, Cui Y: Genecentric gene–gene interaction: a modelbased kernel machine method. Ann Appl Stat. 2012, 6 (3): 11341161. 10.1214/12AOAS545.View ArticleGoogle Scholar
 Ma L, Andrew GC, Alon K: GeneBased Testing of Interactions in Association Studies of Quantitative Traits. Plos Genetics. 2012, 9 (2): e1003321View ArticleGoogle Scholar
 Rajapakse I, Perlman MD, Martin PJ, Hansen JA, Kooperberg C: Multivariate Detection of GeneGene Interactions. Genet Epidemiol. 2012, 36 (6): 622630. 10.1002/gepi.21656.PubMed CentralView ArticlePubMedGoogle Scholar
 Tenenhaus M, Vinzi VE, Chatelin YM, Lauro C: PLS path modeling. Comput Stat Data Anal. 2005, 48 (1): 159205. 10.1016/j.csda.2004.03.005.View ArticleGoogle Scholar
 Esposito VV CW, Henseler J, Wang H: Handbook of Partial Least Squares: Concepts, Methdos and Applications. 2010, Berlin Heidelberg: SpringerView ArticleGoogle Scholar
 Lohmöller J: Latent variable path modeling with partial least squares. 1989, Heidelberg: PhysicaVerlag HeidelbergView ArticleGoogle Scholar
 Wang KAD: A principal components regression approach to multilocus genetic association studies. Genet Epidemiol. 2008, 32: 108118. 10.1002/gepi.20266.View ArticlePubMedGoogle Scholar
 Gauderman WJ, Murcray C, Gilliland F, Conti DV: Testing association between disease and multiple SNPs in a candidate gene. Genet Epidemiol. 2007, 31: 383395. 10.1002/gepi.20219.View ArticlePubMedGoogle Scholar
 Klei L, Luca D, Devlin B, Roeder K: Pleiotropy and principal components of heritability combine to increase power for association analysis. Genet Epidemiol. 2008, 32 (1): 919. 10.1002/gepi.20257.View ArticlePubMedGoogle Scholar
 Markus MT, Groenen PJF: An introduction to the bootstrap. Psychometrika. 1998, 63 (1): 97101.Google Scholar
 Linden A, Adams JL, Roberts N: Evaluating disease management program effectiveness  An introduction to the bootstrap technique. Dis Manage Health Outcomes. 2005, 13: 159167. 10.2165/0011567720051303000002.View ArticleGoogle Scholar
 Xue F, Li S, Luan J, Yuan Z, Luben RN, Khaw KT, Wareham NJ, Loos RJF, Zhao JH: A Latent Variable Partial Least Squares Path Modeling Approach to Regional Association and Polygenic Effect with Applications to a Human Obesity Study. Plos One. 2012, 7 (2): e3192710.1371/journal.pone.0031927.PubMed CentralView ArticlePubMedGoogle Scholar
 Li J, Chen Y: Generating samples for association studies based on HapMap data. Bmc Bioinformatics. 2008, 9: 4410.1186/14712105944.PubMed CentralView ArticlePubMedGoogle Scholar
 Riboli E, Kaaks R: The EPIC project: Rationale and study design. Int J Epidemiol. 1997, 26 (SUPPL. 1): S6S14.View ArticlePubMedGoogle Scholar
 Day N, Oakes S, Luben R, Khaw KT, Bingham S, Welch A, Wareham N: EPICNorfolk: study design and characteristics of the cohort. Br J Cancer. 1999, 80: 95103.PubMedGoogle Scholar
 Rimm AA, Hartz AJ, Fischer ME: A weight shape index for assessing risk of disease in 44,820 women. J Clin Epidemiol. 1988, 41 (5): 459465. 10.1016/08954356(88)900479.View ArticlePubMedGoogle Scholar
 Walsh P: Research profile. The apple shape. Causes and effects. Diabetes Forecast. 2004, 57 (2): 7375.PubMedGoogle Scholar
 Walton C, Lees B, Crook D, Worthington M, Godsland IF, Stevenson JC: Body fat distribution, rather than overall adiposity, influences serum lipids and lipoproteins in healthy men independently of age. Am J Med. 1995, 99 (5): 459464. 10.1016/S00029343(99)802204.View ArticlePubMedGoogle Scholar
 DALTON M, CAMERON AJ, ZIMMET PZ, SHAW JE, JOLLEY D, DUNSTAN DW, WELBORN TA: Waist circumference, waisthip ratio and body mass index and their correlation with cardiovascular disease risk factors in Australian adults. J Intern Med. 2003, 254: 555563. 10.1111/j.13652796.2003.01229.x.View ArticlePubMedGoogle Scholar
 Peter T, Peter T, Katzmarzyk SRS, Wei C, Malina RM, Claude B, Berenson GS: Body Mass Index, Waist Circumference, and Clustering of Cardiovascular Disease Risk Factors in a Biracial Sample of Children and Adolescents. Pediatrics. 2004, 114: e198e205. 10.1542/peds.114.2.e198.View ArticleGoogle Scholar
 Li S, Zhao JH, Luan J, Luben RN, Rodwell SA, Khaw KT, Ong KK, Wareham NJ, Loos RJF: Cumulative effects and predictive value of common obesitysusceptibility variants identified by genomewide association studies. Am J Clin Nutr. 2010, 91 (1): 184190. 10.3945/ajcn.2009.28403.View ArticlePubMedGoogle Scholar
 Bhattacharya K, McCarthy MI, Morris AP: Rapid testing of genegene interactions in genomewide association studies of binary and quantitative phenotypes. Genet Epidemiol. 2011, 35 (8): 800808. 10.1002/gepi.20629.PubMed CentralView ArticlePubMedGoogle Scholar
 Speliotes E: Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet. 2010, 42: 937948. 10.1038/ng.686.PubMed CentralView ArticlePubMedGoogle Scholar
 Obici S: Minireview: Molecular targets for obesity therapy in the brain. Endocrinology. 2009, 150: 25122517. 10.1210/en.20090409.View ArticlePubMedGoogle Scholar
 Walley A: The genetic contribution to nonsyndromic human obesity. Nat Rev Genet. 2009, 10: 431442. 10.1038/nrg2594.View ArticlePubMedGoogle Scholar
 Henseler JR: The Use of Partial Least Squares Path Modeling in International Marketing. Advin Intern Marketing. 2009, 20: 277319.Google Scholar
 Fornell C: A comparative analysis of two structural equation models: LISREL and PLS applied to market data. A second generation of multivariate analysis. Edited by: Fornell C. 1982, New York: Praeger, 1:289–324Google Scholar
 Chin W: The partial least squares approach in structural equation modeling. Modern methods for business research. Edited by: Marcoulides GA. 1998, Lawrence ErlbaumGoogle Scholar
 Fuzhong LPH, Duncan TE, Duncan SC, Alan A, Shawn B: Approaches to Testing Interaction Effects Using Structural Equation Modeling Methodology. Multivar Behav Res. 1998, 33 (1): 139. 10.1207/s15327906mbr3301_1.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.