Mapping Haplotype-haplotype Interactions with Adaptive LASSO
- Ming Li^{1},
- Roberto Romero^{3},
- Wenjiang J Fu^{1}Email author and
- Yuehua Cui^{2}Email author
https://doi.org/10.1186/1471-2156-11-79
© Li et al; licensee BioMed Central Ltd. 2010
Received: 22 March 2010
Accepted: 27 August 2010
Published: 27 August 2010
Abstract
Background
The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods for modeling interactions underlying complex diseases between single genetic variants (e.g. single nucleotide polymorphisms or SNPs) have been extensively studied. Recently, haplotype-based analysis has gained its popularity among genetic association studies. When multiple sequence or haplotype interactions are involved in determining an individual's susceptibility to a disease, it presents daunting challenges in statistical modeling and testing of the interaction effects, largely due to the complicated higher order epistatic complexity.
Results
In this article, we propose a new strategy in modeling haplotype-haplotype interactions under the penalized logistic regression framework with adaptive L_{1}-penalty. We consider interactions of sequence variants between haplotype blocks. The adaptive L_{1}-penalty allows simultaneous effect estimation and variable selection in a single model. We propose a new parameter estimation method which estimates and selects parameters by the modified Gauss-Seidel method nested within the EM algorithm. Simulation studies show that it has low false positive rate and reasonable power in detecting haplotype interactions. The method is applied to test haplotype interactions involved in mother and offspring genome in a small for gestational age (SGA) neonates data set, and significant interactions between different genomes are detected.
Conclusions
As demonstrated by the simulation studies and real data analysis, the approach developed provides an efficient tool for the modeling and testing of haplotype interactions. The implementation of the method in R codes can be freely downloaded from http://www.stt.msu.edu/~cui/software.html.
Keywords
Background
It has been commonly recognized that most human diseases are complex involving joint effort of multiple genes, complicated gene-gene as well as gene-environment interactions [1]. The identification of disease risk factors for monogenic diseases has been quite successful in the past. Due to the small effect of many single genetic variants on the risk of a disease, the identification of disease variants for complex multigenic diseases has not been very successful [2]. There are multiple reasons for this. First, most complex diseases involve multiple genetic variants each conferring a small or moderate effect on a disease risk. Second, the complexity relies on the complicated interactions among disease variants, on a single-single variants or multiple-multiple variants basis. Third, but not the last, gene-environment interaction also plays pivotal roles in determining the underlying complexity of disease etiology. Studies on testing gene-gene interactions have been commonly pursued in the past, but little has been achieved, despite its importance in determining a disease risk (see [3] for a comprehensive review).
Mapping genetic interactions has been traditionally pursued in model organisms to identify functional relationships among genes [4–6]. With the seminal work in quantitative trait loci (QTL) mapping by Lander and Botstein [7], extensive work has been focused on experimental crosses to study the genetic architecture of complex traits. Along the line, methods for mapping QTL interactions have also been developed [8, 9]. The recent development of human HapMap and radical breakthrough in genotyping technology have enabled us to generate high throughput single nucleotide polymorphisms (SNPs) data which are dense enough to cover the whole genome [10]. This advancement allows us to characterize variants at a sequence level that encode a complex disease phenotype, and opens a prospective future for disease variants identification [11, 12].
Genetic interaction, or termed epistasis, occurs when the effect of one genetic variant is suppressed or enhanced by the existence of other genetic variants [13]. In align with this definition, Mani et al. [14] recently defined two distinct genetic interactions, namely the synergistic interaction in which extreme phenotype is expected whenever double mutations are present, and the alleviating interaction where one mutation in one gene masks the effect of another mutation by impairing the function of relative pathways. As an important component of the genetic architecture of many biological traits, the role of epistasis in shaping an organism's development has been unanimously recognized [15, 16]. An increasing number of empirical studies have also revealed the role of epistasis in the pathogenesis of most common human diseases, such as cancer or cardiovascular disease [17, 18].
The high-dimensional SNP data present unprecedented opportunities as well as daunting challenges in statistical modeling and testing in identifying genetic interactions. However, for most complex diseases, it remains largely unknown which combination of genetic variants is causal to the disease. Given that most traits or diseases are multifactorial and genetically complex, it is very unlikely that the function of a single variant can induce an overt disease signal without modeling the gene networks or pathways. Lin and Wu [19] proposed a sequence interaction model in a linear regression framework for a quantitative phenotype. Zhang et al. [20] proposed an entropy-based method for searching haplotype-haplotype interactions using unphased genotype data with applications in type I diabetes. Musani et al. [21] and Cordell [3] recently gave a comprehensive review of statistical methods developed for detecting gene-gene interactions. While most methods are nonparametric in nature such as the popular multifactor dimensionality reduction (MDR) method [22], they do not provide effect estimates for gene-gene interactions. Thus methods focusing on data reduction ignore the biological interpretation of the interaction. For instance, if two SNPs are identified to have interaction, how do they interact in genetics? What are the modes of gene action?
In Cui et al. [12], a novel approach was proposed to group haplotypes to detect risk haplotypes associated with a disease. In an extension to this work, we proposed a new statistical method to model haplotype-haplotype interactions responsible for a binary disease phenotype. We assume a population-based case-control design where a disease phenotype is assumed dichotomous. Due to high-order interactions, we propose a penalized logistic regression framework with adaptive L_{1}-penalty, commonly termed as the adaptive LASSO [23]. The adaptive L_{1}-penalty allows effect estimation and variable selection simultaneously in a single model. Moreover, it preserves the oracle property of variable selection [23]. Due to the binary nature of the response, we proposed a modified Gauss-Seidel method nested within the EM algorithm to estimate parameters. The model is applied to a real data set in which significant haplotype interactions are detected between mother and offspring genomes that might be responsible for disease risks in pregnancy.
Methods
The configuration of two SNP combinations
Observed Genotype | Diplotype | Composite Diplotype | ||
---|---|---|---|---|
Configuration | Frequency | Relative Freq. | ||
11/11 | [11][11] | ${p}_{11}^{2}$ | 1 | HH |
11/12 | [11][12] | 2 p_{11}p_{12} | 1 | $H\overline{H}$ |
11/22 | [12][12] | ${p}_{12}^{2}$ | 1 | $\overline{H}\overline{H}$ |
12/11 | [11][21] | 2 p_{11}p_{21} | 1 | $\overline{H}\overline{H}$ |
12/12 | $\{\begin{array}{l}[11][22]\\ [12][21]\end{array}$ | $\{\begin{array}{l}{p}_{11}{p}_{22}\\ {p}_{12}{p}_{21}\end{array}$ | $\{\begin{array}{l}\varphi \\ 1-\varphi \end{array}$ | $\{\begin{array}{l}H\overline{H}\\ \overline{H}\overline{H}\end{array}$ |
12/22 | [12][22] | 2 p_{12}p_{22} | 1 | $\overline{H}\overline{H}$ |
22/11 | [21][21] | ${p}_{21}^{2}$ | 1 | $\overline{H}\overline{H}$ |
22/12 | [21][22] | 2 p_{21}p_{22} | 1 | $\overline{H}\overline{H}$ |
22/22 | [22][22] | ${p}_{22}^{2}$ | 1 | $\overline{H}\overline{H}$ |
The epistasis model
x_{ t } and z_{ t } can be defined similarly. With the above definition, a_{s(t)}and d_{s(t)}can be interpreted as the additive and dominance effects for the risk haplotype at block s(t); i_{ aa }, i_{ ad }, i_{ da }, i_{ dd }can be interpreted as the additive×additive, additive×dominance, dominance×additive, and dominance×dominance interaction effects between the two blocks, respectively.
Compared to most non-parametric methods in detecting gene-gene interactions, such as the multifactor dimensionality reduction (MDR) method which only provides an interaction test [19], the above interaction model allows one to identify which ones are the risk haplotypes in two haplotype blocks, and to further quantify the specific structure and effect size of epistatic interactions between the two haplotype blocks. We argue that this model-based epistatic test provides biologically more meaningful results than a non-parametric method such as MDR.
Likelihood function
To construct likelihood function, all three groups, M_{2}, M_{3}, M_{4}, except group M_{1}, involve phase ambiguity genotypes, hence need to be modeled with mixture distributions.
Because the phase ambiguous state c_{ si } and c_{ ti } are not observable, we treat them as missing data and use EM algorithm to estimate them iteratively (See below).
where λ is a tuning parameter for the likelihood and penalty term, and is chosen by the minimum Bayesian Information Criterion (BIC); ω = (w_{1}, w_{2}, ..., w_{8}) is a weight vector for the genetic effects β. When w_{ j } = 1 for every j, this leads to a general LASSO penalty. Although the general LASSO estimator may not be consistent, some data dependent weight vector ω is able to warrant the oracle property for the corresponding adaptive LASSO estimator. Specifically, one choice of ω is ω = 1/β_{ OLS }, where β_{ OLS } is the ordinary least square (OLS) estimator. This makes the adaptive LASSO estimate much more attractive than the general LASSO estimate [23].
Missing data and the EM algorithm
The phase ambiguous genotypes lead to missing data. The currently developed algorithms LASSO or adaptive LASSO estimation can not be directly applied to maximize the penalized likelihood (3). However, this could be solved by applying an EM algorithm detailed as follows:
1) Initialize β, γ, and calculate ${\pi}_{i}=p({y}_{i}=1|{x}_{ig},{x}_{ie})=\frac{\mathrm{exp}({x}_{ig}\beta +{x}_{ie}\gamma )}{1+\mathrm{exp}({x}_{ig}\beta +{x}_{ie}\gamma )}$ for subject i;
for i ∈ M_{ k }(k, j) ∈ {(2, s}, (3,t)}.
where $\begin{array}{c}\Pi ={\varphi}_{s}{\varphi}_{t}{\varphi}_{{M}_{4}1i}^{{y}_{i}}{(1-{\varphi}_{{M}_{4}1i})}^{1-{y}_{i}}+{\varphi}_{s}(1-{\varphi}_{t}){\varphi}_{{M}_{4}2i}^{{y}_{i}}{(1-{\varphi}_{{M}_{4}2i})}^{1-{y}_{i}}\\ +(1-{\varphi}_{s}){\varphi}_{t}{\varphi}_{{M}_{4}3i}^{{y}_{i}}{(1-{\varphi}_{{M}_{4}3i})}^{1-{y}_{i}}+(1-{\varphi}_{s})(1-{\varphi}_{t}){\varphi}_{{M}_{4}4i}^{{y}_{i}}{(1-{\varphi}_{{M}_{4}4i})}^{1-{y}_{i}}\end{array}$
3) M-step: Update β,γ by maximizing the penalized log likelihood function (3);
4) Repeat step 1)-3) until convergence.
Computational algorithm for maximizing the penalized log likelihood
In the M step, parameters β, γ are updated by calculating LASSO estimate. The LASSO regression with continuous response has been well studied. Some very efficient algorithms have been proposed, such as the shooting algorithm and the LARS [26, 27]. The estimation has been a challenge for the generalized linear model due to the non-linearity of the likelihood function, especially with an adaptive penalty term. No exact solution exists for parameter estimation in this setting. Here we propose a computational algorithm using a Gauss-Seidel method [28] to solve an unconstrained optimization problem. More detail about this method can be found in Shevade et al. [29]. To simplify the notations, we explain our method without environmental covariates.
With the phase ambiguous genotypes, F_{ j } can be calculated accordingly with the mixture proportion E(c_{ si })and E(C_{ ti })that are estimated from E-step.
Therefore, the optimal conditions could be achieved when Viol_{ j } = 0 for ∀j. For a given λ and w_{ j }, j = 1.....p, we further define I_{ z } = {j: β_{ j } = 0, j > 0}; and I_{ nz } = {0}∪{j: β_{ j } ≠ 0, j > 0}. The detailed estimation procedure is given as:
1) Initialize β_{ j } = 0, j = 0, 1...... p;
2) While any Viol_{ j } > 0 in I_{ z },
Find the maximum violator V_{ k },
Update β_{ k } by optimizing L';
While any Viol_{ j } > 0 in I_{ nz },
Find the maximum violator V_{ l },
Update β_{ l } by optimizing L',
Until no violator exists in I_{ nz };
Until no violator exists in I_{ z }
For computational precision purpose, the condition Viol_{ j } > 0 is relaxed to Viol_{ j } > 10^{-5} in our computation.
This method is based on the convexity of the likelihood function. The computation procedure updates one β_{ j } at a time until all the optimality conditions are achieved. The algorithm is relatively efficient because it does not involve matrix inverse. The convexity condition warrants one and only one solution for each update (See additional file 1). Similar algorithm has been used in linear regression setting, commonly referred to as 'the shooting algorithm' [26], and in logistic regression setting for general LASSO [29]. The asymptotic convergence of this method for non-linear optimization problem has been proven in [[28], Ch.3Prop 4.1].
Risk haplotype selection
where d is the number of non-zero parameters in the model and n is the total sample size.
Results
Simulation study
We conducted a series of simulation with various scenarios to evaluate the statistical property of the proposed method. Within each block, the minor allele frequencies of the two SNPs were assumed to be 0.3 and 0.4 with a linkage disequilibrium D = 0.02. The simulation was conducted under different sample sizes (i.e., n = 200, 500, 1000)
List of parameter values under different simulation designs
Scenario | a _{s} | a _{t} | d _{s} | d _{t} | i _{ aa } | i _{ ad } | i _{ da } | i _{ dd } |
---|---|---|---|---|---|---|---|---|
S0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
S1 | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 |
S2 | 0.8 | 0.8 | 0 | 0 | 0 | 0 | 0 | 0 |
S3 | 0.8 | 0.8 | 0.8 | 0.8 | 0 | 0 | 0 | 0 |
S4 | 0.8 | 0 | 0.8 | 0 | 0.8 | 0.8 | 0.8 | 0.8 |
A case study
We applied our model to a perinatal case-control study on small for gestational age (SGA) neonates as part of a large-scale candidate gene-based genetic association studies of pregnancy complication conducted in Chile. A total of 991 mother-offspring pairs (406 SGA cases and 585 controls) were genotyped for 1331 SNPs involving 200 genes. Maternal and fetal genome interaction was a primary genetic resource for SGA neonates. So we focused our analysis on identifying haplotype interactions between the maternal and fetal genome.
We first excluded SNPs that had a minor allele frequency of less than 5% or that did not satisfy Hardy-Weinberg equilibrium (HWE) in the combined mother and offspring control population by a Chi-squares test with a cut-off p-value of 0.001. We further used the computer software Haploview [30] to identify haplotype blocks for SNPs within each gene. Two tag SNPs were used to represent each block. A sliding window approach was applied to search for interactions between two blocks.
List of selected genes, corresponding "risk" haplotype structure, effect estimates and permutation p-values
SNP ID (allele) | Gene (region) | "Risk" haplotype | a _{s} | d _{s} | a _{t} | d _{t} | i _{ aa } | i _{ ad } | i _{ da } | i _{ dd } |
---|---|---|---|---|---|---|---|---|---|---|
9508994 (C/T) | PON1 (intron 1) | [TC]^{M} | 0 | 0 | 0 | 0 | 0 | -0.45 | 0 | 0 |
20209376 (C/T) | PON1 (intron 5) | [CC]^{O} | p* = 0.001 | |||||||
659435566 (C/T) | NFKB1 (exon 12) | [CC]^{M} | 0 | 0 | 0 | 0 | -0.33 | 0 | 0 | 0 |
659435702 (C/G) | NFKB1 (intron 22) | [TC]^{O} | p* = 0.001 | |||||||
22767327 (A/T) | FLT4 (intron 7) | [AT]^{M} | 0 | 0 | 0 | 0 | 0 | -0.30 | 0 | 0 |
22175087 (C/T) | FLT4 (intron 8) | [TC]^{O} | p* < 0.001 | |||||||
1125300 (G/T) | SPARC (intron 3) | [TT]^{M} | 0 | -0.38 | 0 | 0 | 0 | 0 | 0 | 0.245 |
1125290 (G/T) | SPARC (intron 5) | [TT]^{O} | p* = 0.001 | p* < 0.001 | ||||||
634841108 (A/C) | TIMP2 (intron 2) | [AG]^{M} | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.68 |
634841123 (A/G) | TIMP2 (exon 3) | [CG]^{O} | p* < 0.001 | |||||||
634018768 (A/G) | HPGD (promoter) | [AG]^{M} | 0 | 0 | 0.44 | 0 | 0 | 0 | 0 | 0 |
636105057 (A/G) | HPGD (promoter) | [GA]^{O} | p* < 0.001 | |||||||
17252653 (G/T) | MMP9 (intron) | [GC]^{M} | 0 | 0 | 0.53 | 0 | 0 | 0 | 0 | 0 |
17254821 (C/G) | MMP9 (exon 10) | [TC]^{O} | p* < 0.001 |
Other non-parametric methods, such as multifactor dimensionality reduction (MDR), have been shown to be successful for the identification of interaction effects in many studies. Because MDR can only be applied to studies with balanced case/control design, generalized MDR (GMDR) has been proposed as an extension to MDR [31]. GMDR maps phenotypic traits into residual scores through certain link functions under the generalized liner model setting, and further conducts SNP selection and testing based on the residual scores. To compare with our method, we applied GMDR to the data. The mother-offspring paired genotype data were used as input for GMDR, and a logistic link was used to calculate the residual scores.
In the example of PON1, SNP 20209376 (C/T) in the fetal genome was first selected by GMDR (p-value = 0.0107). SNPs were then paired with each other to identify potential significant pairwise interactions. Only SNP 9508994 (C/T) in the mother genome was found to interact with SNP 20209376 with marginal significance (p-value = 0.0547). More complex model were found to be non-significant (p-value = 0.1719 and p-value = 0.3770 for 3 SNP and 4 SNP model, respectively). Even though GMDR indicated a maternal-fetal interaction between these two SNPs, it did not provide an estimation of the genetic effect and the underlying interaction mechanism between the SNPs.
Model extension
Our method has been illustrated with two SNPs only. The model can be easily extended to more than two SNPs. When three or more SNPs are involved in each haplotype block, Cui et al. [12] gave an explicit derivation for possible "risk" haplotype structure. In fact no matter how may SNPs are involved, three possible composite diplotypes can be constructed as illustrated by Cui et al. [12]. The only challenge for this extension is to deal with the number of heterozygous loci. For example, when three SNPs are considered in a block, there are a total of seven possible phase-ambiguous genotypes. In a single block haplotype analysis, there could be four mixture distributions when constructing the likelihood function. When we consider interactions between two blocks, there are a total of 16 possible mixture distributions in the likelihood function. This will, however, definitely increase the programming challenge and the computing burden. Fortunately, the increaes of the mixture components will not affect the number of parameters to be estimated. We still have four main effects and four interactions, as these parameters are defined based on the "risk" haplotype structure.
Another possible solution to the challenges mentioned above is to do a sliding window search with each window covering two SNPs at a time. This is similar to the sliding window haplotype analysis commonly applied in some software such as PLINK.
Discussion and Conclusions
Although it has been reported that gene-gene interaction plays a major role in genetic studies of complex diseases, the detection of gene-gene interaction has been traditionally pursued on a single SNP level, i.e., focusing on single SNP interaction. Intuitively, SNP-SNP interaction can not represent gene-gene interaction because single SNPs cannot capture the total variation of a gene. Thus, extending the idea of single SNP interaction to haplotype interaction could potentially gain much in terms of capturing variations in genes. The proposed method defines gene-gene interaction through haplotype block interactions and offers an alternative strategy in finding potential interactions between two genes. We argue that the definition of haplotype block interaction could provide additional biological insights into a disease etiology, compared to a single SNP-based interaction analysis.
One of the advantages of our method is in grouping, hence reducing data dimension. By mapping genotypes to composite diplotypes, the data dimension is significantly reduced. Then we can use Bayesian information criterion to select potential "risk" haplotypes [12]. The selection of "risk" haplotype renders another advantage of the method. We can identify significant haplotype structures and further quantify its main and interaction effects. This greatly enhances our model interpretability and biological relevance.
Our simulation study showed that our method has reasonable false positive control and selection power for the genetic parameters. As we expected, the interaction effects have lower selection power compared to the main effects. As sample size increases, we are able to achieve an optimal power for the interaction effects. Another novelty of the method is the modeling of the "risk" haplotype, which leads to the partition of composite diplotypes. No matter how many SNPs are involved, it always ends up with three types of composite diplotypes. Thus, the number of genetic parameters is always fixed regardless of the number of SNPs. The only cost is the search for possible "risk" haplotypes through a larger parameter space.
We applied our method to a SGA study data set. Several SNP pairs were selected with either main or interaction effects. The permutation test confirmed the statistical significance of the selected effect. Our findings confirmed other findings of gene selection in the literature. Gene PON1 was previously reported to be associated with preterm birth, which is one of the potential genetic resources leading to SGA [32]. Gene FLT4 had been found to be association with the growth of human fetal endothelia cells and early human development [33, 34]. Gene HPGD was also reported being involved in human intrauterine growth restriction [35]. Gene MMP9 had been suggested to be related with placenta function [36]. These evidences strongly indicated the biological relevance of our method.
We also identified potential interaction effects for several additional genes, including NFKB1, SPARC and TIMP2. To our knowledge, no experimental evidence has been reported for these genes regarding the biological function related to fetal development or SGA. However, we found that each of these genes had been suggested to be involved in many biological pathways. Studies indicated that gene NFKB1 was functionally related to stress-impaired neurogenesis and depressive behavior [37], myelin formation [38], and adipose tissue growth [39]. Gene SPARC had been suggested to be associated with angiogenesis and tumor growth [40] and the progression of crescentic glomerulonephritis [41]. Gene TIMP2 was reported to be related to myogenesis [42] and the progression of cerebral aneurysms [43]. Further replicate studies are needed to confirm the biological relevance of these genes to SGA.
Declarations
Acknowledgements
The authors wish to thank the two anonymous referees for their helpful comments that improved the manuscript, and thank Dr. Kelian Sun for helping data processing. This work was supported in part by NSF grant DMS-0707031 and by the Perinatology Research Branch, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH, DHHS.
Authors’ Affiliations
References
- Zhao J, Jin L, Xiong M: Test for interaction between two unlinked loci. Am J Hum Genet. 2006, 79 (5): 831-45. 10.1086/508571.PubMed CentralView ArticlePubMedGoogle Scholar
- Drysdale CM, McGraw DW, Stack CB, Stephens JC, Judson RS, Nandabalan K, Arnold K, Ruano G, Liggett SB: Complex promoter and coding region beta 2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness. Proc Natl Acad Sci. 2000, 97 (19): 10483-8. 10.1073/pnas.97.19.10483.PubMed CentralView ArticlePubMedGoogle Scholar
- Cordell HJ: Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet. 2009, 10: 392-404. 10.1038/nrg2579. [http://www.nature.com/nrg/journal/v10/n6/abs/nrg2579.html-a1]PubMed CentralView ArticlePubMedGoogle Scholar
- Phillips PC, Otto SP, Whitelock MC: Beyond the average: The evolutionary importance of epistasis and the variability of epistatic effects. Epistasis and the Evolutionary Process. Edited by: Wold JB, Brodie ED, Wade MJ. 2000, Oxford Univ Press, New YorkGoogle Scholar
- Hartman JL, Garvik B, Hartwell L: Principles for the buffering of genetic variation. Science. 2001, 291: 1001-1004. 10.1126/science.291.5506.1001.View ArticlePubMedGoogle Scholar
- Boone C, Bussey H, Andrews BJ: Exploring genetic interactions and networks with yeast. Nat Rev Genet. 2007, 8: 437-449. 10.1038/nrg2085.View ArticlePubMedGoogle Scholar
- Lander ES, Botstein D: Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics. 1989, 121 (1): 185-99,.PubMed CentralPubMedGoogle Scholar
- Kao CH, Zeng ZB, Teasdale RD: Multiple interval mapping for quantitative trait loci. Genetics. 1999, 152 (3): 1203-16.PubMed CentralPubMedGoogle Scholar
- Cui Y, Wu R: Mapping genome-genome epistasis: a high-dimensional model. Bioinformatics. 2005, 21 (10): 2447-55. 10.1093/bioinformatics/bti342.View ArticlePubMedGoogle Scholar
- The international HapMap Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449: 851-861. 10.1038/nature06258.PubMed CentralView ArticleGoogle Scholar
- Liu T, Johnson JA, Casella G, Wu R: Sequencing complex diseases with HapMap. Genetics. 2004, 168: 503-511. 10.1534/genetics.104.029603.PubMed CentralView ArticlePubMedGoogle Scholar
- Cui Y, Fu W, Sun K, Romero R and Wu R: Mapping Nucleoide sequences that encode complex binary disease traits with Hapmap. Current Genomics. 2007, 5: 307-22. 10.2174/138920207782446188.View ArticleGoogle Scholar
- Bateson W: Mendel's Principles of Heredity. 1909, Cambridge University Press, CambridgeView ArticleGoogle Scholar
- Mani R, St Onge RP, Hartman JL, Giaever G, Roth FP: Defining genetic interaction. Proc Natl Acad Sci. 2008, 105 (9): 3461-6. 10.1073/pnas.0712255105.PubMed CentralView ArticlePubMedGoogle Scholar
- Wolf JB, Frankino WA, Agrawal AF, Brodie ED, Moore AJ: Developmental interactions and the constituents of quantitative variation. Evolution. 2001, 55 (2): 232-45.View ArticlePubMedGoogle Scholar
- Segrè D, DeLuna A, Church GM, Kishony R: Modular epistasis in yeast metabolism. Nat Genet. 2005, 37: 77-83.PubMedGoogle Scholar
- Moore JH: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered. 2003, 56: 73-82. 10.1159/000073735.View ArticlePubMedGoogle Scholar
- Nagel RL: Epistasis and the genetics of human diseases. C R Biol. 2005, 328 (7): 606-615. 10.1016/j.crvi.2005.05.003.View ArticlePubMedGoogle Scholar
- Lin M, Wu RL: Detecting sequence-sequence interactions for complex diseases. Current Genomics. 2006, 7: 59-72. 10.2174/138920206776389775.View ArticleGoogle Scholar
- Zhang J, Liang F, Dassen WR, Veldman BA, Doevendans PA, DeGunst M: Search for haplotype interactions that influence susceptibility to type 1 diabetes through use of unphased genotype data. Am J Hum Genet. 2003, 73 (6): 1385-401. 10.1086/380417.PubMed CentralView ArticlePubMedGoogle Scholar
- Musani SK, Shriner D, Liu N, Feng R, Coffey CS, Yi N, Tiwari HK, Allison DB: Detection of gene × gene interactions in genome-wide association studies of human population data. Hum Hered. 2007, 63 (2): 67-84. 10.1159/000099179.View ArticlePubMedGoogle Scholar
- Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH: Multifactor Dimensionality Reduction Reveals High-Order Interactions among Estrogen Metabolism Genes in Sporadic Breast Cancer. American Journal of Human Genetics. 2001, 69: 138-147. 10.1086/321276.PubMed CentralView ArticlePubMedGoogle Scholar
- Zou H: The adaptive Lasso and its oracle properties. Journal of the American Statistical Association. 2006, 101: 1418-1429. 10.1198/016214506000000735.View ArticleGoogle Scholar
- Cockerham CC: An extension of the concept of partitioning hereditary variance for analysis of covariances among relatives when epistatis is present. Genetics. 1954, 39: 859-882.PubMed CentralPubMedGoogle Scholar
- Tibshirani R: Regression shrinkage and selection via the lasso. J Royal Statist Soc B. 1996, 58 (1): 267-288.Google Scholar
- Fu W: Penalized regressions: the Bridge versus the Lasso. J Computational and Graphical Statistics. 1998, 7 (3): 397-416. 10.2307/1390712.Google Scholar
- Efron B, Hastie T, Johnstone I, Tibshirani R: Least Angle Regression. Annals of Statistics. 2004, 32 (2): 407-499. 10.1214/009053604000000067.View ArticleGoogle Scholar
- Bertsekas DT, Tsitsiklis JN: Parallel and Distributed Computation: Numerical Methods. Prentice Hall, Englewood Cliffs, NJ, USA. 1989Google Scholar
- Shevade SK, Keerthi SS: A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics. 2003, 19 (17): 2246-53. 10.1093/bioinformatics/btg308.View ArticlePubMedGoogle Scholar
- Barrett JC, Fry B, Maller J, Daly MJ: Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005, 21 (2): 263-5. 10.1093/bioinformatics/bth457.View ArticlePubMedGoogle Scholar
- Lou XY, Chen GB, Yan L, Ma J, Zhu J, Elston R, Li MD: A generalized combinatorial approach for detecting gene-by gene and gene-by-environment interactions with application to Nicotine Dependence. Am J Hum Genet. 2007, 80: 1125-1137. 10.1086/518312.PubMed CentralView ArticlePubMedGoogle Scholar
- Lawlor DA, Gaunt TR, Hinks LJ, Davey SG, Timpson N, Day IN, Ebrahim S: The association of the PON1 Q192R polymorphism with complications and outcomes of pregnancy: findings from the British Women's Heart and Health cohort study. Paediatr Perinat Epidemiol. 2006, 20 (3): 244-50. 10.1111/j.1365-3016.2006.00716.x.View ArticlePubMedGoogle Scholar
- Kaipainen A, Korhonen J, Pajusola K, Aprelikova O, Persico MG, Terman BI, Alitalo K: The related FLT4, FLT1, and KDR receptor tyrosine kinases show distinct expression patterns in human fetal endothelial cells. J Exp Med. 1993, 178 (6): 2077-88. 10.1084/jem.178.6.2077.View ArticlePubMedGoogle Scholar
- Boutsikou T, Malamitsi-Puchner A, Economou E, Boutsikou M, Puchner KP, Hassiakos D: Soluble vascular endothelial growth factor receptor-1 in intrauterine growth restricted fetuses and neonates. Early Hum Dev. 2006, 82 (4): 235-9. 10.1016/j.earlhumdev.2005.09.010.View ArticlePubMedGoogle Scholar
- Nevo O, Many A, Xu J, Kingdom J, Piccoli E, Zamudio S, Post M, Bocking A, Todros T, Caniggia I: Placental expression of soluble fms-like tyrosine kinase 1 is increased in singletons and twin pregnancies with intrauterine growth restriction. J Clin Endocrinol Metab. 2008, 93 (1): 285-92. 10.1210/jc.2007-1042.View ArticlePubMedGoogle Scholar
- Kiess W, Chernausek SD, Hokken-Koelega ACS, eds: Small for Gestational Age. Causes and Consequences. Pediatr Adolesc Med Basel, Karger. 2009, 13: 11-25.Google Scholar
- Koo JW, Russo SJ, Ferguson D, Nestler EJ, Duman RS: Nuclear factor-kappaB is a critical mediator of stress-impaired neurogenesis and depressive behavior. PNAS. 2010, 107 (6): 2669-74. 10.1073/pnas.0910658107.PubMed CentralView ArticlePubMedGoogle Scholar
- Limpert AS, Carter BD: Axonal neuregulin 1 type III activates NF-kappaB in Schwann cells during myelin formation. J Biol Chem. 2010, 285 (22): 16614-22. 10.1074/jbc.M109.098780.PubMed CentralView ArticlePubMedGoogle Scholar
- Tang T, Zhang J, Yin J, Staszkiewicz J, Gawronska-Kozak B, Jung DY, Ko HJ, Ong H, Kim JK, Mynatt R, Martin RJ, Keenan M, Gao Z, Ye J: Uncoupling of inflammation and insulin resistance by NF-kappaB in transgenic mice through elevated energy expenditure. J Biol Chem. 2010, 285 (7): 4637-44. 10.1074/jbc.M109.068007.PubMed CentralView ArticlePubMedGoogle Scholar
- Bhoopathi P, Chetty C, Gujrati M, Dinh DH, Rao JS, Lakka SS: The role of MMP-9 in the anti-angiogenic effect of secreted protein acidic and rich in cysteine. Br J Cancer. 2010, 102 (3): 530-40. 10.1038/sj.bjc.6605538.PubMed CentralView ArticlePubMedGoogle Scholar
- Sussman AN, Sun T, Krofft RM, Durvasula RV: SPARC accelerates disease progression in experimental crescentic glomerulonephritis. Am J Pathol. 2009, 174 (5): 1827-36. 10.2353/ajpath.2009.080464.PubMed CentralView ArticlePubMedGoogle Scholar
- Lluri G, Langlois GD, Soloway PD, Jaworski DM: Tissue inhibitor of metalloproteinase-2 (TIMP-2) regulates myogenesis and beta1 integrin expression in vitro. Exp Cell Res. 2008, 314 (1): 11-24. 10.1016/j.yexcr.2007.06.007.PubMed CentralView ArticlePubMedGoogle Scholar
- Aoki T, Kataoka H, Moriwaki T, Nozaki K, Hashimoto N: Role of TIMP-1 and TIMP-2 in the progression of cerebral aneurysms. Stroke. 2007, 38 (8): 2337-45. 10.1161/STROKEAHA.107.481838.View ArticlePubMedGoogle Scholar
- Jon Dattorro : Convex Optimization & Euclidean Distance Geometry. 2005, Meboo publishGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.