 Methodology article
 Open Access
 Published:
Comparison of weighting approaches for genetic risk scores in geneenvironment interaction studies
BMC Geneticsvolume 18, Article number: 115 (2017)
Abstract
Background
Weighted genetic risk scores (GRS), defined as weighted sums of risk alleles of single nucleotide polymorphisms (SNPs), are statistically powerful for detection geneenvironment (GxE) interactions. To assign weights, the gold standard is to use external weights from an independent study. However, appropriate external weights are not always available. In such situations and in the presence of predominant marginal genetic effects, we have shown in a previous study that GRS with internal weights from marginal genetic effects (“GRSmarginalinternal”) are a powerful and reliable alternative to single SNP approaches or the use of unweighted GRS. However, this approach might not be appropriate for detecting predominant interactions, i.e. interactions showing an effect stronger than the marginal genetic effect.
Methods
In this paper, we present a weighting approach for such predominant interactions (“GRSinteractiontraining”) in which parts of the data are used to estimate the weights from the interaction terms and the remaining data are used to determine the GRS. We conducted a simulation study for the detection of GxE interactions in which we evaluated power, type I error and signmisspecification. We compared this new weighting approach to the GRSmarginalinternal approach and to GRS with external weights.
Results
Our simulation study showed that in the absence of external weights and with predominant interaction effects, the highest power was reached with the GRSinteractiontraining approach. If marginal genetic effects were predominant, the GRSmarginalinternal approach was more appropriate. Furthermore, the power to detect interactions reached by the GRSinteractiontraining approach was only slightly lower than the power achieved by GRS with external weights. The power of the GRSinteractiontraining approach was confirmed in a real data application to the Traffic, Asthma and Genetics (TAG) Study (N = 4465 observations).
Conclusion
When appropriate external weights are unavailable, we recommend to use internal weights from the study population itself to construct weighted GRS for GxE interaction studies. If the SNPs were chosen because a strong marginal genetic effect was hypothesized, GRSmarginalinternal should be used. If the SNPs were chosen because of their collective impact on the biological mechanisms mediating the environmental effect (hypothesis of predominant interactions) GRSinteractiontraining should be applied.
Background
For many diseases, genetic influences are exceedingly complex and cannot be explained by simple Mendelian modes of inheritance only. Moreover, genetic and environmental factors may jointly contribute to susceptibility clarifying the importance of analyzing geneenvironment (GxE) interactions, which can be defined as “a different effect of environmental exposure in disease risk in persons with different genotypes” [1].
Since most complex diseases are influenced by hundreds of genetic variants each having a small effect on its own, polygenic approaches that deal with the genetic basis en masse often access more of the heritable component of complex traits than is possible by singlevariant approaches [2]. The most common polygenic approach is the weighted genetic risk score (GRS) approach in which a weighted GRS is calculated from a preselected number of genetic variants to define a person’s individual genetic risk for disease development [3].
One of the first GRS applications was published by Purcell et al. who used GRS to argue that schizophrenia has a polygenic risk [4]. Although their genomewide association study (GWAS) identified few individually significant single nucleotide polymorphisms (SNPs), they provided evidence for a substantial polygenic component to risk of schizophrenia involving thousands of common alleles of very small effect. In addition, GRS show promise for patient stratification and subphenotyping [2]. Hamshere et al. showed that among bipolar disorder cases GRS for schizophrenia risk could distinguish schizoaffective cases from others [5]. Moreover, GRS were successfully used in interaction analyses to examine the genetic susceptibility to air pollutioninduced type 2 diabetes [6], air pollutioninduced airway inflammation [7] and fried foodinduced obesity [8].
The high power of GRS approaches to detect GxE interactions has been confirmed in a recent methodological paper by Aschard [9]. In this publication, Aschard showed that if most interaction effects point into the same direction, the use of GRS increases the power to detect GxE interactions in comparison to the common univariate singlevariant approaches, e.g. with Bonferroni correction, and the joint test of main genetic and interaction effects [9, 10]. Furthermore, by combining SNPs of a certain biological pathway, GRS can be used as a simple statistical approach for the complex biological pathways through which environmentinduced diseases might be caused [7].
GRS have been employed to summarize genetic effects among an ensemble of markers that do not individually achieve significance and to estimate the variance explained by a marker panel [3]. In these applications, the gold standard is to use external weights, e.g. marginal genetic effects estimated in an independent study population [3, 11].
In a recent publication, we presented a new GRS approach that can be applied if no appropriate external weights are available and the marginal genetic effects are predominant, which means that the marginal genetic effects are stronger than the interaction effects [12]. In this approach, we used GRS with internal weights from the marginal genetic effects of the study itself and showed that using these GRS increased the power to detect geneenvironment interactions substantially compared to the common single SNPs approach and to the usage of unweighted GRS with a wellcontrolled type I error [12]. In addition, GRS with weights from the marginal genetic effects estimated with elastic net regression [13] were able to handle a large number of correlated SNPs as well as noise SNPs, i.e. SNPs having no effect on the outcome of interest. Applying this approach to an epidemiological study, we showed in a study population of only 402 women that genetic variation in the endoplasmatic reticulum (ER) stress pathway might play a role in air pollution induced inflammation in the lung [7].
However, in scenarios with predominant interaction effects, a better approach might be to split the data into test and training data and using the training data to estimate the weights in the interaction term itself and the remaining test data to determine the GRS. Dudbridge (2013) evaluated a GRS approach in which the data were split into test and training data for the detection of marginal genetic effects [3]. Dudbridge recommended that the optimal balance of sample sizes between training and test data sets is close to onehalf regardless of the proportion of noise SNPs or the pvalue threshold [3]. Therefore, given an initial sample to be split into training and test subsets, an obvious rule of thumb is to make an even split [3]. However, to the best of our knowledge, this approach has never been evaluated for the detection of GxE interactions.
The aim of the current study is to present a new GRS approach for GxE interaction studies, called GRSinteractiontraining, in which the weights are gained from the interaction terms in the training dataset that is split off the sample data and the remaining test data is used to determine the GRS. We performed a simulation study on the detection of geneenvironment interactions in which we compared the performance of GRSinteractiontraining to GRS with external weights (gold standard) and to weighted GRSmarginalinternal [12]. We considered scenarios with predominant marginal genetic effects and smaller additional GxE interaction effects, and vice versa. We simulated scenarios with an increasing number of noise SNPs (up to 200) and with varying minor allele frequencies.
Moreover, we applied these different weighting approaches to a real data set from the Traffic, Asthma and Genetics (TAG) Study (N = 4465 observations in a pooled dataset across six birth cohorts) concerned with investigating the role of genetic variation of the oxidative stress and inflammation pathway on air pollutioninduced asthma at school age.
Methods
Determination of weighted GRS
Weighted GRS (GRS _{ i }) are defined as a weighted sums of the number of risk alleles (coded as 0, 1, 2) of k considered SNPs (g _{ i1}, …, g _{ ik }) for the n subjects (i = 1, …, n):
The most common weighting approach is to use external weights w _{1}, …, w _{ k }, e.g. marginal genetic effects of the k SNPs estimated in an independent study population [3, 11].
Genomewide metaanalyses that provide the combined effect estimates of a range of independent studies are usually preferred, followed by metaanalyses, which only include a selected number of SNPs identified to be relevant for the phenotype and by GWAS in large single cohorts. Determining weights from two or more different external studies should be treated with caution because effect estimates from different cohorts are often incomparable, e.g. due to differences in study design, ethnicity or phenotype definitions.
A limitation of GRS with external weights is that we can only include SNPs for which the marginal genetic effects have been published. In this regard, GRS with external weights are usually restricted to SNPs with a genomewide significant (pvalue <5 × 10^{−8}) marginal genetic effect in the external study population, whereas SNPs with a predominant interaction effect are usually not presented. Furthermore, not for every phenotype largescale GWAS are published and sometimes they have been conducted only in populations with different ethnicity, sex or age range.
GRSmarginalinternal approach
If no appropriate external weights are available, one approach that we developed recently is to estimate the weights w _{1}, …, w _{ k } from the internal marginal genetic effect of the study sample itself [12], called GRSmarginalinternal.
In this approach, the weights (\( {w}_1,\dots, {w}_k\Big)=\left({\widehat{\beta}}_1,\dots, {\widehat{\beta}}_k\right) \) in eq. (1) are estimated internally from a multivariate elastic net regression analysis [13,14,15] for the combined marginal genetic effect of k pathwayrelated SNPs on the health outcome y in the study population itself. In the elastic net regression model, the values of the unknown parameters for the intercept β _{0} and the marginal genetic effects of the k SNPs β _{ j } (j = 1, …, k) can be estimated by minimizing the sum of the residual sum of squares and a penalty term:
Here, G = (g _{ i1}, …, g _{ ik }) is an n x k matrix holding the k considered SNPs for the n subjects and the penalty function \( P\left(\lambda, \beta \right):= \lambda {\sum}_{j=1}^k\left(\frac{1}{2}\left(1\alpha \right)\kern0ex {\beta}_j^2+\alpha \kern0ex {\beta}_j\right) \) is a combined penalty of lasso and ridge regression penalties. We used crossvalidation to find the optimal values of the regularization parameter λ, i.e. the largest λ –value such that the mean squared error (minMSE) is within 1 standard error (SE) of the minimum as implemented in the R package glmnet [14] and recommended in [15]. The penalty weight α can be chosen between 0 and 1. The elastic net with a penalty weight of α = 1 is identical to the lasso regression, whereas the elastic net with α = 0 is identical to the ridge regression [15]. Since we could show in our recent publication, that the penalty weight α only has a minor impact on power and type I error for the detection of interactions [12], we chose a penalty weight of α = 0.5 in this publication to receive a good balance between ridge and lasso regression. Zou and Hastie proposed the elastic net penalty for linear regression models [13] that was further extended to logistic regression and multinomial regression [14] and to the Cox regression [16].
GRSinteractiontraining approach
In scenarios with predominant interaction effects, i.e. in scenarios in which the GxE interaction effects are stronger than the marginal genetic effects, a better approach might be to use the coefficients from the interaction terms to determine the weights instead of using the marginal genetic effect estimates.
In this new approach, which we call GRSinteractiontraining approach, SNPs get a larger weight to the extent that they interact more strongly with the environmental exposure.
Up to now, the use of training and test datasets for the construction of GRS has only been described for the detection of marginal genetic effects. If GRS are used to estimate marginal genetic effects, Dudbridge pointed out that the weights must be estimated from the marginal genetic effects in a training sample and be used to construct a GRS in an independent test dataset [3]. In the same line, Burgess et al. showed that using internal weights instead of weights from a training dataset should be avoided because it leads to biased effect estimates [17, 18].
Transferring this knowledge to GxE interaction analyses with GRS with weights from the interaction term itself, it is necessary to estimate these internal interaction weights in an independent training sample as well.
In the first step of the GRSinteractiontraining approach, the initial sample is split randomly into a training dataset and a test dataset. Next, the elastic net regression is used to estimate the interaction parameters δ _{ j } (j = 1, …, k) between each of the k SNPs and the environmental factor E by minimizing the sum of the residual sum of squares and a penalty term in the training data:
with E = (e _{1}, …, e _{ n }) being an n x 1 matrix holding the considered environmental exposure E for the n subjects, the environmental effect parameter γ and the penalty function:
The remaining parameters are defined as in eq. (2). The effect estimates for the interaction terms \( {\widehat{\delta}}_j\ \left(j=1,\dots, k\right) \) are then used as weights w _{1}, …, w _{ k } for the GRS (see eq. (1) for the general definition of weighted GRS) in the remaining test data.
Interaction analysis
In the subsequent geneenvironment interaction analysis, a generalized linear model (GLM) [19, 20] is applied to estimate the geneenvironment interaction (GRSxE interaction; interaction between GRS and environmental exposure) for the same health outcome y as in eqs. (2, 3). In a GLM, y is usually assumed to be generated from a distribution in the exponential family that includes, e.g., the normal, binomial, Poisson and gamma distribution. The mean μ of this distribution depends on the independent variables X through:
where E(Y) is the expected value of the random variable Y, g is the link function and X = (grs _{ i }, e _{ i }, grs _{ i } e _{ i }) being an n x 3 matrix holding the considered GRS, the environmental exposure E and the interaction between the GRS and E for the n subjects. The unknown parameter vector τ is estimated using maximum likelihood.
Simulation study
Simulation design
The data for the simulation study was generated using the function simulateSNPglm from the Rpackage scrime [21]. Each of the simulated datasets contains six independent genetic risk factors (i.e. SNPs) and either 6, 50, 100, or 200 additional noise SNPs. The impact of more noise SNPs (up to 840) and highly correlated SNPs was discussed in our previous publication where we showed that weighted GRS with weights estimated in the elastic net regression can handle even a high number of noise and correlated SNPs very well [12]. In most scenarios, we randomly chose minor allele frequencies (MAF) between 0.01 and 0.45 for the six risk SNPs as well as for the noise SNPs. When analyzing the impact of the MAF, we varied the MAFs of the six risk SNPs between 0.01 and 0.45, whereas the MAFs for the noise SNPs were randomly selected. A dominant mode of inheritance was considered for each risk SNP.
We compared two scenarios:
In scenario (a), we constructed a predominant interaction effect which means that the interaction between each of the six risk SNPs and an environmental exposure E is set to an interaction effect of 1.5 with a smaller marginal genetic effect that is not explicitly defined (see [21]).
In scenario (b), we constructed a predominant marginal genetic effect, which means that the marginal genetic effect of each of the six risk SNPs is set to 1.5 with an additional (smaller) interaction effect. For the simulation of the geneenvironment interaction terms in scenario (b), we followed the procedure previously described [12].
Effect estimates and pvalues for the marginal genetic effects, the environmental effects and for the interaction effects of a simulated example dataset of N = 3000 are given for scenarios (a) and (b) in Tables S1 and S2 of Additional file 1.
Simulation of external weights
In real data applications, it is often not or hardly possible to get appropriate external weights. Therefore, we simulated different types of external data with varying degrees of fit to the own study sample. First, external weights were estimated from the marginal genetic effects in an external dataset that was simulated from the same distribution as our study sample data (perfect weights). In addition, we simulated two scenarios with less appropriate external weights. In the first scenario, the effect estimates of the risk SNPs in our own study sample were larger than in the external data (underestimating weights) and in the second scenario, only one of the six risk SNPs of the external data was associated with the outcome in our own study sample (overestimating weights).
We simulated external data with the same sample size as in our own study sample and external data with a sample size being four times larger than in our own study sample and varied the number of noise SNPs from 6 to 200.
Evaluation of power, proportion of signmisspecification, and type I error
The main focus of the model comparison was to maximize the power to detect a geneenvironment interaction with an acceptable type I error.
Power was evaluated in datasets with N = 3000 or N = 1000 observations and 100 or 1000 replications depending on the running time and precision needed in different scenarios. As shown in [12], the restriction to 100 replications only caused a minor sampling error of around 3%points in power and type I error.
The power of the model was calculated as the proportion of times a truepositive interaction was correctly identified (sign of the parameter estimate for the GRSxE interaction term correctly identified and pvalue < 0.05) across all replications. The type I error of the model was calculated as the proportion of times a falsepositive interaction was identified under the null hypothesis. We further evaluated the proportion of signmisspecifications, which was calculated as the proportion of times a significant interaction was identified, but the sign of the parameter estimate for the GRSxE interaction term was not correctly determined.
Within the evaluation of our GRSinteractiontraining approach, we investigated the optimal balance between training and test datasets by comparing different proportions: We started with the scenario recommended by Dudbridge (2013) for GRS used for the detection of marginal genetic effects [3], in which the training and the test datasets have an even sample size (1:1). Further scenarios are based on smaller training datasets (1:2, 1:3, 1:4, 1:9 and 1:19) and larger training datasets (19:1, 9:1, 4:1, 3:1, 2:1) than test datasets.
All analyses were performed using R 3.3.1 [22].
Results
Simulation study
GRSinteractiontraining approach – Balance between training vs. test data
In a first step, we evaluated the optimal balance between training and test data applying our GRSinteractiontraining approach.
In Fig. 1, power and type I error to detect GxE interactions for (a) predominant interaction effects and (b) predominant marginal genetic effects are presented. Power and type I error were evaluated with an increasing sample size of the training data in comparison to the test data (from 19:1 to 1:19).
This figure reveals that in scenarios with many noise SNPs, the optimal split is close to onehalf and the balance is roughly symmetrical around onehalf. However, with a decreasing number of noise SNPs, a higher power was achieved by increasing the test data in comparison to the training data. In scenarios with an equal number of noise and risk SNPs, i.e. with six noise and six risk SNPs, the optimal balance between training and test data lay between 1:3 and 1:4. The type I error was well controlled over all scenarios and there was no difference in power and type I error between scenarios with predominant interaction effects (Fig. 1a) and scenarios with predominant marginal genetic effects (Fig. 1b).
GRSinteractiontraining in comparison to previous weighting approaches
Next, we compared the GRSinteractiontraining approach (balance training vs. test data 1:1) to our previously published GRSmarginalinternal approach [12] and to GRS with external weights (which is typically considered as gold standard) in scenarios with (a) predominant interaction effects and (b) predominant marginal genetic effects with an increasing number of noise SNPs (up to 200).
In scenarios with predominant interaction effects (see Fig. 2a), the GRSinteractiontraining approach achieved a higher power than the GRSmarginalinternal approach. In particular, in scenarios with many noise SNPs, the GRSmarginalinternal approach reached a very low power to detect interaction effects. Furthermore, with more noise, there was a high number of signmisspecifications when using the GRSmarginalinternal approach in scenarios with predominant interaction effects.
In scenarios with predominant marginal genetic effects (see Fig. 2b), the GRSmarginalinternal approach achieved a slightly higher power to detect interaction effects than the GRSinteractiontraining approach, but the differences became smaller with an increasing number of noise SNPs. There were no signmisspecifications in scenarios with predominant marginal genetic effects.
GRS with perfect external weights that were gained from external data that were simulated from the same distribution as our study sample data, outperformed the GRSinteractiontraining and the GRSmarginalinternal approaches. However, if the sample size of the external data was not larger than our own study sample size, the GRSinteractiontraining approach achieved a higher power than GRS with perfect external weights in scenarios with predominant interaction effects (Fig. 2a).
Furthermore, in real data applications, there is usually no perfect match between the external data and the sample data, e.g., effect estimates in the own study sample might differ from those in the external data or only a subset of risk SNPs identified in the external data is associated with the outcome in the own study sample. In these scenarios, the GRSinteractiontraining approach was often more appropriate to detect predominant interaction effects than GRS with external weights. The GRSmarginalinternal approach only outperformed GRS with external weights in the detection of predominant marginal genetic effects if there were <100 noise SNPs in the data (Fig. 2b).
The type I error was well controlled over all scenarios (Fig. 2).
GRSinteractiontraining vs. GRSmarginalinternal – Impact of MAF
In a last step, we analyzed the impact of the MAFs of the six risk SNPs on power, proportion of signmisspecifications and type I error of the GRSinteractiontraining approach in comparison to the GRSmarginalinternal approach.
In scenarios with a predominant interaction effect (see Fig. 3a), the power achieved by the GRSinteractiontraining approach was highest for MAFs between 0.05 and 0.20. Furthermore, there were no signmisspecifications and the type I error was well controlled. The power achieved by the GRSmarginalinternal approach was even higher than the power achieved by the GRSinteractiontraining approach in scenarios with only a small number of noise SNPs and small MAFs. However, with more noise and MAFs >0.1, the GRSinteractiontraining approach outperformed the GRSmarginalinternal approach. Most interestingly, there was a high number of signmisspecifications in scenarios with MAFs ≥0.2 when applying the GRSmarginalinternal approach, especially in scenarios with many noise SNPs.
In scenarios with a predominant marginal genetic effect (see Fig. 3b), the GRSmarginalinternal approach achieved a higher power than the GRSinteractiontraining approach with an acceptable proportion of signmisspecifications.
The type I error was well controlled in all scenarios, but with a higher variation due to the reduced number of replications (100 instead of 1000).
Real data application
The real data application was based on a dataset from the Traffic, Asthma and Genetics (TAG) Study (N = 4465 observations in the pooled dataset across six birth cohorts) in which the interaction between air pollution and SNPs associated with oxidative stress and inflammation on incident childhood asthma was investigated.
Trafficrelated air pollution, asthma, SNPs, and potential confounder data were pooled across six birth cohorts. Parents reported physiciandiagnosed asthma from birth to 7–8 years of age (confirmed by pediatric allergist in two cohorts). Individual estimates of annual average air pollution [nitrogen dioxide (NO_{2}), particulate matter ≤2.5 μm (PM_{2.5}), PM_{2.5} absorbance, ozone] were assigned to each child’s birth address using land use regression, atmospheric modeling, and ambient monitoring data. Geneenvironment interactions between air pollution and SNPs in GSTP1 (rs1138272 and rs1695) and TNF (rs1800629) on asthma were investigated.
The main findings of the pooled analyses were that NO_{2} (OR = 1.23; 95%CI: 1.03, 1.46, for a 10μg/m^{3} increase in NO_{2}) and GSTP1 rs1138272 (TT/TC vs. CC; OR = 1.49; 95%CI: 1.20, 1.84) were marginally associated with asthma and a significant interaction between GSTP1 rs1138272 and NO_{2} on asthma was detected (Bonferronicorrected p = 0.012) [23].
More information about the TAG study can be found in [23,24,25].
In our analysis, we focused on the German Infant Study on the influence of Nutritional Intervention plus environmental and genetic influences of on allergy development (GINIplus) as study sample (N = 593 observations), which is one of the six birth cohorts included in the TAG study. We compared the pvalues derived from weighted GRS with weights from the pooled analysis as published in [23] (proxy for external weights) to pvalues from the GRSmarginalinternal approach and to pvalues from the GRSinteractiontraining approach (balance training vs. test data 1:1 (N _{test} = 296), 1:2 (N _{test} = 395) and 1:3 (N _{test} = 444)).
In Table 1, an overview on the marginal genetic effects in the pooled analysis [23] and in GINIplus are given. Only the marginal genetic association between GSTP1 rs1138272 and asthma was significant in the pooled TAG analysis. Effect estimates differed only slightly between the pooled analysis and GINIplus, being ~30% stronger in GINIplus than in the pooled analysis. However, due to the small sample size of GINIplus (N = 593), this marginal association was not significant in GINIplus.
Table 2 shows the results of the GxE interaction analysis in GINIplus. The significant GxE interaction between GSTP1 rs1138272 and NO_{2} on asthma, which was identified in the pooled analysis [23], was identified by each GRS approach. The lowest pvalues were achieved by applying the GRSmarginalinternal approach and GRS with external weights, followed by the GRSinteractiontraining (using 25% of the data for training and the remaining 75% as test data). The weights from the GRSmarginalinternal approach were almost identical to the univariate estimates from the pooled analysis. The GRSinteractiontraining approach was the only approach that correctly identified GSTP1 rs1138272 as the only SNP that interacts with air pollution (cf. [23]) by setting the weights of the other SNPs to zero.
Discussion
In this article, we presented a new weighting approach, called GRSinteractiontraining, for GRSxE interaction studies in which parts of the study sample are used to estimate the weights and the remaining data are employed to determine the GRS.
In a simulation study and a subsequent real data application, we compared the performance of this approach to weighted GRS with internal weights from the marginal genetic effects, called GRSmarginalinternal [12], and GRS with external weights for the detection of geneenvironment interactions.
Our simulation study has shown that the power for detecting GxE interactions reached by applying the GRSinteractiontraining approach was only slightly lower than the power achieved by weighted GRS with external weights from the marginal genetic effects estimated in an independent study population that fits perfectly to our own study sample. If the external data, however, did not fit to the own study sample perfectly or the sample size of the external data was not larger than our own sample size, the power was higher when using the GRSinteractiontraining approach.
The sample size of the test data in the GRSinteractiontraining approach is only half of the sample size from the GRSmarginalinternal approach, because in the GRSinteractiontraining approach half of the data is used to determine the weights and the remaining test data to calculate the GRS and to estimate the interaction. Nevertheless, if there were no external weights available and the underlying GxE interaction effect was larger than the marginal genetic effect, the highest power was reached with the GRSinteractiontraining approach. If the underlying marginal genetic effect was substantially larger than the GxE interaction effect, the GRSmarginalinternal approach was more appropriate.
GRSinteractiontraining approach – Balance between training vs. test data
Motivated by the idea that the interaction itself might be more suitable to estimate the weights than the marginal genetic effect, we divided each of our datasets into a training and a test dataset and used the interaction estimates from the training data as weights for the GRS in the test data. Dudbridge (2013) evaluated a similar approach for the detection of marginal genetic effects and reported that the optimal balance of sample sizes between training and test datasets is close to onehalf regardless of the proportion of noise SNPs or the pvalue threshold [3]. In our study, this recommendation showed up to be true for scenarios with many noise SNPs (e.g., 6 risk SNPs and 200 noise SNPs) and the balance was roughly symmetrical around onehalf which is also in line with [3]. However, in contrast to Dudbridge (2013), with a decreasing number of noise SNPs (down to only 6), a higher power was achieved by increasing the size of the test data proportionally to the size of the training data. This finding was confirmed in our real data application with only two noise SNPs and one risk SNP, as a lower pvalue was achieved when using more test data than training data. Nevertheless, since we usually consider a large number of noise SNPs in most geneenvironment interaction studies, we generally support Dudbridge’s rule of thumb to make an even split between training and test data for GxE interaction studies.
Internal vs. external weights
Our simulation study has confirmed that the gold standard for the construction of GRS is to use external weights, e.g., from the marginal genetic effects estimated in independent study populations, if the external data fit very well to the study sample. This strong assumption means that the marginal genetic associations in the external data are the same as in our own study sample, this might but must not be reached if the phenotype is assessed in exactly the same way and that there is no ethnic or age difference between the study populations. In real data analyses, these assumptions are often not fulfilled because large scale GWAS are not published for every phenotype and sometimes only in populations with different ethnicity, sex or age range.
The violation of these assumptions might lead to a decrease of power for detecting interaction effects with GRS with external weights. Therefore, in the practical analysis of real data, using internal weights from the study population itself might often be a more powerful alternative to detect GxE interactions.
However, in our real data application, the power reached by GRS with external weights was similar to the power reached by the two approaches with internal weights. One reason for that might be that our study sample (GINIplus) was included in the estimation of the “external” effects. Therefore, the effect estimates from the pooled analysis might fit slightly better to the GINIplus data than they would have fitted if the GINIplus data would not have been part of the pooled analysis. Furthermore, a limitation of the GRSinteractiontraining approach is that the GRSxE interaction term can only be estimated in a subset (i.e. the test data) of the original sample data which reduces the power to detect interactions.
A major limitation of GRS with external weights is that we can only include SNPs for which the marginal genetic effects have been published. In this regard, GRS with external weights are usually restricted to SNPs with a genomewide significant (pvalue <5 × 10^{−8}) marginal genetic effect in the external study population, whereas SNPs with a predominant interaction effect are usually not presented. For GxE interaction studies, this leads to a publication bias towards SNPs with predominant marginal genetic effects. To avoid this publication bias and to increase the power for detecting GxE interactions, estimates from genome wide geneenvironment interaction studies might be used. However, up to now, very few genomewide geneenvironment interaction studies have been published because of the limited power to detect interactions in genomewide analyses.
From a biological perspective, a pathwayorientated GxE interaction analysis might be a more powerful and biologically plausible alternative to genomewide approaches. Very recently, we could, e.g., show in a study population consisting of 402 women that genetic variations in the ER stress pathway might play a role in air pollution induced inflammation in the lung using the GRS approach with internal weights from the marginal genetic effects, although there was no significant marginal genetic effect on the individual SNP level [7].
GRSinteractiontraining vs. GRSmarginalinternal
In scenarios with a predominant interaction effect, i.e. an interaction effect that is (substantially) larger than the marginal genetic effect, the GRSinteractiontraining approach was more powerful than the GRSmarginalinternal approach, particularly in the presence of noise SNPs. Furthermore, applying the GRSmarginalinternal approach in scenarios with predominant interaction effects might lead to a high number of signmisspecifications when the MAFs of the risk SNPs are ≥0.2 and in the presence of noise.
However, in scenarios with a predominant marginal genetic effect and a smaller additional interaction effect, the GRSmarginalinternal approach achieved a slightly higher power than GRSinteractiontraining approach with an acceptable number of signmisspecifications.
In real data applications, the decision if the interaction or the marginal genetic effect is predominant, should be made a priori and be based on biological knowledge. If the SNPs were chosen because the underlying genes had been identified to be marginally associated with the same or a related phenotype (e.g. in a largescale genomewide metaanalysis), independently of the environmental exposure, the weights should be determined from the marginal genetic effects (GRSmarginalinternal). Nevertheless, if the SNPs were chosen because of their potential impact on the biological mechanisms mediating the association between the environmental exposure and disease development, the weights should be determined from the interaction term (GRSinteractiontraining approach). Either this knowledge might be based on mechanistic studies or on epigenomewide association studies (EWAS). EWAS present differentially methylated probes (DMPs) and regions (DMRs) in balance to disease outcomes (e.g. [26] for lung function). Since EWAS identify regions that are modified by environmental factors, they might provide a good preselection of genetic regions to be considered in GxE interaction studies.
In the TAG study, e.g., the considered SNPs were chosen, as the biological mechanisms were thought to underlie both the toxicity of trafficrelated air pollution and the development of asthma [27]. This was confirmed by our performed analysis, which shows that the GRSmarginalinternal approach reached almost the same power as GRSinteractiontraining approach.
Strengths and limitations
Our study has several strengths. To our knowledge this is the first study presenting GRS with weights from the interaction term itself and comparing GRS with internal vs. external weights for the detection of geneenvironment interactions. Furthermore, this is the first study comparing interaction approaches in scenarios with predominant interaction vs. predominant marginal genetic effects, a differentiation that is often ignored in the real data practice but which was shown to have a major impact on the selection of the most powerful analytic strategy. A further strength is that we analyzed the performance of the GRS approaches in the presence of noise and SNPs with different MAFs to cover several data structures common in GxE interaction studies.
A few limitations and outstanding issues should be noted. In our simulation study, we compared the performance of GRS with internal and external weights in quite simple scenarios, which might not cover all types of interaction models. We did not include different modes of inheritance, genegene or other more complex interactions in these scenarios. Such considerations might be beneficial to further optimize the weighted GRS for other scenarios.
Moreover, a comparison of the considered GRS approaches with other stateoftheart interaction approaches might be interesting. However, as Aschard recently showed, the use of GRS can increase the power to detect GxE interactions in comparison to common univariate singlevariant approaches and the joint test of main genetic and interaction effects [4, 5]. We additionally compared our GRS approaches with a multiple logistic lasso regression considering pvalues estimated using the significance test for the lasso [28]. The results of this comparison presented in Additional file 1 show that our GRS approaches outperform the results of a lasso regression in the considered scenarios.
Furthermore, there is room for improvement regarding the decision making process between a predominant interaction effect and a predominant marginal genetic effect because detailed a priori knowledge about the biological pathways is often limited. One possibility to improve the apriori knowledge might be to use information from EWAS. The growing field of epigenetics might clarify many of the biological pathways how environmental exposures might induce health problems and thereby improve the selection process of candidate SNPs for pathway based GxE interaction studies. A possibility to improve the GRS approaches might be to combine the GRSmarginalinternal approach and the GRSinteractiontraining approach to reach a good power for the detection of interactions in scenarios with predominant marginal genetic effects as well as in scenarios with predominant interaction effects.
Our real data application has the limitation that we could only include the three SNPs from which we had previous knowledge about the marginal genetic and interaction effects in a large pooled analysis [23]. However, this is often a limitation in the daily practice as well, since external weights are often limited to, e.g., genomewide significant SNPs because other effect estimates are often not reported. Furthermore, since GINIplus (N = 593) was part of the TAG consortia (N = 4465), the weights from the pooled marginal genetic effects were not independent from our sample data. However, this problem does also often occur in the real data practice because large scale genomewide metaanalyses often include all study populations that are available for the considered phenotype and thereby often include the own study sample as well.
Conclusion
In conclusion, when no appropriate external weights are available (due to, e.g., ethnic differences or differences in the phenotype assessment), we recommend to use internal weights from the study population itself to construct weighted GRS for GxE interaction studies. If the SNPs were chosen because a marginal genetic effect was hypothesized, the weights should be estimated from the marginal genetic effects (GRSmarginalinternal approach). If the SNPs were chosen because of their potential impact on the biological mechanisms mediating the association between the environmental exposure and disease development, the weights should be estimated from the interaction term itself in a training dataset (GRSinteractiontraining approach).
Abbreviations
 EWAS:

Epigenomewide association study
 GINIplus:

German Infant Study on the influence of Nutritional Intervention plus environmental and genetic influences of on allergy development
 GLM:

Generalized linear model
 GRS:

Genetic risk score
 GRSxE interaction:

Interaction between GRS and environmental exposure
 GWAS:

Genomewide association study
 GxE interaction:

Geneenvironment interaction
 MAF:

Minor allele frequency
 NO_{2} :

Nitrogen dioxide
 PM_{2.5} :

Particulate matter ≤2.5 μm
 SNP:

Single nucleotide polymorphism
 TAG Study:

Traffic, Asthma and Genetics Study
References
 1.
Ottman R. Gene–environment Interaction : definitions and study designs. Prev Med (Baltim). 1996;25:764–70.
 2.
Dudbridge F. Polygenic epidemiology. Genet Epidemiol. 2016;40:268–72.
 3.
Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9:e1003348.
 4.
Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;72:1343–54.
 5.
Hamshere ML, O’Donovan MC, Jones IR, Jones L, Kirov G, Green EK, et al. Polygenic dissection of the bipolar phenotype. Br J Psychiatry. 2011;198:284–8.
 6.
Eze IC, Imboden M, Kumar A, von Eckardstein A, Stolz D, Gerbase MW, et al. Air pollution and diabetes association: modification by type 2 diabetes genetic risk score. Environ Int The Authors. 2016;94:263–71.
 7.
Hüls A, Krämer U, Herder C, Fehsel K, Luckhaus C, Stolz S, et al. Genetic susceptibility for air pollutioninduced airway inflammation in the SALIA study. Environ Res Elsevier. 2017;152:43–50.
 8.
Qi Q, Chu AY, Kang JH, Huang J, Rose LM, Jensen MK, et al. Fried food consumption, genetic risk, and body mass index: genediet interaction analysis in three US cohort studies. BMJ. 2014;348:g1610.
 9.
Aschard HA. Perspective on interaction effects in genetic association studies. Genet Epidemiol. 2016;40:678–88.
 10.
Kraft P, Yen YC, Stram DO, Morrison J, Gauderman WJ. Exploiting geneenvironment interaction to detect genetic associations. Hum Hered. 2007;63:111–9.
 11.
Che R, MotsingerReif A. a. Evaluation of genetic risk score models in the presence of interaction and linkage disequilibrium. Front Genet. 2013;4:1–10.
 12.
Hüls A, Ickstadt K, Schikowski T, Krämer U. Detection of geneenvironment interactions in the presence of linkage disequilibrium and noise by using genetic risk scores with internal weights from elastic net regression. BMC Genet. 2017;18:55.
 13.
Zou H, Hastie T. Regularization and variable selection via the elasticnet. J R Stat Soc. 2005;67:301–20.
 14.
Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2009;33:1–22.
 15.
Waldmann P, Mészáros G, Gredler B, Fuerst C, Sölkner J. Evaluation of the lasso and the elastic net in genomewide association studies. Front Genet. 2013;4:1–11.
 16.
Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw. 2011;39:1–13.
 17.
Burgess S, Dudbridge F, Thompson SG. Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat Med. 2016;35:1880–906.
 18.
Burgess S, Thompson SG. Use of allele scores as instrumental variables for Mendelian randomization. Int J Epidemiol. 2013;42:1134–44.
 19.
McCullagh P, Nelder JA. Generalized linear models. 2nd ed. London: Chapman and Hall; 1989.
 20.
Nelder JA, Wedderburn RWM. Generalized linear models. J R Stat Soc A. 1972;135:370–84.
 21.
Schwender H, Fritsch A. scrime: Analysis of HighDimensional Categorical Data such as SNP Data. R package version 1.3.3. 2013.
 22.
Development Core R, Team R. A language and environment for statistical computing [internet]. Vienna, Austria: R foundation for statistical. Computing. 2017; Available from: http://www.rproject.org/
 23.
MacIntyre EA, Brauer M, Melén E, Bauer CP, Bauer M, Berdel D, et al. GSTP1 and TNF gene variants and associations between air pollution and incident childhood asthma: the traffic, asthma and genetics (TAG) study. Environ Health Perspect. 2014;122:418–24.
 24.
MacIntyre EA, Carlsten C, MacNutt M, Fuertes E, Melén E, Tiesler CMT, et al. Traffic, asthma and genetics: combining international birth cohort data to examine genetics as a mediator of trafficrelated air pollution’s impact on childhood asthma. Eur J Epidemiol. 2013;28:597–606.
 25.
Fuertes E, Brauer M, MacIntyre E, Bauer M, Bellander T, Von Berg A, et al. Childhood allergic rhinitis, trafficrelated air pollution, and variability in the GSTP1, TNF, TLR2, and TLR4 genes: results from the TAG study. J Allergy Clin Immunol. 2013;132:342–52.
 26.
Lee M, Hong Y, Kim W, London S. Epigenomewide association study of chronic obstructive pulmonary disease and lung function in Koreans. Epigenomics. 2017;9:971–84.
 27.
Kelly FJ. Oxidative stress: its role in air pollution and adverse health effects. Occup Environ Med. 2003;60:612–6.
 28.
Lockhart R, Taylor J, Tibshirani RJ, Tibshirani RA. Significance test for the lasso. Ann Stat. 2014;42:413–68.
Acknowledgements
We would like to thank all members of the TAG consortia for providing us the data for the real data application: Allan Becker, Andrew Sandford, Andrea von Berg, Anita L. Koryrskyj, Anna Bergström, Anna Gref, Barbara Hoffmann, Beate Schaaf, Bert Brunekreef, Carl Peter Bauer, Carla M. T. Tiesler, Cilla Söderhäll, Claudia Klümper, Dietrich Berdel, Dirkje S. Postma, Elaina A MacIntyre, Elaine Fuertes, Elisabeth Thiering, Eric Melén, F. Nicole Dijk, Gerard H. Koppelman, Göran Pershagen, Inger Kull, Joachim Heinrich, Juha Kere, Marie Standl, Mario Bauer, Marit Westman, Marjan Kerkhof, Meaghan Macnutt, Melanie Waldenberger, Michael Brauer, Moira ChanYeung, Nathalie Acevedo, Olf Herbarth, Sibylle Koletzko, Tom Bellander, Ulrike Gehring.
Funding
This project was part of AH’s PhD thesis at the Faculty of Statistics, TU Dortmund University and was funded by the IUFLeibniz Research Institute for Environmental Medicine, Düsseldorf. This work was also supported by the Deutsche Forschungsgemeinschaft (grant SCHW 1508/3–1 to HS). We further acknowledge financial support by the Deutsche Forschungsgemeinschaft and TU Dortmund University within the funding programme Open Access Publishing.
Availability of data and materials
All data generated within the simulation study can be made available to readers upon request.
Author information
Author notes
Affiliations
Contributions
AH, HS, KI and UK conceived and designed the simulation study. AH, UK and TS (PI of the GINIplus study) and CC (PI of the TAG consortium) contributed to the study design of the real data application. AH performed the simulation study and real data application and was the major contributor in writing the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Anke Hüls.
Ethics declarations
Ethics approval and consent to participate
The GINIplus study was approved by the relevant ethics committees (Ethikkommission der Ärztekammer Nordrhein and Ethikkommision der Bayerischen Landesärztekammer) with written informed consent obtained from the parents of all participants.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional file
Additional file 1: Table S1.
Simulated predominant interaction effects (example data for N = 3000). Table S2. Simulated marginal genetic effects (example data for N = 3000). Comparison of GRS approaches and lasso regression. Figure S1. Power and signmisspecifications comparison. Figure S2. Type I error comparison. (DOCX 405 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Polygenic approach
 Training dataset
 Internal weights
 External weights
 Simulation study
 Power
 Type I error