# Strategy and model building in the fourth dimension: a null model for genotype × age interaction as a Gaussian stationary stochastic process

- Vincent P Diego
^{1, 2}Email author, - Laura Almasy
^{1}, - Thomas D Dyer
^{1}, - Júlia MP Soler
^{3}and - John Blangero
^{1}

**4(Suppl 1)**:S34

https://doi.org/10.1186/1471-2156-4-S1-S34

© Diego et al; licensee BioMed Central Ltd 2003

**Published: **31 December 2003

## Abstract

### Background

Using univariate and multivariate variance components linkage analysis methods, we studied possible genotype × age interaction in cardiovascular phenotypes related to the aging process from the Framingham Heart Study.

### Results

We found evidence for genotype × age interaction for fasting glucose and systolic blood pressure.

### Conclusions

There is polygenic genotype × age interaction for fasting glucose and systolic blood pressure and quantitative trait locus × age interaction for a linkage signal for systolic blood pressure phenotypes located on chromosome 17 at 67 cM.

## Background

The Framingham Heart Study (FHS) [1] offers a unique opportunity to assess possible genotype × age (G × age) interaction in the genetic architecture of complex traits. To study G × age interaction in the FHS, we built on the variance components model [2], which, assuming negligible dominance, may be given as:

**I**is the identity matrix, and , , and are variances of the additive QTL, polygenic and environmental factors, respectively.

Our study was carried out in two phases. In Phase 1, we made inferences on polygenic G × age interaction using what we call the G × age model [3]. Also, we experiment with correlation functions in the G × age model. In Phase 2, we implement the QTL version of this model. Multivariate variance components linkage analysis [4] can also be used to detect G × age interaction. We show how the bivariate variance components model can be used to detect QTL × age interaction. Given that the G × age and bivariate models apply to cross-sectional and longitudinal data, respectively, we compare their relative abilities to detect G × age interaction.

## Models and Methods

Using the bivariate mixed model [5], Blangero [6] derived the following expression for the genetic variance in phenotype response to two different environments:

_{G}is the genetic correlation between environments, in this case ages. The term environment, denoted by E, is used in accordance with standard statistical genetic theory [6]. There is no G × E interaction (i.e., = 0) if = = and ρ

_{G}= 1 [6]. To model = 0, and ρ

_{G}are parameterized as functions of age, which is the environment of interest in our analyses:

_{G}as exponential decay [8], and α, β, and λ are parameters to be estimated, and m and n are two ages. Thus, the null hypotheses are β = 0 and λ = 0, which reflect a stationary covariance function.

In our analyses, we used the expected covariance matrix for a given pedigree [3], which has elements giving the covariance in trait value *y* for any two relatives:

where x and z index individuals at ages m and n, respectively, Φ_{
xz
}gives their kinship status, π is the probability that a random allele is IBD at the QTL, δ_{xz} is 1 for x = z and 0 otherwise, subscripts g, q, and e denote the polygenic, QTL, and environmental components, respectively, and average age is taken over the sample population, defined as all individuals measured for the trait of interest. Some points of clarification are needed. This is a cross-sectional model that applies generally to three types of pair-wise comparisons of individuals. In one type, let x = z such that *m* = *n*. Equation (5) gives the variances in this situation, in accord with the standard variance components model. In a second type, it may be such that x ≠ z while m ≠ n, and, in a third type, it may be such that x ≠ z while m ≠ n. Note that none of these types are longitudinal comparisons, which would be the case where x = z while m ≠ n (i.e., the same individual is measured at different ages). The goal of this approach lies in estimation of the change parameters, namely β and λ, so that we can test the null hypotheses stated above.

For the bivariate model, a trait measured at two different time points is treated as a bivariate phenotype. The pedigree covariance matrix model for a bivariate phenotype K, with constituent phenotypes I and J, may be written as [4]:

where the new matrices **G**, **Q**, and **E** convey the polygenic, QTL, and environmental variance components, respectively, and
is the Kronecker-product operator. The variance components for this model are those for the univariate model for traits I and J, given by Σ_{I} and Σ_{J} (cf. equation (1)), and the trait cross-covariances, which may be parameterized as ρ_{gij}σ_{gi}σ_{gj}, ρ_{qij}σ_{qi}σ_{qj}, and ρ_{eij}σ_{ei}σ_{ej} for the polygenic, QTL, and environmental components, respectively [4]. The null hypotheses of no polygenic genotype or QTL × age interaction are expressed as ρ_{gij} = 1 and ρ_{qij} = 1, respectively.

Scatter plots were examined for traits showing increasing or decreasing variance in trait values with age, suggesting potential for G × age interaction. Traits meeting these criteria, namely systolic blood pressure (SBP) and fasting glucose (GLUC), were analyzed using SOLAR [2], with age, sex, hypertension medication, and body mass index (BMI) as covariates. Phase 1 analyses were conducted on an augmented data set. Exams 12 (1970), 16 (1978), 18 (1982), and 20 (1986) in Cohort 1 were combined with Exams 1 (1971), 2 (1979), 3 (1983), and 4 (1987) in Cohort 2, respectively. To reduce kurtosis after combining, a few outliers were removed for SBP (1–6 per exam, all with values > the mean) but many for GLUC (51–101 per exam, all but one > the mean). No data transformation was made. Since diabetic status was not available, we could not control for it. This gave combined measurement periods 1–4.

Based on genome scans that we conducted for both SBP and GLUC, we decided to focus on SBP. For Phase 2, data for SBP values corrected for hypertension treatment for Cohorts 1 and 2, imputed following Levy et al. [9] (see Soler and Blangero [10]), were analyzed. Bivariate analyses were performed on residuals from multiple regressions with corrected SBP values as the dependent variable and age, sex and BMI as independent variables. For the bivariate analyses, analysis of residuals was necessary due to the unbalanced and incomplete structure of the FHS longitudinal data, which made covariate specification unrepresentative across measurements. For this SBP data set (corrected values and residuals), we experimented with combined exams from Cohorts 1 and 2 and with exams from Cohort 2 taken separately.

## Results

### Phase 1

*p*-values reported in Figure 1 and Tables 1,2 are for the likelihood ratio statistic: Λ = -2 [ln L(θ

_{N})- ln L(θ

_{A})], where the null, H

_{N}(parameter constrained to 0 or 1 as appropriate), is compared to the alternative, H

_{A}(parameter estimated). In general, Λ is distributed as a χ

^{2}random variable with degrees of freedom (d.f.) equal to the difference in the number of parameters under the null and alternative hypotheses [11]. If parameter values are constrained to a boundary under the null hypothesis, the asymptotic distribution of Λ is given by a mixture of random variables, where n denotes the d.f. and where the mixture may include n = 0 (point mass at 0) [11]. However, the traditional criterion (i.e., difference in parameters) is conservative [3].

Univariate Analyses for Combined Cohort 1, Exam 20 and Cohort 2, Exam 4

Models | Ln Likelihood | AIC | Evidence Ratio |
| Ratio Test |
---|---|---|---|---|---|

1. Polygenic (2) | -6714.557 | 13433.11 | 5.7 × 10 | 1.35 × 10 | 1 vs. 6 |

2. P × age (5) | -6537.371 | 13084.74 | 128.8433 | 1.69 × 10 | 1 vs. 2 |

3. Conα G (4) | -6583.193 | 13174.39 | 3.77 × 10 | 1.04 × 10 | 3 vs. 2 |

4. Conβ G (4) | -6542.711 | 13093.42 | 9888.592 | 0.00108 | 4 vs. 2 |

5. Conλ (4) | -6538.801 | 13085.6 | 198.1198 | 0.09078 | 5 vs. 2 |

6. QTL (3) | -6705.086 | 13416.17 | 1.2 × 10 | 2.68 × 10 | 6 vs. 7 |

7. QTL × age (7) | -6530.512 | 13075.02 | 1 | 0.00105 | 2 vs. 7 |

8. Conα QTL (6) | -6534.916 | 13081.83 | 30.09494 | 0.00300 | 8 vs. 7 |

9. Conβ QTL (6) | -6533.796 | 13079.59 | 9.819076 | 0.01038 | 9 vs. 7 |

Bivariate Analyses – Cohort 2, Exams 4 and 5 Residuals^{A}

Models | Ln Likelihood | AIC | Evidence Ratio |
| Ratio Test |
---|---|---|---|---|---|

1. Biv-polyg (6) | -8940.4 | 17888.81 | 15248.48 | 0.00024 | 1 vs. 2 |

2. Biv-QTL (9) | -8930.77 | 17869.54 | 1 | ----- | ----- |

3. Conρ G (8) | -8940.4 | 17888.81 | 15248.48 | 1.14 × 10 | 3 vs. 2 |

4. Conρ Q (8) | -8931.25 | 17870.49 | 1.608439 | 0.32815 | 4 vs. 2 |

### Phase 2

## Conclusion

Our results suggest that there is polygenic G × age interaction for both GLUC and SBP (Fig. 1; Tables 1,2) and QTL × age interaction for the putative QTL for SBP phenotypes on chromosome 17 at 67 cM (linkage: Fig. 2; interaction: Table 1). To our knowledge, this is the first demonstration of QTL × age interaction in humans using linkage analysis methods.

The cross-sectional and longitudinal analyses do not give the same results. The difference may mean that cross-sectional data are better than longitudinal data at capturing G × age interaction. However, based on the received wisdom regarding the relative utility of cross-sectional and longitudinal analyses [13], this explanation does not seem tenable. Alternatively, that the bivariate model did not detect QTL × age interaction (Table 2) may simply be due to the loss in power with an overly parameterized model [12]. Another explanation is that the bivariate model is really operating on time rather than on age and so perhaps the brief time span between Cohort 2, Exams 4 and 5, was too short to capture an interaction effect. Yet another explanation of the difference between the cross-sectional and longitudinal analyses is that the latter was carried out on Cohort 2 only, which, taken by itself, yielded a smaller sample size. Lastly, there is the possibility of some mixture of the last three problems.

Given that equations (5) and (6) derive from the same underlying modeling framework – namely variance components – they can be conceptualized as a strategy for testing the null hypothesis that a given phenotype "translated along the time axis" is a Gaussian covariance stationary stochastic process [14]. This is now an established approach in statistical genetics [14], and is readily extended to the genetics of aging. It is our hope that the analyses herein contribute to a better understanding of the genetic architecture of the complex traits associated with the aging process and associated complex diseases (e.g., cardiovascular disease). We conclude that the G × age and bivariate models offer a feasible system for model building in the fourth dimension.

## Declarations

### Acknowledgments

We thank Drs. Jean W. MacCluer and Ravindranath Duggirala for critically reviewing the manuscript. This work is supported in part by NIH R01MH59490.

## Authors’ Affiliations

## References

- Cupples LA, Yang Q, Demissie S, Copenhafer D, Levy D, for the Framingham Heart Study Investigators: Description of the Framingham Heart Study Data for Genetics Analysis Workshop 13. BMC Genetics. 2003, 4 (suppl 1): S2-10.1186/1471-2156-4-S1-S2.PubMed CentralView ArticlePubMedGoogle Scholar
- Almasy L, Blangero J: Multipoint quantitative-trait linkage analysisin general pedigrees. Am J Hum Genet. 1998, 62: 1198-1211. 10.1086/301844.PubMed CentralView ArticlePubMedGoogle Scholar
- Almasy L, Towne B, Peterson C, Blangero J: Detecting genotype × age interaction. Genet Epidemiol. 2001, 21 (suppl 1): S819-S824.PubMedGoogle Scholar
- Almasy L, Dyer TD, Blangero J: Bivariate quantitative trait linkage analysis: pleiotropy versus co-incident linkages. Genet Epidemiol. 1997, b: 953-958. 10.1002/(SICI)1098-2272(1997)14:6<953::AID-GEPI65>3.0.CO;2-K.View ArticleGoogle Scholar
- Blangero J, Konigsberg LW: Multivariate segregation analysis using the mixed model. Genet Epidemiol. 1991, 8: 299-316. 10.1002/gepi.1370080503.View ArticlePubMedGoogle Scholar
- Blangero J: Statistical genetic approaches to human adaptability. Hum Biol. 1993, 65: 941-966.PubMedGoogle Scholar
- Davidian M, Carroll RJ: Variance function estimation. J Am Statist Assoc. 1987, 82: 1079-1091. 10.2307/2289384.View ArticleGoogle Scholar
- Lange K: Cohabitation, convergence, and environmental covariances. Am J Med Genet. 1986, 24: 483-491. 10.1002/ajmg.1320240311.View ArticlePubMedGoogle Scholar
- Levy D, DeStefano AL, Larson MG, O'Donnell CJ, Lifton RP, Gavras H, Cupples LA, Myers RH: Evidence for a gene influencing blood pressure on chromosome 17: genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the Framingham Heart Study. Hypertension. 2000, 36: 477-483.View ArticlePubMedGoogle Scholar
- Soler JMP, Blangero J: Longitudinal familial analysis of blood pressure involving parametric (co)variance functions. BMC Genetics. 2003, 4 (suppl 1): S87-10.1186/1471-2156-4-S1-S87.PubMed CentralView ArticlePubMedGoogle Scholar
- Self SG, Liang K-Y: Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J Am Statist Assoc. 1987, 82: 605-610. 10.2307/2289471.View ArticleGoogle Scholar
- Burnham KP, Anderson DR: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. New York, Springer-Verlag. 2002, 2Google Scholar
- Ware JH: Linear models for the analysis of longitudinal studies. Am Statist. 1985, 39: 95-101. 10.2307/2682803.Google Scholar
- Pletcher SD, Geyer CJ: The genetic analysis of age-dependent traits: modeling the character process. Genetics. 1999, 151: 825-835.Google Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.