Volume 6 Supplement 1
Multivariate linkage analysis using the electrophysiological phenotypes in the COGA alcoholism data
© Zhang et al; licensee BioMed Central Ltd 2005
Published: 30 December 2005
Multivariate linkage analysis using several correlated traits may provide greater statistical power to detect susceptibility genes in loci whose effects are too small to be detected in univariate analysis. In this analysis, we apply a new approach and perform a linkage analysis of several electrophysiological phenotypes of the Collaborative Study on the Genetics of Alcoholism data of the Genetic Analysis Workshop 14. Our approach is based on a variance-component model to map candidate genes using repeated or longitudinal measurements. It can take into account covariate effects and time-dependent genetic effects in general pedigree data. We compare our results with the ones obtained by SOLAR using single measurement data. Our multivariate linkage analysis found linkage evidence on two regions on chromosome 4: around marker GABRB1 at 51.4 cM and marker FABP2 at 116.8 cM (unadjusted p- value = 0.00006).
The Collaborative Study on the Genetics of Alcoholism (COGA) is a large, multisite genetic study to identify susceptibility genes for alcohol dependence and related phenotypes. COGA data include information from the visual oddball experiment and the eyes closed resting electroencephalogram (EEG) dataset. The four fields beginning with ttth contain data extracted from the target case of the visual oddball experiment for four electrode placements. The extracted measures correspond to the late time window, which is set at 300 to 700 ms following stimulus presentation (bounding the visual P3 event), and the theta band power (3 to 7 Hz). The ttth1 measures have yielded a strong linkage signal on chromosome 7 . The fields beginning with ttdt contain data similar to the ttth variables except that they are based on the delta band power (1 to 2.5 Hz). The fields beginning with ntth contain data extracted from the non-target case of the visual oddball experiment for four electrode placements. The extracted measures correspond to the early time window, which is set at 100 to 300 ms following the stimulus presentation, and the theta band power (3 to 7 Hz). The field labelled with ecb21 contains data extracted from the eyes closed resting EEG experiment. This measurement corresponds to the first component of a trilinear singular value decomposition of the Beta2 band (16.5 to 20 Hz) bipolar electrode data. These data have shown strong linkage on chromosome 4 and strong linkage disequilibrium (LD) with GABA-A gene single-nucleotide polymorphisms (SNPs) on chromosome 4 [2, 3]. The data also include the age, labelled ERP Age, at which the electrophysiological data were collected.
Multivariate linkage analysis using several correlated traits may provide greater statistical power to detect susceptibility genes in loci whose effects are too small to be detected in univariate analysis. In this report, we analyzed the COGA data using an extension of the variance components models for repeated measurements, and considered simultaneously several of the electrophysiological phenotypes.
We assume independence between pedigrees, and consider one pedigree to describe our model. Let y = (y11,...,y1m,...,yn 1,...,y nm ) be a vector of m multivariate trait values for n members of the pedigree. The ith family member has m trait values observed at the age of t i , i = 1,...,n. Consider the model, for i = 1,...,n and j = 1,...,m,
y ij (t i ) = f(X i ,t i ) + s(t i )γi 1+ γi 2+ e ij (t i ), (1)
where f(X i ,t i ) is a function of the fixed covariate effects X i and time t i , s(t i ) a simple parametric function to accommodate time variant genetic effects, γi 1the random effect for a major gene, γi 2the random effect for the cumulative effect of the residual genes, and e ij (t i ) the measurement error. We assume that γi 1, γi 2, and e ij are independent, although e ij (t i ), j = 1,...,m, has a within-subject correlation structure. It follows:
cov(y ij (t i ),y lk (t l )) = s(t i )s(t l )cov(γi 1,γl 1) + cov(γi 2,γl 2) + δ(i = l)σ jk (t i ,t l ),
where σ jk (t i ,t l ) is the covariance function for e ij (t i ) and e lk (t l ) and δ(i = l)is the identity indicator which is 1 when i = l and 0 otherwise. In addition, the covariances of γi 1and γi 2can be partitioned into additive and dominant variances as follows:
where k j represents the k coefficient of Cotterman  for the probability of members i and l sharing j alleles identically by decent (IBD) at the locus of interest, φ and τ are respectively the expected kinship coefficient and the expected probability of sharing 2 alleles IBD over the residual components of the genome, and are the additive and dominant genetic variances at the locus of interest, respectively, and and are the total additive and dominant genetic variances over the residual components of the genome, respectively. The π il , k2,il, φ il and τ il can be obtained using the SOLAR software program . Because the dominant effects are usually too small, we do not consider them in this analysis. A restricted maximum likelihood approach is used to estimate parameters. A likelihood ratio test is used to test the null hypothesis that the genetic variance due to the quantitative trait locus (QTL) equals zero (no linkage). Two times the log likelihood ratio yields a test statistic that is asymptotically distributed as a mixture of χ2 distributions .
The dataset includes a total of 143 nuclear and multigenerational families with 1,614 individuals. We chose those genotyped individuals with no missing electrophysiological phenotypes and ages. This yields 140 families with a total of 819 individuals. We focus our analysis on ecb21 and ttth1 and ttth2 electrophysiological phenotypes that result in a total of 2,457 measurements for the 819 individuals. We noted that from a scatter plot of each electrophysiological phenotype versus the age at which the data were collected, there was a roughly quadratic trend of the phenotype over age. Therefore, in Equation (1), we incorporated age at which the electrophysiological data were collected and its square as covariates. We also incorporated sex as a covariate and included some dummy variables as covariates to allow different intercepts for the individual phenotypes. We also performed the analysis with or without smoking status as a covariate. We considered two forms of s(t): constant and linear functions of the age. For the environmental covariance function σ jk (t i ,t l ), we assumed the same environmental variance for every phenotype (a standardization can be performed prior to the linkage analysis if this assumption is violated) and we considered either the same environmental covariance between any two phenotypes or all different environmental covariances between any two phenotypes.
We first used the SOLAR software program  to analyze each of the electrophysiological phenotypes. We computed the IBD information using SOLAR for two-point linkage analysis. The two-point linkage results from SOLAR show that, for ttth1 data, marker D7S1804 at 156.4 and D7S509 at 163.7 on chromosome 7 both have LOD scores around 3.5 and p-values reaching 0.00003, D7S1796 at 120 cM has a LOD score of 2.4 and a p-value of 0.0003, and D7S794 at 177.9 cM has a LOD score of 2.1 and a p-value of 0.0009. Consistent with existing analyses, individual ttth1 phenotypes produced strong linkage signals on chromosome 7. The results from SOLAR also showed that, for ecb21 data, marker D4S2382 at 43.3 cM on chromosome 4 has a p-value of 0.006 (LOD score = 1.4), GABRB1 at 51.4 cM on chromosome 4 has a p-value of 0.006 (LOD score = 1.4), and FABP2 at 116.8 cM on chromosome 4 has a p-value of 0.002 (LOD score = 1.7).
As we can see from Figure 1, curve d reaches a -log(p-value) of 9.08 (LOD score = 2.9) at marker FABP2 at 116.8 cM on chromosome 4. And, at the same marker, curve e has a value of 7.83 (LOD score = 2.4), curve f has a value of 9.27 (LOD score = 3.0) and curve g has a value of 9.65 (LOD score = 3.9). Our multivariate linkage analysis led to peaks around these two regions, marker GABRB1 at 51.4 cM and marker FABP2 at 116.8 cM, while the evidence of linkage around the two regions has been enhanced. The marker GABRB1 at 51.4 cM on chromosome 4 has already been identified before to be associated with alcoholism [1, 2]. Considering s(t) as one constant parameter different for each phenotype deserves further investigation. Our multivariate linkage analysis did not find any significant evidence of linkage on chromosome 7. Therefore the results are not reported here.
In this analysis, we conducted a simultaneous linkage analysis of multivariate phenotypes. We identified some candidate markers that were identified before using some single phenotypes such as marker GABRB1 at 51.4 cM on chromosome 4, but also some markers that were not suggested before such as marker FABP2 at 116.8 cM on chromosome 4. It is also important to note that the power can also be compromised by a multivariate analysis if only one of the phenotypes contains strong linkage. For example, for chromosome 7, univariate linkage analysis for ttth1 phenotypes revealed very strong linkage signals around 156.4 cM on chromosome 7 with p-values reaching 0.00003, but our multivariate linkage analysis considering ttth1 and ecb21 together or considering ttth1, ttth2, and ecb21 together did not find any significant region on chromosome 7 with linkage, which may be due to the noise introduced by the phenotypes that do not have linkage signals on the chromosome. In other words, a multivariate analysis is most effective when linkage evidence to the individual phenotypes is not strong.
Collaborative Study on the Genetics of Alcoholism
Identity by descent
Quantitative trait loci
This research is supported in part by grant R01DA12468, DA016750, and DA017713 from the National Institute on Drug Abuse.
- Porjesz B, Begleiter H, Wang K, Almasy L, Chorlian DB, Stimus AT, Kuperman S, O'Connor SJ, Rohrbaugh J, Bauer LO, Edenberg HJ, Goate A, Rice JP, Reich T: Linkage and linkage disequilibrium mapping of ERP and EEG phenotypes. Biol Psychol. 2002, 61: 229-248. 10.1016/S0301-0511(02)00060-1.View ArticlePubMedGoogle Scholar
- Porjesz B, Almasy L, Edenberg HJ, Wang K, Chorlian DB, Foroud T, Goate A, Rice JP, O'Connor SJ, Rohrbaugh J, Kuperman S, Bauer LO, Crowe RR, Schuckit MA, Hesselbrock V, Conneally PM, Tischfield JA, Li T-K, Reich T, Begleiter H: Linkage disequilibrium between the beta frequency of the human EEG and a GABAA receptor gene locus. Proc Natl Acad Sci USA. 2002, 99: 3729-3733. 10.1073/pnas.052716399.PubMed CentralView ArticlePubMedGoogle Scholar
- Edenberg HJ: The collaborative study on the genetics of alcoholism: an update. Alcohol Res Health. 2002, 26: 214-218.PubMedGoogle Scholar
- Cotterman CW: A calculus for statistico-genetics. PhD thesis. 1940, Ohio State University, ColumbusGoogle Scholar
- Almasy L, Blangero J: Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet. 1998, 62: 1198-1211. 10.1086/301844.PubMed CentralView ArticlePubMedGoogle Scholar
- Self SG, Liang KY: Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J Am Stat Assoc. 1987, 82: 605-610. 10.2307/2289471.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.