Approaches to mapping genetically correlated complex traits
© George et al; licensee BioMed Central Ltd 2003
Published: 31 December 2003
Our Markov chain Monte Carlo (MCMC) methods were used in linkage analyses of the Framingham Heart Study data using all available pedigrees. Our goal was to detect and map loci associated with covariate-adjusted traits log triglyceride (lnTG) and high-density lipoprotein cholesterol (HDL) using multipoint LOD score analysis, Bayesian oligogenic linkage analysis and identity-by-descent (IBD) scoring methods. Each method used all marker data for all markers on a chromosome. Bayesian linkage analysis detected a linkage signal on chromosome 7 for lnTG and HDL, corroborating previously published results. However, these results were not replicated in a classical linkage analysis of the data or by using IBD scoring methods.
We conclude that Bayesian linkage analysis provides a powerful paradigm for mapping trait loci but interpretation of the Bayesian linkage signals is subjective. In the absence of a LOD score method accommodating genetically complex traits and linkage heterogeneity, validation of these signals remains elusive.
The aim of our analyses was to detect and localize trait loci associated with quantitative traits logtriglyceride (lnTG) and high-density lipoprotein cholesterol (HDL) in the Framingham Heart Study data. Based primarily on Shearman et al. , we focused our search on certain chromosomes. By using Markov chain Monte Carlo analysis (MCMC), we were able to perform linkage analyses of data collected on large and sometimes complex pedigrees, using information from many marker loci simultaneously. Here we report multipoint linkage results from several approaches to analyzing the data.
Pedigree and map analysis
We first did routine analysis and investigation, removing uninformative individuals. Based on data availability, we decided for later analyses to use time point 11 for Cohort 1 and time point 1 for Cohort 2. For a few covariates unavailable at time point 11 for Cohort 1, the information was taken from time point 10 or 12. The program ECLIPSE was used to investigate pedigree uncertainties or errors on the basis of all available marker data. We also estimated sex-averaged and sex-specific recombination frequencies via an expectation maximization (EM) algorithm based on the MCMC program lm_auto in the MORGAN package (URL information below).
Trait definition and segregation analyses
Definition of phenotypes used in reported analyses
Definition or Computation
Trait (T) and/or Covariate (C)
Fasting HDL cholesterol (mg/dl)
Fasting triglycerides (mg/dl)
707 × Weight/(height2)
Number of cigarettes smoked/day
Number of grams of alcohol/day
Fasting glucose (mg/dl)
Covariate set 1: age, sex, cohort
Covariate set 2: Drk, BMI
HDLA & lnTGA
HDL and lnTG adjusted for CV1
HDLA adjusted for CV2 and lnTG
Two approaches were used to obtain models for HDLAA to be used in linkage analysis. Loki , the Bayesian MCMC program for oligogenic models, provided some initial models. Also, normal mixtures were fitted to the adjusted trait values in a commingling analysis assuming Hardy-Weinberg equilibrium. Several binary traits were defined, with cutoffs at -10, +10, and +27.6 for both HDLAA and HDLA, this corresponding to 21% (23%), 19% (20%), and 3% (3%) of the observed individuals having the low, high, or very high HDLAA (HDLA) phenotype. An ordinal trait with 15 ordered categories was also used. Penetrances were defined for the binary trait and ordinal traits: for HDLAA the model was based on the commingling analysis and for HDLA on Loki output.
Linkage detection and mapping
Our linkage studies focused primarily on chromosome 7. Linkage signals have been reported on chromosome 7 for lnTG, HDL, and log(HDL/TG) [1, 2], which we hoped to replicate. Additional analyses were carried out on chromosomes 3, 4, 9, 11, 16, and 20, either to attempt replication of reported signals or as a negative control. Except where indicated, our results and discussion refer only to chromosome 7.
We used Loki to analyze several quantitative traits based on HDL and lnTG. The binary HDLA and HDLAA traits were subjected to IBD scoring linkage detection methods based on lm_auto . We used the MORGAN multipoint LOD score program SCHNELL  on the quantitative HDLAA trait. A few single-marker LOD scores for the quantitative HDLA trait at chromosome 7 markers were checked using FASTLINK . Both the binary and ordinal HDLA and HDLAA were also analyzed with the MORGAN MCMC program lm_bayes, a pseudo-Bayesian approach to the estimation of multipoint LOD scores . All reported analyses used the Haldane genetic map distances, and all multipoint analyses used all markers on a chromosome simultaneously. Except where stated, we used sex-averaged maps. All pedigrees were used unbroken.
Pedigree and map analysis
An ECLIPSE analysis of putative sib trios identified individual 590513 in pedigree 27096 as being an unlikely member of the stated sibship. Whereas true sib trios give log-likelihood differences in the range 40 to 80 relative to half-sib alternatives, trios including this individual gave values close to 0. This pedigree has substantial missing marker data, suggesting there may have been Mendelian errors. The data set that was left after removing 35 individuals with no data, 593 unobserved founder couples each with only one offspring, and the individual 590513, consisted of 3470 individuals in 362 pedigree components, ranging in size from 1 to 74. Two pedigrees contained loops due to sibship exchanges, and several had more than one founder couple. Our segregation and joint linkage/segregation analyses used the reduced data set of 3470 potentially informative individuals. Genome sharing, map re-estimation, and LOD score methods used the subset of 3444 individuals in non-singleton pedigrees.
Re-estimated maximum likelihood recombination frequencies were obtained for chromosome 7. The genetic distance from marker 7_1 to marker 7_22 increased from 191 cM (given) to 205 cM (estimated). Overall, there were few large differences between the given and estimated sex-averaged and sex-specific maps. Marker intervals 7_13–14, 7_18–19 and 7_19–20 showed the highest relative increase in sex-averaged recombination rates: 37%, 45%, and 58%, respectively.
Trait definition and segregation analyses
PolyEM Analysis of HDLA and lnTGAA
Univariate Analysis of HDLA
Covariates used in trait preadjustment
Covariates used within trait analysis
BMI, Smk, Drk, Gl
Linkage detection and mapping
The strength of Loki signals was sensitive to the prior distribution assumed for QTL effects. It was also sensitive to changes in the marker genetic map: the MCMC EM estimated map resulted in reductions in the peak IR, relative to the supplied map, of 11.6% and 17.3% for HDLA and lnTGA, respectively. As shown in Figure 1b, the realized QTL in the region 7_19 to 7_22 having non-negligible trait contributions were used to help define trait models for HDLA and HDLAA LOD score analyses.
In combination, our segregation and linkage analysis results suggest both oligogenic inheritance of HDLA and lnTGA, and a negative genetic correlation, which may be the result of loci affecting these traits at chromosome 7-qter. The chromosome 7 signals were consistently stronger for HDLA than for HDLAA, the latter trait being adjusted for lnTG. For genetically correlated traits, adjustment may weaken the signal, whereas the ratio-trait of Shearman et al.  will reinforce the signal in the presence of a negative genetic correlation. Adjustments for genetically correlated covariates should be applied cautiously.
The weak signals of the model-free analyses of binary traits may be due to low power, and the problems of the multipoint LOD score analyses due to model sensitivity. In the presence of oligogenic inheritance, Loki can detect weak signals, imputing linked QTL only in certain families and modeling other heritable variation with unlinked QTL. However, interpretation of the strength of the signal provided by Loki remains an open question. Thus, in the absence of a LOD score method accommodating genetically complex traits and linkage heterogeneity, validation of these signals remains elusive.
Research supported in part by NIH grant GM 46255.
The MORGAN, ECLIPSE and Loki packages are available at http://www.stat.washington.edu/thompson/Genepi/pangaea.shtml
- Shearman AM, Ordovas JM, Cupples LA, Schaefer EJ, Harmon MD, Shao Y, Keene JD, DeStefano AL, Joost O, Wilson PW, Housman DE, Myers RH: Evidence for a gene influencing the TG/HDL-C ratio on chromosome 7q32.3-qter: a genome wide scan in the Framingham study. Hum Mol Genet. 2000, 9: 1315-1320. 10.1093/hmg/9.9.1315.View ArticlePubMedGoogle Scholar
- Heath SC: Markov chain Monte Carlo segregation and linkage analysis for oligogenic models. Am J Hum Genet. 1997, 61: 748-760.PubMed CentralView ArticlePubMedGoogle Scholar
- Thompson EA: MCMC estimation of multi-locus genome sharing and multipoint gene location scores. Int Stat Rev. 2000, 68: 53-73.View ArticleGoogle Scholar
- Snow G, Wijsman E, Thompson E, Heath S: Multipoint linkage analysis of complex traits with Markov chain Monte Carlo linkage analysis. Am J Hum Genet. 1999, 65 (Suppl): A446-Google Scholar
- Cottingham RW, Idury RM, Schäffer AA: Faster sequential genetic linkage computations. Am J Hum Genet. 1993, 53: 252-263.PubMed CentralPubMedGoogle Scholar
- George AW, Thompson EA: Multipoint linkage analyses for disease mapping in extended pedigrees: a Markov chain Monte Carlo approach. Technical Report #405, Seattle, WA, Department of Statistics, University of Washington. 2001Google Scholar
- Wijsman EM: Joint segregation and linkage analysis using Markov chain Monte Carlo methods. In Quantitative Trait Loci: Methods and Protocols Humana Press, Totowa, NJ. 2002, 139-161.View ArticleGoogle Scholar
- Kruglyak L, Daly ML, Reeve-Daly MP, Lander ES: Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet. 1996, 58: 1347-1363.PubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.