Genetic Analysis Workshop 13: Analysis of Longitudinal Family Data for Complex Diseases and Related Risk Factors

Genetic Epidemiology 25 (Supplement 1): S1–S4 (2003) Genetic Analysis Workshop 13: Introduction to Workshop Summaries Laura Almasy, 1 n L. Adrienne Cupples, 2 E. Warwick Daw, 3 Daniel Levy, 4 Duncan Thomas, 5 John P. Rice, 6 Susan Santangelo, 7 and Jean W. MacCluer 1 Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio, Texas Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts Department of Epidemiology, University of Texas M.D. Anderson Cancer Center, Houston, Texas Framingham Heart Study, National Heart, Lung, and Blood Institute, Framingham, Massachusetts Department of Preventive Medicine, University of Southern California, Los Angeles, California Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts Grant sponsor: National Institutes of Health; Grant number: GM31575. Correspondence to: Laura Almasy, Department of Genetics, Southwest Foundation for Biomedical Research, P.O. Box 760549, San Antonio, TX 78245-0549. E-mail: almasy@darwin.sfbr.org Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/gepi.10278 n INTRODUCTION The Genetic Analysis Workshops (GAWs) began in 1982 as a collaborative effort among researchers of various disciplines to evaluate and compare statistical genetics methods. For each GAW, one or more topics are chosen that relate to current analytical and methodological issues in statistical genetics of complex phenotypes. For each work- shop, sets of simulated and real data are dis- tributed to researchers worldwide who submit the results of their analyses of these data for pre- sentation at GAW. The workshop itself is a 2 12 -day meeting comparing and contrasting the many different approaches used to analyze the data. New methods are introduced, old methods are evaluated in new contexts, and diverse analytical schemes are explored on the level playing field of a common data set. More information about GAW, including details of upcoming workshops, may be found at http://www.sfbr.org/external/gaw/ welcome.html. GAW13 was held November 11–14, 2002, in New Orleans, Louisiana. The 117 contributions sub- mitted to GAW13 were organized into 11 pre- sentation groups of 6–14 papers each. Within each group, a co-author with previous GAW experience was asked to serve as group leader to facilitate group discussion, organize an oral presentation for the group, and take the lead in writing the group summary papers collected in this volume. Eight presentation groups were organized around com- mon methodological themes: derived phenotypes, & 2003 Wiley-Liss, Inc. methods for longitudinal analysis, consistency of genetic analyses across time, missing data and pedigree or genotyping errors, effects of covari- ates, pleiotropy and multivariate analyses, data mining/neural networks/tree-based methods, and development and extension of linkage methods. The remaining three presentation groups con- tained papers united by a common focus on particular phenotypes: analysis of blood pressure and hypertension phenotypes, analysis of obesity/ diabetes/lipid phenotypes, and analysis of tobacco and alcohol phenotypes. Of the original 117 GAW13 contributions, 101 were published as a supplement to BMC Genetics [Almasy et al., 2003]. It is these 101 peer-reviewed, published papers that are summarized in the present volume. Summaries have long been a part of the GAW proceedings. For many years, a single individual was charged with the Herculean task of summar- izing all the GAW contributions, using a particular data set. As of GAW11, there were as many as 67 individual contributions to be summarized for a given data set, and the task was divided among pairs of individuals. With the increasing GAW participation, it also became increasingly difficult to accommodate individual oral presentation of each paper within the schedule of the workshop. For GAW12, a new format was introduced both for workshop presentations and for summary papers. Individual contributions with common themes were assigned to presentation groups that had a single oral presentation at the work- shop. In the GAW12 proceedings [Wijsman et al.,


Preface
This supplement to BMC Genetics contains the proceedings of Genetic Analysis Workshop 13 (GAW13), which was held November 11-14, 2002, at the Marriott hotel in New Orleans, Louisiana. The Genetic Analysis Workshops, which began in 1982 and are currently held biannually, provide a venue for the development and testing of statistical genetic methods. For each GAW, one or more data sets are made available to potential participants and GAW attendees must submit an analysis of one of these data sets, or be the providers of a data set or otherwise involved in workshop organization. The workshop itself is then a head-to-head comparison of different ways of analyzing the same data. New analytical methods are introduced and the performance of established methods is compared.
GAW13 focused on analytical issues relating to localization of genes influencing common, complex diseases and their risk factors, with an emphasis on use of longitudinal data. For the first time in a GAW, longitudinal family data, covering a 40-year time period, was made available to workshop participants. As has been the case in recent GAWs, both simulated and actual human data were available. New this year, however, was a close correspondence between these data sets, with the simulated data mirroring the family structure, trait data, and longitudinal sampling scheme of the real data set.
The Framingham Heart Study data provided for GAW13 included longitudinal observations from two cohorts. The original Framingham cohort (Cohort 1), was first examined in 1948 and has been examined every 2 years thereafter. Cohort 2, composed primarily of offspring of the original cohort and the spouses of these offspring, was examined first in 1971 and has been examined approximately every 4 years. The data provided for GAW13 are a subset of these two cohorts and include 2885 individuals This article is available from: http://www.biomedcentral.com/1471-2156/4/s1/S1 in 300 pedigrees. Both diagnostic phenotypes, such as hypertension, and quantitative risk factors, such as HDL cholesterol levels, weight, and systolic blood pressure were available. Genotypes were provided for microsatellite markers representing a whole genome scan at an average 10-cM density. Marker maps and estimated marker allele frequencies were also supplied.
A second data set was simulated, based as closely as possible on the Framingham Heart Study data made available to GAW13. While the exact details of the pedigrees were modified, the overall pedigree size and structure in the simulated data was based on that of the Framingham Heart Study data. Similarly, phenotypes were simulated to have the same distribution as those in the real data and the observed correlations between some phenotypes were incorporated into the trait model. Genotype data were also closely matched to the actual data, particularly in regard to marker distribution and information content. While participants had the option of using the complete simulated data set, missing data patterns were also generated based on observations from the Framingham Heart Study data. Those who analyzed the simulated data had the choice of obtaining the simulation model prior to their analyses and were asked to indicate whether their analyses were done with or without this knowledge.
In spring of 2002, the availability of the GAW13 data was announced by e-mail to the over 1700 individuals on the GAW mailing list. A total of 97 groups requested GAW13 data. The Framingham data were requested by 75 groups and the simulated data by 90 groups, with 67 groups requesting both data sets. In the summer of 2002, 117 contributions were received describing analyses of the two data sets. Of these, 81 utilized the Framingham Heart Study data and 36 analyzed the simulated data. A book, or CD, containing these contributions plus papers describing the data sets was distributed to workshop participants.
A total of 241 individuals from 13 countries attended GAW13. Attendees included investigators from five continents -Asia, Australia, Europe, and North and South America. As with GAW12, participants were organized into 11 presentation groups and a co-author with past GAW experience was asked to lead discussion in each presentation group. Groups ranged in size from 6 to 14 papers with common themes. Some groups collected papers exploring similar methodological issues, such as methods for longitudinal analysis or derivation of new phenotypes from the data, whereas others were related through a common focus on particular phenotypes, such as analyses of blood pressure and hypertension phenotypes or analyses of tobacco and alcohol use. Given the similarities in the Framingham Heart Study and simulated data sets, presentation groups were assigned without regard to the data set analyzed. Each group met individually during the workshop and in many groups members communicated beforehand to begin comparing and contrasting the approaches taken and the results obtained by group members. At GAW13, many groups used part of their group meeting time for individual presentations by each group member, giving investigators an opportunity to present their work. Although mainly attended by group members, group meetings were open to all GAW13 participants. From this process, each group developed an oral presentation, summarizing and synthesizing the work of the individual papers, which was delivered to the overall workshop during general sessions. Individual contributions were also presented in the form of 52 posters displayed during four poster sessions.
The manuscripts included here are a subset of those presented at GAW13. All of these papers have been reviewed for scientific merit. The proceedings begin with two papers describing the two data sets, followed by 103 individual GAW13 contributions organized by presentation group and alphabetically by first author within each group. Because the data set analyzed was not considered in assignment of group membership, papers reporting analyses of simulated data and Framingham Heart Study data are intermingled. New to GAW this year, the proceedings occupy two volumes. In addition to the individual contributions in this volume, each presentation group has a summary in a forthcoming supplement to the journal Genetic Epidemiology in which the present manuscripts are compared and contrasted and larger themes and conclusions are explored. Results of GAW13 analyses provided novel insights into the etiology of cardiovascular disease and some of its risk factors. In addition, many new methods were developed, explored, and applied for the analysis of genetic data from longitudinal studies, an area of research that has been underdeveloped to date.
Adrienne Cupples, Lynn Goldin, Jean MacCluer, and John Rice. We are grateful to all of these people for their efforts in the difficult task of designing and creating a simulated data set that mimicked many of the features of the Framingham data, thus offering participants the opportunity to address issues such as power and false positives. The creation of the simulated data set was supported by start-up funds to Warwick Daw, and by grants CA52862 and GM58897 to John Morrison and Duncan Thomas.
At GAW13, contributions were organized into groups, each focused on a single topic. Group leaders had the difficult task of generating discussion among strangers via e-mail, and organizing presentations that summarized all of the contributions in their group. Their efforts deserve special recognition. We are grateful to the following individuals who led the group presentations: Max Baur, Jim Gauderman, Laura Almasy, Gail Jarvik, Beth Hauser, Deborah Meyers, Catherine Falk, Ellen Wijsman, Heike Bickeböller, Lynn Goldin, and Nancy Saccone. Ample time was scheduled for discussion, with four discussion periods led by Robert Elston, Ingrid Borecki, Anne Spence, and Chris Amos. We thank these discussion leaders for their efforts in stimulating lively interactions among participants.
Institute, Harrah's New Orleans Casino, Aunt Sally's Praline Shops, Inc., WHERE New Orleans Magazine, and the New Orleans Convention and Visitors Bureau.
Numerous organizations provided funding for scholarships to postdoctoral fellows and graduate students to help defray their expenses in attending GAW13: the National Heart, Lung, and Blood Institute provided 15 scholarships; National Institute of Mental Health, 9 scholarships; Autogen Ltd., 1.5 scholarships; and CANFOR/GmbH, Genometrix, and Genomica, 4.5 scholarships total. We are grateful for their generosity.
Long-term planning for the Genetic Analysis Workshops is the responsibility of the Genetic Analysis Workshop Advisory Committee. Its members are Chris Amos, Max Baur, Françoise Clerget-Darpoux, Cathy Falk, Lynn Goldin, Sue Hodge, Jean MacCluer (chairman), Anne Spence, Brian Suarez, and Duncan Thomas.
The National Institute of General Medical Sciences has provided continuous funding for the Genetic Analysis Workshops since 1982, through grant R01 GM31575 to Jean MacCluer. We are particularly grateful to Irene Eckstrand of NIGMS for her enthusiasm and interest in the GAWs during the past 22 years. The Genetic Analysis Workshops would not be possible without the support of Dr. Eckstrand and NIGMS.
Finally, the Genetic Analysis Workshops could not have enjoyed continued success without the ongoing, enthusiastic support of the GAW participants.