- Open Access
A genome-wide scan to identify loci for smoking rate in the Framingham Heart Study population
© Li et al; licensee BioMed Central Ltd 2003
- Published: 31 December 2003
Although many years of genetic epidemiological studies have demonstrated that genetics plays a significant role in determining smoking behavior, little information is available on genomic loci or genes affecting nicotine dependence. Several susceptibility chromosomal regions for nicotine dependence have been reported, but few have received independent confirmation. To identify susceptibility loci for nicotine dependence, 313 extended pedigrees selected from the Framingham Heart Study population were analyzed by both the GENEHUNTER and S.A.G.E. programs.
After performing linkage analyses on the 313 extended Framingham Heart Study families, the EM Haseman-Elston method implemented in GENEHUNTER provided evidence for significant linkage of smoking rate to chromosome 11 and suggestive linkage to chromosomes 9, 14, and 17. Multipoint sib-pair regression analysis using the SIBPAL program of S.A.G.E. on 1389 sib pairs that were split from the 313 extended families identified suggestive linkage of smoking rate to chromosomes 4, 7, and 17. Of these identified positive regions for nicotine dependence, loci on chromosomes 7, 11, and 17 were identified by both GENEHUNTER and S.A.G.E. programs.
Our genome-wide scan results on the Framingham Heart Study data provide evidence for significant linkage of smoking rate to chromosome 11 and suggestive linkage to chromosomes 4, 7, 9, 14, and 17. These findings suggest that some of these regions may harbor susceptibility loci for nicotine dependence, and warrant further investigation in this and other populations.
- Smoking Behavior
- Nicotine Dependence
- Smoking Rate
- Framingham Heart Study
- Suggestive Linkage
Over the last several decades, a number of twin studies throughout the world have yielded results consistent with the overall conclusion that both genetic and environmental factors contribute to the risk of becoming a long-term smoker (for reviews, see [1, 2]). After performing a meta-analysis of most of the reported twin studies on smoking-related behaviors in the literature, we found that genetic factors contribute approximately 50% to smoking initiation and 59% to smoking persistence .
Although the twin studies suggest moderate genetic influences on nicotine dependence, little information is provided about the chromosomal locations harboring susceptibility loci/genes for nicotine dependence. Linkage-based genome-wide scans for smoking behavior have been reported by Straub et al.  on the Christchurch sample of New Zealand (130 families with 343 genotyped individuals) and the Richmond sample of Virginia (91 families with 264 genotyped individuals), and by Duggirala et al.  and Bergen et al.  on the Collaborative Study on the Genetics of Alcoholism data (COGA; 105 families with 987 genotyped individuals). However, only a few susceptibility regions for nicotine dependence from one study were replicated in another study.
Generally speaking, there are two approaches available to address this problem of identifying susceptibility loci for nicotine dependence and other complex disorders. The first approach is to repeat and extend these genome-wide linkage analyses in different populations; the second is to use higher marker densities for association genome scanning studies. Based on the availability of information on smoking phenotype in the Framingham Heart Study population, we adopted the first genome-wide scan approach to identify susceptibility loci for nicotine dependence in the present study.
Data from the Framingham Heart Study along with clinical exam information from 1948 through 1988 for the original cohort and from 1971 through 1991 for the offspring cohort were provided through Genetics Analysis Workshop 13 (GAW13). On the basis of number of smokers present at each exam, the consistency of the clinical data and interviewing time between the two cohorts, and the potential environmental effect on smoking phenotype included in the Framingham Heart Study data, Exam 12 from 1970 for the original cohort and Exam 1 from 1971 for the offspring cohort were selected and used in this study.
Characteristics of pedigrees to map susceptibility loci for smoking rate.
Mean ± SEM (Range)
No. of subjects per pedigree
8.25 ± 0.31 (3–53)
No. of smokers per pedigree
3.92 ± 0.16 (1–25)
Age of subjects
43.76 ± .34 (9–83)
42.92 ± 0.49 (11–82)
44.54 ± 0.46 (9–83)
Age of smokers
39.23 ± 0.39 (13–80)
38.73 ± 0.54 (15–79)
39.73 ± 0.56 (13–80)
Self-reported cigarettes per day, 1970–1971
20.73 ± 0.35 (1–90)
23.51 ± 0.53 (1–90)
17.95 ± 0.43 (1–60)
Two linkage analysis programs (SIBPAL in S.A.G.E. v. 4.2 and GENEHUNTER v. 2.1) were used in the study. For the EM (expectation maximization) Haseman-Elston quantitative trait locus (QTL) regression method implemented in GENEHUNTER, we analyzed both log-transformed and categorized SR data sets. A detail description on the method can be found in Kruglyak et al.  and Kruglyak and Lander . In SIBPAL, default options were used for all parameters in the trait regression method except that the options w3 (the weighted combination of squared trait difference and squared mean-corrected trait sum adjusting for the non-independence of sib pairs) and w4 (the non-independence of squared trait sums and differences) were examined . Both options yielded essentially the same results on three data sets (i.e., SR, log-transformed SR and categorized SR). Sex and age were included as covariates for all analyses reported in this communication. The S-PLUS 6.1 and SAS 8.2 packages were used to prepare the data in the required format and to analyze the data generated from the linkage analysis programs.
To maximally utilize the phenotypic information from the Framingham Heart Study data, we searched clinical records regarding smoking status of each subject from 1948 to 1988 for the original cohort, and from 1971 to 1989 for the offspring cohort. It appears that data from 1970 for the original cohort and from 1971 for the offspring cohort were more complete and contained significantly more smokers relative to other time points (i.e., exams) for both cohorts. As shown in Table 1, 1228 smokers were included in the study, with an average daily smoking rate of 23.51 and 17.95 cigarettes for men and women, respectively. The average age of smokers included in this study was 38.7 ± 13.4 for male and 39.73 ± 13.8 for female subjects.
The p-value for genomic locations linked to categorized and log-transformed SR across the human genome with a p-value < 0.01 for at least one transformed data set, as detected by the EM-Haseman-Elston method in GENEHUNTER.
Log-transformed SR (peak position)
Categorized SR (peak position)
GATA31A10 & GATA24D12
183xh10 & ATA18A07
GGAA5C04 & GATA90D07
GATA51E09 & GGAA21G11
GATA73F01 & GATA27A03
ATA78D02 & GATA185H04
GATA47F05 & 321xd1
The p-value for genomic locations linked to smoking rate (SR), categorized and log-transformed SR across the human genome with a p-value for at least one transformed data set, as detected by SIBPAL program.
SR (peak position)
Log-transformed SR (peak position)
Categorized SR (peak position)
GATA42H02 & 165xc11
GGAA5C04 & GATA90D07
ATA78D02 & GATA185H04
ATC6A06 & GATA49C09
In this study, by using the EM Haseman-Elston QTL regression method implemented in GENEHUNTER and the SIBPAL program of S.A.G.E., we obtained evidence for significant linkage of smoking rate to chromosome 11 and suggestive linkage to chromosomes 4, 7, 9, 14, and 17. Additionally, our results suggest that the genomic regions mapped on chromosomes 1, 6, 12, 15, 20, and 21 are of potential interest to harbor susceptibility genes for nicotine dependence at a significance level of 0.01. Of these loci, three loci on chromosomes 7, 11, and 17 were identified by both linkage analysis methods.
Although our mapping results provided weak evidence for linkage of smoking rate to chromosome 20 (p = 0.0063; see Table 2), this locus appears to be interesting. Using the same Framingham Heart Study data, two other research groups [11, 12] independently reported weak linkage of the maximum number of cigarettes smoked per day across the first four exams or across all exams of the original and offspring cohorts onto the same region of chromosome 20 identified in this study. To our knowledge, no previous studies in the literature have identified linkage or association with this region on chromosome 20. Therefore, it will be of interest to confirm this finding in other studies. The reason that we achieved lower p- values for most regions reported in our study than other GAW13 analyses evaluating smoking as the phenotype may due to how the data measuring smoking phenotype were selected from the Framingham Heart Study data. As indicated earlier, we only used the smoking information obtained from the exam conducted during 1970–1971, instead of using the maximum number of cigarettes smoked per day across multiple exams over many years. Epidemiological studies have indicated that there has been a steady and dramatic decline of 40% in the prevalence of cigarette smoking by people 18 years or older in the US from 1965 to 1990 . This was also true in the Framingham Heart Study data (data not shown). Therefore, using smoking information obtained from multiple exams over a long period of time may affect estimation of the genetic and environmental parameters, and thus eventually the linkage analysis results. Nicotine dependence is a complex trait with strong genetic and environmental influences. Many years of genetic epidemiological studies have documented that smoking behavior is determined by multiple genetic and environmental factors, and interaction among these factors. Strong evidence for linkage of smoking behavior to chromosome 5q has been reported from an analysis of the COGA data . The linkage to smoking behavior on chromosome 5 was also reported by another study with a different linkage analysis method but at a marginal level of significance . In another independent study, Straub et al.  identified several possible regions for nicotine dependence on chromosomes 2, 4, 10, 16, 17 and 18 in the Christchurch sample of New Zealand but failed to confirm these regions in the Richmond sample of Virginia. This was probably due to insufficient statistical power as a result of the small sample size of the Richmond cohort (91 families with 264 genotyped individuals). Compared with the research described above, a much larger sample size was used in the present study, which may contribute partially to the significant p-values obtained in this study.
There are limitations to this study. For example, we used the number of cigarettes smoked per day as an indirect measure of nicotine dependence without consideration of which cigarette brand each smoker smoked. It is known that there exists a significant variation in nicotine concentration present in each cigarette brand. Therefore, the phenotype of smoking rate used here may represent only a very rough measure of nicotine dependence. Given the objective of this study and the limitation of the data set used in the analysis, we did not, nor were we able to, distinguish individuals in the non-smoking group who had been passively exposed to smoking (i.e., through second-hand smoke) from those who were never exposed to nicotine. As documented earlier, the transformed smoking phenotype still deviated slightly from a normal distribution; however, we do not feel that such remaining kurtosis would have a large effect on the linkage results reported herein, because only model-free methods were used in the analysis and they tend to be more robust to the presence of non-normality in the data. Also, the participants in the Framingham Heart Study are predominantly Caucasian Americans. Accordingly, it is of interest to know whether we can repeat these findings in other ethnic populations.
The 313 extended Framingham Heart Study families were analyzed to identify susceptibility loci for smoking rate by the Haseman-Elston regression methods implemented in GENEHUNTER and the SIBPAL program of S.A.G.E. Our genome-wide scan results provided evidence for significant linkage of smoking rate to chromosome 11 and suggestive linkage to chromosomes 4, 7, 9, 14, and 17. Additionally, we found several regions located on chromosomes 1, 6, 12, 15, 20, and 21 are potentially of interest with a significance level of <0.01. Interestingly, the genomic regions on chromosomes 7, 11, and 17 were identified by both the linkage methods. To our knowledge, most of the susceptibility regions for smoking rate identified in this study have not been reported previously and thus replication of these findings is an important next step.
This work was partially supported by NIH grants DA-12844 to MDL and GM-28356 to RCE. Some results of this paper were obtained by using the program package S.A.G.E., which is supported by a U. S. Public Health Service Resource Grant (RR03655) from the National Center for Research Resources. The Framingham Heart Study is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University. This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or reviews of the Framingham Heart Study, Boston University, or NHLBI.
- Hughes JR: Genetics of smoking: a brief review. Behav Ther. 1986, 17: 335-345. 10.1016/S0005-7894(86)80066-1.View ArticleGoogle Scholar
- Sullivan PF, Kendler KS: The genetic epidemiology of smoking. Nicotine Tobacco Res. 1999, 1: S51-S57.View ArticleGoogle Scholar
- Li MD, Cheng R, Ma JZ, Swan GE: A meta-analysis of estimated genetic and environmental effects on smoking behavior in male and female adult twins. Addiction. 2003, 98: 23-31. 10.1046/j.1360-0443.2003.00295.x.View ArticlePubMedGoogle Scholar
- Straub RE, Sullivan PF, Ma Y, Myakishev MV, Harris-Kerr C, Wormley B, Kadambi B, Sadek H, Silverman MA, Webb BT, Neale MC, Bulik CM, Joyce PR, Kendler KS: Susceptibility genes for nicotine dependence: a genome scan and followup in an independent sample suggest that regions on chromosomes 2, 4, 10, 16, 17 and 18 merit further study. Mol Psychiatry. 1999, 4: 129-144. 10.1038/sj.mp.4000518.View ArticlePubMedGoogle Scholar
- Duggirala R, Almasy L, Blangero J: Smoking behavior is under the influence of a major quantitative trait locus on human chromosome 5Q. Genet Epidemiol. 1999, 17 (suppl 1): S139-S144.View ArticlePubMedGoogle Scholar
- Bergen AW, Korczak JF, Weissbecker KA, Goldstein AM: A genome-wide search for loci contributing to smoking and alcoholism. Genet Epidemiol. 1999, 17 (suppl 1): S55-S60.View ArticlePubMedGoogle Scholar
- Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES: Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet. 1996, 58: 1347-1363.PubMed CentralPubMedGoogle Scholar
- Kruglyak L, Lander ES: Complete multipoint sib-pair analysis of qualitative and quantitative traits. Am J Hum Genet. 1995, 57: 439-454.PubMed CentralPubMedGoogle Scholar
- Statistical Solutions: S.A.G.E.: Statistical Analysis for Genetic Epidemiology, Release 4.2. Cork, Ireland. 2002Google Scholar
- Nyholt DR: All LODs are not created equal. Am J Hum Genet. 2000, 67: 282-288. 10.1086/303029.PubMed CentralView ArticlePubMedGoogle Scholar
- Bergen AW, Yang R, Bai Y, Beerman M, Goldstein AM, Goldin LR: Genomic regions linked to alcohol consumption in the Framingham Heart Study. BMC Genetics. 2003, 4 (suppl 1): S101-10.1186/1471-2156-4-S1-S101.PubMed CentralView ArticlePubMedGoogle Scholar
- Goode EL, Badzioch MD, Kim H, Gagnon F, Rozek LS, Edwards KL, Jarvik GP: Multiple genome-wide analysis of smoking behavior in the Framingham Heart Study. BMC Genetics. 2003, 4 (suppl 1): S102-10.1186/1471-2156-4-S1-S102.PubMed CentralView ArticlePubMedGoogle Scholar
- Giovino GA, Henningfield JE, Tomar SL, Escobedo LG, Slade J: Epidemiology of tobacco use and dependence. Epidemiol Rev. 1995, 17: 48-65.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.