Genome-wide linkage analysis of blood pressure under locus heterogeneity

We describe a method for mapping quantitative trait loci that allows for locus heterogeneity. A genome-wide linkage analysis of blood pressure was performed using sib-pair data from the Framingham Heart Study. Evidence of linkage was found on four markers (GATA89G08, GATA23D06, GATA14E09, and 049xd2) at a significance level of 0.01. Two of them (GATA14E09 and 049xd2) seem to overlap with linkage signals reported previously, while the other two are not linked to any known signals.


Background
High blood pressure (BP) is an important risk factor for cardiovascular disease and is a leading cause of mortality in industrialized countries [1]. Being a complex trait, BP may be influenced by genes at different locations on the genome in different families (inter-family locus heterogeneity). Presumably, allowing for locus heterogeneity in the analysis will increase the power to detect linkage. We propose a model that allows for locus heterogeneity for quantitative trait loci (QTLs). This model generalizes the locus heterogeneity model for qualitative traits [2,3] to the case of continuous traits. A genome-wide linkage analysis was conducted on the Framingham Heart Study data using the likelihood-ratio statistic based on this model.

Data
The Framingham Heart Study consists of two cohorts with a total of 10,333 subjects. The first cohort includes 5209 individuals who were recruited initially in 1948 when the study began. The second cohort includes 5124 subjects who are offspring, or spouses of offspring, of the initial participants. Longitudinal data were recorded over followup years in both cohorts. In the total 330 pedigrees provided by the Genetic Analysis Workshop 13 (GAW13), 1702 subjects were genotyped at at least one marker and 2885 subjects were phenotyped for BP at least once.
To remove the effect of hypertension treatment, we focused only on those subjects who never received any antihypertensive medication. There are 1909 such subjects. Their characteristics are summarized in Table 1. The systolic blood pressures were adjusted for age, sex, and body mass index (BMI) in the following manner. The averages of BP, age, and BMI were computed for each subject, and then the average of BP was regressed over those of age and BMI as well as sex, separately for each of the two cohorts. The residuals of the simple linear regression were used as the (adjusted) phenotypic values. The twopoint identity-by-descent (IBD) sharing probabilities on relative pairs provided by GAW13 were used. Only sib pairs were used in the analysis. Sib pairs from the same sibship or from different sibships in the same pedigree were treated as if they were biologically unrelated.
A total of 400 markers were used in the genome scan. Not all sib pairs were genotyped at every marker. So the number of sib pairs used in the analysis varied from marker to marker, roughly from 400 to 1500.
Significant departure from normality was detected on the (adjusted) BP measurement using the Shapiro-Wilk test (p-value < 0.0001). To reduce the impact of the non-normality of the data on the performance of the proposed statistic, the BP measurements were transformed based on the normal copula model [4] so the transformed trait values follow a standard normal distribution, with a mean and variance of 0 and 1, respectively.

Statistical methods
Consider a sib pair whose trait values are denoted by y 1 and y 2 , respectively. The trait values are standardized such that they have mean 0 and variance 1. It is assumed that the candidate locus has an additive effect but no dominance effect on the trait. Let ρ 0 , ρ 1 , and ρ 2 be the correlation coefficients of y 1 and y 2 when the number of alleles that are shared IBD by the sib pair is 0, 1, and 2, respectively. Because there is no additive effect, we have ρ 1 = (ρ 0 + ρ 2 )/2 [4]. When there is no linkage, ρ 1 can be estimated by taking the value of correlation coefficient between y 1 and y 2 [5,6]. So ρ 1 can be treated as known.
Let α be the probability that a QTL is linked to the candidate locus for the sib pair. Then the likelihood function for the sib pair would be α L(ρ 2 ; y 1 , y 2 ) + (1 -α)L(ρ 1 ; y 1 , y 2 ).
Let i index the i th sib pair in a collection of sib pairs, then the log-likelihood function for these sib pairs is where y i1 and y i2 are the trait values for the i th sib pair. The hypotheses of interest are This model is an extension of the heterogeneity model of Smith [2] for dichotomous traits to continuous traits. The likelihood ratio statistic was used as the test statistic. To obtain the maximum likelihood estimates (MLEs) of parameters, we used a two-dimensional grid search over α and ρ 2 , with the grid size of 0.01 for both parameters.
The asymptotic distribution of the likelihood ratio statistic is not known analytically. So a simulation study with 10,000 replicates was carried out. The trait locus is assumed to have two equally frequent alleles and its heritability is set to 0.3. The unlinked marker is assumed to be fully polymorphic. The 90 th , 95 th , 99 th , and 99.9 th percentiles of the distribution are found to be 2.72, 3.91, 6.61, and 11.07, respectively.

Results
A genome scan is conducted using three statistics: the test statistic of Haseman-Elston method (H-E) [7], the likelihood ratio statistic (HET-LRT) for the heterogeneity model described above, and the likelihood ratio statistic (HOM-LRT) for the homogeneity model, which is a special case of the heterogeneity model with α fixed at 1. The markers that are significant for at least one statistic at significance level 0.01 are listed in Table 2. At significance level 0.01, four significant markers are identified by statistic HET-LRT. Among the four markers, GATA14E09 on chromosome 8 is very close to 8q21.11, a region that showed linkage to BP [8]. 049xd2 on chromosome 16 is between 16p12 and 16p13.1, a region in which linkage to BP was found by two independent studies [8,9]. To the best of our knowledge, the other two markers, i.e., GATA89G08 on chromosome 5 and GATA23D06 on chromosome 8, do not link to any previous findings.

Discussion
We introduce a heterogeneity model for mapping QTLs using sib-pair data. A genome scan was performed to map QTLs affecting BP variation on a population-based data using this new method. At significance level 0.01, evidence of linkage is found in four marker regions on chromosomes 5, 8, and 16. Two of the markers seem to overlap with linkage signals in previous studies, while the other two are not linked to any previous findings.
Of the three statistics used in the genome scan, H-E provides the most linkage signals, followed by HOM-LRT, and then by HET-LRT. At the four markers at which HET-LRT is significant (and at many nonsignificant markers that are not shown), the statistic value of HET-LRT is the same as that of HOM-LRT. Finer grid size may change the values of HOM-LRT and HET-LRT, but the changes are expected to be small. Experiments on the four significant markers with grid size 0.001 strongly support this claim.
In these experiments, HET-LRT is still the same as HOM-LRT for all four markers.
The proposed statistic HET-LRT is intended for a population sample. Its performance under selected samples is unknown. Although the normal copula model can be used on selected samples to recover normality, its effectiveness has yet to be investigated. It is worthwhile to point out that our analysis has excluded individuals on antihypertensive medication. Therefore, careful attention should be paid when generalizing our results to general population. For a recent review on the issues dealing with antihypertensive treatments, see Palmer [10].
We have been using the posterior probability of linkage (PPL) to assess linkage signals across heterogeneous data sets [11][12][13][14]. Since the proposed heterogeneity model for quantitative traits is parallel to that for qualitative traits, it is reasonable to adapt our previous work on qualitative traits to the current setting. The details are being worked out.
The proposed model is for sib-pair data only. It is expected that the use of nonindependent sib-pairs, like what we did in this analysis, does not affect the type I error rate asymptotically. It is of interest to see how this model can be generalized to general pedigrees and how the generalized model performs.