Generalizing Terwilliger's likelihood approach: a new score statistic to test for genetic association

Background: In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. Results: By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. Conclusion: We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.


Background
Haplotype analysis is a popular strategy to analyze multiple single nucleotide polymorphisms (SNPs) typed within one candidate gene [1,2]. Most of the genetic variation of a gene is described by 4 to 8 relatively common haplotypes (frequencies > 0.05). To compare the distribution of observed haplotype frequencies between cases and controls, the global test statistic of Schaid et al. [3] can be used when haplotype ambiguity exists. When no uncertainty in phase exists, for example in regions with little recombinations, the simple Pearson's chi square statistic is used to test for association. Both statistics test for any difference in haplotype frequencies between cases and controls and the degrees of freedom depend on the number of observed haplotypes. However, researchers often aim to identify one ancestral haplotype [4]. For this situation one may consider a test which summarizes the difference between the null hypothesis and the alternative hypothesis in one parameter, yielding a more powerful one degree of freedom test. This paper aims to derive such a method.
For multi-allelic markers, Terwilliger [5] proposed a model with one parameter (λ) for the excess frequency of the associated allele in the cases. To test for association he proposed a likelihood ratio test, i.e. comparing the likelihood under the alternative to the likelihood under no association, for testing. Since it is unknown which allele is associated, the likelihood is a weighted sum of conditional likelihoods of the data given that the allele is associated over the observed marker alleles. As for weights Terwilliger [5] proposed to use the allele frequencies in the population from which the cases and controls were sampled. Thus the most common allele is assigned the highest weight in the analysis. However, for haplotype analysis it is usually not the most common haplotype in the population which is associated to the disease. Therefore in this paper, we propose a model which assigns constant weight to each haplotype. We derive a score statistic from the likelihood of this model.
The log likelihood function of Terwilliger's model also appears to have some unusual statistical features. For example, the allele frequencies that are used as weights are also unknown parameters in the conditional likelihood functions. Maximizing the log likelihood function appears not always straightforward. Further its distribution under the null hypothesis is unknown. Terwilliger proposed to approximate the distribution of the likelihood ratio (TLR) statistic under the null hypothesis with the 50 : 50 mixture of chi square distributions of null and one degrees of freedom. This distribution appeared to yield conservative p-values [6]. Another property of the likelihood function is that the score function, the first derivative of the log likelihood function with respect to λ evaluated at λ = 0 is a constant zero for any observed data for the weights that Terwilliger proposed. Hence the corresponding score statistic does not exist. Score statistics are often preferred over likelihood ratio statistics, because they have often a tractable expression which provides insight in the data and are robust against model deviations [7]. When constant weights instead of allele frequencies as weights are used the score statistic exists. Under the null hypothesis, the statistic follows a normal distribution.
In this paper we derive a method for the extreme situation of phase unambiguity, which means that we study genetic regions with almost no recombination [8]. This assumption is reasonable for many genes, for example fibrinogen alpha, beta and gamma [9] and CRP [10]. When some uncertainty about phase exists, the p-values should be adapted. Our program [11] is able to compute valid p-values for the situation of phase ambiguity by permutation of case and control status. The program calls the publicly available software TAGSNPS [8] to derive the haplotypes in each permutation step (see also Becker et al. [12] for similar methods).
The methods derived in this paper are meant to detect a common haplotype. In general genetic association studies have power to detect common variants. Thus these methods should detect associations under the common disease-common variant hypothesis. However, it is also possible that a rare causal mutation is sometimes present on the identified haplotype. For example Hylckama Vlieg et al. [13] detected a common haplotype that was overrepresented in patients with deep venous thrombosis compared to healthy controls. Many homozygous carriers of this haplotype carried the factor V Leiden mutation, the most important genetic risk factor for deep venous thrombosis, which had been discovered before by other methods [14]. Thus if homozygous carriers of the associated haplotype were sequenced the factor V Leiden mutation would have been detected.
We illustrate the method by an association analysis of three candidate genes, fibrinogen alpha (FGA), beta (FGB) and gamma (FGA)in the Leiden Thrombophilia Study (LETS) [9]. In these data, no phase ambiguity was present and the pair of haplotypes could be derived for each individual. The number of common haplotypes varied from 3 to 5. The frequency of the associated haplotype varied from 0.2 to 0.3. In this paper, we consider the score statistic as an alternative to the classical chi-square and the original TLR statistic of [5] in the context of haplotypes. We also include the likelihood ratio statistic corresponding to the log likelihood using equal weights. We carried out a simulation study to examine the performance of the four statistics under the null hypothesis and to compare the power of the four statistics under various alternatives. Finally we illustrate the proposed statistics by analysis of the haplotype distributions of the FGG, FGB and FGA genes.

Simulation study
By means of a simulation study we compared the performance of the new score statistic with Pearson chi squared χ 2 and the Terwilliger's likelihood ratio with weights equal to p j 's (TLR). We also considered a likelihood ratio test with equal weights (LR). We first evaluated the type I error rates of the test statistics. For the score statistic we used the chi square distribution with one degree of freedom to approximate the distribution under the null hypothesis. For the LR and TLR statistics we used thê  Table 1.
For all m, the type I error rates of the score statistic were maintained at the nominal error rate. For m < 10, the type I error rates of Pearson's chi square corresponded to the nominal level, but became conservative for larger m due to sparse data. For all m considered, the type I error rates for the TLR statistic were conservative (< 0.03). The type I error rates for the LR statistic were also somewhat small (≈ 0.04), but were better than that of the TLR statistic.
To study the power of the statistics, we generated 10,000 samples of 100 case chromosomes and 100 control chro-mosomes from the multinomial distributions with probabilities p 1 (1 -λ) + λ and p j (1 -λ) for j = 2ʜm for cases and p j j = 1ʜm for controls, respectively. First, we considered the model used by Terwilliger [5]. The most common haplotype frequency p 1 in controls was again set to 0.5 and this haplotype was more frequent in cases. The parameter λ was fixed to 0.5 which corresponds to a haplotype frequency of 0.75 in the cases. The number of haplotypes m was again set to 4, 5, 8, 10, 15 and 20. The results are shown in the right columns of Table 1.

Data example
We applied the three statistics to a study on haplotype association of fibrinogen alpha (FGA), beta (FGB) and gamma (FGG) with risk of deep venous thrombosis in LETS [15,16]. Fifteen haplotype tagging SNPs were typed in 474 cases and 474 controls [9]. Within the three genes, the SNPs were in high linkage disequilibrium ( > 0.95).
The number of common haplotypes (frequency greater than 5%) describing FGG, FGA and FGB were three, five and five respectively. Since we focus on common haplotypes, we pooled the rare haplotypes with similar haplotypes by dropping rare tagging SNPs. For FGG, we pooled H4 with H1 and for FGB, we pooled H5 with H4 and H7 with H3 (see Uitte de Willige et al. [9] for more details on the tagging SNPs). In this analysis we considered p-values below 0.05 to be significant. In Table 2 the data are described and the results are given.
For all genes, haplotype H2 appeared to be more frequent in the cases than in the controls. For FGG, FGA and FGB the allelic odds ratios of presence of H2 versus the rest was 1.34, 1.29 and 1.28 respectively. Note that these oddŝ  Although for each subject we were able to derive the pair of haplotypes, we also estimated the p-values of the statistics by 1000 permutations using TAGSNPS to estimate the haplotype counts in each permutation step. The results were similar.

Discussion
In this report we have derived a new score statistic to test for association between a candidate gene and a binary trait. Score statistics are easy to compute, locally most powerful, and robust to small model deviations under the alternative [7]. From our simulations it appears that under the null hypothesis the statistic follows a chi-square distribution. For candidate genes with a small impact on the disease (λ small), five to eight observed variants, and a frequency of the associated variant in the range of 0.2 to 0.4, the new statistic performs better than the Terwilliger's LR statistic. For larger frequencies (around 0.5) or for a large number of variants (m > 8), Terwilliger's LR statistic appears to perform better, because the weight corresponding to the associated haplotype is larger. Compared to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency larger than 0.2 and the number of observed variants m is larger than 5. Especially when the frequency of the associated haplotype is small (< 0.2), Pearson chi-squared statistic performs similar or better than the score statistic.
We used a mixture of two chi-square distributions to approximate the distribution of Terwilliger's LR statistic under the null hypothesis. It is known that this distribution appears to give conservative p-values [6]. Permutations of the case-control status may result in better pvalues. However, permutations are not attractive when no simple formulae for the statistic exists.
When the sampled population is not in Hardy Weinberg equilibrium, the considered statistics are not valid since they assume that the haplotypes of a subject are independent. For Pearson's chi-square statistic on haplotypes, a multi-allelic version of Armitage's trend test on pairs of haplotypes is available [17,18]. Under Hardy Weinberg equilibrium these statistics are asymptotically equivalent.
To derive a one degree of freedom score statistic for pairs of haplotypes, a logistic regression model can be used with the number of associated haplotypes as covariate.
Then the total likelihood is the average of the likelihoods over these logistic regression models of possible associated haplotypes. Here more research is needed. In this paper we used a smoothing approach to deal with the fact that the associated haplotype is unknown. As alternative, one may consider taking the maximum over each haplotype. A disadvantage of the latter approach is that permutations are needed to derive a p-value [19].
In this paper we assumed that the haplotypes could be derived from the typed tagging SNPs. When uncertainty in phase exists the haplotype frequencies have to be estimated using for example an EM algorithm [20] and the uncertainty has to be taken into account when the p-values are derived. This can easily be done by using permutations. Our program [11] is able to compute valid p-values in case of ambiguous phase by calling TAGSNPS [8] to derive the haplotype counts in each permutation step. It may be more efficient to incorporate phase unambiguity in the likelihood function. The gain in efficiency will probably be small. Analyzing genes with little information on phase yields many relatively rare haplotypes and interpretation of the results is often hard. Using permutations to deal with haplotype uncertainty is in the line with recent developments in weighted regression analysis [21] and in imputation of haplotypes [22]. The new score statistic has no power when the frequency of the associated haplotype in the pooled set of cases and controls iŝ . This raises the question if a better summary statistic of Pearson's chi square statistic into an one degree of freedom statistic exists. Here more research is needed.
As an alternative for the score function, the second derivative of the log likelihood may be used [23]. In another paper we study the performance of this second derivative of the log likelihood function of Terwilliger [24].

Conclusion
We conclude that by choosing alternative weights, in particular constant weights, in the likelihood of Terwilliger, a set of new powerful and robust statistical tests was derived. For genetic association studies aiming to identify common associated variants, the new statistic should be used in addition to the standard Pearson's chi square statistic. By using both statistics more insight in the data can be obtained. A program [11] which computes the statistics and corresponding p-values is freely available.

Methods
Let m be the number of haplotypes describing most of the genetic variation in a gene. Assume that the haplotype frequencies are in Hardy-Weinberg equilibrium proportions. Let p = (p 1 ,ʜ, p m ) be the vector of haplotype frequencies in controls. Assume that only one haplotype denoted with index i is over-represented in the cases, then the haplotype frequencies in the cases can be modelled as q i = p i + λ(1p i ) and q j = p j -λp j for j ∈ (1,ʜ,i -1, i + 1,ʜ,m). Let x = (x 1 ,ʜ,x m ) and y = (y 1 ,ʜ,y m ) be vectors of haplotype counts in the cases and the controls, respectively, and let n 1 and n 2 be the total number of case chromosomes and of control chromosomes, respectively, and n = n 1 + n 2 . Then the conditional likelihood L i given that haplotype i carries the mutation is given by and the likelihood proposed by Terwilliger is equal to with L j given in formula (1).
It is easy to see that likelihood function (2) can be generalized to the following likelihood function: with L j given in formula (1) and w = (w 1 ,ʜ,w m ) a vector of known positive weights restricted by . The first derivative of the log likelihood l(λ, p|x, y, w) = log(L(λ, p|x, y, w)) to λ evaluated in λ = 0 is equal to

Authors' contributions
REG participated in method development, wrote a C code, carried out the simulation study and data analysis, participated in interpreting results, and in drafting of the manuscript. SUdeW and MCHdeV provided the data and participated in interpreting the results. QH designed the web interface and extended the C code to account for phase uncertainty. LH participated in method development and in interpreting results. JJH-D participated in method development and in interpreting results, and participate in drafting of the manuscript. All authors read and approved the final version.