Volume 6 Supplement 1
Genetic Analysis Workshop 14: Microsatellite and singlenucleotide polymorphism
Finemapping using the weighted average method for a casecontrol study
 Kijoung Song†^{1},
 Mohammed S Orloff†^{1},
 Qing Lu†^{1} and
 Robert C Elston^{1}Email author
DOI: 10.1186/147121566S1S67
© Song et al; licensee BioMed Central Ltd 2005
Published: 30 December 2005
Abstract
We present a new method for finemapping a disease susceptibility locus using a casecontrol design. The new method, termed the weighted average (WA) statistic, averages the CochranArmitage (CA) trend test statistic and the difference between the HardyWeinberg disequilibrium test statistic for cases and controls (the HWD trend). The main characteristics of the WA statistic are that it improves on the weaknesses, and maintains the strengths, of both the CA trend test and the HWD trend test. Data from three different populations in the Genetic Analysis Workshop 14 (GAW14) simulated dataset (Aipotu, Karangar, and Danacaa) were first subjected to modelfree linkage analysis to find regions exhibiting linkage. Then, for finescale mapping, 140 SNPs within the significant linkage regions were analyzed with the WA test statistic on replicates of the three populations, both separately and combined. The regions that were significant in the multipoint linkage analysis were also significant in this finescale mapping. The most significant regions that were obtained using the WA statistic were regions in chromosome 3 (B03T3056–B03T3058, pvalue < 1 × 10^{10} ) and chromosome 9 (B09T8332–B09T8334, pvalue 1 × 10^{6} ). Based on the results of the simulated GAW14 data, the WA test statistic showed good performance and could narrow down the region containing the susceptibility locus. However, the strength of the signal depends on both the strength of the linkage disequilibrium and the heterozygosity of the linked marker.
Background
It has been shown that finescale mapping of a susceptibility locus for a complex disease can be accomplished by evaluating the deviation from HardyWeinberg equilibrium (HWE). For example, Feder et al. [1], Nielsen et al. [2] and Jiang et al. [3] have discussed using the HardyWeinberg disequilibrium (HWD) test on affected individuals alone. From their results, this HWD test tends to perform well for a recessive disease model and could be more precise in gene localization, but has no power at all for a multiplicative disease model. For casecontrol studies, Sasieni et al. [4] showed that the CochranArmitage (CA) trend test, which uses genotype data, is more appropriate than the allelebased test when HWE is violated. The CA trend test is good when there is dominant inheritance, has power where the HWD test has no power, but requires allowance for population structure. Devlin and Roeder [5] proposed genomic control to allow for population heterogeneity when using the CA trend test. Song and Elston [6] proposed the HWD trend test, which compares the squared difference in HWD between cases and controls, divided by its estimated variance, with the chisquare distribution with 1 d.f. They showed by simulation that the HWD trend test statistic is not inflated by population stratification.
Song and Elston [6] developed a weighted average (WA) statistic that mitigates against the weaknesses and maintains the strong points of both the CA trend test and the HWD trend test. In this study, we apply modelfree linkage analysis to find regions of interest, and then the WA statistic method to fine map, the Genetic Analysis Workshop (GAW14) simulated dataset in order to find susceptibility disease genes. Finally, we compare the results of the WA test with those of the CA trend and the HWD trend tests and find that for these data the WA is virtually identical to the CA test.
Methods
Song and Elston [6] used simulation to show that asymptotically the random variable Y is well approximated by a Gamma distribution, F(y; θ, κ), with mean μ = 1.78 and variance σ^{2} = 3.45, i.e., θ = σ^{2} /μ = 1.94 and κ = μ^{2} /σ^{2} = 0.92. The best value to take for w was problematic because it depends on details of the alternate hypothesis, which are usually unknown. Thus, they chose the value of w indicated above after performing several exploratory simulation studies. They modeled the empirical α (pvalue) that would predict the probability of type I error from the α corresponding to 1  F(y_{(1α)}; θ, κ) using the regression equation
where α_{ i }(i = 1, 2, ..., n) was estimated from a series of n simulation experiments. Based on extensive simulation results for various sample sizes > 50, they estimated a to be
1.4922[1/log_{ e }R × log_{ e }S] + 0.1208[log_{ e }(R/S)]  1.6929[1/(log_{ e }R × log_{ e }S × (0.5  M  0.5))] + 0.9250(M  0.5)^{2}
Genome scan
We analyzed the binary trait affected/unaffected in replicates from the three different populations in the dataset (Aipotu, Karangar, and Danacaa). For each of the three different populations, we first used the microsatellite markers that were on average 7.5 cM apart to perform a linkage scan. We used SIBPAL (S.A.G.E., version 4.6), a modelfree linkage program, to perform a multipoint linkage analysis. Evidence for linkage was evaluated by HasemanElston regression [7] using a weighted average of the squared trait difference and the squared meancorrected trait sum (option W4 [8]). The empirical pvalues for each marker location were calculated and pooled over the populations.
Fine mapping
After reviewing the firststage genomescan linkage results, we selected for a casecontrol study 140 SNPs 0.3 cM apart that encompassed the significant linkage regions. From each of three replicates (one from each population), we randomly sampled 100 affected probands as our cases and 100 unaffected probands as our controls. In an attempt to induce population stratification, we also pooled the replicates from the three populations to obtain 300 cases and 300 controls. Then the WA test statistic was calculated for each of the four different samples using the selected SNPs. We repeated the sampling from each population 5 times and recorded the percentage of times that the hypothesis of no disequilibrium was rejected. The potential confounding effect of population stratification could be allowed for by the genomic control method. Nineteen SNPs that were independent (>12 cM apart if on the same chromosome) and unlinked to the disease locus (p ≥ 0.8), spread throughout the whole genome, were selected for this purpose. However, for the 300 cases and 300 controls, the estimated variance inflation factor was close to 1.00 for each of the 5 replicate samples, suggesting no significant stratification exists in this admixed population. Therefore, no adjustment for variance inflation was made for these data.
Results
Genome scan using modelfree linkage analysis
In our analysis, we observed 8 microsatellite markers with a significance level of p ≤ 0.005 Evidence of linkage was found on chromosome 1 (22cM region that included D01S0023 (p = 0.00028) and D01S0024 (p = 0.0016)), chromosome 3 (12cM region that included D03S0126 (p = 0.0012) and D03S127 (p = 0.00022)), chromosome 5 (4cM region that included D05S0172 (p = 0.0025)), and chromosome 9 (18cM region that included D09S0348 (p = 0.000041) and D09S0349 (p = 0.0010)).
Fine mapping using the WA statistic method
A sample size as small as 100 cases and 100 controls showed on average strong association (p < 0.001) with a susceptibility disease gene in the region between markers B03T3056 and B03T3058 on chromosome 3 and weaker association at marker B09T8334 on chromosome 9 (p < 0.05). Using the sample size of 300 cases and 300 controls, the association analysis confirmed the linkage signals in all regions on chromosomes 1, 3, 5, and 9. In this analysis we found association signals between markers B01T0553 and B01T0555 on chromosome 1 (p = 0.00034) (Figure 1, panel 1), markers B03T3056 and B03T3058 on chromosome 3 (p < 1 × 10^{10} ) (Figure 1, panel 2), markers B05T4146 and B05T4148 on chromosome 5 (p = 0.00250) (Figure 1, panel 3), and markers B09T8332 and B09T8334 on chromosome 9 (p < 1 × 10^{6} ) (Figure 1, panel 4).
Discussion
In this paper we have illustrated the use of a new method for finemapping using a casecontrol design. If the mode of inheritance of a candidate gene is known, then with that knowledge a more powerful method can always be derived. The WA test was derived for a situation in which the mode of inheritance is not known. We compared the performance of the CA, WA, and HWD tests in this dataset and found that the HWD trend test always had low power. Thus, the WA maintained the advantage of the CA trend test and overcame the weakness of the HWD trend test, while the CA and WA tests had almost equal power for these particular data. It should be noted that factors such as missing data and SNP density will affect the WA test to the same extent that they will affect its componentsthe CA test and the HWD test.
Conclusion
Using the WA statistic, a sample size as small as 100 cases and 100 controls showed on average strong association (p < 0.001) with a susceptibility disease gene in the region between markers B03T3056 and B03T3058 on chromosome 3 and weaker association at marker B09T8334 on chromosome 9 (p < 0.05). For a larger sample size (300 cases and 300 controls), as expected much more significant pP values were observed.
Notes
Abbreviations
 CA:

CochranArmitage
 HWD:

HardyWeinberg disequilibrium
 HWE:

HardyWeinberg equilibrium
 WA:

Weighted average
Declarations
Acknowledgements
This work was supported in part by grants from the U.S. Public Health Service: resource grant RR03655 from the National Center for Research Resources; research grant GM28356 from the National Institute of General Medical Sciences; and contract HD23342 from the National Institute of Child Health and Human Development.
Authors’ Affiliations
References
 Feder JN, Gnirke A, Thomas W, Tsuchihashi Z, Ruddy DA, Basava A, Dormishian F, Domingo R, Ellis MC, Fullan A, Hinton LM, Jones NL, Kimmel BE, Kronmal GS, Lauer P, Lee VK, Loeb DB, Mapa FA, McClelland E, Meyer NC, Mintier GA, Moeller N, Moore T, Morikang E, Prass CE, Quintana L, Starnes SM, Schatzman RC, Brunke KJ, Drayna DT, Risch NJ, Bacon BR, Wolff RK: A novel MHC class Ilike gene is mutated in patients with hereditary haemochromatosis. Nat Genet. 1996, 13: 399408. 10.1038/ng0896399.View ArticlePubMedGoogle Scholar
 Nielsen DM, Ehm MG, Weir BS: Detecting markerdisease association by testing for HardyWeinberg disequilibrium at a marker locus. Am J Hum Genet. 1998, 63: 15311540. 10.1086/302114.PubMed CentralView ArticlePubMedGoogle Scholar
 Jiang R, Dong J, Wang D, Sun FZ: Finescale mapping using HardyWeinberg disequilibrium. Ann Hum Genet. 2001, 65: 207219. 10.1046/j.14691809.2001.6520207.x.View ArticlePubMedGoogle Scholar
 Sasieni PD: From genotypes to genes: doubling the sample size. Biometrics. 1997, 53: 12531261. 10.2307/2533494.View ArticlePubMedGoogle Scholar
 Devlin B, Roeder K: Genomic control for association studies. Biometrics. 1999, 55: 9971004. 10.1111/j.0006341X.1999.00997.x.View ArticlePubMedGoogle Scholar
 Song K, Elston RC: A powerful method of combining measures of association and HardyWeinberg equilibrium for finemapping in casecontrol studies. Stat Med.
 Haseman JK, Elston RC: The investigation of linkage between a quantitative trait and a marker locus. Behav Genet. 1972, 2: 319. 10.1007/BF01066731.View ArticlePubMedGoogle Scholar
 Shete S, Jacobs KB, Elston RC: Adding further power to the Haseman and Elston method for detecting linkage in larger sibships: weighting sums and differences. Hum Hered. 2003, 55: 7985. 10.1159/000072312.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.