Clearly, correcting for type I error is important in candidate gene and genome-wide SNP studies. In contrast to the proposed use of principal components based on pair-wise LD to correct for the number of effectively independent tests , we suggest using a LD block-based correction, based on the LD block structure empirically detected in the data. We showed the expected inflation of type I error rates using only nominal p-values, and the extremely conservative over-correction induced by the traditional Bonferroni method. In general, our results show that the LD block-based corrections prevent type I error inflation, without being overly conservative, presenting a compromise between the other approaches. Specifically, the Gabriel blocking algorithm consistently gave a ~3.4% type I error rate across moderate and high LD conditions, which is close to the desired 5% level. Although under moderate LD conditions both the 4GT and SSLD blocking methods gave slightly conservative type I error rates, under high LD conditions these methods are slightly liberal. The Nyholt method, as employed in this paper, is equivalent to a traditional Bonferroni correction under moderate LD conditions, although under high LD conditions it gave a similar type I error rate as the Gabriel method.
Like Schwartz et al. , we also found vast differences in definitions of haplotype blocks between blocking methods, with low levels of agreement about the number of independent SNPs between the 3 haplotype blocking methods. However, the range of these differences did not have a large effect on the type I error rates.
We believe the advantage to using the blocking algorithms instead of the Nyholt method is that the blocking methods are biologically meaningful and achieve type I error rates closer to the desired value over a range of LD levels.
In light of these results, several questions remain. Recent variations on the Nyholt PC method have been proposed that may improve its performance for correction, and this improvement should be evaluated in comparison to the blocking algorithms. These extensions to the PC method allow for a lower LD threshold in determining the number of independent tests. Second, the thresholds for each of the blocking algorithms were set to default values. Variation of these may result in a type I error rate closer to the desired value. Third, all methods examined here relied on D' as the LD metric of interest. The use of r
2 instead may improve all blocking methods, and this should be explored further. Finally, higher-order LD structure was still not considered in the choice for number of effectively independent tests. A correction that allows for both within-block and across-block correlation should further improve the proposed correction.