# Power analysis for genome-wide association studies

- Robert J Klein
^{1}Email author

**8**:58

**DOI: **10.1186/1471-2156-8-58

© Klein; licensee BioMed Central Ltd. 2007

**Received: **22 November 2006

**Accepted: **28 August 2007

**Published: **28 August 2007

## Abstract

### Background

Genome-wide association studies are a promising new tool for deciphering the genetics of complex diseases. To choose the proper sample size and genotyping platform for such studies, power calculations that take into account genetic model, tag SNP selection, and the population of interest are required.

### Results

The power of genome-wide association studies can be computed using a set of tag SNPs and a large number of genotyped SNPs in a representative population, such as available through the HapMap project. As expected, power increases with increasing sample size and effect size. Power also depends on the tag SNPs selected. In some cases, more power is obtained by genotyping more individuals at fewer SNPs than fewer individuals at more SNPs.

### Conclusion

Genome-wide association studies should be designed thoughtfully, with the choice of genotyping platform and sample size being determined from careful power calculations.

## Background

One goal of modern human genetics is to identify the genetic variants that predispose individuals to develop common, complex diseases. It has been proposed that population-based association studies will be more powerful than traditional family-based linkage methods in identifying such high-frequency, low-penetrance alleles [1]. Such studies require the genotypes a large number of polymorphisms (usually single nucleotide polymorphisms [SNPs]) across the genome, each of which is tested for association with the phenotype of interest. As originally proposed, this would be a direct test of association, in which the functional mutation is presumed to be genotyped. An alternate approach to association studies takes advantage of the correlation between SNPs, called linkage disequilibrium (LD), that can occur due to the genealogical history of the polymorphisms [2]. In this approach, often called indirect association, one SNP is genotyped and used to infer indirectly the genotypes at other SNPs with which it is in high LD [3]. As one genotyped SNP, called a "tag" SNP, can be in LD with numerous other SNPs, much fewer SNPs (10^{5} – 10^{6}) would need to be genotyped to capture the common variation in the genome [3]. Recent advances in genotyping technology make such studies feasible [4, 5] and the first results of such studies are being published [6–10].

One key question in designing such studies is the choice of tag SNPs. Numerous methods for choosing the best set of tagging SNPs have been developed and compared [11]. One common measure evaluates the pairwise LD, measured by r^{2}, between the tag SNPs and all other SNPs [12]. The value r^{2} represents the correlation between two SNPs. It is a useful measure because, if N individuals are needed for a specific power with a direct test of association, N/r^{2} individuals would be needed for an indirect test of association [2]. Sets of tag SNPs are usually compared by their "coverage," or fraction of variants in the genome that are in LD (r^{2} above some threshold) with at least one tag [12–15].

There are two related problems with this measure of coverage. First, the binary decision of whether r^{2} is above or below a threshold does not capture the continual decrease in power as r^{2} decreases. If the cutoff value of r^{2} is 0.8, a SNP that shows LD of r^{2} = 0.75 with a tag would be called undetectable since the measure of LD is below the threshold. In truth, association would be detectable, albeit with reduced power. Second, knowledge of the coverage of a set of tag SNPs says nothing about the number of individuals needed for a well-powered study. A better measure to evaluate tag SNPs would be an explicit calculation of the probability that a genome-wide association study will find a statistically significant association given that such an association exists (*i.e.*, power). To solve this problem, one needs to be able to calculate the power of a study given a specified genetic model and sample size. Skol *et al*. have proposed a method for computing power, though they were concerned with issues of study design rather than tag SNP choice [16]. Jorgenson and Witte, who noted the same problems, propose a "cumulative *r*^{2} adjusted power" that integrates LD and tag SNP information to provide the overall power of a study [17].

Realistically, one does not have an unlimited choice of SNPs but rather chooses among several competing commercial products with fixed sets of tag SNPs. Therefore, instead of choosing a set of tag SNPs, a more common problem now is how to evaluate which of several fixed sets of tag SNPs is better for a particular study. Several papers have looked at power for hypothetical and commercial sets of tag SNPs through empirical simulations on a subset of chromosomal regions [13, 18]. This approach suffers from both the speed problem of empirical simulations and the assumption that the sampled regions are representative of the genome as a whole. What is needed is an application of explicit power calculation methods (such as that of Jorgenson and Witte [17]) to the commercially available sets of tag SNPs to allow comparison among products and power calculation for real studies.

Here, I present a method for computing the power of a genome-wide association study when a genetic model and sample size are specified and LD information is available for the population being studied. This method is equivalent to the cumulative *r*^{2} adjusted power of Jorgenson and Witte [17], which will be referred to as "power" for brevity. I show that to obtain the best power, different commercial genotyping products should be used for different populations. I further find that power is sometimes improved by genotyping more individuals at fewer SNPs rather than fewer individuals at more SNPs. These calculations can guide the optimal design of future genome-wide association studies.

## Results and discussion

The power calculations require genotype data on a large representative sample of common SNPs from the population as well as a list of which of these representative SNPs are the tag SNPs (SNPs to be genotyped). Power is computed in three steps. First the best tag SNP for each of the representative SNPs is found. Then, the power for detecting association for each of the representative SNPs assuming that SNP directly influences the phenotype is computed. For this computation, it is assumed that the study will be performed by testing for genotype frequency differences between cases and controls using a two-degree of freedom *χ*^{2} test in which multiple tests are corrected for using the Bonferroni correction. This test explicitly assumes a codominant model. I use this test because it is the most general, at the cost of reduced power relative to a model-specific test. While a multimarker tagging approach could be taken [13], this added level of complexity is not usually included in a first-pass analysis of genome-wide association data and is therefore including it in our power-calculation would inflate the power one might expect in real-world application of genome-wide association studies. Finally, the average power over all the SNPs is taken to be the power of the study.

*N*SNPs present in a given population, each one represented as

*S*

_{ i }. Let

*C*

_{ i }represent SNP

*i*being causative, and

*D*

_{ i }represent SNP

*i*being detected. Assume that one of these SNPs is the causative SNP, but it is unknown which of these is the causative SNP. Then the overall power of the study is given by $\sum _{i=1}^{N}\mathrm{Pr}\phantom{\rule{0.1em}{0ex}}({C}_{i},{D}_{i})$. The power computed for a specific SNP

*S*

_{ i }is given as

*P*

_{ i }= Pr(

*D*

_{ i }|

*C*

_{ i }). Thus, if each

*P*

_{ i }multiplied by Pr(

*C*

_{ i }), we get

This final equation is the same as taking the average power over all the SNPs.

The number of SNPs present in each population and present in each commercial genotyping system

Population | CEU | JPT+CHB | YRI |
---|---|---|---|

SNPs in HapMap | 3868157 | 3890416 | 3796934 |

SNPs w/MAF >= 0.05 (%) | 2230515 (58%) | 2046163 (53%) | 2477182 (65%) |

Common SNPs on Affy 100 K chip (%) | 91400 (79%) | 82995 (72%) | 91363 (79%) |

Common SNPs on Affy 500 K chip (%) | 378415 (77%) | 346887 (70%) | 409849 (83%) |

Common SNPs on Illumina 300 K chip (%) | 313265 (99%) | 251560 (79%) | 252678 (80%) |

Common SNPs on Illumina 550 K chip (%) | 506543 (91%) | 425631 (77%) | 441884 (80%) |

^{2}with the tag SNPs and therefore could slightly inflate the power estimation. As the fraction of SNPs with an r

^{2}greater than the cutoff differs between the ENCODE and non-ENCODE regions by at most ten percentage points, and an average of three percentage points, this overestimation is not likely to be extreme.

I have presented a method to compute the power of a genome-wide association study in which a fixed set of tag SNPs will be genotyped. For the sake of simplicity, I only considered one straightforward single-SNP analysis scheme. While this approach has been used successfully [6], others have suggested that greater power can be obtained by looking at multiple tags or haplotypes [18, 20]. This method for computing power can be adapted to such strategies provided it is possible to compute the power of detecting each SNP in the population given the set of tagging SNPs. I also assume that each SNP is equally likely to be functional. If we knew *a priori* the probability that a given SNP is functional, we could use this to weight the average power over all the SNPs. Such a weighting scheme would prioritize SNPs more likely to be of interest because of either functional considerations or location [21]. For instance, assume we assigned each SNP a probability of being the causative SNP based on external evidence such as a prior linkage study. If these probabilities are normalized to sum to one, they can be used to compute a weighted average power in this approach.

## Conclusion

Proper design of a genome-wide association study requires careful calculation of the power. These calculations will be invaluable to anyone who is planning a genome-wide association study. Using these calculations, the proper sample size to get adequate power in a given study can be computed. Furthermore, the performance of different genotyping platforms can be compared, allowing an investigator to choose whatever is best for his or her study. By performing such calculations, genome-wide association studies can be optimized to get the maximal power possible for a given set of resources.

## Methods

### Genotype data and populations

I used genotype data from release 21 (phase II) of the International HapMap project [19]. I used data from all four populations studied in the HapMap project. These populations are defined by the HapMap project as follows: Yoruba in Ibadan, Nigeria (abbreviation: YRI); Japanese in Tokyo, Japan (abbreviation: JPT); Han Chinese in Beijing, China (abbreviation: CHB); and CEPH (Utah residents with ancestry from northern and western Europe) (abbreviation: CEU). Similar to the analysis performed by the HapMap project, I combined genotypes from the JPT and CHB populations to make a joint JPT+CHB population. For all three resulting populations, I removed SNPs that have a minor allele frequency (MAF) less than 0.05 in that population. The remaining SNPs are considered to be "common." A summary of the number of SNPs remaining for each population is found in Table 1. When phased data is needed, I used the phased chromosomes for release 21.

### Calculation of power

To compute the overall power of an association study, I use three steps. First, I find the best tag SNP for each genotyped SNP in the data set. Then, I compute the power for each SNP assuming the specified GRR and sample size. Finally, I take an average power over all the SNPs to get the overall power.

To find the best tag SNP for each genotyped SNP, I look at the linkage disequilibrium between each SNP and all tag SNPs within 300 kb of it. For each pair of SNPs, I infer the two-locus haplotype frequencies between them using expectation maximization and compute r^{2} between the two SNPs from the inferred haplotype frequencies [12]. The best tag is then taken to be the tag SNP with the highest value of r^{2}.

*χ*

^{2}test. The power of this test is computed using a non-central

*χ*

^{2}distribution with non-centrality parameter

*λ*. Equations for

*λ*have been previously derived for a general

*χ*

^{2}test [22] and for application to genetic association [23]. Specifically, for genotypic association

*λ*is given by:

*N*

_{ A }and

*N*

_{ U }are the number of case (affected) and control (unaffected) individuals, respectively;

*p*

_{00},

*p*

_{01}, and

*p*

_{02}are the genotype frequencies in the cases; and

*p*

_{10},

*p*

_{11}, and

*p*

_{12}are the genotype frequencies in the controls. If, instead of a 3 × 2 table we use a 2 × 2 table for a one-degree of freedom test of allelic association, the non-centrality parameter is given by

where *p*_{
A
}and *p*_{
U
}are the frequencies of allele 0 in the cases and controls, respectively.

I use the Bonferroni correction for multiple testing and require a *p*-value of 0.05/M (where M is the number of tag SNPs genotyped) for statistical significance. When association is directly tested (the SNP is a tag SNP), I use the actual number of cases and controls to compute the power. For indirect association (the SNP is in LD with a tag SNP), I reduce the number of cases and controls by a factor of r^{2} for the power computation [2].

I assume that the disease has a low enough prevalence in the population that the risk allele frequency in those without the disease approximates the risk allele frequency in the population. I can set the disease to follow a multiplicative, additive, dominant, or recessive model with a specified genotype relative risk (GRR) for the SNP of interest [1]. Given that genotype 0 is the wildtype, and taking *p*_{10}, *p*_{11}, and *p*_{12} from the observed genotype frequencies in the population, *p*_{00}, *p*_{01}, and *p*_{02} are computed as follows:

*S*SNPs under consideration (for which we have linkage disequilibrium [LD] data from, for instance, the HapMap project),

*M*are tags that will be genotyped on the chip and

*S-M*are not tags. Further assume that there are

*T*common SNPs in total in this population, which includes both those

*S*SNPs for which we have LD data and SNPs for which we do not know their LD with surrounding SNPs. Let 1-

*β*

_{ i }be the power for SNP

*i*where

*i*ranges from 1 to

*S*and SNP

*i*is a tag SNP when

*i*≤

*M*and a non-tag otherwise. Then, the overall power is given by:

In this manner, the tag SNPs are only considered representative of themselves, while the non-tag SNPs for which we have LD data are considered representative of all common non-tag SNPs. For these calculations, I use *T* = 2 × 10^{7}.

### Implementation

A computer program to implement these calculations was written in C. The source code is available upon request from the author.

## Declarations

### Acknowledgements

I am grateful to Jurg Ott, in whose lab the bulk of this work was performed; Joe Garsetti from Illumina for help in obtaining the list of SNPs on the Illumina chips; and Sara Hamon for critical comments on the manuscript. This work was performed while RJK was a postdoctoral fellow funded by F32HG003681 from NIH.

## Authors’ Affiliations

## References

- Risch N, Merikangas K: The future of genetic studies of complex human diseases. Science. 1996, 273: 1516-1517. 10.1126/science.273.5281.1516.View ArticlePubMedGoogle Scholar
- Pritchard JK, Przeworski M: Linkage disequilibrium in humans: models and data. Am J Hum Genet. 2001, 69: 1-14. 10.1086/321275.PubMed CentralView ArticlePubMedGoogle Scholar
- Hirschhorn JN, Daly MJ: Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005, 6 (2): 95-108. 10.1038/nrg1521.View ArticlePubMedGoogle Scholar
- Matsuzaki H, Dong S, Loi H, Di X, Liu G, Hubbell E, Law J, Berntsen T, Chadha M, Hui H, Yang G, Kennedy GC, Webster TA, Cawley S, Walsh PS, Jones KW, Fodor SPA, Mei R: Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat Methods. 2004, 1: 109-111. 10.1038/nmeth718.View ArticlePubMedGoogle Scholar
- Gunderson KL, Steemers FJ, Lee G, Mendoza LG, Chee MS: A genome-wide scalable SNP genotyping assay using microarray technology. Nat Genet. 2005, 37 (5): 549-554. 10.1038/ng1547.View ArticlePubMedGoogle Scholar
- Klein RJ, Zeiss C, Chew EY, Tsai JY, Sackler RS, Haynes C, Henning AK, Sangiovanni JP, Mane SM, Mayne ST, Bracken MB, Ferris FL, Ott J, Barnstable C, Hoh J: Complement factor H polymorphism in age-related macular degeneration. Science. 2005, 308 (5720): 385-389. 10.1126/science.1109557.PubMed CentralView ArticlePubMedGoogle Scholar
- Herbert A, Gerry NP, McQueen MB, Heid IM, Pfeufer A, Illig T, Wichmann HE, Meitinger T, Hunter D, Hu FB, Colditz G, Hinney A, Hebebrand J, Koberwitz K, Zhu X, Cooper R, Ardlie K, Lyon H, Hirschhorn JN, Laird NM, Lenburg ME, Lange C, Christman MF: A common genetic variant is associated with adult and childhood obesity. Science. 2006, 312 (5771): 279-283. 10.1126/science.1124779.View ArticlePubMedGoogle Scholar
- Arking DE, Pfeufer A, Post W, Kao WH, Newton-Cheh C, Ikeda M, West K, Kashuk C, Akyol M, Perz S, Jalilzadeh S, Illig T, Gieger C, Guo CY, Larson MG, Wichmann HE, Marban E, O'Donnell C J, Hirschhorn JN, Kaab S, Spooner PM, Meitinger T, Chakravarti A: A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization. Nat Genet. 2006Google Scholar
- Maraganore DM, de Andrade M, Lesnick TG, Strain KJ, Farrer MJ, Rocca WA, Pant PV, Frazer KA, Cox DR, Ballinger DG: High-resolution whole-genome association study of Parkinson disease. Am J Hum Genet. 2005, 77 (5): 685-693. 10.1086/496902.PubMed CentralView ArticlePubMedGoogle Scholar
- Duerr RH, Taylor KD, Brant SR, Rioux JD, Silverberg MS, Daly MJ, Steinhart AH, Abraham C, Regueiro M, Griffiths A, Dassopoulos T, Bitton A, Yang H, Targan S, Datta LW, Kistner EO, Schumm LP, Lee A, Gregersen PK, Barmada MM, Rotter JI, Nicolae DL, Cho JH: A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene. Science. 2006, 314 (5804): 1461-1463.PubMed CentralView ArticlePubMedGoogle Scholar
- Ke X, Miretti MM, Broxholme J, Hunt S, Beck S, Bentley DR, Deloukas P, Cardon LR: A comparison of tagging methods and their tagging space. Hum Mol Genet. 2005, 14 (18): 2757-2767. 10.1093/hmg/ddi309.View ArticlePubMedGoogle Scholar
- Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet. 2004, 74 (1): 106-120. 10.1086/381000.PubMed CentralView ArticlePubMedGoogle Scholar
- Pe'er I, de Bakker PI, Maller J, Yelensky R, Altshuler D, Daly MJ: Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet. 2006, 38 (6): 663-667. 10.1038/ng1816.View ArticlePubMedGoogle Scholar
- Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, Cox DR: Whole-genome patterns of common DNA variation in three human populations. Science. 2005, 307 (5712): 1072-1079. 10.1126/science.1105436.View ArticlePubMedGoogle Scholar
- Barrett JC, Cardon LR: Evaluating coverage of genome-wide association studies. Nat Genet. 2006, 38 (6): 659-662. 10.1038/ng1801.View ArticlePubMedGoogle Scholar
- Skol AD, Scott LJ, Abecasis GR, Boehnke M: Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet. 2006, 38 (2): 209-213. 10.1038/ng1706.View ArticlePubMedGoogle Scholar
- Jorgenson E, Witte JS: Coverage and Power in Genomewide Association Studies. Am J Hum Genet. 2006, 78:Google Scholar
- de Bakker PI, Yelensky R, Pe'er I, Gabriel SB, Daly MJ, Altshuler D: Efficiency and power in genetic association studies. Nat Genet. 2005, 37 (11): 1217-1223. 10.1038/ng1669.View ArticlePubMedGoogle Scholar
- The International HapMap Consortium: A haplotype map of the human genome. Nature. 2005, 437 (7063): 1299-1320. 10.1038/nature04226.PubMed CentralView ArticleGoogle Scholar
- Lin S, Chakravarti A, Cutler DJ: Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat Genet. 2004, 36 (11): 1181-1188. 10.1038/ng1457.View ArticlePubMedGoogle Scholar
- Roeder K, Bacanu SA, Wasserman L, Devlin B: Using linkage genome scans to improve power of association in genome scans. Am J Hum Genet. 2006, 78 (2): 243-252. 10.1086/500026.PubMed CentralView ArticlePubMedGoogle Scholar
- Mitra SK: On the limiting power function of the frequency chi-square test. Ann Math Stat. 1958, 29: 1221-1233.View ArticleGoogle Scholar
- Gordon D, Finch SJ, Nothnagel M, Ott J: Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms. Hum Hered. 2002, 54 (1): 22-33. 10.1159/000066696.View ArticlePubMedGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.