Multiethnic genome-wide association study identifies ethnic-specific associations with body mass index in Hispanics and African Americans

Background Genome-wide association studies of obesity have typically assumed fixed genetic effects across ethnicities, rarely attempting to thoroughly compare and contrast findings across various ethnic groups. Therefore, our study aimed to identify novel genetic associations with body mass index (BMI), a common measure of obesity, and explore their cross-ethnic generalizability in a multiethnic population. To that end, we conducted ethnic-specific genome-wide association analyses among 1235 Hispanic, 706 Asian, 1549 African American, and 2395 European American subjects from the Multi-ethnic Study of Atherosclerosis (MESA). We compared findings across ethnicities and investigated single-nucleotide polymorphisms (SNPs) with suggestive BMI-association p-values among 3379 Hispanic and 6871 African American subjects from the Women’s Health Initiative (WHI). Results We identified a genome-wide significant association in MESA Hispanics—rs12253976 in KLF6 (beta = 5.792 kg/m2 per-allele, 95 % confidence interval (CI): 3.885, 7.698; p = 3.43 × 10−9)—and suggestive SNPs with p < 5 × 10−6 in MESA Hispanics, European Americans and African Americans that display ethnic-specific effects on BMI. Of these suggestive SNPs, Hispanic SNP rs12255372 and African American SNP rs6435678 had the most evidence of replication in WHI. rs12255372 (in TCF7L2) was associated with lower BMI in both MESA (beta = −1.111 kg/m2, 95 % CI: −1.578, −0.645; p = 3.33 × 10−6) and WHI Hispanics (beta = −0.304 kg/m2, 95 % CI: −0.613, 0.006; p = 0.054). This TCF7L2 intronic region contains several SNPs (rs7901695, rs4506565, rs4132670, and rs12243326) with low p-values (p < 10−3) in MESA and betas of similar magnitude and direction in MESA and WHI, but only rs12243326 is in strong linkage disequilibrium with rs12255372 in our Hispanic populations, suggesting independent signals in this region. rs6435678 (in ERBB4) was associated with greater BMI in both MESA (beta = 1.104 kg/m2, 95 % CI: 0.643, 1.564; p = 2.85 × 10−6) and WHI African Americans (beta = 0.219 kg/m2, 95 % CI: −0.021, 0.460; p = 0.074). Conclusions Two BMI-association signals are present in the TCF7L2 intronic region of Hispanics, one of which is tagged by rs12255372. ERBB4 rs6435678 is a novel BMI-association signal in African Americans. Overall, our data suggest that ethnic-specific associations are involved in the genetic determination of BMI. Ethnic-specificity has potential implications for the development of gene-based therapies for obesity. Electronic supplementary material The online version of this article (doi:10.1186/s12863-016-0387-0) contains supplementary material, which is available to authorized users.


Background
Obesity is one of the most pressing health problems in the United States (U.S.). It affects nearly 35 % of adults and 17 % of children [1], predisposing them to many chronic conditions including type 2 diabetes (T2D), cardiovascular disease (CVD), and several cancers [2]. The cost of treating obesity-related conditions places great financial burden on the healthcare system [3]. Consequently, understanding the etiology of obesity and developing interventions to prevent its comorbidities are critical public health concerns.
The etiology of obesity is multifactorial [4], but family studies suggest that 40-70 % of the variation in body mass index (BMI), a common measure of obesity, is explained by genetic factors [5][6][7]. Genome-wide association studies (GWAS), which have identified over 100 loci associated with BMI and other obesity-related traits, have greatly expanded our understanding of the genetic basis of obesity [8]. However, further investigation is warranted for several reasons. First, known BMI loci account for only a fraction of the estimated variation in BMI [9]. Second, previous GWAS have primarily relied on data from subjects of European ancestry [10]. Hispanics and African Americans are underrepresented in GWAS, and it is precisely these populations that are overburdened by obesity in the U.S. [11]. Third, obesity GWAS have either analyzed a single ethnic group in isolation or pooled multiethnic data in cross-ethnic metaanalyses, assuming that genetic effects are fixed across ethnic groups, and have rarely attempted to thoroughly compare and contrast findings across ethnic groups.
As noted in [12], to more fully gauge the clinical and public health implications of genetic associations with BMI, studies should not only focus on the replication of genetic loci identified in European populations; they should also evaluate the cross-ethnic generalizability of genetic associations in multiethnic populations. Making unbiased cross-ethnic comparisons of genetic effects is facilitated by the availability of data from multiple ethnic groups sampled in the same fashion from the same underlying source population. Conducting ethnicspecific GWAS within such multiethnic populations could also reveal loci not readily detectable in Europeans due to cross-ethnic differences in allele frequencies and haplotype structures [13].
For these reasons, we used an ethnic-specific GWAS approach to examine genetic associations with BMI in the Multi-ethnic Study of Atherosclerosis (MESA), which includes subjects of four ethnicities: Hispanic, Asian, African American, and European American. We identified the top BMI-associated single-nucleotide polymorphisms (SNPs) in each ethnicity and evaluated whether those SNPs were associated with BMI to a similar extent in the other ethnicities. We then sought to replicate the top SNPs in Hispanics and African Americans in an independent cohort consisting of multiethnic subjects from the Women's Health Initiative (WHI).

Results
Discovery sample characteristics and model covariates Descriptive statistics for MESA are shown in Table 1. Unadjusted associations between participant characteristics and BMI are summarized in Additional file 1: Table S1. After model building, ethnic-specific covariates were: age, sex, smoking, diabetes, and arthritis in Hispanics; age, sex, education, diabetes, and arthritis in Asians and African Americans; and age, sex, income, education, smoking, physical activity, diabetes and arthritis in European Americans.

Population stratification
Additional file 2: Figures S1-S2 shows quantile-quantile plots of observed vs. expected p-values in the discovery and replication datasets, before and after adjustment for population stratification. Before adjustment, there was evidence of genomic inflation in MESA Hispanics (λ = 1.019), WHI Hispanics (λ = 1.158), and WHI African Americans (λ = 1.662). After systematic adjustment for the first two ethnic-specific principal components (PCs) in linear models, these λ estimates were significantly improved (λ = 1.000 for MESA Hispanics, λ = 1.035 for WHI Hispanics, and λ = 1.034 for WHI African Americans), and all λ values were below our predetermined threshold of 1.05. Adjustment for additional PCs did not materially alter these estimates. We note that no evidence of genomic inflation was observed in MESA Asians, European Americans, and African Americans. Systematic adjustment for ethnicspecific PC1 and PC2 did not greatly influence the magnitude of the observed p-values in these populations.

BMI-associated regions
Following SNP quality control (QC) in MESA, 853,278 SNPs in Hispanics, 683,998 in Asians, 871,948 in African Americans, and 749,659 in European Americans were analyzed (Additional file 3: Table S2). The top SNPs (p < 5 × 10 −6 ) in the MESA ethnic groups are displayed in Table 2.
Regional plots visualizing association results for rs12253976, rs7763896, and rs6866721 and their respective flanking region (±500 kb) SNPs are shown in Additional file 5: Figures S3-S5. The chromosome 5 region of European American subjects contains SNPs with low pvalues (p < 10 −5 ) and in strong linkage disequilibrium (LD) (r 2 > 0.8) with rs6866721; these include SNPs rs1672492, rs1672491, and rs7704264 (Table 2). In contrast, the plots for rs7763896 in African Americans and rs12253976 in Hispanics did not show evidence of association for SNPs in their respective flanking regions.

Ethnic-specificity
The associations between the top SNPs and BMI were generally ethnic-specific (Table 2). Two exceptions were Hispanic SNP rs12255372 and European American SNP rs7926805, whose associations with lower BMI achieved nominal significance (p < 0.05) in African Americans. Nevertheless, the I 2 statistics (from concurrent crossethnic meta-analysis in MESA) for all top SNPs were > 50 %, indicating substantial cross-ethnic heterogeneity. Moreover, regional association plots showed that no variants in the vicinity of those SNPs were significantly associated with BMI in the other ethnicities.

Replication analyses
Additional file 6: Tables S4a-b describes the WHI samples, and Additional file 7: Tables S5a-b shows unadjusted associations between the examined subject characteristics and BMI.
The results of replication analyses in WHI Hispanics and African Americans are shown in Table 3. Three    [8]. Therefore, we examined these loci more thoroughly.

TCF7L2
rs12255372 is the Hispanic SNP with most suggestive evidence of replication in WHI (p = 3.33 × 10 −6 in MESA, and p = 0.037 and p = 0.054 respectively in the WHI age-adjusted and fully-adjusted models; Table 4). Minor allele frequencies (MAFs) for rs12255372 were similar across both Hispanic populations (0.244 in MESA and 0.249 in WHI; Table 3 and Additional file 8: Table S6). Figure 1 is a regional plot visualizing association results for rs12255372 and its flanking region (±500 kb) markers in both Hispanic populations. This plots shows four other SNPs-rs7901695, rs4506565, rs4132670, and rs12243326-with low p-values (p < 10 −3 ) in MESA. These SNPs approached or achieved nominal significance in WHI and had betas of similar magnitude and direction in both Hispanic populations (Table 5). However, only rs12243326 is in strong LD with rs12255372 in our Hispanic populations (r 2 = 0.89 in both MESA and WHI). The other three (rs7901695, rs4506565, and rs4132670) have r 2 values of 0.37-0.54 in MESA and 0.54-0.63 in WHI Hispanics.

ERBB4
rs6435678 is the African American SNP with most suggestive evidence of replication in WHI (p = 2.85 × 10 −6 in MESA, and p = 0.051 and p = 0.061 respectively in the WHI age-adjusted and fully-adjusted models; Table 6). MAFs for rs6435678 were similar across both African American populations (0.231 in MESA and 0.235 in WHI; Table 3 and Additional file 9: Table S7). Figure 2 is a regional plot visualizing association results for rs6435678 and its flanking region (±500 kb) markers in both African American populations. This plot shows another BMI-associated SNP at this locus: rs16847102 (p = 1.12 × 10 −5 in MESA, and p = 0.053 and p = 0.074 respectively in the WHI age-adjusted and fully-adjusted model). rs16847102 is in strong LD with rs6435678 in our African American populations (r 2 = 0.90 and 0.87 respectively in MESA and WHI), and the strength, magnitude, and direction of its BMI-association across both populations mirror that of rs6435678 across the unadjusted, minimallyadjusted, and fully-adjusted models ( Table 6).

Ethnic-specific associations with BMI in MESA
In this study, we investigated genetic associations with BMI via an ethnic-specific GWAS approach. Using data from MESA, we identified suggestive SNPs (p < 5 × 10 −6 ) displaying ethnic-specific effects on BMI. These include rs12253976 (10p15.1; KLF6) in Hispanics, rs7763896 (6q23.2; CTGF) in African Americans, and rs6866721 (5q23.1; intergenic) in European Americans. The I 2 values from concurrent cross-ethnic meta-analyses in MESA provided statistical evidence of substantial crossethnic heterogeneity, suggesting that the top SNP effects were not generalizable across ethnicities. Combining results across ethnicities would have masked the SNP effects at these loci (Additional file 10: Table S8). Our findings hence provide support for the hypothesis that ethnic-specific associations are involved in the genetic determination of BMI and highlight the importance of using ethnic-specific approaches for discovery of genetic associations with obesityrelated traits.
Ethnic-specificity, which we define as heterogeneity of SNP effects across ethnicities, likely explains some previous failed replications of candidate obesity loci. Examples include an association in the SIM1 intronic region, discovered in Pima Indians but not generalizable to French Europeans [14], as well as functional coding variant W64R in ADRB3, associated with BMI in East Asians but not in Europeans [15]. Ethnic-specificity also has important implications for the evaluation of genetic loci as potential therapeutic agents for obesity: it emphasizes the need for a personalized medicine approach that focuses on identifying the most effective therapies for subjects of different ethnicities.

Replication in WHI
We sought to replicate the genome-wide significant association for rs12253976 in KLF6 using data from WHI Hispanics. However, no association signal was detected in this independent population. Since the WHI is an allfemale cohort, we explored whether this failed replication could be partially explained by a difference of the SNP effect between men and women. For this purpose, we conducted the following analyses in MESA: a formal test for heterogeneity of the SNP effect by sex [with a cross-product SNP-by-sex interaction term added to the ethnic-specific model] and sex-stratified analyses. As shown in Additional file 11: Table S9, the p-value for the SNP-by-sex interaction for rs12253976 in MESA Hispanics was 2.51 × 10 −4 . While not genome-wide significant, this result prompted us to explore the results of sex-stratified analyses, which revealed that the BMI association signal for rs12253976 in MESA Hispanics was actually stronger in women. Therefore, we concluded that our failure to replicate the findings for this SNP in WHI Hispanics was likely not due to an initial maledriven association in MESA.
We also explored the following explanations for this failed replication: heterogeneity in the BMI distributions   Table 3). However, the genotype distributions were different across the two populations. There were no MESA Hispanic minor allele homozygotes; in fact, the observed association was driven by 27 heterozygotes, who, on average, were 5.8 BMI units heavier than major allele homozygotes (Additional file 13: Table S10). In contrast, all three genotypes were represented in WHI, and the estimated per-allele effect size was more modest. While these differences may be due to sample size differences, they may also indicate true heterogeneity between MESA and WHI. Therefore, the locus containing rs12253976 merits further investigation in other Hispanic populations.

Association with TCF7L2
We also investigated all other suggestive SNPs (p < 5 × 10 −6 ) in MESA Hispanics in WHI. The Hispanic SNP with most suggestive evidence of replication in WHI was rs12255372, an intronic variant in TCF7L2. The regional association plot for this locus displayed four other BMIassociated variants (rs7901695, rs4506565, rs4132670, and rs12243326). Of these, only rs12243326 was in strong LD with rs12255372. The other three SNPs (rs7901695, rs4506565, rs4132670) were in weak LD with rs12255372 in our Hispanic populations, but are expected to be in strong LD with the previously-reported BMI-associated TCF7L2 variant, rs7903146 [8] [estimated r 2 = 0.72-1.00 across the representative International HapMap Project (HapMap phase 3 [16]) populations of Mexican ancestry in Los Angeles, California (MEX), Utah Residents of Northern and Western European Ancestry (CEU), and Chinese in Metropolitan Denver, Colorado (CHD); Additional file 14: Table S11]. Therefore, rs7901695, rs4506565, rs4132670 may serve as proxies for rs7903146 (not genotyped in the Affymetrix 6.0 SNP array), and we conclude that the rs12255372 BMI signal is independent from that of rs7903146. Altogether, this suggests that there are two BMI-associated regions at the TCF7L2 locus in our Hispanic populations, one tagged by the rs7903146 proxy SNPs and another by rs12255372. The minor alleles of rs12255372 and rs7903146 have been consistently associated with an increased risk of T2D [17,18] and have thus been studied extensively in that context. Studies in populations of European ancestry [19], where the two SNPs are in strong LD (Additional file 14: Table S11 and Additional file 15: Figures S8-S13), have proposed rs7903146 as the causal TCF7L2 variant, given the stronger T2D-association signal at that SNP. However, studies in Hispanic and African American populations [20,21], where LD between the two SNPs is weak or non-existent (Additional file 14: Table S11 and Additional file 15: Figures S8-S13), have reported association signals at both SNPs, with one study [21] showing that, in Hispanics, rs12255372 yields a stronger T2D-signal than rs7903146. These studies suggest a role for rs12255372 as an independent T2Dsignal in TCF7L2; and both SNPs may be functionally significant, as both reside in independent, predicted enhancer sites [22].
Our findings suggest a similar story in the context of BMI determination. Locke et al. [8], whose most significant TCF7L2 analysis only included subjects of European ancestry, proposed rs7903146 as a causal variant in this region. Our study, on the other hand, was able to detect two TCF7L2 signals, since LD does not mask the rs12255372 signal in Hispanic populations.
We note that our study did not find a significant association between the rs7903146 proxy variants (rs7901695, rs4506565, and rs4132670) and BMI in MESA European Americans. We propose two possible explanations for this: that our study was insufficiently powered to detect the purported effect size for rs7903146 in Europeans (−0.02 kg/m 2 per (minor) allele [8]), and/or that, unlike our study, which consisted entirely of population-based samples, the Locke et al. meta-analysis also included case-control studies of T2D. Regarding the latter, Locke et al. detected evidence of systematic ascertainment bias at this locus (stronger effects in T2D case-control studies than in population-based studies) [8]. This is in line with candidate gene investigations in population-based samples of European ancestry, such as DESIR [23] and the Framingham Heart Study [24], which refuted prior claims of TCF7L2 BMI-associations made by studies examining this relationship only among individuals with T2D [25].
Nonetheless, in our Hispanic population-based samples, we find that the minor alleles of TCF7L2 intronic variants are associated with lower BMI. Since our regression models had adjusted for diabetes, we examined the effect of removing this variable from the models in ad hoc analyses. Table 5 shows that the associations with lower BMI at this locus were either attenuated or unchanged after removing this variable. Thus, in our Hispanic populations, TCF7L2 intronic variants are associated with lower BMI independently of T2D.

Association with ERBB4
We also investigated all other suggestive SNPs (p < 5 × 10 −6 ) in MESA African Americans in WHI. The SNP with the most suggestive p-values across both African American populations was rs6435678, an intronic variant in ERBB4. The regional plot for this locus showed that the association pattern of the rs6435678 flanking region markers reflected the LD structure of the MESA and WHI African American populations, thus lending additional support to our finding. We note that, upon inspecting this region across the other ethnicities, we found no significant evidence of BMI-associations. Thus, our data suggest that the BMI-effect of rs6435678 may be specific to African Americans, though further investigation in independent multiethnic samples is necessary to substantiate this finding.
ERBB4 was previously linked to BMI in populations of European ancestry via an association with rs7599312 [8], located~10 kb upstream of this gene. rs6435678, which resides in intron 3, is not in LD with rs7599312 in our African American samples (r 2 = 0.00 and 0.01 respectively in MESA and WHI). Thus, we conclude that rs6435678 represents a novel signal in ERBB4. We note that no association with rs7599312 was detected in MESA European Americans. However, this was not surprising because the purported effect size for this SNP is only 0.02 kg/m 2 per-allele in populations of European ancestry [8], which our study was not powered to detect. We also note that there is no LD between rs7599312 and rs6435678 in European Americans (r 2 = 0.00 in MESA).
An association between variants in ERBB4 and BMI is biologically plausible. ERBB4 encodes a receptor tyrosine kinase expressed in various tissues, including liver and pancreas. In the liver, ERBB4 regulates lipogenesis by binding to Neuregulin 4, an epidermal growth factor secreted by brown adipose tissue [26]. In the pancreas, ERBB4 is involved in the epidermal growth factor receptor signaling pathway, which regulates islet cell differentiation [27,28] and β-cell signal transduction, and whose disruption has been linked to impaired glucose tolerance and reduced insulin response in mice [29].

Strengths and limitations
Conducting our GWAS analyses within MESA gave us the unique opportunity to compare genetic associations across four ethnic groups that were sampled in the same fashion from the same underlying source population. However, we note that stratifying MESA by ethnicity limited our statistical power to detect variants with small effect sizes: for variants with MAF ≥ 0.2, we had ≥ 80 % power to detect effect sizes of ≥ 1.4, ≥ 1.7, ≥ 1.2, and ≥ 1.7 kg/m 2 in Asians, Hispanics, European Americans, and African Americans, respectively (Additional file 16: Tables S12). This could explain why only one SNP achieved genome-wide significance; why no suggestive SNPs (p < 5 × 10 −6 ) were identified in Asians; or why FTO was only nominally associated in European Americans and had inconsistent effects in other ethnicities. We acknowledge that a fixed-effects meta-analysis of ethnicspecific GWAS data would be better powered than our ethnic-stratified approach. However, in our own meta-analysis in MESA, we detected substantial evidence of cross-ethnic heterogeneity. Thus, we concluded that pooled effect estimates across ethnicities should not be presented; they would be meaningless, since the effect of the SNPs is not common to all ethnicities.
Another limitation is that many of the SNPs that we identified in MESA did not show evidence of replication in WHI. The WHI Hispanic and African American populations-though similar to their MESA counterparts with respect to BMI distribution (Additional file 12: Figures S6-S7) and MAFs at the evaluated loci (Table 3)-are composed entirely of women, and this had the potential to affect our ability to replicate our findings. Given the results of our formal tests of heterogeneity of the SNP effects by sex and the accompanying sex-stratified analyses in MESA (Additional file 11: Table S9), there is no evidence to suggest that failure to replicate our findings is due to initial male-driven associations in MESA.
A final limitation is that the genotyping platform used by the MESA and WHI studies (the Affymetrix 6.0 SNP array) was designed to optimize coverage of common genetic variants (MAF ≥ 0.1). As noted in [30], lower frequency variants are more likely to be ethnic-specific. Therefore, in multiethnic studies, the use of custom arrays optimized for minority populations would be optimal.

Conclusions
By employing an ethnic-specific GWAS approach, we identified suggestive BMI-associated SNPs in Hispanics, African Americans, and European Americans that can be explored in future studies. The Hispanic and African American SNPs directed us to TCF7L2 and ERBB4. We show that the TCF7L2 intronic region contains two BMI-association signals in Hispanics, one of which (rs12255372) would have likely gone undetected had we not employed an ethnic-specific analytic approach. We also show that the ERBB4 intronic region contains a novel BMI association signal (rs6435678) that may be specific to African Americans.
Overall, our data suggest that ethnic-specific associations are involved in the genetic determination of BMI. The existence of heterogeneous SNP effects across ethnicities highlights the need for utilizing ethnic-specific approaches for discovery of genetic associations and may have important implications for the development of gene-based therapies for common diseases such as obesity.

Discovery phase
Subjects providing data for the discovery phase included 1235 Hispanic, 706 Asian, 1549 African American, and 2395 European American subjects recruited into MESA, a multi-center, prospective study of risk factors affecting CVD progression. Recruitment has been described elsewhere [31]. Briefly, 6814 men and women aged 45-84 years were recruited from six U.S. field centers in 2000-2002. MESA ascertained subject race and ethnicity via a standard questionnaire that adopted the definitions used by the U.S. Office of Management and Budget (OMB). [For simplicity, our present study uses the term 'ethnicity' to refer to the four racial-ethnic groups defined in MESA]. MESA recruited overlapping ethnic groups among field centers to minimize confounding by ethnicity by site [31]. Blood was collected from each subject, and DNA samples were genotyped for 909,622 SNPs using the Affymetrix 6.0 SNP array. Samples were required to have a call rate > 95 %. Further details of sample preparation and genotyping are described elsewhere [31].
Genotype and phenotype information for MESA were obtained from the National Center for Biotechnology Information's database of Genotypes and Phenotypes (NCBI dbGaP study accession: phs000209.v11.p3 MESA SNP Health Association Resource (SHARe)). For the present analysis, MESA was stratified into four ethnicspecific samples. EIGENSTRAT [32] analyses verified that the MESA ethnic groups were clustering together based on genotype data (Additional file 17: Figures S14-S15). Subjects analyzed met the QC thresholds described below and summarized in Additional file 3: Table S2 and had complete data for all ethnic-specific model covariates.
All phenotypic data used herein were obtained at the MESA baseline examination. The primary outcome variable was BMI (kg/m 2 ), calculated from height and weight measurements collected by trained staff at the field centers. All genotyped subjects had baseline BMI data. Variables examined as potential covariates due to their previously-reported associations with BMI were sex, baseline age, education, income, smoking, arthritis, diabetes, and physical activity. Details regarding covariate measurement are provided in Additional file 18: Supplemental Methods.
Ethnic-specific associations between BMI and potential covariates were examined in SAS 9.3 (SAS Institute, Cary NC). Ethnic-specific linear models were then built, with the most parsimonious models selected via backwards elimination of covariates with p > 0.05 until model adjusted-r 2 values were maximized. Age and sex were retained in the models regardless of the statistical significance of their associations with BMI. Individuals with missing values for any covariate included in the final ethnic-specific models were removed from the analyses. Details of variable parameterization and evaluation of the appropriateness of using linear regression on these data are provided in Additional file 18: Supplemental Methods.
Genetic QC procedures also included assessments for cryptic relatedness and population stratification. Cryptic relatedness between subjects in each ethnicity was examined within PLINK using pair-wise identity-by-descent (IBD) estimation. Pairs withπ (estimated proportion of genome shared IBD) > 0.2 were inspected, and only one subject from each family was included. Population stratification was assessed by calculating genomic inflation factors (λ) in PLINK and conducting ethnic-specific PC analysis in EIGENSTRAT. The first two ethnic-specific PCs were systematically added as covariates to each ethnic-specific linear model; and these PCs were sufficient to control for genomic inflation (all λ below a pre-determined threshold of 1.05).
After identifying SNPs with suggestive p-values (p < 5 × 10 −6 ) in each MESA ethnic group, we evaluated the generalizability of these associations to other ethnic groups. The ±500 kb flanking regions of the top SNPs were examined across all ethnicities to account for potential ethnic differences in LD patterns.
Lastly, we conducted cross-ethnic meta-analyses using an inverse-variance method in PLINK and calculated the I 2 statistic for each SNP. I 2 values quantify the percentage of variability in effect estimates attributable to heterogeneity rather than to chance alone [35,36], and, in this context, they can be interpreted as a measure of cross-ethnic heterogeneity.

Replication phase
Suggestive SNPs (p < 5 × 10 −6 ) identified in MESA Hispanics and African Americans were investigated in 3379 Hispanic and 6871 African American women from WHI. Recruitment and selection criteria for WHI have been described previously [37]. Briefly, WHI recruited postmenopausal women aged 50-79 years at 40 U.S. field centers in 1993-1998. WHI also ascertained subject race and ethnicity via a standard questionnaire that adopted