- Open Access
Whole-genome association analysis to identify markers associated with recombination rates using single-nucleotide polymorphisms and microsatellites
BMC Genetics volume 6, Article number: S51 (2005)
Recombination during meiosis is one of the most important biological processes, and the level of recombination rates for a given individual is under genetic control. In this study, we conducted genome-wide association studies to identify chromosomal regions associated with recombination rates. We analyzed genotype data collected on the pedigrees in the Collaborative Study on the Genetics on Alcoholism data provided by Genetic Analysis Workshop 14. A total of 315 microsatellites and 10,081 single-nucleotide polymorphisms from Affymetrix on 22 autosomal chromosomes were used in our association analysis. Genome-wide gender-specific recombination counts for family founders were inferred first and association analysis was performed using multiple linear regressions. We used the positive false discovery rate (pFDR) to account for multiple comparisons in the two genome-wide scans. Eight regions showed some evidence of association with recombination counts based on the single-nucleotide polymorphism analysis after adjusting for multiple comparisons. However, no region was found to be significant using microsatellites.
Recombination between two homologous chromosomes during meiosis generates novel gene combinations and creates genetic diversity among chromosomes. Furthermore, recombination is critical for proper segregation of homologous chromosomes, and is a major factor shaping linkage disequilibrium (LD) patterns in the genome . Much research has been done recently to establish human genetic maps based on recombination and on estimating local recombination rates to augment LD studies and aid in LD study design and interpretation [1–8]. Kong et al.  found marked regional differences in recombination rates and concluded that DNA changes contributing to evolution may not be completely random, but more concentrated within specific regions. This difference may be driven by sequence features. In addition, recombination rate is under genetic control, as exemplified in the finding by Ji et al.  that maize meiotic mutant desynaptic is a recombination modifier that controls recombination rates. In this study, seeking to identify regions potentially affecting recombination rates, we conducted genome-wide association studies based on microsatellites and single-nucleotide polymorphisms (SNPs) of the Collabroative Study on the Genetics of Alcoholism (COGA) data provided by Genetic Analysis Workshop 14 (GAW14). A total of 315 microsatellites and 10,081 SNPs from Affymetrix on 22 autosomal chromosomes were analyzed. We found eight regions/thirteen SNPs that showed some evidence of association with recombination counts. No region was found to be significant using microsatellites after adjusting for multiple comparisons based on the positive false discovery rate (pFDR) criterion.
The COGA data consist of 143 pedigrees with 1,614 individuals, including 1,109 male and female meioses. Genetic maps for microsatellites and SNPs were both provided by GAW14. Some of the distinct SNPs have the same genetic map position, which made inferring recombination events between these SNPs impossible. Therefore, we added 1.0 × 10-6 at these SNPs' genetic map positions to make them distinguishable. To estimate the number of both maternal and paternal recombination events for each female or male meiosis, we used the Best option in the haplotyping analysis in MERLIN , which outputs the most likely haplotype as well as the most likely sites for recombination throughout a pedigree. The total number of gender-specific recombination counts for each parent was obtained by averaging the numbers of recombination events of all the offspring, which was calculated as the total number of recombination events observed in the 22 autosomal chromosomes. For pedigrees with only two generations, i.e., the nuclear families, the inferred average total number of recombination events from each meiosis of the founders was then treated as a quantitative trait and genome-wide association tests were conducted to identify markers associated with this quantitative trait. For the pedigrees with three or more generations, only recombination information from the founders were extracted and considered in the association tests. We compared the results from the two scans using either microsatellites or SNPs.
Genotyping error detection
Because genotyping error may lead to double recombinations within a short distance, it can significantly affect the overall recombination counts. To minimize this impact, the error-checking algorithm implemented in MERLIN, which identifies unlikely genotypes based on double recombination events, was applied and the erroneous genotypes were excluded before applying haplotyping analysis. We used the default parameter in MERLIN, where the erroneous genotypes with a likelihood ratio p ≤ 0.025 were excluded . The same procedure was applied to both SNPs and microsatellies.
Association analysis to identify markers associated with recombination rates
We used multiple linear regressions to evaluate the relation between recombination counts and markers across 22 autosomal chromosomes with adjustments for age and gender for both SNPs and microsatellites. Analysis was carried out based on Whites only to reduce potential confounding factors related to ethnic differences. To account for the multiple comparison problem in the two whole-genome scans, we used pFDR through q-values , where a cut-off point of 5% is chosen. The q-value is a measure of significance in terms of the pFDR, and it is defined to be the minimum pFDR at which the statistic can be called significant. A pFDR of 5% means that among all of the features that are called significant, 5% of them may correspond to the true null hypotheses on average. To get the q-value for each marker, we used the software QVALUE  on the p-values obtained from the multiple regressions.
Results and Discussion
The genome-wide gender-specific recombination counts of the founders were obtained through averaging recombination counts in all meioses leading to his/her offspring. For SNPs, we inferred the founders' genome-wide recombination counts from the gametes of 1,334 offspring from 130 nuclear families. For microsatellites, we inferred the founders' genome-wide recombination counts from the gametes of 1,409 offspring from 111 nuclear families. This resulted in 189 founders (121 females and 68 males) who were Whites with information for SNPs and 199 founders (129 females and 70 males) who were Whites with information for microsatellies. Among these founders, 14 of them missed all microsatellite genotype information and 23 of them missed all SNP genotype information. They had no contribution in the multiple regression analysis. Therefore, the distributions of the founders' sex-specific recombination counts plotted in Figure 1 did not include these founders. We then had 166 founders (106 females and 60 males) that were Whites with information for SNPs and 185 founders (120 females and 65 males) that were Whites with information for microsatellies. The scatter plots of the inferred recombination counts using SNPs and microsatellites for Whites only showed a higher correlation for males than for female. The recombination counts were much higher for females than for males, a well known biological fact [2, 4]. We also noted that the recombination counts were higher using SNPs than those using microsatellites, which may be due to the fact that the SNPs were more dense than the microsatellites, allowing for the capture of recombination events missed by the microsatellites. The mean and median genome-wide gender-specific recombination counts are summarized in Table 1. From the scatter plot for the females, we noted that there were two female founders who had very high inferred recombination counts using the SNPs, 105.4 and 77.8, respectively. In our analysis, we removed the female founder with the average recombination count of 105.4. The above analysis was conducted after removing the possible erroneous genotypes. There were 1,295 microsatellite genotypes that were likely to be erroneous and were set missing with the MERLIN's error checking algorithm, making the estimated genotyping error rate for the microsatellite to be 0.367%. Among the 1,614 individuals and the 315 microsatellites, there were a total of 353,015 genotypes. Similarly, there were 27,338 SNP genotypes that were likely to be erroneous and were set missing with the MERLIN's error checking algorithm. This led to the estimated genotyping error rate for the SNPs to be 0.204% from among the 1,614 individuals and the 10,081 SNPs genotyped. There were a total of 13,395,832 genotypes examined.
We noted that our inferred female and male genome-wide recombination counts were slightly lower than that from previous studies . One reason may be that the 10,081 SNPs did not cover the entire 22 autosomes since the updated SNP data from Affymetrix were not included in the analysis. Another possible reason was that some portion of the corrected genotypes was excluded as erroneous genotypes from the genotyping error detection algorithm.
Markers associated with recombination counts
Multiple linear regressions with adjustments for age and gender generated p-values for each marker, which were not adjusted for multiple comparisons. The corresponding q-values based on the pFDR were calculated using the software QVALUE. We applied the 0.05 q-value cut-off, which gave us 8 regions/13 SNPs that showed some evidence of association with recombination counts. The positions of those regions together with the raw p-values and q-values were summarized in Table 2. The 0.05 q-value cut-off suggested that 1 out of these 13 SNPs may not be associated with recombination counts. For microsatellites, no region was found to be significant after adjusting for multiple comparisons using pFDR.
In summary, we have identified several candidate SNPs likely associated with recombination events, and further studies on these genes may help us gain valuable knowledge on recombination, better understand LD patterns, and lead to more efficient methods to map disease genes.
Collaborative Study on the Genetics on Alcoholism
Genetic Analysis Workshop 14
Positive false discovery rate
Pritchard JK, Przeworski M: Linkage disequilibrium in humans: models and data. Am J Hum Genet. 2001, 69: 1-14. 10.1086/321275.
Kong A, Gudbjartsson DF, Sainz J, Jonsdottir GM, Gudjonsson SA, Richardsson B, Sigurdardottir S, Barnard J, Hallbeck B, Masson G, Shlien A, Palsson ST, Frigge ML, Thorgeirsson TE, Gulcher JR, Stefansson K: A high-resolution recombination map of the human genome. Nat Genet. 2002, 31: 241-247.
Yu A, Zhao C, Fan Y, Jang W, Mungall AJ, Deloukas P, Olsen A, Doggett NA, Ghebranious N, Broman KW, Weber JL: Comparison of human genetic and sequence-based physical maps. Nature. 2001, 409: 951-953. 10.1038/35057185.
Broman KW, Murray JC, Sheffield VC, White RL, Weber JL: Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am J Hum Genet. 1998, 63: 861-869. 10.1086/302011.
Yu J, Lazzeroni L, Qin J, Huang MM, Navidi W, Erlich H, Arnheim N: Individual variation in recombination among human males. Am J Hum Genet. 1996, 59: 1186-1192.
Hudson RR: Two-locus sampling distributions and their application. Genetics. 2001, 159: 1805-1817.
McVean GA, Awasalla P, Fearnhead P: A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics. 2002, 160: 1231-1241.
Stumpf MP, McVean GA: Estimating recombination rates from population-genetic data. Nat Rev Genet. 2003, 4: 959-968. 10.1038/nrg1227.
Ji YF, Stelly DM, Donato MD, Goodman MM, Williams CG: A candidate recombination modifier gene for Zea mays L. Genetics. 1999, 151: 821-830.
Abecasis G, Cherny SS, Cookson WO, Cardon LR: Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002, 30: 97-101. 10.1038/ng786. [http://www.sph.umich.edu/csg/abecasis/Merlin/]
John S, Shephard N, Liu G, Zeggini E, Cao M, Chen W, Vasavda N, Mills T, Barton A, Hinks A, Eyre S, Jones KW, Ollier W, Silman A, Gibson N, Worthington J, Kennedy GC: Whole-genome scan, in a complex disease, using 11,245 single-nucleotide polymorphisms: comparison with microsatellies. Am J Hum Genet. 2004, 75: 54-64. 10.1086/422195.
Story JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003, 100: 9440-9445. 10.1073/pnas.1530509100. [http://faculty.washington.edu/~jstorey/qvalue/]
Supported in part by NIH grant R01 GM59507 and NSF grant DMS 0241160.
SH participated in the design of the study, performed the analysis, and drafted the manuscript. SW helped to obtain recombination counts and preparation of the manuscript. SW, NL, LC, and CO participated in the design and the discussion of the study. HZ conceived the study and helped to draft the manuscript. All authors read and approved the final manuscript.
About this article
Cite this article
Huang, S., Wang, S., Liu, N. et al. Whole-genome association analysis to identify markers associated with recombination rates using single-nucleotide polymorphisms and microsatellites. BMC Genet 6, S51 (2005) doi:10.1186/1471-2156-6-S1-S51
- Recombination Rate
- Recombination Event
- Genetic Analysis Workshop
- Autosomal Chromosome
- Female Founder