Robust physical methods that enrich genomic regions identical by descent for linkage studies: confirmation of a locus for osteogenesis imperfecta
© Brooks et al; licensee BioMed Central Ltd. 2009
Received: 13 September 2007
Accepted: 30 March 2009
Published: 30 March 2009
The monogenic disease osteogenesis imperfecta (OI) is due to single mutations in either of the collagen genes ColA1 or ColA2, but within the same family a given mutation is accompanied by a wide range of disease severity. Although this phenotypic variability implies the existence of modifier gene variants, genome wide scanning of DNA from OI patients has not been reported. Promising genome wide marker-independent physical methods for identifying disease-related loci have lacked robustness for widespread applicability. Therefore we sought to improve these methods and demonstrate their performance to identify known and novel loci relevant to OI.
We have improved methods for enriching regions of identity-by-descent (IBD) shared between related, afflicted individuals. The extent of enrichment exceeds 10- to 50-fold for some loci. The efficiency of the new process is shown by confirmation of the identification of the Col1A2 locus in osteogenesis imperfecta patients from Amish families. Moreover the analysis revealed additional candidate linkage loci that may harbour modifier genes for OI; a locus on chromosome 1q includes COX-2, a gene implicated in osteogenesis.
Technology for physical enrichment of IBD loci is now robust and applicable for finding genes for monogenic diseases and genes for complex diseases. The data support the further investigation of genetic loci other than collagen gene loci to identify genes affecting the clinical expression of osteogenesis imperfecta. The discrimination of IBD mapping will be enhanced when the IBD enrichment procedure is coupled with deep resequencing.
Mapping of regions identical-by-descent (IBD) is a powerful method for the identification of genetic loci shared within families and implicated in disease. Classically, typing of individual genetic markers throughout the genomes of the afflicted individuals mapped shared haplotypes and has been successful in finding loci linked with numerous monogenic traits . An alternative physical method, Genomic Mismatch Scanning (GMS) , physically compares genomes of two affected individuals, related by a not too distant common ancestor, and enriches for the IBD regions they share. Despite its promise, the technique has not been exploited due to technical complexities.
As GMS offers the potential to avoid certain ambiguities associated with genotyping and may be applicable to pools of DNA samples from afflicted, related individuals, we aimed to improve the technique to reduce its inherent noise and to render it robust.
As compared with linkage discovery by genotyping, physical methods based on direct comparison of genomic sequences would enable more complete access to all IBD regions of the genome. Such methods rely on the fact that non-IBD regions are densely polymorphic between two individuals. A conceptually attractive approach for such a direct comparison involves the formation of duplex heterohybrid DNA fragments from the DNAs of related individuals sharing a trait of interest, and then challenging these fragments with reagents actuated by mispairings in the heterohybrids. Such reagents may bind to the mismatched fragment or may introduce strand breaks to permit separation of hybrid fragments that are not perfectly complementary from those that are perfectly paired from the IBD regions. Physical comparison has been successfully used with a variety of technologies that exploit the chemical or structural differences between perfectly matched and mismatched hybrids. These include chemical mismatch cleavage  and various attempts to harness proteins that respond to mismatches such as resolvase [4, 5], single-stranded DNA-specific nucleases [6, 7], MutS mismatch binding [8–10] or cleavage by the mismatch-specific enzymatic activity of E. coli MutS, MutL and MutH . In addition, transfection of hybrid molecules into bacteria enables enrichment for perfectly paired fragments via an in vivo process dependent on mismatch repair activities [12, 13].
The use of these mismatch recognition methods has generally been limited to the analysis of a few targeted fragments, but adapting them to genome-wide analysis is conceivable. Global treatment of fragments from the entire genome, however, also requires elimination of reannealed homohybrid fragments (fragments between strands of DNA from the same individual formed during hybridization, including both mismatch-bearing hybrids formed with one paternal and one maternal strand and isohybrids comprising strands from the same parent) whose presence would confound the identification of IBD DNA. Ford and colleagues[14, 15], in presenting the concept of genome-wide enrichment of IBD fragments, introduced the strategy of tagging one genome with methyl groups, and then using restriction enzymes specific for either methylated or non-methylated DNA, but inactive with hemimethylated DNA, to remove homohybrid DNA. In addition, they suggested using various mismatch-specific agents, either by immunoprecipitation or by the E. coli mismatch repair system to eliminate mispaired fragments [12, 14]. Subsequently, Nelson et al.  described GMS, a combination of the methylation-dependent homohybrid elimination method with in vitro cleavage by the mismatch repair proteins MutS, MutL and MutH, and digestion by exonuclease III. Application of GMS to pairs of related yeast strains, including mapping of the IBD-enriched regions by hybridization to arrayed clones, correctly identified meiotic recombination crossover points .
Use of GMS with mammalian genomes was demonstrated by enrichment of microsatellite alleles shared between related individuals [16, 17] and confirmation of loci containing previously documented disease-related genes with mapping of IBD regions by hybridization to DNA arrays targeted to the chromosome of the known locus  or by microsatellite allele recovery . Despite the recognition of the potential of GMS [20, 21], the approach has not been widely exploited, due to a lack of availability of the mismatch repair proteins, various technical challenges inherent in a multi-step procedure and the lack of appropriate means to map the IBD-enriched DNA. A fundamental problem has been an apparent lack of appreciation of the need for a highly efficient process to ensure elimination of non-identical hybrid fragments. Such residual fragments will hybridize to the DNA features on microarrays and so increase non-specific noise. Modifications to GMS and IBD mapping have been reported but the results from genome wide studies showed a substantial lack of concordance to confirm CEPH family meiotic crossovers . As linkage studies require processing of many samples, an additional limitation has been the use of reagent volumes and methods unsuitable to high throughput microtiter plate-based procedures.
We present here an improved protocol for physical enrichment of IBD regions. The monogenic disease osteogenesis imperfecta (OI; Brittle Bone Disease) provides a context for demonstrating that the protocol could correctly identify a well-established monogenic locus and an opportunity to discover novel loci relevant to OI etiology. OI includes a heterogeneous group of autosomal dominant inherited disorders characterized by bone fragility and other generalized connective tissue abnormalities . As analysis of locus-specific restriction fragment length polymorphisms had shown that nearly all cases of OI segregated with the two collagen genes [24–28], genome-wide linkage analysis was not perceived as necessary. However, given the wide variety of clinical expression of OI, it is possible that variants of modifier genes may also cosegregate with OI . Therefore a genome wide analysis might detect loci harbouring such genes.
We have implemented an IBD enrichment protocol that overcomes many of the difficulties of the original GMS protocol. We report that application of the improved protocol to DNA from OI patients has correctly identified the shared IBD locus that includes COL1A2 bearing the disease-causing mutation and additional loci that may be relevant to OI etiology. The protocol is now suitable for finding linkage loci in applications ranging from monogenic disease to complex multigenic disease.
Overview of IBD enrichment process
Genomic DNA (gDNA) from each individual is cleaved with restriction enzymes that yield fragments of sufficient length to ensure a high probability that they will include common polymorphisms. The two sets of fragments are tagged in different ways such that homohybrid molecules with identical tags are targeted for subsequent elimination, but heterohybrids, with differentially tagged strands, are conserved. After tagging, the DNAs are mixed, denatured and reannealed to form hybrid fragments of different types as shown in Figure 1. Strands from one individual may reanneal with strands from the same individual to generate self-self hybrids (as isohybrids or as mixed parental hybrids), and with strands from the other individual to produce heterohybrids, either with strands of different parental origin (M/P heterohybrids) or with both strands derived from the same parent (M/M or P/P heterohybrids). Any hybrid with strands from different ancestral chromosomes, hence bearing mispairs at polymorphic sites, is a target for subsequent enzymatic attack by MutS, MutL and MutH (LSHase) and exo III. The DNA that survives the procedures that eliminate self-self hybrids and mismatch specific enzymatic digestion is enriched for the perfectly paired IBD DNA.
Necessity for high efficiency of removal of non-specific DNA
Partition of hybrid fragments after random reannealing of sibling pair DNAs.
Maternal or Paternal
Biparental IBD locus
Monoparental IBD locus
Robust enrichment of IBD DNA
To produce adequate yields of reannealed fragments, we ensured that all restriction digested gDNAs had a consistent range of fragment sizes, indicating intact gDNA. We optimized several components of the process, notably the choice and concentration of reagents for reannealing in a formamide emulsion [14, 30, 31] and conditioning of the resin that eliminates fragments resulting from nucleolytic digestion. Quality control during the process also includes monitoring of the DNA concentrations after digestion by the fragmenting restriction endonuclease, after reannealing, after the final enrichment process and after generic amplification to detect any atypical losses or excess yields. Several steps require DNA purification and/or concentration and with a view to automation and robust processing, we perform the process entirely in microtiter plates using ultrafiltration instead of alcohol precipitation. The number of steps has been significantly reduced.
Discrimination efficiency evaluated by quantitative PCR
Identification of IBD regions for grandparent-grandchild pairs by mapping on BAC microarrays
Such experiments with pairs of known IBD status validated the enrichment process and the behaviour of immobilized DNA clones. However, due to uneven distribution of SNPs and different amounts of unique and efficiently annealable sequence, each clone displays its particular characteristics, and hence there is a wide variability in the ratio values between clones. Indeed, as shown in Figure 5, the ratios for some clones detecting enrichment are similar to those of other clones detecting depletion. Therefore, as described in Methods, in subsequent experiments to discover unknown IBD regions, we standardized the ratios by mean centering using the variance of the ratios of each clone obtained from experiments with 150 CEPH sib pairs.
Application of IBD enrichment and mapping to identify monogenic disease loci
Osteogenesis imperfecta relative pairs
Type of pair
Number of pairs
Expected shared IBD fraction
Here we have described several critical modifications to a physical positional cloning process to ensure its reliability and to reduce noise in the final mapping analysis. We have recognized a major challenge in applying this methodology, namely that the procedure must eliminate a large fraction of the reannealed DNA, mainly comprised of strands derived from different chromosomes and from potentially confounding self-self isohybrid fragments. We have formally defined the dependence of the discrimination of IBD DNA from non-identical DNA as a function of the efficiency of the process of IBD selection. Therefore, we designed the procedure to ensure highly efficient and specific intermediate yields, and included step-specific quality control assays. Improvements included a reduction in the number of steps, optimization of reannealing, and conditioning of a DNA binding resin to render this key reagent suitably reliable.
The essential component for IBD selection is the enzymatic activity that recognizes and hydrolyzes mismatched DNA fragments. We cloned the three proteins into high yield overexpression vectors, and with various methods, including surface plasmon resonance studies, characterized their activities and optimized storage and reaction conditions [36–38]. We over-titrated the enzymatic activities and other reagents that eliminate unwanted DNA to provide sufficient discrimination for mapping of IBD regions.
Because the yield of the IBD enrichment process is low, we validated a generic amplification method that produced sufficient DNA for mapping. Deviations in copy number representation introduced by the amplification method are not detrimental to the final array ratio determinations because increases or decreases in copy number of different fragments are roughly balanced throughout the long BAC clone sequences and hence in contiguous clones representing an IBD region. Any clone-specific or sequence-dependent deviations are expected to be of similar direction and magnitude in both sample and reference DNAs. The range of discrimination factors between 1.3 and 4 that we observed by BAC clone hybridization might be interpreted as representing the efficiency of the enrichment process. However, microarray analysis is subject to numerous perturbing factors  resulting in dynamic range dampening. To exclude any microarray-related factors, we assessed discrimination by qPCR and showed that the process can enrich some IBD fragments at least 10- to 50-fold. Even though numerous hybrid fragments might escape the steps designed to eliminate non-identical DNA and so contribute to noise, the improvements we have introduced to the overall process ensure that the preponderant hybridization signal from the BAC clones is due to strongly enriched IBD fragments. We have also developed a related protocol, Genome Hybrid Identity Profiling, which incorporates these improvements and the use of self-self hybrid discrimination via ligation of oligonucleotide tags; this method enables initial fragmentation of the gDNA with any restriction enzyme that generates efficiently ligatable termini.
The distinction between Mendelian monogenic disease and complex polygenic disease has blurred in recent years ; modifier genes may influence the clinical phenotypes of monogenic conditions . To identify candidate loci harbouring disease or modifier genes, extended pedigrees are particularly useful, as any IBD regions conserved in most or all of the afflicted individuals are large, whereas the probability of finding any IBD loci shared among many distant relatives is small [42, 43]. To demonstrate adequate performance for genome-wide disease gene mapping, we applied the IBD-enrichment process to Old Order Amish OI family pairs from an extended pedigree and successfully identified a chromosome 7 locus containing COL1A2, the gene bearing the mutation responsible for OI in these families. Another IBD-enriched locus, mapping to chromosome 1q, includes PTGS2 (COX-2), located at 184.9 Mb. COX-2 is expressed in a regulated manner in osteoblasts and is a key regulator in bone formation, interacting with various key proteins of bone metabolism. Variants of COX-2 might affect the clinical outcome of collagen mutations and so may be involved in some of the pleiotropic phenotypes of OI [23, 44–46]. The phenotype of mice homozygous for the Col1A2 Gly610Cys mutation mimics the milder clinical expression of the mutation in the Amish families, and the phenotype of heterozygous mice is inconsistent with autosomal dominance . As phenotypic severity of the mutated mice varies substantially depending on the genetic background , these mice would be appropriate for constructing additional mutations in candidate modifier genes, such as COX-2.
The potential of family-base studies to identify complex disease genes has yet to be realized; numerous reports have identified linkage peaks of only suggestive p-values and loci are often not replicated . Due to modest gene effects, loci for complex diseases will be difficult to identify unless at least 1000 families are genotyped . Therefore, much effort has been invested in the genome wide association approach using high density SNP genotyping arrays, and many credible associations have been described in case-control studies . Nevertheless, in contrast to linkage analysis, failure to find significant association in a region, even with high density arrays, cannot exclude a disease-related gene in the region . Genotyping several hundred thousand markers generates thousands of significant associations , such that the most significant p-values most likely correspond to gene regions unrelated to the disease; whereas the rare true associations are likely to be among the lower ranking p-values of the significant associations . Hence linkage studies that whittle the genome to a few regions, followed by association studies limited to these regions, decrease the multiple testing burden and the likelihood of false associations.
Genotypes from high density SNP arrays are also useful for IBD detection by comparing data from two or more related individuals to find long runs that contain no genotypes inconsistent with common inheritance from the same ancestral chromosome, and that exceed some calculated or simulated cut-off length, thus confidently excluding runs of random IBS [42, 43, 54]. This approach has been applied with small pedigrees to confirm a previously identified locus for prostate cancer , and to identify candidate loci for kidney cancer . This novel means to exploit SNP data, together with the physical method we present here, provide alternative or complementary strategies for gene hunts in family collections and pedigrees. By addressing both the set of variants represented on SNP arrays and additional variants, such as those typically revealed only by fine mapping of regions of associated haplotypes, the physical method may detect shorter IBD regions than the genotyping approach, including regions that might be excluded as likely IBS, or regions of genuine IBD runs broken by erroneous genotypes.
To identify small IBD regions, the resolution of the mapping method must approach the expected mapping resolution of the physical IBD enrichment procedure. In addition, genome-wide mapping of IBD-enriched DNA should preserve the high level of relative enrichment revealed by qPCR. We expect that next-generation resequencing methods  will permit mapping IBD regions by scoring sequence read depth. Analysis, including genotyping of sequenced SNPs, would be restricted to unique sequences and restriction fragments with some minimum density of SNPs of high minor allele frequencies, thus ignoring the sequences that contribute to dampening of discrimination in microarray analysis. In addition, sequence selection based on resequencing performance with pools of IBD-enriched DNA from CEPH family pairs would generate a genome-wide set of sequences that accurately report IBD sharing. Considering that reannealing to generate unique sequence hybrids reduces the genomic representation at least five-fold and that the level of IBD enrichment for well-behaved sequences is 10- to 50-fold, very few sequencing runs would be sufficient to obtain a depth of coverage comparable to the depth that permitted calling heterozygous SNPs after resequencing an entire human genome, requiring about 70 runs . Sufficiently deep coverage may permit distinction between monoparental and biparental sharing in IBD-enriched regions.
We have established a robust process that physically selects and maps genomic regions that are shared between family members. The methodological approach, originally proposed by Ford and colleagues to isolate "inheritance units" [14, 15], is now practicable for the simultaneous processing of several hundred relative pairs such as sibling pairs under rigorous quality control conditions. The improved process enabled mapping of loci for a monogenic trait, osteogenesis imperfecta. Using the physical enrichment process, we have previously reported the identification of loci for autism, including a locus on chromosome 16p. With subsequent high density genotyping in this locus, we found that PRKCB1, protein kinase beta, is associated with autism . Thus the procedure is now suitable for enriching for IBD DNA in applications ranging from monogenic to complex diseases. Coupled with mature deep resequencing methods to map the IBD-enriched DNA, the technology will enable increased discrimination that will be required to analyze more distantly related individuals and cost-efficient pools of IBD-enriched DNA samples.
Reagents and DNA
Reagents and suppliers included: restriction endonucleases, Dam methylase and exonuclease III, NE Biolabs; TempliPhi kit, Amersham; Repli-G WGA kit, Molecular Staging/Qiagen; SYBR Green, SYBR Gold and PicoGreen, Molecular Probes/InvitroGen; GenomeHIP reagent components including HyFast gDNA reannealing reagent (HyF), DNA affinity polymer (DAP, a reconditioned derivative of benzoyl-naphthoyl-DEAE cellulose, Sigma), DAP buffer, LSHase (a formulation of E. coli MutS, MutL and MutH), enzyme buffer (EB), a formamide-based array hybridization buffer (AHB), coverslip removal buffer (CRB) and stringent wash buffer (SWB), IntegraGen; Multiscreen filtration plates, Millipore; QIAquick glass-fibre purification plate, Qiagen; UltraGAPS slides, Corning.
CEPH family genomic DNAs were either prepared from immortalized tissue culture cells with the Recoverease kit (Stratagene), a procedure that yields highly intact dialyzed DNA , or obtained from the Coriell repository. BAC clones were obtained from InvitroGen or from the Central National de Séquencage, Genoscope (Evry, France). Coriell provides DNA from osteogenesis imperfecta families A, B, C and D, (for pedigrees and phenotypes, see http://ccr.coriell.org/nigms/phenotype/oi.html).
We measured DNA concentrations with PicoGreen in a 384-well fluorescent plate reader using calf thymus DNA as standard.
Hybrid reannealed DNA
We digested genomic DNA (gDNA) from pairs of relatives with Pst I and purified and concentrated the digests with Multiscreen Manu-30 ultrafiltration. We quantified the purified, digested DNA and verified the expected range of fragment sizes by agarose electrophoresis. We used Dam methylase to tag one of the Pst I-digested gDNAs. We combined 1.5 μg of each of the samples, denatured the DNA by incubation in 0.15 M NaOH at RT for 10 min, and neutralized the solution by addition of HyF buffer and phenol. We reannealed complementary strands by shaking the emulsion for 18 hours and recovered the aqueous phase after mixing with chloroform. We immobilized duplex fragments on glass fibre (QIAquick) in the presence of a chaotropic salt to eliminate non-renatured single stranded DNA . We eluted the hybrid renatured fragments with TE and removed an aliquot for labelling and hybridization to microarrays.
Enrichment for IBD DNA
We incubated the reannealed hybrid DNA with LSHase in EB at 37°C for 15 min and heated for 10 min at 65°C. We incubated the product of the LSHase reaction with Mbo I and Dpn I for 30 min at 37°C and 10 min at 65°C. Then, we added exo III, incubated for 30 min at 37°C, added DAP buffer and treated with DAP to isolate IBD-enriched duplex DNA free of single-stranded digestion products.
Amplification of IBD-enriched DNA
To amplify the IBD-enriched DNA, we used either TempliPhi or Repli-G, as recommended by the manufacturers with minor modifications. We denatured the reannealed hybrid DNA or the IBD-enriched DNA with NaOH and amplified for 16 hrs at 30°C. We purified the amplified products by ultrafiltration and quantified the DNA. The extent of amplification ranged from about 250- to 2000-fold, or the equivalent of about 8 to 11 doublings.
With the aim of choosing 3000 clones with 1 Mb spacing, we used the program CloneTrek (IntegraGen). Clone Trek's input includes essential features of each clone and two parameters, the distance between two clones on the tiling path and the minimal distance accepted between two clones. Initially, all clones in the NCBI clone registry are placed on a tiling path. Clone Trek's iterative algorithm: for each chromosome, while there are clones whose removal would create acceptable gaps, identify and remove the least favoured clone. The least favoured clone has the lowest score with respect to various defined criteria, including FISH data, STS content, sequencing status, size and mapping status. The arrays had 4 replicates of 2779 clones, including 2266 mapped to a unique genomic position with the Build 36 assembly (May, 2006). Average spacing was 1.2 Mb and median spacing 0.95 Mb. Duplicate sets of blocks were printed in two zones in order to maximally separate the two sets of duplicates printed in each block. Various controls including 15 rice BACs were also printed in quadruplicate. Subsequent versions of the arrays have 5500 clones.
We purified BAC DNA using alkaline lysis, filtration and isopropanol precipitation. We amplified BAC DNA as described above for amplification of IBD-enriched DNA. Clone identity was verified by end sequencing of all clones and restriction digest fingerprinting of some clones. We checked the fidelity of amplification by verifying that sample BAC fingerprint patterns matched before and after amplification. We digested 4 to 20 μg of each amplified BAC with Alu I and purified the digests by ultrafiltration. The DNA was dried by vacuum centrifugation and resuspended in 12 μl of 50% DMSO. We printed the DNA on GAPS2 slides (Corning) with a BioRobotics MicroGridII arrayer using BioRobotics quill pins; spot diameters were about 100 microns. We irradiated arrays with 100 mJ of 254 nm UV light and then baked them at 80°C for two hours. We scanned all slides and examined the images of auto-fluorescence to identify any pin-specific problems. We validated the printing batches of about 100 slides by SYBR green staining and hybridization tests on selected slides.
Mapping of IBD-enriched DNA on microarrays
Amplified reannealed hybrid DNA, labelled with Cy3, and amplified IBD-enriched DNA, labelled with Cy5, were hybridized to the BAC microarrays to enable a ratiometric analysis [34, 35]. Labelling reactions of 30 μl were for 16 to 18 hours at 37°C with Klenow (exo-) DNA polymerase (NE Biolabs) and contained 1 μg of DNA, 125 μM random octamers, 120 μM dATP, dGTP, TTP, 60 μM dCTP and either 50 μM Cy5-dCTP or 50 μM Cy3-dCTP. We purified the labelled DNA by spun gel filtration through Sephadex G50 (APB Biotech) in HV45 microplates (Millipore). The specific fluorescence (fluorochromes/Kbp, fl/Kb), of each probe, as determined by DNA quantification and Cy-specific fluorescence readings in a fluorescence plate reader using Cy-dCTPs as standards, ranged from 20 to 50 fl/Kb. We prepared hybridization mixes by mixing Cy5- and Cy3-labeled DNA, concentrating by vacuum centrifugation and resuspending in 35 μl of AHB containing Cot1 DNA. Array slides were blocked by incubation in 10% BSA, 0.01% SDS at 37°C for 30 min. We pre-hybridized slides with 40 μl AHB containing 730 mg/ml salmon sperm DNA at RT for 30 min. We removed about 30 μl of the prehybridization mix, deposited the hybridization mixes on the arrays, covered with Hybrislips (Grace) and placed the arrays in individual hybridization chambers (Corning) in a water bath at 42°C for 2 to 3 days. We removed coverslips by gentle agitation in CRB, rinsed in 2× SSC, soaked in SWB at 45°C for 10 min, and then briefly rinsed slides in a series of baths: 0.2× SSC, filtered 0.1× SSC and 70% isopropanol. We have also tested and validated various commercial labelling kits, hybridization buffers and alternative protocols, including washing at higher temperatures in the absence of formamide.
Image and microarray data analysis
Arrays were scanned using an Agilent scanner and fluorescent intensities were corrected by subtraction of local background using GenePix® Pro 5.1. Spots with fluorescent signals indicating partial saturation (> 50,000) or signals less than 2 times the mean of the backgrounds from all autosomal clones were excluded. A ratio value of IBD-enriched DNA versus reannealed DNA was determined based on the four spot replicates for each clone. Clones with less than three morphologically acceptable replicates or with excessive variance of replicate ratios (Var(replicate) - Mean(Var(replicates)) > 2 * Var(Var(replicates)) were eliminated. Median ratios of the replicates for each clone were computed and data were normalized between arrays by dividing the ratio of each clone by the mean of the ratios of all autosomal clones. Unless otherwise noted, the ratios of each clone were standardized using 150 full sib pair controls by subtracting the mean of the control ratios of the clone and dividing by the variance of the control clone ratios. Only the clones mapped to a unique genomic position were used in the analysis.
IBD determination and linkage analysis
We set m as three, corresponding to a moving window of seven adjacent clones. As we expect that the status of an average of 75% of the clones is IBD, a threshold ratio T was determined such that 75% of the MA-ratios from all clones and all sib pair controls were greater than T. We set the IBD status to one for clones with MA-ratios greater than T and to zero for clones with MA ratios less than T. After these binary IBD scores were determined for each clone for each affected relative pair, we then counted the number of pairs that were IBD at each clone. A region represented by a clone or a series of consecutive clones is linked to the trait only if the number of affected pairs that are IBD for the region exceeds the number of pairs that by chance could have received copies of the same ancestral region. The null distribution of chance sharing appropriate for studies with different types of relative pairs was determined as described ; alternative methods to determine the null distribution may be appropriate for various experimental designs, including those with inbred pedigrees [61–64]. To limit the genome-wide probability of false linkage to 5%, we used the P-value of 2 × 10-5 to set the pointwise significance threshold for declaration of significantly increased sharing .
PCR amplifications were quantified by real time qPCR, using a common probe specific for CA microsatellite repeat sequences [66, 67]. Reactions of 25 μl contained 5 ng of amplified IBD-enriched DNA, AmpliTaqGold buffer, 0.2 mM each dATP, dCTP, dGTP, TTP, 0.4 mM dUTP, 5 mM MgCl2, 0.005 U/μl uracil-N-DNA glycosylase (Sigma), 0.03 U/μl AmpliTaqGold DNA polymerase (Roche), 0.2 μM each primer and the probe oligonucleotide 5' FAM-(CA)14-TAMRA. Incubations were at 37°C for 10 min, 95°C for 10 min, followed by 35 cycles at 95°C for 15 sec and 60°C for 1 min.
The microarray gpr files and three annotation files can be obtained from http://BMC_Genetics2008.integragen.org.
The file "Osteogenesis_samples.txt" lists the Coriell osteogenesis imperfecta samples.
The file "Osteogenesis_relative_pairs.txt" lists the experimental pairs of relatives and the identity codes for the microarray gpr files.
The file "BAC_clones.txt" lists the clone names and their chromosomal positions. If it was not possible to confidently position a clone, it is annotated as "Null".
We thank Jan Mous and Elke Roschmann for critical reading and comments; Lon Aggerbeck and David Rickman, CNRS, Gif-sur-Yvette for initial DNA array printing; Stephanie Maillard and Virginie Decaulne for expert technical assistance, Rachel Rousseau for performing qPCR experiments, Jolanta Luberda for assistance in array printing, Safa Saker, Isabelle Lambert, Isabelle Bezier and Thierry Larmonier, Genethon, Association Française contre les Myopathies, for cell culture; Abdel Benajou for help with array data analysis; Bruno Copin for help with figure preparation; Gabor Gyapay, Jean Weissenbach and colleagues, Genoscope, for contributions to DNA preparations and sequencing; Mark Lathrop and colleagues, Centre National de Genotypage for access to their facilities; and the reviewers for constructive criticism of the manuscript.
- Botstein D, Risch N: Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet. 2003, 33 (Suppl): 228-237. 10.1038/ng1090.View ArticlePubMedGoogle Scholar
- Nelson SF, McCusker JH, Sander MA, Kee Y, Modrich P, Brown PO: Genomic mismatch scanning: a new approach to genetic linkage mapping. Nat Genet. 1993, 4 (1): 11-18. 10.1038/ng0593-11.View ArticlePubMedGoogle Scholar
- Cotton RG, Scriver CR: Proof of "disease causing" mutation. Hum Mutat. 1998, 12 (1): 1-3. 10.1002/(SICI)1098-1004(1998)12:1<1::AID-HUMU1>3.0.CO;2-M.View ArticlePubMedGoogle Scholar
- Mashal RD, Koontz J, Sklar J: Detection of mutations by cleavage of DNA heteroduplexes with bacteriophage resolvases. Nat Genet. 1995, 9 (2): 177-183. 10.1038/ng0295-177.View ArticlePubMedGoogle Scholar
- Youil R, Kemper BW, Cotton RG: Screening for mutations by enzyme mismatch cleavage with T4 endonuclease VII. Proc Natl Acad Sci USA. 1995, 92 (1): 87-91. 10.1073/pnas.92.1.87.PubMed CentralView ArticlePubMedGoogle Scholar
- Till BJ, Burtner C, Comai L, Henikoff S: Mismatch cleavage by single-strand specific nucleases. Nucleic Acids Res. 2004, 32 (8): 2632-2641. 10.1093/nar/gkh599.PubMed CentralView ArticlePubMedGoogle Scholar
- Oleykowski CA, Bronson Mullins CR, Godwin AK, Yeung AT: Mutation detection using a novel plant endonuclease. Nucleic Acids Res. 1998, 26 (20): 4597-4602. 10.1093/nar/26.20.4597.PubMed CentralView ArticlePubMedGoogle Scholar
- Lishanski A, Ostrander EA, Rine J: Mutation detection by mismatch binding protein, MutS, in amplified DNA: application to the cystic fibrosis gene. Proc Natl Acad Sci USA. 1994, 91 (7): 2674-2678. 10.1073/pnas.91.7.2674.PubMed CentralView ArticlePubMedGoogle Scholar
- Geschwind DH, Rhee R, Nelson SF: A biotinylated MutS fusion protein and its use in a rapid mutation screening technique. Genet Anal. 1996, 13 (4): 105-111.View ArticlePubMedGoogle Scholar
- Wagner R, Debbie P, Radman M: Mutation detection using immobilized mismatch binding protein (MutS). Nucleic Acids Res. 1995, 23 (19): 3944-3948. 10.1093/nar/23.19.3944.PubMed CentralView ArticlePubMedGoogle Scholar
- Smith J, Modrich P: Mutation detection with MutH, MutL, and MutS mismatch repair proteins. Proc Natl Acad Sci USA. 1996, 93 (9): 4374-4379. 10.1073/pnas.93.9.4374.PubMed CentralView ArticlePubMedGoogle Scholar
- Doutriaux MP, Wagner R, Radman M: Mismatch-stimulated killing. Proc Natl Acad Sci USA. 1986, 83 (8): 2576-2578. 10.1073/pnas.83.8.2576.PubMed CentralView ArticlePubMedGoogle Scholar
- Faham M, Cox DR: A novel in vivo method to detect DNA sequence variation. Genome Res. 1995, 5 (5): 474-482. 10.1101/gr.5.5.474.View ArticlePubMedGoogle Scholar
- Casna NJ, Novack DF, Hsu MT, Ford JP: Genomic analysis II: isolation of high molecular weight heteroduplex DNA following differential methylase protection and Formamide-PERT hybridization. Nucleic Acids Res. 1986, 14 (18): 7285-7303. 10.1093/nar/14.18.7285.PubMed CentralView ArticlePubMedGoogle Scholar
- Sanda AI, Ford JP: Genomic analysis I: inheritance units and genetic selection in the rapid discovery of locus linked DNA markers. Nucleic Acids Res. 1986, 14 (18): 7265-7283. 10.1093/nar/14.18.7265.PubMed CentralView ArticlePubMedGoogle Scholar
- McAllister L, Penland L, Brown PO: Enrichment for loci identical-by-descent between pairs of mouse or human genomes by genomic mismatch scanning. Genomics. 1998, 47 (1): 7-11. 10.1006/geno.1997.5083.View ArticlePubMedGoogle Scholar
- Cheung VG, Gregg JP, Gogolin-Ewens KJ, Bandong J, Stanley CA, Baker L, Higgins MJ, Nowak NJ, Shows TB, Ewens WJ, et al: Linkage-disequilibrium mapping without genotyping. Nat Genet. 1998, 18 (3): 225-230. 10.1038/ng0398-225.View ArticlePubMedGoogle Scholar
- Cheung VG, Nelson SF: Genomic mismatch scanning identifies human genomic DNA shared identical by descent. Genomics. 1998, 47 (1): 1-6. 10.1006/geno.1997.5082.View ArticlePubMedGoogle Scholar
- Mirzayans F, Mears AJ, Guo SW, Pearce WG, Walter MA: Identification of the human chromosomal region containing the iridogoniodysgenesis anomaly locus by genomic-mismatch scanning. Am J Hum Genet. 1997, 61 (1): 111-119. 10.1086/513894.PubMed CentralView ArticlePubMedGoogle Scholar
- Lander ES: Finding similarities and differences among genomes. Nat Genet. 1993, 4 (1): 5-6. 10.1038/ng0593-5.View ArticlePubMedGoogle Scholar
- Kruglyak L, McAllister L: Who needs genetic markers?. Nat Genet. 1998, 18 (3): 200-202. 10.1038/ng0398-200.View ArticlePubMedGoogle Scholar
- Smirnov D, Bruzel A, Morley M, Cheung VG: Direct IBD mapping: identical-by-descent mapping without genotyping. Genomics. 2004, 83 (2): 335-345. 10.1016/j.ygeno.2003.08.002.View ArticlePubMedGoogle Scholar
- Byers PH, Steiner RD: Osteogenesis imperfecta. Annu Rev Med. 1992, 43: 269-282. 10.1146/annurev.me.43.020192.001413.View ArticlePubMedGoogle Scholar
- Sykes B, Ogilvie D, Wordsworth P, Wallis G, Mathew C, Beighton P, Nicholls A, Pope F, Thompson E, Tsipouras P, et al: Consistent linkage of dominantly inherited osteogenesis imperfecta to the type I collagen loci: COL1A1 and COL1A2. Am J Hum Genet. 1990, 46 (2): 293-307.PubMed CentralPubMedGoogle Scholar
- Byers PH, Shapiro JR, Rowe DW, David KE, Holbrook KA: Abnormal alpha 2-chain in type I collagen from a patient with a form of osteogenesis imperfecta. J Clin Invest. 1983, 71 (3): 689-697. 10.1172/JCI110815.PubMed CentralView ArticlePubMedGoogle Scholar
- Falk C, Schwartz R, Ramirez F, Tsipouras P: Use of molecular haplotypes specific for the human pro alpha 2(I) collagen gene in linkage analysis of the mild autosomal dominant forms of osteogenesis imperfecta. Am J Hum Genet. 1986, 38 (3): 269-279.PubMed CentralPubMedGoogle Scholar
- Nicholls A, Pope F, Craig D: An abnormal collagen alpha chain containing cysteine in autosomal dominant osteogenesis imperfecta. Br Med J (Clin Res Ed). 1984, 288 (6411): 112-113. 10.1136/bmj.288.6411.112.View ArticleGoogle Scholar
- Sykes B, Ogilvie D, Wordsworth P, Anderson , Jones N: Osteogenesis imperfecta is linked to both type I collagen structural genes. Lancet. 1986, 2 (8498): 69-72. 10.1016/S0140-6736(86)91609-0.View ArticlePubMedGoogle Scholar
- McBride D, Streeten E, Mitchell B, Shuldiner A: Variable expressivity of a COL1A2 gly-610-cys mutation in a large Amish pedigree. Am J Hum Genet. 2002, 71 (Supplement): 1047-Google Scholar
- Wieder R, Wetmur JG: Factors affecting the kinetics of DNA reassociation in phenol-water emulsion at high DNA concentrations. Biopolymers. 1982, 21 (3): 665-677. 10.1002/bip.360210313.View ArticlePubMedGoogle Scholar
- Kohne DE, Levison SA, Byers MJ: Room temperature method for increasing the rate of DNA reassociation by many thousandfold: the phenol emulsion reassociation technique. Biochemistry. 1977, 16 (24): 5329-5341. 10.1021/bi00643a026.View ArticlePubMedGoogle Scholar
- Broman KW, Murray JC, Sheffield VC, White RL, Weber JL: Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am J Hum Genet. 1998, 63 (3): 861-869. 10.1086/302011.PubMed CentralView ArticlePubMedGoogle Scholar
- Dib C, Faure S, Fizames C, Samson D, Drouot N, Vignal A, Millasseau P, Marc S, Hazan J, Seboun E, et al: A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature. 1996, 380 (6570): 152-154. 10.1038/380152a0.View ArticlePubMedGoogle Scholar
- Solinas-Toldo S, Lampel S, Stilgenbauer S, Nickolenko J, Benner A, Dohner H, Cremer T, Lichter P: Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes Chromosomes Cancer. 1997, 20 (4): 399-407. 10.1002/(SICI)1098-2264(199712)20:4<399::AID-GCC12>3.0.CO;2-I.View ArticlePubMedGoogle Scholar
- Pinkel D, Segraves R, Sudar D, Clark S, Poole I, Kowbel D, Collins C, Kuo WL, Chen C, Zhai Y, et al: High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet. 1998, 20 (2): 207-211. 10.1038/2524.View ArticlePubMedGoogle Scholar
- Acharya S, Foster PL, Brooks P, Fishel R: The coordinated functions of the E. coli MutS and MutL proteins in mismatch repair. Mol Cell. 2003, 12 (1): 233-246. 10.1016/S1097-2765(03)00219-3.View ArticlePubMedGoogle Scholar
- Brooks P: MutS-DNA interactions and DNase protection analysis with surface plasmon resonance. Methods in Molecular Biology: DNA Repair Protocols: Procaryotic Systems. Edited by: Vaughan P. 2000, Totowa, NJ: Humana, 152: 119-132.View ArticleGoogle Scholar
- Galio L, Bouquet C, Brooks P: ATP hydrolysis-dependent formation of a dynamic ternary nucleoprotein complex with MutS and MutL. Nucleic Acids Res. 1999, 27 (11): 2325-2331. 10.1093/nar/27.11.2325.PubMed CentralView ArticlePubMedGoogle Scholar
- Botwell D, Sambrook J, eds: DNA Microarrays: A Molecular Cloning Manual. 2003, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
- Brinkman RR, Dube MP, Rouleau GA, Orr AC, Samuels ME: Human monogenic disorders – a source of novel drug targets. Nat Rev Genet. 2006, 7 (4): 249-260. 10.1038/nrg1828.View ArticlePubMedGoogle Scholar
- Antonarakis SE, Beckmann JS: Mendelian disorders deserve more attention. Nat Rev Genet. 2006, 7 (4): 277-282. 10.1038/nrg1826.View ArticlePubMedGoogle Scholar
- Thomas A, Camp NJ, Farnham JM, Allen-Brady K, Cannon-Albright LA: Shared genomic segment analysis. Mapping disease predisposition genes in extended pedigrees using SNP genotype assays. Ann Hum Genet. 2008, 72 (Pt 2): 279-287. 10.1111/j.1469-1809.2007.00406.x.PubMed CentralView ArticlePubMedGoogle Scholar
- Kong A, Masson G, Frigge ML, Gylfason A, Zusmanovich P, Thorleifsson G, Olason PI, Ingason A, Steinberg S, Rafnar T, et al: Detection of sharing by descent, long-range phasing and haplotype imputation. Nat Genet. 2008, 40 (9): 1068-1075. 10.1038/ng.216.PubMed CentralView ArticlePubMedGoogle Scholar
- Marini JC, Forlino A, Cabral WA, Barnes AM, San Antonio JD, Milgrom S, Hyland JC, Korkko J, Prockop DJ, De Paepe A, et al: Consortium for osteogenesis imperfecta mutations in the helical domain of type I collagen: regions rich in lethal mutations align with collagen binding sites for integrins and proteoglycans. Hum Mutat. 2007, 28 (3): 209-221. 10.1002/humu.20429.PubMed CentralView ArticlePubMedGoogle Scholar
- Li L, Pettit AR, Gregory LS, Forwood MR: Regulation of bone biology by prostaglandin endoperoxide H synthases (PGHS): a rose by any other name. Cytokine Growth Factor Rev. 2006, 17 (3): 203-216. 10.1016/j.cytogfr.2006.01.005.View ArticlePubMedGoogle Scholar
- Lau KH, Kapur S, Kesavan C, Baylink DJ: Up-regulation of the Wnt, estrogen receptor, insulin-like growth factor-I, and bone morphogenetic protein pathways in C57BL/6J osteoblasts as opposed to C3H/HeJ osteoblasts in part contributes to the differential anabolic response to fluid shear. J Biol Chem. 2006, 281 (14): 9576-9588. 10.1074/jbc.M509205200.View ArticlePubMedGoogle Scholar
- McBride DJ, Carleton S, Phillips C, Kouznetsova N, Leikin S, Shapiro J, Mitchell B, Shuldiner A, Streeten E: COL1A2 G610C mice: a knock-in mouse model based on a large human OI kindred with phenotype variation. 9th International Meeting on Osteogenesis Imperfecta: 2005; Annapolis, Maryland: NICHD. 2005Google Scholar
- Carleton SM, McBride DJ, Carson WL, Huntington CE, Twenter KL, Rolwes KM, Winkelmann CT, Morris JS, Taylor JF, Phillips CL: Role of genetic background in determining phenotypic severity throughout postnatal development and at peak bone mass in Col1a2 deficient mice (oim). Bone. 2008, 42 (4): 681-694. 10.1016/j.bone.2007.12.215.PubMed CentralView ArticlePubMedGoogle Scholar
- Altshuler D, Daly M, Kruglyak L: Guilt by association. Nat Genet. 2000, 26 (2): 135-137. 10.1038/79839.View ArticlePubMedGoogle Scholar
- Risch NJ: Searching for genetic determinants in the new millennium. Nature. 2000, 405 (6788): 847-856. 10.1038/35015718.View ArticlePubMedGoogle Scholar
- Donnelly P: Progress and challenges in genome-wide association studies in humans. Nature. 2008, 456 (7223): 728-731. 10.1038/nature07631.View ArticlePubMedGoogle Scholar
- Curtis D, Vine AE, Knight J: A pragmatic suggestion for dealing with results for candidate genes obtained from genome wide association studies. BMC Genet. 2007, 8: 20-10.1186/1471-2156-8-20.PubMed CentralView ArticlePubMedGoogle Scholar
- Zaykin DV, Zhivotovsky LA: Ranks of genuine associations in whole-genome scans. Genetics. 2005, 171 (2): 813-823. 10.1534/genetics.105.044206.PubMed CentralView ArticlePubMedGoogle Scholar
- Leibon G, Rockmore DN, Pollak MR: A SNP streak model for the identification of genetic regions identical-by-descent. Stat Appl Genet Mol Biol. 2008, 7 (1): Article16-PubMedGoogle Scholar
- Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol. 2008, 26 (10): 1135-1145. 10.1038/nbt1486.View ArticlePubMedGoogle Scholar
- Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al: Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008, 456 (7218): 53-59. 10.1038/nature07517.PubMed CentralView ArticlePubMedGoogle Scholar
- Philippi A, Roschmann E, Tores F, Lindenbaum P, Benajou A, Germain-Leclerc L, Marcaillou C, Fontaine K, Vanpeene M, Roy S, et al: Haplotypes in the gene encoding protein kinase c-beta (PRKCB1) on chromosome 16 are associated with autism. Mol Psychiatry. 2005, 10 (10): 950-960. 10.1038/sj.mp.4001704.View ArticlePubMedGoogle Scholar
- Cairns J: The bacterial chromosome and its manner of replication as seen by autoradiography. J Mol Biol. 1963, 6 (3): 208-213.View ArticlePubMedGoogle Scholar
- Beld M, Sol C, Goudsmit J, Boom R: Fractionation of nucleic acids into single-stranded and double-stranded forms. Nucleic Acids Res. 1996, 24 (13): 2618-2619. 10.1093/nar/24.13.2618.PubMed CentralView ArticlePubMedGoogle Scholar
- Smalley SL, Woodward JA, Palmer CG: A general statistical model for detecting complex-trait loci by using affected relative pairs in a genome search. Am J Hum Genet. 1996, 58 (4): 844-860.PubMed CentralPubMedGoogle Scholar
- Fisher RA: A fuller theory of "junctions" in inbreeding. Heredity. 1954, 8 (2): 187-197. 10.1038/hdy.1954.17.View ArticleGoogle Scholar
- Cannings C: The identity by descent process along the chromosome. Hum Hered. 2003, 56 (1–3): 126-130. 10.1159/000073740.View ArticlePubMedGoogle Scholar
- Thomas A, Skolnick MH, Lewis CM: Genomic mismatch scanning in pedigrees. IMA J Math Appl Med Biol. 1994, 11 (1): 1-16. 10.1093/imammb/11.1.1.View ArticlePubMedGoogle Scholar
- Donnelly KP: The probability that related individuals share some section of genome identical by descent. Theor Popul Biol. 1983, 23 (1): 34-63. 10.1016/0040-5809(83)90004-7.View ArticlePubMedGoogle Scholar
- Lander E, Kruglyak L: Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet. 1995, 11 (3): 241-247. 10.1038/ng1195-241.View ArticlePubMedGoogle Scholar
- Jouquand S, Andre C, Cheron A, Hitte C, Chuat JC, Galibert F: Using the fluorogenic 5' nuclease assay for high-throughput detection of (CA)n repeats in radiation hybrid mapping. Biotechniques. 2000, 28 (4): 754-758.PubMedGoogle Scholar
- Ginzinger DG, Godfrey TE, Nigro J, Moore DH, Suzuki S, Pallavicini MG, Gray JW, Jensen RH: Measurement of DNA copy number at microsatellite loci using quantitative PCR analysis. Cancer Res. 2000, 60 (19): 5405-5409.PubMedGoogle Scholar