Detection of mutations in the dystrophin gene via automated DHPLC screening and direct sequencing

Background Currently molecular diagnostic laboratories focus only on the identification of large deletion and duplication mutations (spanning one exon or more) for Duchenne Muscular Dystrophy (DMD) yielding 65% of causative mutations. These mutations are detected by an existing set of multiplexed polymerase chain reaction (PCR) primer pairs. Due to the large size of the dystrophin gene (79 exons), finding point mutations (substitutions, deletions or insertions of one or several nucleotides) has been prohibitively expensive and laborious. The aim of this project was to develop an effective and convenient method of finding all, or most, mutations in the dystrophin gene with only a moderate increase in cost. Results Using denaturing high performance liquid chromatography (DHPLC) screening and direct sequencing, 86 PCR amplicons of genomic DNA from the dystrophin gene were screened for mutations in eight patients diagnosed with DMD who had tested negative for large DNA rearragements. Mutations likely to be disease-causative were found in six of the eight patients. All 86 amplicons from the two patients in whom no likely disease-causative mutations were found were completely sequenced and only polymorphisms were found. Conclusions We have shown that it is now feasible for clinical laboratories to begin testing for both point mutations and large deletions/duplications in the dystrophin gene. The detection rate will rise from 65% to greater than 92% with only a moderate increase in cost.

The DMD gene spans 2.4 million base pairs of genomic DNA on the X chromosome and its 14 kb transcript encodes a full-length protein (dystrophin) of 427 kiloDaltons. Dystrophin is a sarcolemmal protein that through its interaction with many other proteins participates in the linkage of the extracellular matrix to the cytoplasmic cytoskeleton [2][3][4]. Mutations in this gene result in DMD, BMD or other dystrophinopathy. A major consequence of the dystrophin gene's large genomic size is a high rate of mutation; close to 30% of cases prove to be spontaneous mutations [5]. Approximately 60% of mutations causing DMD are deletions of large segments of the gene usually including one or more exons [6][7][8]. Approximately 5% of mutations are duplications of large segments of the gene [8]. Large deletions and duplications are detected using multiplexed PCR primers to amplify a subset of (approximately 20 of the 79) exons that are empirically estimated to constitute 98% of large deletions and duplications. Thus, most but not all large deletions and duplications are detected by this test [9,10]. The other 35% of mutations are presumed to be point mutations such as, substitutions, deletions or insertions of one or several nucleotides leading to premature termination codons (nonsense or frame-shift mutations), amino acid substitutions (missense and neutral mutations) and alterations in splice sites. These mutations have remained undetected in most patients, both male and female, because available techniques are relatively expensive and laborious given the size of the dystrophin gene. Existing procedures for detection of point mutations include SSCP (single strand conformational polymorphism) [11,12], a variation of SSCP called DOVAM (Detection of Virtually All Mutations) [19], Enhanced SSCP [28], HA (Heteroduplex Analysis) [13,14], PTT (Protein Truncation Test) [15], DGGE (Denaturing Gradient Gel Electrophoresis) [16][17][18], chemical cleavage of mismatch [20], and RNase cleavage [21]. Each of these procedures has its advantages and limitations but the size of the gene and the cost of the procedures have not made them routine in most laboratories.
The purpose of this project was to develop an effective and convenient process to detect both large and small alterations in the DMD gene with only a moderate increase in cost. In an effort to address these issues, we developed a new primer set to be used in conjunction with denaturing high-performance liquid chromatography (DHPLC) and dye-terminator sequencing. DHPLC has been developed to screen for DNA variations by separating heteroduplex and homoduplex DNA fragments by ion-pair reverse-phase liquid chromatography [22]. We used this technique to study eight male patients and one female carrier relative of one of the eight males. Since this technique separates heteroduplexes and homoduplexes, it can also be used to analyze manifesting and carrier females with point mutations.

Results
We started the project with the intention of designing primers to amplify and sequence all of the exons of the dystrophin gene as well as the 3'-UTR and 5'-UTR. In order to ensure detection of splice site as well as exon sequence alterations, genomic sequence surrounding each of the sequence-known fragments of dystrophin was mined from the NIH GenBank database. Primers were designed (see methods) that allowed for amplification of a product with 30 -100 bases on either side of each exon. To look for point mutations, eight patients were chosen as test cases based on the absence of dystrophin, clinical symptoms consistent with DMD and no large deletion or duplication detected by the multiplexed PCR test. In addition, one sister thought to be a carrier female was also tested as described in the methods and summarized in Table 1.
Initially the primers were used to prepare amplicons of dystrophin exons for direct sequencing. Likely diseasecausative mutations were found in four of the patients and confirmed in the female relative of patient 01 by direct sequencing of exons 2 through 55, and the three 5'- UTR fragments, the last of which includes the dp427m promoter and exon 1. These were sequenced prior to our obtaining the DHPLC system for screening. When a likely disease-causative mutation was found in each of these patients, no further fragments were sequenced for that patient. Patient 01 and his carrier sister were shown to have a C->T substitution in exon 39 bp5762, predicting a nonsense mutation Gln->Stop, leading to a likely disease-causative truncation of the protein (Table 2 and Figure 1). Patient 02 has an 11 bp deletion in exon 41 and patient 03 is missing exon 21 which was not recognized until the end of the project and had not been detected by the multiplexed PCR test because exon 21 is not one of the included exons in that test. These two mutations both create out of frame transcripts leading to eventual stop codons and premature truncation of the protein. Patient 07 has a C->T substitution in exon 23 at bp 3359 predicting a nonsense mutation Arg->Stop, leading to a likely disease-causative truncation of the protein. Patient 04 has a G->T substitution at base 738+1 of the intron 6 splice site. This donor splice site is normally a G but in 1805 sites examined in the human genome, only 8 times is it a T[23]. The predicted aberrant splicing of exon 7 makes this a likely causative mutation in this patient.
The DHPLC system was obtained halfway into the study and four patients' exons 56 through 78 and the 3'-UTR including exon 79 were analyzed on the DHPLC system prior to sequence analysis. In the DHPLC system, fragments are screened for variation in retention time or chromatogram shape from that of an unaffected control amplicon. Fragments exhibiting variation are then sequenced for confirmation and analysis. Two variations were detected by DHPLC, both in exon 59, for patients 05 and 06 (Figure 2a) (only patient 06 shown). These two fragments were sequenced and a non-disease-causative Gln->Arg missense mutation at bp 9018 was found in both patients. As we found no likely disease-causative mutations in our first pass DHPLC conditions, we sequenced all four patients exons 56-78 and the 7 fragments of the 3'-UTR (the first of which contains exon 79). Only two of the 116 fragments (4 patients x 29 fragments each) sequenced showed alterations which were not detected by the first pass temperatures chosen for DHPLC analysis. The first is a polymorphism, 9857+15 C->T in intron 66, in patient 06. The second is a probable novel disease-causative mutation, 10015+5 G->A in intron 67, within the splice site consensus sequence in patient 08. Based on 1788 donor sites from the human genome, 128 (7.16%) were adenosine while 1468 (82.1%) were guanosine at this position in the splice sequence making it likely that exon 68 will be aberrantly spliced [23]. When these two fragments were run on the DHPLC system at the lower second pass temperatures indicated in 1Additional file A: Primer sequences and DHPLC temperatures Excel spreadsheet, the alterations were detected (Figure 2c and 2b). In addition, the four previously identified mutations were confirmed on the DHPLC system, as was the carrier status of patient 01S (Figure 3).
In total, likely disease-causative mutations were found in six of the eight male patients and confirmed in the carrier sister (patient 01s) of one of the six patients (patient 01).
In addition, one silent and five neutral polymorphisms, and six intronic alterations that have previously been shown to be present in unaffected individuals were found. One silent polymorphism and eight previously unreported intronic alterations were also found. In two patients, 05 and 06, we did not find likely disease-causative mutations despite sequencing all 86 fragments.

Discussion
We used a combination of direct sequencing and DHPLC to search for mutations in 8 patients with DMD and a carrier sister. Likely causative mutations were found in 6 of 8 males and confirmed in the carrier sister of one. In our study, one likely causative mutation (patient 03 del exn21) was found just by PCR analysis and 5 likely causative mutations were first detected by direct sequencing but each was subsequently detected by DHPLC. Our experience indicates that the multiple different conditions of DHPLC would have detected all of these likely causative mutations and all polymorphisms. Direct sequencing of all the fragments is still more costly than DHPLC and our results indicate that initial analysis using the far cheaper DHPLC should precede sequencing thus reducing cost of the analysis. Causative mutations were not found in two of the eight patients. Due to the enormous size of the dystrophin gene (2.4 million base pairs), finding 100% of mutations is improbable using these fragments because we cannot examine all the sequences and situations that might affect expression. It is possible, but unlikely, that one of the currently unknown alterations that we found in patients 05 and 06 will eventually be proven a causative mutation rather than a polymorphism.
Two of the patients in which we found disease-causative mutations have affected brothers, three in addition to patient 01, have potential carrier sisters, and all six have mothers who could also be carriers. We have shown that, 1) using the DHPLC system, it is now not only feasible but actually a simple process to determine if any of these relatives have the same point mutation as their brother/ son and, 2) that carrier testing and pre-natal diagnostic testing is now available to any of these patients who wish to use it.
As larger cohorts of patients are tested, a more accurate estimate of the percentage of undetectable causative mutations will emerge and present a clear new challenge. We estimate based on our small sample, and other un-published data, that such cases will comprise a small percentage (less than 8 percent) of total DMD cases.
Possible explanations for such cases include duplications which we may have missed, mutations in unknown enhancers or translation modifiers hidden in exons or introns, mutations that create novel splice sites and changes in the coding region which might be pathogenic but which, due to our lack of knowledge, are thought to be polymorphic. For example, changes to an amino acid that is essential for some protein/protein interaction (potentially transportation), or is modified/processed on the protein level but which we currently assume is just a polymorphic change. Other possible mechanisms include mosaicism, in which DNA extracted from blood lymphocytes has different sequence than DNA extracted from muscle cells, and cryptic chromosomal rearrangements. These will require dedicated efforts to resolve either individually case by case or to develop new, more comprehensive, routine tests, including RNA and protein analysis. Fortunately, there are other methods for detection of point mutations that can be compared to, or used in addition to, the method presented here, including PTT [15], DGGE [18], and DOVAM-S [19]. More information on DGGE analysis of the DMD can be found in the Leiden Muscular Dystrophy web site [http:// www.dmd.nl/] .
Clinical laboratories planning to begin testing for point as well as large mutations must clearly evaluate possible technologies in at least three areas: effectiveness, convenience and cost.
DHPLC followed by sequencing improves the effectiveness of mutation detection from 65%, using the existing multiplexed PCR technology that detects large mutations only, to approximately 92% by including detection of approximately 75% of point mutations as well. Clinical laboratories that are planning to screen cohorts of patients using the technique presented here will produce three important outcomes. The first will be a more accurate measure of the effectiveness of DHPLC screening for sequence variation followed by direct sequencing. The second will be improvements in the conditions for mutation detection using DHPLC. As new mutations are discovered, they could be entered into the database along with any suggested improvements to the DHPLC conditions for a given fragment/alteration. The third will be a collection of DNA, RNA and tissue from patients for whom all 86 fragments were sequenced with no likely disease causing mutations detected. This will prove extremely useful for further investigations into the more subtle causes of dystrophin-deficient muscular dystrophy. Ultimately, this will provide procedures for the detection of mutations in dystrophin-absent patients that will be more comprehensive.
The convenience of DHPLC followed by sequencing is readily apparent. It requires neither radioisotopes nor ethidium bromide gels when combined with a core sequencing facility or capillary sequencer (See 2Additional file B for details).
We found that the cost of reagents for DHPLC screening followed by direct sequencing was only moderately higher than the cost for the existing multiplexed PCR test. The reagent cost of the existing multiplexed PCR diagnostic is estimated at $25.00 per patient. Increasing the percentage of mutations detected in patients from the current 65% to approximately 92% by including point mutations would come at the moderate increase in reagent costs of approximately $57.75 per patient plus a moderate increase in other costs such as consumables and technician time. Although the initial investment in a DHPLC system is not minimal at approximately $80,000, the cost can be amortized over many patients and the DHPLC system can be used in the molecular diagnosis of many other diseases in addition to DMD. The same, of course, is true for the purchase of an automated sequencer. Details of the reagent costs per patient calculations are attached as 1Additional file A. Briefly, we assumed that likely disease-causative mutations would be found, on average, within approximately 43 fragments. We calculated the cost of reagents per 50µl PCR and the cost of reagents to run a sample on the DHPLC system and multiplied by the number of samples required to screen a patient. We estimated that there will be four fragments per patient that require sequencing. We calculated the reagent costs to amplify, purify and sequence these four strands per patient in both directions. We then combined the cost of the existing multiplex test for 100% of patients with the cost of DHPLC screening followed by direct sequencing for 35% of patients to arrive at an average cost per patient and the increase over the average cost per patient for the existing multiplexed PCR test alone.

Conclusion
Point mutations can be found in both DMD/BMD male patients and asymptomatic or manifesting DMD/BMD carrier females via an effective and convenient process using automated DHPLC screening for variation and direct sequencing for confirmation and analysis at a moderate increase in cost per patient. We have shown that it is now feasible for clinical laboratories to begin testing for both point mutations and large mutations in the dystrophin gene using this or one of the other available methods for mutation detection. The detection rate for all mutations in the DMD gene can be increased from the present 65% to above 92%.

Patient materials
After reviewing medical records, we studied eight male patients who had been diagnosed as having DMD based on pathology lab report, disease progression and abnormalities in the level of dystrophin expression either by immunofluorescence or western blot analysis. In each patient, no large deletion or duplication was detected using the currently used primers. This current set consists of primers to amplify the dp427m promoter including exon 1 and exons 3,4,6,8,12,13,17,19,43,44,45,47,48,49, 50, 51, 52, 60. Although none of the patients had a family history of DMD, all were determined to be highly likely to have dystrophin gene mutations, as opposed to other genetic causes of muscular dystrophy (see Table 1).
Genomic DNA was prepared from whole blood using the Puregene kit (Gentra Systems, Minneapolis, Mn.). DNA concentration was determined by spectrophotometry. This study was approved by the Children's Hospital institutional review board, and informed consent was obtained from all adult participating subjects and from parents or legal guardians of participating minors.  Table 3) contain most of the sequence obtained. Many compromises are involved in selection of primer sequences. The primers listed in 1Additional file A: Primer sequences and DHPLC temperatures Excel spreadsheet were designed using the Primer3 (Whitehead Institute, Cambridge, Ma.) and OLIGO 6.0 (Molecular Biology Insights, Inc., Cascade, Co.) software packages with four goals in mind. These were: a) to include 30-100 bases of intronic sequence adjacent to each exon at both the 5' and 3' ends of the exon b) to create a single, visible, specific band of reasonable intensity when analyzed on agarose gel c) to create fragments that have melting characteristics appropriate for cost-effective DNA variation screening by DHPLC analysis. Appropriate melting characteristics are such that variations from unaffected sequence in any portion of the fragment will be detected using ideally only one running temperature but no more than three (see Additional file A: Primer sequences and DHPLC temperatures Excel spreadsheet) d) to provide clean sequence in both directions without resorting to additional primers designed simply for sequencing the amplicon.

Primer/fragment design
All these goals were met by the primer set presented here. The primer set and the sequence of the fragments as well as additional intron sequence are available on the Leiden Muscular Dystrophy pages [http:// www.dmd.nl/DMD_DHPLC.html] . Obviously, many of these primers could be grouped into sets for multiplex PCR in order to initially screen for large mutations using the same primers presented here. Multiplexed PCR could be run on the DHPLC system in a size detection mode for detection of large deletion/duplication alterations in males. A page or process could be provided within the Leiden Muscular Dystrophy pages for posting of suggested improvements to DHPLC conditions for detection of specific alterations and for suggested primer design improvements. Quantitative PCR, FISH (Fluorescent In-Situ Hybridization) or MAPH (see [http:// www.dmd.nl/DMD_MAPH.html] ) analysis is still required for detection of female carriers of large deletion/ duplication alterations. Quantitative PCR can be performed on the DHPLC system, the ABI 7700 Sequence Detector, the Bio-Rad iCycler or by standard densitometric procedures.
PCR cycling conditions consisted of an initial denaturation step at 95°C for 15 min. followed by 35 cycles of 94°C for 5 s, the specified annealing temperature (indicated in 1Additional file A: Primer sequences and DHPLC temperatures Excel spreadsheet) for 15 s, 72°C for 30 s and ending with a final elongation step at 72°C for 3 min.

WAVE ® system DHPLC analysis
The DHPLC instrument we used is the WAVE ® system (Transgenomic Inc., Omaha, Ne.). Unpurified PCR amplicons from patients were mixed in a 1:1 ratio with an aliquot of unpurified PCR amplicon from an unaffected male. The unaffected male amplicon and the mix were heated to 95°C for 5 min. and cooled slowly over 45 min. to 25°C in a thermocycler. The unaffected and mixed reactions were then run at pre-determined temperatures (see 1Additional file A: Primer sequences and DHPLC temperatures Excel spreadsheet) on the WAVE ® system and the resultant chromatograms compared for variation in shape or retention time. Figure 4a depicts typical chromatograms for samples in which no variation from unaffected control is detectable. Figure 4b and 4c show the mutation and size standards. We recommend that the first five injections of each run should be a blank (0 volume), the size standard, a blank, the mutation standard and a blank. The retention time and peak/trough height of the standards should be compared to those obtained the very first time the standards were run (at installation of the system) to ascertain the separation performance of the column. If they vary by more than 10%, the run should be aborted and the column cleaned or other preventative/curative steps taken to return the system to operation within the specifications for the standards as stated in the systems manual.