The human L-threonine 3-dehydrogenase gene is an expressed pseudogene

Background L-threonine is an indispensable amino acid. One of the major L-threonine degradation pathways is the conversion of L-threonine via 2-amino-3-ketobutyrate to glycine. L-threonine dehydrogenase (EC 1.1.1.103) is the first enzyme in the pathway and catalyses the reaction: L-threonine + NAD+ = 2-amino-3-ketobutyrate + NADH. The murine and porcine L-threonine dehydrogenase genes (TDH) have been identified previously, but the human gene has not been identified. Results The human TDH gene is located at 8p23-22 and has 8 exons spanning 10 kb that would have been expected to encode a 369 residue ORF. However, 2 cDNA TDH transcripts encode truncated proteins of 157 and 230 residues. These truncated proteins are the result of 3 mutations within the gene. There is a SNP, A to G, present in the genomic DNA sequence of some individuals which results in the loss of the acceptor splice site preceding exon 4. The acceptor splice site preceding exon 6 was lost in all 23 individuals genotyped and there is an in-frame stop codon in exon 6 (CGA to TGA) resulting in arginine-214 being replaced by a stop codon. These truncated proteins would be non-functional since they have lost part of the NAD+ binding motif and the COOH terminal domain that is thought to be involved in binding L-threonine. TDH mRNA was present in all tissues examined. Conclusions The human L-threonine 3-dehydrogenase gene is an expressed pseudogene having lost the splice acceptor site preceding exon 6 and codon arginine-214 (CGA) is mutated to a stop codon (TGA).


Background
Liver failure is a cause of considerable mortality; therefore bioartificial livers may offer significant therapeutic benefit. Porcine-derived hepatocytes are being used in clinical studies of bioartificial livers [1,2]. There may, however, be significant differences in the activities of liver enzymes between species [3]. These differences are also important considerations when the pharmaceutical industry con-ducts new drug metabolism and pharmacokinetic studies on key mammal species [4].
The liver plays a critical role in regulating the circulating concentrations of amino acids. The regulation of amino acid supply to bioartificial organs and maintaining the activity of the amino acid-metabolising enzymes will be important in their development. Active maintenance of the optimal amino acid concentrations will offer the possibility of prolonging the differentiated function of hepatocytes in bioartificial livers [5].
L-threonine is one of three indispensable amino acids since mammals do not possess the necessary enzymes for the transamination of threonine [6]. However, the transient apoenzyme form of L-threonine dehydratase bound to pyridoxamine 5'-phosphate dehydratase can carry out a half-transamination of L-threonine [7]. Once L-threonine is oxidised through various catabolic pathways it is lost for the purposes of protein synthesis. Therefore, it is important for determining levels of human nutrition that factors regulating threonine oxidation are identified. The international recommended intake of L-threonine for adults is 7 mg.kg -1 .d -1 [8]. However, this has recently been challenged as being too low and a recommendation of 15 mg.kg -1 .d -1 has been suggested [9].
In mammals, the removal of the 1-carbon of threonine is thought to occur through threonine dehydrogenase, this carbon being incorporated into glycine and released as CO 2 by the mitochondrial glycine cleavage system. In normally-fed pigs and rats 80 to 87% of L-threonine catabolism occurs via threonine dehydrogenase [15,16] and in isolated rat and cat hepatocytes 35% to 50% of threonine oxidation occurs through this pathway [17,18]. Zhao et al. measured threonine oxidation in adult humans by the production of labelled CO 2 from L- [1-13 C]threonine and concluded that the conversion of threonine to glycine through the threonine dehydrogenase pathway was not important if it exists at all [19]. However, the dehydrogenase pathway is thought to account for 44% of total threonine oxidation in infants [20]. More recently, the same group used a more sensitive method to measure the pro-duction of labelled glycine and CO 2 in adults and suggested that there was some threonine catabolism via the threonine dehydrogenase pathway, but it accounted for only 7-11% of total [21]. This suggested that the L-serine/ threonine dehydratase pathway is the dominant route in the catabolism of threonine in humans.
Recently, the sequences of the murine and porcine TDH cDNAs have been reported [22]. The amino-terminal regions of these proteins have characteristics of a mitochondrial targeting sequence and are related to the UDPgalactose 4-epimerases, with both enzyme families having an amino-terminal NAD + binding domain. The sequence of the human KBL transcript, the second enzyme in the pathway, has also been described [23] and this study aimed to identify the human TDH gene.

Human TDH gene
The murine L-threonine 3-dehydrogenase gene (TDH) has recently been identified [22]. The human genome was searched using this murine sequence and only a single putative human TDH gene was found to be present. The putative human TDH gene is located on clone RP11-110L10, chromosome 8p23-22 (GenBank accession No. AC011959, Whitehead Institute/MIT Center for Genome Research, USA), between the gene for acyl-malonyl condensing enzyme and hypothetical protein, C8orf13 gene (accession No. XP_088377) ( Figure 1A). By comparison with the murine TDH gene and the porcine cDNA, 8 human exons were identified spanning 10 kb ( Figure 1B). The predicted start of the ORF is in good sequence context for the initiation of translation [24].
The sequence of the human TDH exons (Fig. 2) has 84% identity with the porcine cDNA at the nucleotide level in the ORF and the theoretical full-length human protein has 85% identity and 97% similarity with the porcine TDH protein [22], suggesting that this is the human TDH gene. The first potential polyadenylation signal site (ATTAAA) is also conserved between species.

Human TDH predicted secondary structure
Using the translation of the human genomic sequence (with the stop codon replaced with the conserved arginine residue found in other TDH proteins) the predicted secondary structure of the likely "ancestral" TDH protein was determined using Psi-Pred [30] and aligned with the crystal structure of GALE from E. coli [29,31] (Fig. 4). The human ancestral TDH protein would have been a mixed alpha-helices/beta-sheet protein with a NAD + binding Rossmann-fold and belongs to the tyrosine-dependent oxidoreductase protein family (also known as short-chain dehydrogenases). The characteristic Tyr-x-x-x-Lys couple (residues 191 and 195) found in all family members is important for catalysis with the conserved tyrosine serving as the active-site base [33]. Two domains were identified, with the larger amino-terminus domain (residues 53-234) having a NAD + binding motif and the smaller carboxy-terminus domain (residues 218-329) likely to be involved in L-threonine binding. It is likely that the substrate is located in the cleft between the two domains.

Mutations in the human TDH gene
However, there are 3 mutations that disrupt the gene transcription and translation on the sequence of genomic clone RP11-110L10 (Fig. 2). There is a loss of the acceptor splice sites in exon 4 and 6 and an in-frame stop codon in exon 6 (from the expected CGA to TGA) resulting in arginine-214 being replaced by a stop codon. Together, these mutations suggested that the human TDH gene is a pseudogene.
To determine whether these mutations were present in other species, sequence alignments around the acceptor splice sites in exons 4 and 6 were constructed (Fig. 5). The genomic sequences from the mouse and puffer fish (Takifugu rubripes or Fugu) and another human genomic clone (AF131216, Institute of Molecular Biotechnology, Germany) have intact acceptor splice sites in exon 4 suggesting that the loss of this splice site is a single nucleotide polymorphism (SNP) (Fig. 5A). In both human genomic DNA   sequences the acceptor splice site in exon 6 is lost, but it is intact in both the mouse and puffer fish sequences (Fig.  5B). Similarly, the stop codon is present in both human genomic DNA sequences, but it is not present in genomic DNA, cDNA and EST sequences from other species. In human genomic DNA sequence (AC011959) there is also an A to T mutation in codon 201 (ATG to TTG, in exon 5) resulting a methionine to leucine substitution.

Genotyping of individuals by restriction enzyme digests and DNA sequencing
To determine whether these mutations in the 2 human genomic sequences were sequencing errors or just present in those individuals utilised for determining the human genome, the region encompassing the mutations was amplified by PCR from a number of individuals, sequenced and digested with restriction enzymes. Amplicons from the exon 4 splice acceptor site were digested with the restriction enzyme AciI. If the acceptor splice site was not present (G^CGG rather than GCAG, acceptor splice site under- lined) the 158 bp amplicon was digested to produce restriction fragments of 104 and 54 bp. The restriction enzyme digest pattern and DNA sequence of 3 individuals are illustrated (Fig. 6) in which one individual has the splice site intact, another has lost the splice site and the third has the heterozygous condition. Digestion is not as complete as would be expected in the heterozygous condition, presumably due to heteroduplex formation during the PCR. This SNP was found in 20 out of 40 chromosomes examined.
Amplicons from the exon 6 splice acceptor site from 3 individuals were digested with the restriction enzyme Hsp92II. If the acceptor splice site was not present (CATG^g rather than CATAg) the 207 bp amplicon was digested to produce restriction fragments of 112 and 95 bp. Three individuals examined had lost the splice site

Figure 4
Predicted secondary structure of human "ancestral" TDH protein was determined using the Psi-Pred program and was aligned with that of the crystal structure of uridine diphosphogalactose-4-epimerase protein (GALE) from Escherichia coli [29,31] using 3D-PSSM. The labels are: human "ancestral" TDH protein predicted secondary structure, HsTDH_PSSM; human "ancestral" TDH protein sequence, HsTDH_seq; E. coli GALE protein sequence, EcGale_seq; E. coli GALE protein secondary structure, 1udc_SS; alpha-helix, H, highlighted in light blue; beta-sheet, E, highlighted in yellow; c = turn, coil or loop. Identical residues are shown and additionally, a ":" indicates positive equivalence.

Copy DNA sequences of the human TDH gene
The human TDH gene is expressed. PCR amplification from cDNA libraries using primers designed to amplify the TDH ORF was carried out and amplicons of approximately 1000 bp were obtained from cDNAs from 2 individuals, cloned and sequenced. The 1019 bp sequence of clone 1 (AY101186) skipped exon 6, resulting in a premature stop codon in exon 7, (Fig. 8A) and would encode a 230 residue ORF (Fig. 8B). The 935 bp sequence of clone 2 (AY101187) skipped exon 4 and utilised a cryptic splice site in exon 6, 24 bp downstream of the expected site ( Fig  5B), resulting in a premature stop codon in exon 6 ( Fig.   8A) and would encode a 157 residue ORF (Fig. 8B). The human TDH gene is therefore a pseudogene since all individuals examined contain at least 2 mutations that on translation would generate truncated proteins that would be non-functional since they would be unable to make appropriate contacts with the substrates, L-threonine and NAD + .

Discussion
A search of the human genomic sequences for the TDH gene identified only one candidate location on chromosome 8p23-22. This is the expected site of the human TDH gene since there are a number of genes that neighbour the TDH genes in the genomes of human, mouse and puffer fish that are found in common in this locus. They are the myotubularin related protein 8, the B-lymphoid tyrosine kinase and the hypothetical protein C8orf13. There are differences in the gene order and orientation in the human and puffer fish TDH loci indicating that there have been re-arrangements in this locus since the divergence of the human and puffer fish lineages [34].
The human TDH gene contains 3 mutations that disrupt the gene transcription and translation. There is a loss of the acceptor splice sites preceding exons 4 and 6 and an in-frame stop codon in exon 6. A database search of expressed human sequences identified other TDH cDNAs and ESTs similar to the 2 cDNA sequences described. The sequence of the human cerebellum full-length cDNA clone FLJ25033 (AK057762, NEDO human cDNA sequencing project University of Tokyo, Japan) has 2 5'UTR exons, the first of which corresponds to exon -1 on the mouse genomic DNA sequence [22]. This clone skips both exons 4 and 6, and includes the polyadenylation site. The ORF would encode a truncated 180 residue protein.
Three very similar human ESTs (AI005002, AI243637 and AI809781) have exon 4 skipped and use the cryptic acceptor splice site in exon 6 that was found in clone 2 and possesses the in-frame stop codon.
The human TDH gene is a pseudogene because in all individuals examined the gene contains 1 or more mutations that disrupt RNA splicing which give rise to a variety of mRNAs in different individuals. In both pig and mouse TDH cDNAs there was no evidence of alternatively splicing [22]. In humans, arginine-214 is replaced by a stop codon. On translation all the human mRNAs would generate a variety of truncated proteins ranging from 157 to 230 residues, some of which would also contain deletions of residues 96-146; whereas the size of the core of TDH enzyme (residues 48-364) differs by only 2 residues in other species, including some bacterial TDH proteins [22]. The human truncated proteins would be non-func- tional since they would be unable to make appropriate contacts with the substrates, L-threonine and NAD + . They have lost most of the carboxy-terminal domain, which by homology with GALE [26][27][28][29], would be expected to bind L-threonine and also have lost various regions of the NAD + binding motif that extends from exon 1 to exon 6. In comparison, even single point missense mutations in the human GALE gene disrupt protein function resulting in epimerase-deficiency galactosemia [35].
At present, no known human genetic disease has been associated with defects in the threonine dehydrogenase and 2-amino-3-ketobutyrate coenzyme A ligase biochemical pathway. This may be because the enzymatic activity of Lserine/threonine dehydratase is sufficient to metabolise Lthreonine in humans. The mitochondrial threonine dehydrogenase enzyme is thought to act in the maintenance of free somatic threonine concentration derived from dietary threonine [15] suggesting that humans may not be able to regulate the level of circulating threonine as well as other mammals. In vertebrates, L-threonine is degraded by two major enzymatic pathways and attempts to determine their relative contribution in humans has been by indirect methods. Zhao et al. measured threonine oxidation by the production of labelled CO 2 in adult humans from labelled threonine and concluded that the conversion of threonine to glycine through the threonine dehydrogenase pathway was not important if it exists at all [19].
More recently, Darling et al. used a more sensitive method to measure the production of labelled glycine and CO 2 and suggested that there was some threonine catabolism via the threonine dehydrogenase pathway, but it accounted for only 7-11% of total [21]. Direct demonstration of the absence of TDH enzymatic activity in human tissues such as liver remains to be demonstrated. In contrast, in normally-fed pigs and rats 80 to 87% of L-threonine catabolism occurs via threonine dehydrogenase [15,16].
Since the TDH gene is not functioning in humans, what enzyme(s) could be responsible for the small percentage of threonine catabolism to glycine? In bacteria, threonine aldolase also yields glycine, but mammals are thought to lack the "genuine" threonine aldolase [36]. However, serine hydroxymethyltransferase exhibits a low threonine aldolase activity in mammalian liver [36].
Whether the first dehabilitating mutation was the loss of the splice acceptor site preceding exon 6 or the arginine-214 to stop codon mutation is unknown, nor is it known when in human evolution that a functional TDH gene was lost. However, the arginine codon CGA is used in only 11.1% of arginine residues in humans, being prone to mutate to a stop codon. The TDH gene has been conserved throughout evolution, being found in bacteria and other mammals [22], so why has it not been conserved in man? How could such a mutation establish its self in the ancestral population? It is likely that the mutation must have conferred some selectable advantage, under the prevailing environmental conditions, on those individuals who carried it, and arose at a time when the ancestral population was small. Under conditions of protein starvation, threonine dietary intake could have been a limiting factor on growth, survival and successful reproduction and a reduction in the rate of threonine catabolism would have conferred a selective advantage on those individuals with defective TDH gene. Additionally, if we consider that the threonine catabolism pathway is also a glycine synthesis pathway, then a reduction in the production of glycine in inhibitory glycinergic neurons [37] may have contributed to human neural evolution.
Since humans lack a functioning threonine dehydrogenase enzyme and human parasites such as trypanosomes do possess one (Edgar and Horn, unpublished results, GenBank AF529241), the parasite threonine dehydrogenase enzyme is a potential target for therapeutic intervention. Indeed, in trypanosomes such as Trypanosoma brucei, which causes sleeping sickness, L-threonine dehydrogenase is an important metabolic enzyme and inhibition of this enzyme by a wide range of sulphydryl reagents, such as tetraethylthiuram disulphide, leads to a loss of trypanosome viability [38][39][40].
Loss of other metabolically important genes in man results in disease susceptibility. Man and primates are scurvy-prone, having lost the L-gulono-gamma-lactone oxidase gene that is found in most other eukaryotes, and are unable to synthesize L-ascorbic acid [41]. Man and primates have also lost the urate oxidase enzyme that cataly-

Figure 8
The human TDH pseudogene is expressed. (A) Clones from 2 individuals were sequenced and mapped to the TDH gene. Clone 1 skipped exon 6, and the resulting frame-shift generated a premature stop codon in exon 7 (TGA). Clone 2 skipped exon 4 and utilised a cryptic splice site in exon 6, and the resulting frame-shift generated a premature stop codon in exon 6. (B) The translation of clone 1 (exon 6 skipped) encoded a truncated 230 residue ORF. The skipped exon 4 in clone 2 does not alter the reading frame. However, the use of a cryptic splice site in exon 6, results in a premature stop codon in exon 6. Stop codons are indicated by *.
ses the conversion of uric acid to allantoin. This results in a high concentration of uric acid in the blood, predisposing man to hyperuricemia that can lead to gouty arthritis and renal stones [42]. Humans have recently lost 2 functional genes involved in sialic acid function. They are the CMP-N-acetylneuraminic acid hydroxylase gene (CMAH) [43] and the siglec-like molecule (Siglec-L1) [44]. However, no diseases have yet been associated with the loss of these genes. The CMAH enzyme converts the sialic acid, N-acetylneuraminic acid to N-glycolylneuraminic acid, potentially affecting recognition by a variety of endog-enous and exogenous sialic acid-binding lectins. Siglecs are immunoglobulin superfamily member lectins that selectively recognize different sialic acids types and siglec-L1 preferentially recognizes N-glycolylneuraminic acid. To date, no disease has been associated with the loss of the TDH gene, but humans may not be able to metabolise high protein diets as efficiently as other mammals.

Conclusions
The human L-threonine 3-dehydrogenase gene is an expressed pseudogene, which accounts for the very low lev- els of threonine oxidation measured in humans by the production of labelled CO 2 from labelled threonine. This suggests that the L-serine/threonine dehydratase pathway may be the only route in the catabolism of threonine in humans. The presence of a functional TDH gene in key mammal species such as pigs and mice and its loss in man should be taken into consideration when utilising hepatocytes in bioartificial livers and pharmacokinetic studies.