Breaking News

Peptidase Family U34 is an Extension of the N-Terminal Nucleophile Hydrolases

Peptidases are a diverse class of enzymes that hydrolyze the peptide bonds in proteins, peptides and various other substrates.  Due to their importance in biology and medicine, many peptidases are well studied for their sequence, structure and catalytic mechanisms.  Despite the diversity of peptidase sequences and structures, there are only a limited number of catalytic types: aspartic (A), cysteine (C), metallo (M), serine (S) and threonine (T).  A comprehensive sequence and structure based classification of peptidases is available from the MEROPS database (Barrett et al. 2001; Rawlings et al. 2002).  In MEROPS database, a peptidase family consists of peptidases with significant sequence similarity.  Each peptidase family is named by a letter specifying its catalytic type and a number, such as S32.  Families considered to be evolutionarily related are grouped into clans.  Among the nearly 200 peptidase families in MEROPS, there are only a few with unknown catalytic type (U).  The key residues for catalysis have not been revealed by experiments or sequence analysis.  With the development of sensitive similarity search tools such as PSI-BLAST (Altschul et al. 1997) and the enlargement of sequence and structure databases, more discoveries of remote homology can be made by computational tools.  This has been demonstrated for numerous peptidases (Makarova and Grishin 1999; Makarova et al. 2000; Pei and Grishin; 2002).  Here, we describe the homology relationship between peptidase family U34 and penicillin V acylases.  This helps reveal that U34 peptidases are cysteine proteases and the U34 family is an extension of the diverse superfamily of N-terminal nucleophile hydrolases.

Similarity searches for U34 family dipeptidases

The first representative of U34 family was discovered in the lactic acid bacterial species Lactobacillus helveticus as a broad-specificity dipeptidase with 474 residues, designated dipeptidase A or pepDA (Dudley et al. 1996).  PSI-BLAST searches (Altschul et al. 1997) starting with the full sequence of pepDA against the non-redundant protein database (nr) maintained at NCBI (November 2nd, 1,226,480 sequences; 390,314,779 total letters; e-value cutoff 0.01) converged to about 40 U34 family homologs, among which there are a few eukaryotic and archael proteins.  The closest homologs of pepDA are mainly from the Lacobacillus species and the Streptococcus species, often annotated as putative dipeptidase A or hypothetical proteins.  The eukaryotic and archaeal homologs are all hypothetical proteins without functional annotations.  They are putative peptidases.  None of the homologs have known structure or characterized peptidase catalytic types.  PSI-BLAST searches are sensitive to the query sequence.  To ensure full coverage, found homologs were grouped by single-linkage clustering (1 bit per site threshold, about 50% sequence identity) and representative sequences from each group are used as queries for further PSI-BLAST iterations, as scripted in the SEALS package (Walker and Koonin 1997).  During the course of iterations, we found statistically supported evidence that U34 peptidases and penicillin V acylases (PVA) are remote homologs.  For example, when pepDA (NCBI gene identification (gi) number: 1072051) was used a query, one homolog from Mus musculus (gi: 12852713) was found in the second iteration with e-value 2e-09.  When this eukaryotic homolog was used as query, it found the penicillin V acylase from Lactococcus lactis (gi: 15673817)  in the third iteration with e-value 4e-04.  The structure of penicillin V acylase from Bacillus sphaericus has been determined (Figure 1) (Suresh et al. 1999), which offers a fold prediction for the U34 family peptidases. 

Penicillin acylases and the Ntn-hydrolases

Penicillin acylases (EC 3.5.1.11) catalyze the hydrolysis of penicillin into 6-aminopenicillanic acid (6-APA) and an organic acid (Mahajan 1984).  There are two distinct groups of penicillin acylases with different substrate preferences: penicillin V acylases (PVA) prefer to cleave phenoxymethyl penicillin (penicillin V) and penicillin G acylases (PGA) have higher affinity for phenylacetil penicillin (penicillin G).  Both enzymes are used in industry to produce semisynthetic penicillin.  Their structures (Duggleby et al. 1995; Suresh et al. 1999) reveal the same fold and both belong to a large superfamily of amidohydrolases called N-terminal nucleophile (Ntn) hydrolases (Brannigan et al. 1995).  Ntn-hydrolases utilize the sidechain of the amino-terminal residue to perform the nucleophilic attack to the target amide bond (Brannigan et al. 1995).  Many structures of the Ntn-hydrolases have been solved.  In the SCOP (Structure Classification of Proteins) database (Murzin et al. 1995), Ntn-hydrolase fold also includes several other families: class II glutamine amidotransferases (Smith et al. 1994), proteosome subunits (Lowe et al. 1995; Groll et al. 1997) and (glycosyl) asparaginases (Oinonen et al. 1995) (Table 1).  Their structures have similar architecture consisting of 2 layers of beta sheets sandwiched by two layers of alpha helices (Figure 1).  They are considered to be evolutionarily related based on their structural similarity and the common location of the catalytic residue, although sequence similarities are beyond detection among different families.  The structures also show great variability in the number of secondary structure elements and the details of their arrangement (Brannigan et al. 1995; Oinonen and Rouvinen 2000).  The catalytic residue can be a cysteine, a serine or a threonine.  For example, PVA is a cysteine peptidase while PGA is a serine peptidase.  According to the MEROPS classification, Ntn-hydrolases form peptidase clan PB, which is the only clan with three different catalytic types (Table 1).  Ntn-hydrolases represent a superfamily of amidohydrolases that have developed great divergence in sequence, structure and substrate specificity (Table 1).  Peptidase family U34 is an extension of the Ntn-hydrolases.

The PVA/U34 family

Since the U34 peptidases and PVAs can be linked by similarity searches, we denote these homologs as the PVA/U34 family.  We have found about 150 proteins in this family by extension searches.  Besides the U34 peptidases and PVAs, these proteins include conjugate bile salt hydrolases (or choloyglycine hydrolases, EC 3.5.1.24), which has been noticed before as close homologs of PVAs (Christiaens et al. 1992).  We also found that acid ceramidases (N-acylsphingosine amidohydrolase, EC 3.5.1.23) are close homologs of PVAs.  Acid ceramidases are eukaryotic enzymes that hydrolyze the sphingolipid ceramide into sphingosine and free fatty acid.  Deficiency of acid ceramidase activity leads to lysosomal storage disorder known as Farber disease (Koch et al. 1996).  Another subgroup consists of the isopenicillin N acyltransferases (acyl-coenzyme A:6-aminopenicillanic-acid-acyltransferases, EC 2.3.1-) (Montenegro et al. 1990), which the MEROPS assigns to family C45 in clan PB.  So the PVA/U34 family hydrolases have diverged to catalyze the hydrolysis of a variety of substrates and play different physiological roles. 

Active site in PVA/U34 family

The most striking conservation feature in the PVA/U34 family is the catalytic cysteine residue in most of the detected homologs (Figure 2).  The sidechain of the cysteine serves as the nucleophile and the free aNH2 serves as the proton donor and acceptor in the catalytic process.  For all Ntn-hydrolases, the catalytic residue is uncovered in the active enzyme by the removal of the sequences N-terminal to it.  We have found different ways to achieve this in PVA/U34 family proteins.  Proteins in the subgroup of penicillin V acylases and the conjugate bile salt hydrolases usually have the catalytic cysteine as the second residue, so the active residue is revealed right after the removal of the initiation formyl-methionine (Figure 2b).  One close homolog of  pepDA is experimental characterized as an extracellular arginine aminopeptidase from Streptococcus gordonii (gi: 16506526, Figure 2a) (Goldstein et al. 2002).   This protein has a typical export signal sequence of 14 hydrophobic residues.  The predicted catalytic cysteine residue is right after the cleavage site and thus exposed after the removal of the signal sequence.  Inhibitor studies also showed that this protein has some cysteine protease characteristics, in support of our predictions.  The acid ceremidases usually have a relatively long sequence N-terminal to the catalytic cysteine.  The removal of this N-terminal part may be an autoproteolysis process, like many other Ntn-hydrolases.
The strongest sequence signal for all PVA/U34 family proteins lies in the motif containing the catalytic cysteine residue and corresponds to the N-terminal beta-hairpin in the structure of B. sphaericus penicillin V acylase (pdb id: 3pva; Figure 1, in purple).  Other common features in this motif include the hydrophobic patterns and positions occupied mainly by small residues near the catalytic cysteine (Figure 2).  The beta-hairpin motif is longer in the close homologs of U34 family dipeptidases (Figure 2a) than in the close homologs of PVAs (Figure 2b).  Two other residues (Arg17 and Asp20) are also highly conserved in the motif in most of the PVA/U34 proteins.  In the structure of 3pva, Arg17 makes hydrogen bonds to the opposite beta sheet: two to the main chain carboxy groups (Tyr68 and Met80) and one to the sidechain of Asp69.  So Arg17 should be an important factor in maintaining the overall stability of the structure.  Besides, one sidechain nitrogen of Arg17 is only 3.8A away from the catalytic sulfhydryl group, suggesting Arg17 could also be involved in catalysis.  Position corresponding to Arg17 is usually occupied by a positively charged residue in other homologs in PVA/U34 homologs (Figure 2).  The sidechain of Asp20 makes a hydrogen bond with the free backbone amino group of the catalytic cysteine.  This interaction is critical for maintaining the orientation of the cysteine residue for catalysis.  Another important part of the catalytic machinery is the oxyanion hole, which is used to stabilize the negative charges developed on the substrate carboxy group in the transition state.  Crystal structure of 3pva has revealed that the oxyanion hole consists of the sidechain Nd2 of Asn175 and the mainchain NH of Tyr82 (Suresh et al. 1999).  However, PSI-BLAST local alignments of the U34/PVA homologs are usually restricted to the very N-terminal conserved beta hairpins, not covering the position of Asn175.  This suggests that the rest of the sequences are fairly diverse among different U34/PVA subgroups, which is consistent with the broad scope of substrates that different subgroups can have.  We made a global alignment of all found U34/PVA homologs using program PCMA (Pei et al.) which emphasizes the consistency of profiles in the alignment process.  Indeed, the position correspond to Asn175 has a high conservation value (Pei and Grishin 2001a).  The full alignment is available at ftp://iole.swmed.edu/dipep/dipep.aln. 
Conservation of Arg17, Asp20 and Asn175 is unique for the U34/PVA family (Figure 2d).  Structure comparisons have revealed a great diversity in the exact placement of active site components for different Ntn-hydrolases (Oinonen and Rouvinen 2000).  Even in the U34/PVA family, a few diverse subgroups have different conservation patterns in the beta-hairpin motif.  Two are shown in Figure 2c.  The isopenicillin N acyltransferases (Montenegro et al. 1990) have the positively charged Arg17 replaced by a glutamine residue.  The other subgroup consists of eukaryotic proteins, most of which are hypothetical proteins.  Two experimentally characterized proteins are Drosophila protein LAMA (Perez and Steller 1996) and Trypanosoma lysosomal membrane glycoprotein p67 (Kelley et al. 1999), both are related to development.  In this subgroup, position of Arg17 is often occupied by a His and Asp20 is replaced by a large hydrophobic residue, usually Trp.  In conclusion, the PVA/U34 family is a large and diverse family of cysteine-type Ntn-hydrolases. 

No comments