Annotation of genomes for nitrilases
In this study, a total of 16 genomes were annotated using RAST server for nitrilase encoding genes. The genomes size ranged from 1.7 to 9.15 Mbp size with GC content 35.2 to 70%. The presence of total predicted genes in the genome varies from 1900 in Pyrococcus horikoshii UBA8834 to 9063 in Bradyrhizobium diazoefficiens SEMIA 5080 (Table 1). The genome mining revealed highest number of nitrilase encoding genes in Bradyrhizobium diazoefficiens SEMIA 5080 followed by Acidovorax sp. MR-S7, A. oryzae RS-1, Acinetobacter sp. ATCC27244, Cupriavidus necator B9, C. necator 5, Rhodococcus opacus PD630, Sphingomonas sp. S17, Sphingomonas sp. LH128 and Streptomyces AC40, whereas the genome of Bacillus cereus W, B. cereus G9241, Bacillus sp. SBA12, Pyrococcus horikoshii UBA8834, Rhodococcus rhodochrous BKS6-46, R. rhodochrous TRN7, Sphingomonas sp. KC8 and Streptomyces sp. AC30 lacks nitrilase encoding genes (Table 1). Nitrilase predicted in Streptomyces AC40 was only 119 aa long possibly because of the draft genome sequence (Salwan et al., 2020). Further, the identification of amino acid sequences revealed grouping of nitrilases to C-N hydrolase, amidase, glutamine amidotransferase, nitrile hydratase and periplasmic nitrile proteins (Table S1). Among all, four nitrilases showing identity towards uncharacterized subgroup of the nitrilase superfamily were characterized to identify newer sources of nitrilases. The size of nitrilase varies from 208–345 amino acids and shared 24–37% sequence identity to various nitrilases (Table 2). Further, the evolutionary and phylogenetic history of nitrilase revealed relatedness of uncharacterized nitrilases AcNit, As7nit, CnB9 and Cn5 with nitrilases belonging to C_N hydrolase domain and grouped as separate cluster (Fig. 1).
Table 1
Characteristics of genomes and annotation based on RAST server
Organism
|
Genome Id
|
Size (bp)
|
GC Content
|
Number of Contigs
|
Number of Coding Sequences
|
Number of RNAs
|
Nitrilases
|
Acidovorax sp. MR-S7
|
BANPO1
|
5,007,754
|
68.3
|
130
|
4918
|
55
|
2
|
Acidovorax oryzae RS-1
|
AFPT
|
5,522,282
|
68.7
|
156
|
5323
|
44
|
1
|
Acinetobacter sp. ATCC27244
|
ABYNO
|
3,330,366
|
39.4
|
255
|
3333
|
64
|
2
|
Bacillus cereus W
|
ABCZO2
|
5,240,573
|
35.5
|
102
|
5482
|
136
|
0
|
Bacillus cereus G9241
|
AAEKO1
|
5,934,942
|
35.2
|
207
|
6316
|
155
|
0
|
Bradyrhizobium diazoefficiens SEMIA 5080
|
ADOUO2
|
9,085,533
|
64
|
13
|
9063
|
53
|
6
|
Cupriavidus necator B9
|
FMSH
|
7,303,706
|
66.3
|
550
|
7389
|
57
|
2
|
Cupriavidus necator ISOLATE 5
|
CAIGK
|
7,191,616
|
64.5
|
59
|
6865
|
56
|
1
|
Pyrococcus horikoshii UBA8834
|
DUJNO1
|
1,724,643
|
41.8
|
8
|
1900
|
47
|
0
|
Rhodococcus rhodochrous BKS6-46
|
AGVW
|
6,213,641
|
67.4
|
609
|
6195
|
53
|
0
|
Rhodococcus opacus PD630
|
AGVD
|
9,149,864
|
67
|
491
|
9091
|
55
|
3
|
Rhodococcus rhodochrous TRN7
|
FBUKO1
|
4,871,006
|
70.2
|
173
|
5330
|
59
|
0
|
Sphingomonas sp. KC8
|
AFMP
|
4,074,265
|
63.7
|
70
|
4036
|
49
|
0
|
Sphingomonas sp. S17
|
AFGGO1
|
4,268,406 |
65.7
|
62
|
4035
|
52
|
1
|
Sphingomonas sp. LH128
|
ALVCO1
|
6,457,516
|
64.9
|
658
|
6793
|
56
|
2
|
Table 2
Identification of nitrilases retrieved from the annotated genomes using pBLAST tool in NCBI
Sequence ID
|
Amino acids
|
% Identity
|
Nearest neighbour
|
Identification
|
CAIGKG010000010
Cupriavidus necator 5
(Cn5Nit)
|
273
|
34
|
Chain A, Nitrilase Pyrococcus abyssi GE5, to and
|
Figure 003879: Uncharacterized subgroup of the nitrilase superfamily
|
32
|
Chain A, Hydrolase Xanthomonas campestris pv. campestris
|
32
|
Chain A, Amidase, Nesterenkonia sp. 10004
|
27
|
to Chain A, Hydrolase, Carbon-nitrogen Family Staphylococcus aureus subsp. aureus COL
|
BANP01
Acidovorax sp. MR-S7
(AcNit)
|
271
|
37
|
Chain A, Hydrolase Xanthomonas campestris pv. campestris
|
Figure 003879: Uncharacterized subgroup of the nitrilase superfamily
|
31
|
Chain A, Amidase Nesterenkonia sp. 10004
|
26
|
Chain A, Hydrolase, Carbon-Nitrogen Family Staphylococcus aureus subsp. aureus COL]
|
26
|
Chain A, N-carbamyl-d-amino Acid Amidohydrolase Agrobacterium sp. KNK712.
|
ABYN01000224
Acinetobacter sp. ATCC 27244 (As7Nit)
|
274
|
26
|
putative carbon-nitrogen family hydrolase from Staphylococcus aureus
|
Figure 003879: Uncharacterized subgroup of the nitrilase superfamily
|
25
|
Amidase from Nesterenkonia sp. 10004
|
24
|
N-carbamoyl-D-amino acid amidohydrolase Agrobacterium tumefaciens
|
FMSH01000197
Cupriavidus necator B9 (Cn9Nit)
|
276
|
29
|
putative carbon-nitrogen family hydrolase from Staphylococcus aureus
|
Figure 003879: Uncharacterized subgroup of the nitrilase superfamily
|
32
|
CN-hydrolase superfamily protein Xanthomonas campestris pv. campestris
|
The comparative amino acid composition revealed molecular weight and pI in the range ~ 24–54 kDa and 4.76–7.81, respectively which is closer to the previously characterized aliphatic or aromatic nitrilases. The lower pI value is probably due to the presence of higher contents of acidic amino acids (30%) and lower contents of basic amino acids (25%), and − 0.032 GRAVY and 93 aliphatic indexes. Besides this, more number of negatively charged amino acids over the surface, higher content of non-polar (55%) and polar amino acids (22%) and less Pro (4%) and more Gly (9%) residues in nitrilases may provide flexibility to the protein structures.
Domain and motif analysis
The domain analysis of nitrilases AcNit, As7Nit, Cn5Nit and Cn9Nit revealed a conserved catalytic domain C-N Hydrolase of ~ 252–297 aa belonging to nitrilase superfamily. The signal peptide cleavage site was not predicted as per SignalP server and Interpro Scan (http://www.cbs.dtu.dk/services/SignalP-4.1/; https://www.ebi.ac.uk/interpro/search/sequence/) thus, revealing its intracellular location. Three residues located at Glu, Lys and Cys (EKC) formed a catalytic triad centre, although the location of amino acid is variable. The triad is conserved where C residue plays important role in maintaining its activity and K and E residues mediate acid-base catalysis for nitrilase (Raczynska et al., 2011). The crystal structures of various nitrilases had been resolved in eukaryotes including animals, plants and yeasts, as well as prokaryotic nitrilases (6MG6 and 3WUY) (Shen et al., 2020). It has been reported that the putative substrate binding pocket formed of hydrophobic and hydrophilic residues lies near the surface in nitrilase of Synechocystis sp. strain PCC6803 (Shen et al., 2020). Further characterization of these four nitrilases revealed presence of single motif consisted of 28 aa although the position is variable among all (Table 3; Fig. 2b). Catalytic residues were found conserved in all the predicted proteins (Table 3).
Table 3
Annotation of genes encoding nitrilases and prediction of motifs, domains, active sites, secondary structures and other conserved sites based on InterProScan and NCBI-CDD
Sequence ID
|
Motifs
|
Domain
|
Catalytic triad
|
Active sites
|
SS
|
Genes for xenobiotic degradation
|
ABYN01000224
Acinetobacter sp. ATCC 27244
(AcNit)
|
21–49
|
CN-Hydrolase 2-254
|
E41, K115, C157
|
E41, T99, K115, F119, E132, C157, Y158, L160, R161, A182
|
9H, 17S
|
K00362, K00627
K00622, K00791
K00930, K00626
K00982
|
BANP01
Acidovorax sp. MR-S7
(As7Nit)
|
20–48
|
CN-Hydrolase
1-298
|
E40, K113, C158
|
E40, N96, K113, F117, E128, C158, Y159, L161, R162, A183
|
9H, 18S
|
K00362, K00627
K00364, K00625
K00791, K00930
K00363, K00626 K00984, K00982
|
CAIGKG010000010
Cupriavidus necator (Cn5Nit)
|
26–54
|
CN-Hydrolase
8-263
|
E46, K119, C159
|
E46, N102, K119, F123, E134, C159, Y160, L162, R163, A185
|
8H, 15S
|
K00362, K00625
K00791, K00626
|
FMSH01000197
Cupriavidus necator B9
(Cn9Nit)
|
29–57
|
CN-Hydrolase 10–265
|
E49, K122, C162
|
E49, N105, K122, F126, E137, C162, Y163, L165, R166, A188
|
9H, 17S
|
K00625
K00791
K00626
|
Structural characterization of identified nitrilases
The secondary structure analysis revealed presence of 30% helix, 31% sheets and 2% loops or intrinsically disordered regions, forming a 4-layer α-β-β-α sandwich structure as per SCOP classification (Table 2; Fig. 2a). Ramachandran plot depicted 88–92% amino acid residues in most favored regions, 7.5–10.4% amino acid residues in additional allowed regions, and 0.4–1.3% amino acid residues in the disallowed conformations (Fig. 3). The 3D structure of 4 nitrilases prepared by taking best matches which involves 32–38% identity with nitrilase of Mus musculus nitrilase-2 (PDB: 2W1V_A), and 26% identity with nitrilase of Staphylococcus aureus subsp. aureus (PDB: 3P8K) and 31% identity with nitrilase of Nesterenkonia sp. 10004 (3HKX_A). The modeling and superimposition of modeled structures AcNit, As7Nit, Cn5Nit and Cn9Nit with the template 2W1V_A revealed root mean square deviation (RMSD) 1.7Å, 0.093, 0.137 and 0.156, respectively. The modeled 3D structures of nitrilases gave QMean score 0.65, 0.67, 0.63 and 0.64, indicating good quality for the model to represent the enzymatic protein as the values fall within the prescribed limits. The geometry and topology of the predicted model was close to the 2W1V_A template. The crystal structures of nitrilase from Arabidopsis thaliana (PDB: 6I00), Caenorhabditis elegans (PDB:1EMS), Helicobacter pylori (PDB: 6MG6), Mus musculus (PDB: 2W1V), Pyrococcus abyssi (PDB:3KLC), Saccharomyces cerevisiae (PDB: 4H5U and 1F89), and Synechocystis sp. (PDB: 3WUY) has already been reported. However, with most of the nitrilases, the crystal structure had been resolved, came from eukaryotes such as animals, plants and yeasts, and only the nitrilases (6MG6 and 3WUY) came from prokaryotes. The catalytic triad EKC appeared distantly in primary structure and forms substrate binding pocket along with other residues, a characteristic of the nitrilase superfamily (Fig. 4). The structural superimposition of modeled nitrilases showed almost identical structural coordination of the catalytic triads with the selected template 2W1V_A. AcNit displays a relatively big substrate-binding pocket formed of aromatic and hydrophobic residues E40, N96, K113, F117, E128, C158, Y159, L161, R162, and A183 (Fig. 4), to accommodate both aliphatic and aromatic nitrile substrates and displays the broadest nitrile substrate spectrum. Aromatic residues in the substrate-binding pocket of AcNit are possibly involved in stabilizing aromatic nitrile substrates. Similarly, crystallographic studies have reported substrate binding pocket composed of hydrophobic (EKHEC) and hydrophilic residues (YFNYWWPMVF) in Synechocystis sp. strain PCC6803 (Zhang et al., 2014). Further, the involvement of amino acid N118 and E142 has been reported in stabilizing the active site E53 and K135, respectively. Similarly, aromatic nitrile substrates are stabilized by involving W170 and F202 and other residues PGSMVGQIF are responsible for making substrate-binding loop (Zhang et al., 2014; Raczynska et al., 2011). Previous studies have also revealed formation of oxyanion hole by involving phenylalanine which play major role in recognizing substrates (Shen et al., 2020). Therefore, the selected nitrilase AcNit contain catalytic C158 which acts as a nucleophile, E40 activates the sulfhydryl group of C158 by acting as a base, and K113 stabilizes intermediates formed in the reaction. The conserved residues N96 and E128 also help in providing stability and activating catalytic triad by making hydrogen bonds with E40 and K113, respectively.
Protein ligand interaction based on docking
Nitrilases are known for hydrolytic activity against aromatic, aliphatic and arylacetonitrile substrates. The predicted protein ligand interaction of AcNit displays higher binding affinities − 5.79 kcal/mol for indole-3-acetonitrile, -4.56 kcal/mol for α-picolinic acid amide, 4.29 kcal/mol for phenylacetonitrile, -4.15 kcal/mol for 2-cyanopyridine, -3.32 kcal/mol for crotonolide, -3.24 kcal/mol for butenedinitrile, -2.73 kcal/mol for acrylonitrile, and − 2.68 kcal/mol for malononitrile compared to + 196.77 Kcal/mol for hexonamide and + 107 kcal/mol for benzamide. The ineteraction between proteins and ligands takes place only when ΔG or Gibbs free energy change is negative. The negative ΔG determines the stable interaction of proteins with the ligands. Similarly, the nitrilase of L. aggregata DSM 13394 is reported do not act on aromatic and heterocyclic nitriles but showed high preference towards arylacetonitriles hydrolysis like iminiodiacetonitrile (Zhang et al., 2012). Other nitrilase Comamonas testosterone, Geobacillus pallidus rapc8, Nocardia globerula NHB-2, Pseudomonas aeruginosa 10145, Rhodococcus rhodochrous J1, Streptomyces sp. MTCC 7546 have been reported for aliphatic, aromatic and arylaectonitrile substrates and reported suitable for various industrial applications (Nigam et al., 2009; Levy-Schil et al., 1995; Williamson et al., 2010; Harper et al., 1985; Alonso et al., 2008; Kobayashi et al., 1989).
To investigate whether aromatic amino acid Y159 in nitrilase plays important role for determining substrate specificity, saturation mutagenesis was performed. The introduction of non-aromatic alanine residue in place of Y159 completely disrupted the catalytic activity for indole-3-acetonitrile which is indicated by the occurrence of + 5.69 kcal/mol binding energy for IAN (Fig. 4d). It is known that strong binding of protein and the ligand depends upon the accuracy of binding energy. Lower is the binding energy; stronger is the affinity for binding substrates. Therefore, the presence of hydrophilic residues in the substrate binding pocket justifies the activity of modeled nitrilase AcNit towards aromatic substrates also.
Gene prediction for xenobiotic degradation and secondary metabolites
Various genes responsible for degradation of xenobiotics including benzoate, bisphenol, fluorobenzoate, furfural, naphthalene, aminobenzoate, styrene, atrazine, xylene, caprolactam, chloroalkane and chloroalkene, cytochrome P450 and steroids were predicted. Similarly, other gene operons involved in the biosynthesis of phenylpropanoid, stilbenoid, diarylheptanoid and gingerol, flavanoid, flavones, isoflavanoid, indole alkaloid, isoquinoline alkaloid, acridone, penicillin and cephalosporin, monobactam and clavulanic acid were predicted. The presence of these genes suggests the role of nitrilases in biodegradation of pollutants and xenobiotic compounds. It could also prove useful for production.