A global atlas of fungal biosynthetic gene clusters reveals the diversification of diketopiperazine biosynthesis

doi:10.21203/rs.3.rs-4715743/v1

Download PDF

Research Article

A global atlas of fungal biosynthetic gene clusters reveals the diversification of diketopiperazine biosynthesis

https://doi.org/10.21203/rs.3.rs-4715743/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background

Fungi represent one of the largest and most promising reservoirs of structurally diverse natural products. However, the global biosynthetic potential of fungi significantly expanded but remains underexplored.

Results

Here, we presented the most comprehensive fungal biosynthetic gene cluster (BGC) atlas that comprised 303,983 BGCs predicted from 13,125 fungal genomes, revealing many less-explored taxa encoding large biosynthetic diversity. The fungal BGCs were organized into 43,984 gene cluster families (GCFs), with 99.6% remaining uncharacterized and 91.7% being genus-specific. Gene-centric analysis has revealed the presence of 359 cyclodipeptide synthases of three distinct subcategories and 9,482 nonribosomal peptide synthetases (NRPSs) responsible for diketopiperazine biosynthesis in the fungal BGC atlas. Interestingly, 304 type one CDPSs with high homology to bacterial CDPSs were discovered in fungi for the first time, exclusively found in Fusarium. A mass spectrometry-guided approach resulted in the isolation of eighteen indole diketopiperazine alkaloids, including three novel ones, from an Aspergillus strain. Bioinformatics analysis confirmed that these compounds are synthesized by an NRPS protein and several post-modification enzymes.

Conclusions

The study presents the most comprehensive fungal BGC atlas and highlights the diversification of diketopiperazine biosynthesis in fungi, laying a crucial foundation for the exploration of specific types of natural products from fungi.

fungi

biosynthetic gene cluster

genome mining

diketopiperazine

cyclodipeptide synthase

Fungi are recognized as one of the largest and most promising reservoirs of structurally diverse natural products, as evidenced by the substantial annual increase in newly reported compounds, rising from 655 in 2011 to 1,236 in 20201. Over thirty secondary metabolites derived from fungi have been approved as pharmaceutical drugs or are currently under investigation at various clinical stages, including well-known drugs like penicillin, mizoribine, cyclosporine, and lovastatin2. Genome mining has been proven as a powerful approach for systematically investigating the potential of microorganisms to synthesize novel natural products and has been successfully applied to fungi, bacteria, and marine prokaryotes3-5. For instance, Robey et al. performed the first global analysis of the fungal BGC atlas using 1,037 fungal genomes, revealing that only 1% of the biosynthetic gene cluster families (GCFs) in fungi contained known BGCs from MIBiG, and highlighting the specific biosynthetic logic and chemical space in fungi3. Over the past few years, the number of publicly available fungal genome sequences has dramatically increased to over 10,000, presenting an opportunity for the in-depth mining of the biosynthetic potential of fungi.

Cyclodipeptides, usually called 2,5-diketopiperazines, represent one of the most abundant classes of fungal natural products, with approximately 700 reported compounds exhibiting diverse biological activities, such as antitumor, antiviral, antifungal, and antibacterial activities6. Nonribosomal peptide synthetases (NRPSs) and cyclodipeptide synthases (CDPSs) have been recognized as the crucial enzymes responsible for the formation of 2,5-diketopiperazine scaffolds from two amino acids and aminoacyl-tRNAs, respectively7. Over 100 CDPSs have been functionally characterized and nearly all of them originate from bacteria8, except for two groups of eukaryotic cyclopeptide synthases identified by Yee et al. and Krause et al., respectively9^, 10. Fungal-derived NRPSs that form diketopiperazines typically exhibit a dimodular structure. In contrast, bacteria have only been reported to use two single-module NRPSs or a multi-module NRPS for diketopiperazine synthesis11. However, the diversity, novelty, and distribution patterns of CDPS and diketopiperazine-forming NRPS in fungi, which hold crucial value for the efficient exploration of diketopiperazine-type natural products, remain unknown.

Here, we constructed the most extensive fungal BGC atlas based on 13,125 fungal genomes, offering a comprehensive overview of the secondary metabolic potential of fungi. The atlas also systematically describes the diversity, novelty, and distribution patterns of biosynthetic gene cluster families (GCFs) in fungi. The gene-centric analysis uncovered the diversity, novelty, and specificity of CDPSs and diketopiperazine-forming NRPSs in fungi based on the BGC atlas. A mass spectrometry-guided approach, coupled with genome mining, was employed to investigate the potential of an Aspergillus strain in the biosynthesis of diketopiperazine natural products.

2.1. Construction of the fungal BGC atlas

A total of 13,125 fungal genomes or metagenome-assembly genomes (MAGs) were retrieved from the NCBI genome database on June 26, 2023. All genomes were analyzed using antiSMASH 6.0.1 with the default fungi setting to identify their biosynthetic gene clusters12. The number and classification of BGCs in each genome were obtained through our custom script “ex_region_info_from_gbk.py”. All BGCs were grouped into seven categories using a custom script “Big_Scape_Category.py” referring to the BiG-SCAPE classification table13. Genomic taxonomic information, size, quality assessment, and quantity of BGCs were visualized in ITOL and can be accessed in Dataset S114. The number of BGCs distributed in each category of the 13,125 genome was counted as the characteristic value, and the maximum likelihood phylogenetic tree was constructed by hierarchical clustering using scripts hcluster.R.

2.2. Construction of the fungal gene cluster families

To assess the potential of fungi in generating novel metabolite scaffolds, a total of 212,478 complete BGCs (not on contig edge) and characterized BGCs from the MIBiG database were grouped into families using BiG-SLiCE version 1.1.1 with default settings15, facilitating the construction of the fungal GCF atlas. The biosynthetic-Pfam and sublevel Pfam features of each GCF model were used to construct the phylogenetic tree of fungal GCFs using a customized script “matrix_to_tree.R”. The novelty of GCFs can be qualitatively determined based on whether characterized BGCs from MIBiG are present within the family, or quantitatively assessed based on average cumulative BLAST score from antiSMASH. The most frequently occurring phylum and genus within each GCF were used as representatives to visualize the fungal GCF atlas using ITOL.

2.3. Mining of diketopiperazine synthases

To date, all characterized cyclodipeptide synthases can be categorized into three types: the widely recognized CDPS of bacterial origin (T1CDPS)16, and two groups of newly characterized eukaryotic cyclopeptide synthases (T2CDPS and T3CDPS)9^, 10. Due to the significant differences in sequence similarity and length between these three types of cyclodipeptide synthases, potential CDPS proteins were identified separately from the fungal BGC dataset based on the experimentally characterized CDPSs using HMMER17. Specifically, eight characterized T1CDPS (NCBI accession numbers: WP_013152196.1, WP_003411681.1, WP_015803347.1, WP_019889609.1, AVP32201.1, Q8GED7.1, WP_159786149.1, and WP_099202916.1)18–24, seven known T2CDPS (NCBI accession numbers: UZP48213.1, WCB22921.1, WCB22922.1, WCB22923.1, UZP48212.1, XP_015405763, and XP_026617165.1)9 and four known T3CDPS (NCBI accession numbers: XP_453056.1, XP_037144654.1, XP_028891729.1, CDO92882.1)10 were collected from literature and used as seed sequences for CDPS searching, respectively. These seed sequences were used to search against the protein-coding sequences extracted from the fungal BGC atlas using HMMER with the default settings. Functional annotation results of the protein-coding sequences provided by antiSMASH were extracted via a custom Python script “ex_genefunction.py”. All hits were BLAST v. 2.2.28 + against the seed sequences. HMM score, annotation results, and BLAST score were used as the selection criteria for CDPS candidates. The approximately-maximum-likelihood phylogenetic tree of putative CDPSs was calculated using FastTree 2.125 and visualized in iTOL. The sequence similarity network of these proteins was generated using the Enzyme Similarity Tool with a threshold alignment score under 5026. The biosynthetic gene cluster network of BGCs containing putative CDPSs was constructed using BiG-SCAPE version 1.1.5 at a cut-off of 0.713. The putative CDPS and BGC networks were visually represented and annotated in Cytoscape 3.827.

2.4. Mining of diketopiperazine-forming NRPSs

NRPSs have been widely known as catalysts for the formation of diketopiperazine scaffolds from two amino acid substrates, especially in fungi. Fungal diketopiperazine-forming NRPSs were identified based on four experimental characterized NRPSs (NCBI accession numbers: B9WZX0.1, XP_754329.2, L7WU80.1, and OM307404.1)28-31responsible for the biosynthesis of diketopiperazine scaffolds using the similar HMM-based screening approach from the protein-coding sequences extracted from the fungal BGC atlas. Since the highly modular structure of NRPSs, the number of AMP binding domains in each hit was extracted using the customized script “ex_A_domain_from_gbk.py”, and only NRPSs containing two AMP-binding domains were retained for the selection of diketopiperazine-forming NRPSs. HMM score, annotation results, and BLAST score were used as the selection criteria for diketopiperazine-forming NRPSs. The approximately-maximum-likelihood phylogenetic tree and NRPS network (alignment score of 350) were constructed and visualized as described above.

2.5. Targeted discovery of diketopiperazines and the biosynthetic enzymes from fungi

Aspergillus sp. WHUF0304 was isolated from a mangrove soil sample collected from Yalong Bay, at Sanya, Hainan, China, in Dec 2018. The fungus was identified as Aspergillus sp. according to its morphological characteristics and ITS gene sequences. The culture broth was extracted and analyzed by a SCIEX X500B Q-TOF spectrometer coupled to an ExionLC AC system. The raw LC-MS data underwent feature detection and alignment using MZmine2. Subsequently, CSV and MGF files were generated and imported into GNPS for network generation using default parameters. Visualization of the generated molecular networks was performed in Cytoscape, and tentative identification of DKP metabolites was annotated using SIRIUS 45.

Genome assembly was performed using Flye 2.9.3 (https://github.com/fenderglass/Flye, SCR_017016) for Nanopore Sequencing data32, and consensus sequences and variant calls were created using Medaka v1.11.333. Subsequently, bowtie was employed to align the Next-generation sequencing reads34, followed by sorting and indexing using SAMtools35. Lastly, the assembly genome was refined using Pilon version 1.24 (https://github.com/broadinstitute/pilon, SCR_014731)based on Next-generation sequencing data, enabling the correction of assembly errors such as SNPs, indels, and gaps36. The genome was annotated with a pre-trained Aspergillus fumigatus model as the species parameters using the Augustus web interface and submitted to antiSMASH for analysis to obtain asi BGC, which was then compared to cri BGC37. A reference culture of Aspergillus sp. WHUF0304 maintained at -80°C is stored at Wuhan University, China.

2.6. Fermentation, extraction, and isolation

The fungi were cultured in liquid medium (soluble starch 15 g, glucose 5 g, peptone 5 g, yeast extract 5 g, (NH₄)₂SO₄ 0.5 g, K₂HPO₄ 0.5 g, NaCl 0.5 g, MgSO₄ 0.5 g, CaCO₃ 1.0 g, water 1 L, pH 7.5) as a seed solution, and then 5 mL seed solution was inoculated into 70×1000 mL glass culture flasks, each containing solid rice medium (rice 70 g, distilled water 120 mL). The fungi were statically fermented for 30 days at room temperature. Fermentation products were extracted three times with EtOAc by soaking overnight. A crude extraction (102.20 g) was obtained by vacuum distillation and was fractionated by a silica gel column using the petroleum ether: EtOAc (5:1 to 0:1, v/v) to give six fractions (A to F).

Fraction B (8.87 g) was subjected to silica gel column chromatography, eluting with petroleum ether: EtOAc (5:1, 2:1, 1:1, 0:1, v/v) to obtain eleven fractions (B1 ~ B11). Fraction B8 (0.6 g) was separated by a silica gel column eluting with CH₂Cl₂:MeOH (1000:1, 600:1, 100:1, v/v) to yield three fractions (B8-1 ~ B8-3). Fraction B8-1 was separated by semipreparative HPLC (MeOH-H₂O, 80:20, v/v, 1.0 mL/min) to yield 2 (5.0 mg). Compound 2 was separated by a chiral column (CHIRALPAK IB) eluted with n-hexane–isopropanol (90:10, v/v, 1.0 mL/min, UV: 210 nm and 254 nm) to yield 2a and 2b. Fraction B8-2 was subjected to a Sephadex LH-20 column eluting with MeOH to obtain 5 (1.9 mg). Fraction B9 was passed over a Sephadex LH-20 column, eluting with MeOH, and further purified by semipreparative HPLC (MeOH-H₂O, 78:22, v/v, 3 mL/min) to obtain 6 (5.4 mg).

Fraction C (30.15 g) was applied to the silica gel column chromatography eluting with the gradient of CH₂Cl₂:MeOH (60:1 and 40:1, v/v) to obtain 5 fractions (C1 ~ D5). Fraction C2 (2.57 g) was separated into four fractions (C2-1 to C2-4) by the ODS-C₁₈ column chromatography eluting with the gradient of MeOH-H₂O (50:50 to 100:0, v/v). Fraction C2-3 was fractionated by semipreparative HPLC (Acetonitrile-H₂O, 50:50, v/v, 3 mL/min) to yield 3 (3.6 mg), 9 (7.1 mg), and 11 (3.5 mg). Fraction C2-4 was subjected to a silica gel column eluting with petroleum ether: EtOAc (10:1, 5:2, and 1:1, v/v) to obtain four fractions (C2-4-1 ~ C2-4-4). Fraction C2-4-2 was purified by semipreparative HPLC (Acetonitrile-H₂O, 70:30, v/v, 3 mL/min) to yield 4 (10.8 mg). Fraction C2-4-3 was separated by semipreparative HPLC (MeOH-H₂O, 84:16, v/v, 1 mL/min) to yield 1 (10.0 mg). 1a and 1b were separated on a chiral column (CHIRALPAK IH) eluting with n-hexane-EtOH (98:2, v/v, 1.0 mL/min) using a UV detector at 210 nm and 230 nm. Fraction C3 (4.12 g) was isolated by a silica gel column eluting with petroleum ether: EtOAc (2:1, 1:1, and 0:1, v/v ) to yield five fractions (C3-1 ~ C3-5). Fraction C3-2 was subjected to semipreparative HPLC using MeOH-H₂O (58:42, v/v, 3 mL/min) as mobile phase to obtain 12 (31.2 mg), 15 (12.1 mg), 17 (3.3 mg), and 18 (5.0 mg). Fraction C3-3 was purified by semipreparative HPLC using MeOH-H₂O (60:40, v/v, 3 mL/min) as mobile phase to obtain 13 (4 mg). Fraction C4 was fractionated by the ODS-C₁₈ column chromatography into seven fractions (C4-1 ~ C4-7) eluting with the gradient MeOH-H₂O (55:45 to 100:1, v/v). Fraction C4-3 was further purified by semipreparative HPLC (Acetonitrile-H₂O, 51:49, v/v, 3 mL/min) to yield 8 (9.8 mg) and 10 (5.4 mg).

Fraction D (15.65 g) was chromatographed over a silica gel column using CH₂Cl₂:MeOH (30:1, 20:1, and 41:1, v/v) as a mobile phase to afford five fractions (D1- D5). Fraction D2 was purified by semipreparative HPLC (MeOH-H₂O, 60:40, v/v, 3 mL/min) to give 16 (77.0 mg). Fraction D4 (5.5 g) was separated by a silica gel column using petroleum ether: EtOAc (1:1, 1:2, and 0:1, v/v) as mobile phase to obtain six fractions (D4-1 ~ D4-6). Fraction D4-2 was isolated by semipreparative HPLC (Acetonitrile-H₂O, 46:54, v/v, 3 mL/min) to give 14 (6.55 mg). Fraction D4-3 was separated by semipreparative HPLC (MeOH/H₂O, 80:20, v/v, 3 mL/min) to yield 7 (20.1 mg).

3.1. Overview of the fungal BGC atlas

To gain a more extensive review of the biosynthetic chemical space of fungi, a total of 13,125 fungal genomes or metagenome-assembly genomes (MAGs) were curated and analyzed using multiple bioinformatic tools. This genome dataset spans a wide phylogenetic range, encompassing 12 phyla and 1,102 genera (Dataset S1). It includes extensively studied fungal taxonomic groups like Aspergillus, Fusarium, Penicillium, and Talaromyces, as well as taxa with limited information about their BGCs, such as Pyricularia, Trichoderma, and Colletotrichum. The 13,125 fungal genomes were predicted to encode a total of 303,983 BGCs (Dataset S2). Among these, nonribosomal peptide synthetase (NRPS) BGCs were the most prevalent, constituting 38.5% (116,988). Type I polyketide synthase (PKS I) and terpene BGCs ranked as the second and third most abundant classes, accounting for 26.5% (80,518) and 16.4% (49,751) of the total BGCs, respectively. Conversely, saccharide BGCs were not identified in fungi, and ribosomally synthesized and post-translationally modified peptides (RiPPs) BGCs represented only 0.4% (1,312). It is worth noting that the fungal genomes primarily come from the phyla Ascomycota, Basidiomycota, and Mucoromycota, accounting for 81.4% (10,683), 13.9% (1,827), and 2.9% (375) respectively. The genome quantities from other phyla are all below 0.7% (94). As expected, the phylum Ascomycota has the highest average BGC count, reaching 26.41, while the less-studied phylum Zoopagomycota surprisingly ranks second with an average BGC count of 17.04. Basidiomycota, widely used as medicinal mushrooms, encode the third highest average number of BGCs among fungal phyla, aligning with their reputation as prolific producers of biologically active natural products.

The fungal BGC atlas intuitively uncovered the average number of BGCs in all 1,105 fungal genera. Among them, 225 genera have an average BGC count exceeding 30, including well-known genera such as Aspergillus (45.3 ± 12.7, n = 1,095), Fusarium (40.7 ± 7.2, n = 1,326), and Penicillium (41.0 ± 9.2, n = 393), as well as less-studied ones like Diaporthe (95.0 ± 19.3, n = 32), Calonectria (80.7 ± 17.4, n = 70), Macrophomina (62.9 ± 11.7, n = 28), Eutypa (62.8 ± 2.9, n = 41), and Pyricularia (49.2 ± 7.5, n = 383) (Fig. 1 and Dataset S1). The full list of fungal genera with great biosynthetic potential can be accessed in Dataset S1. It is noteworthy that the median genome size of these fungi is around 33.8 Mb, and scaffold N50 around 0.89 Mb (Fig. 1), indicating that the assembly quality of most fungal genomes has reached a moderate level. However, the incomplete genome sequences did affect the predicted total number of BGCs in a genome, especially for large-size BGCs, such as NRPS and PKS BGCs.

3.2. Diversity, novelty, and distribution pattern of fungal GCFs

Considering the existence of many homologous BGCs in phylogenetic closely related strains and the significant impact of incomplete BGCs on the novelty and specificity of GCFs, BIG-SLICE was employed to investigate the diversity and novelty of 212,478 complete fungal BGCs (not on contig edge). The BGCs sharing similar domain architectures were clustered into 43,984 GCFs using cosine-like (via l2-normalization) distances at the default threshold (T = 300), with 27,241 GCFs containing only one BGC. The dendrogram of the 16,743 GCFs consisting of more than two BGCs was presented in Fig. 2 and Dataset S3, revealing that these GCFs were predominantly presented in phyla Ascomycota (15,160, 90.5%) and Basidiomycota (882, 5.3%). The other three phyla encoded 2–70 phylum-specific GCFs, which were correlated with their small genome and BGC counts. However, the products encoded by these phylum-specific GCFs may be more easily obtained due to the clean secondary metabolic background. Besides, there are 575 GCFs harboring BGCs originating from multiple phyla, indicating a relatively conserved nature of core genes within these BGCs across strains from diverse phyla. Interestingly, genera producing the highest number of secondary metabolites were also observed to encode the most genus-specific GCFs. Among the identified 13,075 genus-specific GCFs, Aspergillus encoded the largest amount of the genus-specific GCFs (2,298), followed by Fusarium (1,622), Penicillium (1,023), Colletotrichum (845), and Trichoderma (395). A total of 3,668 GCFs contain BGCs derived from multiple genera, approximately six times the number of phylum-nonspecific GCFs, implying that the core genes in a significant proportion of BGCs exhibit conservation only at the phylum level. All BGCs within the same GCFs belong to the same class. NRPS and PKS I still constitute the two predominant classes in the GCF dataset, accounting for 37.2% (6,230) and 33.8% (5,654), respectively. Biosynthetic-Pfam and sublevel Pfam features of the GCF models were used to construct the hierarchical relationship between the 16,743 GCFs consisting of more than two BGCs. The clustering results indicated that GCFs of the same class tend to cluster on nearby hierarchical branches, except for the PKS1 and PKS-NRPS hybrids classes, which are highly mixed together. GCFs of the "Other" type are distributed across multiple core branches, consistent with the composition of their core biosynthetic gene clusters.

The fungal GCF atlas comprises 27,241 GCFs with a single BGC, 6,512 GCFs with two BGCs, 7,486 GCFs with BGC numbers ranging from three to ten, and 2,745 GCFs with more than ten BGCs. Only 254 GCFs harbor more than 100 BGCs, indicating the great species-specificity of these GCFs. In PKS I-type GCFs, those with only two BGCs constitute 44.3% (2,505 in 5,654), a significantly higher percentage compared to other categories. Conversely, within terpene-type GCFs, the highest proportion includes GCFs with more than ten BGCs, reaching 29.5% (385 in 1,307). This indicates that the species-specificity of PKS I-type BGCs is significantly greater than that of terpene-type BGCs. The average cumulative BLAST score of BGCs within a GCF, calculated using KnownClusterBlast in antiSMASH, was employed to assess the novelty of the GCFs. The results indicate that 69.7% (11,675 in 16,743) of GCFs are entirely novel when compared to BGCs from the MIBiG database. Additionally, 8.0% (1,346 in 16,743) of GCFs exhibit an average cumulative BLAST score below 1000, and 15.3% (2,564 in 16,743) of them display an average cumulative BLAST score below 5000. It's worth noting that 48.3% (949 in 1,965) of PKS-NRPS hybrid-type GCFs show an average cumulative BLAST score above 1000 and only 23 of them contain BGCs from MIBiG, indicating that the novelty of these BGCs is relatively low but the majority of them remain unidentified. A large number of other types of GCFs also show low novelty, and secondary metabolites encoded by these BGCs are highly worthy of targeted exploration. In addition, among the obtained 43,984 GCFs, only 165 GCFs contain BGCs from MIBiG, further highlighting the great biosynthetic potential of fungi.

3.3. Diversity and novelty of fungal cyclodipeptide synthases

The reported CDPS are divided into three types: eight bacterial cyclodipeptide synthases (Type one CDPS, T1CDPS), seven eukaryotic arginine-containing cyclopeptide synthases (Type two CDPS, T2CDPS) and four other eukaryotic cyclopeptide synthases (Type three CDPS, T3CDPS). Using these characterized CDPS proteins as seed sequences, a total of 304 T1CDPS, 40 T2CDPS, and 15 T3CDPS candidates were identified by HMMER (Dataset S4), and their phylogenetic tree is presented in Fig. 3A. Interestingly, all T1CDPS homologs are exclusively distributed in Fusarium, while the T2CDPS candidates are identified from BGCs of Neofusicoccum (16), Aspergillus (13), and Trichophyton (11),. The putative CDPSs were organized into 13 clusters, with 6 clusters consisting of a single protein. All T1CDPS candidates are divided into one cluster (cluster a). The T2CDPS candidates are divided into 5 clusters, with three large clusters containing 16 (cluster b),12 (cluster c), and 8 (cluster e) proteins, respectively, while the cluster g consists of 2 proteins and 2 seed sequences (Fig. 3B). The T3CDPS candidates are divided into two clusters, 6 (cluster e) and 5 (cluster f) proteins, respectively. Furthermore, potential CDPSs sourced from the genus Fusarium, Trichophyton and Neofusicoccum are distributed within a single cluster, suggesting that genera Fusarium harbor the potential to synthesize type one diketopiperazine scaffolds, while Trichophyton and Neofusicoccum harbor the potential to synthesize type two diketopiperazine scaffolds.

The structure of the gene cluster similarity network for BGCs containing these CDPS homologs closely resembled the CDPS sequence similarity network (Fig. 3C). The T1CDPS homologs were mainly observed in BGCs of CDPS and PKS I types, the putative T2CDPS are mainly located in BGCs of CDPS-PKS I and CDPS-NRPS hybrid types, and the T3CDPS homologs are mainly located in BGCs of NRPS, NRPS-like and NRPS + T1PKS (Fig. 3C). The results indicate a high correlation between the CDPS sequences and the core biosynthetic genes within the same gene clusters, providing a promising opportunity for the targeted exploration of specific types of natural products based on CDPS sequences. The finding also highlights their potential to biosynthesize diketopiperazine-containing polyketides or peptides. For instance, BGCs containing CDPS from cluster 1 also harbor the essential genes for the biosynthesis of curvularin, indicating their potential to synthesize diketopiperazine-containing curvularin derivatives (SI Appendix Fig. S1). Most seed sequences of T1CDPS appeared as singletons under the given thresholds, indicating that the diversity of newly identified CDPS candidates is low, and the functionality of CDPSs within the same cluster is likely to be very similar.

3.4. Diversity, and novelty of fungal diketopiperazine-forming NRPSs

Fungal diketopiperazine-forming NRPSs were identified based on four experimental characterized NRPSs responsible for the biosynthesis of the diketopiperazine scaffold using an HMM-based screening approach. A total of 175,418 hits were produced under the default settings, resulting in the discovery of 24,808 unique protein sequences containing two AMP-binding domains. As shown in SI Appendix Fig. S2, the remaining 24,808 hits were depicted along the horizontal axis according to their HMM score and annotation results. A clear drop in the HMM similarity score was observed around sequences 9,500. The first protein annotated as not being an NRPS is sequence 9,473 (HMM score 796.6, BLAST e-value 2E-170). Consequently, all the 9,472 proteins with HMM scores greater than 796.6 and the ten NRPSs showing comparable HMM scores but with BLAST e-values of 0, were regarded as candidates for diketopiperazine-forming NRPSs (Dataset S5). They are distributed across multiple fungal phyla but also present in significant numbers in genera such as Fusarium, Aspergillus, Penicillium, and Colletotrichum (Fig. 4A), which may explain the ubiquity of diketopiperazine natural products in fungi. The phylogenetic tree of protein sequences indicates that diketopiperazine-forming NRPSs from the same fungal genus are mostly distributed across multiple branches, demonstrating a certain degree of diversity and genus-specificity. The majority of diketopiperazine-forming NRPSs with HMM score > 1800 originate from Aspergillus and are classified into cluster e, which includes the characterized nonribosomal peptide synthetase (hasD) in the NRPS network. This underscores the promising potential of Aspergillus for synthesizing diketopiperazine natural products. The NRPS proteins in cluster b mostly have moderate HMM scores (1200–1800), longest sequence lengths (> 10 k nt), and are widely distributed across multiple fungal genera, but none of the proteins in this cluster have been functionally validated.

The diketopiperazine-forming NRPSs in cluster a represent the most abundant and diverse group, with the widest range of HMM scores and sequence lengths. Moreover, they are extensively distributed across various fungal genera, and the two identified diketopiperazine-forming NRPSs (ftmA and notE) are classified within this group (Fig. 4B). The NRPS proteins in cluster c and d are strictly clustered together separately in the phylogenetic tree. Their sequence lengths are all below 8000 nt and are mainly derived from Fusarium. CriC, the first fungal diketopiperazine-forming NRPSs that catalyzes the formation of a cyclic dipeptide from L-tryptophan and L‐alanine, is distributed within cluster g with eight NRPS proteins from Aspergillus (Fig. 4), suggesting that these proteins possess significant catalytic potential for the cyclization of L‐tryptophan and L‐alanine.

3.5. Discovery of diketopiperazines and the diketopiperazine-forming enzyme from Aspergillus sp. WHUF0304

The genus Aspergillus is one of the major sources of diketopiperazine natural products and diketopiperazine-forming enzymes. Mass spectrometry-guided molecular network analysis revealed that the strain Aspergillus sp. WHUF0304 may produce a series of novel diketopiperazine natural products (Fig. 5A). Therefore, we explore the chemical diversity and biosynthetic mechanisms of diketopiperazines in this strain to illustrate the biosynthetic characteristics of diketopiperazine scaffolds in fungi. A total of eighteen indole diketopiperazine alkaloids (1–18), including three new ones, were characterized from the fermentation culture of a marine-derived fungus Aspergillus sp. WHUF0304 (Fig. 5A). Aspergillan A (1) was similar to cryptoechinulin D38, except a furan moiety was formed in 1 by the connection of C-24 and C-27 via oxygen. Notably, 1 has a zero specific rotation and no cotton effect in its ECD spectrum, which indicates Aspergillan A is a racemate. The enantiomers of Aspergillan A, 1a, and 1b, were separated by chiral HPLC using a Chiralpak IB column. ECD calculations with time-dependent density functional theory (TD-DFT) were performed, and the Boltzmann-averaged ECD spectra of (12S,28R,31R)-1 and (12R,28S,31S)-1 matched well with the experimental ECD spectra of 1a and 1b, respectively. Aspergillan B (2) was the dehydro analog of aspergilline D39. 2 was also assumed to be a racemate due to its zero specific rotation and baseline ECD curvet, and the absolute configuration of 2a and 2b was assigned as 12S, 21R, 29S and 12R, 21S, 29R, respectively. Aspergillan C (3) was identified as an analog of aspergilline B39.

The structures of compounds 4–18 were identified as Cryptoechinulin D (4), Eurotinoid B (5), Variecolortide B (6), Variecolortide C (7), Isoechinulin A (8), Variecolorin G (9), Neochinulin D (10), Variecolorin J (11), Cryptoechinulin C (12), Variecolorin O (13), Variecolorin H (14), Aspergilline B (15), Neoechinulin A (16), Neoechinulin B (17), Dihydroneoechinulin B (18) based on comparison of HRESIMS, ¹H NMR, and ¹³C NMR data with previously reported literature38-45. All these indole diketopiperazine alkaloids are speculated to be synthesized from L-tryptophan and L-alanine. Gene-centric analysis revealed that Aspergillus sp. WHUF0304 does not encode the CDPS gene in its genome. However, it does contain a diketopiperazine-forming NRPS protein, which exhibits 94% sequence similarity to CriC identified from Eurotium cristatum NWAFU-1 (Fig. 5B). In addition, homologous genes of the other five post-modification genes in the cir gene cluster can also be located near the target NRPS gene, and their sequences exhibit high similarity (97%-99%). This case highlights the importance of the present study in the efficient exploration of novel diketopiperazines and the biosynthetic enzymes.

Fungi are one of the most promising sources of natural product-derived drugs. With extensive genomic data resources now available, the construction of an updated fungal BGC atlas holds significant promise for efficiently discovering novel natural products from fungi. Diketopiperazines are a prevalent class of natural products in fungi. Research into the diversity of cyclodipeptide biosynthesis based on the fungal BGC atlas provides a valuable reference for the targeted exploration of specific types of natural products. The present study established the most extensive fungal BGC atlas based on 13,125 fungal genomes, which systematically outline the diversity, novelty, and distribution patterns of GCFs in fungi. Through gene-centric analysis, the study revealed the diversity, novelty, and specificity of CDPSs and diketopiperazine-forming NRPSs in fungi. Additionally, a mass spectrometry-guided approach, combined with genome mining, was utilized to explore the potential of an Aspergillus strain in producing diketopiperazine natural products.

In 2021, Robey et al. presented an interpreted atlas of BGCs predicted from 1,037 fungal genomes, unveiling notable distinctions in biosynthetic logic and chemical space between bacteria and fungi3. With a more than tenfold increase in the number of fungal genomes, we have compiled the most comprehensive fungal BGC atlas to date. This atlas not only delineates the distribution of BGCs within each genome but also elucidates the diversity, novelty, and distribution patterns of each GCF, which provides valuable information for the rational discovery of fungal natural products. For instance, the fungal BGC atlas intuitively uncovered the average number of BGCs in all 1,105 fungal genera, revealing many less-studied genera (Diaporthe, Calonectria, Macrophomina, Eutypa, and Pyricularia) that encode more than 30 BGCs on average. Similar to the above study, the number of BGCs per genome varies significantly at both the phylum and genus levels. The genomes of the class Eurotiomycetes encode an average of 37.8 BGCs, which is lower than the reported 48 BGCs in the literature3. This is mainly due to the inclusion of a large number of genomes from the orders Onygenales and Chaetothyriales, which typically have fewer BGCs (Dataset S1).

Due to the high prevalence of homologous gene clusters in the fungal BGC atlas, BiG-SLiCE was employed to cluster the fungal BGCs into GCFs. The potential of fungi to synthesize novel secondary metabolite scaffolds can be evaluated based on the diversity and novelty of these GCFs. Besides, only complete BGCs, roughly defined as those not located on the contig edge, were included in constructing the fungal GCF atlas, thereby reducing the influence of fragmented genome data on the research outcomes. The novelty of GCFs was qualitatively determined based on whether characterized BGCs from MIBiG are present within the family, or quantitatively assessed based on average cumulative BLAST score from antiSMASH. The findings indicate that out of the obtained 43,984 GCFs, only 165 GCFs contain BGCs from MIBiG, suggesting that metabolite scaffolds encoded by 99.6% of GCFs remain underexplored. Additionally, a total of 1,158 GCFs show an average cumulative BLAST score > 5000, implying that at least 993 GCFs likely encode new analogs of reported metabolite scaffolds but have not been explored yet. Based on the structure of fungal gene cluster families and their relationship with characterized BGCs from MIBiG, it is possible to annotate the metabolite scaffolds encoded by numerous fungal gene clusters. This also helps in determining BGCs and GCFs that are worth prioritizing for further investigation.

Until now, all widely recognized CDPSs have been identified from bacteria. In this study, we initially discovered 304 T1CDPS candidates from the fungal BGC atlas, all originating from the Fusarium. The sequence and functional diversity of these BCDPSs, particularly their evolutionary mechanisms, are highly worthy of further investigation. Arginine-containing cyclopeptide synthases (T2CDPS) and T3PKS were the two groups of experimentally characterized fungal cyclodipeptide synthases9^, 10. Here, we figured out 40 T2CDPS candidates from the fungal BGC atlas, primarily originating from Neofusicoccum, Aspergillus, and Trichophyton, which further expand the diversification of cyclodipeptide biosynthesis in fungi. NRPSs have been identified as the primary catalysts for diketopiperazine biosynthesis in fungi. They are extensively distributed across genera such as Aspergillus, Penicillium, and Talaromyces, known for producing diketopiperazine natural products6. However, as two single-module NRPSs or a multi-module NRPS have only been reported to catalyze the biosynthesis of diketopiperazines in bacteria, these types of NRPSs were not included in this study. Instead, this research systematically explored the diversity of CDPSs and diketopiperazine-forming NRPSs in fungal genomes, laying the foundation for research on the biosynthesis of cyclodipeptide natural products. Additionally, it offers valuable guidance for enzymatic synthesis of cyclodipeptide molecules.

In the present study, eighteen indole diketopiperazine alkaloids were isolated from a marine-derived fungus, Aspergillus sp. WHUF0304. The findings suggest that Aspergillus may harbor genes encoding BCDPS or NRPS proteins involved in diketopiperazine synthesis. However, HMMER analysis results indicate the absence of CDPS proteins, with only one diketopiperazine-forming NRPS highly similar to CriC. CriC has been characterized as the first NRPS in fungi responsible for catalyzing the cyclization of L-tryptophan and L‐alanine28. Coincidentally, all the compounds isolated in our study were derived from diketopiperazines of L-tryptophan and L-alanine. Moreover, the key post-modification enzymes required for the biosynthesis of these natural products can also be located near this NRPS. Hence, it can be inferred with confidence that these compounds are synthesized by the asi gene cluster, a homologous gene cluster of cri. This case provides crucial references for further exploration and biosynthetic studies of diketopiperazine natural products from fungi.

In conclusion, we have presented the most comprehensive fungal BGC atlas to date, shedding light on many less-explored taxa that encode a vast biosynthetic diversity. The fungal GCF atlas delineates the diversity, novelty, and distribution patterns of GCFs in fungi, revealing that 99.6% of GCFs remain uncharacterized, with 91.7% being genus-specific. Gene-centric analysis underscored NRPSs as the primary contributors to diketopiperazine biosynthesis in fungi and comfirmed the presense of T1CDPSs in fungi, with an exclusive distribution in the genus Fusarium. Bioinformatics analysis confirmed that three new indole diketopiperazine alkaloids isolated from a marine-derived Aspergillus strain are synthesized by an NRPS protein and several post-modification enzymes. This study underscores the diversification of diketopiperazine biosynthesis in fungi and offers valuable guidance for exploring specific types of natural products from fungi.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Supplementary Information

Additional file 1 (Supplementary informations)

. Supplementary Methods and Supplementary Results. Tables S1. 1H (600 MHz) and 13C (150 MHz) NMR data of 1–3 (δ in ppm, CDCl3). Table S2. Re-optimized conformers above 5% population of 1 and 2 calculated at the B3LYP/6-311G(d, p) level. Figure S1. The composition of BGCs containing CDPS from cluster 3. Figure S2. Result of the gene centric-screening approach for genes encoding diketopiperazine-forming NRPSs. Supplementary Notes (NMR data).

Additional file 2 (Supplementary Tables): Dataset S1.

Genomic features and the antisMASH results of the 13.125 funaal genomes. Dataset S2. Information about the 303,983 BGCs predicted from the 13,125 fungal genomes. Dataset S3. Information about the 16,743 GCFs(≥ 2 BGCs). Dataset S4. Information about the identified 359 CDPSs. Dataset S5. Information about the 9,482 diketopiperazine-forming NRPSs. Dataset S6. Cartesian coordinates for the re-optimized conformers of 1 and 2 calculated at the B3LYP_6-311G(d,p) level.

Funding

National Key Research and Development Programs (nos. 2022YFC2804700 and 2022YFC2804104) Fundamental Research Funds for the Provincial Universities of Zhejiang (no. RF-A2022013), and the National Natural Science Foundation of China (no. 42276137).

Author Contribution

B.W. led the methodology, original draft writing, and review and editing of the manuscript. T.Y. was primarily responsible for data curation, and also contributed to formal analysis and visualization, while supporting the original draft writing. H.L. worked on the investigation and led the validation process. Z.Z. and H.C. both worked on data curation, with Z.Z. also contributing to software development. G.H. was involved in both the investigation and software development. H.L. provided support for the methodology. W.Y. supported the investigation and software development. Y.Y. assisted with the review and editing of the manuscript. A.F. contributed to the validation. K.H. provided resources for the study. X.L. contributed to data curation and supervision. H.W. was involved in funding acquisition, project administration, and manuscript review and editing.

Acknowledgement

We thank all authors’ contributions to the present project.

Availability of data and materials

Not applicable.

van Santen JA, Poynton EF, Iskakova D, McMann E, Alsup TA, Clark TN, et al. The Natural Products Atlas 2.0: A database of microbially-derived natural products. Nucleic Acids Res. 2022;50(D1):D1317-D23.
Prescott TA, Hill R, Mas-Claret E, Gaya E, Burns E. Fungal Drug Discovery for Chronic Disease: History, New Discoveries and New Approaches. Biomolecules. 2023;13(6):986.
Robey MT, Caesar LK, Drott MT, Keller NP, Kelleher NL. An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes. Proc Natl Acad Sci. 2021;118(19):e2020230118.
Wei B, Du AQ, Zhou ZY, Lai C, Yu WC, Yu JB, et al. An atlas of bacterial secondary metabolite biosynthesis gene clusters. Environ. Microbiol. 2021;23(11):6981–92.
Wei B, Hu G-A, Zhou Z-Y, Yu W-C, Du A-Q, Yang C-L, et al. Global analysis of the biosynthetic chemical space of marine prokaryotes. Microbiome. 2023;11(1):144.
Wang X, Li Y, Zhang X, Lai D, Zhou L. Structural diversity and biological activities of the cyclodipeptides from fungi. Molecules. 2017;22(12):2026.
Borgman P, Lopez RD, Lane AL. The expanding spectrum of diketopiperazine natural product biosynthetic pathways containing cyclodipeptide synthases. Org Biomol Chem. 2019;17(9):2305–14.
Gondry M, Jacques IB, Thai R, Babin M, Canu N, Seguin J, et al. A comprehensive overview of the cyclodipeptide synthase family enriched with the characterization of 32 new enzymes. Front Microb. 2018;9:46.
Yee DA, Niwa K, Perlatti B, Chen M, Li Y, Tang Y. Genome mining for unknown–unknown natural products. Nat Chem Biol. 2023:1–8.
Krause DJ, Kominek J, Opulente DA, Shen X-X, Zhou X, Langdon QK, et al. Functional and evolutionary characterization of a secondary metabolite gene cluster in budding yeasts. Proc Natl Acad Sci. 2018;115(43):11030–5.
Harken L, Li S-M. Modifications of diketopiperazines assembled by cyclodipeptide synthases with cytochrome P 450 enzymes. Appl Microbiol Biotechnol. 2021;105:2277–85.
Blin K, Shaw S, Kloosterman AM, Charlop-Powers Z, Van Wezel GP, Medema MH, et al. antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res. 2021;49(W1):W29-W35.
Navarro-Muñoz JC, Selem-Mojica N, Mullowney MW, Kautsar SA, Tryon JH, Parkinson EI, et al. A computational framework to explore large-scale biosynthetic diversity. Nat Chem Biol. 2020;16(1):60–8. doi: 10.1038/s41589-019-0400-9.
Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49(W1):W293-W6.
Kautsar S, van der Hooft J, de Ridder D, Medema M. BiG-SLiCE: a highly scalable tool maps the diversity of 1.2 million biosynthetic gene clusters. Gigascience. 10: giaa154. 2021.
Huang Y, Li J, Chen S, Liu W, Wu M, Zhu D, et al. Advances in the biosynthesis of cyclodipeptide type natural products derived from actinomycetes. Chin J Biotechnol. 2023;39(11):4497–516.
Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 2011;39(suppl_2):W29-W37.
Giessen TW, von Tesmar AM, Marahiel MA. A tRNA-dependent two-enzyme pathway for the generation of singly and doubly methylated ditryptophan 2, 5-diketopiperazines. Biochemistry. 2013;52(24):4274–83.
McLean KJ, Carroll P, Lewis DG, Dunford AJ, Seward HE, Neeli R, et al. Characterization of active site structure in CYP121: A cytochrome P450 essential for viability of Mycobacterium tuberculosis H37Rv. J Biol Chem. 2008;283(48):33406–16.
Yu H, Xie X, Li S-M. Coupling of guanine with cyclo-L-Trp-L-Trp mediated by a cytochrome P450 homologue from Streptomyces purpureus. Org Lett. 2018;20(16):4921–5.
Yao T, Liu J, Liu Z, Li T, Li H, Che Q, et al. Genome mining of cyclodipeptide synthases unravels unusual tRNA-dependent diketopiperazine-terpene biosynthetic machinery. Nat Comm. 2018;9(1):4091.
Lautru S, Gondry M, Genet R, Pernodet J-L. The albonoursin gene cluster of S. noursei: biosynthesis of diketopiperazine metabolites independent of nonribosomal peptide synthetases. Chem Biology. 2002;9(12):1355–64.
Tian W, Sun C, Zheng M, Harmer JR, Yu M, Zhang Y, et al. Efficient biosynthesis of heterodimeric C3-aryl pyrroloindoline alkaloids. Nat Commun. 2018;9(1):4428.
Witwinowski J, Moutiez M, Coupet M, Correia I, Belin P, Ruzzini A, et al. Study of bicyclomycin biosynthesis in Streptomyces cinnamoneus by genetic and biochemical approaches. Sci Rep. 2019;9(1):20226.
Price MN, Dehal PS, Arkin AP. FastTree 2–approximately maximum-likelihood trees for large alignments. PloS one. 2010;5(3):e9490.
Gerlt JA, Bouvier JT, Davidson DB, Imker HJ, Sadkhin B, Slater DR, et al. Enzyme function initiative-enzyme similarity tool (EFI-EST): a web tool for generating protein sequence similarity networks. BBA-Proteins Proteom. 2015;1854(8):1019–37.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.
Qi J, Han H, Sui D, Tan S, Liu C, Wang P, et al. Efficient production of a cyclic dipeptide (cyclo-TA) using heterologous expression system of filamentous fungus Aspergillus oryzae. Microb Cell Fact. 2022;21(1):146.
Yin W-B, Baccile JA, Bok JW, Chen Y, Keller NP, Schroeder FC. A nonribosomal peptide synthetase-derived iron (III) complex from the pathogenic fungus Aspergillus fumigatus. J Am Chem Soc. 2013;135(6):2064–7.
Maiya S, Grundmann A, Li SM, Turner G. The fumitremorgin gene cluster of Aspergillus fumigatus: identification of a gene encoding brevianamide F synthetase. ChemBioChem. 2006;7(7):1062–9.
Li S, Srinivasan K, Tran H, Yu F, Finefield JM, Sunderhaus JD, et al. Comparative analysis of the biosynthetic systems for fungal bicyclo [2.2. 2] diazaoctane indole alkaloids: the (+)/(–)-notoamide, paraherquamide and malbrancheamide pathways. MedChemComm. 2012;3(8):987–96.
Kolmogorov M, Bickhart DM, Behsaz B, Gurevich A, Rayko M, Shin SB, et al. metaFlye: scalable long-read metagenome assembly using repeat graphs. Nat Methods. 2020;17(11):1103–10.
Oxford Nanopore Technologies Ltd: RNA and cDNA sequencing. https://nanoporetech.com/applications/techniques/rna-and-cdna-sequencing. Accessed 10 July 10, 2024.
Langmead B, Wilks C, Antonescu V, Charles R. Scaling read aligners to hundreds of threads on general-purpose processors. Bioinformatics. 2019;35(3):421–32.
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10(2):giab008.
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PloS one. 2014;9(11):e112963.
Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34(suppl_2):W435-W9.
Gao H, Zhu T, Li D, Gu Q, Liu W. Prenylated indole diketopiperazine alkaloids from a mangrove rhizosphere soil derived fungus Aspergillus effuses H1-1. Arch Pharmacal Res. 2013;36:952–6.
Wang M-L, Chen R, Sun F-J, Cao P-R, Chen X-R, Yang M-H. Three alkaloids and one polyketide from Aspergillus cristatus harbored in Pinellia ternate tubers. Tetrahedron Lett. 2021;68:152914.
Zhong W, Wang J, Wei X, Fu T, Chen Y, Zeng Q, et al. Three pairs of new spirocyclic alkaloid enantiomers from the marine-derived fungus Eurotium sp. SCSIO F452. Front Chem. 2019;7:350.
Chen G-D, Bao Y-R, Huang Y-F, Hu D, Li X-X, Guo L-D, et al. Three pairs of variecolortide enantiomers from Eurotium sp. with caspase-3 inhibitory activity. Fitoterapia 2014;92:252–9.
Elsbaey M, Sallam A, El-Metwally M, Nagata M, Tanaka C, Shimizu K, et al. Melanogenesis inhibitors from the endophytic fungus Aspergillus amstelodami. Chem Biodiversity. 2019;16(8):e1900237.
Chen X, Si L, Liu D, Proksch P, Zhang L, Zhou D, et al. Neoechinulin B and its analogues as potential entry inhibitors of influenza viruses, targeting viral hemagglutinin. Eur J Med Chem. 2015;93:182–95.
Zhou LN, Zhu TJ, Cai SX, Gu QQ, Li DH. Three new indole-containing diketopiperazine alkaloids from a deep‐ocean sediment derived fungus Penicillium griseofulvum. Helv Chim Acta. 2010;93(9):1758–63.
Aoki T, Ohnishi K, Kimoto M, Fujieda S, Kuramochi K, Takeuchi T, et al. Synthesis and Neuroprotective Action of Optically Pure Neoechinulin A and Its Analogs. Pharmaceuticals. 2010;3(4):1063–9.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

A global atlas of fungal biosynthetic gene clusters reveals the diversification of diketopiperazine biosynthesis

Status:

Version 1

Abstract

Background

Results

Conclusions

Figures

Background

Methods

2.1. Construction of the fungal BGC atlas

2.2. Construction of the fungal gene cluster families

2.3. Mining of diketopiperazine synthases

2.4. Mining of diketopiperazine-forming NRPSs

2.5. Targeted discovery of diketopiperazines and the biosynthetic enzymes from fungi

2.6. Fermentation, extraction, and isolation

Results

3.1. Overview of the fungal BGC atlas

3.2. Diversity, novelty, and distribution pattern of fungal GCFs

3.3. Diversity and novelty of fungal cyclodipeptide synthases

3.4. Diversity, and novelty of fungal diketopiperazine-forming NRPSs

3.5. Discovery of diketopiperazines and the diketopiperazine-forming enzyme from Aspergillus sp. WHUF0304

Discussion

Conclusions

Declarations

Declarations

Competing interests

Supplementary Information

Funding

Author Contribution

Acknowledgement

Availability of data and materials

References

Additional Declarations

Supplementary Files

Status:

Version 1