Changes in pericarp structure
The pericarp of A. trifoliata is known to crack longitudinally at maturity, and the seeds disperse from along the ventral suture with fruit cracking. First, the dynamic structures of the fruit pericarps in different development stages were observed in this study (Fig. 1). In the non-cracking stage (PS), the arrangements of pericarp cells and cuticles were dense, with small intercellular spaces, and were distributed continuously (Fig. 1a, 1d, 1g). However, the cell wall became thinner, the cell volume became larger, the number of cell layers decreased, and the arrangement of cells was loose with poor integrity; further, the spacing between the cells became bigger and the cell of exocarp and mesocarp began to degrade in the initial cracking stage (PM) (Fig. 1b, 1e, 1h). The cells were arranged irregularly and were not compact, and the cell layers continued to reduce, with larger cell spaces, and cells continued to degrade in the total cracking stage (PL) (Fig. 1c, 1f, 1i).
Transcriptomic analysis overview
To obtain an overview of the A. trifoliata transcriptome during fruit development and ripening, nine cDNA libraries (i.e. PS, PM and PL, each with three repeats) were constructed. A total of 47.05, 46.92, and 54.00 million raw sequence reads were produced from the PS, PM and PL libraries, respectively. After removing reads with indeterminate base ratios > 10 %, as well as low-quality reads and adaptor sequences, 46.45, 46.35, 53.46 million clean reads with the percentage of Q30 bases and GC contents of 91.58−93.75 % and 45.94−48.48 %, respectively, were obtained (Table S1). The resultant A. trifoliata transcriptome contained 241 376 transcripts, ranging from 201 to 2000 bp, and 186 054 unigenes (> 200 bp; Table 1 and Fig. S1); the details of the size distribution of the transcripts and unigenes are shown in Fig. S1.
To determine the putative functions of the assembled transcripts, all unigenes were annotated using Basic Local Alignment Search Tool (BLAST) searches against the five databases, including National center for biotechnology information non-redundant protein sequences database (NR) (100 329; 53.9 % and 41.57 % of all identified unigenes and transcripts, respectively), SwissProt (56 346; 30.3 % and 23.3 %), Protein families database (Pfam) (34 428; 18.5 % and 14.3 %), Gene Ontology database (GO) (44 558; 23.9 % and 18.5 %), and Kyoto Encyclopedia of Genes and Genomes pathway database (KEGG) (30 298; 16.3 % and 12.6 %). This indicated that the NR database provided the largest number of annotations, suggesting that 100 329 unigenes correspond with sequences from at least one of the public databases, and 7283 unigenes were annotated to all databases. Moreover, a total of 17 601, 19 281, and 13 525 unique unigenes and 45 866, 48 104, and 43 303 absent unigenes were identified in PS, PM and PL, respectively. Most of these were uncharacterized, unknown, and hypothetical proteins in the annotation results (Table S2-S3).
Among these unigenes, 11 205 were identified as differently expressed genes (DEGs) using an absolute log2 fold-change >1 with p < 0.05 during fruit ripening. There were 779 up-regulated and 1924 downregulated genes in the PM compared to levels in the PS group, which are presented in a volcano plot in Fig. S2a, and 4623 upregulated and 1975 downregulated genes in PL compared to expression in the PM group, and these are presented in a volcano plot in Fig. S2b. There were 1904 DEGs that were co-expressed in PM_PS and PL_PM combinations (Table 2).
Functional classification of the identified DEGs
To further understand the function of the identified DEGs, bioinformatics analysis was performed based on gene functional classification and hierarchical cluster analysis. GO analysis indicated that most of the DEGs in the biological process (BP) category were involved in the cellular amide metabolic processes and amide metabolic processes; structural molecular activity and oxidoreductase activity comprised the highest proportions of DEGs in the molecular function (MF) category of both PM_PS and PL_PM groups, respectively; cytoplasmic parts and intracellular ribonucleoprotein complexes comprised the highest proportion of DEGs in the cell components (CC) in PM_PS and PL_PM, respectively (Fig. 2a-2b).
Moreover, a KEGG pathway analysis was carried out to further evaluate the DEGs. Many were enriched in metabolic pathways, ribosomes in PM_PS, and biosynthesis of secondary metabolites and metabolic pathways in PL_PM (Fig. 2e-2f). Additionally, a hierarchical cluster analysis was performed to further understand the expression changes in the cell wall- related DEGs, (Fig. 3). In all, 285 cell wall-related DEGs, including those associated with pentose and glucuronate interconversions, the phenylpropanoid pathway, galactose metabolism, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism and transcription factors were clustered closely both in PM_PS and PL_PM group. Notably, most cell wall-related DEGs were downregulated in the PM_PS group but were upregulated in PL_PM group (Fig. 3a-3f).
Quantitative proteome analysis
To understand the molecular mechanisms of pericarp cracking in A. trifoliata fruits, a quantitative proteomics analysis was performed using the TMT platform and LC-MS/MS analysis during fruit ripening, to complement the transcriptome analysis. Accordingly, totals of 812 625 spectra,
68 151 identified spectra, 12 456 peptides, and 10 572 unique peptides were found by proteomic analysis and 2839 proteins were identified (Table 1 and Table S3). In terms of protein mass distribution, proteins with molecular weights greater than 9 kDa had a wide range and good coverage, with a maximum distribution area of 10–40 kDa (Fig. S3a). Peptide quantitative analysis of the proteins showed that protein quantity decreased with an increase in the matching peptide (Fig. S3b).
Among of these proteins, 240 were identified as differentially abundant proteins (DAPs) using a fold-change >1.2 and < 0.83 with p < 0.05 as the thresholds for upregulated and downregulated, respectively. Further, 84 proteins were more abundant and 106 proteins were less abundant in the PM_PS group and are shown in a volcano plot in Fig. S2c; 20 DAPs were more abundant and 13 DAPs were less abundant in the PL_PM group, and these are shown in a volcano plot in Fig. S2d, whereas 17 were co-expressed in PM_PS and PL_PM.
Functional classification of the identified DAPs
Bioinformatics analysis of DAPs was carried out based on protein functional classifications and hierarchical cluster analysis. GO analysis showed that most DAPs in the BP category were involved in cellular responses to the chemical stimulus and cellular oxidant detoxification processes in the PM_PS group, as well as the metabolic and macromolecule metabolic processes in the PL_PM. The highest portions of DAPs in MF category were the oxidoreductase activity and antioxidant activity in PM_PS, as well as structural constituent of ribosome, and structural molecule activity in PL_PM group. Extracellular region in the PM_PS group and cytoplasmic parts in the PL_PM group comprised the highest portions of DAPs in CC category (Fig. 2c-2d).
Moreover, a KEGG pathway analysis was carried out to further evaluate the DAPs. Many were enriched in two-component system and ribosome pathways in the PM_PS and PL_PM groups, respectively (Fig. 2g-2h).
Hierarchical cluster analysis was performed to further explore the expression changes in the cell wall-related DAPs. A total of 40 cell wall-related DAPs, including those associated with pentose and glucuronate interconversions, the phenylpropanoid pathway, galactose metabolism, starch and sucrose metabolism, amino sugar and nucleotide sugar metabolism, and cell wall metabolism- related proteins were clustered closely in both PM_PS and PL_PM groups. Notably, most cell wall-related DAPs were upregulated in both the PM_PS and PL_PM groups, whereas those DAPs involved in phenylpropanoid pathways and galactose metabolism were downregulated in the PM_PS group and up-regulated in the PL_PM group (Fig. 3g-l).
Comparative analysis between protein abundance and gene expression levels
There were more DEGs (4607 and 6598) than DAPs (190 and 50; PM_PS and PL_PM, respectively) and many more shared DEGs (1904) than shared DAPs (17) in cracking fruit compared to those in non-cracking fruit. Most of these DEGs and DAPs were downregulated in the PM_PS group but upregulated in the PL_PM group, suggesting that greater changes in fruit cracking occurred during fruit ripening. Among the shared DEGs, 1123 were upregulated and 781 were downregulated in the PM_PS group, whereas 808 were upregulated and 1096 were downregulated in PL_PM group. Of the shared DAPs, nine were increased in abundance and eight were decreased in abundance the PM_PS group, whereas eight exhibited increased abundance and nine showed decreased abundance in the PL_PM group (Table 2). To evaluate the relationships between the transcriptomic and proteomic changes during fruit pericarp cracking, the quantitative data for DEGs and DAPs were used for correlation analysis. According to this, 14 and four DAPs and their corresponding DEGs were identified in the PM_PS and PL_PM groups, respectively. Of these, 12 DAPs (four with increased abundance and eight with decreased abundance) and four DAPs (two with increased abundance and two with decreased abundance) were regulated in the same direction as their corresponding DEGs in the PM_PS and PL_PM groups, respectively (Fig. 4a-4b); meanwhile, two were regulated in the opposite direction as their corresponding DEGs in the PM_PS group. There were more DEGs than DAPs in both the PM_PS and PL_PM groups, with significant differences in the trends of transcript levels and protein abundance.
Furthermore, the fold-changes of the DAPs indicated weak positive correlations with their corresponding DEGs based on Pearson’s correlation tests (r = 0.03 and 0.11, p < 0.01, in PM_PS and PL_PM, respectively; Fig. 4c-4d). The fold-changes in the DAPs were positively correlated with the DEG showing the same trend (r =0.9161 and 0.8, p < 0.01, in PM_PS and PL_PM, respectively; Fig. 4e-4f).
Identification of DAPs and DEGs associated with candidate pathways
To further clarify the biological functions of the co-regulated DEGs–DAPs genes, an enrichment analysis was conducted based on the GO and KEGG pathways analyses. The largest groups within the BP category were those linked to metabolic process and cellular process; catalytic activity and binding and cell and cell part were predominant in MF and CC categories, respectively, both in PM_PS and PL_PM groups (Fig. S4a-4b). In the PM_PS group, 14 DAPs were significantly enriched in seven pathways, both with respect to DEGs and DAPs, which included fructose and mannose metabolism pathway, phenylpropanoid biosynthesis, glutathione metabolism, ubiquinone and other terpenoid-quinone biosynthesis, pentose and glucuronate interconversions, amino sugar and nucleotide sugar metabolism, and galactose metabolism. In the PL_PM group, four DAPs were significantly enriched in the three pathways, both in terms of DEGs and DAPs, which included a calcium signaling pathway, pentose and glucuronate interconversions, and galactose metabolism pathway (Fig S4c-4d). The comparative analysis showed that two pathways, including pentose and glucuronate interconversions and galactose metabolism pathways, were shared between the transcriptome and proteome data, for both PM_PS and PL_PM groups. However, the phenylpropanoid biosynthesis pathway was only shared by DAPs and DEGs in the PM_PS group. Therefore, those shared metabolic pathways might play potential roles in A. trifoliata fruit cracking.
Moreover, the protein–protein interaction (PPI) network was analyzed to predict the biological functions of A. trifoliata fruit cracking using the STRING database. A total of 85 DAPs, including 29 upregulated (28 and one in PM_PS and PL_PM, respectively) and 56 downregulated DAPs (46 and 10 in PM_PS and PL_PM, respectively) were assigned to the interaction network (Fig. 5). Only four cell-wall related proteins, including endoglucanase 8 (TRINITY_DN136333_c1_g2), β-glucosidase 33 (BGLU33; TRINITY_DN137437_c3_g1), peroxiredoxin-2 (TRINITY_ DN137008_c1_g6), and β-galactosidase1 (β-GAL1; TRINITY_DN138388_c1_g1) were determined to interact with other proteins.
Validation of data reliability through reverse transcription real-time quantitative PCR (qPCR)
To validate the results of the RNA-Seq and TMT data, expression and correlation analyses between the qPCR and the fragments per kilobase per million reads mapped (FPKM) values obtained from the transcriptome and proteome data were performed. The 20 selected DEGs, involved in phenylpropanoid biosynthesis, pentose and glucuronate interconversions, amino sugar and nucleotide sugar metabolism, galactose metabolism, and starch and sucrose metabolism, as well as other cell wall metabolism-related DEGs, had shown differential expression patterns in the PM_PS and PL_PM groups, and the results of the qPCR are shown in Fig. 6. Specifically, in the PM_PS group, phenylpropanoid pathway-related genes 4-coumarate-COA-ligase (4CL), peroxidase (PRX), and PRX2 were downregulated, but cinnamyl-alcohol dehydrogenase (CAD) and shikimate O-hydroxycinnamoyltransferase (HCT) were upregulated. The galactose metabolism-related genes β-galactosidases (β-GAL1 and β-GAL2), amino sugar and nucleotide sugar metabolism-related gene beta-D-xylosidase (BXL), starch and sucrose metabolism related genes cellulase (CEL), cellulose synthase-like protein (CSLG), and glucan endo-1,3-beta-D-glucosidase (ENDOB) were downregulated. Meanwhile, the cell wall metabolism genes NAC, NAC-like, and EXP1 were downregulated, but the BHLH transcription factor and dirigent protein (DIR2) were upregulated. Further, pentose and glucuronate interconversion-related genes PL, PG, and PE were upregulated. Most of these genes were significantly upregulated in the PL_PM group, except for 4CL, CAD, β-GAL, and EXP1. Moreover, the expression of 13 candidate genes, including DIR2 (r = 0.7977, p < 0.05), NAC-like (r = 0.9464, p < 0.01), EXP1 (r = 0.8582, p < 0.01), CAD (r = 0.8015, p < 0.01), β-GAL1 (r = 0.8440, p < 0.05), β-GAL2 (r = 0.6675, p < 0.05), 4CL (r = 0.7466, p < 0.05), ENDOB (r = 0.7052, p < 0.05), PE (r = 0.8042, p < 0.05), BHLH (r = 0.7485, p < 0.05), PG3 (r = 0.6819, p < 0.05), CEL (r = 0.8732, p < 0.05), and PRX2 (r = 0.8325, p < 0.01) showed strong correlations with the RNA-Seq data, and seven genes showed poor correlations with the corresponding protein expression (Fig. 6; Table 3).
Then, the corresponding genes of 17 selected DAPs were analyzed by qPCR to further validate the proteomic data, and the qPCR results of these genes are shown in Fig. 7. Specifically, PRX, PRX3, PRX4, PRX5, β-GAL1, β-GAL2, BXL, PG, and PG2 were significantly downregulated, whereas glucan endo-1,3-beta-glucosidase (ENBG) and furostanol glycoside 26-O-beta-glucosidase (F26G), involved in starch and sucrose metabolism, and PL, PE, EXP1, and DIR1 were significantly upregulated in the PM_PS group. In the PL_PM group, most of the genes were upregulated, except for DIR1, EXP1, PG, PG2, β-GAL1, β-glucosidase33 (BGLU33), and alpha/beta hydrolase (α-HY). Moreover, expression of the 14 candidate genes, including DIR1 (r = 0.8316, p < 0.05), PG2 (r = 0.8336, p < 0.05), EXP1 (r = −0.7482, p < 0.05), F26G (r = 0.8907, p < 0.01), BGLU33 (r = 0.7248, p < 0.05), PE (r = 0.7596, p < 0.05), PRX (r = 0.7489, p < 0.05), BXL (r = 0.6766, p < 0.05), α-HY (r = 0.8270, p < 0.01), PRX5 (r = 0.7481, p < 0.05), PRX6 (r = 0.6748, p < 0.05), β-GAL1 (r = −0.8215, p < 0.05), β-GAL2 (r = 0.7797, p < 0.05), and ENBG (r = 0.8917, p < 0.01) showed strong correlations with the TMT data. However, levels of three genes showed poor correlations with their corresponding protein expression (Fig. 7; Table 3). In general, the qPCR results confirmed the gene expression patterns obtained by transcriptomic and proteomic data, suggesting that our results are reliable.
Validation of data reliability by parallel reaction monitoring (PRM)
The protein expression levels obtained by TMT were confirmed by quantifying their expression by PRM analysis. 20 proteins that exhibited significantly different levels by TMT analysis were selected for PRM analysis, of which 18 were successfully quantified (Table 4). Here, 14 of the 18 (77.8%) proteins showed the same trend as that observed when the protein levels were quantified by TMT, including PE, PL, PG2, F26G, β-GAL2, Auxin efflux carrier, α-HY, PRX2, PG4, PRX5, PRX3, endoglucanase 19, endoglucanase 8, and DIR1. Meanwhile the mean expression levels of PRX, beta-fructofuranosidase, BXL, and BGLU33 proteins were inconsistent with the protein levels quantified by TMT. In general, the trends in the expression changes measured by PRM and TMT were basically consistent.
Analysis of proteins expressed in A. trifoliata fruit identifies genes that might play relevant roles in fruit cracking
To illuminate which genes might play key roles in cell wall metabolism pathways during fruit cracking, the expression profiles of 10 genes were analyzed. Among these, PL (TRINITY_DN143250_c1_g6) was significantly upregulated in the PM_PS group, PE (TRINITY_DN143028_c0_g1), β-GAL2 (TRINITY_DN142386_c5_g1), F26G (TRINITY_ DN142424_c1_g1), PG (TRINITY_DN196976_c0_g1), PG3 (TRINITY_DN142042_c0_g2), and BXL (TRINITY_DN141432_c1_g2) were upregulated in the PL_PM group, and BGLU33 (TRINITY_DN137437_c3_g1) was downregulated in the IS_NS group. PG2 (TRINITY_ DN142943_c1_g1) and PG4 (TRINITY_DN141074_c0_g1) were downregulated in the PL_PM group based on both the transcriptome and proteome (Fig. 8). Notably, PL, PE, and β-GAL2 were upregulated in the PM_PS, and PL_PM groups, respectively, based on transcriptome, proteome, qPCR, and PRM data.