Based on the comparison of the relative coverage of reads between Cappable-seq and Illumina RNA-seq with ANNOgesic tool, 1641 TSSs were identified across the genome of L. monocytogenes RO15 (Table S3). Among all TSS, 1233 had highest coverage within 300 bp upstream of an open reading frame (ORF) and were classified as primary TSS. Secondary TSSs (i.e. TSSs with lower coverage within 300 bp upstream of an ORF) were detected for 90 genes. The number of TSSs located within the coding sequence of a gene (i.e. internal TSSs) was 233. Moreover, 199 TSSs located on the antisense strand of a protein coding gene (antisense TSSs) and 55 orphan TSSs not assigned to any coding DNA sequence (CDS) were detected. To facilitate visual inspection of these results, we created a public online genome viewer page (https://icemduru.github.io/listeria_ro15_transcript/).
By comparing the Cappable-seq reads of the HPP-treated samples to those of the control samples, we were able to predict TSSs that were specific to the HPP treatment. In total six TSS were predicted to be specific to HPP-treated samples. The six TSSs specific to HPP-treated samples were associated with five genes and one tRNA (Table 1).
TSS of prophages
Our previous studies investigating the genome of RO15 strain [12] and gene expression in response to HPP stress [13] revealed that prophages were associated with the phenotypic characteristics of the strain. Notably, we observed upregulation of phage genes during HPP stress, suggesting induction of prophages during HPP stress [13]. This led us to further explore phage TSSs, which previously have received limited attention in L. monocytogenes strains.
Previously we identified five distinct prophage regions in the RO15 genome, ranging in size from 10.7 kb (prophage 1) to 41.3 kb (prophage 5) [12]. Prophage 5 was further observed as a circular form, suggesting a free virus genome version [12]. Among these regions, prophage 5 harboured the highest number of TSSs, with a significantly higher proportion of antisense TSSs compared to the other prophages (Table 2).
Table 2
Prophage regions in RO15 and the number of TSS.
Prophage Region | Number of genes | Primary TSS | Internal TSS | Antisense TSS | Orphan TSS | Total |
Prophage 1 (125824–136550 bp) | 17 | 1 | 1 | 0 | 0 | 2 |
Prophage 2 (673631–706883 bp) | 44 | 1 | 4 | 1 | 0 | 6 |
Prophage 3 (2175980–2223958 bp) | 79 | 7 | 4 | 3 | 1 | 15 |
Prophage 4 (2559206–2601984 bp) | 64 | 7 | 0 | 3 | 2 | 12 |
Prophage 5 (2729417–2770759 bp) | 67 | 10 | 1 | 7 | 0 | 18 |
Interestingly, one of the HPP-specific TSS (TSS:2221338_f) was located in the prophage 3 region. This TSS, which is antisense to the gene OCPFDLNE_02235 (encoding phage terminase large subunit protein), was specifically induced in response to HPP stress. Cappable-seq analysis revealed the presence of TSS signals at this location in all HPP-treated samples, while no such signal was detected in control samples. Additionally, short Illumina RNA-seq reads indicated the presence of opposite-strand reads in the OCPFDLNE_02235 gene region, supporting the potential involvement of antisense RNA in this gene. However, no sRNA was predicted on this region in our sRNA prediction workflow
Prediction of sRNAs in L. monocytogenes strain RO15
The ANNOgesic tool was used to predict sRNAs in L. monocytogenes RO15, resulting in 81 identified candidates (Table S4). The majority (53 of 81) were classified as intergenic sRNAs. A total of 20 sRNAs were found to originate from untranslated regions (UTRs), with four derived from 3'UTRs and 16 from 5'UTRs. Six of the predicted sRNAs were identified as antisense sRNAs (asRNAs), and two were classified as InterCDS sRNAs (located within coding sequences).
To investigate asRNAs and gene expression patterns during high-pressure processing (HPP) treatment, gene counts were performed. However, no anticorrelation was observed between fold changes in asRNAs and their corresponding genes (Figure S1).
In a previous study, nine antisense RNAs (asRNAs) were experimentally validated (by Northern blot) in Listeria monocytogenes strain EGD-e [19]. To investigate these asRNAs in strain RO15, we mapped them to the genome of strain RO15. We then examined our TSS prediction data for RO15 and found that two of these nine validated asRNAs had corresponding TSSs in RO15. Interestingly, only one of our predicted sRNA was in the same genomic location as a validated asRNAs in EGD-e. However, our prediction classified this sRNA as intergenic in RO15 because the adjacent gene was not found in RO15.
Prediction of promoters, and UTR length in L. monocytogenes strain RO15
We analyzed the upstream regions of all predicted TSSs (-50 bp to 1 bp) using the MEME suite to identify promoter motifs in L. monocytogenes. Our findings revealed that there is only one well-conserved region in the promoter regions, with a -10 position “TATAAT”-like motif. This motif is present in all three types of TSSs. Additionally, we observed that the − 10 position motif is slightly different in secondary TSS promoter regions compared with other types of TSS regions. This suggests that there may be some subtle differences in the regulation of secondary TSSs compared with other types of TSSs (Fig. 2).
We investigated the length distribution of 5' UTRs in L. monocytogenes RO15 using TSSs and calculating UTR lengths. Our analysis revealed that most of the 5' UTRs were shorter than 100 bases (Figure S2). We observed that the median length of 5' UTRs using primary TSSs was 33 bases. When secondary TSSs were also considered, the median UTR length increased to 34 bases (Figure S2).
Operon prediction in L. monocytogenes strain RO15
By using a combination of Illumina short and ONT direct long RNA-seq reads, we analyzed the operon structure of the L. monocytogenes RO15 genome. Our predictions revealed that 71% (2210 genes) of the genes were organized into operons (Table S5). A total of 658 operons were identified, with an average of 3.35 genes per operon (median = 2 genes). The largest predicted operons were predominantly found within the prophage regions. Large operons were also seen in the L. monocytogenes RO15 genome, such as the cobalamin biosynthesis operon, comprising 19 genes (OCPFDLNE_01235 - OCPFDLNE_01253), emerged as one of the largest operons within the entire genome.
Furthermore, we visually examined our predictions and observed strong support from both Cappable-seq and direct RNA sequencing reads. For instance, the manXYZ operon (OCPFDLNE_00834 - OCPFDLNE_00837) was predicted by our workflow, and both Cappable-seq and direct RNA sequencing reads helped us visually confirm the operon prediction (Fig. 3).
A-to-I RNA editing in L. monocytogenes strain RO15 and strain ScottA
During the TSS analysis, we carefully examined the previously published Illumina RNA-seq data from RO15 and ScottA strains [13]. Our analysis revealed several sequence variants that were not attributed to sequencing errors and were not detected in the gDNA-based PacBio or Illumina data [12]. These variants, consisting of A to G (T to C for reverse strand) transitions, are indicative of A-to-I RNA editing, a post-transcriptional modification that has been previously studied in other bacteria [22, 23].
The presence of A-to-I RNA editing variants was higher in HPP-treated samples for several genes (Table 3, Table 4, Table S6). A subset of these variants was specifically observed in HPP-treated samples, suggesting their potential role in HPP response mechanisms. A-to-I RNA editing within genes spoVG, pbpX, mdxE, patB, hpf, and pepC was particularly common in both strains under HPP stress. While the majority of observed A-to-I RNA editing variants were confined to HPP-treated samples, a subset of these variants were also detected in control samples. These variants, found within genes pflA, mdxE, secA, and kdpD in strain RO15, and actA, ldh, dhaS, LMOSA_26480, and mdxE in strain ScottA. Notably, most of these variants were observed as partial variants, indicating that approximately half of the RNA-seq reads contained the variant base in the transcript.
Table 3
Summary of A > G (T > C on reverse strand) variant positions and the corresponding gene in strain RO15. The table shows the positions of A > G variants in RNA of RO15, and the locus tag, gene name, number of samples with the variant in treated and control groups, reference and variant amino acids of the corresponding gene, respectively. The p-values between treated and control samples were calculated using Fisher’s exact test. For the sake of clarity, only variants detected in more than 7 samples are presented. The complete list of variants is available in the supplementary Table S6.
Variant Pos. | Locus Tag | Gene Name | # variants treated (n = 51) | # variants control (n = 52) | ref. aa | var. aa | p-value |
2269171 | OCPFDLNE_02280 | mdxE | 37 | 10 | Y | C | 4.90E-08 |
2649787 | OCPFDLNE_02685 | hpf | 21 | 1 | L | L | 3.47E-07 |
497744 | OCPFDLNE_00478 | dhaS | 12 | 0 | T | A | 9.18E-05 |
1768810 | OCPFDLNE_01761 | yfhP | 10 | 0 | L | L | 4.90E-04 |
1544500 | OCPFDLNE_01549 | | 9 | 0 | Y | C | 1.11E-03 |
429958 | OCPFDLNE_00419 | mngB | 8 | 0 | Y | C | 2.47E-03 |
204136 | OCPFDLNE_00200 | spoVG | 9 | 1 | Y | C | 7.41E-03 |
2428075 | OCPFDLNE_02434 | pepC | 8 | 1 | M | V | 1.50E-02 |
2288303 | sRNA_92 | rli47 | 8 | 1 | - | - | 1.50E-02 |
2646986 | OCPFDLNE_02684 | secA | 26 | 14 | L | L | 1.52E-02 |
1451307 | OCPFDLNE_01455 | pflA | 51 | 47 | L | L | 2.70E-02 |
1213571 | OCPFDLNE_01217 | pdtaS | 9 | 3 | L | L | 6.98E-02 |
Table 4
Summary of A > G (T > C on reverse strand) variant positions and the corresponding gene in strain ScottA. The table shows the positions of A > G variants in RNA of RO15, and the locus tag, gene name, number of samples with the variant in treated and control groups, reference and variant amino acids of the corresponding gene, respectively. The p-values between treated and control samples were calculated using Fisher’s exact test. For the sake of clarity, only variants detected in more than 7 samples are presented. The complete list of variants is available in the supplementary Table S6.
Variant Pos. | Locus Tag | Gene Name | # variants treated (n = 53) | # variants control (n = 54) | ref. aa | var. aa | p-value |
2675586 | LMOSA_4770 | hpf | 21 | 0 | L | L | 3.21E-08 |
1717344 | LMOSA_25410 | pepV | 15 | 1 | Y | C | 8.61E-05 |
242871 | LMOSA_10890 | spoVG | 12 | 0 | Y | C | 1.08E-04 |
2324765 | LMOSA_1570 | gloA | 12 | 0 | Y | C | 1.08E-04 |
720351 | intergenic_region | | 10 | 0 | - | - | 5.55E-04 |
1775078 | LMOSA_25880 | - | 10 | 0 | T | A | 5.55E-04 |
2520887 | LMOSA_3290 | patB | 10 | 0 | T | A | 5.55E-04 |
2481510 | LMOSA_2970 | pepC | 11 | 1 | M | V | 1.91E-03 |
s252859 | LMOSA_10970 | actA | 39 | 26 | T | T | 9.89E-03 |
2281668 | LMOSA_1140 | mdxE | 14 | 4 | Y | C | 1.01E-02 |
2715706 | LMOSA_5200 | - | 9 | 2 | T | A | 2.85E-02 |
1282535 | LMOSA_21080 | - | 11 | 4 | F | L | 5.54E-02 |
568103 | LMOSA_13830 | dhaS | 13 | 21 | T | A | 1.46E-01 |
1843087 | LMOSA_26480 | - | 53 | 52 | G | G | 4.95E-01 |
257308 | LMOSA_11030 | ldh | 53 | 54 | I | V | 1.00E + 00 |
Interestingly, we observed a gradual increase in the editing percentage with increasing HPP treatment time (Figure S3, Figure S4). However, the expression patterns of the genes harbouring these A-to-I RNA editing variants were not consistently altered. Both upregulation and downregulation were observed for these genes (Figure S5).
We examined the distribution of the RNA edited sites (Fig. 4). Our findings revealed distinct preferences for modified sites between the RO15 and Scott A strains. Notably, while the predicted consensus motif differed between the two strains, two four-base motifs (TACG and GTAA) emerged as the most prevalent edited motifs in both strains (Fig. 4).
RNA m6A modification prediction in L. monocytogenes strain RO15 and strain ScottA
In the RO15 HPP-treated sample, we predicted 23 sites with m6A modifications, while only three positions were identified as m6A modified in the RO15 control sample. m6A modifications targeting the dnaD and arsC genes were observed in both treated and control samples (Table S7). Notably, carbohydrate transmembrane transporter activity-related genes such as manX (encoding a mannose transporter), mntB (encoding a manganese transporter), and mdxE (encoding a Maltodextrin-binding protein) were frequently targeted by m6A modifications in the HPP-treated sample (Table S7).
In contrast to RO15, fewer m6A RNA modifications were observed in HPP treated samples of ScottA. In the treated sample, four positions were predicted as m6A modified, while in the control sample, 11 positions were predicted as m6A modified. The modifications that target the gadB gene were seen in both samples. In addition to the gadB gene, m6A modification was observed in the inlA gene (encoding an Internalin-A protein) (Table S7).