Functional genomic analysis of L. d. bulgaricus LJJ
The raw sequencing data of L. d. bulgaricus LJJ was subjected to quality control based on the three generations of the sequencing data. An additional file shows this in more detail [see Additional file 2]. The final result of genome assembly is available(Table 1). The gene prediction results of L. d. bulgaricus LJJ showed that a total of 2003 ORFs were predicted, with a total length of 1,598,697 bp, an average length of 798.15 bp, accounting for 84.54% of the genome [see Additional file 3,4]. The L. d. bulgaricus LJJ genome-wide circle map is shown in Figure 1. The results of GO annotation, COG classification, and KEGG classification are available in Additional files [see Additional file 5].
Comparative analysis of genomic sequences and basic features of L. d. bulgaricus LJJ and ATCC11842
Although the basic characteristics of the genome of strains of L. d. bulgaricus are similar, slight differences may also be possible among variations. These differences may be attributed to evolutionary variability as a result of environmental differences(Table 2).
Comparative analysis of homology between L. d. bulgaricus LJJ and ATCC11842 genes
The results of genome-wide homology analysis of L. d. bulgaricus LJJ and L. d. bulgaricus ATCC11842 are shown in Figure 2. There were a total of 1441 genes with homology, with 445 genes unique to L. d. bulgaricus LJJ, while 126 genes were unique to L. d. bulgaricus ATCC11842. The homology of L. d. bulgaricus LJJ and L. d. bulgaricus ATCC11842 gene is relatively high, reaching 84%. Also, a small number of genes having significant differences and no homology was found.
Collinear analysis of L. d. bulgaricus LJJ and ATCC11842 genomes
Genome-wide analysis of L. d. bulgaricus LJJ and type strain ATCC11842 is shown in Figure 3.
Our results showed that there was no large sequence rearrangement between the genomes of LJJ and ATCC11842, indicating that the collinearity was good. Insertion, deletion, inversion, and translocation between LJJ and ATCC11842 genome occur in three LCBs(Locally collinear blocks, LCBS). Short gene fragments insertion or inversion implies that the two strains of L. d. bulgaricus may have undergone genetic recombination or metastasis during the evolution process. Therefore, it is speculated that the insertion or deletion of these fragments is likely to result in different acid tolerance between the two strains [10]. A follow-up investigation such as knocking out target genes is required to validate this postulation.
Comparative Gene Ontology(GO) annotations of unique genes in L. d. bulgaricus LJJ and ATCC11842
The gene ontology annotation of the unique genes of LJJ and ATCC11842 is mainly focused on three levels: cellular component, molecular function, and biological process. In comparison to ATCC11842, LJJ has an advantage in the classification of cellular components, such as cell parts and membrane parts [see Additional file 6]. In the classification of molecular function, sub-classification, molecular transcription activity, binding, catalytic activity, and transporter activity plays important roles. For the classification of biological processes, its regulation, cellular processes, response stimuli, single-organism processes, and metabolic processes are necessary.
In these aspects, LJJ is advantageous probably because they have increased H+-ATPase activity on the cell membrane during acidic environments, and also play an important role in the acid tolerance process [11].
The synthesis of cell membrane is an important cell biological property, in which a large number of genes related to the function of cell components are required to be expressed [12]. The acid tolerance is the most important biological function of lactic acid bacteria against acid stress. The acid stress process involves stimulating cell information interaction, gene expression regulation, substance transportation metabolism, etc. [13]. Consequently, biological processes are improved as a result of the relation between functional genes and molecular functions.
Cluster of Orthologus groups (COG) comparative analysis of the unique genes in L. d. bulgaricus LJJ and ATCC11842
In comparison with ATCC11842, LJJ has a larger proportion in the following aspects: (L) replication, recombination, and repair; (M) cell membrane biosynthesis and outer membrane proteins; (C) energy production and conversion; (E) amino acid transport and Metabolism; (V) defense mechanisms [see Additional file 6]. The number of functional genes related to cell protection in LJJ is significantly higher than those of ATCC11842 and could be responsible for its higher acid tolerance level. Although LJJ naturally habits acid-stressed environments, it could, however, have difficulty in surviving normally during insufficient nutrients and energy [14]. In the absence of strong membrane and membrane protein synthesis or self-healing systems, it is difficult for even acid-tolerant probiotics to survive in an acidic environment [6]. GO annotations and COG classification Comparative analysis of unique genes in L. d. bulgaricus LJJ and ATCC11842.
Preliminary screening of L. d. bulgaricus LJJ acid-tolerant genes
Based on the results of the comparative genomics, the acid-tolerant gene was initially screened from the unique genes of LJJ and ATCC11842(Table 3). Most of the acid-tolerant genes initially screened are related to amino acid metabolism, suggesting the unique role of amino acids in bacterial acid tolerance.
PCR amplification and verification of acid-tolerant genes
Using the designed P1, P2….P16 as amplification primers, and genomes of 7 different L. d. bulgaricus strains as templates, 16 possible acid-tolerant genes of LJJ were amplified by PCR. The amplification results are shown in Figure 4.The result showed that the 7 different strains of L. d. bulgaricus contained the target genes corresponding to the P1, P7, P10, P11, P14, and P16 primers. Most fragments were also similar in size, implying that the target genes had no fragment insertion and loss. However, the target gene was not successfully amplified, indicating that the strain does not contain the gene fragment. A successful amplification would imply that the amplified fragments were different, thus proving that the genes had insertion and deletion in different strains.
Sequenced acid-tolerant genes and acid-tolerant-related gene verification in L. d. bulgaricus
The amplified fragment products were sequenced(Table 4). The sizes of the gene fragments were clearly expressed by the same gene in different strains, and are significantly different. Some fragments are obviously shorter than the length of the target gene fragment, which may have occurred as a result of gene deletion during evolution. However, a few other fragments were significantly longer than those of the target gene. The possible reason for these variations is that the fragment insertion occurs during the evolution of the gene, resulting in different fragment sizes, which ultimately leads to genetic functional changes. Consequently, the same gene is expressed differently in different strains, in such a way that some strains contain the gene, while others do not. Therefore, it is speculated that the related acid-tolerant protein cannot be normally expressed due to the deletion of the gene thus the acid-tolerant metabolic pathway functions abnormally resulting in a relatively weak acid tolerance.
Sequence alignment analysis of acid-tolerant genes
The sequence of selected acid-tolerant genes (No. 6, 12 and 13) were further analyzed, and the acid-tolerant genes dapA, dapH and lysC in different strains were also analyzed to identify the differences in their gene expressions. The results of the alignment analysis are shown in Figure 5a, 5b, and 5c
Figure 5(a) shows the alignment of dapA in different strains. The sequence alignment showed that large fragment deletions of dapA occur in JB, M13, ATCC11842, and GMC. It is speculated that the active center of the enzyme may be inactivated due to the deletion of the DNA fragment, resulting in the lysine metabolic pathway cannot be catalyzed to synthesize lysine. Thus, the acid tolerance of JB, M13, ATCC11842, and GMC is considered weak. Figure 5(b) and 5(c) show the amplification sequence alignment of dapH and LysC in different strains. After alignment, it was found that the expression results of the genes were consistent in the three strains LJJ, SY3, and YL5 with relatively strong acid tolerance. Based on the inferences from our study, dapH and lysC could be regarded as acid-tolerant genes of L. d. bulgaricus LJJ
Analysis of metabolic regulation of acid-tolerant genes of L. d. bulgaricus LJJ
According to previous research, the acid tolerance of lactic acid bacteria includes the regulation of intracellular H+ [15], the regulation of cell membrane [16], stress response protein expression [17]. Likewise, the regulation of amino acid metabolism plays an important role in the regulation of intracellular pH, such as the glutamate decarboxylase system [18], Arginine deiminase(ADI) pathway [18].
In this study, we obtained three genes (dapA, dapH, and lysC) related to the acid tolerance of L. d. bulgaricus LJJ. The analysis of the acid-tolerant metabolic regulation was performed by KEGG. The regulation process of lysine synthesis by dapA, dapH, and lysC is shown in Figure 6.
Three important acid-tolerant genes dapA, dapH, and lysC mainly regulate the synthesis of lysine and also participate in the lysine metabolic pathway. The lysC gene encodes an aspartate kinase, while the dapA gene encodes a 4-hydroxy-tetrahydrodipicolinate synthase. These two enzymes are important for the synthesis of lysine from aspartate and are involved in all metabolic pathways such as succinyl-DAP (diaminopimelate) pathway, acetyl-DAP pathway, DAP dehydrogenase pathway, and DAP aminotransferase pathway. The dapH gene encodes a tetrahydrodipicolinate N-acetyltransferase, a key enzyme in the acetyl-DAP pathway.