Strain A2 growth conditions
Strain A2 was grown in the nutrient medium at different concentrations of NaCl (0–20%, 0-3.5 mol/l) at 35°C, showing optimal growth at 1.7mol/l NaCl concentration. The strain showed observable growth after 36 hours in a concentration range of 0.5–2.6 mol/l (3–15%) NaCl. In the 0-0.5 mol/l concentration and beyond 2.6 mol/l concentration, the growth was not detectable in the same time range.
Cellular and colonial morphology
The colonies of strain A2 had an indefinite morphology, with an irregular light brown growth pattern and a slightly viscous consistency that expanded in all directions. For the cellular morphology of A2, we observed small cells around 1µm in length (Fig. 1) and a coccobacillus morphology that reacted negatively to Gram staining.
Genome Overview
The genome sequence of A2 was 3,855,926 bp in 33 scaffolds with an N50 Length of 336,455 bp and an N90 Length 84,770 with a Max Length scaffold of 483,539 bp, a Min Length scaffold of 648 bp with a GC content of 67.4% (Table 1). The genome contains 3,509 genes, with a percentage of coding genes of 87.86%. In the genome of A2, there are a total of 74 RNAs, of which 62 are tRNA genes, eight are rRNA genes, including one 16S rRNA, one 23S rRNA, and six for 5S rRNA; the remaining four correspond to snRNA (Table S1). Furthermore, about 303 tandem repeat sequences were found, as well as 211 mini- and five microsatellites (Table S2). Six databases were used to annotate the functions of the CDSs. The total number of genes annotated by each GO, KEGG, COG, NR, Pfam, and Swiss-Prot database were 2557, 3380, 2967, 3405, 2557, and 1848 respectively (Table 1).
Table 1
Project information and genome statistics
Attributes | Value |
Sequencing platform | Illumina |
Genome size (bp) | 3,855,926 |
GC content % | 67.4 |
Scaffold Max Lenght (bp) | 483,539 |
Scaffold Min Length (bp) | 648 |
N50 Length(bp) | 336,455 |
N90 Length (bp) | 84,770 |
Scaffolds | 33 |
Number of coding genes | 3509 |
RNA genes | 74 |
tRNA | 62 |
rRNA genes | 8 |
5 S rRNA | 6 |
16 S rRNA | 1 |
23 S rRNA | 1 |
sRNA | 4 |
GO annotation | 2557 |
KEGG annotation | 3380 |
COG annotation | 2967 |
NR annotation | 3405 |
Pfam anotattion | 2557 |
Swiss-Prot annotation | 1848 |
Genome annotation
For A2, the GO annotations (Fig. 2a) are in three main categories: molecular function (3167), cellular component (2303), and biological process (5552). In the molecular function category, the five subcategories with the most annotations were catalytic activity (1380) and binding (1173). Cell part and cell were the subcategories that presented the most annotations, with 949 each in the cellular component category. In the biological process category were the metabolic process (1467), cellular process (1401), localisation (569), and establishment of localisation (552).
Based on sequence homologies, a total of 3344 were mapped to 24 categories in COG (Fig. 2b). Regarding the above, amino acid transport and metabolism (E), general functions prediction (R), energy production and conversion (C), translation, ribosomal structure, and biogenesis (J), transcription (K) and inorganic ion transport metabolism (P) were the most abundant categories in COG with 9.9%, 8.9%, 7.0%, 6.9%, 6.9%, and 6.0% respectively. These functions are essential for halophilic bacteria survival in hypersaline environments (Zhang et al. 2022).
The KEGG annotations (Fig. 2c) have six categories with many annotated genes. For the metabolism category, we have amino acid and carbohydrate metabolism, which have the highest proportion of annotated genes, with 250 and 191, respectively. Regarding the annotation of amino acid metabolism, we can mention metabolic routes such as alanine, aspartate, and glutamate metabolism (ko00250), arginine and proline metabolism (ko00330), lysine biosynthesis (ko00300) and cysteine and methionine metabolism (ko00270), all of them closely related to the metabolic pathway of degradation and biosynthesis of ectoine and other compatible solutes, which is immersed in the glycine, serine and threonine metabolism pathway (ko00260). Pyruvate metabolism (ko00620) has the highest amount of carbohydrate metabolism annotations, glyoxylate and dicarboxylate metabolism (ko00630) have a second place and are followed by propanoate metabolism (ko00640) and butanoate metabolism (ko00650) respectively.
Phylogenetic and phylogenomic analysis
The results for the phylogenetic analysis of the 16S rRNA gene (1528 bp) showed that A2 belongs to the genus Halomonas with 99.9% homology in the NCBI database with Halomonas sp. strain K-15-10-3 (OP077305.1) 16S rRNA gene, partial sequence (1507 bp), 99.2% with Halomonas salifodinae strain BC7 (EF527873.1) 16S rRNA gene (1428 bp) partial sequence and 99.1% with Halomonas pacifica strain NBRC 102220 16S rRNA gene (1463 bp) partial sequence, however, in the new classification by de la Haba et al. (2023), H. pacifica becomes Bisbaumannia pacifica NBRC 102220 ( NR_114047.1). For type species in the EzBiocloud database, there is 99.3% homology with H. salifodinae BC7, 98.9% with B. pacifica NBRC 102220, and 97.6% with Halomonas sediminicola CPS11 (NR_152067.1). The phylogenetic tree (Fig. 3) constructed with the results from the NCBI and RDP database shows the close relationship of A2 with H. salifodinae strain BC7 and B. pacifica strain NBRC 102220 forming a branch apart from the other species. The phylogenetic tree constructed with the MLSA genes (Fig. 4) also results in a high homology between A2 and H. salifodinae JCM14803 (GCA_039522585.1), and both show a close relationship with B. pacifica DSM 4742. In the phylogenetic analysis of the 16S rRNA gene sequences (Fig. 5) and the genome-to-genome comparison (Fig. 6) that was carried out in TYGS, the results are similar and corroborate the phylogenetic closeness of A2 with H. salifodinae JCM14803 and B. pacifica NBRC 102220 that found in the same branch. The proteome of A2 shows considerably higher homology to the proteome of H. salifodinae IM328, followed by that of B. pacifica NBRC 102220, compared to the proteomes of closely related Halomonas species (Fig. 7). Based on the above data the genome of A2 was compared with genomes of closely related species using ANI and dDDH analysis (Table 2, S3, S4). The ANI and dDDH values between A2 and H. salifodinae IM328 (GCA_036871055.1) were 95.95% and 85.6%, respectively. At the same time, B. pacifica NBRC 102220 were 89.9% and 69.7%, respectively; the ANI and dDDH values for the other related species did not exceed 81% and 30.6%, respectively. According to species delimitation, cut-off values of 95% for ANI and 70% for dDDH and showing ≥ 98.7 % 16S similarity (Chun et al. 2018; Meier-Kolthof et al. 2022) and supported by the previous information, Halomonas sp. A2 is designated as Halomonas salifodinae strain A2. Furthermore, it is proposed that Halomonas salifodinae belongs to the genus Bisbaumannia because it shows more significant similarity than the genus Halomonas.
Table 2
The ANI and dDDH analysis between A2 and its closely related species
Hit taxon | ANI (%) | dDDH (%) |
Halomonas salifodinae IM328 | 95.9 | 85.6 |
Bisbaumannia pacifica NBRC 102220 | 89.9 | 69.7 |
Halomonas alkalicola strain CICC 11012s | 81.1 | 29.6 |
Billgrantia campisalis strain A4 | 80.9 | 28.4 |
Halomonas shengliensis CGMCC 1.6444 | 80.6 | 28.5 |
Halomonas campaniensis 5AG | 80.6 | 30.6 |
Halomonas ventosae CECT 5797 | 80.1 | 28.6 |
Secretory system analysis
The general secretion (Sec) and twin-arginine translocation (Tat) pathways are the most used bacterial secretion systems to transport proteins across the cytoplasmic membrane (Natale et al. 2008). Most proteins transported by the Sec and Tat pathways remain inside the cell, either in the periplasm or the inner membrane. The Sec system can also transport proteins that must stay in the inner membrane through the signal recognition particle (SRP) pathway (Green and Mecsas 2016). The Sec pathway transfers the protein to the periplasmic space before it is folded, and the Tat pathway transfers the folded protein to the periplasm. In contrast, the SRP pathway allows the protein to be folded and transferred simultaneously (Valent et al. 1998). Secretion System Type I is dedicated to transporting digestive enzymes, such as proteases and lipases, as well as adhesins and heme-binding proteins (Green and Mecsas, 2016). Meanwhile, the Secretion System Type VI translocates proteins to various recipient cells, including eukaryotic cell targets and, more commonly, other bacteria (Russell et al. 2014). A total of 28 genes related to the bacterial secretion system were found in the genome of H. salifodinae A2 (Table S5). We have 11 Sec-SRP genes, three from Tat, one from the Type I secretion system, and 13 from Type VI.
Ion transport proteins analysis
In many halophilic microorganisms, potassium absorption and the synthesis of compatible organic solutes control osmotic regulation in conditions of high salinity; K + limitation inhibits growth and adaptation to the saline environment, especially at high salinities, and causes a decrease in intracellular compatible organic solutes (Kraegeloh and Kunte 2002). The Tkr and Pha1 systems regulate K + transport and entry into the cell. The Trk system acts as a transmembrane transport protein, is ATP-dependent, and promotes K + uptake through an electrochemical gradient. (Kraegeloh et al. 2005). The Pha1 system functions as a K+/H + antiporter with optimal pH involved in potassium transport under slightly alkaline conditions; the Pha1 system can also transport Na+ (Yamaguchi et al. 2009). In A2, two genes, trkA and trkH, belonging to the Trk system, and five Pha1 system genes involved with K + transport were identified (Table S6). Surviving in environments with high salinity is complex; not only does the absorption of K + help keep the osmotic pressure under control, but the expulsion of Na + is essential to maintaining balance. Therefore, precise metabolic systems must be available for the transport of Na+. For A2, we found that there are six genes of the Nqr system that encode subunits of NADH: ubiquinone reductase (Na+-transporting) and seven genes of the multisubunit sodium/proton antiporter Mrp system (Table S6).
Secondary metabolites analysis
The antiSMASH 7.0 biosynthetic gene cluster prediction tool was used to detect biosynthetic gene clusters (BGCs) in the genome sequence of H. salifodinae A2 (Table S7). The results demonstrate the presence of biosynthetic gene clusters for different types of metabolites mentioned below. Ranthipeptide, an emerging class of natural products belonging to the ribosomally synthesised and posttranslationally modified peptide (RiPP) superfamily, analysis shows that these two ranthipeptides participate in quorum sensing and the control of cellular metabolism (Chen et al. 2020). Resorcinol for the catabolic metabolism of the aromatic compound resorcinol (Yang et al. 2022). Betalactone represents a poorly explored group of secondary metabolites with pharmaceutical potential, many of them with potent bioactivity against bacteria, fungi, or human cancer cell lines. (Džunková et al. 2023; Robinson et al. 2019). Siderophores are synthesised and secreted by many bacteria, yeasts, fungi, and plants for Fe (III) chelation under low iron conditions (Timofeeva et al. 2022).
Genomic islands analysis
Horizontal gene transfer is an important mechanism for microbial genome evolution, allowing rapid adaptation and survival in specific niches. Genomic islands, commonly defined as groups of bacterial or archaeal genes of probable horizontal origin, are of medical, environmental, or industrial interest (Bertelli et al. 2019). Using Islandviewer 4 software, groups of genes associated with genomic islands in the genome of A2 were detected by the IslandPath-DIMOB and SIGI-HMM prediction methods (Fig. 8). The results of the analysis revealed that A2 has 12 genomic islands with a total of 168 kbp, of which the three largest have a size of 34 kbp, 27 kbp and 19 kbp. Among other groups of genes present (Table S8) in the genomic islands are six genes that encode proteins directly related to Fe metabolism, TonB-dependent receptor, siderophore-iron reductase FhuF, AraC family transcriptional regulator, ferrioxamine B receptor, iron complex transport system ATP-binding protein and Fe3+-hydroxamate ABC transporter permease FhuB, all related to different Halomonas species, these genes have great importance in response to alkaline stress (Zhai et al. 2021). Two ISAs1 family transposases from Halomonas halodenitrificans, four TypeI-F CRISPR-associated, two associated with proteins, and two associated with helicase and endonuclease from Halotalea alkalilenta. In addition to three genes associated with the fimbrial adhesin protein and pilus assembly protein from Alcanivorax sp. PN-3.
Prophage regions analysis
Prophages, due to their ability to share genes, play a key role in bacterial adaptation to compete with other bacteria and adjust metabolism depending on the environmental conditions in which they are found to survive and grow (Bondy-Denomy and Davidson 2014). Typically, the prolonged presence of a prophage within a bacteria sometimes leads to the degradation of genetic sequences of the prophage genome, a phenomenon called “phage domestication” (Touchon et al. 2014). Newly integrated phages appear to be inactivated by the host and then eliminate unhelpful genes through point mutations and deletions in genetic regions that are not under selection. Through this process, it can be justified that most prophage sequences found within bacterial genomes are incomplete and do not contain essential genes for interaction with the bacteria (Bobay et al. 2014). An 8.2Kb extended incomplete prophage region containing ten ORFs was found at position 31514–39718 within the A2 genome using the PHASTER server (Table S9). The prophage region was classified as incomplete (< 70 score). This region was confirmed to match the sequences of PHAGE_Entero_mEp460_NC_019716. Enterobacteriaceae phage mep460 is a species of dsDNA virus from the class Caudoviricetes, bacterial and archaeal viruses with head-tail morphology.
CRISPR sequence analysis
CRISPR elements are essential for bacterial genomes, providing acquired immunity against viruses and plasmids (Horvath and Barrangou 2010). Several CRISPRs with a unique or different repeat sequence can be found in each strain, but only one of each type is associated with the cas genes because the spacers in the CRISPRs are different. The unique sequences or spacers correspond mostly to foreign DNA fragments, i.e. viruses, plasmids, or mobile genetic elements. Several genes called cas are associated with CRISPR and are found near them. They perform three different functions of the immune system: adaptation, crRNA maturation, and interference, and their number varies from one type to another. Phylogenetic studies on the CAS protein suggest that CRISPRs are acquired through horizontal transfer. CRISPR loci are transcribed into a pre-crRNA from the leader that acts as a promoter, and then this precursor matures into a small crRNA that plays a role in the selection and destruction of homologous foreign sequences (Couvin et al. 2018; Grissa et al. 2007). In A2, 15 CRISPR regions of 100 to 130 nucleotides in length were located with evidence level one and a CAS-TypeIF cas gene cluster region with a level of evidence 4 (Table S10, S11).
Biotechnological importance genes
Many microorganisms produce natural polyesters such as polyhydroxyalkanoates (PHA) as energy and carbon reserve materials under stressful growth conditions. They can subsequently be degraded by intracellular depolymerases and metabolised as an alternative carbon and energy source (Philip et al. 2007). PHAs have properties like those of conventional petroleum-based plastics, so in addition to their biocompatibility, they are considered materials with high biotechnological potential in industrial applications (Keshavarz and Roy 2010). In A2, the genes phbB acetoacetyl-CoA reductase, phbC polyhydroxyalkanoate synthase, and phaR polyhydroxyalkanoate synthesis repressor PhaR were detected, which participate in the biosynthesis of intracellular PHA.
Alpha-amylase is an enzyme of biotechnological interest because it is used for its antibiofilm properties to control microorganisms. Bacterial biofilms are a significant threat to industries, the environment, and the healthcare sector (Goel et al. 2022). The amyA gene encoding alpha-amylase was detected among the unique genes for A2.
Strain A2 also carries multiple genes directly involved in arsenic resistance; these genes have been reported in some species of the Halomonacea family (Diken et al. 2015; Lin et al. 2012; Wu et al. 2018). The ars system of arsenic detoxification is the most studied; Its operation begins with the uptake of arsenate by the phosphate transporters of the pstABCS complex (Lin et al. 2012) and the uptake of arsenite by aquaglyceroporins, which transform As (V) into As (III) by arsenate reductases, and the efflux of As (III) by arsenite efflux permeases (Wu et al. 2018). The presence of the genes arsA arsenite/tail-anchored protein-transporting ATPase, arsB arsenite transporter, arsC arsenate reductase, and arsH arsenical resistance protein ArsH suggest that A2 is an arsenite-specific efflux and arsenic-reducing prokaryote. These arsenic resistance genes, in addition to previous studies performed in other species, validate the genomic potential of A2 as a candidate organism for bioremediation studies (Diken et al. 2015). Genes of biotechnological importance found in A2 can be seen in supplementary Table S12.
Pangenome Analysis
Pangenome analysis was performed using Roary v3.13.0, using GFF3 files generated by Prokka. Consequently, we obtained four different classes of genes belonging to the ‘core’ (95% ≤strains ≤ 100%), ‘softcore’ (90%≤ strains < 95%), ‘shell’ (15%≤ strains < 90%) and ‘cloud’ (0%≤ strains < 15%). Pangenome analysis was performed using the A2 genome and 100 reference genomes of described Halomonas and related species found in the NCBI database. Pangenome analysis using Roary revealed 136,122 genes composing the pangenome of A2 and associated species at the 90% threshold. The number of core genes shared by 95–100% of strains was 317, while the cluster of softcore or near-core accessory genes shared by 90% to < 95% of strains was 16. The accessory set of gene clusters widely distributed in the species that form the shell genes were 3,457 genes shared by between 15% and < 95% of the strains. The accessory set of rare genes in the species that form the gene clouds that occur in 0% to < 15% of the strains showed the most significant number of 132,332 genes, also called unique genes. The large number of genes in the cloud implies significant heterogeneity between species (Carpi et al. 2022). The above agrees with what was reported by de la Haba et al. 2023 where the significant genetic variability within the genus is mentioned, which prevents the formation of a monophyletic group, which is why the taxonomy of the genus was reorganised. The results in the pangenome matrix and phylogenomic trees (Fig. 9, 10) constructed from the alignment of 152 core genes from 40 related species show a close relationship between H. salifodinae A2 and H. salifodinae IM328; both show a greater close with B. pacifica NBRC 102220 forming a branch separate from the Halomonas genus.
Unique genes
In H. salifodinae A2, 691 unique genes were detected, compared with other species, and 431 only compared with H. salifodinae IM328. There are 38 unique genes important for adaptability and biotechnological relevance involved in molybdenum metabolism, tungsten, copper, iron, antimicrobial resistance systems, biofilm formation, acid resistance, and an entire bo3-type cytochrome oxidase system. Some of these are described below and in the supplementary material Table S13.
The trace elements molybdenum and tungsten are used by virtually all life forms as binding cofactors in many enzymes (Johnson et al. 1996). Bacterial genes for molybdenum- and tungsten-containing enzymes are often differentially regulated depending on the availability of the metal in the environment (Rajeev et al. 2019). Both Mo and W atoms share several similar chemical characteristics; due to this, there are several strategies to differentiate them and avoid incorrect insertion of the metal into the active site of the enzymes. These elements enter the intracellular medium as soluble oxoanions, MoO42- and WO42-, through specific ATP-binding cassette (ABC) transporter systems. Within prokaryotes, these transport systems are divided into three different families: Mod, Wtp, and Tup (Otrelo-Cardoso et al. 2014). The genes tupA Tungstate-binding protein TupA and tupC Tungstate uptake system ATP-binding protein TupC of the TupABC system were detected; these participate in the cellular uptake of tungsten and belong to the ABC type transport systems (ATP Binding Cassette). The TupA component is a periplasmic protein that binds tungstate anions, which are then transported across the membrane by the TupB component using ATP hydrolysis as an energy source (the reaction catalysed by the ModC component). In addition, the genes related to molybdenum metabolism modA Molybdate-binding protein ModA, mobA Molybdenum cofactor guanylyltransferase, mobB Molybdopterin-guanine dinucleotide biosynthesis adapter protein, moeA_1 Molybdopterin molybdenumtransferase were detected in A2.
Bacteria use flagella for mobility in adverse environments and movement to less hostile sites suitable for adaptation and survival. They are usually formed by multiple protein subunits (He et al. 2023). Flagella assembly is highly ordered and governed by a hierarchical mode of regulation that is tightly controlled by at least 17 operons composed of more than 50 genes (Chevance and Hughes, 2008). In A2, nine genes were found to be directly involved in synthesising proteins to structure some of the parts of the flagellum, such as the basal body, axial structure, or flagellar filament. Some of these genes are fliD1 Flagellar hook-associated protein 2, fliF Flagellar M-ring protein, and flgA Flagella basal body P-ring formation protein FlgA.
The genes of the cytochrome bo3 complex were determined as unique genes in A2. Cytochrome bo3 is encoded by the cyoABCDE operon (Forte et al. 2019). The three gene products of cyoA, cyoB, and cyoC are related to subunits II, I, and III, respectively, of the eukaryotic and prokaryotic aa3-type cytochrome c oxidases, and this belongs to the heme-copper oxidase type A-1 superfamily (Sousa et al. 2012). The enzyme generates a proton motive force with high efficiency (H + / e - = 2) since it is endowed with proton pumping activity (Puustinen et al. 1991). The cytochrome bo3 complex predominates in growth conditions where oxygen tension is high (Chepuri et al. 1990). The enzyme consists of four subunits and has three redox cofactors, a low-spin heme b, a high-spin heme o3, and CuB, all located in subunit I. Heme b is the primary electron acceptor of ubiquinol, while heme o3 and CuB form a binuclear active centre where O2 chemistry occurs (Melin et al. 2018). As previously thought, Cytochrome bo3 contains only one ubiquinol binding site located on subunit I instead of two, known as the high-affinity QH site. (Choi et al. 2017)
The Type I protein secretion system prsE and prsD genes were determined to be unique genes in A2. This system is responsible for the secretion of the EPS-glycanases PlyA and PlyB, two exopolysaccharides (EPS) necessary for biofilm formation (Russo et al. 2006). These enzymes play a crucial role in biofilm formation by cleaving EPS chains and modulating biofilm structure and maturation (Lucke et al. 2020).
A2 would also have the presence of unique genes for a resistance system to acidic environments. The genes adiC Arginine/agmatine antiporter and adiA biodegradative arginine decarboxylase are reported in Escherichia coli as one of three acid resistance systems. The adi gene region is organised into two transcriptional units, adiAY and adiC, which are coordinately regulated but transcribed independently in E. coli. The data also illustrate that the AdiA antiporter system and AdiC decarboxylase are designed to function only at high acidity levels, sufficient to damage the cell (Gong et al. 2003). Furthermore, arginine is more than a common amino acid for protein synthesis; it can also be used as the sole nitrogen source for E. coli and as a carbon source for many other bacteria. It even functions as a substrate for synthesising polyamines essential for the extreme acid resistance of E. coli (Charlier et al. 2019).
Two unique genes, bepE and bepF, which are important in antibiotic resistance and were reported in Brucella suis, were detected in A2. These genes are involved in forming efflux pumps that facilitate the exit of various toxic compounds from the cell. Resistance nodulation-cell division (RND) type efflux pumps are responsible for the multidrug resistance phenotype observed in many clinically relevant species. Furthermore, RND pumps have been implicated in physiological processes, with roles in the virulence mechanisms of several pathogenic bacteria (Martin et al. 2009).
Genes for the biosynthesis of compatible solutes
Ectoine is synthesised from aspartate through a series of enzymatic reactions. First, L-aspartate-phosphate is synthesised by aspartate kinase (LysC) through ATP-dependent phosphorylation of L-aspartate and catalysed by L-aspartate-beta-semialdehyde dehydrogenase (Asd) through an NADPH-dependent reaction to form L-aspartate-semialdehyde (Schwibbert et al. 2011). Aspartate-semialdehyde is transaminated to 2,4-diaminobutyric acid (DABA); this reaction is catalysed by the DABA transaminase (EctB) encoded in the ectB gene. An acetyl group is then transferred to DABA from acetyl-CoA by DABA-N-acetyltransferase (EctA) encoded by the ectA gene, synthesising N-acetyl-1,4-diaminobutyric acid (Ono et al. 1999). Finally, ectoine synthase (EctC), encoded by the ectC gene, catalyses the cyclic condensation of N-acetyl-l-2,4-diaminobutyric acid, leading to the formation of ectoine (Galinski et al. 1997; Ono et al. 1999). Under certain stress conditions, Halomonas elongata and some other halophiles can convert ectoine to 5-hydroxyectoine with ectoine hydroxylase encoded by the ectD gene (Bursy et al. 2007; García-Estepa et al. 2006). All genes for ectoine and hydroxyectoine synthesis were found in A2. The classic configuration of these genes in the Halomonas genus is maintained in A2, with the ectABC genes arranged continuously in a downstream direction. Still, the ectD gene is separated from the other genes in the upstream direction. We compared the gene configurations with different bacterial species (Fig. 11). Furthermore, ectoine can also be used as a carbon source by using the doeABCDX system. The degradation of ectoine begins by hydrolysis of ectoine to N-αacetyl-1-2,4-diaminobutyric acid by DoeA (ectoine hydrolase) followed by deacetylation of N-α-acetyl-1-2,4-diaminobutyric acid to 1–2,4-diaminobutyric acid by DoeB and a transaminase reaction by DoeD to form L-aspartate semialdehyde. Finally, DoeC can oxidise L-aspartate-semialdehyde to aspartate. DoeX is a Lrp/AsnC family transcriptional regulator. The doeABCDX genes are found in the genome of A2, suggesting that it has the putative ability to degrade ectoine.
The betaine synthesis betAB, betI, and betT genes are present in A2. They synthesise betaine from the precursor choline in a two-step process that involves choline dehydrogenase BetB and betaine aldehyde dehydrogenase BetA; these convert choline to betaine aldehyde and betaine aldehyde to betaine, respectively (Cánovas et al. 1996). The betT gene encodes BetT, a high-affinity choline transporter, and betI encodes BetI, a transcriptional repressor of bet genes (Scholz et al. 2016).
Using glutamate and glutamine as compatible solutes is very common in bacteria. The glutamate synthesis pathway depends on glutamate dehydrogenase (Gdh) encoded by gdhA, while glutamine synthesis falls on glutamine synthetase (Gln) encoded by the glnA gene (Kloosterman et al. 2006). In the genome of A2, the genes gdhA and glnA were found to synthesise glutamate and glutamine, respectively.
The proABC genes involved in proline synthesis are found in the genome of A2. The proB gene encodes glutamate 5-kinase, proA encodes glutamate-5-semialdehyde dehydrogenase, and proC encodes pyrroline-5-carboxylate reductase (Goswami et al. 2022). In addition, other genes related to proline metabolism were found, such as proS encoding a proline-tRNA ligase, proV encoding transport system ATP-binding protein, proW encoding a transport system permease protein, and proX encoding transport system substrate-binding protein all related to glycine/betaine-proline transport, putA encodes proline dehydrogenase and putP sodium/proline symporter. The genes for synthesising compatible solutes found in A2 can be seen in supplementary Table S14.