Phylogenetic analysis based on 16S rRNA gene sequences
The 16S rRNA gene sequence analysis showed strain YIM B04394T was related most closely to the type strain of R. toxinivorans DSM 16998T (98.3% similarity). Meanwhile, the strain clustered in the genus Roseateles stably after phylogenetic analyses of the 16S rRNA gene sequences. The The neighbor-joining and maximum-parsimony phylogenetic trees showed that strain YIM B04394T formed a distinct and stable clade within the genus Roseateles with R. toxinivorans DSM 16998T s the closest relative, and supported by 82-92 % bootstrap value, respectively (Figs. S1 and S3). The maximum-likelihood phylogenetic tree clustered YIMB04394T with R. albus hw1T, R. koreensis hw8T, R. toxinivorans DSM 16998T and R. oligotrophus KCTC 42519T (Fig. S2). It was obvious that strain YIM B04394T was phylogenetically affiliated to the genus Roseateles. The GenBank accession number for the 16S rRNA gene sequence of strain YIM B04394T is PP813686.
Genomic analysis and genome‑based phylogeny
The genome-based phylogenetic tree showed that strain YIM B04394T formed a distinct and stable clade within the genus Roseateles with R. asaccharophilus DSM 25082T as the closest relative, and supported by 100% bootstrap value (Fig. 1). The dDDH, ANI values between strain YIM B04394T and related type strains were 21.5% and 76.47% with R. toxinivorans DSM 16998T, 22.9% and 79.54% with R. asaccharophilus DSM 25082T, respectively. These values were significantly lower than the currently established boundaries for genomic species definition (70% for dDDH, 95–96% for ANI) (Wayne 1988; Richter and Rosselló-Móra 2009). These results proved that strain YIM B04394T should be assigned to the genus Roseateles and represented a novel species. The similarity of 16S rRNA gene sequence, topological structure of phylogenetic tree and phylogenetic tree results based on genome were considered comprehensively. The literature data of above two strains R. toxinivorans DSM 16998T and R. asaccharophilus DSM 25082T were selected for comparison in this study.
Genome features
The assembled draft genome sequence of strain YIM B04394T consisted of a total length of 5,464,438 bp and contained an N50 of 629,318 bp, 193 contigs, 3 rRNA, 61 tRNA and 5,121 protein-coding genes in total. The DNA G+C content was 67.8 mol%. The genomic analyses showed 99.4 % completeness for the novel strain. The genomic features of the novel strain and the reference strains are presented in Table 1, and Fig. S4 shows the distribution of particular genomic regions of strain YIM B04394T. The draft genome of strain YIM B04394T has been submitted to DDBJ/ENA/GenBank under the accession number JBDPLH000000000.
The analysed AntiSMASH results of the genome showed 6 gene clusters inculding one Ripp-like, one terpene, one NI-siderophore, one hydrogen cyanide, and two N-acyl amino acid. The results of the genome of strain YIM B04394T and other reference strains are shown in Table S1. They all contain terpene and N-acyl amino acid gene clusters. N-acyl amino acids are amphiphilic molecules, with different potential fatty acid and head group moieties which play a key role in a variety of physiological functions (Prakash and Kamlekar 2021).
Functional analysis by COG database revealed that the genome of strain YIM B04394T was annotated 4,581 genes (89.46% of all genes). All annotated genes were assigned to 4 categories and 22 types (Fig. S5). The most abundant type was function unknown (Type S, 1,386 genes), followed by amino acid transport and metabolism (Type E, 415 genes), signal transduction mechanisms (Type T, 351 genes), transcription (Type K, 325 genes), inorganic ion transport and metabolism (Type P, 289 genes). The CAZy annotation results of strain YIM B04394T found 141 genes assigned to 6 functional CAZy classes, and carbohydrate esterases, glycosyl transferases and glycoside hydrolases were the main categories (Fig. S6). Functional analysis by KEGG database revealed that strain YIM B04394T had 3,434 genes (Fig. S7).
Morphology, physiological and biochemical characteristics
Strain YIM B04394T was Gram-stain-negative and aerobic bacterial strain. Under transmission electron micrographs, the shape of the cells was rod-shaped with a single polar flagellum (Fig. 2). Cells grown in R2A broth showed motility when observed by a light microscope, which was consistent with the results of soft agar cultivation. The strain showed a weakly positive reaction to a 3% (v/v) aqueous hydrogen peroxide solution with few oxygen bubble production, while oxidase activity showed a positive reaction. Strain YIM B04394T grew well on R2A medium and weakly on NA. The strain grew at 10-40℃ (optimum, 28℃) and at pH 6.0–8.0 (optimum, pH 7.0). Additionally, the strain grew at 0-5% NaCl concentrations (optimum, 0%). Strain YIM B04394T was positive for hydrolyzed of gelatin, starch, nitrate reduction, tweens 40 and tweens 80, while negative for tween 20. In API 20NE and API ZYM strips, strain YIM B04394T was positive for urease, aesculin, rase lipase (C8), esterase lipase (C14), cystine arylaminase, trypsin, α-galactosidase, N-acetyl-β-glucosaminidase,while tryptophan, arginine, D-galactose, D-glucose, D-mannitol, N-acetyl-D-glucosamine, gluconate, decylic acid, adipic acid, malic acid, citric acid, phenylacetic acid, esterase (C4), leucine arylaminase, valine arylamidase, chymotrypsin, acid phosphatase were negative. Detailed physiological and biochemical comparison of strain YIM B04394T and related Roseateles species are presented in Table 2. The data of Biolog GEN III MicroPlate tests and antibiotic susceptibility for strain YIM B04394T were shown in Table S2 and Table S3.
Chemotaxonomic characterizations
The polar lipid of strain YIM B04394T was comprised of diphosphatidylglycerol, phosphatidylethanolamine and two unknown phospholipids (Fig. S8). MK-8 was the isoprenoid quinone present in strain YIM B04394T. The predominant fatty acids (>10%) of strain YIM B04394T were identified as summed feature 3 (C16:1ω7c and/or C16:1ω6c, 37.36 %), C16:0 (29.09 %). The fatty acids profiles of strain YIM B04394T showed disparities from the closely associated species were presented in Table S4.