Localization and Classification of Tn916 and Its Family in Bacterial Genomes
588 Tn916 and its family transposons were localized, by using AAB60030.1 as a benchmark, they integrated into 259 genomic sites, predominantly within Bacillota (Table S3).
342 Tn916s were localized, which integrated at 144 genomic sites, in 429 integrases with an identity of no less than 90% to the AAB60030.1 (Table S3-1). The results showed that Tn916s are scattered across Bacillota, Pseudomonadota, and Mycoplasmatota, with the vast majority in Bacillota, mainly distributed within Staphylococcus aureus and S. pseudintermedius, accounting for 18.13% and 13.54% of the total determined Tn916s, respectively. There is only one strain in both Pseudomonadota and Mycoplasmatota. Haemophilus ducreyi (Pseudomonadota) is a Gram-negative bacterium, while the others are Gram-positive. Tn916 is an 18kb conservative size transposon, but 68 Tn916s changed in size, the largest being Tn916Ef − 18465 (39,169 bp), and Tn916Ed − VREdu (CP042598.1) is 16,082 bp, missing the resistance gene (tetM). One copy was often integrated into genomes, 21 18kb-sized Tn916sCd exist in 14 strains, among which 7 strains have 2 copies, it indicates that about 50% of the strains have weaker defense against Tn916. The direct repeats (DRs) of Tn916s are all AT-rich.
Each class of Tn916 family is classified according to integration sites, and the representative integrases were analyzed for genetic evolution using MEGA XI. The integrases of each class of the Tn916 family are concentrated in the phylogenetic tree, named clockwise as Tn916.1 to Tn916.9 (Fig. 2).
70 Tn916.1 transposons which inserted into 36 integration sites exist in Bacillota and Actinomycetota, mainly distributed within Clostridioides difficile and Enterococcus faecium of Bacillota, accounting for 20% and 50% of the Tn916.1 transposons, respectively (Table S3-2). Tn916.1 is a class of transposons ranging in size from 5 ~ 16 kb, but Tn916.1Bb − JCM7017 is exceptionally large (163 kb). 80 Tn916.2 transposons which integrated into 48 sites exist within Bacillota and Actinomycetota, mainly distributed within C. difficile of Bacillota (Table S3-3). Tn916.2 is a class of transposons ranging in size from 12 ~ 59 kb. There are both 3 Tn916.1 in 2 strains of Clostridium innocuum; There is 3 Tn916.2 in a Clostridium innocuum; There are both 2 Tn916.2 in 2 Clostridium innocuum, which indicating that this species has weaker resistance to the Tn916 family; in Amedibacterium intestinale JCM 30884 strain, there are 3 approximately 30 kb Tn916.2 transposons, indicating immune evasion phenomena.
43 Tn916.3 transposons which were only distributed within C. difficile of Bacillota are inserted into 7 integration sites, all strains have only one copy, sized 22 ~ 46 kb (Table S3-4). 4 Tn916.4 transposons which were only exists within Clostridia and Bacilli of Bacillota are inserted into 2 integration sites, all sized around 32kb (Table S3-5). 5 Tn916.5 transposons which were only distributed within Streptococcus thermophilus of Bacillota are all integrated at the same site, a class of transposons ranging in size from 15 ~ 26 kb (Table S3-6). 5 Tn916.6 transposons which were exist within Lachnospiraceae of Bacillota are nserted into 3 integration sites, sized 31 ~ 36 kb (Table S3-7). 16 Tn916.7 transposons which were only distributed within E. faecium of Bacillota are inserted into 8 integration sites (Table S3-8), sized 33 ~ 57 kb. 8 Tn916.8 transposons which were mainly distributed within Streptococcus constellatus of Bacillota are integrated into 3 genomic sites, a class of transposons ranging in size from 7 ~ 18 kb (Table S3-9). 15 Tn916.9 transposons which were only distributed within Bacillota and mainly exists within C. difficile are integrated into 7 genomic sites, sized 30 ~ 55 kb (Table S3-10).
Determination of Inverted Repeats for Tn916 and Its Family
Through ClustalW alignment (Fig. S1-1 to S10-2), the IRs of Tn916 and its family were determined (starting from the 5’ or 3’ end and ending where three consecutive mismatches occur) (Table 1). However, due to the less data for Tn916.5 and Tn916.6 and the inability to determine their IRs through ClustalW, it is possible that their structures have degraded over time in the genome or that existing methods have not accurately localized them. The IRs of the Tn916 and its seven families are all AT-rich. In Tn916.2, the IRs of transposons which Tn916.2Bb − NRBB49、Tn916.2Ef − E4438−1、Tn916.2Ci − LCLUMCCI001–3 are inconsistent with those of other transposons at the same integration sites (Fig. S3-1 and S3-2).
Determination of the Boundary Stem-Loop Structures of Tn916 and Its Family
Upon analyzing the DRs and adjacent sequences of Tn916 and its family with clearly defined IRs, it was found that both upstream and downstream sequences form AT-rich stem-loop structures. The upstream stem-loop structure is composed of a sequence from the upstream host genome sequence + a 5 bp coupling sequence + the upstream IR; the downstream stem-loop structure is formed by the downstream IR + a 5 bp coupling sequence + a sequence from the downstream host genome, where the sequences from the upstream/downstream host genome are reverse complementary to the upstream/downstream IR sequences (Fig. 3 and Supplementary Files). The sequence from the upstream or downstream host genome is identical or complementary to one endpoint of the IRs sequence with 4 or 5 bp, with their conserved bases being G/A/TTTT (AAAA/T/C) (Fig. 3 and Table 1).
Reorganization of Conserved Bases in the Coupling Sequences of Tn916 and Its Family
In the coupling sequences of Tn916 and its family, the first nucleotide is A or the last nucleotide is T, forming combinations of TTTA or TAAA (the underlined part is close to the coupling sequence in the stem structure of the stem-loop structure) (Fig. 3). The proportion of TTTA or TAAA combinations appearing at least once in the upstream/downstream stem-loop structures of Tn916 and its family can be calculated (the numerator is the number of TTTA or TAAA combinations, and the denominator is the number of integration sites of the transposon family). The proportions of TA combinations are 100% in both Tn916.3 and Tn916.4, 94.44% (34/36) in Tn916.1, 80% (36/45) in Tn916.2, 75.69% (109/144) in Tn916, 71.43% (5/7) in Tn916.9, 50% (5/10) in Tn916.7, while Tn916.8 had too few integration sites and did not form TTTA or TAAA combinations. The results showed that in Tn916, Tn916.1, Tn916.2, Tn916.3, Tn916.4, and Tn916.9, the conserved base in the coupling sequences is the first A or the last T.
Modular Structure of the Tn916 Family
The Tn916 family mostly contains four modules as same as Tn916 (Table S4). The recombination module always contains integrase and excisionase, with the identity of excisionase protein exceeding 75% (Fig. 4). The transcription regulation module invariably possesses orf7 (sigma factor) and orf8 (a potential transcription regulatory protein, HTH domain-containing protein) (Fig. 4). The auxiliary function modules are composed of different antibiotic resistance genes, with Tn916 carrying the tetracycline resistance gene; Tn916.2 is divided into three subtypes due to the auxiliary function module, named Tn916.2①, Tn916.2②, Tn916.2③, all carrying the ABC transporter gene cluster, but with low identity among the modules; Tn916.3 and Tn916.9 also carry the ABC transporter system gene cluster causing multidrug resistance. The conjugation transfer modules of the Tn916 family vary significantly. Tn916.4 has too few localized, and the specific module structure of Tn916.1 cannot be determined due to the large variation in functional genes of each module structure, and some of the auxiliary function modules are missing.
Distribution of Tn916 and Its Family in chromosomes
Tn916 and its family are primarily distributed within species such as C. difficile, C. innocuum, E. faecalis, E. faecium, S. aureus, S. pseudintermedius, S. agalactiae, S. pneumoniae, S. pyogenes and S. suis. The most integration sites of Tn916 and its family were identified in C. difficile DSM 27639, E. faecium E7237, E. faecalis JY32, C. innocuum I46 (CP022722.1). 3 Tn916.2,1 Tn916.1, 1 Tn916.3 and 1 Tn916.9 were integrated in C. difficile DSM 27639, providing multidrug resistance through various ABC transporter gene clusters. 39 other integrated sites of the Tn916 family also exist in this strain, indicating a weaker resistance to the Tn916 family (Fig. S11-a). 1 Tn916.7 were integrated in The E. faecium E7237, conferring vancomycin resistance, and 21 other Tn916 family integration sites also exist in this strain (Fig. S11-b). 1 Tn916 were integrated in The E. faecalis JY32, conferring tetracycline resistance, and 9 other Tn916 family integration sites also exist in this strain (Fig. S11-c). 3 Tn916.1 and 2 Tn916.2 were integrated in The C. innocuum I46 (CP022722.1), conferring multidrug resistance, and 6 other Tn916 family integration sites also exist in this strain (Fig. S11-d).