The number of isolates per patient was in the range of 6 ~ 15, over 12 ~ 44 days between the first respiratory tract isolate and last bloodstream isolate. The pattern of antimicrobial susceptibility indicated that all isolates were carbapenem-resistant A. baumannii, and only susceptible to colistin (Table S1). We detected two Pasteur sequence types (STs) and six Oxford STs from the isolates. All isolates belonged to the predominant clonal complex CC2. Two patients’ isolates (C and L) and 11 isolates in Patient A belonged to Pasteur ST2. Four isolates in Patient A belonged to a new Pasteur ST of CC2. Different Oxford STs were identified in three patients, including ST1968 and a new Oxford ST in Patient A, ST469 and ST436 in Patient C, and ST195 and ST208 in Patient L (Table S1). A total of twelve isolates were selected for further analysis, including five bloodstream isolates in the three patients, and other isolates which have the same Oxford ST with bloodstream isolates in each patient (Fig. 1).
Within-host genetic diversity from respiratory tract carriage to bloodstream infection
To facilitate detailed analysis of strain relationships, we developed a robust phylogeny based on single nucleotide variants (SNVs) present in core regions of the genome to represent ancestral relationships among the isolates (Fig. 2). The phylogenetic tree analysis indicated that isolates collected from different patient were grouped into distinct clades. Isolates in each patient showed different genomic characteristics. Isolates which had the same ST were clustered together. Bloodstream isolates A14, C4, C6, L9 and L10 were clustered with isolates from respiratory tract in each patient.
We defined the amount of genetic diversity as the number of SNVs between 12 selected isolates within the patients. SNVs of the genomes were identified by mapping sequence reads for each isolate against the first isolates with the same Oxford ST as the bloodstream isolate in each patient (A1 for patient A, C2 for patient C, and L6 for patient L). A total of twenty-one SNVs were identified (Table S2). A complete list of all the genic mutations differing between the isolates was given in the Table 1. Except for a hypothetical protein, the SNVs were found in srpA, gspJ, srmB, fimV, sca1, transposase, pilR, pcaJ, tniA genes, which encodes for organic peroxide-dependent peroxidase, general secretion pathway protein, DEAD box helicase family, Hep Hag repeat protein, Tfp pilus assembly protein FimV, phage-related minor tail protein, transposase IS66 family, sigma-54 interaction domain protein, 3-oxoadipate CoA-transferase, mu transposase respectively. Two genes encode the same protein Tfp pilus assembly protein FimV. The genes encoding for 3-oxoadipate CoA-transferase PcaJ had the highest density of SNVs. Two missense variants and one synonymous variant were detected in the gene.
Table 1
The genic SNVs found from respiratory tract carriage to bloodstream infection in three A. baumannii bacteremia patients.
No.
|
SNV type
|
Coding region change
|
Gene name/function
|
Patient ID (reference isolate ID)
|
Oxford ST (gltA-gyrB-gdhB-cpn60-recA-gpi-rpoD)
|
Isolates ID
|
1
|
Stop gained
|
Gln113*
|
srpA, has an organic peroxide-dependent peroxidase activity
|
A (A1)
|
New (122-3-3-2-2-97-3)
|
A14
|
2
|
Disruptive inframe insertion
|
Phe120_Lys121 ins Asn
|
gspJ, general secretion pathway protein
|
A (A1)
|
New (122-3-3-2-2-97-3)
|
A14
|
3
|
Conservative inframe insertion
|
His458_Ala459 ins ValValLysValValLeuLysT
hrValHisValValAlaSerValValLysIleValHisVal
AlaSerSerThrGlnThrValHisValValLysValVal
LeuLysThrValGlnAsnValAlaSerValValLysIle
ValArgValAlaSerSerThrGlnIleValHis
|
srmB, belongs to the DEAD box helicase family
|
C (C2)
|
469 (1-12-3-2-2-103-3)
|
C4, C6
|
4
|
Upstream gene variant
|
|
hypothetical protein
|
C (C2)
|
469 (1-12-3-2-2-103-3)
|
C6
|
5
|
Upstream gene variant
|
|
Hep Hag repeat protein
|
C (C2)
|
469 (1-12-3-2-2-103-3)
|
C4, C6
|
6
|
Missense variant
Synonymous variant
|
Thr4Lys
Asp5Asp
|
fimV, Tfp pilus assembly protein FimV
|
C (C2)
|
469 (1-12-3-2-2-103-3)
|
C2
|
7
|
Synonymous variant
|
Asp18Asp
|
fimV, Tfp pilus assembly protein FimV
|
C (C2)
|
469 (1-12-3-2-2-103-3)
|
C4, C6
|
8
|
Upstream gene variant
|
|
sca1, phage-related minor tail protein
|
L (L6)
|
195 (1-3-3-2-2-96-3)
|
L6, L8, L10
|
9
|
Missense variant
|
Gly184Ser
|
Transposase IS66 family
|
L (L6)
|
195 (1-3-3-2-2-96-3)
|
L6, L7, L9, L10
|
10
|
Upstream gene variant
|
|
pilR, sigma-54 interaction domain
|
L (L6)
|
195 (1-3-3-2-2-96-3)
|
L8, L9, L10
|
11
|
Missense variant
Synonymous variant
|
Ile91Leu
Phe90Leu
Arg89Arg
|
pcaJ, 3-oxoadipate CoA-transferase activity
|
L (L6)
|
195 (1-3-3-2-2-96-3)
|
L7, L8, L9, L10
|
12
|
Synonymous variant
|
Leu16Leu
|
tniA, mu transposase, C-terminal
|
L (L6)
|
195 (1-3-3-2-2-96-3)
|
L7, L10
|
SNVs of the genomes were identified by mapping sequence reads for each isolate against the first isolates with the same Oxford ST type as the bloodstream isolate in each patient. Isolates from bloodstream were in bold. |
The mean number of SNVs between consecutively sampled isolates ranged from 0.67 to 8 for different patient (Fig. 3A). To assess whether the accumulation of substitutions was time-dependent, we fitted a linear regression model of the number of accrued substitutions against the corresponding time (Fig. 3B). In general, A. baumannii isolates accumulated mutations over time at different rates in three patients. With the exception of Patient C, who’s within-host substitution rate was 4.35114 × 10− 5 SNVs site− 1 year− 1, the other two patients showed lower within-host substitution rate ranging from 6.46753 × 10− 6 to 7.52486 × 10− 6 SNVs site− 1 year− 1 (Table 2). Oxford STs of Patient A (new) and L (ST195) belonged to clonal complex 208 (CC208), which was the predominant CC in China. Isolates of CC208 had similar substitution rate, and such within-host substitution rate resulted in the introduction of up to ≈ 29 substitutions per year. Non CC208 isolates showed faster substitution rate (182.5 substitutions per year).
Table 2
Within-host nucleotide substitution rates from respiratory tract carriage to bacteremia.
Patient ID
|
CC208
|
R2
|
Estimate
|
Substitution rate
|
SNV year− 1
|
A
|
Y
|
0.9944
|
0.08647
|
7.52486 × 106
|
31.56155
|
C
|
N
|
1
|
0.5
|
4.35114 × 105
|
182.5
|
L
|
Y
|
0.8176
|
0.07432
|
6.46753 × 106
|
27.1268
|
R2 denoted coefficient of determination. The estimated value for the regression coefficient was expressed as SNVs per week. The substitution rate were expressed SNVs site− 1 year− 1. |
In two of three patients, intergenic SNVs occurred more frequently than synonymous and non-synonymous SNVs (Fig. 3C). Five intergenic SNVs were identified in patient C, including 2 upstream gene variants and 3 intergenic variants. In Patient L, we discovered a total of 5 intergenic SNVs, comprising 3 upstream gene variants and 2 intergenic variants. Non-synonymous mutations mostly occurred during the transition from respiratory tract carriage to bacteremia. In Patient A and C, non-synonymous variants were only detected during the transition. In Patient L, we discovered a total of 3 missense variants among the respiratory tract isolates. Functional analysis suggested that most of the mutations were in genes associated intracellular trafficking, secretion, and vesicular transport, DNA replication, recombination, and repair and cell motility (Fig. 3D).
Homologous recombination during within-host evolution
Homologous recombination is the major driver of evolution inbacterial pathogens. To identify or rule out the occurrence of recombination, we aligned the genomes of the isolates from each patient to assess whether we could identify genomic regions with high density of SNVs, a well-known signature for recombination. We found evidence for the occurrence of within-host recombination in one patient (Patient L). The range of recombination blocks was 1 ~ 3, while the number SNVs within each block was 14 (range: 8 ~ 36) per recombination block. We then assessed the ratio of imported SNVs via recombination relative to random substitutions (r/m) and total recombination blocks relative to random substitutions (ρ/θ), which are widely used statistics for quantifying the contribution of recombination to genomic diversification. The r/m and ρ/θ averaged across all phylogenetic branches where recombination had occurred were 3.02 (range: 0 ~ 26) and 0.35 (range:0 ~ 3) respectively.
Patient specific gene content during within-host evolution
Gene content showed patient specificity. Only 1.1% ~ 9.4% of the A. baumannii gene clusters observed in a specific patient (i.e., the subject-specific pan-genome) and 88.5% ~ 92.8% of the gene clusters observed in all isolates of a specific patient (i.e., the subject-specific core-genome) were shared between all three patients, and 0.5% ~ 6.3% of core gene clusters were entirely unique to a patient (Fig. 4). Antibiotic resistant gene and virulence gene profiling showed that most genes were consistent in the same MLST isolates in each patient (Table S3). Only one isolate A2 contained the fosfomycin resistance gene fosB4. The gene armA was only absent in isolate L7. The gene aph(3')-Ia lost in the L2, L4 and L5 isolates. For the virulence genes, only the bap gene showed difference within the the same MLST isolates. The isolates C4 and C7 had two biofilm-associated bap gene, which C2 had only one. These findings suggested a personalization of gene content at the population level. The A. baumannii population found in a single host retains the inherent diversity from multiple founder lineages, further evolution of the A. baumannii gene repertoire occurred in a host-specific manner. Our results showed that, despite their wide distribution in the SNV-based phylogeny, A. baumannii within a single host had constrained gene content diversity.