Temperature-dependent biofilm analysis
Biofilm formation varied greatly between isolates at both 22°C and 37°C (Figure 1). Of the 280 isolates studied, 69% (n=193) produced more biofilm at 22°C (average OD540 of 1.98) compared to 37°C (average OD540 measurement of 1.29). Isolates demonstrated less variation in OD540 measures when grown at 37°C (mean = 1.291, standard deviation = 0.98), than at 22°C (mean of 1.983, standard deviation of 1.12) (Figure 2). Increased biofilm growth phenotype at 37°C was widely distributed across the phylogeny (Figure 3).
Scoary analyses yielded 2,834 clusters of orthologous groups (COGs) and genes associated with the phenotypic trait of producing more biofilm at 22°C compared to 37°C (naïve p< 0.05) (Suppl. Table 3). None of these hits were significant using a Bonferroni corrected p-value of <0.05 but 74 were significantly associated with this phenotype using the conservative, but less stringent, Benjamini-Hochberg correction at p<0.05. Pyseer analysis using the same input gave 1,015 significant hits (naïve p < 0.05) and 155 using a lrt (likelihood - ratio test) p< 0.01 (Suppl. Table 4). The top 20 most significant hits using the two tests are shown in Table 1 with the 14 genes present in both lists shown in bold type. Scoary hits were ordered by Benjamini-Hochberg – adjusted p-values and Pyseer hits by lrt- p-values. All hits are shown in Suppl. Tables 3 and 4. 20/20 of the most significant hits identified by Scoary are present in the genome of isolate WH-SGI-V-07050 and 15 / 20 using Pyseer and genes in Table 1 were therefore numbered according to this genome (when present). These genes include acr3, arsC, arsR and arsH - four genes involved in arsenic resistance / reduction as well as two methyltransferases and an integrating conjugative element (ICE) protein and an ICE relaxase. The gene arsR-family transcriptional regulators are considered to be important in many physiological processes, including biofilm production[44]. Presence of arsenic in bacterial cells has been demonstrated to affect biofilm synthesis as well as chemotaxis and motility[45], and there is a suggestion that arsenic can prevent the switch between planktonic and sessile lifestyle[46]. The presence of the flagellar protein FliC is also significantly associated with increased biofilm growth at 22°C compared to 37°C Pyseer (but not Scoary). Flagellar motility is well-known as a requirement for biofilm production in P. aeruginosa[47] and suppression of fliC has been linked to increasing temperature in P. syringae[48], with similar temperature-association of flagellar being reported in Listeria monocytogenes and Proteus vulgaris[49, 50]. The gene clsA, a cardiolipin synthase has demonstrated impact on biofilm formation in E. coli[51] although association with temperature has not been evaluated. The two methyltransferases and in particular the SAM (S-adenosylmethionine) – dependent methyltransferase may have a role in N-acyl-homoserine lactone synthesis, the key molecules in P. aeruginosa quorum sensing (QS)[52] although further work would be required to establish their importance.
The majority (33/40) of the most significant hits presented in Table 1 are in close proximity to each other in the WH-SGI-V-07050 chromosome – genes numbered 373 (WH-SGI-V-07050_00373 in Table 1) to 461 (WH-SGI-V-07050_00461). These genes are present in between 63 and 80 of the 280 study isolates. Comparative analysis of DNA in this region from the genome of isolate WH-SGI-V-07050 with other study isolate genomes show that this region is 165,376 bp long and contains 149 genes starting (from WH-SGI-V-07050_349 to WH-SGI-V-07050_498). A Blastn search of this region showed >99% DNA similarity over the entire region with several finished genomic sequences including strain FDAARGOS-532 (GenBank accession GI:1519006927). A manual alignment of this genome with those of WH-SGI-V-07050 and PAO1 (GI:110645304) was performed using Artemis Comparison Tool[53] (Figure 4). This shows that this region is not present in strain PAO1 except for a region of 13,273bp corresponding to bases 2,923,150 to 2,936,423 of PAO1 common to all three genomes. This conserved region contains ten open reading frames including pgsA at the 5’ end and the hypothetical protein gene PAO1_02672 at the 3’ end.
We used Phandango to visualise the genomes of the 280 isolates in this study for the presence (or absence) of genes corresponding to WH-SGI-V-07050_349 to WH-SGI-V-07050_498. Figure 5 shows that most of these genes are present in the genomes of many isolates and that their distribution is not strongly associated with core genome phylogeny although only a small number of genomes share the total length of this region with WH-SGI-V-07050 (shown in blue in Figure 5). Besides containing a cluster of arsenic resistance genes (see above) this region also contains genes encoding heavy metal-resistance genes for copper (copA2, copA_3, copA_4, copB_1, copC, copD, copK), mercury (merR_1, merP_1), and cadmium (cadA_1) that (with the exception of cadA) are significantly associated with the phenotype of producing more biofilm at ambient compared to body temperature (Scoary and Pyseer ranked hits are shown in Table 1 (and raw data in Suppl. Tables 3 and 4). The region also contains a gene for tetracycline resistance (tetA_1) that is significantly associated with this phenotype as well as several transcriptional regulators: - zntR_1, zntR_2, cusR (adjacent to sensor kinase gene cusS), hcaR_1, cecR and dmlR_24. Also present in this region are a cluster of nine genes that are present in between 260 and 278 of the 280 genomes studied. These genes include the response regulator gene gacA, a putative transcriptional regulator gene as well as a colicin receptor gene. The architecture of this region suggests a history of recent recombination events and this is supported by the presence in this region of an Integrative and Conjugative Element (ICE) protein and a prophage integrase that would strongly suggest that these gene clusters have been mobilised by conjugative transposition and phage transduction events into independent lineages.
Gene knockouts can be used to help elucidate the relative importance of individual candidate genes to observed biofilm phenotypes. The arsR family transcriptional regulator (Table 1) is present in 78 of 280 isolates and is not present in PAO1. It is distinct from the arsR gene present in 275 / 280 of our isolates that forms an arsenic resistance operon with arsB and arsC in PAO1. We found that two separate transposon mutants of the arsR gene in PAO1 showed enhanced biofilm growth at 22°C compared to 37°C (Suppl. Figure 2). This is supportive of a role for this gene in biofilm production at ambient temperature.
Temperature associated core genome SNPs
Pyseer analysis yielded 2083 significant SNPs (lrt p<0.001) associated with increased biofilm production at 22°C compared to 37°C (Suppl. Table 5). The 30 most significant hits (ordered by lrt-adjusted p-values) relative to the PAO1 genome are shown in Table 2. These include three genes with more than one SNP – Type II secretion system protein D, a sulfotransferase family protein and a nucleoside binding protein. Type II secretion systems are associated with pathogenesis and environmental survival in a number of species and involve the export of proteins to the extracellular biofilm matrix[54]. The possible role of sulfotransferases and the second most significant SNP - in cysW, a sulfate transport system permease gene, in biofilm production are unclear as is the role of PAO1 gene 00240, a nucleoside binding protein. The most significant SNP associated with increased biofilm production at 22°C is in the PAO1 gene 04493, that codes for 1-acyl-sn-glycerol-3-phosphate acyltransferase. This enzyme and quiP, an acyl-homoserine lactone acylase gene, may be involved in interactions with quorum sensing systems as the major QS signalling molecules in P. aeruginosa are acyl-homoserine lactones. Other SNPs in genes that have previously been found to be involved in biofilm production or regulation are in algX and algA, genes that form part of the operon involved in the production of the well-characterized extracellular polysaccharide biofilm component alginate[55] that is associated with the hyper-mucosity phenotype observed in isolates from some cystic fibrosis patients. The response regulator pleD is known to play a role in motility and transition between sessile and motile forms in Caulobacter crescentus[56] although its role in signalling in P. aeruginosa is less clear as is the role of the helix-turn-helix transcriptional regulator syrM. Other significant SNPs in genes involved in cellular motility in Table 2 are the flagellar glycosyl transferase gene fgtA and a putative major fimbrial subunit gene lpfA that may facilitate bacterial attachment and promote biofilm formation[57].
Biofilms on stainless-steel
Of the isolates assessed for biofilm production on stainless-steel (Suppl. Figure 1), 40 % (n= 10 out of 25) ST111 and 52% (n=13 out of 23) ST235 were considered high density biofilm producers. This demonstrates a large intra-clone variation of the biofilm phenotype. However, the variation observed in one clone was similar to the other (i.e. not significantly different), suggesting the intra-clone variation may be a common feature of other clones.
BEAST 2 analysis suggests that this sample of ST111 isolates diverged from a common ancestor ≅ 43 years ago (Figure 6.a), with a pan-genome of 15,488 genes. The 107 ST235 isolates included in this study had a pan-genome of 15,178 genes and diverged from a common ancestor ≅ 28 years ago (Figure 6.b), suggesting that the two lineages emerged within approximately 15 years of each other. When considered alongside biofilm phenotype, those isolates sharing phylogenetic similarity display similar biofilm phenotype, except for ST235 isolates WH-SGI-V-07622 and WH-SGI-V-07625, as well as isolates WH-SGI-V-07498 and WH-SGI-V-07624. Nevertheless, this analysis suggests many higher and lower density biofilm formers are closely related, and that biofilm production on stainless-steel is not necessarily predictable based only on genomic analysis.
The pan-genome of biofilm genes widely described in the literature (Figure 7) has a different core vs accessory structure in each of the two lineages in this study, with the majority of genes (≈54%) in both present in fewer than 15% of genomes. The ST235 isolates analysed here have a higher percentage of core genes compared to ST111 (61.7% vs 31.34%), whilst ST111 has a larger cloud genome compared to ST235 (40.3% vs 17.02%). This variation is down to a greater number of homologues of genes in the ST111 biofilm pan-genome pelA has four homologues in ST111 isolates and flgK, pilY1, pslI and pslJ have two each. The genes pilA and fimT were both identified in the ST111 pan-genome, but not found in the ST235 pan-genome, whilst pslC was found in ST235 but not in the ST111 pan-genome.
Pan-genome association analysis of stainless-steel biofilm phenotype using Scoary on ST111 (n=25) and ST235 (n=23) isolates showed no significant associations after Bonferroni or Benjamini-Hocht corrections for multiple sampling (p<0.05). However, Pyseer allows pangenome association analysis using mean OD540 values and the most statistically significant associations between accessory gene presence and biofilm production on stainless-steel are shown in Table 3. In ST111 and ST235 the most significantly associated genes are small proteins of 84 and 86 amino-acids respectively that share 21% amino acid sequence identity to each other and only match to hypothetical proteins using blastp. HHpred indicate partial but significant structural similarity to the same H. pylori Type IV secretion system translocase for both proteins. Protein translocation by these previously uncharacterised putative translocases could have a significant role in biofilm production. In ST111 isolates other genes with significant associations include the sensor kinase cusS (also associated with temperature dependent biofilm production - above) found in 22 isolates, and eight genes located together and present in only two ST111 isolates (isolate WH-SGI-V-07174 genes 02964-02971) that include two transposases, a type II toxin/antitoxin (TA) system RelE/ParE family toxin, and an XRE family transcriptional regulator that possibly represents an integrated plasmid sequence. Secreted TA system toxins have previously been found to be involved in biofilm formation in P. aeruginosa as well playing important roles in pathogenicity and persistence[58]. In ST235 genes associated with enhanced biofilm production on stainless-steel include two genes involved in mercury resistance, two methyltransferase proteins and an alginate biosynthesis gene algA_3. The putative ST235 T4SS translocase (WH-SGI-V-07406 gene 05831) is present in five ST235 isolates as are three genes in close proximity (05832, 05844 and 05845 in this isolate) that may represent part of a mobile genetic structure.
Core genome SNPs associated with biofilm production on stainless-steel were found using Pyseer and the most significant hits are shown in Table 4. In ST111 these include a SNP in a GGDEF and EAL domain-containing protein. Such proteins are known to have a key role in cell signalling and biofilm production[59]. The chemotaxis protein McpU and the PAS domain-containing protein are involved in cell-signalling and play a role in biofilm formation. Polymorphisms in the membrane proteins MprF and the DedA membrane family protein may have an effect on biofilm formation through increased translocation of proteins and other macromolecules from the cell. In ST235 the most significant core genome SNP is in a tRNA-Asn region however its relevance to biofilm production on stainless-steel is unclear. Four significant SNP sites are present in the topoisomerase primase domain-containing protein DnaG although the possible role of primase genes in biofilm production or regulation is obscure. Similarly, the importance of SNPs associated with biofilm phenotype for GTPase, ViaA and the two glycoside hydrolase family 19 proteins is unclear although we theorise that such glycoside hydrolases could be involved in biofilm matrix hydrolysis promoting transition of P. aeruginosa from a sessile to a planktonic state.