Optimisation of data collection parameters
We started by optimising the data collection procedures, focusing particularly on experimental parameters specific to AFM-IR, such as the excitation laser power and pulse rate. Considering that optimal settings for a field of view measuring 10–20 µm wide necessitate slow scanning speeds (45), we quantified sample drift. Additionally, methods for plotting raw data from IR images were implemented for quality control purposes.
Bacterial cells were embedded in epoxy resin after fixation, and AFM-IR was conducted on 95 nm thick sections of the produced resin blocks (Fig. 1A). This embedding approach provides superior sample shelf life and surface smoothness, facilitating imaging (46). To determine the common optimal laser power for all measurements, we collected spectra at various power levels at sample locations devoid of cells, which was confirmed by the absence of the protein-originating amide I band (Fig. 1B). A laser power of 1.37% was identified as optimal, exhibiting less noise compared to higher levels (5.37%), and a higher signal compared to lower levels (0.69% and 2.87%) gave a lower signal than 1.37%. We conclude that 1.37% is the optimal power level for these samples on our system. In resonance-enhanced AFM-IR, the repetition frequency of the IR laser needs to match the contact frequency of the sample-cantilever system (24), which varies from cantilever to cantilever but also depends on where the deflection laser hits the cantilever (Fig. 1C). The optimal frequency was tracked using a phase-locked loop (PLL), as it is contingent upon the nanomechanical properties of the sample and cantilever and is subject to drift. The gains of the PLL were determined through scanning experiments of epoxy-embedded bacteria. P = 60 and I = 6 provided the best separation between epoxy and bacteria (Fig. 1D). When acquiring a collection of spectra at various locations throughout the sample, we opt for low PLL gains (I = 0.1, P = 1) to reduce noise while allowing the PLL Frequency to adapt to slow changes in the optimal pulse rate.
Given the slow scanning speeds employed, sample drift may cause issues if left uncorrected. Temperature variations in the laboratory environment were found to exert a pronounced influence on sample drift relative to the AFM probe. Drift correction strategies were employed based on the observed drift patterns. This was done by collecting a series of heightmaps of the same sample, in this case, 2x2 µm images of amyloid protein on a gold substrate over a period of over 12 hours (Fig. 1E). Temperature-dependent drift is apparent both in the sample plane (x and y) and its vertical position (z). Based on our data, drift speeds on the order of 5–10 nm/min should be expected, even at relatively constant temperatures, and drift correction may be necessary (47).
The vertical sample drift was automatically compensated for by the AFM height tracking mechanism. However, drift in the cantilever's free air deflection requires additional consideration to ensure consistent force application during image acquisition. Similar measurements over 2 days at near-constant temperatures (within 28 ± 0.2°C) and an average sample drift of only 0.4 nm/min revealed differences in the free air deflection when automated deflection setpoint adjustment between acquired heightmaps was allowed (Fig. 1F). Given an engagement force of 0.3 V, these differences were large. As such, they required counteracting by resetting the deflection setpoint between image acquisitions; otherwise, this would result in strong variations in the force applied on the sample and cantilever and therefore also the optimal pulse rate.
Finally, we assessed the accuracy of the humidity sensor in our system because atmospheric water vapour profoundly impacts IR spectra in the mid-infrared region due to its sharp absorption lines. While regular collection of the laser emission spectrum before each measurement partially compensates for this effect, periodic verification of the relative atmospheric humidity throughout an experiment is advisable, ideally maintaining levels below 1%. Notably, the placement of the humidity sensor in a nanoIR3 system near the supply of dry air may yield humidity readings that appear overly optimistic compared to readings obtained from a sensor positioned adjacent to the sample location (Fig. 1G). Thus, it is imperative to allow humidity levels to fully equilibrate before collecting IR measurements.
Despite the implementation of these optimisations, the stability of the system may not always be sufficient to guarantee high-quality measurements. To ensure the integrity of our data, we acquired Height and Deflection images in one scanning direction and IR Amplitude, IR Phase, and PLL Frequency images in both scanning directions (trace and retrace) without applying any data processing. This approach enables the assessment of data quality both during and after measurement (Fig. 1F). Through this method, we can evaluate trace-retrace errors and assess the magnitude of deflection and IR phase signals, minimising deviations from zero. For all images published in this work, the raw data can be found in Additional File 1. Similarly, all raw spectra are presented in Additional File 2.
Data analysis pipeline and signal reproducibility
We established a pipeline for the automated analysis of AFM-IR images and spectra collected with the predefined parameters. Refer to the Methods section for details. To evaluate the performance of our measurement and analysis protocols, we prepared five identical samples of bacteria with spontaneous inclusion body (IB) formation and conducted multiple imaging sessions for each sample (n = 3–4), utilising the same cantilever whenever possible (refer to Additional file 3, Note S1 for additional sample and cantilever details). This approach enabled us to assess both technical and biological variability.
In each individual measurement, we collected two IR maps, one at 1625 cm⁻¹ (representing β-sheets (25)) and one at 1650 cm⁻¹ (representing α-helices and unordered loops (25)), along with five IR spectra corresponding to inclusion bodies (IB), cytoplasm (CP), and epoxy (background; BG). Representative spectra and their locations are shown in Fig. 2A-C. For all spectra in this study, location data are provided in Figure S2. To quantify the relative β-sheet content in each spectrum, we integrated the area from 1615 to 1635 cm⁻¹ (Fig. 2D). Our analysis revealed an enrichment of β-sheets in IBs compared to the cytoplasm. The observed β-sheet enrichment had a relative magnitude of 1.4 (95% CI: 1.36–1.50, two-sample t test: p_adj = .006). Notably, the technical variability observed did not yield statistically significant differences between repeat measurements (ANOVA on all data points for each sample: p_adj > .2). Moreover, no significant biological variability was observed (ANOVA on averages of each replicate: p > .24) in this assessment.
The PLL Frequency analysis (Fig. 2E) revealed no significant technical variability (ANOVA on all data within each repeat: p_adj > .3) when a single outlier measurement was excluded (repeat 2). However, we observed significant between-sample differences in the PLL Frequency of IBs upon discarding measurement 2 and samples 4–5 due to a lack of repeats (ANOVA on averages of each replicate: p_adj = .04), with sample 1 exhibiting significantly lower values than the other samples (Tukey’s test, 1–2: p_adj = .008, 1–3: p_adj = .005, 2–3: p_adj = .9).
AFM-IR images provide a greater variety and depth of information than do spectra. They were first processed following the protocol detailed in the Methods section. The resulting dataset is shown in Figs. 2F-J. First, we observed polar enrichment of IBs (Fig. 2K); however, there were more IBs in the middle of the cell than expected from the literature (3). This may be a result of the random three-dimensional orientation of cells with respect to the sectioning plane, but it is also possible that AFM-IR is sensitive to small protein aggregates that were not previously picked up by fluorescence microscopy approaches. Note that the relative age of the cell poles is not accessible in this experiment and that therefore, the sign of the polar location has no meaning. The positive pole is simply the one located on the right in the image.
Second, this image-based dataset provides a measurement of the number of inclusion bodies per cell for each sample, as shown in Fig. 2L. Within this dataset, there was no significant technical variability (ANOVA on all data within each repeat: p_adj > .5). There was biological variability (ANOVA on averages of each replicate: p_adj = .001) caused by sample 3 (Tukey’s test: p_adj < .012).
Third, this dataset contains a distribution of IB sizes (Fig. 2M). There was no evidence of significant technical (ANOVA on all data within each repeat: p > .3) or biological variability between the samples (ANOVA on averages of each replicate: p_adj = .4).
Fourth, the segmentation maps can be correlated to the IR amplitude ratio and PLL images to assess the physical and structural properties of IBs in an unbiased manner. Due to the inhomogeneous intensities of IR amplitude images discussed before, it is important to compare the relative β-sheet enrichment of an IB, the mean of the 1625/1650 cm⁻¹ image within the IB region, to that of the cytoplasm surrounding it (Fig. 2N). In this case, there was significant technical variability only within sample 3 (ANOVA on all data within sample 3: p_adj = .0005, for other samples: p > .22), but no biological variability between samples (ANOVA on averages of each replicate: p_adj = .06). The relative β-sheet enrichment of inclusion bodies in this dataset was 1.11 (95% CI: 1.07–1.15, two-sample t test: p_adj = .002). This enrichment value is lower than that measured in the spectral analysis, possibly because of the choice of wavenumbers for imaging.
Figure 2O shows the PLL Frequency difference between IBs and the surrounding cytoplasm. As in the spectral analysis, measurement 2 is an outlier. Excluding it, there was no statistical evidence for technical variability (ANOVA on all data within each repeat: p_adj > .5) or biological variability when testing for differences between samples 1–3 (ANOVA on averages of each measurement: p_adj > .06). The PLL frequency of IBs can be evaluated independently from that of the cytoplasm, but this approach introduces extensive technical and biological variability (Additional File 3, Note S2).
In summary, we developed a robust imaging pipeline providing data inaccessible by spectral analysis and independent of user bias due to the cherry-picking of spectrum locations. However, image analysis is limited by the discrete number of acquired wavenumbers and is more sensitive to technical artifacts, as shown in the ratio map in Fig. 2I.
The nature of a stressor is reflected in the structure of resulting inclusion bodies
Having developed a robust imaging pipeline and evaluated its sensitivity to technical and biological variability, we attempted to distinguish IBs from various stress conditions by AFM-IR. A panel was selected to include physical stress (heat shock), chemical stress (heavy metals such as NiCl2, CoCl2 and oxidation by hydrogen peroxide) and proteotoxic stress (overexpression of the aggregation-prone p53 DNA-binding domain (41) or exposure to the peptides P2 and P33 (9)). Peptins are short hydrophobic peptides that nucleate the aggregation of endogenous proteins due to homology with aggregation-prone regions.
To increase the experimental throughput, only IR absorption spectra were collected for these samples, as shown in Fig. 3A. These experiments were performed in E. coli BL21 to accommodate the overexpression stress, but this strain also exhibited spontaneous IB formation in the buffer IB and cytoplasm were distinct from each other under all conditions, partly due to the increased β-sheet concentration, which was visible in the second derivative spectra (Fig. 3B). Figure 3C shows a quantification of the β-sheet content, the cytoplasmic levels of which were correlated with those in IBs (Pearson r = .84, 95% CI: .34-.97, p = .009; Fig. 3D). PCA indicated that the first principal component was highly sensitive to the β-sheet content (Fig. 3E). Both PCA and UMAP (48) projections (Figs. 3F-G) could distinguish between the IB and cytoplasm spectra. Furthermore, IBs from heat shock and proteotoxic stress conditions formed a cluster, and the chemical stresses were intermediate between them and the cytoplasm spectra. In this sense, the AFM-IR data corresponded with the severity and type of applied stress.
Because these results were based on a single sample per condition, they needed to be validated. We therefore compared H2O2 stress to heat shock with a larger number of samples (n = 3) and full imaging following the protocol developed in this paper. Heat shock was shown to induce a much greater IB load (Fig. 4A, B). There were some inclusions visible in the hydrogen peroxide sample in Fig. 4A, but they were not recognised by the image segmentation pipeline, presumably due to their lower β-sheet enrichment and smaller size.
These smaller IBs could still be studied by collecting IR absorption spectra (see Figs. 4C-D). Spectral analysis confirmed that heat shock IBs had the highest β-sheet content among all spectra quantified in Fig. 4E (Dunnett’s test: p < .033). Additionally, the second derivative spectra implied the existence of two new bands in the peroxide-stressed spectra at 1678 cm⁻¹ (antiparallel β-sheets) and 1616 cm⁻¹ (intermolecular β-sheets), although the latter was nearly invisible in the original spectra. The 1678 cm⁻¹ band set the peroxide cytoplasm spectra apart from all others (Fig. 4F): Dunnett’s test comparing all spectra to the control cytoplasm revealed no significant differences, except for the peroxide cytoplasm spectrum (p_adj = .01). We concluded that AFM-IR is sensitive enough to distinguish between different stresses based on the secondary structure of cytoplasmic and aggregated proteins in stressed cells.
Recovery from heat shock
To go even further, heat shock IBs were characterised in a time-resolved manner after returning to 37°C (samples were collected before heat shock and immediately, 30 min, 1 h and 2 h after heat shock; Figs. 5A-C).
Unsurprisingly, heat shock raised IBs, and their spectra contained a significantly greater contribution from β-sheets, as confirmed by second derivative analysis (Additional File 3, Note S3). A quantification of the β-sheet signal from these spectra (Fig. 5D) showed that the IB spectra at all timepoints were significantly enriched in β-sheets compared to the IB spectra before heat shock (ANOVA followed by Tukey’s test: p_adj < .0003), but there was no evidence of significant changes in the β-sheet content during the recovery period (Tukey’s test: p > .6). The cytoplasmic β-sheet content was stable over time (ANOVA: p = .4). The PLL Frequency of IBs did not change over time between the IB spectra at different timepoints (ANOVA: p = .7), nor did cytoplasm spectra (ANOVA: p = .7, Fig. 5E). In general, however, IBs were stiffer than cytoplasm (Wilcoxon signed-rank test: p_adj = 2e-5).
The image analysis data, specifically of the IB area (Fig. 5F) and number (Fig. 5G), showed similar trends: an increase during the heat shock with a steady state in the two hours afterwards. While the evolution of IB β-sheet enrichment was not statistically significant (ANOVA: p_adj = .1), its trend recapitulated the spectral quantification and remained significantly greater than 1 in general (95% CI: 1.13–1.18, two-sample t test: p_adj = 2e-18, Fig. 5H). Similarly, the difference in PLL Frequency between IBs and the cytoplasm (Fig. 5I) did not vary over time (ANOVA: p = .8) but was positive (95% CI: .15-.52, one-sample t test: p_adj = .0003).
In short, AFM-IR was unable to resolve any differences in the IB composition in the first two hours after heat shock. This could mean that disassembly takes longer than two hours under the conditions used in this paper (14), or it could be a limitation of the instrument. These data were validated by several orthogonal methods: the IBs were stained with the amyloid marker pFTAA and imaged using structured illumination microscopy to verify the amyloid nature of the β-sheets, one sample was imaged by SEM to measure electron density variations and surface wear due to the AFM measurement, and IBs were purified and reimaged by AFM-IR (Additional File, Note S4).
Using the full capabilities of AFM-IR
The protocol presented in this paper sacrifices resolution in favour of faster acquisition times and larger fields of view, yet the resulting data did offer evidence that IBs are not sharply defined objects but that they have diffuse boundaries spanning approximately 120 nm (Fig. 6A). This figure shows the average β-sheet enrichment and PLL difference of all pixels in the heat shock recovery dataset with respect to the distance to the closest IB border, with negative values indicating pixels outside an IB. To substantiate this evidence, we also present an example of the capabilities of the instrument at a sampling rate of approximately 1 pixel per 3 nm, as presented in Fig. 6B. This image clearly shows a heterogeneous IB with diffuse edges.
In addition to the size and number of IBs, their β-sheet content and PLL Frequency, a large set of other properties was measured, some of which were found to be intimately connected with each other (Fig. 6C). To avoid batch effects, values for each image were converted to a z score independently of the other images. As an internal control, neither cell orientation nor the location of an IB on the polar axis are correlated with any other variable in this dataset. Some correlations presented are technical artefacts, such as the mean and standard deviation of the PLL Frequency of an IB. A suite of correlations, including the proximity of an IB to a cell pole (ib_polar_projection_abs), is likely driven by apparent cell size, which in turn is strongly dependent on the orientation of the cell with respect to the sectioning plane. Interestingly, the relative β-sheet enrichment of an IB (β-to-α ratio divided by that of the surrounding cytoplasm, beta_ratio_fc) was largely uncorrelated to variables related to its stiffness. By its definition, it has correlations with variables related to the IR amplitude measurements, but it was also positively correlated with the total IB area. The thickness of a section at an IB/cytoplasm location (ib_height_mean and cp_height_mean) was not strongly correlated with the IR amplitude at 1625 cm⁻¹ or 1650 cm⁻¹ or with the PLL measurement, indicating that these signals must come from variations in protein density, even if they can be difficult to accurately measure with AFM-IR (23). This mode of analysis shows the power and potential of image-based AFM-IR experiments.