An optimized workflow for MS-based quantitative proteomics of challenging clinical bronchoalveolar lavage fluid (BALF) samples

doi:10.21203/rs.3.rs-2247886/v1

Download PDF

Method Article

An optimized workflow for MS-based quantitative proteomics of challenging clinical bronchoalveolar lavage fluid (BALF) samples

https://doi.org/10.21203/rs.3.rs-2247886/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 02 Apr, 2023

Read the published version in Clinical Proteomics →

You are reading this latest preprint version

Background

Clinical Bronchoalveolar lavage fluid (BALF) samples are rich in biomolecules, including proteins, and useful for molecular studies of lung health and disease. However, mass spectrometry (MS)-based proteomic analysis of BALF is challenged by the dynamic range of protein abundance, and potential for interfering contaminants. A robust, MS-based proteomics compatible sample preparation workflow for BALF samples, including those of small and large volume, would be useful for many researchers.

Results

We have developed a workflow that combines high abundance protein depletion, protein trapping, clean-up, and in-situtryptic digestion, that is compatible with either qualitative or quantitative MS-based proteomic analysis. The workflow includes a value-added collection of endogenous peptides for peptidomic analysis of BALF samples, if desired, as well as amenability to offline semi-preparative or microscale fractionation of complex peptide mixtures prior to LC-MS/MS analysis, for increased depth of analysis. We demonstrate the effectiveness of this workflow on BALF samples collected from COPD patients, including for smaller sample volumes of 1-5 mL that are commonly available from the clinic. We also demonstrate the repeatability of the workflow as an indicator of its utility for quantitative proteomic studies.

Conclusions

Overall, our described workflow consistently provided high quality proteins and tryptic peptides for MS analysis. It should enable researchers to apply MS-based proteomics to a wide-variety of studies focused on BALF clinical specimens.

Bronchoalveolar lavage fluid

BALF

quantitative proteomics

mass spectrometry

sample preparation

lung disease

BALF: A valuable clinical sample for studying lung disease

Bronchoalveolar lavage fluid (BALF) is a clinical sample, generally collected from patients with lung conditions, via a bronchoscopy passed through the upper airway and into the lungs. A saline solution is introduced, effectively washing the distal lung tissue followed by aspirating the lavage, providing a means to obtain cells and secreted molecules directly from the lungs for further analysis. BALF’s value for molecular characterization of lung disease has been long known(1) and described for various clinical investigations(2–4).

The value of BALF for investigation of lung disease and biology stems from its rich repertoire of biomolecules, many of which are lung specific. The distal lung sampled in BALF collection contains alveolar macrophages, and, in disease, various inflammatory cells, along with cell secreted molecules (proteins and metabolites), host and microbial DNA(5), RNA (primarily packaged in extracellular vesicles(6)) and lipids(7). Proteins have long been known to be a major component of BALF(8), including proteins present in high amounts specific to disease pathologies (e.g. mucins, surfactant proteins)(9, 10). The fluid also contains endogenous peptides generated as products of protease activity on larger proteins, some of which have biological activity and can serve as biomarkers(11, 12). Small molecule metabolites, acted upon by enzymes, are also detectable(13–15). Finally, extracellular vesicles, packed with nucleic acids, proteins and other molecule types have also been well-described(16, 17).

BALF MS-based proteomics: history and existing challenges

Given the prominence of proteins within BALF, much attention has been given to applying mass spectrometry (MS)-based proteomics methods to the characterization of these samples. Early applications focused on using two-dimensional gel electrophoresis to separate and visualize complex protein mixtures derived from BALF(8), followed by digestion of separated proteins with trypsin and analysis using nanoscale liquid chromatography (LC) tandem mass spectrometry (MS/MS) to collect mass spectra of fragmented peptides for subsequence sequence database searching, peptide identification and protein inference(18). Over the last two decades numerous studies of BALF collected from patients with diverse lung conditions have been described in the literature(4, 19–21).

From the outset, the challenges that BALF presents to MS-based proteomic analysis has been appreciated and described in numerous publications(22), including a recent report from members of the American Thoracic Society(23). As outlined in these publications, the analytical challenges to working with BALF are numerous. These stem from the inherent chemical complexity of lung tissue exudate that makes up BALF, which presents these challenges: 1) presence of many plasma-derived, high abundance proteins (e.g. albumin, transferrin, etc.) that may suppress detection of lower abundant, lung tissue-derived proteins; 2) potential for interfering molecules found in mucus, and/or lipid surfactants, as well as high salt content from saline used in the BALF collection, all of which are generally not compatible with nanoscale LC-MS/MS based proteomic systems; 3) dilution of tissue-derived molecules due to large volumes of saline that is sometimes used to collect BALF samples; and 4) limiting amounts of protein material collected, depending on the subject (e.g. children vs adults) and/or disease pathology being studied, challenging deep detection of proteins.

Given these challenges, it is not surprising that a number of publications have described workflows for MS-based proteomic analysis of BALF over the prior two or more decades(24–26). As MS-based technologies have improved in their sensitivity and accuracy for qualitative and quantitative proteomics, the depth of BALF proteomic studies has also increased. Whereas early efforts using MS-based proteomics which could only identify tens to hundreds of proteins(8), newer conceptions have greatly expanded the detectable proteins within these complex samples.

Notably, a recent qualitative study using a contemporary high-resolution LC-MS/MS system and employing depletion of high abundance plasma proteins coupled with extensive, semi-preparative offline high pH HPLC fractionation has identified over 4000 proteins in BALF from lung cancer patients(27). A recent quantitative study in BALF used semi-preparative offline high pH HPLC fractionation and label-free quantification to identify several thousand proteins, including those differentially expressed in diseases related to lung connective tissue(28). Another recent study showed the potential for emerging data-independent acquisition (DIA) to quantify BALF proteins in lung cancer, where direct LC-MS analysis of tryptic digests from patient samples quantified over 600 proteins in these samples(29).

Despite their success, the workflows employed for preparing samples in these studies have some limitations that prevent broader adoption for BALF studies. Notably, those studies demonstrating the deep identification of thousands of BALF proteins utilized relatively large amounts of starting BALF sample -- 20 mL or more of individual samples(28) or pooling patient samples to generate tens of milliliters of sample(27) in order to yield protein amounts necessary for semi-preparative scale fractionation and LC-MS/MS analysis. Although these amounts of starting material are acceptable for small scale proof-of-concept studies, in many cases BALF sample volumes available for analysis are only in the low milliliter range, especially for studies involving children. The methods also utilize processing steps such as precipitation geared towards higher amounts of total protein in order to remove contaminants, which may be susceptible to sample loss in samples with lower amounts of material. Finally, these studies have not demonstrated their compatibility with contemporary quantitative proteomic methods geared towards analysis of larger cohorts of patients (e.g. highly multiplexed isobaric peptide labeling), which is necessary for large-scale studies investigating clinical BALF samples. Therefore, a need still exists for a robust sample processing workflow amenable to BALF samples collected in low milliliter amounts, offering the ability to sensitively detect even lower abundant lung proteins, and be applied to multiplexed quantitative analysis of larger patient cohorts.

A robust workflow for quantitative proteomics of BALF

Here, we describe a robust sample processing workflow with flexibility to a wide variety of studies focused on the proteomic characterization of BALF. The workflow brings together high abundance protein depletion, efficient protein trapping and contaminant removal using S-trap columns, and compatibility to multiplexed isobaric peptide tagging of resulting trypsin digested protein samples. The workflow includes a value-added step for collecting endogenous peptides for MS-based peptidomics, if desired, while being amenable to a wide range of total protein yields (tens of micrograms down to microgram or less) resulting from varying amounts of available BALF volumes. The workflow utilizes offline high pH HPLC peptide fractionation compatible with this range of protein yields (semi-preparative scale for large yields, microscale for limited yields). Through the analysis of clinically derived samples from patients with Chronic Obstructive Pulmonary Disease (COPD), we demonstrate the repeatability of our workflow indicating its utility for quantitative MS-based proteomics of BALF. Our workflow should be of value for a wide range of researchers seeking to understand proteome dynamics related to lung health and disease in routinely collected clinical BALF samples.

Figure 1 shows the BALF processing workflow, compatible with LC-MS/MS and quantitative proteomic analysis. Details of each step are provided in the Methods section. The key steps to the workflow include using a molecular weight cutoff spin filter which simultaneously concentrates the higher MW proteins and enables collection of endogenous peptides for analysis by LC-MS, if desired. Concentrated proteins are subjected to immunoaffinity depletion of high abundance plasma proteins, followed by concentration of non-retained proteins, and cleanup and trypsin digestion using protein trapping. Peptide mixtures are then ready for isobaric labeling, if desired, and/or direct analysis using nanoscale LC-MS/MS.

Initially our protocol included a protein precipitation step, similar to other past protocols(28, 30), to concentrate, and ostensibly eliminate contaminants from the BALF proteins. Methanol:chloroform precipitation was chosen as a means to phase separate potential lipid or surfactant contaminants from soluble proteins(30). However, we found that the precipitation step was inconsistent in its effectiveness at removing contaminants, as in some cases downstream LC-MS/MS analysis showed contaminants that obscured detection of BALF-derived peptides (see Supplementary Figure 1 for representative contamination results). Precipitation also had the downside of expanding sample volumes, which required extra handling and concentration steps.

As a solution, we introduced a concentration step using a MW cutoff spin filter, and downstream protein trapping via the S-Trap technology(31, 32). In our hands, protein trapping provided a means to efficiently capture proteins, remove contaminants, concentrate samples, and conduct tryptic digestion all within the same disposable filter, while providing an easy means to collect the peptides for further processing and/or analysis. After implementing the protein trapping step, we eliminated problems with contaminants.

Our workflow also provides a means to enrich endogenous peptides from BALF. These are most likely peptides processed proteolytically and have been shown to have potential diagnostic value in BALF samples(31, 32). We have found that the flow-through collected from the 3 kDa molecular weight cut-off filters using starting volumes of 1-5 mL of BALF contain micrograms of endogenous peptides, which can be concentrated and desalted using STAGE Tips and detected directly by LC-MS/MS (data not shown).

The yield of high-quality proteins, and resulting tryptic peptides, is another critical parameter for ensuring deep and accurate results in MS-based proteomics from challenging clinical samples such as BALF. Once we had determined the ability of our workflow to reliably eliminate contaminants from BALF samples, we assessed the yield of proteins and tryptic peptides from starting sample amounts representative of those generated from patients in the clinic. Table 1 shows results from nineteen processed BALF samples collected from healthy control patients, each with a starting amount of only 5 mL or less of volume, which represents an amount of clinical sample that is commonly collected from infants and children where volumes are proportional to body weight(33, 34). Here, we show the average amounts of protein, and peptides after digestion with trypsin, quantified at key points across the steps shown in Figure 1. Important values shown in Table 1 include the amount of protein available after depletion of high abundance proteins (including a calculation of % depletion), and also the final amount of peptides available after digestion from the S-trap. Supplementary Table 1 shows the values for individual samples used for this assessment. Although there is some variation in these numbers depending on the sample, the processing workflow consistently yields microgram amounts of high-quality tryptic peptides, leaving an excess of sample for analysis on contemporary LC-MS/MS instrumentation platforms.

Table 1

Yield of proteins and peptides from 5 mL of BALF across processing workflow

From the samples used for generating results in Table 1, we also assessed the depth of results from a direct LC-MS/MS analysis of these samples using an Orbitrap Eclipse system. Figure 2 shows a Venn diagram of the proteins identified from six of these samples, which yielded an average of 988 proteins identified via direct LC-MS/MS analysis. Parameters were set at 1% FDR with 2 or more unique peptides for protein identification and grouping (see Methods for details).

The amenability of any sample processing workflow to quantitative analysis is critically important, as most researchers will seek to compare changes in BALF protein abundance between different conditions. As such, we carried out a demonstration study using some of the samples from Table 1 to test repeatability of the workflow. We used the isobaric Tandem Mass Tags (TMT) reagents for multiplexed quantitative analysis(35). Proteins isolated from two different BALF samples (called here Test 1 and Test 2) were divided into two equal parts and taken through the protein depletion and clean-up steps of the workflow. The peptides from these test samples (5 micrograms per sample) were labeled with TMT reagents, as part of a larger testing experiment using the 16-plex TMTPro labeling kit and fractionated using offline high pH HPLC.

Table 2 shows results from the replicate TMT-labeled (Test 1 and Test 2) BALF samples. We found that samples divided into equal amounts prior to protein depletion had average protein ratios close to 1, with acceptable CVs for untargeted isobaric tagging experiments demonstrating reproducibility of our workflow. Across the entire 16plex TMT experiment, we identified with high confidence 1844 protein groups from 24306 peptides. Supplementary Table 2 shows the complete set of protein identification results from this TMT-based experiment, and the quantitative analysis of the Test 1 and Test 2 replicate samples.

Table 2

Results from a demonstration TMT-based quantitative repeatability experiment in BALF samples using our sample preparation workflow.

Finally, we also tested the effectiveness of handling samples that yield lower amounts of protein prior to LC-MS/MS analysis. Here, individual BALF samples with very small starting sample sizes (1-10 mL) yielded very low amounts of recovered digested peptides, in some cases only about 1 microgram. To demonstrate the amenability of our workflow to in-depth, quantitative analysis using TMT labeling, we adapted a microscale labeling(36) and fractionation method(37, 38). Here, to minimize sample handling steps the peptides are labeled while bound to C18 stationary phase. In this way, each of 16 separate samples with starting peptide amounts of 1 microgram each were subjected to TMT labeling, followed by microscale high pH fractionation (referred to here as “microfractionation”) resulting in 9 fractions for LC-MS/MS analysis. Table 3 shows the results from this analysis, comparing the effects of direct analysis on this sample compared to adding microfractionation on numbers of peptides and proteins identified. Here, results from three different 16-plex, TMT-labeled groups of samples were compared using either no fractionation or microfractionation, with average peptides and proteins identified across these different groups. Microfractionation increased the depth of protein identification significantly, by approximately 2-fold. Supplementary Table 3 shows the number of proteins identified within one of these 16-plex TMT labeled group of samples.

Table 3

Amenability of the workflow to material-limited samples and the effects of microfractionation on increasing depth of peptide and protein identification.

Our results demonstrate the effectiveness of our workflow for preparing samples for a wide variety of proteomic studies in BALF samples. BALF is a commonly collected clinical specimen, rich in biomolecules, and valuable for molecular characterization of lung disease and health. Proteins are a main component of BALF(9, 10, 21) making this sample type a prime target for MS-based proteomic analysis. However, the small sample volume, potential for contaminants and suppression by high abundance proteins is known to limit many researchers in their attempts to analyze BALF(23). Here, we present a robust workflow for processing BALF that overcomes these limitations.

We have demonstrated several advantages of our workflow that overcome these limitations:

The streamlined sample handling steps are amenable to relatively small starting volumes of BALF (as low as 1 mL of starting volume), in contrast to other studies that have described analysis of tens of milliliters of starting samples(27); we identify similar numbers of proteins as these past studies(27), with far less starting material.
The high-quality peptide mixtures generated by the workflow are amenable to either semi-preparative offline high pH HPLC fractionation (for sample amounts producing tens of micrograms of peptides) or a microfractionation (for samples producing as little as 1 microgram of peptides).
The processing steps, including depletion of high abundance proteins, S-trap based purification and tryptic digestion, have been demonstrated to maintain quantities of peptides, making this amenable to downstream quantitative proteomics.
We demonstrate the use of multiplexed TMT-labeling with our workflow as an example quantitative method, although label-free methods could also be used, including emerging data independent acquisition (DIA) methods(39).
Our approach offers a value-added, simple enrichment of endogenous peptides from BALF samples using molecular weight cutoff spin filters; these peptides are amenable to direct analysis using LC-MS/MS and offer additional information on diagnostic signatures and/or proteolytic processing.

During the course of our work, it came to our attention that our high abundance depletion spin columns (Seppro, Sigma Aldrich) were discontinued for manufacturing. Fortunately, a very similar product is available from Thermo Fisher (High-Select Top14 Abundant Protein Depletion Resin) which can be used in a similar spin-column format to the methods we describe in our workflow. This product targets many of the same proteins as the Seppro product and should be easily implemented within our general workflow.

We have demonstrated a robust sample preparation workflow for clinical BALF samples. This workflow provides high quality proteins, and resulting tryptic peptides, for analysis in contemporary MS-based proteomics instrument platforms. Although we demonstrate its effectiveness in starting sample amounts down to 1 milliliter in volume, further optimization may be necessary for very dilute clinical samples or those with very small total volumes, such as those collected from infants or small children(33, 34). Nevertheless, our workflow should be effective on BALF samples collected in the clinic using standardized methods, and useful for studying proteome dynamics in a wide variety of studies focused on lung health and disease.

BALF collection

BALF was obtained using standard procedures(40, 41). Sample collection was performed by sequentially instilling and then withdrawing 50 mL aliquots of sterile normal saline up to a total of 200 mL into the right middle lobe or lingula. Samples were aliquoted and immediately stored at -80°C prior to processing and underwent one freeze-thaw cycle.

Initial BALF processing

BALF samples were thawed, vortexed, and centrifuged at 500 x G for 10 minutes at 4°C. The supernatant was transferred to a new 15 mL conical tube. The insoluble pellet was suspended in 500 uL of PBS and stored at -70°C for future use. BALF supernatant was refrozen prior to drying through lyophilization. Samples were resuspended in 1 mL of LC-MS grade water, ultra-centrifuged at 100,000 x G for 1 hour, and the supernatant was removed for further processing. The remaining pellet was suspended in 200 uL of PBS and added to the soft spin pellet collected prior to lyophilization.

Molecular weight cutoff step to collect endogenous peptides

Amicon Ultra-4 centrifugal filters, MWCO of 3 kDa, were conditioned with 4 mL of 5% methanol in LC-MS grade water and an additional rinse of 4 mL LC-MS grade water. BALF supernatant was ultra-centrifuged at 4,000 x G for 1 hour at 4°C. Flow-through, containing endogenous peptides, was frozen and stored at -70°C for future peptide analysis. The concentrated protein was removed from the filter’s sample reservoir and transferred to a 2.0 mL LoBind tube.

Quantification of proteins with BCA

BALF proteins were quantified using the Pierce BCA protein assay. BSA standards and samples were analyzed in a microplate reader at an absorbance of 562 nm. A standard curve was calculated to determine protein amounts in BALF samples.

High abundance protein depletion

Seppro IgY14 spin columns were used to remove fourteen highly abundant plasma proteins (Albumin, IgG, α1-Antitrypsin, IgA, IgM, Transferrin, Haptoglobin, α2-Macroglobulin,, Fibrinogen, Complement C3, α1-Acid Glycoprotein, HDL, LDL) from BALF samples, leaving an enriched pool of low abundance proteins. Prior to sample loading, spin columns were washed with two blank samples of Seppro dilution buffer to remove non-covalently bound IgY from the beads. (Each “wash” included: addition of buffer to spin columns, mixing of the beads by mechanical inversion/shaking of the column, and centrifugation at 400 x G for 30 seconds). Known amounts of BALF protein in 1x Seppro dilution buffer (Tris (hydroxymethyl) aminomethane), were loaded into the spin column. Samples were mixed for 15 minutes, centrifuged at 400 x G for 30 seconds, and flow-through was collected. An additional wash of the column with 1x Seppro dilution buffer was performed and a second flow through was collected. The two BALF washes were kept on ice for further analysis. Spin columns were washed twice with 1x Seppro dilution buffer. Bound proteins were stripped from the beads with four washes of 1x Seppro glycine-based stripping buffer, and immediately neutralized with 1x Seppro neutralization buffer (Tris (hydroxymethyl) aminomethane). Beads were resuspended in a 1x Seppro dilution buffer containing 0.02% sodium azide, and stored at 4°C.

Molecular weight cutoff to concentrate the two washes after Seppro

Amicon Ultra-0.5 mL centrifugal filters, MWCO of 3 kDa, were conditioned with 0.5 mL of 5% methanol in LC-MS grade water and an additional rinse of 0.5 mL LC-MS grade water. The initial BALF flow-through collected from Seppro was ultra-centrifuged at 14,000 x G for 1 hour at 4°C. The second BALF flow-through was added and ultra-centrifuged at 14,000 x G for 1 hour at 4°C. Concentrated protein was removed from the filter’s sample reservoir and transferred to a 1.5 mL LoBind tube.

Quantification of proteins with BCA

BALF proteins were quantified using the Pierce BCA protein assay. BSA standards and samples were analyzed via Nanodrop at an absorbance of 562 nm. A standard curve was calculated to determine protein amounts in BALF samples.

Protein trapping, clean-up and tryptic digestion

Post-depletion BALF protein was frozen and lyophilized in a speed vac. Dried BALF samples were solubilized in 5% SDS, 50mM TEAB, pH 8.5, sonicated at 90 sonics for 5 minutes, and centrifuged at 12,000 x G for 8 minutes. The supernatant was transferred to a 1.5 mL LoBind tube. Proteins were reduced, alkylated, and acidified to pH < 1. Samples were transferred to S-trap columns (ProtiFi) that were centrifuged at 4,000 x G for 30 seconds to trap proteins onto columns. Protein was washed 6x with 100mM TEAB in 90% LC-MS grade methanol, pH 7.55, to remove all contaminants. Trypsin Gold, MS grade (Promega), in a 1:10 ratio of enzyme to protein, was added to the S-trap and columns were incubated overnight at 37°C. Digested proteins were eluted from the column with 50% acetonitrile / 50mM TEAB, pH 8.5. BALF peptides were frozen and lyophilized in a speed vac for further use.

Peptide Assay

BALF peptides were resuspended in LC-MS grade water and quantified using the Pierce Quantitative Colorimetric Peptide Assay. Peptide digest standards and samples were analyzed via Nanodrop at an absorbance of 480 nm. A standard curve was calculated to determine peptide amounts recovered from S-traps for each BALF sample. Samples were frozen and lyophilized in a speed vac.

TMT labeling

Normal Scale. For samples with higher amounts of total peptides (greater than 5 total ug), TMT16pro label reagents (Thermo Fisher) were reconstituted in anhydrous acetonitrile. A total of 5 ug of BALF protein digest for each sample was suspended in 100 mM TEAB, pH 8.5. TMTpro labels were added to BALF samples and incubated for 1 hour at room temperature. 5% hydroxylamine in LC-MS grade water was added to the samples and incubated for 15 minutes to quench the reaction. Labeled BALF peptides were pooled, frozen, and lyophilized.

Microscale. For samples with lower amounts of total peptide (approximately 1 ug), TMT16pro label reagents were first reconstituted in anhydrous acetonitrile. A total of 1 ug of BALF protein digest for each sample was suspended in 0.1% formic acid in LC-MS grade water. The acidified BALF peptides were transferred to preconditioned C18 Stop and Go Extraction (STAGE) tips(42), and drawn through twice (centrifuged at 1000 x G for 1 minute) to bind to the C18 stationary phase. Peptides were washed with 0.1% formic acid in LC-MS grade water and labeled with TMT16pro tags in 20mM TEAB, pH 8 buffer. Labeled peptides were eluted with a 0.1% formic acid in 80:20 acetonitrile:water buffer followed by 20 mM ammonium formate, pH 10 in 80:20 acetonitrile:water. Each separate TMT-labeled BALF peptide sample was pooled, frozen, and lyophilized.

Offline high pH fractionation

Semi-preparative Scale Fractionation. For normal scale, processed samples (40 ug or more total peptides after pooling) were resuspended in 50 µL of 50 mM ammonium formate and fractionated offline by high pH C18 reversed phase (RP) chromatography as described previously(43) with the following changes. A Shimadzu Prominance HPLC (Shimadzu, Columbia, MD) with a Hot Sleeve-25L Column Heater (Analytical Sales & Products, Inc., Pompton Plains, NJ) was used with a column setup of a Security Guard precolumn housing a Gemini NX C18 cartridge (Phenomemex, Torrance, CA) attached to a C18 XBridge column, 150 mm x 2.1 mm internal diameter, 5 um particle size (Waters Corporation, Milford, MA). Buffer A was 20 mM ammonium formate, pH 10 in 98:2 water:acetonitrile and buffer B was 20 mM ammonium formate, pH 10 in 10:90 water:acetonitrile. The flow rate was 200 µL/min with a gradient from 2–7% buffer B over 0.5 min, 7–15% buffer B over 7.5 min, 15–35% buffer B over 45 min, and 35–60% buffer B over 15 min. Fractions were collected every 2 minutes and UV absorbances were monitored at 215 nm and 280 nm. Peptide-containing fractions were divided into three groups, “early”, “middle”, and “late”. A volume equal to 15 milli-absorbance units of the first “early” fraction was concatenated with the first “middle” and “late” fraction, and so on. Concatenated fractions were lyophilized and cleaned with STAGE tips using Waters Oasis MCX material as the stationary phase.

Microscale Fractionation. For microscale samples, pooled peptides (16 ug total for TMT-labeled samples) were reconstituted in 100 mM NH₄HCO₂, pH 10. STAGE tips were prepared with a C8 core and C18-AQ resin packed on top, and conditioned and equilibrated. Peptides were transferred to STAGE tips and drawn through twice (centrifuged at 1000 x G for 2 minutes) to bind them to resin/filter. Peptides were washed and eluted into fractions sequentially with increasing concentrations (5%, 7.5%, 10%, 12.5%, 15%, 17.5%, 20%, 22.5%, 25%, 27.5%, 30%, 32.5%, 35%, 40%, 50%, 60%, 70%, 80%) of acetonitrile in LC-MS grade water.

LC-MS/MS analysis

For non-TMT labeled peptide mixtures, we reconstituted the dried peptide fractions in 97.9:2:0.1, H2O: acetonitrile (ACN):formic acid (FA) (load solvent) and analyzed ~ 300 nanograms of each fraction by capillary LC-MS with a Thermo Fisher Scientific, Inc (Waltham, MA) Dionex UltiMate 3000 RSLCnano system on-line with an Orbitrap Eclipse mass spectrometer (Thermo Scientific, Waltham MA) with FAIMS (high-field asymmetric waveform ion mobility) separation. We injected peptides directly in load solvent and performed gradient separation on a self-packed C18 column (Dr. Maisch GmbH ReproSil-PUR 1.9 um 120 Å C18aq, 100 um ID x 40 cm length) at 55°C with the following profile: 5% B solvent from 0–2 minutes, 8% B at 2.5 minutes, 21% B at 90 minutes, 35% B at 120 minutes and 90% B at 122 minutes with a flowrate of 400 nl/min from 0–2 minutes and 315 nl/minute from 2.5–122 minutes, where solvent A was 0.1% formic acid in water and solvent B was 0.1% formic acid in ACN. The FAIMS nitrogen cooling gas setting was 5.0 L/min, the carrier gas was 4.6 L/min and the inner and outer electrodes were set to 100°C. We scanned the CV (compensation voltage) at -45, -60 and − 75 for 1 second each with a data dependent acquisition method. We employed the following MS parameters: ESI voltage + 2.1 kV, ion transfer tube 275°C; no internal calibration; Orbitrap MS1 scan 120k resolution in profile mode from 380–1400 m/z with 50 msec injection time; 100% (4 x 10E5) automatic gain control (AGC); MS2 was triggered on precursors with 2–5 charges above 2.5E4 counts; MIPS (monoisotopic peak determination) was set to Peptide; MS2 settings (all CV’s) were: 01.6 Da quadrupole isolation window, 30% fixed collision energy, Orbitrap detection with 30K resolution at 200 m/z, first mass fixed at 110 m/z, 54 msec max injection time, 100% (5 x 10E4) AGC, 45 sec dynamic exclusion duration with +/- 10 ppm mass tolerance and exclusion lists were shared among CV’s.

For TMT-labeled peptide mixtures, we reconstituted the dried peptide fractions in 94.9:5:0.1, H2O:acetonitrile (ACN):formic acid (FA) (load solvent) and analyzed ~ 800 nanograms of each fraction by capillary LC-MS with a Thermo Fisher Scientific, Inc (Waltham, MA) Dionex UltiMate 3000 RSLCnano system on-line with an Orbitrap Eclipse mass spectrometer (Thermo Scientific, Waltham MA) with FAIMS (high-field asymmetric waveform ion mobility) separation. We injected peptides directly in load solvent and performed gradient separation on a self-packed C18 column (Dr. Maisch GmbH ReproSil-PUR 1.9 um 120 Å C18aq, 100 um ID x 40 cm length) at 55°C with the following profile: 5% B solvent from 0–2 minutes, 8% B at 2.5 minutes, 21% B at 135 minutes, 34% B at 180 minutes and 90% B at 182 minutes with a flowrate of 325 nl/min from 0–2 minutes and 315 nl/minute from 2.5–182 minutes, where solvent A was 0.1% formic acid in water and solvent B was 0.1% formic acid in ACN. The FAIMS nitrogen cooling gas setting was 5.0 L/min, the carrier gas was 4.6 L/min, and the inner and outer electrodes were set to 100°C. We scanned the CV (compensation voltage) at -45, -60 and − 70 for 1.5 seconds each with a data dependent acquisition method. We employed the following MS parameters: ESI voltage + 2.1 kV, ion transfer tube 275°C; no internal calibration; Orbitrap MS1 scan 120k resolution in profile mode from 400–1400 m/z with 50 msec injection time; 100% (4 x 10E5) automatic gain control (AGC); MS2 was triggered on precursors with 2–6 charges above 2.5E4 counts; MIPS (monoisotopic peak determination) was set to Peptide; MS2 settings (all CV’s) were: 0.7 Da quadrupole isolation window, 38% fixed collision energy, Orbitrap detection with 50K resolution at 200 m/z, first mass fixed at 110 m/z, 150 msec max injection time, 250% (1.25 x 10E5) AGC, 30 sec dynamic exclusion duration with +/- 10 ppm mass tolerance and exclusion lists were shared among CV’s.

Data analysis

Sequence database searching. We processed peptide tandem MS using SEQUEST(44) (Thermo Scientific) in Proteome Discoverer 2.5. The human Universal Proteome (UP000005640) protein sequence database was downloaded from Uniprot.org on Sept 20, 2021 and merged with a common lab contaminant protein database (https://www.thegpm.org/crap/, groups 1, 2 and 3) for a total of 78,182 total protein sequences. We applied the precursor mass recalibration node with precursor mass tolerance 20 ppm, product ion tolerance 0.1 Da, dynamic mass TMTpro (+ 304.2071 m/z) on K (for TMT labeled samples) and fixed carbamidomethyl (CAM) modification (+ 57.0215 m/z) of C. The SEQUEST database search parameters were: enzyme trypsin full specificity, 2 missed cleave sites; peptide length 6–50 amino acids, precursor tolerance 15 ppm, fragment ion tolerance was 0.06 Da. We specified CAM cysteine (+ 57.021 Da) as a fixed modification and the dynamic modifications were TMTpro on K and peptide N-terminus (for TMT labeled samples), acetylation of protein N-terminus (+ 42.011 Da), oxidation of M (+ 15.995 Da), conversion of Q to pyroglutamic acid (-17.027 Da), M loss at the protein N-terminus (-131.040 Da), M loss + acetylation at the protein N-terminus (-89.030 Da) and deamidation of N and Q (+ 0.984 Da). For protein inference, we applied 1% protein and peptide False Discovery Rate (FDR) filters using the Percolator algorithm(45) in PD.

Protein quantification. We used Proteome Discoverer (PD) for TMT-based protein quantification with the following parameters: unique and razor peptides were included, shared peptides were excluded, impurity corrections were applied, co-isolation threshold maximum was 50%, normalization was performed on the total peptide amount, protein ratio calculations were performed using pairwise ratio-based mode, which is similar to the method employed in MaxLFQ(46), and hypothesis testing was performed using the t-test. For multi-TMT experiments, we scaled the average reporter ion abundances of the pooled ‘control’ channel to 100 and scaled all other TMT reporter channels proportionally. PD employed the Benjamini-Hochberg false discovery rate procedure to control for errors associated with multiple hypothesis tests(47).

Ethics approval and consent to participate

All samples were de-identified and the current study was approved by the University of Minnesota Institutional Review Board who determined that the proposed activity is not research involving human subjects as defined by DHHS and FDA regulations.

Consent for publication

Not applicable.

Availability of data and materials

The mass spectrometry proteomics datasets generated and analyzed during the current study are available in the Zenodo repository with a data set identifier (DOI) 10.5281/zenodo.7153029.

The data can be accessed via https://doi.org/10.5281/zenodo.7153029.

Uploading of this data to the ProteomeXchange platform is currently in process.

Competing interests

The authors declare that they have no competing interests.

Funding

This work was supported by grant R01HL140971 from the NIH to C.H.W and T.J.G. The Orbitrap Eclipse instrumentation platform used in this work was purchased through High-end Instrumentation grant S10OD028717 from the NIH.

Authors’ contributions

CW and TG designed the study and provided oversight of experiments and results interpretation. CW, TG, LP, AM, MK, and DW all planned steps involved in the sample preparation workflow and helped in troubleshooting methods and optimization. AM, MK and DW carried out experiments in the laboratory testing the workflow, optimizing, and demonstrating effectiveness. DW carried out final demonstration experiments on the complete, optimized workflows, including the TMT labeling experiments for quantitative proteomics. TM and LH advised in preparing the samples for MS analysis and generated the LC-MS/MS data. LH, PD, SM, and DW performed data processing using customized software tools. TG, CW, DW, PD, SM all interpreted results used for optimizing the method and demonstrating effectiveness. TG, CW, and DW wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We thank the Center for Mass Spectrometry and Proteomics at the University of Minnesota for providing services related to quantitative proteomics, including consulting on sample preparation methods, generating MS-based data and analysis of results.

Kahn FW, Jones JM. Analysis of bronchoalveolar lavage specimens from immunocompromised patients with a protocol applicable in the microbiology laboratory. J Clin Microbiol. 1988;26(6):1150-5.
Bhargava M, Wendt CH. Biomarkers in acute lung injury. Transl Res. 2012;159(4):205-17.
Domagala-Kulawik J. The relevance of bronchoalveolar lavage fluid analysis for lung cancer patients. Expert Rev Respir Med. 2020;14(3):329-37.
Wiktorowicz JE, Jamaluddin M. Proteomic analysis of the asthmatic airway. Adv Exp Med Biol. 2014;795:221-32.
Lin P, Chen Y, Su S, Nan W, Zhou L, Zhou Y, et al. Diagnostic value of metagenomic next-generation sequencing of bronchoalveolar lavage fluid for the diagnosis of suspected pneumonia in immunocompromised patients. BMC Infect Dis. 2022;22(1):416.
Chen J, Hu C, Pan P. Extracellular Vesicle MicroRNA Transfer in Lung Diseases. Front Physiol. 2017;8:1028.
Matthiesen R. MS-Based Biomarker Discovery in Bronchoalveolar Lavage Fluid for Lung Cancer. Proteomics Clin Appl. 2020;14(1):e1900077.
Wattiez R, Hermans C, Bernard A, Lesur O, Falmagne P. Human bronchoalveolar lavage fluid: two-dimensional gel electrophoresis, amino acid microsequencing and identification of major proteins. Electrophoresis. 1999;20(7):1634-45.
Cheng G, Ueda T, Numao T, Kuroki Y, Nakajima H, Fukushima Y, et al. Increased levels of surfactant protein A and D in bronchoalveolar lavage fluids in patients with bronchial asthma. Eur Respir J. 2000;16(5):831-5.
Sepper R, Prikk K, Metsis M, Sergejeva S, Pugatsjova N, Bragina O, et al. Mucin5B expression by lung alveolar macrophages is increased in long-term smokers. J Leukoc Biol. 2012;92(2):319-24.
Tirone C, Iavarone F, Tana M, Lio A, Aurilia C, Costa S, et al. Oxidative and Proteolytic Inactivation of Alpha-1 Antitrypsin in Bronchopulmonary Dysplasia Pathogenesis: A Top-Down Proteomic Bronchoalveolar Lavage Fluid Analysis. Front Pediatr. 2021;9:597415.
Vento G, Tirone C, Lulli P, Capoluongo E, Ameglio F, Lozzi S, et al. Bronchoalveolar lavage fluid peptidomics suggests a possible matrix metalloproteinase-3 role in bronchopulmonary dysplasia. Intensive Care Med. 2009;35(12):2115-24.
Callejon-Leblic B, Garcia-Barrera T, Gravalos-Guzman J, Pereira-Vega A, Gomez-Ariza JL. Metabolic profiling of potential lung cancer biomarkers using bronchoalveolar lavage fluid and the integrated direct infusion/ gas chromatography mass spectrometry platform. J Proteomics. 2016;145:197-206.
Nambiar S, Bong How S, Gummer J, Trengove R, Moodley Y. Metabolomics in chronic lung diseases. Respirology. 2020;25(2):139-48.
O'Connor JB, Mottlowitz M, Kruk ME, Mickelson A, Wagner BD, Harris JK, et al. Network Analysis to Identify Multi-Omic Correlations in the Lower Airways of Children With Cystic Fibrosis. Front Cell Infect Microbiol. 2022;12:805170.
Carnino JM, Lee H, Jin Y. Isolation and characterization of extracellular vesicles from Broncho-alveolar lavage fluid: a review and comparison of different methods. Respir Res. 2019;20(1):240.
Liu Z, Yan J, Tong L, Liu S, Zhang Y. The role of exosomes from BALF in lung disease. J Cell Physiol. 2022;237(1):161-8.
Rajczewski AT, Jagtap PD, Griffin TJ. An overview of technologies for MS-based proteomics-centric multi-omics. Expert Rev Proteomics. 2022:1-17.
Bhargava M, Viken KJ, Barkes B, Griffin TJ, Gillespie M, Jagtap PD, et al. Novel protein pathways in development and progression of pulmonary sarcoidosis. Sci Rep. 2020;10(1):13282.
Nguyen EV, Gharib SA, Schnapp LM, Goodlett DR. Shotgun MS proteomic analysis of bronchoalveolar lavage fluid in normal subjects. Proteomics Clin Appl. 2014;8(9-10):737-47.
Wattiez R, Falmagne P. Proteomics of bronchoalveolar lavage fluid. J Chromatogr B Analyt Technol Biomed Life Sci. 2005;815(1-2):169-78.
Guerrero CR, Maier LA, Griffin TJ, Higgins L, Najt CP, Perlman DM, et al. Application of Proteomics in Sarcoidosis. Am J Respir Cell Mol Biol. 2020;63(6):727-38.
Bowler RP, Wendt CH, Fessler MB, Foster MW, Kelly RS, Lasky-Su J, et al. New Strategies and Challenges in Lung Proteomics and Metabolomics. An Official American Thoracic Society Workshop Report. Ann Am Thorac Soc. 2017;14(12):1721-43.
Govender P, Dunn MJ, Donnelly SC. Proteomics and the lung: Analysis of bronchoalveolar lavage fluid. Proteomics Clin Appl. 2009;3(9):1044-51.
Leroy B, Falmagne P, Wattiez R. Sample preparation of bronchoalveolar lavage fluid. Methods Mol Biol. 2008;425:67-75.
Plymoth A, Lofdahl CG, Ekberg-Jansson A, Dahlback M, Lindberg H, Fehniger TE, et al. Human bronchoalveolar lavage: biofluid analysis with special emphasis on sample preparation. Proteomics. 2003;3(6):962-72.
Sim SY, Choi YR, Lee JH, Lim JM, Lee SE, Kim KP, et al. In-Depth Proteomic Analysis of Human Bronchoalveolar Lavage Fluid toward the Biomarker Discovery for Lung Cancers. Proteomics Clin Appl. 2019;13(5):e1900028.
Ye J, Liu P, Li R, Liu H, Pei W, Ma C, et al. Biomarkers of connective tissue disease-associated interstitial lung disease in bronchoalveolar lavage fluid: A label-free mass spectrometry-based relative quantification study. J Clin Lab Anal. 2022;36(5):e24367.
Ortea I, Rodriguez-Ariza A, Chicano-Galvez E, Arenas Vacas MS, Jurado Gamez B. Discovery of potential protein biomarkers of lung adenocarcinoma in bronchoalveolar lavage fluid by SWATH MS data-independent acquisition and targeted data extraction. J Proteomics. 2016;138:106-14.
Prely LM, Paal K, Hermans J, van der Heide S, van Oosterhout AJM, Bischoff R. Quantification of matrix metalloprotease-9 in bronchoalveolar lavage fluid by selected reaction monitoring with microfluidics nano-liquid-chromatography–mass spectrometry. Journal of Chromatography A. 2012;1246:103-10.
Elinger D, Gabashvili A, Levin Y. Suspension Trapping (S-Trap) Is Compatible with Typical Protein Extraction Buffers and Detergents for Bottom-Up Proteomics. Journal of Proteome Research. 2019;18(3):1441-5.
HaileMariam M, Eguez RV, Singh H, Bekele S, Ameni G, Pieper R, et al. S-Trap, an Ultrafast Sample-Preparation Approach for Shotgun Proteomics. Journal of Proteome Research. 2018;17(9):2917-24.
Malmstrom K, Lehto M, Majuri ML, Paavonen T, Sarna S, Pelkonen AS, et al. Bronchoalveolar lavage in infants with recurrent lower respiratory symptoms. Clin Transl Allergy. 2014;4:35.
Riedler J, Grigg J, Stone C, Tauro G, Robertson CF. Bronchoalveolar lavage cellularity in healthy children. American Journal of Respiratory and Critical Care Medicine. 1995;152(1):163-8.
Dayon L, Hainard A, Licker V, Turck N, Kuhn K, Hochstrasser DF, et al. Relative quantification of proteins in human cerebrospinal fluids by MS/MS using 6-plex isobaric tags. Anal Chem. 2008;80(8):2921-31.
Myers SA, Rhoads A, Cocco AR, Peckner R, Haber AL, Schweitzer LD, et al. Streamlined Protocol for Deep Proteomic Profiling of FAC-sorted Cells and Its Application to Freshly Isolated Murine Immune Cells. Mol Cell Proteomics. 2019;18(5):995-1009.
Dimayacyac-Esleta BRT, Tsai C-F, Kitata RB, Lin P-Y, Choong W-K, Lin T-D, et al. Rapid High-pH Reverse Phase StageTip for Sensitive Small-Scale Membrane Proteomic Profiling. Analytical Chemistry. 2015;87(24):12016-23.
Kim H, Dan K, Shin H, Lee J, Wang JI, Han D. An efficient method for high-pH peptide fractionation based on C18 StageTips for in-depth proteome profiling. Analytical Methods. 2019;11(36):4693-8.
Searle BC, Pino LK, Egertson JD, Ting YS, Lawrence RT, MacLean BX, et al. Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry. Nat Commun. 2018;9(1):5128.
Akata K, Leung JM, Yamasaki K, Leitao Filho FS, Yang J, Xi Yang C, et al. Altered Polarization and Impaired Phagocytic Activity of Lung Macrophages in People With Human Immunodeficiency Virus and Chronic Obstructive Pulmonary Disease. J Infect Dis. 2022;225(5):862-7.
Cribbs SK, Uppal K, Li S, Jones DP, Huang L, Tipton L, et al. Correlation of the lung microbiota with metabolic profiles in bronchoalveolar lavage fluid in HIV infection. Microbiome. 2016;4:3.
Rappsilber J, Ishihama Y, Mann M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal Chem. 2003;75(3):663-70.
Yang F, Shen Y, Camp DG, 2nd, Smith RD. High-pH reversed-phase chromatography with fraction concatenation for 2D proteomic analysis. Expert Rev Proteomics. 2012;9(2):129-34.
Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom. 1994;5(11):976-89.
Käll L, Canterbury JD, Weston J, Noble WS, MacCoss MJ. Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nature Methods. 2007;4(11):923-5.
Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, Mann M. Accurate Proteome-wide Label-free Quantification by Delayed Normalization and Maximal Peptide Ratio Extraction, Termed MaxLFQ. Molecular & Cellular Proteomics. 2014;13(9):2513-26.
Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological). 1995;57(1):289-300.

No competing interests reported.

SupplementaryFigure1.xlsx
Additional file 1: Supplementary Figure 1 Representative spectra of contamination results; precipitation vs. S-traps (XLSX)
SupplementaryTable1.xlsx
Additional file 2: Supplementary Table 1 Protein and peptide yields across processing workflow for all BALF samples (XLSX)
SupplementaryTable2.xlsx
Additional file 3: Supplementary Table 2 Complete qualitative peptide/protein identification of experimental TMT16 plex and quantitative analysis of repeatability experiment (XLSX)
SupplementaryTable3.xlsx
Additional file 4: Supplementary Table 3 Quantitative protein and peptide identification across one TMT16 plex group; fractionated vs. unfractionated analysis (XLSX)

Download PDF

Journal Publication

published 02 Apr, 2023

Read the published version in Clinical Proteomics →

Editorial decision: Major revision
21 Dec, 2022
Reviews received at journal
12 Dec, 2022
Reviewers agreed at journal
08 Dec, 2022
Reviewers invited by journal
07 Dec, 2022
Editor assigned by journal
07 Dec, 2022
Submission checks completed at journal
07 Dec, 2022
First submitted to journal
07 Nov, 2022

You are reading this latest preprint version

An optimized workflow for MS-based quantitative proteomics of challenging clinical bronchoalveolar lavage fluid (BALF) samples

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

BALF: A valuable clinical sample for studying lung disease

BALF MS-based proteomics: history and existing challenges

A robust workflow for quantitative proteomics of BALF

Results

Discussion

Conclusions

Methods

BALF collection

Initial BALF processing

Molecular weight cutoff step to collect endogenous peptides

Quantification of proteins with BCA

High abundance protein depletion

Molecular weight cutoff to concentrate the two washes after Seppro

Quantification of proteins with BCA

Protein trapping, clean-up and tryptic digestion

Peptide Assay

TMT labeling

Offline high pH fractionation

LC-MS/MS analysis

Data analysis

Declarations

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1