Pandas DataFrames are commonly used for representing MS data in a tabular structure. Mass spectrometry data is typically loaded from a raw experiment file (i.e., mzML, d, raw, wiff) into a DataFrame object using a library capable of accessing the raw data (Figure 1, Listing 1-4).6,18–20 Further preprocessing and data manipulation such as filtering, smoothing or peak picking can easily be performed directly on the pandas DataFrame before plotting. pyOpenMS-viz has been designed to make the plotting of MS data as accessible as possible. At its core, pyOpenMS-viz extends the pandas API, allowing users to directly plot mass spectrometry data from a DataFrame with just a single line of code, lowering the learning curve and encouraging visual data exploration (Figure 1). Users can specify which columns of the DataFrame to plot and how to group data across all plot types. This reduces the need for strict column naming conventions or data reformatting, simplifying the visualization process without affecting upstream data processing tools. This structure allows plots to be easily adapted to alternative data dimensions, such as ion mobility.
Figure 1: pyOpenMS-viz is a flexible plotting platform for mass spectrometry data visualization. pyOpenMS-viz is a Python-based visualization tool allowing for compatibility with a wide variety of pre-existing mass spectrometry tools. pyOpenMS-viz extends the build-in plotting module of pandas, a widely used data analysis and manipulation tool, already commonly used in the mass spectrometry Python ecosystem. This allows for direct plotting of DataFrames. pyOpenms-viz only uses user-specified columns meaning that users can store additional custom metadata not used in plotting. Currently supported plot types include spectrum, peak map, mobilogram and chromatograms. Supported backends include Matplotlib, Bokeh and Plotly, allowing both static and interactive visualization.
Table 1 Overview of Plot Types Supported. PyOpenMS-viz supports a diverse set of mass spectrometry plots across different backends to support the generation of both interactive and publication quality plots. All plots are supported across all backends with the exception of a peak map 3D since Bokeh backend has limited support for 3D visualizations. Visualizations of these plots across different backends can be found in Figure S2.
The class-based architecture (Supplemental Figure 1) also ensures a consistent API across diverse plotting backends. This allows users to easily switch between generating static figures with Matplotlib or interactive figures with Bokeh or Plotly by simply changing a single parameter (Supp. Figure 2, Listing 5). pyOpenMS-viz is designed to work well in different environments (Figure 2). In script-based workflows, it facilitates the generation of static, publication quality figures using Matplotlib for single files or while batch processing large data sets (Supp. Figure 3). Within Jupyter notebooks, researchers can adopt a workflow to process and visually explore their data. Sharing their notebook along with visualizations is a convenient way to share results and can increase reproducibility. pyOpenMS-viz can easily be integrated in Python-based web applications to create interactive visualization tools that allow users to explore MS data online. Because the package works the same way across these different setups, it simplifies the learning process for users and empowers them to use it in different contexts. The currently supported plots include spectra, chromatograms, mobilograms and peak maps (Table 1, Figure 1), which are further explained below.
In mass spectrometry, the spectrum plot is widely used to verify the identity of an analyte. It visualizes recorded ion intensities (e.g., the measured ion current) distributed over a mass-to-charge (m/z) dimension. A common use case is the visualization of annotated tandem mass spectra for verifying analyte identification or intensity ratios in label-based quantification workflows (Figure 3a). pyOpenMS-viz spectrum plots leverage the groupby functionality to group and color peaks based on annotation. pyOpenMS-viz also supports custom peak annotations, including text labels to highlight peaks of interest, mirror spectrum plots for comparing an experimental spectrum to another one (e.g., from spectral libraries or fragment intensity predictions) (Supp. Figure 4, Listing 6). Furthermore, to speed up plotting, pyOpenMS-viz bins peaks to reduce the number of peaks rendered (Listing 7).
Chromatogram plots are commonly used to visualize the total ion current detected in a mass spectrometer or elution profiles of analyte(s) tracked over retention time. In targeted experiments, a chromatogram of multiple mass traces is frequently inspected to verify the identification and quantification quality. pyOpenMS-viz supports grouping of mass traces using the pandas groupby functionality. Grouped mass traces get plotted with different colors in the same plot (Figure 3b, Supp. Figure 4, Listing 6). pyOpenMS-viz allows users to manually select and group column names of the same DataFrame to achieve different types of visualizations (Figure 3b, Supp. Figure 4, Listing 6). Furthermore, peak boundaries can optionally be supplied in an additional pandas DataFrame for annotation allowing for validation of automated peak-picking tools like OpenSwath21 or DIA-NN22 (Listing 8). Plots can easily be adapted to plot alternative dimensions such as ion mobility. As an example, we include the Mobilogram plot class which is a modified version of the chromatogram plot for visualizing ion mobility data in PASEF workflows (Figure 3c).
Sometimes, visualization across a single dimension is not sufficient for quality control, and visualization across two or more dimensions is required. For example, feature detection algorithms may detect eluting analytes in retention time and m/z dimensions or additional dimensions of separation such as ion mobility can produce multidimensional spectra. Peak maps allow plotting peak data along both dimensions using color-coded intensities, producing a plot that resembles a heatmap (Figure 3d-f). Since peak maps commonly display large amounts of data which can be slow in a Python plotting environment, pyOpenMS-viz supports dynamic peak binning to display large amounts of data in a performant manner, without much loss of visual information (Listing 7). Peak maps can also be plotted with marginal plots, such that a chromatogram representing the total ion current and a merged spectrum of the visible area is included on the marginals (Figure 3g-i). In addition to the heatmap-like 2D view, peak maps can also be displayed in a 3D format. In this view, intensity is represented as a third dimension, allowing for a clearer and more detailed representation of the data, particularly of elution profiles and isotopic peaks in a single plot. (Figure 3j-l, Supp. Figure 4).