Software. Data was processed with SciPy (version 1.3.1)10, Numpy (version 1.15.4)11, and Pandas (version 0.23.4)12. Tabular data images were made with Libre Office’s Calc (https://www.libreoffice.org/). Data visualization was performed with matplotlib (version 3.0.2)13 and seaborn (version 0.9.0)14 libraries for the Python programming language. Principal Component Analysis (PCA) was performed with Scikit-learn (version 0.20.0)15. Liputils can be automatically installed via pip, the Python package installer (as detailed in Procedure), downloaded from GitHub (https://github.com/Stemanz/restring) or from the Python package Index, Pypi (https://pypi.org/project/restring/).
Software setup. reString is a Python program that needs a working Python 3.6 + environment. In its GUI form, it leverages tkinter16, the Python binding to the Tk GUI toolkit, which is included in most distributions. reString will work out of the box (when installed via pip, see below) within a standard Python installation that can be obtained at https://www.python.org/ for Windows, macOS, GNU/linux distributions and other platforms.
Statistical & analyses. Statistical analyses are detailed for each individual analysis in the appropriate figure or table caption, and were performed with GraphPad Prism software version 9.1.1 (223).
Procedure. The following protocol illustrates how to prepare an environment to use reString, as well as how to analyze sample RNAseq results. An active internet connection is required to download software and data. Further detailed and up-to-date information on additional features of reString can be found at the project’s repository (https://github.com/Stemanz/restring).
Procedure – Installation. The steps used in this protocol are intended for machines running macOS, GNU/Linux, or Windows. Some commands need to be executed from the UNIX shell (the Terminal app) and are prefixed by the “$” symbol, which shall not be inputted with the commands. The same commands can be inputted in the Windows shell (Users would see a “>” symbol at the end instead of a “$” symbol).
1 | Download and install Python. Head over to https://www.python.org/downloads/ and get the latest Python release for your operating system. Windows users can also install Python through the Microsoft store. Please note that Python might already be installed in your system; check whether this is the case and also whether its version is 3.6 or greater (open a terminal – see below - and type ‘python --version’; inspect the output. Windows users should alternatively search for Python in the Start Menu, or type ‘py’ in a terminal). Install the software by following on screen instructions.
1.a | Optional for GNU/linux: While all libraries needed by reString are included by default (or being installed) in MacOS and Windows, GNU/Linux distros vary greatly in this respect. While we could assume that the average GNU/Linux user will be able to address any missing dependency issue, we know that not all Debian-based distro include tkinter, a library which reString rely upon, which can be fixed with this terminal command (please otherwise refer to updated or specific distros documentation for any issue):
$ sudo apt-get install python3-tk
2 | Open a terminal. On macOS, this is done by running the Terminal app from the Utilities (to access Utilities, from Finder ⌘ + ⇧ + U, or Go > Utilities). On Windows, bring up the Start menu and type ‘cmd’ in the search field, then run it.
3 | Install reString. reString and its dependencies can be automatically installed from the command line. Type:
$ pip install restring
3.a | Optional troubleshooting for Windows: it is possible that after the installation the system does not know where to find pip, the Python package manager. Should this occur, locate the folder containing it by typing in a terminal:
> cd\
> dir pip.exe /a /s
The folder containing it (for example: C:\Users\username\AppData\Local\Programs\Python39\Scripts) needs to be added to the environment PATH variable. Start typing “environment variables” in the Windows search box, and click on “Edit the system environment variables”. Open a dialog by clicking on “Environment variables”, then double-click on PATH. Add the folder to the list, OK and exit. Close and reopen the terminal.
4 | Run reString. The installation takes care of creating a script that automatically runs the graphical user interface (GUI), that can be invoked directly from the terminal:
$ restring-gui
reString should launch and the User should see the program’s main window (Supplementary Fig. S1). Steps 1 and 3 will not be needed anymore to run reString.
Please note that on some Windows setups the antivirus might scan restring-gui.exe for threats. This is normal and should not take longer than a few seconds. Also on Windows, if the system fonts are scaled to 125% or above, reString fonts might be displayed too large. Reduce system fonts scaling to 100% to solve the issue.
5 | Update reString. To periodically ensure that reString is up to date, type in the terminal:
$ pip install restring --upgrade
Procedure – Analysis. The protocol is illustrated through an example experiment which makes use of sample files, that Users can analyze to familiarize themselves with the file format accepted by reString. Ideally, each file should have a name that serves as the label for the experimental condition. The file structure is detailed in Supplementary Fig. S2.
1 | Download sample files: in the program’s main app, choose “File > Download sample data”. Alternatively, download them from https://github.com/Stemanz/restring/tree/main/sample_data.
2 | Prep sample files
After downloading sample data (the file is called restring_sample_tables.zip and is found in the default browser’s download folder, usually Downloads), unzip the folder and copy it over to any desired location. In our example, we will be using the home directory (on the Mac, Finder > Go > Home, or ⌘ + ⇧ + H).
3 | Create the output folder. Create a folder of choice to store the results. In this example, we will create the folder output within the sample data folder.
4 | Choose the input files. Tell reString what input files to process, choose “File > Open...” or click “Open files..”. The file choosing dialog will open (Supplementary Fig. S3). Select all of them and click “Open”. For each file successfully opened, reString prints a message on the textual output frame (Supplementary Fig. S4). The frame is scrollable so that Users can always inspect each step of the analysis.
5 | Choose the output folder. Choose a previously created folder to store the analysis results. Choose “File > Set output folder” or click “Set folder” to open the dialog (Supplementary Fig. S5).
6 | Run the analysis with defaults settings. Choose “Analysis > New analysis” or click the “New analysis” button to start retrieving and aggregating results automatically. Please note that the computer must be connected to the internet. Refer to “Procedure – Analysis parameters” to learn about all settings.
reString will automatically retrieve from String functional enrichment information for statistically significant terms from KEGG Pathways, Gene Ontology (Biological Processes, Molecular Function and Cellular Component), and Reactome Pathways knowledge bases. These will be stored in a subfolder of the output folder with the same name as the input file and are equivalent to those that Users can manually download from STRING (Supplementary Fig. S6). Depending which genes were selected in the analysis, file names are prepended with “UP_”, “DOWN_” or “ALL_” (upregulated, downregulated and all genes simultaneously, see “Procedure – Analysis parameters”). reString details all steps it takes to retrieve functional enrichment analysis information automatically from STRING (Supplementary Fig. S7). reString will then aggregate retrieved results (Supplementary Fig. S8) and, for each functional enrichment searched (KEGG, Function, Component, Process and RCTM), produce two tables that contained the abridged version of the whole analysis: results and summary (Fig. 1).
Procedure – Analysis parameters. The retrieval of functional enrichment information and the following aggregation is performed by reString with default settings, that can be adjusted.
Species – reString defaults to Mus musculus, but a different species can be selected: “Analysis > Set species” will open the species selection dialog (Supplementary Fig. S9). For species that are not listed, a taxonomy identifier can be manually set.
Upregulated and/or downregulated genes – reString knows from the input files whether genes/proteins are up- or downregulated in any given comparison between two conditions (Supplementary Fig. S2 and S10). This allows four types of different analyses to be selected via “Analysis > DE genes settings”: i) “Upregulated genes only”: Functional enrichment info is searched for upregulated genes only; ii) “Downregulated genes only”: Functional enrichment info is searched for downregulated genes only; iii) “Upregulated and Downregulated, separately”: This is the default option. For every comparison, both upregulated and downregulated genes are considered, but separately. This means that functional enrichment info is retrieved for upregulated and downregulated genes separately, but the terms are aggregated from both. If a term shows up in both UP and DOWN gene lists, then the lowest P-value one is recorded; iv) “All genes together”: Functional enrichment info is searched for all genes together, and the resulting aggregation will reflect the functional enrichment analysis retrieved with all genes combined.
Procedure – Data visualization. reString integrates a flexible data visualization tool that aids Researchers to visualize the aggregated results and produce publication-quality heatmaps and clustermaps with a few clicks. The window can be opened via “Analysis > Draw clustermap” (Supplementary Fig.S11), and it is intended to work with reString results-type tables (Fig. 1). Detailed information on each option can be found in the Supplementary Materials and Methods or in the online documentation (https://github.com/Stemanz/restring/blob/main/README.md).
Animals and experimental procedures. Procedures involving animals and their care were conducted in accordance with institutional guidelines, in compliance with national (D.L. No. 26, March 4, 2014, G.U. No. 61 March 14, 2014), international (EEC Council Directive 2010/63, September 22, 2010: Guide for the Care and Use of Laboratory Animals, United States National Research Council, 2011) laws and policies and the ARRIVE guidelines17. The experimental protocol was approved by the Italian Ministry of Health (Protocollo 2012/4).
Apoe knockout (EKO) mice (https://www.jax.org/strain/002052) in the C57BL/6J background were purchased from Charles River Laboratories (Calco, Italy); double Apoe and Apoa1 knockouts (DKO) were generated as previously described18,19.
Eight weeks old male mice were randomly divided, genotype-wise, into 8 groups and fed either a normal laboratory diet (NLD, 4RF21, Mucedola, Italy) or a Western-type diet (WD, TD.88137, Envigo, Italy) for 6 or 22 weeks, and sacrificed as described20.
Aortas were then snap-frozen in liquid nitrogen for RNA-seq analyses (n = 3) or longitudinally opened, pinned flat on a black wax surface in ice-cold PBS and photographed unstained for en face analysis (n = 6–7)20–26.
RNA extraction. Total RNA was isolated from mouse aorta and extracted as previously described26. RNA was quantified and purity was checked, and 1 µg RNA was retrotranscribed to cDNA, as described27. Possible gDNA contamination was ruled out by running a PCR on 20 ng of cDNA/RNA with a primer pair producing two amplicons of different size on cDNA (193 bp) and gDNA (677 bp), see Supplementary Table S1 and Supplementary Fig. S12.
Quantitative PCR. Twenty ng of cDNA were used as template for each qPCR reaction, performed on a CFX Connect thermal cycler with iTAQ Universal Sybr Green Supermix (Bio-Rad, Segrate, Italy). Conditions and primers are detailed in Supplementary Table S1. A final melting curve analysis was always performed. Fold changes relative to the control group were calculated with the ΔΔCt method28. The gene cyclophilin A (Ppia) was used as reference gene29.
RNA-seq analyses. The quality of the mRNA was tested using the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) prior to RNAseq; samples with RIN < 7.0 were discarded. RNA samples were processed using the RNA-Seq Sample Prep kit from Illumina (Illumina, Inc., CA, USA). Clusters of tagged libraries (8 to 9 per single Illumina flowcell, created using the Illumina Cluster Station) were sequenced on a Genome Analyzer IIx (Illumina, Inc., CA, USA) to produce 50nt-long, unpaired reads. Reads were mapped on the UCSC genome assembly mm10 (reference strain C57BL6/J) using the classic tuxedo suite bowtie and tophat programs30. Estimation of gene expression levels was performed using cufflinks30. Genes with an adjusted P-value lower than 0.05 were considered differentially expressed (DE). All data and materials have been made publicly available at NCBI GEO. Data sets and can be accessed at (https://www.ncbi.nlm.nih.gov/geo/query/acc. cgi?acc = GSE173974). [The accession is currently set as private, and will be made public upon acceptance for publication. For reviewing purposes, it can be accessed via this token: ibubqiakntirfwl. Please, keep this token private and do not share it]
Data processing and visualization. Gene ontology analyses were performed with reString by querying STRING31 as thoroughly described in this manuscript; and terms with adjusted P-values lower than 0.05 were considered significant. Principal Component Analysis was performed with Scikit-learn15. Data visualization was performed with reString, as well as SciPy32, matplotlib13 and seaborn14 libraries for the Python programming language.