tspex is implemented as a Python package and it can be used locally through a Python Application Programming Interface (API), command-line interface or web version. Local installation of tspex is as easy as calling it with pip or conda and requires few dependencies. Refer to the tspex GitHub repository for the most up-to-date source code, dependency details and instructions. An open source web interface (Figure 1A), built with Flask and deployed using Docker containers, is also available at https://tspex.lge.ibi.unicamp. br/.
tspex provides twelve distinct tissue-specificity metrics, which differ in their assump- tions, scale and properties. Broadly, these metrics can be divided into two groups9: (1) general scoring metrics, that summarize in a single value how tissue-specific or ubiquitous is a gene across all tissues and (2) individualized scoring metrics that quantify how specific is the expression of each gene to each tissue.
The general scoring metrics provided by tspex are: Counts3, Tau10, Gini coefficient11, Simpson index12, Shannon entropy specificity13, ROKU specificity14, Specificity measure dispersion (SPM DPM)15, and Jensen-Shannon specificity dispersion (JSS DPM)16. As for individualized scoring metrics, tspex includes: Tissue-specificity index (TSI)17, Z- score18, Specificity measure (SPM)19, and Jensen-Shannon specificity (JSS)16. Each metric provides values that range within different scales, thus tspex includes an option to transform tissue-specificity values so that they fall within 0 (ubiquitous expression) and 1 (tissue-specific expression). The equations for all provided metrics as well as their transformations can be found in the Supplementary Material or at https://apcamargo. github.io/tspex/metrics/.
As input, tspex requires an expression matrix (TSV, CSV or Excel formats) in any appropriate unit, such as TPM, FPKM or CPM. Optionally, tspex allows the expression values to be log-transformed before computation of tissue-specificity, which reduces the dependency between expression variance and expression level, improving the reliability of tissue-specificity measurements9. Internally, expression data and the tissue-specificity values are stored in a Python object and can be easily accessed for further investigation through the Python API.
Finally, the tspex package provides built-in functions for data visualization. Specifically, the user can plot histograms of tissue-specificity values (Figure 1B) and heatmaps of the expression of genes whose tissue-specificity is above a chosen value (Figure 1C). These visualizations allow quick inspection of the results and can be helpful for deciding threshold values.