Preliminary tests on the transwell functional assay
The transwell migration assay is a commonly used test to study the migratory response of cells in response to a biochemical concentration gradient. In our study, we induced cell migration by using two concentrations of fetal bovine serum at either side of the transwell membrane (Fig. 1a). Human fibroblasts were seeded at 1% serum concentration on the top of the transwell membrane and induced to migrate towards the 15% concentration at bottom face of the membrane during a 48-h incubation period. Initially, we tested two membrane pore sizes, 3 and 5 µm (Fig. 1b), and found that reduced pore size correlated with a lower ratio of cells between bottom and top compartments, indicative of a higher selection pressure (Fig. 1c). Given these results, the 3-µm pore condition was considered sufficiently selective for performing a screening based on this functional assay.
Design of the CRISPR-Cas9 knockout screen
Human fibroblasts were transduced with the Human CRISPR Knockout Pooled Library (GeCKO v2 library)15, which has a lentiviral backbone containing both the Streptococcus pyogenes Cas9 nuclease and the sgRNA scaffold, and includes puromycin as an antibiotic selection marker. The library spans the whole genome with 3 sgRNAs per 19051 genes and 4 sgRNAs per 1864 miRNAs. Cells were transduced at a multiplicity of infection of 0.3 and with a coverage of ~ 300 cells/sgRNA, to statistically guarantee at most one virus per cell and a sufficient representation of each sgRNA in the cell population. Afterwards, cells were expanded for 14 days under puromycin selection conditions, before starting the migration selective pressure in tranwells (Fig. 1d). This expansion aimed at excluding from the cell population all sgRNAs that would be removed even in absence of a migratory selective pressure, and increasing the signal-to-noise ratio of the subsequent selection.
Human fibroblast selection in transwells was performed similarly to the preliminary experiment above, using 3-µm pore transwell devices. Three rounds of selection were performed collecting separately the cells attached to the top and the bottom of the membrane. At each round, cells collected from the top compartment were seeded again at the top compartment of new devices, and cells collected from the bottom compartment were seeded at the top of the membrane of other new devices. Between each round of transwell selection there was a 48-h expansion in conventional flasks to ensure a sufficient number of cells for collection and re-seeding. Figure 1d shows the nomenclature of the samples collected, which also includes control samples obtained expanding fibroblasts in standard culture dishes. At the end of the experiment, the genomic DNA from each sample was sequenced and analyzed to obtain a count matrix for each sgRNA in each sample.
Quality control of CRISPR-Cas9 knockout screening results
The first quality control was performed on the plasmid library, amplified before lentiviral packaging, to verify any potential bias that may affect screening results. Only 4 sgRNAs were lost in the amplification procedure, and the sgRNA distribution had a narrow distribution (Figure S1a). Then, raw data of all samples were analyzed to quantify uneveness of sgRNA distribution and number of missing sgRNAs (Figure S1b-c). Gini index is a measure of sgRNA diversity in culture, the higher the value the lower the diversity. Gini index has values close to zero for a perfectly even distribution (same number of counts for each sgRNA). Thus, it is expected that Gini index increases while sgRNAs are progressively selected by culture conditions. The Gini index for the plasmid library was as low as 0.074 (Figure S1b), confirming the narrow distribution shown in Figure S1a. Gini index strongly increased in the Time 0 sample (0.141), due the high number of sgRNAs that were eliminated during the first 14 days of expansion (Figure S1c). We hypothesized these eliminated sgRNAs are targeting essential genes or miRNAs for fibroblast culture in our experimental conditions. We asked whether these null sgRNAs in Time 0 sample included those targeting genes recognized to be essential in multiple other cell lines according to a previous study16. Figure 2a shows that all but 3 of these known essential genes were excluded in our Time 0 sample.
We filtered the data according to stringent criteria, excluding sgRNAs that did not have a sufficiently high count (25) in Time 0 and Control samples and genes and miRNAs not targeted by at least 3 sgRNAs. Principal component analysis (PCA) results showed that Control samples cluster quite closely to Time 0 sample, without an evident temporal trend (Fig. 2b). On the other hand, two diverging temporal trajectories were obtained for Top and Bottom samples, demonstrating the ability of the selection process to progressively highlight differences between the two cell subpopulations. The histograms of the sgRNAs in the filtered data in each sample showed that along the migratory selective process there is a progressive loss of sgRNAs, particularly in the Bottom samples (Fig. 2c).
Hit analysis confirms known migration-related genes
We identified significant hits, genes and miRNAs whose sgRNAs are enriched or depleted, based on MAGeCK-VISPR maximum likelihood estimation algorithm. As mentioned above, after data filtering each gene and miRNA was targeted by at least 3 sgRNAs to have higher statistical significance. First, we compared the three Control samples with Time 0 sample, to exclude from the following migration analysis genes and miRNAs whose sgRNAs were enriched or depleted under conventional culture conditions. However, most of the sgRNA selection in normal culture occured in the first 14 days after transduction, and no additional hit was identified in these comparisons (Fig. 3a).
Then, we compared Bottom samples with Top samples at each round of the selection process. Significant hits were identified from the second round and increased in number in the third round (Figs. 3b-c), with the third round results including most of the significant features (genes or miRNAs) from the previous one (Fig. 3d). Features whose sgRNAs were depleted in the Bottom compared to the Top samples represent those features that, once deleted, hinder cell migration, and thus will be called promoting migration. On the contrary, features whose sgRNAs were enriched in the Bottom samples are preventing migration because, once deleted, the cell migration propensity is increased. The full list of significantly enriched or depleted hits between Bottom 3 and Top 3 samples is presented in Supplementary dataset 1 and Supplementary dataset 2.
We asked whether known genes involved in migration could be identified in the screening. In particular, we searched for genes belonging to the Gene Ontology-Biological Processes categories "cell motility", "cell projection organization", and "cell adhesion". The last two categories are well known to be related to the process of cell migration, because the initial response of a cell to a migration-promoting concentration gradient is to polarize and extend protrusions that assemble new adhesion points, while at the rear of the cell focal adhesions disassemble2,3. Migration-related genes that are significant hits are highlighted in Figs. 3e-f and listed in Table 1. Among the most significant migration-promoting genes, the p14-MP1 complex (LAMTOR2/3) is a known regulator of focal adhesion remodeling and its absence has been reported to reduce the migration speed17; RTTN plays a role in cilia structure and function18; MFAP4 is crucial for type I collagen, elastin, and tropoelastin binding19; IL36B, formerly known as IL-1F8, modulates inflammation and fibrosis20; MCC and SMURF1 are implicated in migration through WNT pathway21,22, while CDC42EP5 and VAV2 in Rac-driven migration23–25. Two identified significant genes preventing migration are LAMB4 and GORASP1. LAMB4 belongs to the laminin family, constituents of the extracellular matrix, however it is still poorly characterized, and it is currently not known to take part into any laminin heterotrimers26,27. The only study on its relation to migration reported that LAMB4 silencing in head and neck squamous cell carcinoma promotes migration, in agreement with what we found in fibroblasts28. GORASP1, also known as GRASP65, is a Golgi structural protein critical in establishing polarity in migrating cells once regulated by phosphorylation29.
Table 1
List of the significant genes identified by MAGeCK-VISPR algorithm in the comparison Bottom3 vs. Top3, belonging to the indicated categories, highlighted in Figs. 3e-f.
Gene Ontology ID | Gene Ontology Category name | Significant genes Bottom3 vs. Top3 |
GO:0048870 (BP) | Cell motility | BBS2, BMP7, ITGB8, LAMB4, LAMTOR2, MCC, NRTN, PALLD, SEMA6A, SHROOM2, SP100, SRGAP2, VAV2 |
GO:0030030 (BP) | Cell projection organization | BBS2, BLOC1S5, BMP7, CDC42EP, NEUROG3, NME5, NRTN, NRXN1, PALLD, PTPRD, RTTN, SEMA6A, SLC11A, SMURF1, SRGAP2, SYNE1, VAV2 |
GO:0007155 (BP) | Cell adhesion | ARG2, BMP7, HLA-DPB1, ICOS, IL36B, ITGB8, LAMB4, MFAP4, NRXN1, PALLD, PCDHA11, PCDHA5, PCDHAC1, PCDHB14, PCDHGB5, PDCD1LG2, PTPRD, SEMA6A, SRGAP2 |
Hit analysis identifies new targets in migration
In the CRISPR-Cas9 knockout screening, every counted sgRNA played a role in a single cell where it silenced a targeted feature, gene or miRNA, at a specific genomic locus. Thus, features with significantly enriched or depleted sgRNAs are individually important. This is different from, for example, transcriptomic data where a number of genes collectively describe cell behavior. However, given that biological processes typically include some regulatory redundancy, we asked whether there are multiple features that could be ascribed to the same biological functions. We selected all Reactome pathways that contained at least three genes that were found significant from the screening. Reactome categories were then visualized according to Reactome hierarchy in Fig. 4a.
We found a number of metabolic categories that can be at least partially explained by the biochemical gradient used in the screening, where top cells were exposed to a much lower serum concentration compared to bottom cells. We do not consider these categories necessarily related to the migration process itself. Rho GTPase, WNT, and MAPK signaling, together with the extracellular matrix (ECM) organization, are already known processes related to migration and cancer metastasis30. Besides in Rho GTPase signaling, mechanisms of membrane trafficking emerge in multiple categories in Golgi vesicle transport regulated by glycosylation. The transport of small molecules, like ions, may suggest a role of mechanisms that control cell volume, facilitating the cell squeezing through constrictions, as previously reported for SLC12A6 (a.k.a. KCC3)31. Fibroblast role in immune regulation has been described in inflammation, cancer, and infection, also in the context of fibroblast-secreted soluble and ECM molecules promoting migration of immune cells32. The number of immune-related genes identified in this screening show also a self-regulation on fibroblast migration. Interestingly, a number of non-pseudogene olfactory receptors (OR10H1, OR10K2, OR1E1, OR2A42, OR52D1) have been found significant. They are G protein-coupled receptors (GPCRs) better known for their role in the nose for recognition of odorant molecules33. However, their expression in other tissues have been described, suggesting sensory functionality in other contexts34. OR1E1, OR2A42, OR52D1 have been already listed as ectopically expressed35, and OR10H1 has been suggested as a biomarker in urinary tract cancer36.
The full list of genes belonging to each of the identified Reactome categories is reported in Supplementary dataset 3. Figure 4b-c describe the remaining most highly significant genes that could not be classified in Reactome categories shared with multiple other genes from the screening. They represent an important resource for further experimental studies. Furthermore, in Figures S2 and S3 the full graphical results of significant genes and miRNAs are presented.