hESC maintenance and colonic lineage differentiation
hESCs were grown and maintained on 1% Matrigel (Corning)-coated six-well plates in StemFlex medium (Gibco) at 37°C with 5% CO2. For definitive endoderm (DE) differentiation, hESCs were cultured to achieve 80-90% confluency, and treated with 3 μM CHIR99021 (CHIR, Stem-RD) and 100 ng/ml Activin A (R&D systems) in basal medium RPMI1640 (Cellgro) supplemented with 1X Pen-Strep (Gibco) for 1 day, and changed to the basal medium containing only 100 ng/ml Activin A the next day. To induce CDX2+ hindgut endoderm, DE were treated with 3 μM CHIR99021 and 500 ng/ml FGF4 (Peprotech) in RPMI1640 supplemented with 1X B27 supplement (Gibco) and 1X Pen- Strep (Gibco) for 4 days with daily changing of fresh media. Organoids began to bud out from the 2D culture during the hindgut differentiation process. The hindgut endoderm was then subjected to colonic lineage induction by treatment with 100 ng/ml BMP2 (Peprotech), 3 μM CHIR99021 and 100 ng/ml hEGF (Peprotech) in Advance DMEM F12 medium supplemented with 1X B27 supplement (Gibco), 1X GlutaMax (Gibco), 10 mM HEPES (Gibco) and 1X Pen-Strep (Gibco) for 3 days with daily changing of fresh medium. After colonic fate induction, the colon progenitor organoids were collected from the initial 2D cultures and embedded in a 100% Matrigel dome in a 24-well plate. Differentiation to mature colonic cell types was achieved by culturing these colon progenitor organoids in differentiation medium containing 600 nM LDN193189 (Axon), 3 μM CHIR99021 and 100 ng/ml hEGF in Advance DMEM F12 medium supplemented with 1X B27 supplement, 1X GlutaMax, 10 mM HEPES and 1X Pen-Strep. The differentiation medium was refreshed every 3 days for at least 40 days to achieve full colonic differentiation. The colon organoids were passaged and expanded every 10 – 14 days at 1:6 density. To passage the organoids, the Matrigel domes containing the organoids were scrapped off the plate and resuspended in cold splitting media (Advance DMEM F12 medium supplemented with 1X GlutaMax, 10 mM HEPES and 1X Pen-Strep). The organoids were mechanically dislodged from the Matrigel dome and fragmented by pipetting in cold splitting media. The old Matrigel and splitting media were removed after pelleting cells and the organoids were resuspended in 100% Matrigel. 50 µL Matrigel containing fragmentized colon organoids were plated in one well of a pre-warmed 24-well plate.
Cell Lines
HEK293T (human [Homo sapiens] fetal kidney) and Vero E6 (African green monkey [Chlorocebus aethiops] kidney) were obtained from ATCC (https://www.atcc.org/). Cells were cultured in Dulbecco’s Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum and 100 I.U./mL penicillin and 100 μg/mL streptomycin. All cell lines were incubated at 37°C with 5% CO2.
SARS-CoV-2 Pseudo-Entry Viruses
Recombinant Indiana VSV (rVSV) expressing SARS-CoV-2 spikes were generated as previously described23. HEK293T cells were grown to 80% confluency before transfection with pCMV3-SARS-CoV2-spike (kindly provided by Dr. Peihui Wang, Shandong University, China) using FuGENE 6 (Promega). Cells were cultured overnight at 37°C with 5% CO2. The next day, medium was removed and VSV-G pseudotyped ΔG-luciferase (G*ΔG-luciferase, Kerafast) was used to infect the cells in DMEM at an MOI of 3 for 1 hr before washing the cells with 1X DPBS three times. DMEM supplemented with 2% fetal bovine serum and 100 I.U./mL penicillin and 100 μg/mL streptomycin was added to the infected cells and they were cultured overnight as described above. The next day, the supernatant was harvested and clarified by centrifugation at 300g for 10 min and aliquots stored at −80°C.
SARS-CoV-2 Viruses
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), isolate USA-WA1/2020 (NR-52281) was deposited by the Center for Disease Control and Prevention and obtained through BEI Resources, NIAID, NIH. SARS-CoV-2 was propagated in Vero E6 cells in DMEM supplemented with 2% FBS, 4.5 g/L D-glucose, 4 mM L-glutamine, 10 mM Non-Essential Amino Acids, 1 mM Sodium Pyruvate and 10 mM HEPES. Infectious titers of SARS-CoV-2 were determined by plaque assay in Vero E6 cells in Minimum Essential Media supplemented with 2% FBS, 4 mM L-glutamine, 0.2% BSA, 10 mM HEPES and 0.12% NaHCO3 and 0.7% agar. All work involving live SARS-CoV-2 was performed in the CDC/USDA-approved BSL-3 facility of the Global Health and Emerging Pathogens Institute at the Icahn School of Medicine at Mount Sinai in accordance with institutional biosafety requirements.
SARS-CoV-2 pseudo-entry virus infections
To assay pseudo-typed virus infection on colon organoids, COs were seeded in 24 well plates. Pseudo-typed virus was added at MOI=0.01 plus polybrene at a final concentration of 8 μg/mL, and the plate centrifuged for 1 hr at 1200g. At 3 hpi, the infection medium was replaced with fresh medium. At 24 hpi, colon organoids were harvested for luciferase assays or immunostaining analysis. For chemical screening analysis, colon organoids were digested by TrypLE and seeded in 384 well plates at 1x104 cells per well. After chemical treatment, pseudo-typed virus was added at MOI=0.01 and the plate centrifuged for 1 hr at 1200g. At 24 hpi, hPSC-COs were harvested for luciferase assays according to the Luciferase Assay System protocol (Promega).
SARS-CoV-2 virus infections
hPSC-COs were infected with SARS-CoV-2 in the CO media at an MOI of 0.1, 0.05 or 0.01 as indicated and incubated at 37°C. At 24 hpi, cells were washed three times with PBS and harvested for either RNA analysis or immunofluorescence staining.
Approximately 2.5 × 105 Vero E6 cells were pre-treated with the indicated compounds for 1 h prior to infection with SARS-CoV-2 at an MOI of 0.01 in DMEM supplemented with 2% FBS, 4.5 g/L D-glucose, 4 mM L-glutamine, 10 mM Non-Essential Amino Acids, 1 mM Sodium Pyruvate and 10 mM HEPES. At 24 hpi, cells were washed three times with PBS before harvesting for immunofluorescence staining or RNA or protein analysis.
Cells were lysed in RIPA buffer for protein analysis or fixed in 5% formaldehyde for 24 h for immunofluorescent staining, prior to safe removal from the BSL-3 facility.
Colon organoid processing and immunostaining
The colon organoids were released from Matrigel using Cell Recovery Solution (Corning) on ice for 1 hr, followed by fixation in 4% paraformaldehyde for 4 hr at 4°C, washed twice with 1X PBS and allowed to sediment in 30% sucrose overnight. The organoids were then embedded in OCT (TissueTek) and cryo-sectioned at 10 μm thickness. For indirect immunofluorescence staining, sections were rehydrated in 1X PBS for 5 min, permeabilized with 0.2% Triton in 1X PBS for 10 min, and blocked with blocking buffer containing 5% normal donkey serum in 1X PBS for 1 hr. The sections were then incubated with the corresponding primary antibodies diluted in blocking buffer at 4°C overnight. The following day, sections were washed three times with 1X PBS before incubating with fluorophore-conjugated secondary antibody for one hr at RT. The sections were washed three times with 1X PBS and mounted with Prolong Gold Antifade mounting media with DAPI (Life technologies). Images were acquired using an LSM880 Laser Scanning Confocal Microscope (Zeiss) and processed with Zen or Imaris (Bitplane) software.
Immunofluorescent staining
Organoids and tissues were fixed in 4% PFA for 20 min at RT, blocked in Mg2+/Ca2+ free PBS plus 5% horse serum and 0.3% Triton-X for 1 hr at RT, and then incubated with primary antibody at 4°C overnight. The information for primary antibodies is provided in Extended Data Table 2. Secondary antibodies included donkey anti-mouse, goat, rabbit or chicken antibodies conjugated with Alexa-Fluor-488, Alexa-Fluor-594 or Alexa-Fluor- 647 fluorophores (1:500, Life Technologies). Nuclei were counterstained by DAPI.
Western blot
Protein was extracted from cells in Radioimmunoprecipitation assay (RIPA) lysis buffer containing 1X Complete Protease Inhibitor Cocktail (Roche) and 1X Phenylmethylsulfonyl fluoride (Sigma Aldrich) prior to safe removal from the BSL-3 facility. Samples were analysed by SDS-PAGE and transferred onto nitrocellulose membranes. Proteins were detected using rabbit polyclonal anti-GAPDH (Sigma Aldrich, G9545), mouse monoclonal anti-SARS-CoV-2 Nucleocapsid [1C7] and mouse monoclonal anti-SARS-CoV-2 Spike [2B3E5] protein (a kind gift by Dr. T. Moran, Center for Therapeutic Antibody Discovery at the Icahn School of Medicine at Mount Sinai). Primary antibodies were detected using Fluorophore-conjugated secondary goat anti-mouse (IRDye 680RD, 926-68070) and goat anti-rabbit (IRDye 800CW, 926-32211) antibodies. Antibody- mediated fluorescence was detected on a LI-COR Odyssey CLx imaging system and analyzed using Image Studio software (LI-COR).
Single cell organoid preparation for scRNA-sequencing
The colon organoids cultured in Matrigel domes were dissociated into single cells using 0.25% Trypsin (Gibco) at 37°C for 10 min, and the trypsin was then neutralized with DMEM F12 supplemented with 10% FBS. The dissociated organoids were pelleted and resuspended with L15 Medium (Gibco) supplemented with 10 mM HEPES, and 10 ng/ml DNaseI (Sigma). The resuspended organoids were then placed through a 40 µm filter to obtain a single cell suspension, and stained with DAPI followed by sorting of live cells using an ARIA II flow cytometer (BD Biosciences). The live colonic single cell suspension was transferred to the Genomics Resources Core Facility at Weill Cornell Medicine to proceed with the Chromium Single Cell 3’ Reagent Kit v3 (10x Genomics, product code # 1000075) using 10X Genomics Chromium Controller. A total of 10,000 cells were loaded into each channel of the Single-Cell A Chip to target 8000 cells. Briefly, according to manufacturer’s instruction, the sorted cells were washed with 1x PBS + 0.04% BSA, counted by a Bio-Rad TC20 Cell Counter, and cell viability was assessed and visualized. A total of 10,000 cells and Master Mixes were loaded into each channel of the cartridge to generate the droplets on Chromium Controller. Beads-in-Emulsion (GEMs) were transferred and GEMs-RT was undertaken in droplets by PCR incubation. GEMs were then broken and pooled fractions recovered. After purification of the first-strand cDNA from the post GEM-RT reaction mixture, barcoded, full-length cDNA was amplified via PCR to generate sufficient mass for library construction. Enzymatic fragmentation and size selection were used to optimize the cDNA amplicon size. TruSeq Read 1 (read 1 primer sequence) was added to the molecules during GEM incubation. P5, P7, a sample index, and TruSeq Read 2 (read 2 primer sequence) were added via End Repair, A-tailing, Adaptor Ligation, and PCR. The final libraries were assessed by Agilent Technology 2100 Bioanalyzer and sequenced on Illumina NovaSeq sequencer with pair-end 100 cycle kit (28+8+91).
Sequencing and gene expression UMI counts matrix generation
T FASTQ files were imported to a 10x Cell Ranger - data analysis pipeline (v3.0.2) to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis. In a first step, cellranger mkfastq demultiplexed samples and generated fastq files; and in the second step, cellranger count aligned fastq files to the reference genome and extracted gene expression UMI counts matrix. In order to measure viral gene expression, we built a custom reference genome by integrating the four virus genes and luciferase into the 10X pre-built human reference (GRCh38 v3.0.0) using cellranger mkref. The sequences of four viral genes (VSV-N VSV-NS, VSV-M and VSV-L) were retrieved from NCBI (https://www.ncbi.nlm.nih.gov/nuccore/335873), and the sequence of the luciferase was retrieved from HIV-Luc.
Single-cell RNA-seq data analysis
We filtered cells with less than 300 or more than 8000 genes detected as well as cells with mitochondria gene content greater than 30%, and used the remaining cells (6175 cells for the uninfected sample and 2962 cells for the infected sample) for downstream analysis. We normalized the gene expression UMI counts for each sample separately using a deconvolution strategy24 implemented by the R scran package (v.1.14.1). In particular, we pre-clustered cells in each sample using the quickCluster function; we computed size factor per cell within each cluster and rescaled the size factors by normalization between clusters using the computeSumFactors function; and we normalized the UMI counts per cell by the size factors and took a logarithm transform using the normalize function. We further normalized the UMI counts across samples using the multiBatchNorm function in the R batchelor package (v1.2.1). We identified highly variable genes using the FindVariableFeatures function in the R Seurat (v3.1.0)25, and selected the top 3000 variable genes after excluding mitochondria genes, ribosomal genes and dissociation-related genes. The list of dissociation-related genes was originally built on mouse data26, we converted them to human ortholog genes using Ensembl BioMart. We aligned the two samples based on their mutual nearest neighbors (MNNs) using the fastMNN function in the R batchelor package, this was done by performing a principal component analysis (PCA) on the highly variable genes and then correcting the principal components (PCs) according to their MNNs. We selected the corrected top 50 PCs for downstream visualization and clustering analysis. We ran the uniform manifold approximation and projection (UMAP) dimensional reduction using the RunUMAP function in the R Seurat25 package with training epochs setting to 2000. We clustered cells into eight clusters by constructing a shared nearest neighbor graph and then grouping cells of similar transcriptome profiles using the FindNeighbors function and FindClusters function (resolution set to 0.2) in the R Seurat package. We identified marker genes for each cluster by performing differential expression analysis between cells inside and outside that cluster using the FindMarkers function in the R Seurat package. After reviewing the clusters, we merged four clusters that were likely from stem cell population into a single cluster (LGR5+ or BMI1+ stem cells) and kept the other four clusters (KRT20+ epithelial cells, MUC2+ goblet cells, EPHB2+ TA cells, and CHGA+ NE cells) for further analysis. We re-identified marker genes for the merged five clusters and selected the top 10 positive marker genes per cluster for heatmap plot using the DoHeatmap function in the R Seurat package25.
In vivo transplantation and drug evaluation
hPSC-COs were harvested by cell scraper, mixed with 20 µl Matrigel (Corning) and transplanted under the kidney capsule of 7-9 weeks old male NSG mice. Two weeks post- transplantation, SARS-CoV-2 pseudo-entry virus was inoculated locally at 1x103 FFU. At 24 hpi, the mice were euthanized and used for immunohistochemistry analysis.
To determine the MPA’s activity in vivo, the mice were treated with 50 mg/kg MPA in (10%DMSO/90% corn oil) by IP injection. Two hours after drug administration, SARS- CoV-2 pseudo-entry virus was inoculated locally at 1x103 FFU. At 24 hpi, the mice were euthanized and used for immunohistochemistry analysis.
All animal work was conducted in agreement with NIH guidelines and approved by the WCM Institutional Animal Care and Use Committee (IACUC) and the Institutional Biosafety Committee (IBC).
High throughput chemical screening
To perform the high throughput small molecule screen, hPSC-COs were dissociated using TrypLE for 20 min in a 37℃ waterbath and replated into 10% Matrigel-coated 384- well plates at 20,000 cells/40 µl medium/well. After 6 hr, cells were treated with compounds from an in-house library of ~1280 FDA-approved drugs (Prestwick) at 10 µM. DMSO treatment was used as a negative control. One hour late, cells will be infected with SARS-CoV-2 pseudo virus (MOI=0.01). After 24 hpi, hPSC-COs were harvested for luciferase assay following the Luciferase Assay System protocol (Promega).
RNA-Seq following viral infections
Organoid infections were performed at an MOI of 0.1 and harvested at 24 hpi in DMEM supplemented with 0.3% BSA, 4.5 g/L D-glucose, 4 mM L-glutamine and 1 μg/ml TPCK- trypsin. Total RNA was extracted in TRIzol (Invitrogen) and DNase I treated using Direct- zol RNA Miniprep kit (Zymo Research) according to the manufacturer’s instructions. RNA- seq libraries of polyadenylated RNA were prepared using the TruSeq RNA Library Prep Kit v2 (Illumina) or TruSeq Stranded mRNA Library Prep Kit (Illumina) according to the manufacturer’s instructions. cDNA libraries were sequenced using an Illumina NextSeq 500 platform. Raw reads were aligned to the human genome (hg19) using the RNA-Seq Aligment App on Basespace (Illumina, CA), following differential expression analysis using DESeq227. Differentially expressed genes (DEGs) were characterized for each sample (p adjusted-value < 0.05). Volcano plots were constructed using custom scripts in R.
Statistical analysis
N=3 independent biological replicates were used for all experiments unless otherwise indicated. n.s. indicates non-significance. P-values were calculated by unpaired two- tailed Student’s t-test unless otherwise indicated. *p<0.05, **p<0.01 and ***p<0.001.