Protein expression and purification
The gene sequence corresponding to the full-length SARS-Cov-2 N protein (GenBank QIG56001.1) was amplified from SARS-CoV-2 RNA and cloned into a pET28a-TEV vector for the expression of a 6xHis-fusion protein, as previously described51. The NTD (residues Q43-E174) and CTD (residues S250-P364) fragments were amplified from the full-length N construct using primers forward (NTD-F 5’-AACGTGGATCCCAAGGTTTACCCAATAATACTG-3’, CTD-F 5’CTAAGGGATCCGCTGCTGAGGCTTCTAAGAAG3’) and reverse (NTD-R 5’-ACTGCCGCGGCCGCTTTATTCTGCGTAGAAGCCTTTTGG-3’, CTD-R 5’CTTTTTAGCGGCCGCTTATGGGAATGTTTTGTATGCGTC3’), respectively, and inserted into the BamHI/NotI sites of a pET-SUMO vector (Invitrogen), carrying a SUMO sequence at the N-terminus. The expression vectors were used to transform Escherichia coli BL21 (DE3) cells (Novagen, USA). Freshly transformed cells were grown in LB-kanamycin (50 μg/mL) medium to OD600nm 0.8 at 37°C. The temperature of the cultures was lowered to 25°C (full-length N) or to 18°C (NTD and CTD), and protein expression was induced with 0.1 mM (full-length N) or 0.5 mM (NTD and CTD) IPTG for 16 h at the respective temperatures. Cells were harvested by centrifugation (4,000 x g, 10 min) and stored at -80°C. To remove nucleic acids of bacterial origin, the proteins were purified under denaturing conditions using urea and high salt concentration26. Frozen cell pellets were thawed and resuspended in buffer A (50 mM sodium phosphate, pH 7.6, 500 mM NaCl, 10% glycerol, 20 mM imidazole, 6 M urea) and lysed by sonication on ice. Lysed cells were centrifuged at 18,000 x g for 40 min at 4°C to remove cell debris and the supernatants were applied onto a HisTrap FF 5mL column (GE healthcare) pre-equilibrated with buffer A. After washings, proteins were eluted in ten column volumes of buffer B (50 mM sodium phosphate, pH 7.6, 500 mM NaCl, 10% glycerol, 500 mM imidazole, 3 M urea). Fractions containing the protein of interest were pooled and dialyzed against buffer C (50 mM sodium phosphate, pH7.6, 500 mM NaCl, 10% glycerol) overnight at 4°C. Except for the N-full construct, the recombinant proteins (NTD and CTD contructs) were cleaved with the appropriate TEV and SUMO proteases. Cleaved tags were removed by reverse affinity chromatography using buffer A and B without urea. Protein fractions were concentrated and fractionated on a size exclusion Superdex 200 16/600 (full-length N) or Superdex 75 16/600 (NTD and CTD) column, previously equilibrated with buffer C.
For crystallization tests, the CTD was purified using Turbonuclease from Serratia marcescens (Sigma, USA). Briefly, bacterial cells after IPTG induction were suspended in lysis buffer (50 mM Tris HCl pH 8.0; 1 M NaCl, 5% glycerol, 1 mM β-mercaptoethanol) containing 200 units of Turbonuclease and lysed by sonication as described above. Lysed cells were centrifuged, and the supernatant was applied onto a HisTrap FF 5mL column pre-equilibrated with buffer D (50 mM Tris HCl pH 8.0, 500 mM NaCl, 5% glycerol, 1 mM β-mercaptoethanol). Bound proteins were eluted using the same buffer containing 500 mM imidazole. The eluate was dialyzed against buffer D overnight at 4°C. After SUMO cleavage and reverse affinity chromatography, the proteins were fractionated on a Superdex 75 16/600 column pre-equilibrated with buffer E (20 mM Tris HCl, pH 8.0, 100 mM NaCl, 1 mM β-mercaptoethanol).
The quality of all protein preparations was verified by SDS-PAGE and dynamic light scattering (DLS). In addition, UV absorbance at 260/280 nm was used to estimate the amount of nucleic acid in the protein samples. Only protein samples with a monodisperse character and a 260/280 nm ratio of 0.5-0.6 were used in the experiments described below.
Fluorescence polarization assay
Chemically synthesized RNA probes 5’-labelled with fluorescein isothiocyanate (FITC) and purified by HPLC were obtained from Thermo Scientific (USA). Probe sequences were as follows: RNA1 (5’-CACUCACUGUCUUUUUUGAUGGUAGAGU-3’), RNA2 (5’-CACUCACUGUCAAAAAAGAUGGUAGAGU-3’), RNA3 (5’-CACUCACUGUCUUGUUUGAUGGUAGAGU-3’), RNA4 (5’-CACUCACUGUCUUUUUU-3’), RNA5 (5’-CACUCACUGUC-3’) and scramble control RNA6 (5’-AUAUAGCUAC-3’),
Fluorescence polarization (FP) assays were used to determine the binding affinity of the N protein to the RNA probes, solubilized in 50 mM sodium phosphate buffer (pH, 7.6). Purified N protein from 2.5 nM to 15 µM in 50 mM sodium phosphate buffer, 100 mM NaCl (pH, 7.6) was mixed with each RNA at 10 nM final concentration, in 384-well plates. FP data was acquired using a ClarioStar microplate reader (BMG LabTech), with excitation and emission wavelengths set to 485 and 530 nm, respectively. Affinity binding curves were fitted to a Hill1 model using the OriginPro software.
Hight-throughput screening assays
A customized library with 3215 nonredundant compounds from the collections ‘FDA-approved’, ‘anti-COVID’, ‘anti-infection’ and ‘anti-virus’, was purchased from MedChemExpress (NJ, USA). The library, in 384-well plates, was diluted to 1 mM concentration in dimethyl sulfoxide (DMSO) and stored at -20°C. Columns 1, 2, 23 and 24 of all microplates were filled with DMSO for screening controls as described below.
Binding of RNA1 to the N protein was monitored by FP, as described above. Screenings were performed in 384-well, flat bottom, black polypropylene microplates (Greiner #781289), using the binding buffer supplemented with 0.01% triton X-100, in a final volume of 25 µL. The final concentration of RNA, N protein, library compound and DMSO were 10 nM, 500 nM, 20 µM and 2% (v/v) respectively. Initially, the assay plates were filled with the RNA probe solution (19.5 µl) using a MultiDrop dispenser (Thermo Fisher) and the compounds (0.5 µl) were transferred from the library to the assay plates in a Janus-MDT liquid handler platform (PerkinElmer). FP measurements were performed at this stage to detect possible interference from library compounds. The N protein (5 µl) was then transferred to all wells of the assay plates using the MultiDrop dispenser, except for columns 1 and 24 which received buffer, RNA and DMSO only, and were used as negative controls (low control, free probe). On the other hand, columns 2 and 23 received buffer, RNA, protein and DMSO, and were considered as positive controls (high control, bound probe). After adding the protein to the assay mix, the plates were incubated at room temperature for 30 min before FP was measured. mP values of positive (100% binding) and negative (0% binding) controls were used for sample data normalization.
Concentration−response curves and IC50 determination
To confirm the activity and evaluate the potency of selected hit candidates, hit compounds were subjected to dose-response assays. Except for compound concentration, the assay conditions were as described above, where 0.5 µL of the compounds were transferred to the assay plates generating a concentration gradient from 1.8 nM to 50 µM. The concentration-response assays were performed in triplicates and after acquisition of the FP data, the percentage of inhibition for each test compound was calculated as follows: % inhibition = (high control average – read value)/(high control average – low control average) × 100. Normalized data was fit to the log4 parameters equation with variable slope to extract the IC50 values with GraphPad Prism 9 (GraphPad LLC, v. 9.3.1).
Biophysical analysis and hit confirmation
Aggregation assays were performed with the full-length N protein diluted to 10 µM in 50 mM sodium phosphate buffer, 100 mM NaCl (pH, 7.6), with subsequent addition of 20 µM of each test compounds or 1% DMSO used as diluent. The same assay was performed with the protein at 2.5 µM in the presence of 500 µM CA. Samples were evaluated by DLS in a ZetaSizer NanoS (Malvern) equipment, at 10°C, using default parameters set by the equipment.
The affinity of RNA1 for the N protein in the presence of selected hit compounds was inspected by FP. Serial dilutions of the protein:compound mixtures at a 1:2 ratios were prepared in 50 mM sodium phosphate buffer, 100 mM NaCl (pH, 7.6). RNA1 at 10 nM was added to the serial dilutions and FP measurements were performed as described above.
To determine the amount of CA required for N protein-RNA1 dissociation, purified N protein at 2.5 µM in 50 mM sodium phosphate buffer, 100 mM NaCl (pH, 7.6) was incubated 10 nM RNA1 at 4 °C for 60 min prior to the addition of increasing amounts of CA up to 500 µM. The FP data were acquired as described above and the affinity binding curves were fitted to a Hill1 model using the OriginPro software.
The dissociation constant and thermodynamic parameters for the N protein-CA interaction were determined by ITC using a a VP-ITC calorimeter (Malvern). Purified N protein was dialyzed against 50 mM sodium phosphate buffer, 500 mM NaCl (pH, 7.6), overnight at 4 °C, and further diluted to 20 µM. The dialysis buffer was used to prepare CA at 250 µM. Both solutions, from cell and syringe, were prepared in the presence of 0.25% DMSO. CA was titrated against the protein solution (10 µL injections) at 20 °C with 300 s intervals. CA titrations against the buffer and buffer titrations against the protein solution were performed as controls. The isotherm curves, after subtracting the controls, were analyzed using the Microcal Origin software provided with the equipment, and the data were fitted to One Set of Sites model.
Protein crystallization and X-ray data collection
Freshly prepared CTD samples at 8 mg/mL were subjected to hanging-drop vapor diffusion crystallization trials performed in 24-well VDX plates at 18oC using Hampton Crystal Screen HT™ solutions. After optimizing the crystallization conditions, CTD crystals were obtained within three days in 100 mM Tris -HCl (pH 8.3), 30% PEG 4000, 0.2 M sodium acetate, with 2 µL drops (1 µL protein /1 µL reservoir solution) and 300 µL reservoir solution. Protein-ligand crystals were grown within three weeks under the same crystallization condition with a reservoir solution containing 4 mM CA. Protein crystals were cryoprotected by rapid soaking in reservoir solution containing 25% glycerol and flash-cooled in liquid nitrogen. X-ray diffraction data were collected under cryogenic conditions (100 K) at 1.327 Å and 0.977 Å wavelength at the Manacá beamline (macromolecular micro and nano crystallography)52 of Sirius, the Brazilian synchrotron light source (LNLS, Campinas, Brazil), using a PILATUS 2M detector placed 145 mm from the crystal. The X-ray data were collected using a fine ϕ-slicing strategy, rotated through 360° with a 0.1° oscillation range per frame.
X-ray data processing and structural determination
X-ray diffraction data were automatically processed with XDS53 using the Manacá Automatic Processing Pipeline (ManacáAutoProc) and analyzed and scaled using Pointless, Matthews and Scala from CCP4 package36,54,55. The phases of the datasets were determined by molecular replacement with Molrep56 and Phaser57 using the SARS-CoV-2 CTD crystal structure (PDB code: 7C22) as the search model. The atomic structures were refined using REFMAC558, ligands had their geometry restraint information generated for refinement by eLBOW59 and then modelled using COOT60. All figures were generated using PyMOL.
The volume of the CA binding site was estimated using parKVFinder software61 using the box adjustment mode around the CA molecule with the following parameters: probe in of 1.4 Å, probe out of 12 Å and removal distance of 0.5 Å.
NMR experiments
All NMR spectra were obtained using an Agilent DD2 500 MHz spectrometer or Varian Inova 600 MHz spectrometer both equipped with a 5 mm triple-resonance probe and a Z pulse-field gradient unit at 298 K. The STD experiments were performed with 400 μM CA and 4 μM N protein samples dissolved in 80 mM sodium phosphate buffer, pH 7.4, prepared with deuterated water. The 1D 1H-STD spectra were obtained by subtracting the saturated spectra (on- resonance) from the reference spectra (off-resonance), which was automatically performed by phase cycling using the dpfgse satzfer pulse sequence implemented in the VNMRJ software (Agilent). The spectra were acquired using 2048 scans with a selective irradiation frequency of the protein at -0.5 ppm (on-resonance) and 30 ppm (off-resonance). Forty G-shaped pulses of 50 ms separated by 1 ms delays were applied to the samples. The total length of the saturation train was 2.5 s. A T2 filter was applied to eliminate all protein background. The off-resonance spectra were used as reference spectra and were acquired with 1024 scans keeping all other parameters equal to the 1D 1H-STD-NMR spectra. For the group epitope mapping analysis, the STD enhancements (ASTD) were determined by the integrals of individual protons of the ligands in the 1D 1H-STD-NMR spectrum (ISTD) divided by the integral of the same signals at the reference spectrum (I0) and multiplied by the excess ratio of ligand to protein concentration ([L]/[P]) according to equation 1.
In vitro anti-SARS-CoV-2 activity
Vero CCL81 cells (African green monkey kidney cell line - BCRJ, # 0245) were cultivated in DMEM medium supplemented with 10% fetal bovine serum (FBS), 1% L-glutamine and 1% penicillin/streptomycin. Calu-3 cells (Human lung cell line – ATCC, # HTB-55™) were cultivated in DMEM/F12 (1:1, v/v) medium supplemented with 20% FBS, 1% L-glutamine and 1% penicillin/streptomycin. Both cell lines were grown at 37°C with 5% CO2.
Antiviral assays were performed with the HIAE-02 SARS-CoV-2/SP02/human/2020/BRA strain (GenBank accession MT126808.1) kindly provided by Prof. Edison Luiz Durigon (USP-SP, Brazil). Virus stocks were propagated in Vero CCL81 cells in 75 cm2 flasks. After 30-36 h of growth, culture supernatants were centrifuged to remove cell debris and stored at -80°C. All assays involving infectious virus were performed in the BSL-3 unit of the Emerging Viruses Laboratory (LEVE) at the State University of Campinas, Brazil.
Cell viability in Vero CCL-81 and Calu-3 after CA treatment was measured by the MTT [3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide] (Sigma–Aldrich, USA) method. Briefly, CA diluted in 10% DMEM at 25 and 100 µM final concentration was added to the confluent monolayer of cells grown in 24-well plates. After 48 h growth, the medium was replaced by fresh DMEM containing MTT (200 µg/mL) and cells were incubated for 3 h. DMSO was used to solubilize the formazan crystals and cell viability was measured by OD at 492 nm. The results were expressed according to the equation (T/C) × 100%, where T and C represented the mean optical density of treated and control, respectively. DMSO at 0.2% (v/v) in DMEM medium was used as the vehicle control.
To evaluate the activity of CA on SARS-CoV-2 replication, Calu-3 and Vero CCL-81 cells were seeded into 24-well plates at 2x105 and 2.5x105 cells per well, respectively. The antiviral activity was determined at multiplicity of infection (MOI) of 1. The confluent monolayer of cells was incubated with the virus for 1 h, after which the culture medium was replaced by fresh medium containing CA at 25 and 100 µM final concentration. Culture supernatants were harvested 48 h after virus inoculation and viral load was determined by plaque assay.
Plaque assay
Vero cells were seeded into 24-well plates and incubated for 1h with the supernatants from the antiviral assays, serially diluted to 10-6. After virus incubation, cells were overlaid with semi-solid medium (1% w/v carboxymethylcellulose in DMEM supplemented with 5% FBS) and incubated for 4 days. After removal of semi-solid medium, cells were fixed with paraformaldehyde 4% and plaques were visualized after 1% methylene blue staining. Viral lysis plaques were counted and the results were expressed as viral plaque forming units (PFU) per mL of sample.