Optimizing variant-specific therapeutic SARS-CoV-2 decoys using deep-learning-guided molecular dynamic simulations.

doi:10.21203/rs.3.rs-1971184/v1

Download PDF

Article

Optimizing variant-specific therapeutic SARS-CoV-2 decoys using deep-learning-guided molecular dynamic simulations.

https://doi.org/10.21203/rs.3.rs-1971184/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 14 Jan, 2023

Read the published version in Scientific Reports →

You are reading this latest preprint version

Treatment of COVID-19 with a soluble version of ACE2 that binds to SARS-CoV-2 virions before they enter host cells is a promising approach, but it needs to be optimized and adapted to emerging viral variants. The computational workflow presented here consists of molecular dynamics simulations for RBD-ACE2 binding affinity assessments of ACE2 or RBD variants and a novel convolutional neural network architecture working on pairs of voxelized force-fields for efficient search-space reduction. We identified hACE2-Fc K31W along with multi-mutation variants as high-affinity candidates, which we also validated in vitro with virus neutralization assays. We evaluated binding affinities of these ACE2 variants with the RBDs of Omicron BA.3, Omicron BA.4/BA.5, and Omicron BA.2.75 in silico. In addition, candidates produced in Nicotiana benthamiana, an expression organism for potential large-scale production, showed a 4.6-fold reduction in half-maximal inhibitory concentration (IC₅₀) compared with the same variant produced in CHO cells and an almost six-fold IC₅₀ reduction compared with wild-type hACE2-Fc.

SARS-CoV-2

COVID-19

coronavirus

spike protein

ACE2

molecular dynamics

neutralization assay

variant of concern

Beta

Delta

Omicron

Catalophore Halo

binding-affinity prediction

artificial neural network

deep learning

The coronavirus-disease-2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is combated by vaccination programs, but also therapeutic options such as the SARS-CoV-2 main protease inhibitor nirmatrelvir/ritonavir (Paxlovid™)¹ are needed due to the ongoing evolution towards escape of existing immunity^2,3. Since coronaviruses are characterized by moderate to high mutation rates, treatment options that are independent of spike protein mutations or easily adaptable constitute a significant piece of the puzzle to reduce the virus’s danger, not only for the unvaccinated part of the population⁴. Further SARS-CoV-2 immune-escape variants may emerge in the future as that has already happened with the variants Beta (B.1.351), Gamma (P.1), and recently Omicron (B.1.1.529) variants^5–8.

During infection, the virus utilizes the receptor-binding domain (RBD) of its trimeric spike proteins, protruding from the viral surface, to bind to its cellular receptor angiotensin-converting enzyme 2 (ACE2) after two precleavage events^9,10. This key mechanism enables the viral internalization into the host cell and its subsequent replication in the host organism^3,11. Targeting the entering mechanism and blocking the interaction between the viral spike protein and human ACE2 (hACE2) is a promising treatment option since host-cell infection is thus prevented. Using human soluble ACE2 (hsACE2) as a decoy to block SARS-CoV-2 infection is therapeutically potent in human capillary organoids and in COVID-19 model hamsters^12,13. The application of recombinant hsACE2 has been reported to be safe in healthy human subjects¹⁴ but did not reduce mortality in a phase 2 trial (ClinicalTrials.gov ID: NCT04335136).

There have been approaches to modify the hsACE2 protein to increase its moderate binding affinity to the RBD (K_D ≈ 20 nM)^15–18. Using the hACE2 dimer (hACE2₂) instead of a monomer enhanced its effect against SARS-CoV-2 infection due to avidity effects^17,19. Fusion to segments of human IgG improved its pharmacokinetics in mouse models²⁰ and activated degranulation of NK cells²¹. In addition, mutations at hotspot positions e.g. S19, T27, K31, H34, L79, and N330 enhanced hACE2’s affinity for the SARS-CoV-2 spike RBD^{17,19,22−24}, and treatment with a hsACE2₂ triple mutant reduced mortality in SARS-CoV-2 infected K18-hACE2 mice²⁵. In this regard, research groups reported hACE2-decoy–RBD dissociation constants in the low nanomolar to subnanomolar range^19,26.

Based on these observations and data, we devised a workflow using a combination of standard techniques and our point-cloud technology: First, we used extensive molecular-dynamics (MD) simulations to filter hACE2 mutations that strengthen the interaction with the spike RBD. To this end, we employed an empirical scoring function (ESF) closely related to the linear-interaction-energy (LIE) method, which has been calibrated on experimental binding data, as described previously²⁷. We implemented virus-neutralization assays to evaluate the potential of four hACE2 variants, linked to an Fc segment of human IgG1 (hACE2-Fc), to inhibit the spreading of wild-type SARS-CoV-2 but also SARS-CoV-2 Beta that has acquired mutations, which reduce the binding to class 1 antibodies⁵. In addition, we examined the mass-production potential of such an approach via the expression of hACE2-Fc variants in Nicotiana benthamiana.

The MD-simulation data were combined with hACE2- and RBD Cataphore Halos²⁷ to train an artificial neural network (ANN). Our deep-learning models presented here are intended as a tool to predict binding affinities of the SARS-CoV-2 spike protein with hACE2 variants. Apart from that, the models should be able to predict the hACE2-binding affinities for RBDs of newly emerging SARS-CoV-2 variants based solely on hACE2- and RBD Halos. When new virus mutants emerge, the vast mutational landscape of hACE2 could then rapidly be screened via the ANN, and the top performers verified using MD so that the treatment could be adapted to the hsACE2 variant with the highest binding affinity to the presently dominant SARS-CoV-2 variant, or even optimized to several important variants.

Overall Strategy.

Deep-mutational scans examining a broad range of RBD and hACE2 single mutants have shown severe alterations in binding- and expression-properties^17,28. We collected high-affinity hACE2 mutants from literature and combined them with mutations identified by visual examination of the hACE2-RBD binding interface.

We then optimized an ESF, as recently described²⁷, to screen the collection for hACE2 mutants with enhanced binding affinities to the spike-RBD from SARS-CoV-2 variants of concern (VOCs). Furthermore, we trained an ANN on our MD-simulation data to produce binding-affinity estimates for many variants of both spike-RBD and hACE2 much faster than our approach based on the ESF. This will mainly serve as a future high-throughput filter for new spike-RBD variants.

Promising hACE2 variants detected in our simulations were expressed in CHO cells and in N. benthamiana plant leaves. Their SARS-CoV-2-neutralizing capacity was evaluated in wet-lab neutralization tests. A summary of our integrated computational and experimental strategy is visualized in Fig. 1.

ESF binding-energy predictions correlate with experimental data.

A LIE model is an efficient way to compute (binding-) energy differences from ensembles of short MD simulations of either different ligands in a binding site or the Gibbs free-energy difference of a bound ligand-protein system compared to the free ligand in water. Elsewhere²⁷, we have applied an ESF derived from the LIE method to the SARS-CoV-2-spike RBD as the “protein” together with the hACE2 as the “ligand”. While this choice seems somewhat arbitrary at first, we tested hACE2 and RBD as ligands and found the former choice to be more reliable in terms of model stability. Herein, we apply this model to a different set of data.

In particular, we validated our existing ESF by examining the correlation between predicted Gibbs free-energy values (ΔG_pred) of hACE2 mutants using experimental half-maximal effective-concentration (EC₅₀) values from in vitro binding-affinity experiments (Table S2, Fig. S5). We found that hACE2 interaction-energy changes were robust to the method of measurement (in vitro binding-affinity assay vs. ESF). Experimental data were well-correlated with predicted Gibbs free energies from the model (R² = 0.54).

ESF reveals high-affinity hACE2 variants.

Compared to wild-type hACE2, most preselected hACE2 variants show enhanced binding affinities to the five SARS-CoV-2 variants wild type, Beta, Delta, Omicron BA.1, and Omicron BA.2 (Fig. S6). K31W might play a key role in the interaction with the spike-RBD since hACE2-K31W is the only single mutant with strikingly low ΔG_pred values, and this amino acid exchange is also present in the top high-affinity multi-mutants. The highest binding affinities to the spike-RBD are shown for mutants with three to five amino acid substitutions. Reaching ΔG_pred values around − 71 kJ/mol, hACE2 T27Y_L79T_N330Y_K31W and hACE2 T27Y_L79T_K31W reveal exceptionally high affinities to the Omicron BA.2 spike-RBD compared to -52 kJ/mol for the wild-type hACE2. In addition, we included the recently emerged Omicron sublineages BA.3, BA.4/BA.5, and BA.2.75 in our hACE2-RBD binding affinity evaluation. With predicted binding affinities of -67 kJ/mol and − 62 kJ/mol hACE2 T27Y_L79T_N330Y_K31W and hACE2 T27Y_L79T_K31W remained the top high-affinity variant for Omicron BA.3, whereas the effect diminished for BA.4/BA.5 and BA.2.75 (Fig. S6).

Halo procreation and deep-learning data preparation.

Catalophore Halos are computed from the protein structure as fields of physicochemical properties defined on a 3D point cloud at the hACE2-RBD binding interface²⁷ but above the protein’s surface. The deep-learning models use the Halo of the spike-RBD and the Halo of the hACE2 (visualized in Fig. 2 and Fig. S4) to predict the binding affinity of the two underlying proteins. The fact that the ANN is trained on MD-simulation data leads to a reliable deep-learning setup since both the MD-simulation results and the ANN predictions rely on input data from the same source. The MD simulations use the structural information about the protein, and the ANN uses the same information encoded in the Halos. The Halos are derived from the structures. After their computation, Halos are agnostic to the amino-acid sequence and should represent the functional capacities of the protein structure and, in particular, of the adjacent surface area.

ANN Predictions and Validation.

ANN predictions were implemented in two different ways. First, the potential of the system to predict the effects of small mutational changes in RBD variants for the same hACE2 decoys was investigated using BA.1 to BA.2 as examples, and second, a large-scale screening of the mutational landscape spanned by all possible single-point mutations of hACE2 while the RBD remained fixed was performed.

As a first step, we trained our ANN on different data sets in order to explore its capabilities regarding the prediction of variants with unseen combinations of amino-acid exchanges, both on the spike-RBD- and the hACE2-side of the binding complex. Details of these experiments are given in the Methods section below.

In order to test the predictive power of such a network and setup, we used it to predict binding affinities for the spike-RBD variant of Omicron BA.2, combined with several hACE2 variants. The training was based on the entire ESF dataset except for Omicron-BA.2-variant data. In order to gauge the potential of an ANN prediction model, we made the following basic comparison:

The pure Pearson-correlation between MD results for Omicron BA.1 and BA.2 variants, respectively, given the same set of hACE2 receptors, is 0.72. Remarkably, the correlation between MD simulated BA.2 and the model-predicted BA.2 increases to 0.76, indicating that the model explains an additional 7% of the variance compared to a pure extrapolation from BA.1 variants. The model also exhibited lower worst-case performance, as the maximum error for the pure correlation between BA.1 and BA.2 was 8.80 kJ/mol versus the one obtained by the model at 6.65 kJ/mol. Most of the worst-performers are in the lower end of >-50 kJ/mol, whereas the predictive power for high binding affinities is of much more interest for the present manuscript and our ambition in general. Furthermore, the model mapped the highest outliers, whose MD ΔG values were <-70 kJ/mol, to the highest affinity value seen during training (-68 kJ/mol), and successfully understood that the outlier with the second-highest affinity would also bind stronger to BA.2 spike-RBD, as seen in Fig. 3. This prediction shows that the ANN is not only able to better predict the values close to the bulk of the affinity distribution than extrapolating from very closely related variants, but also that it reliably maps the strongly underrepresented high-affinity samples to the highest affinity bracket around − 68 kJ/mol.

This shows that the network can learn meaningful physical insights from Halos with a performance significantly better than simply learning a copy-function or regression-to-the-mean for the closest previously seen data. This is especially true for the case of strong binding. Furthermore, since all of the data in the training set were either Omicron BA.1, which is relatively distant from the wild-type RBD, or were much closer to the wild type, this indicates that the model is able to combine learned insights from relatively different inputs, in this case relatively distant sequences. This is likely an advantage of the sequence- and structure-agnostic nature of the ANN’s Halo training data.

After verifying the predictive power of the ANN, the mutational landscape of hACE2 was screened by predicting all possible single amino-acid exchanges, of which the 300 most promising predictions were verified using the MD-model. As shown in Fig. S1, the model identified single-mutants comparable to the best hACE2 mutant found in the initial MD-runs. (For a detailed description, see the ANN guided ACE2 pre-selection section in the supplementary information.)

Virus-neutralization experiments are in line with ESF predictions.

Before we describe our experimental results, it is important to note that the connection between our ANN- and MD-model predictions and the neutralization experiments is not entirely direct. While in silico we study effects on interaction energies in a well-defined region, other factors influence the in vitro results. Nevertheless, our numerical approach is able to predict binding affinities without using extensive MD calculations, and binding affinity certainly is an important factor of neutralization efficacy.

In addition to wild-type hACE2, we expressed promising hACE2 (residues 19–740) variants from the MD screening with a C-terminal human IgG Fc tag in CHO cells. To exclude unintended physiological interactions in the human body¹⁹, we eliminated hACE2’s peptidase activity by mutation of the following positions: H374N and H378N²⁹. We conducted BSL-3 virus-neutralization assays to evaluate their potential to block SARS-CoV-2 infections, with either wild-type SARS-CoV-2 or SARS-CoV-2 Beta. Remdesivir, an inhibitor of the viral RNA-dependent RNA polymerase³⁰, served as a positive-control compound during the assay. Two independent methods, quantification of the SARS-CoV-2 RNA levels in the cell-culture supernatant by quantitative Reverse Transcription PCR (qRT-PCR) and immunohistochemical staining of infected cells using a monoclonal anti-SARS-CoV-2 Nucleocapsid antibody in combination with an HRP-conjugated secondary antibody, were implemented for the analysis of the neutralization experiments. A calibration curve allowed the determination of viral copies in the supernatant based on cycle threshold (Ct) values from the qRT-PCR.

Preincubation of wild-type SARS-CoV-2 with 25 µg/mL wild-type hACE2-Fc reduced the viral-copy number from 4.95E + 08 and 1.66E + 08 for the untreated control to 9.51E + 04 and 1.03E + 03 viral copies for the treated samples, respectively (Fig. S7). hACE2-Fc variants E35D, K31W, and T92C showed even enhanced potential against SARS-CoV-2 infections and led to nearly complete inhibition of infections (Fig. S7b, c). A similar effect was obtained for infections with the SARS-CoV-2 Beta variant (Fig. S7e, f). At a final concentration of 0.78 µg/mL (Fig. 4), a 53,480-fold decrease of viral copies was measured for hACE2-Fc K31W treated samples (5.59E + 01 viral copies) compared to hACE2-Fc wild type treated samples (2.99E + 06 viral copies). The same effect was also confirmed via immunohistochemical staining of SARS-CoV-2 infected cells (Fig. 4b).

hACE2-Fc expression in N. benthamiana plant leaves.

In addition to protein expression in CHO cells, we used N. benthamiana plant leaves for hACE2-Fc K31W production and evaluated their RBD-neutralization potential in ELISA-based neutralization assays. We found that hACE2-Fc K31W produced in N. benthamiana leaves has a higher inhibitory activity against RBD compared to hACE2-Fc K31W produced in CHO cells (Fig. 5). Both hACE2-Fc K31W variants are more potent than wild-type hACE2-Fc, produced in CHO cells, in the described setting.

In the ongoing COVID-19 pandemic, there is still a need for widely-available therapeutic options. The applicability of soluble recombinant hACE2 proteins as decoys to block the binding of SARS-CoV-2 to human cells has been studied by multiple groups^{12,13,17,19,26}. Using a diverse approach combining MD simulations, in vitro neutralization tests, live-virus infection assays, and artificial neural networks in conjunction with Catalophore Halos, we finally implemented a workflow that enables a fast efficiency evaluation of a particular hACE2 variant in combination with a specific SARS-CoV-2 VOC based solely on their hACE2- and RBD Halo. This can be done as soon as a newly emerged SARS-CoV-2 virus strain has been sequenced. In addition, our workflow allows the identification of more effective hACE2 variants in case of newly-arising SARS-CoV-2 VOCs in the future.

Our rapid hACE2-RBD binding-affinity evaluation technique consists of a multi-layered strategy. First, in case of the emergence of a new VOC or another variant of interest, a homology model and RBD Halo is created based on its RBD sequence. Together with the already existing set of Halos for our hACE2 samples, or one particular Halo of a specific hACE2 variant of interest, the RBD Halo is then used as an input for an ANN prediction run. This yields results quickly, and a ranked list of the combinations is returned. Predicted high-affinity candidates are then fed into the MD-model workflow to validate and refine the ANN’s prediction. Homology modeling, Halo creation, and ANN prediction take less than half an hour on standard desktop hardware.

Although our workflow benefits from its promptness and cost-efficiency, one must not forget that it only covers alterations in the SARS-CoV-2 spike-RBD. Potential binding-affinity alterations and conformational changes of the whole trimeric spike protein due to substitutions in the spike S2 region, as have been shown for D614G³¹, are not covered. Furthermore, the influence of glycosylation-pattern changes can not be detected since ANN-guided preselection and MD-simulation-based predictions both use deglycosylated input structures.

During our studies, we identified hACE2 K31W, hACE2 T27Y_L79T_N330Y_K31W, and hACE2 T27Y_L79T_K31W, amongst others, as hACE2 variants that showed good binding properties for a variety of SARS-CoV-2 VOCs. High binding affinity is a crucial factor when it comes to virus neutralization capacity, and the two factors correlated well in a previous study¹⁹. The neutralizing effects of hACE2 K31W, encompassing a lower mutational load than variants with three to five mutations, have been validated in vitro. hACE2 K31W along with three other hACE2 variants, fused to the Fc region of human IgG at the C-terminal end, was expressed in CHO cells. To prevent unintended vasodilatory effects, we eliminated hACE2’s peptidase activity. The presence of wild-type hACE2-Fc, hACE2-Fc K31W, hACE2-Fc E35D, and hACE2-Fc T92C during one hour of VeroE6 virus infection (SARS-CoV-2 wild-type or Beta variant), followed by washing and 48 h incubation with the respective variant, strongly reduced the number of viral copies in the supernatant compared to the untreated control. At low hACE2-Fc concentrations between 3.13 to 0.78 µg/mL, the nearly complete suppression of SARS-CoV-2 spreading remained unaffected for hACE2-Fc K31W but vanished for wild-type hACE2-Fc. This observation was confirmed by the immunohistochemical analysis as an independent readout.

K31W has previously been described as a beneficial mutation regarding hACE2-RBD affinity optimization²⁶. The large aromatic side chain of tryptophan is likely to contribute to enhanced hydrophobic and pi-stacking interactions with residues from the RBD. In particular, the favorable interaction between W31 and Y489 is expected to be the key enhancing factor and has been shown to overshadow the effects of the disruption of two salt bridges (K31-D35 and K31-E484)²⁶.

To evaluate the mass-production potential of our hACE2 variants, we transiently expressed hACE2-Fc K31W in N. benthamiana plant leaves (hACE2-Fc K31W_NB) and tested their SARS-CoV-2-neutralization potential in ELISA-based neutralization assays. Compared to hACE2-Fc K31W produced in CHO cells (hACE2-Fc K31W_CHO), hACE2-Fc K31W_NB was more potent in inhibiting the binding of the RBD to immobilized wild-type hACE2-Fc. This is possibly due to differing hACE2-Fc glycosylation patterns. Wild-type ACE2-Fc expressed in N. benthamiana showed less processed complex N-glycans and a partial underglycosylation at the N90 position in recent LC-ESI-MS analyses compared to ACE2-Fc produced in a human cell line³². It has been reported previously that most of the hACE2 substitutions of N90 and T92, which together form a consensus N-glycosylation motif, would be beneficial for binding to the RBD¹⁷ since N-glycans could hinder binding through steric clashes or electrostatic effects³³. With a half-maximal inhibitory concentration (IC₅₀) of 0.313 µg/mL (hACE2-Fc K31W_NB) and 1.442 µg/mL (hACE2-Fc K31W_CHO), both hACE2 mutants had a lower IC₅₀ value compared to the wild-type hACE2-Fc, expressed in CHO cells, with 1.856 µg/mL. In summary, these data demonstrate that the production of hACE2-Fc variants with correct folding in N. benthamiana is possible, which is in line with previous studies^32,34. Plant-produced soluble ACE2 variants represent a promising, cost-effective therapeutic option in the treatment of patients suffering from COVID-19.

This work presents a bioinformatic approach to monitoring the binding affinity of hACE2 variants, potentially used as therapeutic decoys for COVID-19 patients, to SARS-CoV-2 variants. The systematic two-pronged strategy enables a rapid binding-affinity evaluation of any given SARS-CoV-2 VOC in combination with a vast number of hACE2 designs at an early stage after sequencing a SARS-CoV-2 VOC. Our workflow could potentially be applied also to other viral targets, such as the MERS entry receptor DPP4. However, our system is especially useful in assessing the efficacy of a given hACE2 decoy to a new VOC at an early stage, shortening timelines for hACE2-decoy adaptation and reducing the number of samples for in vitro selection.

Construction of the MD-simulation system.

Preparation of spike-RBD-hACE2 structures.

A crystal structure of the wild-type SARS-CoV-2 spike-RBD bound with human ACE2 (hACE2) was downloaded from the PDB (ID: 6M0J). Energy minimization was performed using a short steepest-descent minimization followed by simulated annealing (timestep 2 fs, atom velocities scaled down by 0.9 every 10th step). For this purpose, the AMBER14 force field³⁵, applying an 8-Å force cutoff was implemented. The minimized structure was implemented as a template for homology modeling. hACE2 mutants with experimentally determined binding affinities (Table S2) were used to validate the empirical scoring function (ESF). Their structures were built by introducing the mutation into the protein sequence and subsequent homology modeling using Yasara³⁶. The same homology-model experiment was performed for the wild-type RBD:hACE2 amino-acid sequence (ID: 6M0J) to guarantee identical handling of input structures. The final input files contained residues 333–526 of the respective SARS-CoV-2 RBD and residues 19–615 of hACE2 coordinating one zinc ion.

System setup and training.

The system was established as described previously²⁷. Initial structures were solvated in a cuboid box with periodic boundaries. The cell was filled with water at a density of 0.997 g/cm³, ionizable groups were protonated according to pH 7.4 and 0.9% NaCl counter ions were added. Energy minimization took place before and after each simulation phase to clear bumps and adjust the covalent geometry. For this purpose, the same preparation procedure as described above was implemented. MD simulations were carried out using the AMBER14 force field with automatic parameter assignment by AutoSMILES³⁷ at 310 K and 1 bar. The RBD:hACE2-complex structure, as well as the unbound hACE2 structure, were simulated for 200 ps. Energy snapshots were extracted every 200 fs during the simulation.

hACE2 screening.

hACE2 design.

hACE2 mutations were initially selected according to visual inspection with the focus on affinity optimization using an RBD:hACE2 crystal structure from the PDB (ID: 6M0J) and the collection was extended by the top-ten high-affinity hACE2 variants, determined by Chan et al. in flow-cytometry binding-affinity experiments¹⁷ and combinations of the two approaches (Fig. S6).

Structure preparation.

To examine binding affinities between the suggested hACE2 variants and SARS-CoV−2 variant of concern (VOC) RBDs, suggested hACE2 mutations and RBD-residue changes reported by the WHO for Beta, Delta, Omicron BA.1, Omicron BA.2, Omicron BA.2.75, Omicron BA.3 and Omicron BA.4/BA.5³⁸ were manually incorporated into the wild-type sequence (PDB: 6M0J) and homology models were built with Yasara. The final input files contained residues 333–526 of the respective SARS-CoV−2 RBD and residues 19–615 of ACE2 coordinating one zinc ion.

MD-simulation.

The model-predicted value of ΔG is computed from the linear combination of two energy contributions, one term coming from van der Waals (vdw) and another from electrostatic forces (elec). The energy contributions are differences of ensemble averages of bound and unbound configurations. In the bound case, the hACE2 environment includes water molecules and RBD atoms, whereas, in the unbound state, hACE2 is exclusively surrounded by water molecules within the simulation box. While the energy differences come from MD ensembles, the weights for these contributions were determined using a set of 43 structures along with experimental K_D-values with simulation parameters optimized as described previously²⁷. For our model calculations, we used 200-ps simulations with 50 replicates per variant, which led to Eq. (1).

$$\varDelta G=0.765\varDelta {E}^{vdw}+0.024\varDelta {E}^{elec}$$

It should be noted here that the weight for the electrostatic term is relatively small. This leads to a small contribution from the electrostatic energy differences to ΔG since energy differences in both terms are of the same order of magnitude. Our previous study showed that electrostatic interaction energies are too broadly distributed in each simulation run to lead to a good-enough signal-to-noise ratio for the corresponding bound-unbound difference. As a result, the model fit reduced the contribution of this term significantly in order not to destroy the correlation between predicted ΔG and the gauging data.

Model validation.

Comparison to experimental EC ₅₀ values.

A subset of the hACE2-variant collection was tested in in vitro binding affinity experiments (see below). The correlation between predicted Gibbs free energy values from the simulation and logarithmic EC₅₀ values was evaluated by calculating R².

hACE2 production.

Construction of expression plasmid.

The full-length cDNA of human ACE2 (GenBank Accession No. AF291820) was purchased from Sino Biological Inc. (Beijing, China). The cDNA-encoding hACE2 (residues 19–720) for Chinese-hamster-ovary (CHO) expression was amplified by polymerase chain reaction (PCR) and cloned into the expression vector pYD11SP. The insert was fused to the human IgG1 signal peptide at the N-terminus and to the Fc region of human IgG1 at the C-terminal end. The designed mutations were introduced by PCR to generate hACE2 mutants with higher potency. To eliminate hACE2 peptidase activity, H374N and H378N mutations were introduced by overlap extension PCR.

For expression in N. benthamiana, coding sequences of hACE2 (K31W)-Fc were amplified using the following primers:

Forward primer 5’-GCC GGT CTC GCT TCA GGC ATG GAT CCA TGT CCA CCA TTG AGG AAC AGG CCA AGA CA-3’
Reverse primer: 5’-CTT TTA GCT CAG CAT TCT GCT TTT GAG CTC TCA TTT ACC CGG AGA CAG GGA GAG G-3’.

Sequences were inserted in the pSCMP plant-expression vector via BamHI- and SacI-restriction sites using In-Fusion® HD Cloning Plus PCR Cloning Kits (Takara Bio USA, Inc., San Jose, CA, USA), resulting in pSCMP-ACE2(K31W)-Fc fused to barley alpha-amylase 2 signal peptide.

Expression of soluble hACE2-Fc.

CHO cells were maintained in FreeStyle™ F17 medium (Invitrogen, Waltham, MA, USA) supplemented with 4 mM glutamine and 0.1% Kolliphor P-188 (Sigma, St. Louis, MO, USA). The cells were grown at 37°C in shake flasks on an orbital shaker set to 120 rpm in a humidified 5% CO₂ incubator. For transfections, cells were seeded at 1.0 × 10⁶ cells/mL on day 0 and transfected with Polyethylenimine MAX linear (LPEI MAX, MW 25, Polysciences Inc.) on day 1. Briefly, 80 µg DNA plasmid and 133 µL LPEI MAX stock solution (3 mg/mL) were diluted with 5 mL FreeStyle™ F17 medium, respectively. The diluted DNA and LPEI-MAX were combined and incubated at room temperature for 3 min. 10 mL of the mixture were added to 90 mL overnight CHO cell culture. At 4 to 24 h post-transfection, 3 mL of tryptone N1 (Organotechnie, La Courneuve, France) were added. The conditioned media were collected at 72 h after transfection for purification of soluble hACE2-Fc proteins.

N. benthamiana was cultivated in a growth room under a 16 h light:8 h dark photoperiod at 22°C and 50% relative humidity. The binary construct pSCMP-ACE2(K31W)-Fc was introduced into Agrobacterium tumefaciens strain AGL1 by the freeze–thaw method. The AGL1 strains were inoculated in selective liquid YEP media with 50 µg/mL rifampicin, 50 µg/mL carbenicillin, 100 µg/mL kanamycin at 28°C in a shaking incubator at 200 rpm for two days. For transient expression, the AGL1 pellet harboring pSCMP-ACE2-Fc was resuspended and diluted in 1 × infiltration buffer containing 10 mM 2-(N-morpholino) ethanesulfonic acid (MES), 10 mM MgSO₄, 5% glucose, 200 µM acetosyringone, at pH 5.3 to an OD₆₀₀ of 1.0. The AGL1 strains were vacuum-infiltrated into the 6-week-old N. benthamiana plant leaves and maintained at 22°C growth room. The leaf tissue was harvested 4 days post infiltration (dpi) for protein-expression extraction and purification.

Purification of hACE2-Fc.

Cleared conditioned media from CHO cells transfected with soluble human ACE2-Fc were supplemented with 0.5 mL equilibrated MabSelect™ PrismA Resin (GE Healthcare, Chicago, IL, USA) and incubated in a fridge with shaking overnight. The resin was collected on a chromatography column and washed with 50 mL of buffer A (20 mM sodium phosphate, 150 mM NaCl, pH 7.2). The proteins were eluted with buffer B (0.1 M glycine, pH 3.5). The eluate was immediately neutralized with 1M Tris, pH 10.6. The hACE2-Fc-containing fractions were pooled and the storage buffer was changed to 1 × PBS. The concentrations of purified hACE2-Fc proteins were determined by Bradford assay.

For purification of recombinant soluble hACE2-Fc expressed from N. benthamiana, 50 g of frozen leaves were extracted in ice-cold lysis buffer (1x PBS containing 10% glycerol, 1% PVP, 0.5% TritonX-100, 1 mM PMSF, 0.01% β-me, 0.5% protease inhibitor cocktail) at a ratio of 1:2 (w/v) and mixed in a Magic bullet blender (Homeland Housewares, Los Angeles, CA, USA) by applying three cycles of 30 second at a 30-second interval. The crude extract was then incubated in a fridge with shaking for 30 min before centrifuging at 13,000 × g for 25 min at 4°C. The pH of the extract was adjusted to pH 7.2 followed by centrifugation at 13,000 × g for 25 min at 4°C. The cleared extract was used to purify soluble hACE2-Fc as described above.

Cell culture and virus stocks.

African green-monkey kidney-epithelial cells VeroE6 (Biomedica, Vienna, Austria) were cultivated in Gibco’s Minimum Essential Medium (MEM) supplemented with Earle’s Salts and L-glutamine (all from Thermo Fisher Scientific, Waltham, MA, USA) with 5% fetal bovine serum (FBS; Thermo Fisher Scientific) and 1% penicillin/streptomycin (Thermo Fisher Scientific), in the following referred to as MEM (5% FBS). Incubation at 37°C, 5% CO₂ if not stated otherwise.

A human 2019-nCoV Isolate (Ref-SKU: 026V−03883, Charité, Berlin, Germany) and a human SARS-CoV−2 Beta variant isolate (Ref-SKU: 014V-04058, EVAg, Marseille, France) were propagated in VeroE6 cells. TCID50 titres were determined according to the Reed Munch method³⁹ and plaque-forming units (PFU) were calculated using the conversion factor 0.7, based on the ATCC-LGC standards (www.atcc.org/support/technical-support/faqs/converting-tcid-50-to-plaque-forming-units-pfu). For all infection experiments, the working stocks were diluted to a calculated multiplicity of infection (MOI) 0.0003 in MEM (2% FBS). All experimental steps with active SARS-CoV-2 virus isolates were performed under BSL-3 conditions.

SARS-CoV−2 neutralization assay.

Prior to every assay, purified hACE2-Fc solutions were freshly diluted in MEM (2% FBS) to hACE2-Fc concentrations of 0.78 µg/mL, 1.56 µg/mL, 3.13 µg/mL, 6.25 µg/mL, 12.5 µg/mL, 25 µg/mL or 25 µg/mL.

SARS-CoV−2 neutralization assays were performed similarly as described previously⁴⁰. 24 h prior to the assay, VeroE6 cells were seeded (30,000 cells/well) in a 48-well plate in MEM (10% FBS). After preincubation of SARS-CoV−2 (wild-type or Beta variant) with hACE2-Fc protein (wild type or variant) in concentrations between 0.78 to 25 µg/mL for ½ h, cells were infected at a multiplicity of infection (MOI) of 0.0003 with the preincubation mix in a final volume of 200 µL per well in MEM (2% FBS). Here the dose control (DC) was sampled. After 1 h of incubation at 37°C and 5% CO₂, the mixture was removed and cells were washed two times with fresh medium to remove unadsorbed virus particles. Respective hACE2-Fc solutions were again added to the cells. Cells were then incubated over a period of 48 h at 37°C and 5% CO₂. In the assay, untreated infected cells were used as positive controls and non-infected cells served as negative controls. Remdesivir (THP Medical Products, Vienna, Austria) was applied as an additional control. In the respective wells, cells were preincubated with Remdesivir (10 µM) for 30 min prior to infection. Remdesivir was added again after the washing steps. 140 µL of the supernatant were harvested and inactivated to extract RNA and quantify viral-copy numbers via Quantitative Reverse Transcription PCR (qRT-PCR). After removal of the remaining supernatant, the 48-well plate was fixed in 4% formalin for SARS-specific immunohistochemical staining (IHC).

RNA isolation, quantitative RT-PCR, and calculation of viral-copy numbers.

The supernatant samples were inactivated by adding AVL buffer (Qiagen, Hilden, Germany), the viral RNA was isolated following the manufacturer’s protocol using the QIamp viral-RNA mini Kit (Qiagen) and RNA was eluted in 40 µL ultra-pure H₂O. qRT-PCR of viral RNA was performed with the QuantiTect Probe RT-PCR Kit (Qiagen) using the Rotor Gene Q cycler (Qiagen). Reactions took place in a total volume of 25 µL at 50°C for 30 min followed by 95°C for 15 min and 45 cycles of 95°C for 3 s and 55° C for 30 s. The employed N1 primer set and probe, which enable the detection of N-gene of SARS-CoV−2, were recommended by the CDC at the 2019-Novel Coronavirus (2019-nCoV) Real-time rRT-qPCR Panel (www.cdc.gov/coronavirus/2019-ncov/lab/rt-pcr-panel-primer-probes.html).

2019-nCoV_N1-F 2019-nCoV_N1 Forward Primer 5’-GAC CCC AAA ATC AGC GAA AT−3’
2019-nCoV_N1-R 2019-nCoV_N1 Reverse Primer 5’-TCT GGT TAC TGC CAG TTG AAT CTG−3’
2019-nCoV_N1-P 2019-nCoV_N1 Probe 5’-FAM-ACC CCG CAT TAC GTT TGG TGG ACC-BHQ1−3’ FAM, BHQ−1

To allow the calculation of viral-copy numbers, a commercially-available standard (ATCC VR-1986D genomic RNA from 2019 Novel Coronavirus, Lot: 70,035,624, ATCC, Manassas, VA, USA) was serially diluted and analyzed via qRT-PCR. The resulting Ct-values were plotted against ln[copy numbers] and the equation received from linear-regression analysis (y = -1.442 x + 35.079) was used to calculate the viral-copy numbers from the Ct-values of the samples for Primer and Probe N1. The calculated viral-copy numbers refer to a volume of 140 µL supernatant harvested after the neutralization assay.

Immunohistochemical analysis.

After removal of the supernatant and fixation of the cells with 4% formalin for 1 h at room temperature, cells were washed twice with PBS and incubated with PBS at room temperature for at least 10 min. Cells were treated with Triton X 100 (0.1% in PBS, Merck Millipore, Darmstadt, Germany) for 10 min. Cells were washed three times with PBS for 3 min. Endogene peroxidases were blocked by applying H₂O₂ (3% in MetOH, Merck) for 30 min. Cells were washed three times with PBS for 3 min. Samples were incubated for 1 h at room temperature with a 1:1000 dilution of primary antibody (SARS-CoV-2 (2019-nCoV) Nucleocapsid Antibody, Rabbit Mab, Cat: 40,143-R019, Sino Biological Inc.) in antibody diluent (REAL Antibody diluent, Dako Cat: S202230_2, Agilent Technologies, Santa Clara, CA, USA). Cells were washed three times with PBS for 3 min. Cells were incubated for 30 min with the secondary peroxidase-conjugated anti-Rabbit antibody using the REAL EnVDetectSys Perox/DAB+, Rb/M (Agilent Technologies) as a detection system. Cells were washed three times with PBS for 3 min. The cells were incubated with 100 µL Substrate-Chromogen (EC substrate-Chromogen, Dako, Cat: K346430–2, Agilent Technologies) until optimal staining of viral infected cells was reached, but not longer than 3 min. The reaction was stopped by washing with PBS. High-quality images were obtained using a light microscope (40x magnification) in combination with the Jenoptik Gryphax Avior microscope camera and the Jenoptik Gryphax software (both from Jenoptik, Jena, Germany).

Binding-affinity assay.

The wells of microtiter plates were coated with 100 µL of 2 µg/mL recombinant His-tagged SARS-CoV-2 RBD protein in carbonate buffer, pH 9.6 overnight at 4°C. The next day, the coating solution was removed and the plate was washed three times with washing solution PBST (PBS + 0.05% v/v Tween20). The plate was blocked using 300 µL of 5% skim milk in PBST solution for 1 h at 37°C. The blocking solution was completely discarded and the plate was washed three times with the washing solution. Soluble hACE2-Fc proteins were serially diluted with PBST solution containing 0.1% BSA. 100 µL of hACE2-Fc with each concentration were added into the wells and incubated at 37°C for 1 h. The plate was washed three times with the washing solution. 100 µL of HRP-conjugated anti-Fc antibody solution (1:10,000) were pipetted to each well and incubated for 40 min at 37°C. The antibody solution was removed and the plate was washed three times with washing solution. 100 µL of TMB substrate (Biopanda Diagnostics, Belfast, United Kingdom) were added to each well and incubated at room temperature for 5–10 min. After sufficient color development, 50 µL of stop solution (2N H₂SO₄) were added to the wells. The absorbance (optical density, OD) was read at 450 nm. The data were plotted and binding affinities were analyzed using GraphPad Prism.

ELISA-based neutralization assay.

The wells of microtiter plates were coated with 100 µL of 2 µg/mL recombinant wild-type hACE2-Fc protein in PBS buffer overnight at 4°C. The next day, the coating solution was removed and the plate was washed three times with washing solution PBST. The plate was blocked using 300 µL of 5% skim milk in PBST solution for 1 h at 37°C. HRP-conjugated recombinant SARS-CoV-2 RBD protein was prepared at a concentration of 200 ng/mL in PBST with 0.1% BSA. The blocking solution was completely discarded and the plate was washed three times with the washing solution. Soluble hACE2-Fc proteins were serially diluted with PBST solution containing 0.1% BSA and mixed with HRP-RBD at a ratio of 1:1. 100 µL of each mixture were added into the wells and incubated at 37°C for 30 min. The plate was washed four times with the washing solution. 100 µL of TMB substrate were added to each well and incubated at room temperature for 5–10 min. After sufficient color development, 50 µL of the stop solution (2N H₂SO₄) were added to the wells. The absorbance (optical density, OD) was read at 450 nm. The data were plotted and the neutralizing activities were analyzed using GraphPad Prism.

Generation of ANN training data.

Sequence data preparation.

Sequences used for the preparation of ANN training data included RBD and hACE2 sequences either retrieved from visual inspection, literature research^17,41, or spike protein sequences that were available at GISAID⁴² by January 4th, 2022. In addition to 39 specific RBD sequences obtained from a list of 145 representative spike protein sequences available at GISAID, RBD sequences (residues 319–541) were extracted from 1.3 million non-redundant spike protein sequences (out of 6.7 million entries) by pairwise alignment to the wild-type RBD and analyzed employing in-house tools in Python. For the ANN training data set, 1,077 specific RBD sequences without any insertions or deletions and containing at least two mutations in reference to the wild-type RBD were considered. Due to the large number of specific RBD sequences bearing two mutations, these were additionally restricted to sequences occurring at least twice in the list of 6.7 million entries. Including RBD mutations derived from literature research⁴¹ and current VOCs, a total of 1,165 RBD sequences were used for the training of the ANN. hACE2 mutations were retrieved either from visual inspection, literature research¹⁷, or a combination of both, resulting in 95 distinctive sequences. A list of the RBD-hACE2 pairs used for training the ANN is provided in Table S3.

Homology models of the respective RBD-hACE2 complexes were created and binding affinities for RBD (residues 333–526) and hACE2 (residues 19–615) were predicted via MD simulations as described above (“ACE2 screening”). Additionally, the homology models were used to calculate RBD Halos and ACE2 Halos.

Procreation of Catalophore hACE2 Halos and spike-RBD Halos.

As described previously^27,43, a Catalophore Halo is a multivariate property field composed of a collection of points in Cartesian space discretized onto an equidistant grid annotated with currently 19 physicochemical and statistical properties (e.g. electrostatics, hydrophobicity, flexibility, potential energies, hydrogen-bonding potential, or dissolvability) that are projected by a biomolecule into its surroundings.

Spike-RBD-hACE2 homology models were deposited in the CATALObase platform⁴³ and used as input data for calculating hACE2- and RBD-Halos. This was achieved by a Yasara⁴⁴ structure-preparation step combined with a Halo-creation and -annotation step. The latter was performed using a modified version of the AutoGrid tool that is part of the Autodock⁴⁵ suite, version 4.2.3. The 3D point clouds generated with a grid spacing of 0.75 Å cover the entire outer molecular surface of either RBD (for RBD-Halos) or hACE2 (for hACE2-Halos) with a thickness of 5 Å. Molecular surfaces were hereby defined by a probe radius of 1.4 Å around the atoms’ vdw radii. Focusing on the binding-interface region, Halos were further restricted to a maximum distance of 5 Å to the atoms of the respective binding partner with corresponding vdw radii and were cropped down by the space which is occupied by these atoms plus their vdw radii. Consequently, the binding partner only influences the shape but none of the 19 physicochemical and statistical properties of the Halo given by the underlying biomolecule. Ligand-atom types and properties used for the annotation of point clouds were: carbon, H-bond donor hydrogen, non-H-bonding nitrogen, H-bond acceptor oxygen, H-bond acceptor sulfur, desolvation potential, electrostatic potential, aromatic carbon, phosphor, accessibility, hydrophobicity, flexibility, positioning of chains, sulfur, bromine, chlorine, fluor, iodine.

Artificial Neural network.

Idea and Intention.

We trained an artificial neural network (ANN) on our MD-simulation data, augmented by experimental data for the spike-RBD-hACE2 binding affinity where appropriate. The ANN uses the Catalophore Halos of both the spike-RBD and the hACE2 for predicting the binding affinity based on Halo information alone (i.e., without direct reference to sequence or structure). Since getting model results this way is many orders of magnitude faster than running an MD-simulation for the spike-RBD-hACE2 interaction, we can employ the ANN model to get an extremely efficient estimate for many variants of both spike-RBD and hACE2. The cost of inference for a single variant of hACE2 essentially amounts to the computation of the Halo, which is a short addon to preparing the mutated hACE2 structure that serves as an input for the binding-affinity MD simulation.

Based on numerous ANN-predictions, we can test many more possible variants of hACE2, compile a ranked list of the results, and then feed the most promising candidates into the MD-pipeline for validation. This serves two purposes: first of all, we validate the ANN model even further, and second of all, it provides us with predictions that have a reliability validated by MD-model gauging runs via the ESF. Overall, this approach allows us to choose a much more interesting and potentially representative sampling pattern for our hACE2-design approach than an unguided set of runs of our MD-simulation setup would provide.

Initial Pre-Omicron RBD Experiment.

Using a dataset of 1,049 RBD-hACE2 pairs with RBD amino-acid-exchange counts ranging from one to 20 exchanges (for a more detailed breakdown see Table S1 in the supporting information) compared to the wild-type RBD and a single Omicron BA.1 example, the model was trained using a random subset of 120 samples for validation. With the parameters from the epoch with the lowest validation loss for inference, the model was tested on a previously unseen set of 300 RBD-hACE2 combinations. Here it reached a mean error of 1.05 kJ/mol, with a maximum error of 4 kJ/mol. The Pearson correlation for this set was 0.69.

Initial Pre-Omicron hACE2 Experiment.

A subset of 50 unique hACE2 variants was isolated and removed from the set. The remaining set was again split into training and validation. Using the parameters that produce the lowest validation loss during training for inference, the influence of these hACE2 variants was predicted. The average prediction error was 1.2 kJ/mol, and the maximum error was 4.8 kJ/mol.

The chosen subset of hACE2 amino-acid exchanges induces a mean variance of 3.72 kJ/mol on calculated RBD-hACE2 binding affinities, compared to the RBD binding with the wild-type hACE2 alone. From this, we conclude that the model explains a large share of the variance induced by the hACE2 amino-acid exchanges. Fig. S2 shows the prediction error as a function of the ground-truth energies of the predicted samples.

Figure S3 shows the distribution of samples in the entire training set with respect to their binding-affinity values. The vast majority of samples is concentrated around − 60 to -50 kJ/mol, with a strongly underrepresented minority found at binding affinities lower than − 65 kJ/mol. Since a high accuracy in the low-energy regime is desirable and most important, the machine-learning specific challenge lies in avoiding a network bias towards the mean of the distribution and instead transferring information learned from the bulk of the distribution to the low-energy outliers. The Omicron BA.2 experiment described above indeed shows better performance than copying values from or regressing to the mean of the closest samples seen during training for high-energy values, thus indicating success in predicting the strongly underrepresented labels.

Network Architecture.

The ANN used for predicting binding affinities is a 3D convolutional neural network (CNN) which we named “Tandem ZipperNet”. The network takes two voxelized Halos as inputs and outputs predictions of Gibbs free-energy values (ΔG). The overall architecture consists of three blocks: the first block uses separated convolutions, processing the input Halos of RBD and hACE2 binding sites individually, but with shared weights. This layer is supposed to both increase spatial independence on the data and reduce the imbalance coming from different numbers of unique RBD and hACE2 variants in the training data.

These separated convolutions are then followed by a single convolution acting on the output volume stacked along the channel dimension, effectively joining the Halos spatially while keeping the channel size, thus acting as the “Zipper”. A third convolutional block acts on the joined data, first increasing channels and later introducing a bottleneck at the channel dimension. The output is subsequently flattened and fed into a multi-layer-perceptron (MLP). For more information see “Network Input Data” and “Network Regularization and Augmentation” in the supporting information.

Statistics and display.

Statistical analyses were performed and plots were generated using GraphPad Prism 9 as well as Seaborn and Matplotlib. Coordinate files for Fig. 2 were generated in PyMOL. Images were rendered using Blender and Open3D.

Data availability.

Publicly available datasets were analyzed in this study. This data can be found here: https://www.gisaid.org/. Input and final structure files as well as Pandas Dataframes of interaction energies exported as Python Pickle files generated within this work are available for download at https://doi.org/10.6084/m9.figshare.19904953.

Acknowledgements

Financial support was provided by the Austrian Science Fund (FWF) through the doc.funds project DOC-46 "Catalox", the Doctoral Academy Graz of the University of Graz, the Austrian Centre of Industrial Biotechnology (Austrian Research Promotion Agency, FFG, project nr. 872161) in the Next Generation Bioproduction project nr. 92017 and of the Austrian Research Promotion Agency General Programme funding scheme project nr. 41404876 "VirtualCure - Rapid Development of an Automated & Expandable In-silico High-Throughput Drug Repurposing Screening Pipeline'' and by The National Research Council of Canada Industrial Research Assistance Program (NRC IRAP) through EUREKA COVID-19 Project Number 956312. The computational results presented have been achieved in part using the Vienna Scientific Cluster (VSC) and HPC resources provided by Innophore. Technical and infrastructure support was provided by the Amazon Web Services Diagnostic Development Initiative (DDI). Some computational results presented in this manuscript have been produced in cloud computing facilities provided by Amazon Web Services within DDI, project nr. “CC ADV 00502188 2021 TR” entitled "virus.watch/SARSCoV-2". Catalphore is a registered trademark (AT 295631) of Innophore GmbH. Calculations were carried out using the software described in the methods section embedded in the Catalophore^TM Drug Solver platform with a non-commercial open-science license granted by Innophore GmbH. Initial spike models were generated within the FASTCURE consortium (https://fastcure.net/). We thank all researchers who shared SARS-CoV-2 genome sequences in GISAID. A GISAID acknowledgment table containing sequence data used in this study is available at https://doi.org/10.6084/m9.figshare.19904953. The authors would like to thank the lab team of Prof. Zatloukal, especially Christine Langner for assistance and accompaniment at the BSL-3 laboratory. We thank Verena Resch for her support in data visualization.

Author contributions

K.K. performed MD simulations, prepared Halo data, performed BSL3 neutralization assays and drafted the manuscript with input from all authors. T.S. contributed to analysis of data and developed, trained and validated the ANN. V.D. performed MD simulations and wrote software to generate Halos. L.P. performed data preparation and sequence analysis, performed MD simulations and prepared Halo data. A.S. gave structural advice and structural biology input for data analysis. A.K. contributed to analysis of data, revised the manuscript, advised and contributed to the ANN development. M.C. suggested ACE2 variants and gave structural biology input for data analysis. W.W., X.Y., Y.Z., W.W.-S.W., C.S., T.Z., X.Z., C.B., L.L., Y.H., Z.X., Z.Z., J.Y. produced ACE2 proteins and performed binding affinity- and neutralization assays. K.Z. gave medical and scientific advice. K.G. gave structural advice and suggested ACE2 variants. C.C.G. , G.S. contributed in evaluating, preparing and interpreting the data, designed, managed and supervised the project. All authors edited the manuscript to its final form.

Declaration of interests a.s.

K.K., T.S., V.D., L.P., A.S., A.K., M.C. report working for Innophore. W.W., X.Y., Y.Z., W.W.-S.W., C.S., T.Z., X.Z., C.B., L.L., Y.H., Z.X., Z.Z. and J.Y. report working for SignalChem Lifesciences Corporation. K.G., G.S., C.C.G. report being shareholders of Innophore, an enzyme and drug discovery company. C.B., Y.H., Z.X., Z.Z. and J.Y. report being shareholders of SignalChem Lifesciences Corporation. Additionally, G.S. and C.C.G. report being managing directors of Innophore. K.Z. reports being shareholder of Zatloukal Innovations GmbH, a company dedicated to the development of plant based biopharmaceuticals. The research described here is scientifically and financially independent of the efforts in any of the above mentioned companies and open-science.

Additional Information

Competing financial interests: The authors declare no competing interests.

Hammond, J.; Leister-Tebbe, H.; Gardner, A.; Abreu, P.; Bao, W.; Wisemandle, W.; Baniecki, M.; Hendrick, V. M.; Damle, B.; Simón-Campos, A.; Pypstra, R.; Rusnak, J. M. Oral Nirmatrelvir for High-Risk, Nonhospitalized Adults with Covid-19. N Engl J Med 2022, 386 (15), 1397–1408. https://doi.org/10.1056/NEJMoa2118542.
Wang, C.; Horby, P. W.; Hayden, F. G.; Gao, G. F. A Novel Coronavirus Outbreak of Global Health Concern. The Lancet 2020, 395 (10223), 470–473. https://doi.org/10.1016/S0140-6736(20)30185-9.
Zhou, P.; Yang, X.-L.; Wang, X.-G.; Hu, B.; Zhang, L.; Zhang, W.; Si, H.-R.; Zhu, Y.; Li, B.; Huang, C.-L.; Chen, H.-D.; Chen, J.; Luo, Y.; Guo, H.; Jiang, R.-D.; Liu, M.-Q.; Chen, Y.; Shen, X.-R.; Wang, X.; Zheng, X.-S.; Zhao, K.; Chen, Q.-J.; Deng, F.; Liu, L.-L.; Yan, B.; Zhan, F.-X.; Wang, Y.-Y.; Xiao, G.-F.; Shi, Z.-L. A Pneumonia Outbreak Associated with a New Coronavirus of Probable Bat Origin. Nature 2020, 579 (7798), 270–273. https://doi.org/10.1038/s41586-020-2012-7.
Su, S.; Wong, G.; Shi, W.; Liu, J.; Lai, A. C. K.; Zhou, J.; Liu, W.; Bi, Y.; Gao, G. F. Epidemiology, Genetic Recombination, and Pathogenesis of Coronaviruses. Trends Microbiol 2016, 24 (6), 490–502. https://doi.org/10.1016/j.tim.2016.03.003.
Greaney, A. J.; Starr, T. N.; Barnes, C. O.; Weisblum, Y.; Schmidt, F.; Caskey, M.; Gaebler, C.; Cho, A.; Agudelo, M.; Finkin, S.; Wang, Z.; Poston, D.; Muecksch, F.; Hatziioannou, T.; Bieniasz, P. D.; Robbiani, D. F.; Nussenzweig, M. C.; Bjorkman, P. J.; Bloom, J. D. Mapping Mutations to the SARS-CoV-2 RBD That Escape Binding by Different Classes of Antibodies. Nat Commun 2021, 12 (1), 4196. https://doi.org/10.1038/s41467-021-24435-8.
Hoffmann, M.; Arora, P.; Groß, R.; Seidel, A.; Hörnich, B. F.; Hahn, A. S.; Krüger, N.; Graichen, L.; Hofmann-Winkler, H.; Kempf, A.; Winkler, M. S.; Schulz, S.; Jäck, H.-M.; Jahrsdörfer, B.; Schrezenmeier, H.; Müller, M.; Kleger, A.; Münch, J.; Pöhlmann, S. SARS-CoV-2 Variants B.1.351 and P.1 Escape from Neutralizing Antibodies. Cell 2021, 184 (9), 2384-2393.e12. https://doi.org/10.1016/j.cell.2021.03.036.
Planas, D.; Saunders, N.; Maes, P.; Guivel-Benhassine, F.; Planchais, C.; Buchrieser, J.; Bolland, W.-H.; Porrot, F.; Staropoli, I.; Lemoine, F.; Péré, H.; Veyer, D.; Puech, J.; Rodary, J.; Baele, G.; Dellicour, S.; Raymenants, J.; Gorissen, S.; Geenen, C.; Vanmechelen, B.; Wawina-Bokalanga, T.; Martí-Carreras, J.; Cuypers, L.; Sève, A.; Hocqueloux, L.; Prazuck, T.; Rey, F. A.; Simon-Loriere, E.; Bruel, T.; Mouquet, H.; André, E.; Schwartz, O. Considerable Escape of SARS-CoV-2 Omicron to Antibody Neutralization. Nature 2022, 602 (7898), 671–675. https://doi.org/10.1038/s41586-021-04389-z.
Tuekprakhon, A.; Nutalai, R.; Dijokaite-Guraliuc, A.; Zhou, D.; Ginn, H. M.; Selvaraj, M.; Liu, C.; Mentzer, A. J.; Supasa, P.; Duyvesteyn, H. M. E.; Das, R.; Skelly, D.; Ritter, T. G.; Amini, A.; Bibi, S.; Adele, S.; Johnson, S. A.; Constantinides, B.; Webster, H.; Temperton, N.; Klenerman, P.; Barnes, E.; Dunachie, S. J.; Crook, D.; Pollard, A. J.; Lambe, T.; Goulder, P.; Paterson, N. G.; Williams, M. A.; Hall, D. R.; Conlon, C.; Deeks, A.; Frater, J.; Frending, L.; Gardiner, S.; Jämsén, A.; Jeffery, K.; Malone, T.; Phillips, E.; Rothwell, L.; Stafford, L.; Fry, E. E.; Huo, J.; Mongkolsapaya, J.; Ren, J.; Stuart, D. I.; Screaton, G. R. Antibody Escape of SARS-CoV-2 Omicron BA.4 and BA.5 from Vaccine and BA.1 Serum. Cell 2022, 185 (14), 2422-2433.e13. https://doi.org/10.1016/j.cell.2022.06.005.
Shang, J.; Wan, Y.; Luo, C.; Ye, G.; Geng, Q.; Auerbach, A.; Li, F. Cell Entry Mechanisms of SARS-CoV-2. Proc Natl Acad Sci U S A 2020, 117 (21), 11727–11734. https://doi.org/10.1073/pnas.2003138117.
Hoffmann, M.; Kleine-Weber, H.; Schroeder, S.; Krüger, N.; Herrler, T.; Erichsen, S.; Schiergens, T. S.; Herrler, G.; Wu, N.-H.; Nitsche, A.; Müller, M. A.; Drosten, C.; Pöhlmann, S. SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor. Cell 2020, 181 (2), 271-280.e8. https://doi.org/10.1016/j.cell.2020.02.052.
Letko, M.; Marzi, A.; Munster, V. Functional Assessment of Cell Entry and Receptor Usage for SARS-CoV-2 and Other Lineage B Betacoronaviruses. Nat Microbiol 2020, 5 (4), 562–569. https://doi.org/10.1038/s41564-020-0688-y.
Monteil, V.; Kwon, H.; Prado, P.; Hagelkrüys, A.; Wimmer, R. A.; Stahl, M.; Leopoldi, A.; Garreta, E.; Hurtado Del Pozo, C.; Prosper, F.; Romero, J. P.; Wirnsberger, G.; Zhang, H.; Slutsky, A. S.; Conder, R.; Montserrat, N.; Mirazimi, A.; Penninger, J. M. Inhibition of SARS-CoV-2 Infections in Engineered Human Tissues Using Clinical-Grade Soluble Human ACE2. Cell 2020, 181 (4), 905-913.e7. https://doi.org/10.1016/j.cell.2020.04.004.
Higuchi, Y.; Suzuki, T.; Arimori, T.; Ikemura, N.; Mihara, E.; Kirita, Y.; Ohgitani, E.; Mazda, O.; Motooka, D.; Nakamura, S.; Sakai, Y.; Itoh, Y.; Sugihara, F.; Matsuura, Y.; Matoba, S.; Okamoto, T.; Takagi, J.; Hoshino, A. Engineered ACE2 Receptor Therapy Overcomes Mutational Escape of SARS-CoV-2. Nat Commun 2021, 12 (1), 3802. https://doi.org/10.1038/s41467-021-24013-y.
Haschke, M.; Schuster, M.; Poglitsch, M.; Loibner, H.; Salzberg, M.; Bruggisser, M.; Penninger, J.; Krähenbühl, S. Pharmacokinetics and Pharmacodynamics of Recombinant Human Angiotensin-Converting Enzyme 2 in Healthy Human Subjects. Clin Pharmacokinet 2013, 52 (9), 783–792. https://doi.org/10.1007/s40262-013-0072-7.
Yan, R.; Zhang, Y.; Li, Y.; Xia, L.; Guo, Y.; Zhou, Q. Structural Basis for the Recognition of SARS-CoV-2 by Full-Length Human ACE2. Science 2020, 367 (6485), 1444–1448. https://doi.org/10.1126/science.abb2762.
Shang, J.; Ye, G.; Shi, K.; Wan, Y.; Luo, C.; Aihara, H.; Geng, Q.; Auerbach, A.; Li, F. Structural Basis of Receptor Recognition by SARS-CoV-2. Nature 2020, 581 (7807), 221–224. https://doi.org/10.1038/s41586-020-2179-y.
Chan, K. K.; Dorosky, D.; Sharma, P.; Abbasi, S. A.; Dye, J. M.; Kranz, D. M.; Herbert, A. S.; Procko, E. Engineering Human ACE2 to Optimize Binding to the Spike Protein of SARS Coronavirus 2. Science 2020, 369 (6508), 1261–1265. https://doi.org/10.1126/science.abc0870.
Wrapp, D.; Wang, N.; Corbett, K. S.; Goldsmith, J. A.; Hsieh, C.-L.; Abiona, O.; Graham, B. S.; McLellan, J. S. Cryo-EM Structure of the 2019-NCoV Spike in the Prefusion Conformation. Science 2020, 367 (6483), 1260–1263. https://doi.org/10.1126/science.abb2507.
Glasgow, A.; Glasgow, J.; Limonta, D.; Solomon, P.; Lui, I.; Zhang, Y.; Nix, M. A.; Rettko, N. J.; Zha, S.; Yamin, R.; Kao, K.; Rosenberg, O. S.; Ravetch, J. V.; Wiita, A. P.; Leung, K. K.; Lim, S. A.; Zhou, X. X.; Hobman, T. C.; Kortemme, T.; Wells, J. A. Engineered ACE2 Receptor Traps Potently Neutralize SARS-CoV-2. Proc Natl Acad Sci U S A 2020, 117 (45), 28046–28055. https://doi.org/10.1073/pnas.2016093117.
Liu, P.; Wysocki, J.; Souma, T.; Ye, M.; Ramirez, V.; Zhou, B.; Wilsbacher, L. D.; Quaggin, S. E.; Batlle, D.; Jin, J. Novel ACE2-Fc Chimeric Fusion Provides Long-Lasting Hypertension Control and Organ Protection in Mouse Models of Systemic Renin Angiotensin System Activation. Kidney Int 2018, 94 (1), 114–125. https://doi.org/10.1016/j.kint.2018.01.029.
Huang, K.-Y.; Lin, M.-S.; Kuo, T.-C.; Chen, C.-L.; Lin, C.-C.; Chou, Y.-C.; Chao, T.-L.; Pang, Y.-H.; Kao, H.-C.; Huang, R.-S.; Lin, S.; Chang, S.-Y.; Yang, P.-C. Humanized COVID-19 Decoy Antibody Effectively Blocks Viral Entry and Prevents SARS-CoV-2 Infection. EMBO Mol Med 2021, 13 (1), e12828. https://doi.org/10.15252/emmm.202012828.
Tanaka, S.; Nelson, G.; Olson, C. A.; Buzko, O.; Higashide, W.; Shin, A.; Gonzalez, M.; Taft, J.; Patel, R.; Buta, S.; Richardson, A.; Bogunovic, D.; Spilman, P.; Niazi, K.; Rabizadeh, S.; Soon-Shiong, P. An ACE2 Triple Decoy That Neutralizes SARS-CoV-2 Shows Enhanced Affinity for Virus Variants. Sci Rep 2021, 11, 12740. https://doi.org/10.1038/s41598-021-91809-9.
Linsky, T. W.; Vergara, R.; Codina, N.; Nelson, J. W.; Walker, M. J.; Su, W.; Barnes, C. O.; Hsiang, T.-Y.; Esser-Nobis, K.; Yu, K.; Reneer, Z. B.; Hou, Y. J.; Priya, T.; Mitsumoto, M.; Pong, A.; Lau, U. Y.; Mason, M. L.; Chen, J.; Chen, A.; Berrocal, T.; Peng, H.; Clairmont, N. S.; Castellanos, J.; Lin, Y.-R.; Josephson-Day, A.; Baric, R. S.; Fuller, D. H.; Walkey, C. D.; Ross, T. M.; Swanson, R.; Bjorkman, P. J.; Gale, M.; Blancas-Mejia, L. M.; Yen, H.-L.; Silva, D.-A. De Novo Design of Potent and Resilient HACE2 Decoys to Neutralize SARS-CoV-2. Science 2020, 370 (6521), 1208–1214. https://doi.org/10.1126/science.abe0075.
Ye, F.; Lin, X.; Chen, Z.; Yang, F.; Lin, S.; Yang, J.; Chen, H.; Sun, H.; Wang, L.; Wen, A.; Zhang, X.; Dai, Y.; Cao, Y.; Yang, J.; Shen, G.; Yang, L.; Li, J.; Wang, Z.; Wang, W.; Wei, X.; Lu, G. S19W, T27W, and N330Y Mutations in ACE2 Enhance SARS-CoV-2 S-RBD Binding toward Both Wild-Type and Antibody-Resistant Viruses and Its Molecular Basis. Signal Transduct Target Ther 2021, 6 (1), 343. https://doi.org/10.1038/s41392-021-00756-4.
Zhang, L.; Dutta, S.; Xiong, S.; Chan, M.; Chan, K. K.; Fan, T. M.; Bailey, K. L.; Lindeblad, M.; Cooper, L. M.; Rong, L.; Gugliuzza, A. F.; Shukla, D.; Procko, E.; Rehman, J.; Malik, A. B. Engineered ACE2 Decoy Mitigates Lung Injury and Death Induced by SARS-CoV-2 Variants. Nat Chem Biol 2022, 18 (3), 342–351. https://doi.org/10.1038/s41589-021-00965-6.
Havranek, B.; Chan, K. K.; Wu, A.; Procko, E.; Islam, S. M. Computationally Designed ACE2 Decoy Receptor Binds SARS-CoV-2 Spike (S) Protein with Tight Nanomolar Affinity. J Chem Inf Model 2021, 61 (9), 4656–4669. https://doi.org/10.1021/acs.jcim.1c00783.
Durmaz, V.; Köchl, K.; Singh, A.; Hetmann, M.; Parigger, L.; Krassnigg, A.; Nutz, D.; Korsunsky, A.; König, C.; Chang, L.; Krebs, M.; Bassetto, R.; Pavkov-Keller, T.; Resch, V.; Gruber, K.; Steinkellner, G.; Gruber, C. C. Structural-Bioinformatics Analysis of SARS-CoV-2 Variants Reveals Higher HACE2 Receptor Binding Affinity for Omicron B.1.1.529 Spike RBD Compared to Wild-Type Reference. Research Square, December 16, 2021. https://doi.org/10.21203/rs.3.rs-1153124/v1.
Starr, T. N.; Greaney, A. J.; Hilton, S. K.; Ellis, D.; Crawford, K. H. D.; Dingens, A. S.; Navarro, M. J.; Bowen, J. E.; Tortorici, M. A.; Walls, A. C.; King, N. P.; Veesler, D.; Bloom, J. D. Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding. Cell 2020, 182 (5), 1295-1310.e20. https://doi.org/10.1016/j.cell.2020.08.012.
Moore, M. J.; Dorfman, T.; Li, W.; Wong, S. K.; Li, Y.; Kuhn, J. H.; Coderre, J.; Vasilieva, N.; Han, Z.; Greenough, T. C.; Farzan, M.; Choe, H. Retroviruses Pseudotyped with the Severe Acute Respiratory Syndrome Coronavirus Spike Protein Efficiently Infect Cells Expressing Angiotensin-Converting Enzyme 2. Journal of Virology 2004.
Gordon, C. J.; Tchesnokov, E. P.; Woolner, E.; Perry, J. K.; Feng, J. Y.; Porter, D. P.; Götte, M. Remdesivir Is a Direct-Acting Antiviral That Inhibits RNA-Dependent RNA Polymerase from Severe Acute Respiratory Syndrome Coronavirus 2 with High Potency. J Biol Chem 2020, 295 (20), 6785–6797. https://doi.org/10.1074/jbc.RA120.013679.
Gobeil, S. M.-C.; Janowska, K.; McDowell, S.; Mansouri, K.; Parks, R.; Manne, K.; Stalls, V.; Kopp, M. F.; Henderson, R.; Edwards, R. J.; Haynes, B. F.; Acharya, P. D614G Mutation Alters SARS-CoV-2 Spike Conformation and Enhances Protease Cleavage at the S1/S2 Junction. Cell Reports 2021, 34 (2). https://doi.org/10.1016/j.celrep.2020.108630.
Castilho, A.; Schwestka, J.; Kienzl, N. F.; Vavra, U.; Grünwald-Gruber, C.; Izadi, S.; Hiremath, C.; Niederhöfer, J.; Laurent, E.; Monteil, V.; Mirazimi, A.; Wirnsberger, G.; Stadlmann, J.; Stöger, E.; Mach, L.; Strasser, R. Generation of Enzymatically Competent SARS-CoV-2 Decoy Receptor ACE2-Fc in Glycoengineered Nicotiana Benthamiana. Biotechnology Journal 2021, 16 (6), 2000566. https://doi.org/10.1002/biot.202000566.
Capraz, T.; Kienzl, N. F.; Laurent, E.; Perthold, J. W.; Föderl-Höbenreich, E.; Grünwald-Gruber, C.; Maresch, D.; Monteil, V.; Niederhöfer, J.; Wirnsberger, G.; Mirazimi, A.; Zatloukal, K.; Mach, L.; Penninger, J. M.; Oostenbrink, C.; Stadlmann, J. Structure-Guided Glyco-Engineering of ACE2 for Improved Potency as Soluble SARS-CoV-2 Decoy Receptor. eLife 2021, 10, e73641. https://doi.org/10.7554/eLife.73641.
Mamedov, T.; Gurbuzaslan, I.; Yuksel, D.; Ilgin, M.; Mammadova, G.; Ozkul, A.; Hasanova, G. Soluble Human Angiotensin- Converting Enzyme 2 as a Potential Therapeutic Tool for COVID-19 Is Produced at High Levels In Nicotiana Benthamiana Plant With Potent Anti-SARS-CoV-2 Activity. Frontiers in Plant Science 2021, 12.
Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. Ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from Ff99SB. J Chem Theory Comput 2015, 11 (8), 3696–3713. https://doi.org/10.1021/acs.jctc.5b00255.
Krieger, E.; Vriend, G. New Ways to Boost Molecular Dynamics Simulations. J Comput Chem 2015, 36 (13), 996–1007. https://doi.org/10.1002/jcc.23899.
Jakalian, A.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. Journal of Computational Chemistry 2002, 23 (16), 1623–1641. https://doi.org/10.1002/jcc.10128.
World Health Organization. Tracking SARS-CoV-2 variants. 2022-07-19. https://www.who.int/activities/tracking-SARS-CoV-2-variants (accessed 2022-07-20).
Ramakrishnan, M. A. Determination of 50% Endpoint Titer Using a Simple Formula. World J Virol 2016, 5 (2), 85–86. https://doi.org/10.5501/wjv.v5.i2.85.
Kicker, E.; Tittel, G.; Schaller, T.; Pferschy-Wenzig, E.-M.; Zatloukal, K.; Bauer, R. SARS-CoV-2 Neutralizing Activity of Polyphenols in a Special Green Tea Extract Preparation. Phytomedicine 2022, 98, 153970. https://doi.org/10.1016/j.phymed.2022.153970.
Zahradník, J.; Marciano, S.; Shemesh, M.; Zoler, E.; Harari, D.; Chiaravalli, J.; Meyer, B.; Rudich, Y.; Li, C.; Marton, I.; Dym, O.; Elad, N.; Lewis, M. G.; Andersen, H.; Gagne, M.; Seder, R. A.; Douek, D. C.; Schreiber, G. SARS-CoV-2 Variant Prediction and Antiviral Drug Design Are Enabled by RBD in Vitro Evolution. Nat Microbiol 2021, 6 (9), 1188–1198. https://doi.org/10.1038/s41564-021-00954-4.
Shu, Y.; McCauley, J. GISAID: Global Initiative on Sharing All Influenza Data - from Vision to Reality. Euro Surveill 2017, 22 (13), 30494. https://doi.org/10.2807/1560-7917.ES.2017.22.13.30494.
Gruber, K.; Steinkellner, G.; Gruber, C. Determining Novel Enzymatic Functionalities Using Three-Dimensional Point Clouds Representing Physico Chemical Properties of Protein Cavities. USA, US20150302142A1, October 22, 2015.
Krieger, E.; Koraimann, G.; Vriend, G. Increasing the Precision of Comparative Models with YASARA NOVA--a Self-Parameterizing Force Field. Proteins 2002, 47 (3), 393–402. https://doi.org/10.1002/prot.10104.
Morris, G. M.; Huey, R.; Lindstrom, W.; Sanner, M. F.; Belew, R. K.; Goodsell, D. S.; Olson, A. J. AutoDock4 and AutoDockTools4: Automated Docking with Selective Receptor Flexibility. J Comput Chem 2009, 30 (16), 2785–2791. https://doi.org/10.1002/jcc.21256.

No competing interests reported.

Download PDF

Journal Publication

published 14 Jan, 2023

Read the published version in Scientific Reports →

Editorial decision: Major revision
29 Oct, 2022
Reviews received at journal
03 Oct, 2022
Reviewers agreed at journal
16 Sep, 2022
Reviewers invited by journal
16 Sep, 2022
Editor assigned by journal
13 Sep, 2022
Editor invited by journal
13 Sep, 2022
Submission checks completed at journal
13 Sep, 2022
First submitted to journal
17 Aug, 2022

You are reading this latest preprint version

Optimizing variant-specific therapeutic SARS-CoV-2 decoys using deep-learning-guided molecular dynamic simulations.

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Results

Discussion

Materials And Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1