A few years ago, the World Health Organisation published a list of pathogens (https://www.who.int/activities/prioritizing-diseases-for-research-and-development-in-emergency-contexts) with high priority for research and development. All were viruses, none had known treatments or vaccines, and all had the potential to trigger pandemics that could kill thousands. WHO experts later decided to add a new condition: Disease X, referring as “a serious international epidemic caused by a pathogen currently unknown.” The new member of the coronavirus family COVID-19, that has never been encountered before, with no vaccines or treatments and limited diagnostic tools, resembles most of the characteristics of a Disease X. Many important questions about this new virus, which has spread to 25 countries in less than two months since first appearing in China, still remain unanswered. One of these most intriguing questions concerns the protein encoded by Orf3b [1]. The origin of this completely novel short protein and its role in the viral life cycle and pathogenesis is still unknown.
Here we analyze COVID-19 Orf3b protein using the informational spectrum method (ISM) [2]. This virtual spectroscopy method, which is based on two electronic molecular descriptors, the quasi valence number (AQVN) and the electron-ion interaction potential (EIIP), allows functional analysis of protein sequences without any prior experimental data. The ISM was recently used for prediction of potential receptor, natural reservoir, tropism, and therapeutic/vaccine target of COVID-19 (https://f1000research.com/articles/9-52).
In Figure 1, the informational spectrum (IS) of COVID-19 Orf3b protein. The dominant peak in this IS corresponds to the frequency F(0.047) is given. According to the IS methodology, this frequency represents the information encoded by Orf3b protein, which determines its interaction with other proteins. To identify the possible interactors of COVID-19 Orf3b protein, the UniProt database (https://www.uniprot.org) was screened using ISM for human proteins with the dominant peak on the frequency F(0.0.047). The list of human proteins that have a dominant peak in IS at the frequency F(0.0.047) is given in Table 1. Among these proteins, the Importin alpha 3 (Karyopherin 3) and Importin 4 (Karyopherin 4) have the highest values of the amplitude and the signal-to-noise ratio (S/N) on the frequency F(0.047). According to the IS criterion, these proteins are potential candidate interactors of COVID-19 Orf3b protein.
Table 1. Human proteins with the dominant amplitude in the informational spectrum on the frequency F(0.047).
Protein
|
Amplitude
|
S/N
|
Arylsulfatase D precursor
|
5.97
|
6.28
|
G1/S-specific cyclin-D3
|
2.83
|
6.08
|
Protein C-ets-1
|
6.95
|
9.04
|
Fibroblast growth factor 4 precursor
|
1.70
|
5.83
|
GAGE1_HUMAN G antigen 1
|
1.19
|
5.52
|
G antigen 3
|
1.02
|
5.73
|
Glutamine synthetase
|
3.15
|
5.16
|
Hypoxia-inducible gene 2 protein
|
0.72
|
7.40
|
Homeobox protein Hox-A9
|
2.19
|
5.35
|
Homeobox protein Hox-C9
|
2.34
|
5.88
|
Importin subunit alpha-3
|
10.62
|
11.84
|
Importin subunit alpha-4
|
8.43
|
9.19
|
Kremen protein 2 precursor
|
4.75
|
6.37
|
Lymphocyte antigen 6D precursor
|
1.44
|
7.54
|
Motile sperm domain-containing protein 3
|
2.25
|
5.92
|
Myotilin
|
5.32
|
6.74
|
Phosphofurin acidic cluster sorting protein 2
|
11.75
|
7.93
|
Protocadherin gamma C4 precursor
|
8.83
|
5.32
|
Pituitary homeobox 3
|
2.84
|
7.56
|
Something about silencing protein 10
|
5.47
|
6.67
|
Septin-5
|
3.45
|
5.16
|
Synaptotagmin
|
3.73
|
5.62
|
Transmembrane protein 28
|
4.77
|
6.66
|
Further, literature data mining reveals that the severe acute respiratory syndrome coronavirus (SARS-CoV) Orf6 protein antagonizes STAT1 protein function by sequestering Karyopherin nuclear import factors (KPNA) on the rough endoplasmic reticulum/Golgi membrane which leads to a loss of STAT1 transport into the nucleus [2]. The loss of STAT1 transport into the nucleus in response to interferon signaling, blocks the expression of STAT1-activated genes that establish an antiviral state. Of note is that the same mechanism of blocking the interferon response was reported for the Ebola virus in which VP24 binds KPNA [4]. These data suggest that the COVID-19 Orf3b protein could represent the functional analog of the SARS-CoV Orf6 protein.
Finally, the COVID-19 Orf3b protein sequence was scanned to look for the domain that gives the highest contribution to the information represented by the frequency F(0.047) (Figure 2). This analysis revealed that the domain 15-29 is probably essential for the interaction of Orf3b protein with KPNA.
In conclusion, presented analysis suggests (i) that the COVID-19 Orf3b protein is functional analog of SARS-CoV Orf6 protein, (ii) that the COVID-19 Orf3b protein impairs the interferon signaling network and the host innate defense by binding to KPNA proteins, and (iii) that the domain 15-19 of the COVID-15 Orf3b protein is putative binding site for KPNA proteins and could be used as a possible therapeutic target.