Sauropus androgynus L. Merr is a palatable and nutritive leafy vegetable and also used in traditional remedies for various diseases in South and Southeast Asia[1-3]. In Thailand, S. androgynus plants are infected with multiple viruses, including ageratum yellow vein virus (AYVV), tomato leaf curl new delhi virus (ToLCNDV) and sauropus leaf curl virus (SaLCV) in the genus Begomovirus, as well as sauropus yellowing virus (SaYV) in the genus Polerovirus[4,5]. To date, the identification of the virus infecting S. androgynus in China is scant. Studies are need to reveal the situation of virus infection in Chinese S. androgynus.
In September 2023, curled and yellow leaves that are reminiscent of virus disease symptoms were observed on S. androgynus in Fujian Province, China. Firstly, RT-PCR (Promega, USA; Vazyme, China) using RNA isolated from symptomatic S. androgynus as template was performed with universal begomovirus primer pair PLA1v1978B/PAR1c715H and polerovirus primers Gen-1/Gen-2[5]. No genome signals of begomovirus and polerovirus were detected (not shown), suggesting that the symptomatic S. androgynus would be infected by some other pathogens.
Thereupon, total RNA purified from a pool of ten symptomatic leaves from multiple plants was subjected to preparing the RNA library through a rRNA-depletion method. The RNA library was sequenced with PE150 on the Illumina NovaSeq-6000 platform at Novogene, China, generating 73,443,310 reads. Removing low-quality reads, 71,682,618 reads were assembled de novo using Trinity (version 2.8.5) with default parameters. Contigs longer than 400 bp were searched against the NCBI non-redundant (NR) protein sequence database using BLASTx (-evalue 10 -use_sw_tback -max_target_seqs 3 -outfmt 6) to define putative viral contigs. As a result, a contig of 7,968 bp with sequence similarity to members of the genus Allexivirus was identified. This sequence was found to encode multiple proteins, including one with an RdRp domain. The protein sequence containing the RdRp domain exhibited only 64.3% identity to its closest match in the NR database, indicating that this 7,968 bp sequence should be the genome of a new allexivirus. Hence, we tentatively named the new virus as sauropus androgynus virus (SaV).
The genus Allexivirus previously contained 13 species, whose genome is a single molecule of linear (+) single-stranded RNA, of 7.4–8.8 kb in size, with a 3’-poly(A) tail[6, 7]. Seven allexiviruses belong to the subgenus Acarallexivirus, with complete genomes exceeding 8,000 nucleotides (nt) in size. These acarallexiviruses contain ORFs that sequentially encode replicase protein, triple gene block protein 1 (TGBp1), TGBp2, 40 kDa protein, coat protein (CP), and nucleic acid binding protein (NABP)[7]. The other six species are blackberry virus E (BVE), garlic mite-borne filamentous virus (GarMbFV), vanilla latent virus (VLV), arachis pintoi virus (ApV), senna severe yellow mosaic virus (SSYMV), and alfalfa virus S (AVS)[7]. Since the genomes are shorter than those of the acarallexiviruses, the ORFs of BVE, GarMbFV, VLV, ApV and SSYMV encode similar proteins like acarallexiviruses and an additional TGBp3, lacking NABP[8].
To get the accurate sequence of SaV, six primer pairs (Table S1) covering the putative virus contig were designed and employed to amplify the genome fragments of SaV by RT-PCR. The PCR amplicons were subsequently cloned into the T-vector using a pMD™ 19-T Vector Cloning Kit (Takara, China). E. coli transformed with ligated plasmids, at least three positive clones for each fragment were sequenced. Additionally, primers near both ends of afore-determined sequence were designed for determining the 5’ termini using a 5’ RACE kit (Invitrogen, USA) and the 3’ terminus by RT-PCR (Table S1). The complete genome of the virus was obtained through sequence-assembled and deposited into the NCBI GenBank database under accession number PQ177843.
The genome of SaV, excluding the 3’-poly(A) tail, consists of 8,007 nucleotides (nt) and contains six ORFs flanked by 5’ and 3’ untranslated regions (UTR) of 94 nt and 197 nt, respectively (Fig. 1a). The nucleotide sequence of SaV exhibits a range of 52.2–61.2% identity with recognized allexiviruses (Table 1). The ORF1 (nt 94–4,675) of SaV encodes a 171.3-kDa replication-associated protein (replicase) that includes the methyltransferase (amino acid (aa) positions 41–333), helicase (aa 770–1,003) and RNA-dependent RNA polymerase (aa 1,183–1,470) (Fig.1a). The ORF1 of SaV shares 57.8–65.7% nt and 58.1–69.5% aa identity with allexiviruses, with the highest identity to the ApV (Table 1).
Table 1. Pairwise comparisons of nucleotide (nt) and putative amino acid (aa) sequence identities between SaV and most closely related allexiviruses.
Virus
|
APV
|
AVS
|
BVE
|
SSYMV
|
VLV
|
GarVA
|
GarVC
|
GarVD
|
ShVX
|
% nt sequence identity
|
Complete genome
|
61.2
|
56.0
|
56.9
|
57.0
|
53.6
|
52.6
|
52.2
|
52.4
|
52.4
|
ORF1/RdRp
|
65.7
|
62.2
|
63.0
|
64.1
|
60.4
|
59.1
|
59.3
|
59.3
|
57.8
|
ORF2/TGBp1
|
53.0
|
51.3
|
52.5
|
52.3
|
46.0
|
49.1
|
49.9
|
50.1
|
50.0
|
ORF3/TGBp2
|
59.5
|
62.0
|
58.6
|
53.2
|
49.7
|
33.3
|
48.7
|
50.0
|
52.3
|
ORF4/TGBp3
|
56.6
|
49.8
|
60.6
|
45.2
|
43.2
|
-
|
-
|
-
|
-
|
ORF5/P40
|
43.2
|
44.3
|
40.4
|
40.3
|
31.6
|
41.3
|
38.9
|
41.7
|
24.1
|
ORF6/CP
|
53.4
|
54.0
|
61.8
|
56.6
|
31.7
|
50.5
|
50.4
|
49.7
|
49.3
|
% aa sequence identity
|
ORF1/RdRp
|
69.5
|
63.0
|
63.7
|
65.8
|
64.8
|
60.6
|
60.6
|
58.6
|
58.1
|
ORF2/TGBp1
|
47.0
|
43.9
|
50.6
|
44.7
|
42.4
|
39.3
|
40.6
|
41.5
|
43.2
|
ORF3/TGBp2
|
52.9
|
59.6
|
48.5
|
50.0
|
46.0
|
41.6
|
41.6
|
40.6
|
42.6
|
ORF4/TGBp3
|
25.9
|
38.9
|
34.4
|
31.5
|
28.4
|
-
|
-
|
-
|
-
|
ORF5/P40
|
34.2
|
35.2
|
25.9
|
26.9
|
21.7
|
28.7
|
26.3
|
31.8
|
28.7
|
ORF6/CP
|
51.9
|
46.1
|
56.4
|
47.1
|
47.4
|
42.7
|
43.3
|
44.6
|
38.8
|
The analysis was performed using fast alignment with default parameters in DNAMAN software.
The maximum percentage of identity observed for each protein is shown in bold. -, no homologous gene
ORF2 (nt 4,819–5,532; 25.6 kDa) and ORF3 (nt 5510–5818; 11.0 kDa) of SaV encode putative proteins that are homologous to the TGBp1 and TGBp2 of Alphafexiviridae members, respectively (Fig.1a). The identity ranges of TGBp1 and TGBp2 in SaV with orthologs in other allexiviruses are 46.0–53.0% and 33.3–62.0% in nt, and 39.3–50.6% and 40.6–59.6% in aa, respectively (Table 1). TGBp1 possesses an NTPase/helicase domain characterized by the GVPGCGKST motif[9]. TGBp2 contains a viral movement domain (aa 4–99). SaV ORF4 (nt 5,622–5,957; 11.7 kDa) encodes a TGBp3 with a CRLTITGHAISLSGC motif conserved in the Alphafexiviridae[8] (Fig. S1), and shares 43.2–60.6% nt and 25.9–38.9% aa identity with orthologs of allexiviruses (Table 1). SaV ORF5 (nt 5,938–7,038; 33.9 kDa) encodes the allexiviruses typical serine-rich 40K protein (Fig. 1a), which probably involved in virion assembly[10]. SaV ORF6 (nt 7,079–7,810; 25.6 kDa) encodes the coat protein with the motif KYAAFDFFNGVTNDAA conserved in the Alphafexiviridae. Pairwise comparisons of aa identity reveal ranges of 21.7–35.2% (nt identity 24.1–44.3%) for SaV ORF5 and 38.8–56.4% (nt identity 31.7–61.8%) for SaV ORF6 (Table 1). In total, the comparison based on nt and aa sequence implies that SaV is analogous with allexiviruses, especially AVS and BVE (Table 1).
To describe the phylogenetic relationship of SaV in the family Alphafexiviridae (Table S2), the amino acids of SaV replicase and CP were adopted to construct the phylogenetic tree, respectively[11] (Fig. 1b, c). Obviously, SaV is grouped with allexiviruses as the nearest member of the clade and not on the branch of acarallexiviruses through phylogenetic analysis based on ORF1 replicase or CP (Fig. 1b, c), suggesting that SaV belongs to Allexivirus but not the subgenus Acarallexiviruse. The sequence identities of RdRp or CP between SaV and other known allexiviruses are below the cutoff threshold (72% for nt or 80% for aa) for species delineation in the genus Allexivirus[6] (Table 1). Therefore, SaV should be considered as a novel specie of the genus Allexivirus in the family Alphaflexiviridae. To our knowledge, this is the first virus of the family Alphafexiviridae reported to infecting S. androgynous.